WO2019237427A1 - Method, apparatus and system for assisting hearing-impaired people, and augmented reality glasses - Google Patents
Method, apparatus and system for assisting hearing-impaired people, and augmented reality glasses Download PDFInfo
- Publication number
- WO2019237427A1 WO2019237427A1 PCT/CN2018/092812 CN2018092812W WO2019237427A1 WO 2019237427 A1 WO2019237427 A1 WO 2019237427A1 CN 2018092812 W CN2018092812 W CN 2018092812W WO 2019237427 A1 WO2019237427 A1 WO 2019237427A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound source
- voice
- text
- module
- client
- Prior art date
Links
- 208000032041 Hearing impaired Diseases 0.000 title claims abstract description 98
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000003190 augmentative effect Effects 0.000 title claims abstract description 17
- 239000011521 glass Substances 0.000 title claims abstract description 14
- 238000004891 communication Methods 0.000 claims abstract description 19
- 238000006243 chemical reaction Methods 0.000 claims description 7
- 230000004044 response Effects 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 6
- 208000016354 hearing loss disease Diseases 0.000 description 6
- 206010011878 Deafness Diseases 0.000 description 5
- 239000003086 colorant Substances 0.000 description 5
- 230000008901 benefit Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 210000005069 ears Anatomy 0.000 description 2
- 210000001508 eye Anatomy 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 210000005252 bulbus oculi Anatomy 0.000 description 1
- 231100000895 deafness Toxicity 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 231100000888 hearing loss Toxicity 0.000 description 1
- 230000010370 hearing loss Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S5/00—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
- G01S5/18—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B27/00—Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
- G02B27/01—Head-up displays
- G02B27/017—Head mounted
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/02—Services making use of location information
- H04W4/025—Services making use of location information using location based information parameters
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B27/00—Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
- G02B27/01—Head-up displays
- G02B27/017—Head mounted
- G02B2027/0178—Eyeglass type
Definitions
- the present invention relates to the field of augmented reality technology, and in particular, to a method, device, and system for assisting the hearing impaired, and augmented reality glasses.
- Augmented Reality (AR) technology is a technology that calculates the position and angle of an image in real time, superimposes the corresponding image, video, and 3D model on the image, and then fuses the virtual world with the real world.
- the AR client can combine real-time image recognition of the offline environment of the user with the pictures stored directly in its local image recognition, and according to the pre-configured display of the identified offline targets in the real scene The effect is enhanced to display the corresponding display data.
- augmented reality technology is widespread, but for the hearing impaired, augmented reality technology has not helped them well.
- An object of the present invention is to provide a method, a device, and a system for assisting a hearing-impaired person, and augmented reality glasses, which can make the hearing-impaired person understand the content of another person's speech.
- an aspect of the present invention provides a method for assisting a hearing impaired person, the method comprising: receiving a voice of at least one sound source; identifying a voice of each of the at least one sound source to Converting the speech of each sound source in the at least one sound source into text expressed in a first preset target language; and displaying the text converted by the speech of each sound source in the at least one sound source.
- the method further includes: receiving text; converting the received text into speech expressed in a second preset language; and playing the converted speech.
- the method before the receiving the voice of at least one sound source and / or the received text, the method further includes: receiving a response to the first preset target language and / or the second preset target language. set up.
- the method further comprises: determining location information of the hearing impaired person; and sending the location information to a mobile terminal and / or client, so that the mobile terminal and / or client obtains the location in real time information.
- the method before sending the location information to a mobile terminal and / or client, the method further includes: receiving a setting for a contact, wherein the mobile terminal and / or client is a contact with the selected one A mobile terminal and / or client corresponding to a person.
- another aspect of the present invention provides a device for assisting a hearing impaired person, the device comprising: a voice receiving module for receiving voice of at least one sound source; a voice recognition module for identifying the at least one A voice of each sound source in the sound source to convert the voice of each sound source in the at least one sound source into text expressed in a first preset target language; and a display module for displaying the at least one Text converted from the voice of each source in the sound source.
- the device further includes: a text receiving module for receiving text; a text conversion module for converting the received text into speech expressed in a second preset language; and a voice playback module for playing the text Transformed Voice.
- a text receiving module for receiving text
- a text conversion module for converting the received text into speech expressed in a second preset language
- a voice playback module for playing the text Transformed Voice.
- the device further includes a language setting module configured to receive the first preset target before the voice receiving module receives the voice of at least one sound source and / or the text receiving module receives the text. Language and / or the setting of the second preset target language.
- the display module is a near-eye display.
- the near-eye display is a see-through near-eye display.
- the device further includes: a positioning module for determining position information of the hearing impaired; and a communication module for sending the position information to a mobile terminal and / or a client, so that the mobile terminal And / or the client obtains the location information in real time.
- a positioning module for determining position information of the hearing impaired
- a communication module for sending the position information to a mobile terminal and / or a client, so that the mobile terminal And / or the client obtains the location information in real time.
- the device further includes: a contact setting module configured to receive the setting of the contact before the communication module sends the location information to the mobile terminal and / or the client, wherein the mobile terminal And / or the client is a mobile terminal and / or client corresponding to the selected contact.
- a contact setting module configured to receive the setting of the contact before the communication module sends the location information to the mobile terminal and / or the client, wherein the mobile terminal And / or the client is a mobile terminal and / or client corresponding to the selected contact.
- another aspect of the present invention provides an augmented reality glasses, which includes the above-mentioned device.
- another aspect of the present invention provides a system for assisting a hearing impaired person, the system including the device described above, and a client.
- another aspect of the present invention provides a machine-readable storage medium, where the machine-readable storage medium stores instructions, and the instructions are used to cause a machine to execute the foregoing method.
- the speech of each sound source in at least one sound source is converted into text and the converted text of each sound source is displayed.
- the hearing impaired can understand the content of other people's speech by reading the words, thereby achieving hearing impairment Communication and communication between people and others.
- FIG. 1 is a flowchart of a method for assisting a hearing-impaired person according to an embodiment of the present invention
- FIG. 2 is a diagram illustrating an example of displaying text converted by each sound source of at least one sound source according to another embodiment of the present invention
- FIG. 3 is an exemplary diagram using arrows to indicate directions according to another embodiment of the present invention.
- FIG. 4 is an exemplary diagram of an orientation provided by another embodiment of the present invention.
- FIG. 5 is an exemplary diagram showing the position of a sound source and the text converted by voice according to another embodiment of the present invention.
- FIG. 6 is an exemplary diagram showing the positions of multiple sound sources and the text converted by voice according to another embodiment of the present invention.
- FIG. 7 is a flowchart of a method for assisting a hearing-impaired person according to another embodiment of the present invention.
- FIG. 8 is a structural block diagram of a device for assisting a hearing impaired person according to another embodiment of the present invention.
- FIG. 1 is a flowchart of a method for assisting a hearing impaired person according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps.
- step S10 the voice of at least one sound source is received.
- step S11 the speech of each sound source in the at least one sound source is identified, so as to convert the speech of each sound source in the at least one sound source into a text expressed in a first preset target language.
- speech recognition technology is used to convert speech into text.
- step S12 the text converted by the voice of each of the at least one sound source is displayed.
- An example of displaying text converted by each sound source of at least one sound source may be shown in FIG. 2, and the sound source may be displayed in an orderly manner in the sound source in FIG. 2.
- the sound sources may be displayed in the order of the left and right of the sound sources, or displayed in other arrangements, which is not limited.
- the speech-transformed text of a certain sound source if all the text cannot be displayed in one line, the text may be displayed in a new line or scrolled.
- the speech of each sound source in at least one sound source is converted into text and the text converted by each sound source is displayed.
- the hearing impaired person can understand the content of other people's speech by reading the words, and realizes the hearing loss Communication and exchange.
- the user's operation experience is extremely light, and he / she can “hear” information beyond his ability without operating a technical system at all.
- the method for assisting the hearing impaired provided by the embodiment of the present invention is not only applicable to hearing impaired persons, but also applicable to ordinary persons.
- the preset foreground color and the preset background color are used for display, and the preset foreground color and the preset background color are different colors.
- the preset foreground color is white and the preset background color is black to display white characters on a black background; or the preset foreground color is black and the preset background color is white to display black characters on a white background.
- the preset foreground color is white, the preset background color is green, and the white text on a green background is displayed; or the preset foreground color is green, the preset background color is white, and the green text on a white background is displayed.
- the way to display text may also be to display the text corresponding to different sound sources by alternately changing the color of the preset foreground color and the preset background color, that is, the preset foreground color and the preset background color are different colors.
- the preset foreground and background colors do not change colors.
- the colors of the preset foreground color and the preset background color can be limited according to actual conditions.
- the preset foreground color is white, the preset background color is black, and the white text on a black background is displayed; or the preset foreground color is black.
- the preset background color is white, and black characters on a white background are displayed; for example, the preset foreground color is white, the preset background color is green, and white characters on a green background are displayed; or the preset foreground color is green, and the preset background color is white Displays green text on a white background.
- the following exemplarily uses the preset foreground color to be white and the preset background color to be green to introduce alternately changing the preset foreground color and the preset background color to display text corresponding to different sound sources.
- a certain voice corresponds to the first sound source, and the text converted by the first voice is displayed in white on a green background;
- the first The next sound of the voice corresponds to a different sound source from the first sound source (named second sound source corresponds to the second sound source), and is used when displaying the text converted by the second voice Green text on a white background;
- the sound source corresponding to the next speech of the second speech is the second sound source, and the text converted from the third speech is still displayed in green on a white background Word;
- the sound source corresponding to the next speech of the third speech (named the fourth speech) and the second sound source are different sound sources (the sound source corresponding to the fourth speech is named the third sound source,
- the third sound source can be the first sound source or other sound sources, as long as it is not the second sound source.
- the method for assisting the hearing impaired further includes: determining the position of each sound source in the at least one sound source based on the received speech of each sound source in the at least one sound source. .
- the displayed content includes the position in addition to the text converted by the voice.
- determining the position of the sound source may be based on the time when the voice emitted from the sound source is received.
- a voice receiving module for receiving voice of at least one sound source includes a plurality of voice acquisition modules, the plurality of voice acquisition modules are disposed at different positions, and the plurality of voice acquisition modules receives voices from the same sound source. Time is different.
- the position of the sound source is determined according to the difference in the time when the speech arrives at the multiple speech collection modules, that is, according to the time difference between the time when the speech reaches the multiple speech collection modules.
- the voice collection module may be a microphone
- the voice receiving module may be a microphone array.
- the microphone array may include 2, 4, 6, 7, or 8 microphones.
- the reference point of the azimuth may be set according to the actual situation, for example, it may be the position where the voice receiving module is located. Specifically, it may be any one of the plurality of voice acquisition modules, or may be an intermediate position of the plurality of voice acquisition modules.
- the speech receiving module is worn by the hearing-impaired person or is not far from the hearing-impaired person
- the speech-receiving module is used as the reference point, which is actually the hearing-impaired person as the reference point. Azimuth knows where the sound source is relative to itself.
- the orientation may include a direction and / or a distance.
- an arrow may be used to indicate the direction.
- the arrow is located in an area delimited by a circle.
- the starting point of the arrow is the origin of the circle.
- the origin is equivalent to the position of the hearing impaired and the arrow deviates.
- the vertical axis passing through the circle is at an angle.
- the vertical dotted line in the figure is the vertical axis passing through the circle.
- the horizontal axis of the circle is used as a reference. As shown by the horizontal dashed line in FIG.
- the arrow when the arrow is located above the horizontal axis, it means that the sound source is in front of the hearing impaired; when the arrow is located below the horizontal axis, Indicates that the sound source is behind the hearing impaired. For example, taking the direction example shown in 3 as an example, the sound source indicated by the arrow is in front of the hearing impaired person.
- the way in which the direction of the sound source is indicated by arrows can also be interpreted as the direction is indicated by a clock.
- the circle represents the dial, and the vertical axis located in the upper half of the horizontal axis of the circle indicates the 12 o'clock direction. According to the angle at which the arrow deviates from 12 o'clock, it is determined that the sound source is about the clock direction.
- the sound source indicated by the arrow is about 10 o'clock.
- the azimuth can be expressed using an example as shown in FIG. 4.
- the position of the display distance can be set according to the actual situation, and there is no limitation on this.
- the azimuth includes only the distance, only the distance may be displayed.
- text may also be used to describe the orientation. For example, taking the orientation shown in FIG. 4 as an example, the text "direction is ten o'clock and distance is 50 cm" may be displayed.
- FIG. 5 only shows by way of example the position of the area where the orientation and the speech-transformed text are displayed.
- the position of the two display areas can be selected according to the actual situation. For the two display areas, The position is not limited.
- the example shown in FIG. 5 may be used for display.
- the sound sources can be displayed in a sequence of up and down, as shown in FIG. 6.
- the sound sources may be displayed in the order of the left and right of the sound sources, or displayed in other arrangements, which is not limited.
- the manner of displaying the orientation and text simultaneously may also be displayed by using the above-mentioned manner of displaying text.
- displaying the orientation simultaneously means that a sound source is provided at the corresponding text.
- the orientation and basic principle are the same, and will not be repeated here.
- FIG. 6 is a flowchart of a method for assisting a hearing impaired person according to another embodiment of the present invention. The difference from the method shown in FIG. 1 is that the method shown in FIG. 6 further includes the following content.
- step S73 a character is received.
- connecting a keyboard enables a hearing impaired person to enter text through the keyboard.
- a hearing impaired person can enter text through the interactive interface.
- the client may be a mobile APP.
- step S74 the received text is converted into speech expressed in a second preset language, for example, text-to-speech conversion is achieved by using TTS technology.
- step S75 the converted voice is played.
- the hearing-impaired person when the hearing-impaired person does not have the pronunciation ability or the pronunciation ability is limited, the hearing-impaired person can input text to express his meaning and communicate with others.
- steps S73 to S75 may also be performed before steps S70 to S72, and this is not limited.
- the method before receiving voice and / or text from at least one sound source, the method further includes: receiving settings for the first preset target language and / or the second preset target language. set.
- the “hearing impaired person” may not be a person with limited hearing ability, but may be a “first-view equal hearing impaired person” who does not understand the language of others Of people who are different in their language are "second-view parity hearing impaired.”
- the method for assisting the hearing impaired may further include the following: determining the position information of the hearing impaired; and sending the position information to the mobile terminal and / or the client, so that the mobile The terminal and / or the client obtains the location information in real time.
- the contact person related to the hearing impaired person can obtain the position information of the hearing impaired person in real time to confirm whether he is safe, and can find him as soon as possible when a situation occurs.
- the position information of the hearing-impaired person can be determined in real time through GPS positioning technology.
- the method before sending the location information to the mobile terminal and / or the client, the method further includes: receiving a setting of a contact, wherein the mobile terminal and / or the client are connected with the selected The mobile terminal and / or client corresponding to the specified contact.
- the hearing-impaired person's location information can be sent directly to the contacts who can appear in time, so that when the hearing-impaired person has difficulty, he can reach the hearing-impaired person's location as soon as possible, helping the hearing-impaired person to solve the problem .
- the correspondence relationship between the hearing-impaired persons and the mobile terminals and / or clients used by them can be set in advance.
- the method for assisting the hearing impaired may further include the following: according to the order of receiving voices, recording the position and / or text corresponding to each sound source, and combining the position and / Or text is stored locally or in the cloud to further help the hearing impaired to remember and share afterwards.
- FIG. 8 is a device for assisting the hearing impaired according to another embodiment of the present invention.
- the device includes a voice receiving module 1, a voice recognition module 2, and a display module 3.
- the voice receiving module 1 is configured to receive voice of at least one sound source.
- the voice recognition module 2 is configured to recognize the voice of each sound source in the at least one sound source, so as to convert the voice of each sound source in the at least one sound source into a text expressed in a first preset target language.
- the display module 3 is configured to display text converted by the voice of each sound source in the at least one sound source.
- the speech of each sound source in at least one sound source is converted into text and the converted text of each sound source is displayed.
- the hearing impaired person can understand the content of other people's speech by reading the words, thus achieving the hearing impaired person and others. Communication and exchange.
- the user's operation experience is extremely light, and he / she can “hear” information beyond his ability without operating a technical system at all.
- the method for assisting the hearing impaired provided by the embodiment of the present invention is not only applicable to hearing impaired persons, but also applicable to ordinary persons.
- the device for assisting the hearing impaired further includes a determining module, which is configured to determine at least one sound source based on the voice of each sound source of the received at least one sound source. The position of each sound source in.
- the display module is further configured to display the position of each sound source in the at least one sound source.
- the device for assisting the hearing impaired further includes a text receiving module for receiving text; a text conversion module for converting the received text into a second preset language Expressed speech; and a speech playback module for playing the converted speech.
- the device further includes: a language setting module, configured to receive the first preset before the voice receiving module receives the voice of the at least one sound source and / or the text receiving module receives the text. Setting of the target language and / or the second preset target language.
- the display module may be a near-eye display.
- the distance between the near-eye display and the eyeball may be less than 2 cm.
- the near-eye display may include a see-through near-eye display or a non-see-through near-eye display.
- the display module may be a see-through near-eye display. In this way, while not affecting the hearing-impaired person from observing other things, the hearing-impaired person can understand the speech-translated text of each sound source or the position of each sound source and the speech-translated text by watching "subtitles".
- the device further includes: a positioning module for determining the location information of the hearing impaired; and a communication module for sending the location information to the mobile terminal and / or the client so that the mobile The terminal and / or the client obtains the location information in real time.
- the device further includes: a contact setting module, configured to receive the setting of the contact before the communication module sends the location information to the mobile terminal and / or the client,
- the mobile terminal and / or client is a mobile terminal and / or client corresponding to the selected contact.
- the device for assisting the hearing impaired further includes a storage module.
- the storage module is used to record the position and text corresponding to each sound source according to the order of receiving voices, to further help the hearing impaired to remember and share afterwards.
- the storage module records the orientation and text corresponding to each sound source, which may be storing the orientation and text corresponding to each sound source on the local end or in the cloud.
- another aspect of the embodiments of the present invention provides a system for assisting the hearing impaired, the system includes: the device described in the above embodiments and a client.
- the client can receive text input by the user; and / or can receive location information of the hearing impaired.
- augmented reality glasses include the devices described in the above embodiments.
- the augmented reality glasses include an electronic circuit system that supports the operation of the device described in the foregoing embodiment.
- the electronic circuit system includes a power supply, a processor, a network connection, and other modules, as well as a voice receiving module, a text receiving module, and a voice playing module.
- the electronic circuit system may further include an externally visible human-machine interface module and buttons and / or a touch control panel.
- the processor includes the determination module, the speech recognition module, and the text conversion module described in the foregoing embodiments.
- the human-machine interface module includes a display module.
- the processor can also perform offline speech recognition locally, or online speech recognition in the cloud via a network connection.
- the touch control panel, buttons, and / or voice receiving module may be provided on the glasses or glasses accessories of the augmented reality glasses, for example, on the temples, frames, or lenses.
- the voice receiving module may be disposed on the frame, on the same temple or on a different temple, or at a position close to the ears (both ears or monaural), to reach the pole. Try to fit the ear.
- the voice receiving module is a microphone array and the microphone sub-array includes two microphones, the two microphones are respectively disposed on two frames, or are disposed on different positions of the same temple, or are respectively disposed on two On the temples.
- a plurality of microphones may also be respectively arranged on the frame and / or the temple according to the actual situation.
- the use of a microphone array is of great significance. Using a microphone array does not require the distance of the sound source from the voice receiving module. In addition, the use of the microphone array can adapt to various distances and can meet the requirements in most communication scenarios. The distance refers to the distance between the sound source and the microphone array.
- the distance between the sound source and the voice receiving module is between 50cm and 1m; in a multi-person group conversation, the sound source is between the distance of 1m and 2m from the voice receiving module; conference , The distance between the sound source and the voice receiving module is 3m; in class, the distance between the sound source and the voice receiving module is 3m to 5m, and so on.
- the display module is a near-eye display
- the text converted by each sound source or the position of each sound source and the text converted by the sound are presented in front of the eyes.
- the near-eye display may be transparent or non-transparent.
- the near-eye display is a see-through near-eye display, it does not affect the hearing-impaired person's observation of the display scene, and through the graphical instructions superimposed on the real scene, the hearing-impaired person can see each
- the text converted by a sound source or the position of each sound source and the text converted by a sound source make hearing-impaired people understand the voice information heard while watching "subtitles" or get similar information while understanding the voice information heard.
- the near-eye display may be a monochromatic display, which uses a preset background color and a preset foreground color to display the text or orientation and text corresponding to the sound source.
- the near-eye display can also be a color display. The background color and foreground color are alternately displayed to display the text or orientation and text corresponding to different sound sources.
- the specific conversion method refer to the content described in the foregoing embodiment. Fully avoid the deafness of the hearing impaired, so that the hearing impaired can focus on the content itself; meanwhile, the hearing impaired can conduct normal real-world communication without the discomfort of being interrupted and the need to change the focus of attention.
- another aspect of the embodiments of the present invention further provides a machine-readable storage medium, where the machine-readable storage medium stores instructions, and the instructions are used to cause a machine to execute the method described in the foregoing embodiments.
- the sound of each sound source in at least one sound source is converted into text and the converted text of each sound source is displayed.
- the hearing impaired can understand the content of other people's speech by reading the words, thereby achieving hearing impairment Communication and communication between people and others.
- the text input by the hearing impaired is converted into speech and the converted speech is played. In this way, when the hearing impaired has no pronunciation ability or the pronunciation ability is limited, the hearing impaired person can express his meaning by entering text and communicate with others.
- the received speech is converted into words expressed in the language used by the "deaf hearing impaired” and / or the words entered by the "deaf hearing impaired” are converted into others who communicate with the "deaf
- the speech expressed in the language used thus realizes the communication between the "persons with hearing impairment” and others.
- the program is stored in a storage medium and includes a number of instructions to enable a microcontroller, a chip, or a processor. (processor) executes all or part of the steps of the method described in each embodiment of the present application.
- the foregoing storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes .
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Optics & Photonics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Embodiments of the present invention relate to the technical field of augmented reality. Provided are a method, apparatus and system for assisting hearing-impaired people, and augmented reality glasses. The method comprises: receiving the voice of at least one sound source; recognizing the voice of each of the at least one sound source, so as to convert the voice of each of the at least one sound source into a text expressed by using a preset target language; and displaying the text obtained by converting the voice of each of the at least one sound source. The apparatus comprises: a voice receiving module, a voice recognition module, and a display module. The system comprises the apparatus and a client. The augmented reality glasses comprise the apparatus. Thus, hearing-impaired people can understand the content of the talk of others, thereby implementing communication between the hearing-impaired people and the others.
Description
本发明涉及增强现实技术领域,具体地涉及一种用于辅助听障人士的方法、装置和系统及增强现实眼镜。The present invention relates to the field of augmented reality technology, and in particular, to a method, device, and system for assisting the hearing impaired, and augmented reality glasses.
增强现实(Augmented Reality,AR)技术,是一种通过实时计算影像的位置及角度,在影像上叠加相应的图像、视频、3D模型,进而对虚拟世界与现实世界进行融合的技术。AR客户端可以结合直接存储在其本地的图片识别物料,对用户的线下环境进行实时的图像识别,并在识别出的特定的线下目标在真实场景中的位置上,按照预配置的展示效果增强显示相应的展示数据。随着技术的发展,增强现实技术的应用很广泛,但对于听障人士而言,增强现实技术却没有很好的帮助到他们。Augmented Reality (AR) technology is a technology that calculates the position and angle of an image in real time, superimposes the corresponding image, video, and 3D model on the image, and then fuses the virtual world with the real world. The AR client can combine real-time image recognition of the offline environment of the user with the pictures stored directly in its local image recognition, and according to the pre-configured display of the identified offline targets in the real scene The effect is enhanced to display the corresponding display data. With the development of technology, the application of augmented reality technology is widespread, but for the hearing impaired, augmented reality technology has not helped them well.
当前,听障人士与健听人沟通主要通过以下两种途径:手语翻译员或佩戴助听器。但是,这两中沟通途径,对听障人士而言都存在一定的问题。At present, there are two main ways for hearing impaired people and hearing people to communicate: sign language interpreters or hearing aids. However, these two communication channels have certain problems for the hearing impaired.
发明内容Summary of the Invention
本发明的目的是提供一种用于辅助听障人士的方法、装置和系统及增强现实眼镜,其可实现使得听障人士明白他人讲话内容。An object of the present invention is to provide a method, a device, and a system for assisting a hearing-impaired person, and augmented reality glasses, which can make the hearing-impaired person understand the content of another person's speech.
为了实现上述目的,本发明的一个方面提供一种用于辅助听障人士的方法,该方法包括:接收至少一个声源的语音;识别所述至少一个声源中的每一声源的语音,以将所述至少一个声源中的每一声源的语音转化成采用第一预设目标语言进行表达的文字;以及显示所述至少一个声源中的每一声源的语音所转化的文字。In order to achieve the above object, an aspect of the present invention provides a method for assisting a hearing impaired person, the method comprising: receiving a voice of at least one sound source; identifying a voice of each of the at least one sound source to Converting the speech of each sound source in the at least one sound source into text expressed in a first preset target language; and displaying the text converted by the speech of each sound source in the at least one sound source.
可选地,该方法还包括:接收文字;将所接收的文字转化为采用第二预设语言表达的语音;以及播放所转化的语音。Optionally, the method further includes: receiving text; converting the received text into speech expressed in a second preset language; and playing the converted speech.
可选地,在所述接收至少一个声源的语音和/或所述接收文字之前,该方法还包括:接收对所述第一预设目标语言和/或所述第二预设目标语言的设定。Optionally, before the receiving the voice of at least one sound source and / or the received text, the method further includes: receiving a response to the first preset target language and / or the second preset target language. set up.
可选地,该方法还包括:确定所述听障人士的位置信息;以及向移动终端和/或客户端发送所述位置信息,以使得所述移动终端和/或客户端实时获取所述位置信息。Optionally, the method further comprises: determining location information of the hearing impaired person; and sending the location information to a mobile terminal and / or client, so that the mobile terminal and / or client obtains the location in real time information.
可选地,在向移动终端和/或客户端发送所述位置信息之前,该方法还包括:接收对联系人的设定,其中所述移动终端和/或客户端为与所选定的联系人对应的移动终端和/或客户端。Optionally, before sending the location information to a mobile terminal and / or client, the method further includes: receiving a setting for a contact, wherein the mobile terminal and / or client is a contact with the selected one A mobile terminal and / or client corresponding to a person.
相应地,本发明的另一方面提供一种用于辅助听障人士的装置,该装置包括:语音接收模块,用于接收至少一个声源的语音;语音识别模块,用于识别所述至少一个声源中的每一声源的语音,以将所述至少一个声源中的每一声源的语音转化成采用第一预设目标语言进行表达的文字;以及显示模块,用于显示所述至少一个声源中的每一声源的语音所转化的文字。Accordingly, another aspect of the present invention provides a device for assisting a hearing impaired person, the device comprising: a voice receiving module for receiving voice of at least one sound source; a voice recognition module for identifying the at least one A voice of each sound source in the sound source to convert the voice of each sound source in the at least one sound source into text expressed in a first preset target language; and a display module for displaying the at least one Text converted from the voice of each source in the sound source.
可选地,该装置还包括:文字接收模块,用于接收文字;文字转化模块,用于将所接收的文字转化为采用第二预设语言表达的语音;以及语音播放模块,用于播放所转化的语音。Optionally, the device further includes: a text receiving module for receiving text; a text conversion module for converting the received text into speech expressed in a second preset language; and a voice playback module for playing the text Transformed Voice.
可选地,该装置还包括:语言设定模块,用于在所述语音接收模块接收至少一个声源的语音和/或所述文字接收模块接收文字之前,接收对所述第一预设目标语言和/或所述第二预设目标语言的设定。Optionally, the device further includes a language setting module configured to receive the first preset target before the voice receiving module receives the voice of at least one sound source and / or the text receiving module receives the text. Language and / or the setting of the second preset target language.
可选地,所述显示模块为近眼显示器。Optionally, the display module is a near-eye display.
可选地,所述近眼显示器为透视式近眼显示器。Optionally, the near-eye display is a see-through near-eye display.
可选地,该装置还包括:定位模块,用于确定所述听障人士的位置信息;以及通信模块,用于向移动终端和/或客户端发送所述位置信息,以使 得所述移动终端和/或客户端实时获取所述位置信息。Optionally, the device further includes: a positioning module for determining position information of the hearing impaired; and a communication module for sending the position information to a mobile terminal and / or a client, so that the mobile terminal And / or the client obtains the location information in real time.
可选地,该装置还包括:联系人设定模块,用于在所述通信模块向移动终端和/或客户端发送所述位置信息之前,接收对联系人的设定,其中所述移动终端和/或客户端为与所选定的联系人对应的移动终端和/或客户端。Optionally, the device further includes: a contact setting module configured to receive the setting of the contact before the communication module sends the location information to the mobile terminal and / or the client, wherein the mobile terminal And / or the client is a mobile terminal and / or client corresponding to the selected contact.
此外,本发明的另一方面还提供一种增强现实眼镜,该增强现实眼镜包上述的装置。In addition, another aspect of the present invention provides an augmented reality glasses, which includes the above-mentioned device.
另外,本发明的另一方面还提供一种用于辅助听障人士的系统,该系统包括上述的装置;以及客户端。In addition, another aspect of the present invention provides a system for assisting a hearing impaired person, the system including the device described above, and a client.
另外,本发明的另一方面还提供一种机器可读存储介质,该机器可读存储介质上存储有指令,该指令用于使得机器执行上述的方法。In addition, another aspect of the present invention provides a machine-readable storage medium, where the machine-readable storage medium stores instructions, and the instructions are used to cause a machine to execute the foregoing method.
通过上述技术方案,将至少一个声源中的每一声源的语音转化成文字并显示每一声源的语音转化的文字,如此,使得听障人士可以通过看字明白他人讲话内容,实现了听障人士与他人之间的沟通和交流。Through the above technical solution, the speech of each sound source in at least one sound source is converted into text and the converted text of each sound source is displayed. In this way, the hearing impaired can understand the content of other people's speech by reading the words, thereby achieving hearing impairment Communication and communication between people and others.
本发明的其它特征和优点将在随后的具体实施方式部分予以详细说明。Other features and advantages of the present invention will be described in detail in the following detailed description.
附图是用来提供对本发明实施例的进一步理解,并且构成说明书的一部分,与下面的具体实施方式一起用于解释本发明实施例,但并不构成对本发明实施例的限制。在附图中:The drawings are used to provide a further understanding of the embodiments of the present invention, and constitute a part of the description. Together with the following specific implementations, the drawings are used to explain the embodiments of the present invention, but not to limit the embodiments of the present invention. In the drawings:
图1是本发明一实施例提供的用于辅助听障人士的方法的流程图;1 is a flowchart of a method for assisting a hearing-impaired person according to an embodiment of the present invention;
图2是本发明另一实施例提供的显示至少一个声源中的每一声源的语音转化的文字的示例图;FIG. 2 is a diagram illustrating an example of displaying text converted by each sound source of at least one sound source according to another embodiment of the present invention; FIG.
图3是本发明另一实施例提供的使用箭头表示方向的示例图;FIG. 3 is an exemplary diagram using arrows to indicate directions according to another embodiment of the present invention; FIG.
图4是本发明另一实施例提供的方位的示例图;FIG. 4 is an exemplary diagram of an orientation provided by another embodiment of the present invention; FIG.
图5是本发明另一实施例提供的显示一声源的方位及语音转化的文字 的示例图;FIG. 5 is an exemplary diagram showing the position of a sound source and the text converted by voice according to another embodiment of the present invention; FIG.
图6是本发明另一实施例提供的显示多个声源的方位及语音转化的文字的示例图;FIG. 6 is an exemplary diagram showing the positions of multiple sound sources and the text converted by voice according to another embodiment of the present invention; FIG.
图7是本发明另一实施例提供的用于辅助听障人士的方法的流程图;以及7 is a flowchart of a method for assisting a hearing-impaired person according to another embodiment of the present invention; and
图8是本发明另一实施例提供的用于辅助听障人士的装置的结构框图。FIG. 8 is a structural block diagram of a device for assisting a hearing impaired person according to another embodiment of the present invention.
附图标记说明Reference Signs
1 语音接收模块 2 语音识别模块1 voice receiving module 2 voice recognition module
3 显示模块3 display module
以下结合附图对本发明实施例的具体实施方式进行详细说明。应当理解的是,此处所描述的具体实施方式仅用于说明和解释本发明实施例,并不用于限制本发明实施例。The specific implementations of the embodiments of the present invention will be described in detail below with reference to the drawings. It should be understood that the specific implementation manners described herein are only used to illustrate and explain the embodiments of the present invention, and are not intended to limit the embodiments of the present invention.
本发明实施例的一个方面提供一种用于辅助听障人士的方法。图1是本发明一实施例提供的用于辅助听障人士的方法的流程图。如图1所示,该方法包括以下步骤。An aspect of an embodiment of the present invention provides a method for assisting a hearing impaired person. FIG. 1 is a flowchart of a method for assisting a hearing impaired person according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps.
在步骤S10中,接收至少一个声源的语音。In step S10, the voice of at least one sound source is received.
在步骤S11中,识别至少一个声源中的每一声源的语音,以将至少一个声源中的每一声源的语音转化成采用第一预设目标语言进行表达的文字。例如,通过语音识别技术来实现将语音转化成文字。In step S11, the speech of each sound source in the at least one sound source is identified, so as to convert the speech of each sound source in the at least one sound source into a text expressed in a first preset target language. For example, speech recognition technology is used to convert speech into text.
在步骤S12中,显示至少一个声源中的每一声源的语音所转化的文字。其中,显示至少一个声源中的每一声源的语音转化的文字的示例可以如图2所示,在图2中按照声源上下依次排列的方式进行显示。此外,也可以按照声源左右依次排列的方式进行显示,或者采用其他的排列方式进行显示, 对此,不进行限制。此外,在显示某一声源的语音转化的文字时,若一行不能显示完全所有文字,则可以自动换行显示,或者可以滚动显示。In step S12, the text converted by the voice of each of the at least one sound source is displayed. An example of displaying text converted by each sound source of at least one sound source may be shown in FIG. 2, and the sound source may be displayed in an orderly manner in the sound source in FIG. 2. In addition, the sound sources may be displayed in the order of the left and right of the sound sources, or displayed in other arrangements, which is not limited. In addition, when displaying the speech-transformed text of a certain sound source, if all the text cannot be displayed in one line, the text may be displayed in a new line or scrolled.
将至少一个声源中的每一声源的语音转化成文字并显示每一声源的语音转化的文字,如此,使得听障人士可以通过看字明白他人讲话内容,实现了听障人士与他人之间的沟通和交流。此外,采用本发明实施例中所述的方法,用户操作体验极轻,完全无需操作技术系统就能够“听”到其能力所不及的信息。另外,需要说明的是,本发明实施例提供的用于辅助听障人士的方法不仅可以适用于听障人士,也适用于普通人士。The speech of each sound source in at least one sound source is converted into text and the text converted by each sound source is displayed. In this way, the hearing impaired person can understand the content of other people's speech by reading the words, and realizes the hearing loss Communication and exchange. In addition, with the method described in the embodiment of the present invention, the user's operation experience is extremely light, and he / she can “hear” information beyond his ability without operating a technical system at all. In addition, it should be noted that the method for assisting the hearing impaired provided by the embodiment of the present invention is not only applicable to hearing impaired persons, but also applicable to ordinary persons.
可选地,在本发明实施例中,显示文字的方式可以有很多种。例如,采用预设前景色与预设背景色进行显示,其中预设前景色与预设背景色为不同种颜色。比如,预设前景色为白色,预设背景色为黑色,显示黑底白字;或者预设前景色为黑色,预设背景色为白色,显示白底黑字。再比如,预设前景色为白色,预设背景色为绿色,显示绿底白字;或者预设前景色为绿色,预设背景色为白色,显示白底绿字。如此,使得用户可以更加清楚的区分出文字。例如,显示文字的方式还可以是采用预设前景色与预设背景色交替变换颜色的方式显示不同声源对应的文字,即,预设前景色和预设背景色为不同种颜色,根据接收语音的顺序,当所接收的相邻语音所对应的声源为不同声源时,交替变化预设前景色和预设背景色;当所接收的相邻语音所对应的声源为同一声源时,预设前景色和预设背景色不变化颜色。其中,对于预设前景色和预设背景色的颜色,可以根据实际情况进行限定,例如,预设前景色为白色,预设背景色为黑色,显示黑底白字;或者预设前景色为黑色,预设背景色为白色,显示白底黑字;再比如,预设前景色为白色,预设背景色为绿色,显示绿底白字;或者预设前景色为绿色,预设背景色为白色,显示白底绿字。下面示例性地以预设前景色为白色、预设背景色为绿色介绍交替变换预设前景色与预设背景色显示不同声源对应的文字。若某一语音(命名为第一语音,该命名仅为便于叙述, 无限定作用)对应于第一声源,显示第一语音转化的文字时采用绿底白字;根据接收语音的顺序,第一语音的下一条语音(命名为第二语音)对应的声源与第一声源为不同声源(命名第二语音对应的声源为第二声源),显示第二语音转化的文字时采用白底绿字;根据接收语音的顺序,第二语音的下一条语音(命名为第三语音)对应的声源为第二声源,则在显示第三语音转化的文字时依旧采用白底绿字;根据接收语音的顺序,第三语音的下一条语音(命名为第四语音)对应的声源与第二声源为不同声源(命名第四语音对应的声源为第三声源,其中第三声源可以是第一声源,也可以是其他声源,只要不是第二声源即可),则在显示第四语音转化的文字时采用绿底白字,如此,循环下去,直到所接收到的语音对应的信息全部显示完毕,其中语音对应的信息包括语音转换的文字。Optionally, in the embodiment of the present invention, there may be many ways to display text. For example, the preset foreground color and the preset background color are used for display, and the preset foreground color and the preset background color are different colors. For example, the preset foreground color is white and the preset background color is black to display white characters on a black background; or the preset foreground color is black and the preset background color is white to display black characters on a white background. For another example, the preset foreground color is white, the preset background color is green, and the white text on a green background is displayed; or the preset foreground color is green, the preset background color is white, and the green text on a white background is displayed. In this way, the user can distinguish the text more clearly. For example, the way to display text may also be to display the text corresponding to different sound sources by alternately changing the color of the preset foreground color and the preset background color, that is, the preset foreground color and the preset background color are different colors. The order of voices, when the sound sources corresponding to the received adjacent voices are different sound sources, the preset foreground color and the preset background color are alternately changed; when the sound sources corresponding to the received adjacent voices are the same sound source, The preset foreground and background colors do not change colors. The colors of the preset foreground color and the preset background color can be limited according to actual conditions. For example, the preset foreground color is white, the preset background color is black, and the white text on a black background is displayed; or the preset foreground color is black. , The preset background color is white, and black characters on a white background are displayed; for example, the preset foreground color is white, the preset background color is green, and white characters on a green background are displayed; or the preset foreground color is green, and the preset background color is white Displays green text on a white background. The following exemplarily uses the preset foreground color to be white and the preset background color to be green to introduce alternately changing the preset foreground color and the preset background color to display text corresponding to different sound sources. If a certain voice (named as the first voice, the name is only for narrative purposes, without limitation) corresponds to the first sound source, and the text converted by the first voice is displayed in white on a green background; according to the order of receiving voices, the first The next sound of the voice (named second sound) corresponds to a different sound source from the first sound source (named second sound source corresponds to the second sound source), and is used when displaying the text converted by the second voice Green text on a white background; according to the order of the received speech, the sound source corresponding to the next speech of the second speech (named the third speech) is the second sound source, and the text converted from the third speech is still displayed in green on a white background Word; according to the order of received speech, the sound source corresponding to the next speech of the third speech (named the fourth speech) and the second sound source are different sound sources (the sound source corresponding to the fourth speech is named the third sound source, The third sound source can be the first sound source or other sound sources, as long as it is not the second sound source.) When displaying the text converted by the fourth voice, use white text on a green background. Correspondence of received voice All the information is displayed, and the information corresponding to the voice includes the text converted by the voice.
可选地,在本发明实施例中,该用于辅助听障人士的方法还包括:基于接收的至少一个声源中的每一声源的语音,确定至少一个声源中的每一声源的方位。其中,针对至少一个声源中的每一声源,显示的内容除了包括语音转化的文字外,还包括方位。Optionally, in the embodiment of the present invention, the method for assisting the hearing impaired further includes: determining the position of each sound source in the at least one sound source based on the received speech of each sound source in the at least one sound source. . For each sound source in the at least one sound source, the displayed content includes the position in addition to the text converted by the voice.
其中,确定声源的方位可以是基于接收到从声源发出的语音的时间。例如,用于接收至少一个声源的语音的语音接收模块包括多个语音采集模块,该多个语音采集模块被设置在不同的位置,多个语音采集模块接收到从同一声源发出的语音的时间不同。针对至少一个声源中的每一声源,根据语音到达多个语音采集模块的时间的不同,即,根据语音到达多个语音采集模块的时间差,确定声源的方位。可选地,在本发明实施例中,语音采集模块可以是麦克风,该语音接收模块可以是麦克风阵列。例如,麦克风阵列可以包括2、4、6、7或8个麦克风。Wherein, determining the position of the sound source may be based on the time when the voice emitted from the sound source is received. For example, a voice receiving module for receiving voice of at least one sound source includes a plurality of voice acquisition modules, the plurality of voice acquisition modules are disposed at different positions, and the plurality of voice acquisition modules receives voices from the same sound source. Time is different. For each sound source in the at least one sound source, the position of the sound source is determined according to the difference in the time when the speech arrives at the multiple speech collection modules, that is, according to the time difference between the time when the speech reaches the multiple speech collection modules. Optionally, in the embodiment of the present invention, the voice collection module may be a microphone, and the voice receiving module may be a microphone array. For example, the microphone array may include 2, 4, 6, 7, or 8 microphones.
可选地,在本发明实施例中,方位的基准点可以根据实际情况进行设置,例如,可以是语音接收模块所在的位置。具体地,可以是多个语音采集模块中的任一语音采集模块,或者还可以是多个语音采集模块的中间位 置。另外,当语音接收模块被听障人士佩戴或者距离听障人士的距离不远时,以语音接收模块为基准点,实际即以听障人士为基准点,如此,听障人士可以基于确定出的方位了解声源相对于自己的位置。Optionally, in the embodiment of the present invention, the reference point of the azimuth may be set according to the actual situation, for example, it may be the position where the voice receiving module is located. Specifically, it may be any one of the plurality of voice acquisition modules, or may be an intermediate position of the plurality of voice acquisition modules. In addition, when the speech receiving module is worn by the hearing-impaired person or is not far from the hearing-impaired person, the speech-receiving module is used as the reference point, which is actually the hearing-impaired person as the reference point. Azimuth knows where the sound source is relative to itself.
可选地,在本发明实施例中,方位可以包括方向和/或距离。可选地,在本发明实施例中,可以采用箭头表示方向,箭头位于一圆周划定的区域内,箭头的起点为该圆周的原点,其中,该原点相当于听障人士所在位置,箭头偏离穿过该圆周的纵轴一角度,如图3所示,图中竖向虚线为穿过圆周的纵轴。此外,以圆周的横轴为基准,如图3所示的横向虚线所示,当箭头位于横轴以上的部分时,表示声源在听障人士的前方;当箭头位于横轴以下的部分时,表示声源在听障人士的后方。例如,以如3所示的方向示例为例,该箭头表示的声源在听障人士的前方。另外,该以箭头表示声源的方向的方式还可以解读为采用时钟来表示方向。其中圆周代表表盘,位于圆周的横轴的上半部分的纵轴表示12点钟方向,根据箭头偏离12点钟的角度确定声源大概在几点钟方向。以图3所示的方向示例为例,箭头表示的声源大概在10点钟方向。另外,在方位包括方向和距离的情况下,可以采用如图4所示的示例表示方位。需要说明的是,显示距离的位置可以根据实际情况进行设定,对此,不进行限制。此外,在方位仅包括距离的情况下,可以仅显示距离。特别地,当基于接收到的语音确定声源来自于听障人士本人时,采用箭头表示声源的方向时,在圆周中心显示“O”或者“●”来表示声源的方向。另外,在本发明实施例中,还可以采用文字描述方位,例如,以图4所示的方位为例,可以显示文字“方向为十点钟方向,距离为50cm”。Optionally, in the embodiment of the present invention, the orientation may include a direction and / or a distance. Optionally, in the embodiment of the present invention, an arrow may be used to indicate the direction. The arrow is located in an area delimited by a circle. The starting point of the arrow is the origin of the circle. The origin is equivalent to the position of the hearing impaired and the arrow deviates. The vertical axis passing through the circle is at an angle. As shown in FIG. 3, the vertical dotted line in the figure is the vertical axis passing through the circle. In addition, the horizontal axis of the circle is used as a reference. As shown by the horizontal dashed line in FIG. 3, when the arrow is located above the horizontal axis, it means that the sound source is in front of the hearing impaired; when the arrow is located below the horizontal axis, Indicates that the sound source is behind the hearing impaired. For example, taking the direction example shown in 3 as an example, the sound source indicated by the arrow is in front of the hearing impaired person. In addition, the way in which the direction of the sound source is indicated by arrows can also be interpreted as the direction is indicated by a clock. The circle represents the dial, and the vertical axis located in the upper half of the horizontal axis of the circle indicates the 12 o'clock direction. According to the angle at which the arrow deviates from 12 o'clock, it is determined that the sound source is about the clock direction. Taking the direction example shown in FIG. 3 as an example, the sound source indicated by the arrow is about 10 o'clock. In addition, in a case where the azimuth includes a direction and a distance, the azimuth can be expressed using an example as shown in FIG. 4. It should be noted that the position of the display distance can be set according to the actual situation, and there is no limitation on this. In addition, in a case where the azimuth includes only the distance, only the distance may be displayed. In particular, when it is determined that the sound source comes from the hearing impaired person based on the received voice, when an arrow is used to indicate the direction of the sound source, "O" or "●" is displayed at the center of the circle to indicate the direction of the sound source. In addition, in the embodiment of the present invention, text may also be used to describe the orientation. For example, taking the orientation shown in FIG. 4 as an example, the text "direction is ten o'clock and distance is 50 cm" may be displayed.
在显示的内容包括方位和语音转化的文字的情况下,针对每一声源,显示的内容的示例可以如图5所示。另外,需要说明的是,图5仅以示例的方式展示了显示方位及语音转化的文字的区域的位置,该两者的显示区域的位置可以根据实际情况进行选择,对于该两者的显示区域的位置不进 行限定。此外,显示至少一个声源中的每一声源的方位及语音转化的文字时均可以采用图5所示的示例进行显示。另外,显示多个声源的方位及语音转化的文字时,可以按照声源上下依次排列的方式进行显示,如图6所示。此外,也可以按照声源左右依次排列的方式进行显示,或者采用其他的排列方式进行显示,对此,不进行限制。In the case where the displayed content includes the text of orientation and speech conversion, for each sound source, an example of the displayed content may be as shown in FIG. 5. In addition, it should be noted that FIG. 5 only shows by way of example the position of the area where the orientation and the speech-transformed text are displayed. The position of the two display areas can be selected according to the actual situation. For the two display areas, The position is not limited. In addition, when displaying the position of each sound source and the text converted by speech in at least one sound source, the example shown in FIG. 5 may be used for display. In addition, when displaying the positions of multiple sound sources and the text converted by speech, the sound sources can be displayed in a sequence of up and down, as shown in FIG. 6. In addition, the sound sources may be displayed in the order of the left and right of the sound sources, or displayed in other arrangements, which is not limited.
可选地,在本发明实施例中,同时显示方位和文字的方式也可以采用上述的显示文字的方式进行显示,相比于只显示文字,同时显示方位就是在对应的文字处配上声源的方位,基本原理相同,这里将不再赘述。Optionally, in the embodiment of the present invention, the manner of displaying the orientation and text simultaneously may also be displayed by using the above-mentioned manner of displaying text. Compared to displaying only the text, displaying the orientation simultaneously means that a sound source is provided at the corresponding text. The orientation and basic principle are the same, and will not be repeated here.
图6是本发明另一实施例提供的用于辅助听障人士的方法的流程图。与图1所示的方法的不同之处在于,图6所示的方法还包括以下内容。FIG. 6 is a flowchart of a method for assisting a hearing impaired person according to another embodiment of the present invention. The difference from the method shown in FIG. 1 is that the method shown in FIG. 6 further includes the following content.
在步骤S73中,接收文字。其中,听障人士输入文字的方式有很多。例如,连接键盘,使得听障人士通过键盘输入文字。例如,连接交互界面,听障人士可以通过交互界面输入文字。此外,还可以连接客户端,听障人士通过客户端输入文字。可选地,该客户端可以是手机APP。In step S73, a character is received. Among them, there are many ways for the hearing impaired to enter text. For example, connecting a keyboard enables a hearing impaired person to enter text through the keyboard. For example, by connecting an interactive interface, a hearing impaired person can enter text through the interactive interface. In addition, you can connect to the client, and the hearing impaired can enter text through the client. Optionally, the client may be a mobile APP.
在步骤S74中,将所接收的文字转化为采用第二预设语言表达的语音,例如,通过TTS技术实现文字到语音的转化。In step S74, the received text is converted into speech expressed in a second preset language, for example, text-to-speech conversion is achieved by using TTS technology.
在步骤S75中,播放所转化的语音。In step S75, the converted voice is played.
如此,当听障人士没有发音能力或者发音能力受限时,使得听障人士可以通过输入文字来表达其意思,与他人进行交流。In this way, when the hearing-impaired person does not have the pronunciation ability or the pronunciation ability is limited, the hearing-impaired person can input text to express his meaning and communicate with others.
需要说明的是,步骤S73-步骤S75也可以在步骤S70-步骤S72之前,对此,不进行限制。It should be noted that steps S73 to S75 may also be performed before steps S70 to S72, and this is not limited.
可选地,在本发明实施例中,在接收至少一个声源的语音和/或接收文字之前,该方法还包括:接收对第一预设目标语言和/或第二预设目标语言的设定。在该实施例中,“听障人士”可能并非是真正的听力能力受限制的人士,可以是不懂与之交流的他人的语言的“第一视同听障人士”,或者是与之交流的他人不同其语言的“第二视同听障人士”。设定“第一视同听障 人士”使用的第一预设目标语言,将接收到至少一个声源的语音转化成采用第一预设目标语言表达的文字,“第一视同听障人士”通过看转化后的文字明白与之交流的他人讲话的内容。设定与“第二视同听障人士”进行交流的他人使用的第二预设目标语言,将“第二视同听障人士”输入的文字转化为采用第二预设目标语言进行表达的语音,他人可以通过听语音明白“视同第二听障人士”所要表达的意思。如此,实现了“视同听障人士”与他人之间的交流。Optionally, in the embodiment of the present invention, before receiving voice and / or text from at least one sound source, the method further includes: receiving settings for the first preset target language and / or the second preset target language. set. In this embodiment, the “hearing impaired person” may not be a person with limited hearing ability, but may be a “first-view equal hearing impaired person” who does not understand the language of others Of people who are different in their language are "second-view parity hearing impaired." Set the first preset target language used by the “first-view hearing-impaired person” to convert the voice received from at least one sound source into text expressed in the first preset target language. "Understand what the other person is talking to by looking at the converted text. Set a second preset target language used by others who communicate with the “second-view hearing-impaired person”, and convert the text entered by the “second-view hearing-impaired person” into an expression in the second preset target language Voice, others can understand the meaning of "as a second hearing impaired" by listening to the voice. In this way, communication between the “persons with hearing impairment” and others is realized.
可选地,在本发明实施例中,该用于辅助听障人士的方法还可以包括以下内容:确定听障人士的位置信息;以及向移动终端和/或客户端发送位置信息,以使得移动终端和/或客户端实时获取位置信息。如此,使得与听障人士的相关的联系人可以通过实时获取听障人士的位置信息,以确认其是否安全,并且可以在出现状况时尽快找到他。其中,在本发明实施例中,可以通过GPS定位技术实时确定听障人士的位置信息。Optionally, in the embodiment of the present invention, the method for assisting the hearing impaired may further include the following: determining the position information of the hearing impaired; and sending the position information to the mobile terminal and / or the client, so that the mobile The terminal and / or the client obtains the location information in real time. In this way, the contact person related to the hearing impaired person can obtain the position information of the hearing impaired person in real time to confirm whether he is safe, and can find him as soon as possible when a situation occurs. Among them, in the embodiment of the present invention, the position information of the hearing-impaired person can be determined in real time through GPS positioning technology.
可选地,在本发明实施例中,在向移动终端和/或客户端发送位置信息之前,该方法还包括:接收对联系人的设定,其中移动终端和/或客户端为与所选定的联系人对应的移动终端和/或客户端。当与听障人士相关的联系人有很多时,在不同的情况下,有的联系人可以在听障人士出现困难时,及时出现在听障人士的身边,以帮助其解决困难,由此,在发送听障人士的位置信息时,可以直接向能及时出现的联系人发送听障人士的位置信息,以使得在听障人士出现困难时,尽快到达听障人士所在地,帮助听障人士解决困难。此外,可以预先设定好与听障人士相关的联系人及其所使用的移动终端和/或客户端的对应关系。Optionally, in the embodiment of the present invention, before sending the location information to the mobile terminal and / or the client, the method further includes: receiving a setting of a contact, wherein the mobile terminal and / or the client are connected with the selected The mobile terminal and / or client corresponding to the specified contact. When there are many contacts related to the hearing impaired, in different situations, some contacts can appear beside the hearing impaired in time to help the hearing impaired, thereby, When sending the hearing-impaired person's location information, the hearing-impaired person's location information can be sent directly to the contacts who can appear in time, so that when the hearing-impaired person has difficulty, he can reach the hearing-impaired person's location as soon as possible, helping the hearing-impaired person to solve the problem . In addition, the correspondence relationship between the hearing-impaired persons and the mobile terminals and / or clients used by them can be set in advance.
此外,在本发明实施例中,该用于辅助听障人士的方法还可以包括以下内容:根据接收语音的顺序,记录各个声源对应的方位和/或文字,将各个声源对应的方位和/或文字存储在本地端或云端,以进一步帮助听障人士的记忆及事后分享。In addition, in the embodiment of the present invention, the method for assisting the hearing impaired may further include the following: according to the order of receiving voices, recording the position and / or text corresponding to each sound source, and combining the position and / Or text is stored locally or in the cloud to further help the hearing impaired to remember and share afterwards.
相应地,本发明实施例的另一方面提供一种用于辅助听障人士的装置。图8是本发明另一实施例提供的用于辅助听障人士的装置。如图8所示,该装置包括语音接收模块1、语音识别模块2和显示模块3。其中,语音接收模块1用于接收至少一个声源的语音。语音识别模块2用于识别至少一个声源中的每一声源的语音,以将至少一个声源中的每一声源的语音转化成采用第一预设目标语言进行表达的文字。显示模块3用于显示至少一个声源中的每一声源的语音所转化的文字。Accordingly, another aspect of the embodiments of the present invention provides a device for assisting a hearing impaired person. FIG. 8 is a device for assisting the hearing impaired according to another embodiment of the present invention. As shown in FIG. 8, the device includes a voice receiving module 1, a voice recognition module 2, and a display module 3. The voice receiving module 1 is configured to receive voice of at least one sound source. The voice recognition module 2 is configured to recognize the voice of each sound source in the at least one sound source, so as to convert the voice of each sound source in the at least one sound source into a text expressed in a first preset target language. The display module 3 is configured to display text converted by the voice of each sound source in the at least one sound source.
将至少一个声源中的每一声源的语音转化成文字并显示每一声源的语音转化的文字,如此,使得听障人士可以通过看字明白他人讲话内容,实现了听障人士与他人之间的沟通和交流。此外,采用本发明实施例中所述的方法,用户操作体验极轻,完全无需操作技术系统就能够“听”到其能力所不及的信息。另外,需要说明的是,本发明实施例提供的用于辅助听障人士的方法不仅可以适用于听障人士,也适用于普通人士。The speech of each sound source in at least one sound source is converted into text and the converted text of each sound source is displayed. In this way, the hearing impaired person can understand the content of other people's speech by reading the words, thus achieving the hearing impaired person and others. Communication and exchange. In addition, with the method described in the embodiment of the present invention, the user's operation experience is extremely light, and he / she can “hear” information beyond his ability without operating a technical system at all. In addition, it should be noted that the method for assisting the hearing impaired provided by the embodiment of the present invention is not only applicable to hearing impaired persons, but also applicable to ordinary persons.
可选地,在本发明实施例中,该用于辅助听障人士的装置还包括确定模块,该确定模块用于基于接收的至少一个声源中的每一声源的语音,确定至少一个声源中的每一声源的方位。显示模块还用于显示至少一个声源中的每一声源的方位。Optionally, in the embodiment of the present invention, the device for assisting the hearing impaired further includes a determining module, which is configured to determine at least one sound source based on the voice of each sound source of the received at least one sound source. The position of each sound source in. The display module is further configured to display the position of each sound source in the at least one sound source.
可选地,在本发明实施例中,该用于辅助听障人士的装置还包括文字接收模块,用于接收文字;文字转化模块,用于将所接收的文字转化为采用第二预设语言表达的语音;以及语音播放模块,用于播放所转化的语音。Optionally, in the embodiment of the present invention, the device for assisting the hearing impaired further includes a text receiving module for receiving text; a text conversion module for converting the received text into a second preset language Expressed speech; and a speech playback module for playing the converted speech.
可选地,在本发明实施例中,该装置还包括:语言设定模块,用于在语音接收模块接收至少一个声源的语音和/或文字接收模块接收文字之前,接收对第一预设目标语言和/或第二预设目标语言的设定。Optionally, in the embodiment of the present invention, the device further includes: a language setting module, configured to receive the first preset before the voice receiving module receives the voice of the at least one sound source and / or the text receiving module receives the text. Setting of the target language and / or the second preset target language.
可选地,在本发明实施例中,显示模块可以是近眼显示器。其中,该近眼显示器距离眼球的距离可以小于2cm。此外,近眼显示器可以包括可透视的近眼显示器或不可透视的近眼显示器。如此,实现了将每一声源发 出的语音转化的文字或者每一声源的方位及发出的语音转化的文字呈现在眼前。优选地,在本发明实施例中,显示模块可以是透视式近眼显示器。如此,实现了在不影响听障人士观察其他事物的同时,使得听障人士可以通过观看“字幕”了解每一声源发出的语音转化的文字或者每一声源的方位及发出的语音转化的文字。Optionally, in the embodiment of the present invention, the display module may be a near-eye display. The distance between the near-eye display and the eyeball may be less than 2 cm. In addition, the near-eye display may include a see-through near-eye display or a non-see-through near-eye display. In this way, it is possible to present in front of the eyes the text converted by each sound source, or the position of each sound source, and the text converted by the sound. Preferably, in the embodiment of the present invention, the display module may be a see-through near-eye display. In this way, while not affecting the hearing-impaired person from observing other things, the hearing-impaired person can understand the speech-translated text of each sound source or the position of each sound source and the speech-translated text by watching "subtitles".
可选地,在本发明实施例中,该装置还包括:定位模块,用于确定听障人士的位置信息;以及通信模块,用于向移动终端和/或客户端发送位置信息,以使得移动终端和/或客户端实时获取所述位置信息。Optionally, in the embodiment of the present invention, the device further includes: a positioning module for determining the location information of the hearing impaired; and a communication module for sending the location information to the mobile terminal and / or the client so that the mobile The terminal and / or the client obtains the location information in real time.
可选地,在本发明实施例中,该装置还包括:联系人设定模块,用于在通信模块向移动终端和/或客户端发送所述位置信息之前,接收对联系人的设定,其中移动终端和/或客户端为与所选定的联系人对应的移动终端和/或客户端。Optionally, in the embodiment of the present invention, the device further includes: a contact setting module, configured to receive the setting of the contact before the communication module sends the location information to the mobile terminal and / or the client, The mobile terminal and / or client is a mobile terminal and / or client corresponding to the selected contact.
此外,在本发明实施例中,该用于辅助听障人士的装置还包括存储模块。该存储模块用于根据接收语音的顺序,记录各个声源对应的方位和文字,以进一步帮助听障人士的记忆及事后分享。其中,该存储模块记录各个声源对应的方位和文字可以是将各个声源对应的方位和文字存储在本地端或云端。In addition, in the embodiment of the present invention, the device for assisting the hearing impaired further includes a storage module. The storage module is used to record the position and text corresponding to each sound source according to the order of receiving voices, to further help the hearing impaired to remember and share afterwards. Wherein, the storage module records the orientation and text corresponding to each sound source, which may be storing the orientation and text corresponding to each sound source on the local end or in the cloud.
本发明实施例提供的用于辅助听障人士的装置的具体工作原理及益处与本发明实施例提供的用于辅助听障人士的方法的具体工作原理及益处相似,这里将不再赘述。The specific working principle and benefits of the device for assisting the hearing impaired provided by the embodiments of the present invention are similar to the specific working principle and benefits of the method for assisting the hearing impaired provided by the embodiments of the present invention, and will not be repeated here.
另外,本发明实施例的另一方面提供一种用于辅助听障人士的系统,该系统包括:上述实施例中所述的装置以及客户端。其中,该客户端可以接收用户输入的文字;和/或可以接收听障人士的位置信息。In addition, another aspect of the embodiments of the present invention provides a system for assisting the hearing impaired, the system includes: the device described in the above embodiments and a client. The client can receive text input by the user; and / or can receive location information of the hearing impaired.
此外,本发明实施例的另一方面提供一种增强现实眼镜。该增强现实眼镜包括上述实施例中所述的装置。In addition, another aspect of the embodiments of the present invention provides an augmented reality glasses. The augmented reality glasses include the devices described in the above embodiments.
其中,该增强现实眼镜包括支持上述实施例中所述的装置运行的电子 电路系统,该电子电路系统包括电源、处理器、网络连接等模块以及语音接收模块、文字接收模块和语音播放模块。此外,该电子电路系统还可以包括外部可见的人-机界面模块以及按钮和/或触摸控制板。其中,处理器包括上述实施例中所述的确定模块、语音识别模块和文字转化模块。人-机界面模块包括显示模块。处理器还可以实现在本地进行离线语音识别,也可以实现经由网络连接在云端进行在线语音识别。The augmented reality glasses include an electronic circuit system that supports the operation of the device described in the foregoing embodiment. The electronic circuit system includes a power supply, a processor, a network connection, and other modules, as well as a voice receiving module, a text receiving module, and a voice playing module. In addition, the electronic circuit system may further include an externally visible human-machine interface module and buttons and / or a touch control panel. The processor includes the determination module, the speech recognition module, and the text conversion module described in the foregoing embodiments. The human-machine interface module includes a display module. The processor can also perform offline speech recognition locally, or online speech recognition in the cloud via a network connection.
可选地,在本发明实施例中,触摸控制板、按钮和/或语音接收模块可以被设置在增强现实眼镜的眼镜或者眼镜附件上,例如,设置在镜腿、镜框或者镜片上。可选地,在本发明实施例中,语音接收模块可以被设置在镜框上、同一镜腿上或者不同镜腿上,或者是接近于耳部(双耳或单耳)的位置上,达到极尽拟合耳部的效果。例如,在语音接收模块为麦克风阵列且麦克分阵列包括两个麦克风的情况下,该两个麦克风分别设置两个镜框上,或者被设置在同一镜腿的不同位置上,或者被分别设置在两个镜腿上。当麦克风阵列包括的麦克风的数量大于2时,也可以根据实际情况将多个麦克风分别设置在镜框和/或镜腿上等。另外,使用麦克风阵列时,语音到达麦克风阵列中的每个麦克风的时间和强度均存在差异,通过对差异进行计算可以得到更加便于处理的清晰声音。此外,相比于采用单体麦克风或者降噪麦克风,使用麦克风阵列具有十分重要的意义,使用麦克风阵列可以不要求声源距离语音接收模块的距离。并且,使用麦克风阵列可以适应各种距离,能够满足多数交流场景下的要求,其中,该距离指的是声源距离麦克风阵列的距离。例如可以满足以下交流场景的要求:两人单独对话,声源距离语音接收模块的距离在50cm与1m之间;多人小组对话,声源距离语音接收模块的距离在1m与2m之间;会议,声源距离语音接收模块的距离为3m;上课,声源距离语音接收模块的距离在3m到5m,等等。Optionally, in the embodiment of the present invention, the touch control panel, buttons, and / or voice receiving module may be provided on the glasses or glasses accessories of the augmented reality glasses, for example, on the temples, frames, or lenses. Optionally, in the embodiment of the present invention, the voice receiving module may be disposed on the frame, on the same temple or on a different temple, or at a position close to the ears (both ears or monaural), to reach the pole. Try to fit the ear. For example, when the voice receiving module is a microphone array and the microphone sub-array includes two microphones, the two microphones are respectively disposed on two frames, or are disposed on different positions of the same temple, or are respectively disposed on two On the temples. When the number of microphones included in the microphone array is greater than two, a plurality of microphones may also be respectively arranged on the frame and / or the temple according to the actual situation. In addition, when using a microphone array, there is a difference in the time and intensity of speech reaching each microphone in the microphone array. By calculating the difference, a clearer sound that is more convenient to process can be obtained. In addition, compared to using a single microphone or a noise reduction microphone, the use of a microphone array is of great significance. Using a microphone array does not require the distance of the sound source from the voice receiving module. In addition, the use of the microphone array can adapt to various distances and can meet the requirements in most communication scenarios. The distance refers to the distance between the sound source and the microphone array. For example, it can meet the requirements of the following communication scenarios: two people talk individually, the distance between the sound source and the voice receiving module is between 50cm and 1m; in a multi-person group conversation, the sound source is between the distance of 1m and 2m from the voice receiving module; conference , The distance between the sound source and the voice receiving module is 3m; in class, the distance between the sound source and the voice receiving module is 3m to 5m, and so on.
此外,在显示模块为近眼显示器的情况下,实现了将每一声源发出的语音转化的文字或者每一声源的方位及发出的语音转化的文字呈现在眼前。 其中,近眼显示器可以是可透视的,也可以是不可透视的。进一步地,在近眼显示器为透视式近眼显示器的情况下,实现了在不影响听障人士观察显示场景的同时,透过叠加于现实场景的图示化指示,使得听障人士可以实时看到每一声源发出的语音转化的文字或者每一声源的方位及发出的语音转化的文字,使得听障人士在观看“字幕”以理解收听到的语音信息或者在理解收听到的语音信息的同时获取类似于常人的对位置的感知。此外,考虑到避免听障人士注意力分散,近眼显示器可以是单色显示器,采用预设背景色和预设前景色显示声源对应的文字或者方位及文字。另外,近眼显示器也可以是彩色显示器,采用背景色和前景色交替变换的形式显示不同声源对应的文字或者方位及文字,具体变换方式可以参见上述实施例中所述的内容,如此,也可以充分避免听障人士注意力分散,使得听障人士专注于内容本身;同时使得听障人士可以进行正常的实景交流,而不会产生被打断及需要转换注意力焦点的不适。In addition, in the case where the display module is a near-eye display, it is realized that the text converted by each sound source or the position of each sound source and the text converted by the sound are presented in front of the eyes. Among them, the near-eye display may be transparent or non-transparent. Further, in the case that the near-eye display is a see-through near-eye display, it does not affect the hearing-impaired person's observation of the display scene, and through the graphical instructions superimposed on the real scene, the hearing-impaired person can see each The text converted by a sound source or the position of each sound source and the text converted by a sound source make hearing-impaired people understand the voice information heard while watching "subtitles" or get similar information while understanding the voice information heard. To ordinary people's perception of location. In addition, in order to avoid distraction of the hearing impaired, the near-eye display may be a monochromatic display, which uses a preset background color and a preset foreground color to display the text or orientation and text corresponding to the sound source. In addition, the near-eye display can also be a color display. The background color and foreground color are alternately displayed to display the text or orientation and text corresponding to different sound sources. For the specific conversion method, refer to the content described in the foregoing embodiment. Fully avoid the deafness of the hearing impaired, so that the hearing impaired can focus on the content itself; meanwhile, the hearing impaired can conduct normal real-world communication without the discomfort of being interrupted and the need to change the focus of attention.
另外,本发明实施例的另一方面还提供一种机器可读存储介质,该机器可读存储介质上存储有指令,该指令用于使得机器执行上述实施例中所述的方法。In addition, another aspect of the embodiments of the present invention further provides a machine-readable storage medium, where the machine-readable storage medium stores instructions, and the instructions are used to cause a machine to execute the method described in the foregoing embodiments.
综上所述,将至少一个声源中的每一声源的语音转化成文字并显示每一声源的语音转化的文字,如此,使得听障人士可以通过看字明白他人讲话内容,实现了听障人士与他人之间的沟通和交流。将听障人士输入的文字转化成语音并播放转化的语音,如此,当听障人士没有发音能力或者发音能力受限时,使得听障人士可以通过输入文字来表达其意思,与他人进行交流。此外,将接收的语音转化成使用“视同听障人士”使用的语言表达的文字和/或将“视同听障人士”输入的文字转化成使用与“视同听障人士”沟通的他人使用的语言表达的语音,如此,实现了“视同听障人士”与他人之间的交流。In summary, the sound of each sound source in at least one sound source is converted into text and the converted text of each sound source is displayed. In this way, the hearing impaired can understand the content of other people's speech by reading the words, thereby achieving hearing impairment Communication and communication between people and others. The text input by the hearing impaired is converted into speech and the converted speech is played. In this way, when the hearing impaired has no pronunciation ability or the pronunciation ability is limited, the hearing impaired person can express his meaning by entering text and communicate with others. In addition, the received speech is converted into words expressed in the language used by the "deaf hearing impaired" and / or the words entered by the "deaf hearing impaired" are converted into others who communicate with the "deaf The speech expressed in the language used thus realizes the communication between the "persons with hearing impairment" and others.
以上结合附图详细描述了本发明实施例的可选实施方式,但是,本发 明实施例并不限于上述实施方式中的具体细节,在本发明实施例的技术构思范围内,可以对本发明实施例的技术方案进行多种简单变型,这些简单变型均属于本发明实施例的保护范围。The optional implementations of the embodiments of the present invention have been described above in detail with reference to the accompanying drawings. However, the embodiments of the present invention are not limited to the specific details in the foregoing implementations. Within the scope of the technical concept of the embodiments of the present invention, the embodiments of the present invention The technical solution of the present invention performs various simple modifications, and these simple modifications all belong to the protection scope of the embodiments of the present invention.
另外需要说明的是,在上述具体实施方式中所描述的各个具体技术特征,在不矛盾的情况下,可以通过任何合适的方式进行组合。为了避免不必要的重复,本发明实施例对各种可能的组合方式不再另行说明。In addition, it should be noted that the specific technical features described in the foregoing specific embodiments can be combined in any suitable manner without conflict. In order to avoid unnecessary repetition, the embodiments of the present invention do not separately describe various possible combinations.
本领域技术人员可以理解实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序存储在一个存储介质中,包括若干指令用以使得单片机、芯片或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。Those skilled in the art can understand that all or part of the steps in the method of the above embodiments can be completed by a program instructing related hardware. The program is stored in a storage medium and includes a number of instructions to enable a microcontroller, a chip, or a processor. (processor) executes all or part of the steps of the method described in each embodiment of the present application. The foregoing storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes .
此外,本发明实施例的各种不同的实施方式之间也可以进行任意组合,只要其不违背本发明实施例的思想,其同样应当视为本发明实施例所公开的内容。In addition, various combinations of the embodiments of the present invention can also be arbitrarily combined, as long as it does not violate the idea of the embodiment of the present invention, it should also be regarded as the content disclosed in the embodiment of the present invention.
Claims (15)
- 一种用于辅助听障人士的方法,其特征在于,该方法包括:A method for assisting the hearing impaired, characterized in that the method includes:接收至少一个声源的语音;Receiving voice from at least one sound source;识别所述至少一个声源中的每一声源的语音,以将所述至少一个声源中的每一声源的语音转化成采用第一预设目标语言进行表达的文字;以及Identifying the speech of each sound source in the at least one sound source to convert the speech of each sound source in the at least one sound source into text expressed in a first preset target language; and显示所述至少一个声源中的每一声源的语音所转化的文字。Display text converted by the voice of each of the at least one sound source.
- 根据权利要求1所述的方法,其特征在于,该方法还包括:The method according to claim 1, further comprising:接收文字;Receive text将所接收的文字转化为采用第二预设语言表达的语音;以及Converting received text into speech expressed in a second preset language; and播放所转化的语音。Play the converted voice.
- 根据权利要求2所述的方法,其特征在于,在所述接收至少一个声源的语音和/或所述接收文字之前,该方法还包括:接收对所述第一预设目标语言和/或所述第二预设目标语言的设定。The method according to claim 2, wherein before the receiving the voice of at least one sound source and / or the received text, the method further comprises: receiving a response to the first preset target language and / or Setting of the second preset target language.
- 根据权利要求1-3中任一项所述的方法,其特征在于,该方法还包括:The method according to any one of claims 1-3, further comprising:确定所述听障人士的位置信息;以及Determining the location information of the hearing impaired; and向移动终端和/或客户端发送所述位置信息,以使得所述移动终端和/或客户端实时获取所述位置信息。Sending the location information to a mobile terminal and / or a client, so that the mobile terminal and / or the client obtains the location information in real time.
- 根据权利要求4所述的方法,其特征在于,在向移动终端和/或客户端发送所述位置信息之前,该方法还包括:接收对联系人的设定,其中所述移动终端和/或客户端为与所选定的联系人对应的移动终端和/或客户端。The method according to claim 4, wherein before sending the location information to a mobile terminal and / or a client, the method further comprises: receiving a setting for a contact, wherein the mobile terminal and / or The client is a mobile terminal and / or client corresponding to the selected contact.
- 一种用于辅助听障人士的装置,其特征在于,该装置包括:A device for assisting the hearing impaired, characterized in that the device includes:语音接收模块,用于接收至少一个声源的语音;A voice receiving module, configured to receive voice of at least one sound source;语音识别模块,用于识别所述至少一个声源中的每一声源的语音,以将所述至少一个声源中的每一声源的语音转化成采用第一预设目标语言进行表达的文字;以及A voice recognition module, configured to recognize the voice of each of the at least one sound source, so as to convert the voice of each of the at least one sound source into a text expressed in a first preset target language; as well as显示模块,用于显示所述至少一个声源中的每一声源的语音所转化的文字。A display module, configured to display text converted by the voice of each of the at least one sound source.
- 根据权利要求6所述的装置,其特征在于,该装置还包括:The device according to claim 6, further comprising:文字接收模块,用于接收文字;Text receiving module for receiving text;文字转化模块,用于将所接收的文字转化为采用第二预设语言表达的语音;以及A text conversion module for converting the received text into speech expressed in a second preset language; and语音播放模块,用于播放所转化的语音。The voice playback module is used to play the converted voice.
- 根据权利要求7所述的装置,其特征在于,该装置还包括:The apparatus according to claim 7, further comprising:语言设定模块,用于在所述语音接收模块接收至少一个声源的语音和/或所述文字接收模块接收文字之前,接收对所述第一预设目标语言和/或所述第二预设目标语言的设定。A language setting module configured to receive the first preset target language and / or the second preset language before the voice receiving module receives the voice of at least one sound source and / or the text receiving module receives the text Set the target language setting.
- 根据权利要求6-8任一项所述的装置,其特征在于,所述显示模块为近眼显示器。The device according to any one of claims 6 to 8, wherein the display module is a near-eye display.
- 根据权利要求9所述的装置,其特征在于,所述近眼显示器为透视式近眼显示器。The device according to claim 9, wherein the near-eye display is a see-through near-eye display.
- 根据权利要求6所述的装置,其特征在于,该装置还包括:The device according to claim 6, further comprising:定位模块,用于确定所述听障人士的位置信息;以及A positioning module for determining position information of the hearing impaired person; and通信模块,用于向移动终端和/或客户端发送所述位置信息,以使得所述移动终端和/或客户端实时获取所述位置信息。A communication module is configured to send the location information to a mobile terminal and / or a client, so that the mobile terminal and / or the client obtains the location information in real time.
- 根据权利要求11所述的装置,其特征在于,该装置还包括:The apparatus according to claim 11, further comprising:联系人设定模块,用于在所述通信模块向移动终端和/或客户端发送所述位置信息之前,接收对联系人的设定,其中所述移动终端和/或客户端为与所选定的联系人对应的移动终端和/或客户端。A contact setting module, configured to receive a setting of a contact before the communication module sends the location information to a mobile terminal and / or a client, wherein the mobile terminal and / or the client The mobile terminal and / or client corresponding to the specified contact.
- 一种增强现实眼镜,其特征在于,该增强现实眼镜包括权利要求6-12中任一项所述的装置。An augmented reality glasses, characterized in that the augmented reality glasses include the device according to any one of claims 6-12.
- 一种用于辅助听障人士的系统,其特征在于,该系统包括:A system for assisting the hearing impaired, characterized in that the system includes:权利要求6-12中任一项所述的装置;以及The device of any one of claims 6-12; and客户端。Client.
- 一种机器可读存储介质,该机器可读存储介质上存储有指令,该指令用于使得机器执行权利要求1-5中任一项所述的方法。A machine-readable storage medium has instructions stored on the machine-readable storage medium, which are used to cause a machine to perform the method according to any one of claims 1-5.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810597336.9A CN108962254A (en) | 2018-06-11 | 2018-06-11 | For assisting the methods, devices and systems and augmented reality glasses of hearing-impaired people |
CN201810597336.9 | 2018-06-11 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019237427A1 true WO2019237427A1 (en) | 2019-12-19 |
Family
ID=64488163
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/092812 WO2019237427A1 (en) | 2018-06-11 | 2018-06-26 | Method, apparatus and system for assisting hearing-impaired people, and augmented reality glasses |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108962254A (en) |
WO (1) | WO2019237427A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111768787A (en) * | 2020-06-24 | 2020-10-13 | 中国人民解放军海军航空大学 | Multifunctional auxiliary audio-visual method and system |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109616122A (en) * | 2018-12-25 | 2019-04-12 | 王让利 | A kind of visualization hearing aid |
CN110146988A (en) * | 2019-05-15 | 2019-08-20 | 东北大学 | A kind of wear-type augmented reality glasses system and its implementation |
CN111128180A (en) * | 2019-11-22 | 2020-05-08 | 北京理工大学 | Auxiliary dialogue system for hearing-impaired people |
CN112185415A (en) * | 2020-09-10 | 2021-01-05 | 珠海格力电器股份有限公司 | Sound visualization method and device, storage medium and MR mixed reality equipment |
CN114550430A (en) * | 2022-04-27 | 2022-05-27 | 北京亮亮视野科技有限公司 | Character reminding method and device based on AR technology |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103310683A (en) * | 2013-05-06 | 2013-09-18 | 深圳先进技术研究院 | Intelligent glasses and voice communication system and method basing on intelligent glasses |
CN103869471A (en) * | 2014-01-09 | 2014-06-18 | 盈诺飞微电子(上海)有限公司 | Head voice recognition projector and system |
CN105824137A (en) * | 2016-05-25 | 2016-08-03 | 北京联合大学 | Visualized intelligent glasses |
CN106205293A (en) * | 2016-09-30 | 2016-12-07 | 广州音书科技有限公司 | For speech recognition and the intelligent glasses of Sign Language Recognition |
CN206178272U (en) * | 2016-10-12 | 2017-05-17 | 语联网(武汉)信息技术有限公司 | External multi -lingual smart machine of glasses |
CN107071603A (en) * | 2017-06-30 | 2017-08-18 | 广州音书科技有限公司 | A kind of microphone and system for Real-time speech recognition |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105554662A (en) * | 2015-06-30 | 2016-05-04 | 宇龙计算机通信科技(深圳)有限公司 | Hearing-aid glasses and hearing-aid method |
-
2018
- 2018-06-11 CN CN201810597336.9A patent/CN108962254A/en active Pending
- 2018-06-26 WO PCT/CN2018/092812 patent/WO2019237427A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103310683A (en) * | 2013-05-06 | 2013-09-18 | 深圳先进技术研究院 | Intelligent glasses and voice communication system and method basing on intelligent glasses |
CN103869471A (en) * | 2014-01-09 | 2014-06-18 | 盈诺飞微电子(上海)有限公司 | Head voice recognition projector and system |
CN105824137A (en) * | 2016-05-25 | 2016-08-03 | 北京联合大学 | Visualized intelligent glasses |
CN106205293A (en) * | 2016-09-30 | 2016-12-07 | 广州音书科技有限公司 | For speech recognition and the intelligent glasses of Sign Language Recognition |
CN206178272U (en) * | 2016-10-12 | 2017-05-17 | 语联网(武汉)信息技术有限公司 | External multi -lingual smart machine of glasses |
CN107071603A (en) * | 2017-06-30 | 2017-08-18 | 广州音书科技有限公司 | A kind of microphone and system for Real-time speech recognition |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111768787A (en) * | 2020-06-24 | 2020-10-13 | 中国人民解放军海军航空大学 | Multifunctional auxiliary audio-visual method and system |
Also Published As
Publication number | Publication date |
---|---|
CN108962254A (en) | 2018-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019237427A1 (en) | Method, apparatus and system for assisting hearing-impaired people, and augmented reality glasses | |
CN108141696B (en) | System and method for spatial audio conditioning | |
US11068668B2 (en) | Natural language translation in augmented reality(AR) | |
CA2898750C (en) | Devices and methods for the visualization and localization of sound | |
WO2019237428A1 (en) | Method and device for providing sound source information and augmented reality glasses | |
JP6017854B2 (en) | Information processing apparatus, information processing system, information processing method, and information processing program | |
US11096006B1 (en) | Dynamic speech directivity reproduction | |
CN105607263A (en) | Head mounted electronic equipment | |
WO2021143574A1 (en) | Augmented reality glasses, augmented reality glasses-based ktv implementation method and medium | |
CN112764549B (en) | Translation method, translation device, translation medium and near-to-eye display equipment | |
US11297456B2 (en) | Moving an emoji to move a location of binaural sound | |
US10916159B2 (en) | Speech translation and recognition for the deaf | |
CN110717344A (en) | Auxiliary communication system based on intelligent wearable equipment | |
WO2019237429A1 (en) | Method, apparatus and system for assisting communication, and augmented reality glasses | |
CN112887654A (en) | Conference equipment, conference system and data processing method | |
CN213876195U (en) | Glasses frame and intelligent navigation glasses | |
KR20210109004A (en) | User groups based on artificial reality | |
WO2019238018A1 (en) | Method, device and system for focusing on speech of single sound source | |
Eksvärd et al. | Evaluating Speech-to-Text Systems and AR-glasses: A study to develop a potential assistive device for people with hearing impairments | |
CN115077525A (en) | Navigation method and device based on auxiliary visual glasses | |
US11495004B1 (en) | Systems and methods for lighting subjects for artificial reality scenes | |
CN115268621A (en) | VR head-mounted device and VR virtual reality equipment | |
CN107786829A (en) | Conversational system based on panoramic shooting technology | |
CN118509572A (en) | Real-time caption display method of head-mounted display device, head-mounted display device and medium | |
JP2001228794A (en) | Conversation information presenting method and immersed type virtual communication environment system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18922160 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 19/03/2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18922160 Country of ref document: EP Kind code of ref document: A1 |