CN109032545A

CN109032545A - For providing the method and apparatus and augmented reality glasses of sound source information

Info

Publication number: CN109032545A
Application number: CN201810596747.6A
Authority: CN
Inventors: 张志扬; 苏进; 苏卓然; 李琦; 杨莉
Original assignee: Beijing Jia Er Medical Technology Co Ltd
Current assignee: Beijing Jia Er Medical Technology Co Ltd
Priority date: 2018-06-11
Filing date: 2018-06-11
Publication date: 2018-12-18
Also published as: WO2019237428A1

Abstract

The embodiment of the present invention provides a kind of method and apparatus and augmented reality glasses for providing sound source information, belongs to augmented reality field.This method comprises: receiving the voice of at least one sound source；Based on the voice of each sound source at least one received described sound source, the orientation of each sound source at least one described sound source is determined, wherein point on the basis of the speech reception module of voice of the orientation to receive at least one sound source；The voice for identifying each sound source at least one described sound source, is converted to text for the voice of each sound source at least one described sound source；And the text that the orientation and voice of each sound source at least one sound source described in showing are converted.It is thereby achieved that the orientation of sound source being understood, getting information about while the content for the voice for enabling hearing-impaired people to be apparent that sound source issues, so that hearing-impaired people can preferably hold the content of the voice of sound source sending.

Description

For providing the method and apparatus and augmented reality glasses of sound source information

Technical field

The present invention relates to augmented reality fields, more particularly to the method and apparatus and increasing for providing sound source information Strong Reality glasses.

Background technique

Augmented reality (Augmented Reality, AR) technology is a kind of position and angle by calculating image in real time Degree, is superimposed corresponding image, video, 3D model, and then to the technology that virtual world is merged with real world on image. AR client can carry out real-time image to environment under the line of user in conjunction with its local picture recognition material is stored directly in Identification, and target enhances on the position in real scene according to the bandwagon effect of pre-configuration under the specific line identified Show corresponding display data.With the development of technology, augmented reality application it is very extensive, but for hearing-impaired people and Speech, augmented reality is but without helping to arrive them well.

Currently, hearing-impaired people listens people to link up mainly through following two approach with strong: sign language interpreters or wear hearing aid. But both communication approachs cannot all make well hearing-impaired people understand sound source position, influence hearing-impaired people hold from The content for the voice that sound source issues.

Summary of the invention

The method and apparatus and augmented reality glasses that the object of the present invention is to provide a kind of for providing sound source information, can Realize the orientation that sound source is understood while the content for the voice for enabling hearing-impaired people to understand that sound source issues.

To achieve the goals above, one aspect of the present invention provides a kind of for providing the method for sound source information, the party Method includes: the voice for receiving at least one sound source；Based on the voice of each sound source at least one received described sound source, really The orientation of each sound source at least one fixed described sound source, wherein the orientation is to receive the language of the voice of at least one sound source Point on the basis of sound receiving module；Identify the voice of each sound source at least one described sound source, it will at least one described sound The voice of each sound source in source is converted to text；And show the orientation and language of each sound source at least one described sound source The text that sound is converted.

Optionally, the orientation includes direction and/or distance.

Optionally, the direction is used arrow expression, and the arrow is located in the region of circumference delimitation, the arrow Deviate one angle of the longitudinal axis across the circumference.

Correspondingly, another aspect of the present invention provides a kind of for providing the device of sound source information, which includes: voice Receiving module, for receiving the voice of at least one sound source；Determining module, for based at least one received described sound source Each sound source voice, the orientation of each sound source at least one described sound source is determined, wherein the orientation is with institute's predicate Point on the basis of sound receiving module；Speech recognition module, the voice of each sound source at least one described sound source for identification, with The voice of each sound source at least one described sound source is converted to text；And display module, for show it is described at least The text that the orientation and voice of each sound source in one sound source are converted.

Optionally, the orientation includes direction and/or distance.

Optionally, the display module is near-eye display.

Optionally, the near-eye display is perspective formula near-eye display.

In addition, another aspect of the present invention also provides a kind of augmented reality glasses, which includes above-mentioned Device.

In addition, another aspect of the present invention also provides a kind of machine readable storage medium, on the machine readable storage medium It is stored with instruction, which is used for so that machine executes above-mentioned method.

Through the above technical solutions, showing that the text of the voice conversion of each sound source at least one sound source to listen barrier Personage is apparent that the content of the voice of each sound source；The orientation for showing each sound source at least one sound source, realize with Simply and intuitively mode prompts the sounding position of hearing-impaired people's voice.In this way, making hearing-impaired people in viewing " subtitle " to understand The perception to position for being similar to ordinary person is obtained while the voice messaging of uppick, is realized so that hearing-impaired people is apparent that It can understand while the content for the voice that each sound source issues, get information about the orientation of each sound source, so that listening barrier Personage can preferably hold the content for the voice that each sound source issues.Particularly, in the environment of there are multi-acoustical, so that listening Barrier personage understands the orientation of each sound source, is remarkably contributing to hearing-impaired people and exchanges with other people.

Other features and advantages of the present invention will the following detailed description will be given in the detailed implementation section.

Detailed description of the invention

Attached drawing is to further understand for providing to the embodiment of the present invention, and constitute part of specification, under The specific embodiment in face is used to explain the present invention embodiment together, but does not constitute the limitation to the embodiment of the present invention.Attached In figure:

Fig. 1 is the flow chart for the method for providing sound source information that one embodiment of the invention provides；

Fig. 2 be another embodiment of the present invention provides the exemplary diagram for making direction indicated by an arrow；

Fig. 3 be another embodiment of the present invention provides orientation exemplary diagram；

Fig. 4 be another embodiment of the present invention provides one sound source of display orientation and voice conversion text exemplary diagram；

Fig. 5 be another embodiment of the present invention provides display multi-acoustical orientation and voice conversion text example Figure；And

Fig. 6 be another embodiment of the present invention provides for provide sound source information device structural block diagram.

Description of symbols

1 speech reception module, 2 determining module

3 speech recognition module, 4 display module

Specific embodiment

It is described in detail below in conjunction with specific embodiment of the attached drawing to the embodiment of the present invention.It should be understood that this Locate described specific embodiment and be merely to illustrate and explain the present invention embodiment, is not intended to restrict the invention embodiment.

The one aspect of the embodiment of the present invention provides a kind of for providing the method for sound source information.Fig. 1 is that the present invention one is real The flow chart of the method for providing sound source information of example offer is provided.As shown in Figure 1, this approach includes the following steps.

In step slo, the voice of at least one sound source is received.

In step s 11, the voice based on each sound source at least one received sound source, determines at least one sound source In each sound source orientation.

Wherein it is determined that the orientation of sound source can be based on the time for receiving the voice issued from sound source.For example, for connecing The speech reception module for receiving the voice of at least one sound source includes multiple voice acquisition modules, and multiple voice acquisition module is set It sets in different positions, the time that multiple voice acquisition modules receive the voice issued from same sound source is different.For at least Each sound source in one sound source, the difference of the time of multiple voice acquisition modules is reached according to voice, that is, reach according to voice The time difference of multiple voice acquisition modules determines the orientation of sound source.Optionally, in embodiments of the present invention, voice acquisition module It can be microphone, which can be microphone array.For example, microphone array may include 2,4,6,7 or 8 A microphone.

Optionally, in embodiments of the present invention, the datum mark in orientation can be configured according to the actual situation, for example, can To be the position where speech reception module.Specifically, it can be any voice acquisition module in multiple voice acquisition modules, Or it can also be the middle position of multiple voice acquisition modules.In addition, when speech reception module by hearing-impaired people wear or When distance apart from hearing-impaired people is not far, the point on the basis of speech reception module, the practical point i.e. on the basis of hearing-impaired people, such as This, hearing-impaired people can understand position of the sound source relative to oneself based on the orientation determined.

Optionally, in embodiments of the present invention, orientation may include direction and/or distance.Optionally, implement in the present invention In example, direction indicated by an arrow can be adopted, arrow is located in the region of circumference delimitation, and the starting point of arrow is the original of the circumference Point, wherein the origin is equivalent to hearing-impaired people position, and arrow deviates one angle of the longitudinal axis across the circumference, such as Fig. 2 institute Show, vertical dotted line is the longitudinal axis across circumference in figure.In addition, on the basis of the horizontal axis of circumference, lateral dotted line institute as shown in Figure 2 Show, when arrow is located at the part of horizontal axis or more, indicates sound source in the front of hearing-impaired people；When arrow is located at horizontal axis portion below Timesharing indicates sound source at the rear of hearing-impaired people.For example, the sound source that the arrow indicates exists by taking direction example as indicated with 2 as an example The front of hearing-impaired people.In addition, the mode in the direction for being depicted with arrows sound source can also be read as using clock come the side of expression To.Wherein circumference represents dial plate, and the longitudinal axis positioned at the top half of the horizontal axis of circumference indicates 12 o'clock direction, is deviateed according to arrow The angle at 12 o'clock determine sound source probably what time direction.By taking direction example shown in Fig. 2 as an example, the sound source that arrow indicates is big Generally in 10 o'clock direction.In addition, can be indicated using example as shown in Figure 3 in the case where orientation includes direction and distance Orientation.It should be noted that the position of display distance can be set according to the actual situation, in this regard, being not limited.This It outside, can only display distance in the case where orientation only includes distance.Particularly, when determining sound source based on the voice received When from hearing-impaired people, when adopting the direction of sound source indicated by an arrow, " O " or "●" are shown in circle center to indicate The direction of sound source.In addition, in embodiments of the present invention, verbal description orientation can also be used, for example, with orientation shown in Fig. 3 For, it can show that " direction is ten o'clock direction to text, and distance is 50cm ".

In step s 12, the voice for identifying each sound source at least one sound source, will be every at least one sound source The voice of one sound source is converted to text.For example, voice is converted to text to realize by speech recognition technology.

In step s 13, the orientation of each sound source at least one sound source and the text that voice is converted are shown.Its In, show that the example of the orientation of each sound source and the text of voice conversion can be as shown in Figure 4.In addition, showing a certain sound source Voice conversion text when, if a line cannot show completely all texts, can enter a new line display automatically, or can roll Display.

In addition, it is necessary to explanation, Fig. 4 only illustrates the area of the text of display orientation and voice conversion in an illustrative manner The position of the position in domain, the display area of the rwo can be selected according to the actual situation, for the display area of the rwo Position without limit.In addition, equal when showing the text that the orientation of each sound source at least one sound source and voice convert It can be shown using example shown in Fig. 4.In addition, display multi-acoustical orientation and voice conversion text when, can be with It is shown in such a way that sound source is arranged successively up and down, as shown in Figure 5.In addition it is also possible to be arranged successively according to sound source or so Mode shown, or shown using other arrangement modes, in this regard, being not limited.

Optionally, in embodiments of the present invention, the mode of display orientation and text can there are many kinds of.For example, using pre- If foreground is shown with default background colour, wherein default foreground and default background colour are not same color.For example, default Foreground is white, and presetting background colour is black, shows black matrix wrongly written or mispronounced character；Or default foreground is black, default background colour is White shows white gravoply, with black engraved characters.For another example, presetting foreground is white, and presetting background colour is green, shows green bottom wrongly written or mispronounced character；Or Default foreground is green, and presetting background colour is white, shows the green word of white background.In this way, allowing the more clear area of user Separate text.For example, the mode of display orientation and text can also be using default foreground and default background colour checker The mode of color shows the corresponding orientation of different sound sources and text, that is, default foreground and default background colour are not same color, According to receive voice sequence, when institute received adjacent voice corresponding to sound source be different sound sources when, alternately variation preset before Scenery and default background colour；When sound source corresponding to received adjacent voice be same sound source when, preset foreground and default Background colour does not change color.Wherein, it for the color of default foreground and default background colour, can be limited according to the actual situation Fixed, for example, default foreground is white, presetting background colour is black, shows black matrix wrongly written or mispronounced character；Or default foreground is black, Default background colour is white, shows white gravoply, with black engraved characters；For another example, presetting foreground is white, and presetting background colour is green, is shown green Bottom wrongly written or mispronounced character；Or default foreground is green, presetting background colour is white, shows the green word of white background.Below illustratively with default Foreground is white, default background colour is that green introduces checker and presets foreground and shows different sound sources pair from default background colour The orientation and text answered.If a certain voice (being named as the first voice, which is only to be convenient for describing, unlimited effect) corresponds to Green bottom wrongly written or mispronounced character is used when the orientation of the first sound source, the first voice of display text converted and the first sound source；According to reception voice Sequentially, the corresponding sound source of next voice (being named as the second voice) from the first sound source of the first voice are the (name of different sound sources The corresponding sound source of second voice is the second sound source), using white when the orientation of the text of display the second voice conversion and the second sound source The green word in bottom；According to the sequence for receiving voice, the corresponding sound source of next article of voice (being named as third voice) of the second voice is the Two sound sources then still use the green word of white background at the orientation of the text and the second sound source that show the conversion of third voice；According to reception The sequence of voice, the corresponding sound source of next article of voice (being named as the 4th voice) of third voice are different sound sources from the second sound source (the corresponding sound source of the 4th voice of name is third sound source, and wherein third sound source can be the first sound source, be also possible to other sound Source, only if it were not for the second sound source), then green bottom is used at the orientation of the text and third sound source that show the conversion of the 4th voice Wrongly written or mispronounced character finishes, wherein the corresponding letter of voice in this way, circulation is gone down until the received corresponding information of voice is all shown Breath includes the text of voice conversion and the orientation of the corresponding sound source of voice.

It is each to show that the text of the voice conversion of each sound source at least one sound source is apparent that hearing-impaired people The content of the voice of sound source；The orientation for showing each sound source at least one sound source, realizes and mentions in a manner of simply and intuitively Show the sounding position of hearing-impaired people's voice.In this way, hearing-impaired people is made to understand the voice messaging of uppick in viewing " subtitle " While obtain the perception to position for being similar to ordinary person, realize so that hearing-impaired people is apparent that the language that each sound source issues The orientation of each sound source can be understood, got information about while the content of sound, so that hearing-impaired people can preferably hold The content for the voice that each sound source issues.Particularly, in the environment of there are multi-acoustical, so that hearing-impaired people understands each sound The orientation in source is remarkably contributing to hearing-impaired people and exchanges with other people.In addition, being used using method described in the embodiment of the present invention Family operating experience is extremely light, " can listen " information beyond one's ability to its energy without operating technology system completely.In addition, it is necessary to say Bright, the method provided in an embodiment of the present invention for providing sound source information not only can be adapted for hearing-impaired people, be also suitable In ordinary people.

Optionally, in embodiments of the present invention, before receiving the voice of at least one sound source, this is used for for providing The method of sound source information can also include the following contents: receive the setting of the language for the text being converted to voice.In the reality It applies in example, " hearing-impaired people " may be not the restricted personage of real Listening Ability of Ethnic, can be he for being ignorant of exchanging therewith " treating as hearing-impaired people " of the language of people.The language that setting " treating as hearing-impaired people " uses, will receive at least one sound source Voice is converted to the text expressed using the language, and " treat as hearing-impaired people " passes through the text after seeing conversion and understand to be exchanged therewith The content that other people talk.In this way, realizing " treating as hearing-impaired people " and exchanging between other people.

In addition, in embodiments of the present invention, which can also include the following contents: according to The sequence for receiving voice, records the corresponding orientation of each sound source and text, the corresponding orientation of each sound source and text is stored in Local side or cloud, further to help the memory and subsequent sharing of hearing-impaired people.

Correspondingly, the another aspect of the embodiment of the present invention provides a kind of for providing the device of sound source information.Fig. 6 is this hair What bright another embodiment provided is used to provide the device of sound source information.As shown in fig. 6, the device include speech reception module 1, really Cover half block 2, speech recognition module 3 and display module 4.Wherein, speech reception module 1 is used to receive the language of at least one sound source Sound.Determining module 2 is used for the voice based on each sound source at least one received sound source, determines at least one sound source The orientation of each sound source, wherein orientation point on the basis of speech reception module 1.Speech recognition module 3 for identification at least one The voice of each sound source at least one sound source is converted to text by the voice of each sound source in sound source.Display module 4 The text that orientation and voice for showing each sound source at least one sound source are converted.

Optionally, in embodiments of the present invention, orientation may include direction and/or distance.Optionally, implement in the present invention In example, direction is used arrow expression, and arrow is located in the region of circumference delimitation, and arrow deviates the longitudinal axis one across the circumference Angle.

Optionally, in embodiments of the present invention, display module 4 can be near-eye display.Wherein, the near-eye display away from 2cm can be less than with a distance from eyeball.In addition, near-eye display may include see-through near-eye display or can not have an X-rayed Near-eye display.In this way, realizing the information presentation of sound source is before eyes.Preferably, in embodiments of the present invention, display module 4 can be perspective formula near-eye display.In this way, realizing while not influencing hearing-impaired people and observing other things, so that listening Hindering personage can be by the information of viewing " subtitle " understanding sound source.

In addition, in embodiments of the present invention, which further includes memory module.The storage mould Block is used to the corresponding orientation of each sound source and text are recorded, further to help hearing-impaired people's according to the sequence for receiving voice Memory and subsequent sharing.Wherein, which records the corresponding orientation of each sound source and text and can be each sound source pair The orientation and text answered are stored in local side or cloud.

It is each to show that the text of the voice conversion of each sound source at least one sound source is apparent that hearing-impaired people The content of the voice of sound source；The orientation for showing each sound source at least one sound source, realizes and mentions in a manner of simply and intuitively Show the sounding position of hearing-impaired people's voice.In this way, hearing-impaired people is made to understand the voice messaging of uppick in viewing " subtitle " While obtain the perception to position for being similar to ordinary person, realize so that hearing-impaired people is apparent that the language that each sound source issues The orientation of each sound source can be understood, got information about while the content of sound, so that hearing-impaired people can preferably hold The content for the voice that each sound source issues.Particularly, in the environment of there are multi-acoustical, so that hearing-impaired people understands each sound The orientation in source is remarkably contributing to hearing-impaired people and exchanges with other people.In addition, it is necessary to which explanation, provided in an embodiment of the present invention Device for providing sound source information not only can be adapted for hearing-impaired people, be also applied for ordinary people.

Concrete operating principle and benefit and the present invention provided in an embodiment of the present invention for providing the device of sound source information The concrete operating principle and benefit for the method for providing sound source information that embodiment provides are similar, will not be described in great detail here.

In addition, the another aspect of the embodiment of the present invention provides a kind of augmented reality glasses.The augmented reality glasses include upper State device as described in the examples.Wherein, which includes that device described in above-described embodiment is supported to run Electronic circuit system, the electronic circuit system include the modules such as power supply, processor, network connection and speech reception module.This Outside, which can also include externally visible man-machine interface module and button and/or touch control board.Its In, processor includes determining module described in above-described embodiment and speech recognition module.Man-machine interface module includes display mould Block.Processor, which can also be realized, is locally carrying out offline speech recognition, also may be implemented to carry out beyond the clouds via network connection Line speech recognition.

Optionally, in embodiments of the present invention, touch control board, button and/or speech reception module can be arranged on The glasses of augmented reality glasses are perhaps for example, be arranged on temple, frame or eyeglass on glasses accessories.Optionally, in this hair In bright embodiment, speech reception module can be arranged on frame, in same temple or in different temples, or close In on the position of ear's (ears or monaural), achieve the effect that extremely to the greatest extent be fitted ear.For example, being microphone in speech reception module In the case that array and Mike's subarray include two microphones, which is respectively set on two frames, or by It is arranged on the different location of same temple, or is separately positioned in two temples.As the Mike that microphone array includes When the quantity of wind is greater than 2, multiple microphones can also be separately positioned on frame and/or temple according to the actual situation etc..Separately When outside, using microphone array, the time and intensity that voice reaches each microphone in microphone array is had differences, and is led to It crosses and difference is carried out to calculate the available clear sound for easily facilitating processing.In addition, compared to using monomer microphone or Noise reduction microphone is had a very important significance using microphone array, can not require sound source distance using microphone array The distance of speech reception module.Also, various distances are adapted to using microphone array, can satisfy under most exchange scenes Requirement, wherein the distance refers to the distance of sound source distance microphone array.Such as it can satisfy wanting for following exchange scene Ask: two people individually talk with, and distance of the sound source apart from speech reception module is between 50cm and 1m；The dialogue of more people groups, sound source away from With a distance from speech reception module between 1m and 2m；Meeting, distance of the sound source apart from speech reception module are 3m；It attends class, sound Distance of the source apart from speech reception module is in 3m to 5m, etc..

In addition, realizing in the case where display module is near-eye display by the corresponding orientation of sound source and text presentation It is before eyes.Wherein, near-eye display can be see-through, be also possible to can not to have an X-rayed.Further, in near-eye display In the case where for perspective formula near-eye display, realize while not influencing hearing-impaired people's observation display scene, through superposition It is indicated in the graphicalization of reality scene, allows hearing-impaired people to see the corresponding orientation of sound source and text in real time, so that listening barrier Personage obtains the perception to position for being similar to ordinary person while voice messaging of the viewing " subtitle " to understand uppick.This Outside, it is contemplated that avoid hearing-impaired people's dispersion attention, near-eye display can be monochrome display, using default background colour and in advance If foreground shows the corresponding orientation of sound source and text.In addition, near-eye display is also possible to color monitor, using background colour Show that the corresponding orientation of different sound sources and text, specific mapping mode may refer to above-mentioned reality with the form of foreground checker Content described in example is applied, this manner it is also possible to hearing-impaired people's dispersion attention sufficiently be avoided, so that hearing-impaired people is absorbed in content Itself；Allow hearing-impaired people to carry out normal outdoor scene exchange simultaneously, is interrupted and needs switch without generating The discomfort of focus.

In addition, the another aspect of the embodiment of the present invention also provides a kind of machine readable storage medium, the machine readable storage Instruction is stored on medium, which is used for so that machine executes method described in above-described embodiment.

In conclusion showing that the text of the voice conversion of each sound source at least one sound source makes hearing-impaired people can be with Understand the content of the voice of each sound source；The orientation for showing each sound source at least one sound source, realizes with simple, intuitive Mode prompt the sounding position of hearing-impaired people's voice.In this way, making hearing-impaired people in viewing " subtitle " to understand uppick The perception to position for being similar to ordinary person is obtained while voice messaging, is realized so that hearing-impaired people is apparent that each sound source The orientation that each sound source can be understood, got information about while the content of the voice of sending, so that hearing-impaired people can be more The content for the voice that the good each sound source of assurance issues.Particularly, in the environment of there are multi-acoustical, so that hearing-impaired people is clear The orientation of each sound source of Chu is remarkably contributing to hearing-impaired people and exchanges with other people.

The optional embodiment of the embodiment of the present invention is described in detail in conjunction with attached drawing above, still, the embodiment of the present invention is simultaneously The detail being not limited in above embodiment can be to of the invention real in the range of the technology design of the embodiment of the present invention The technical solution for applying example carries out a variety of simple variants, these simple variants belong to the protection scope of the embodiment of the present invention.

It is further to note that specific technical features described in the above specific embodiments, in not lance In the case where shield, it can be combined in any appropriate way.In order to avoid unnecessary repetition, the embodiment of the present invention pair No further explanation will be given for various combinations of possible ways.

It will be appreciated by those skilled in the art that implementing the method for the above embodiments is that can pass through Program is completed to instruct relevant hardware, which is stored in a storage medium, including some instructions are used so that single Piece machine, chip or processor (processor) execute all or part of the steps of each embodiment the method for the application.And it is preceding The storage medium stated includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory The various media that can store program code such as (RAM, Random Access Memory), magnetic or disk.

In addition, any combination can also be carried out between a variety of different embodiments of the embodiment of the present invention, as long as it is not The thought of the embodiment of the present invention is violated, equally should be considered as disclosure of that of the embodiment of the present invention.

Claims

1. a kind of for providing the method for sound source information, which is characterized in that this method comprises:

Receive the voice of at least one sound source；

Based on the voice of each sound source at least one received described sound source, determine each at least one described sound source The orientation of sound source, wherein point on the basis of the speech reception module of voice of the orientation to receive at least one sound source；

The voice for identifying each sound source at least one described sound source, by each sound source at least one described sound source Voice is converted to text；And

Show the orientation of each sound source at least one described sound source and the text that voice is converted.

2. the method according to claim 1, wherein the orientation includes direction and/or distance.

3. according to the method described in claim 2, the arrow is located at it is characterized in that, the direction is used arrow expression In the region that one circumference delimited, the arrow deviates one angle of the longitudinal axis across the circumference.

4. a kind of for providing the device of sound source information, which is characterized in that the device includes:

Speech reception module, for receiving the voice of at least one sound source；

Determining module determines described at least one for the voice based on each sound source at least one received described sound source The orientation of each sound source in a sound source, wherein orientation point on the basis of the speech reception module；

Speech recognition module, the voice of each sound source at least one described sound source for identification, will it is described at least one The voice of each sound source in sound source is converted to text；And

Display module, the text that orientation and voice for showing each sound source at least one described sound source are converted.

5. device according to claim 4, which is characterized in that the orientation includes direction and/or distance.

6. device according to claim 5, which is characterized in that the direction is used arrow expression, and the arrow is located at In the region that one circumference delimited, the arrow deviates one angle of the longitudinal axis across the circumference.

7. the device according to any one of claim 4-6, which is characterized in that the display module is near-eye display.

8. device according to claim 7, which is characterized in that the near-eye display is perspective formula near-eye display.

9. a kind of augmented reality glasses, which is characterized in that the augmented reality glasses include described in any one of claim 4-8 Device.

10. a kind of machine readable storage medium, it is stored with instruction on the machine readable storage medium, which is used for so that machine Perform claim requires method described in any one of 1-3.