CN106792341A

CN106792341A - Audio output method and device and terminal equipment

Info

Publication number: CN106792341A
Application number: CN201611056298.3A
Authority: CN
Inventors: 汤中良
Original assignee: Guangdong Genius Technology Co Ltd
Current assignee: Guangdong Genius Technology Co Ltd
Priority date: 2016-11-23
Filing date: 2016-11-23
Publication date: 2017-05-31

Abstract

The embodiment of the invention discloses an audio output method, an audio output device and terminal equipment. The method comprises the following steps: when the loudspeaker is detected to be in an audio output state, determining the direction of a user; and controlling the loudspeaker to output audio to the direction of the user. The embodiment of the invention solves the technical problem that the directional loudspeaker cannot automatically identify the direction of the user by determining the direction of the user and outputting the audio to the direction of the user, and realizes the technical effects of automatically identifying the direction of the user and outputting the audio to the direction in a directional manner.

Description

A kind of audio-frequency inputting method, device and terminal device

Technical field

The present embodiments relate to intelligent terminal technical field, more particularly to a kind of audio-frequency inputting method, device and terminal Equipment.

Background technology

With the fast development of intelligent terminal, intelligent terminal (for example, smart mobile phone and Intelligent worn device etc.) is wide It is general to be applied to people's work, the every field of life.

Loudspeaker is equipped with current intelligent terminal, speaker sound output function is supported.And ventional loudspeakers send Sound to all the winds propagate, in order to reduce the interference to surrounding population, occur in that a kind of with ventional loudspeakers work The different directional loudspeaker of principle, first directional loudspeaker by low frequency sound signals be loaded in the very strong high-frequency signal of directive property it On, then by amplifying, being transmitted into air, then, air can be high-frequency signal rapid filtration, and audible signal thereon is just Meeting nature is leached, and realizes the direction propagation as laser.

But, once existing directional loudspeaker or the intelligent terminal equipped with directional loudspeaker, its position are right after fixed The direction of the loudspeaker output sound answered is fixed.Under many scenes, for example, user is back to loudspeaker sound propagation side Xiang Shi, the sound of above-mentioned output can not well be received by user.

The content of the invention

The present invention provides a kind of audio-frequency inputting method, device and terminal device, to realize automatic identification audio output direction, Sound is exported towards user direction.

In a first aspect, the embodiment of the invention provides a kind of audio-frequency inputting method, the method includes：

When loudspeaker is detected in audio output state, orientation where user is determined；

The loudspeaker is controlled to output audio in orientation where the user.

Further, orientation where determining user includes：

Carry out IMAQ to the space where the loudspeaker, and image to gathering carries out image recognition；

If including characteristics of human body's information in the image of the collection, the image according to the collection determines characteristics of human body's Orientation, using the orientation of the characteristics of human body as orientation where user.

Further, orientation where determining user includes：

IMAQ is carried out to the space where the loudspeaker using rotating camera, and in rotating camera rotation The image of Real time identification collection during turning；

If comprising characteristics of human body's information in recognizing the image of collection, controlling the rotating camera to stop the rotation, will The orientation of the rotating camera direction is used as orientation where user when stopping the rotation.

Further, orientation where determining user includes：

IMAQ, and the image that will be gathered and the advance user for gathering are carried out to the space where the loudspeaker Image matched；

If the match is successful, the image according to the collection determines the orientation of the user.

Further, orientation where determining user includes：

IMAQ is carried out to the space where the loudspeaker, if comprising multiple users' in recognizing the image of collection During characteristics of human body's information, then the distance between the loudspeaker and each user are determined using range sensor；

Image according to the collection determines orientation where the user nearest apart from the loudspeaker.

Further, before determining orientation where user, also include：

The user is identified using iris recognition sensor.

Second aspect, the embodiment of the present invention additionally provides a kind of audio output device, and the device includes：

Orientation determining module, for when loudspeaker is detected in audio output state, determining orientation where user；

Dio Output Modules, for controlling the loudspeaker to output audio in orientation where the user.

Further, the orientation determining module specifically for, IMAQ is carried out to the space where the loudspeaker, And the image to gathering carries out image recognition；If characteristics of human body's information is included in the image of the collection, according to the collection Image determine the orientation of characteristics of human body, using the orientation of the characteristics of human body as orientation where user.

Further, the orientation determining module is specifically for using rotating camera to the sky where the loudspeaker Between carry out IMAQ, and during the rotating camera rotates Real time identification collection image；If recognizing collection Image in include characteristics of human body's information, then control the rotating camera to stop the rotation, by when stopping the rotation it is described rotation take the photograph As the orientation of head direction is used as orientation where user.

Further, the orientation determining module specifically for, IMAQ is carried out to the space where the loudspeaker, And the image of collection is matched with the image of the user of advance collection；If the match is successful, according to the collection Image determines the orientation of the user.

Further, the orientation determining module specifically for, IMAQ is carried out to the space where the loudspeaker, If characteristics of human body's information of multiple users is included in recognizing the image of collection, raised one's voice using described in range sensor determination The distance between device and each user；Image according to the collection determines orientation where the user nearest apart from the loudspeaker.

Further, the audio output device also includes：

Iris recognition module, before orientation where for determining user in the orientation determining module, using iris recognition Sensor identifies the user.

The third aspect, the embodiment of the present invention additionally provides a kind of terminal device, including any that above-mentioned second aspect is provided The item audio output device and loudspeaker；

The loudspeaker is arranged in the terminal device.

Further, the terminal device includes camera and range sensor；Or, camera and iris recognition sensing Device；

The camera, the image in the space where for gathering the terminal device, and determined according to the image of collection Orientation where user；

The range sensor, for determining the distance between the terminal device and user；

The iris recognition sensor, for identifying user.

Further, the camera is rotary pick-up head.

The technical scheme of the embodiment of the present invention, is exported by the orientation of terminal automatic identification user and to orientation where user Audio, solves the technical problem that directional loudspeaker is unable to automatic identification user direction, realizes the orientation of automatic identification user, And the technique effect of audio is exported to the azimuthal orientation.

Brief description of the drawings

Fig. 1 is the flow chart of the audio-frequency inputting method in the embodiment of the present invention one；

Fig. 2 is the flow chart of the audio-frequency inputting method in the embodiment of the present invention two；

Fig. 3 is the flow chart of the audio-frequency inputting method in the embodiment of the present invention three；

Fig. 4 is the flow chart of the audio-frequency inputting method in the embodiment of the present invention four；

Fig. 5 is the flow chart of the audio-frequency inputting method in the embodiment of the present invention five；

Fig. 6 is the structural representation of the audio output device in the embodiment of the present invention six；

Fig. 7 is the structural representation of the terminal device in the embodiment of the present invention seven.

Specific embodiment

The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that, in order to just Part rather than entire infrastructure related to the present invention is illustrate only in description, accompanying drawing.

Embodiment one

Fig. 1 is a kind of flow chart of audio-frequency inputting method that the embodiment of the present invention one is provided, and the present embodiment is applicable to fixed To the situation of output audio, the method can perform by audio output device provided in an embodiment of the present invention, and the device can be with Realized by the way of software and/or hardware, the device can be integrated in the terminal with audio output function, for example, raising Sound device, mobile terminal (such as mobile phone, panel computer), car-mounted terminal, notebook computer and fixed terminal (such as desktop computer) In.Specifically include following steps：

S110, when detecting loudspeaker and being in audio output state, determine orientation where user.

The loudspeaker can be the loudspeaker being arranged in terminal, or loudspeaker apparatus.When the loudspeaker When setting terminal, audio output state refers to the state that terminal exports sound by loudspeaker, for example, can be talking state Or music state etc..When the loudspeaker is loudspeaker apparatus, audio output state refers to loudspeaker apparatus and broadcasts Play a record or audio out sound state.Orientation where user refers to the recipient position of audio relative to terminal Direction.

S120, the control loudspeaker are to output audio in orientation where the user.

In the present embodiment, when it is determined that after the orientation of user, control instruction is sent to loudspeaker so that audio is towards user Direction output.For example, user can send control instruction by control device (for example, remote control or mobile phone) to loudspeaker, tool Body can send control instruction using wifi network, bluetooth or 4G networks to loudspeaker, after the loudspeaker receives control instruction, can Loudspeaker are exported into audio towards user by rotating.

Wherein, controlling loudspeaker orientation output audio can be realized by active directional loudspeaker or matrix loudspeaker array. The operation principle of active directional loudspeaker is that low frequency sound signals are loaded on the very strong high-frequency signal of directive property, then by putting Greatly, it is transmitted into air, then, air will can naturally leach high-frequency signal rapid filtration, audible signal thereon, Realize the direction propagation as laser；The operation principle of matrix loudspeaker array be by some loudspeakers matrix arrangement at equal intervals, Each loudspeaker unit radiates a same-phase wave surface for plane, and the combination of multiple units is formed and can provide single main extension Sound source, the wave surface of the loudspeaker array produces quality by the coupling in whole audiorange in certain area coverage Consistent sound, makes it be propagated in a certain direction in the form of wave beam.

What deserves to be explained is, it is determined that, it is necessary to judge whether the audio in output state needs before orientation where user Orient output.Specifically, the distance of applications distances sensor detection terminal and user face, when the distance is less than predeterminable range When, it is not necessary to carry out the identification of user location, be normally carried out the broadcasting of voice, otherwise, when the distance more than predeterminable range or When the loudspeaker of terminal is in hands-free outer mode playback, the orientation of user can be determined automatically or the manually opened station-keeping mode of user, and Controlling loudspeaker is to output audio in orientation where user.Wherein, predeterminable range typically can be 10cm or 20cm.

The technical scheme of the present embodiment, sound is exported by the orientation of terminal automatic identification user and to orientation where user Frequently, the technical problem that directional loudspeaker is unable to automatic identification user direction is solved, the orientation of automatic identification user is realized, and The technique effect of audio is exported to the azimuthal orientation.

Embodiment two

Fig. 2 is a kind of flow chart of audio-frequency inputting method that the embodiment of the present invention two is provided, in the base of above-described embodiment one Audio-frequency inputting method is optimized on plinth, there is provided the method for determining orientation where user, specifically to the loudspeaker institute Space carry out IMAQ, and image to gathering carries out image recognition；If special comprising human body in the image of the collection Reference ceases, then the image according to the collection determines the orientation of characteristics of human body, using the orientation of the characteristics of human body as user institute In orientation.Accordingly, the method for the present embodiment includes：

S210, IMAQ is carried out to the space where the loudspeaker, and image to gathering carries out image recognition.

Wherein, it can, by camera collection image, image be entered that IMAQ is carried out to the space where loudspeaker Whether row identification refers to being identified the image information included in image, it is determined that including user in the image of collection.

If including characteristics of human body's information in S220, the image of the collection, the image according to the collection determines human body The orientation of feature, using the orientation of the characteristics of human body as orientation where user.

Wherein, characteristics of human body's information refers to being able to confirm that the information comprising human body in image, for example, can be human body head Portion, face or face etc., if containing above-mentioned any one characteristics of human body's information in identifying image, it is possible to determine image In contain user.Side of the user relative to terminal is calculated and determined by characteristics of human body's information relative position in the picture Position.

Terminal can separated in time, for example can be 30 seconds or 1 minute, the figure in the space where continuous acquisition loudspeaker Picture simultaneously recognizes that acquisition characteristics of human body's information, determines the orientation of user in real time.

S230, the control loudspeaker are to output audio in orientation where the user.

The technical scheme of the present embodiment, by gathering the image in space where loudspeaker, identification characteristics of human body's information is with certainly The orientation of dynamic identifying user, and audio is exported to the azimuthal orientation, the orientation of automatic identification user is realized, and it is fixed to the orientation To the technique effect of output audio.

Embodiment three

Fig. 3 is a kind of flow chart of audio-frequency inputting method that the embodiment of the present invention three is provided, on the basis of above-described embodiment On audio-frequency inputting method is optimized, there is provided the method for determining orientation where user, specifically using rotating camera pair Space where the loudspeaker carries out IMAQ, and the Real time identification collection during the rotating camera rotates Image；If comprising characteristics of human body's information in recognizing the image of collection, controlling the rotating camera to stop the rotation, will stop The orientation of the rotating camera direction is used as orientation where user during rotation.Accordingly, the method for the present embodiment includes：

S310, IMAQ is carried out to the space where the loudspeaker using rotating camera, and taken the photograph in the rotation The image of Real time identification collection during being rotated as head.

Wherein, rotating camera is the camera for being capable of rotary taking.Specifically, being in audio output when terminal is detected State, and judging it needs to be determined that during the orientation of user, terminal is automatic or the manually opened rotating camera of user, obtains terminal institute In the image in space, and the image to rotating camera acquisition is identified in real time, automatic to catch characteristics of human body's information.

If S320, recognizing in the image of collection comprising characteristics of human body's information, the rotating camera is controlled to stop rotation Turn, using the orientation of rotating camera direction when stopping the rotation as orientation where user.

Specifically, in rotating camera during rotary taking, when having recognized characteristics of human body's information and occurring, for example Can occur in that human body head in image, control rotating camera is stopped the rotation, rotating camera is stopped the rotation the moment Direction be defined as user where direction；Otherwise, the image in rotating camera collection space is continued, until recognizing someone Body characteristicses information, determines user direction.What deserves to be explained is, when the characteristics of human body's information in image disappears, rotating camera Automatic opening rotates and continues to gather the image in space, until having recognized characteristics of human body's information, determines user direction.

S330, the control loudspeaker are to output audio in orientation where the user.

The technical scheme of the present embodiment, obtains and recognizes the image in space where loudspeaker in real time by rotating camera, Automatically catch the characteristics of human body of user to determine the orientation of user, realize the orientation of automatic identification user, and it is fixed to the orientation To the technique effect of output audio.

Example IV

Fig. 4 is a kind of flow chart of audio-frequency inputting method that the embodiment of the present invention four is provided, on the basis of above-described embodiment On audio-frequency inputting method is optimized, there is provided the method for determining orientation where user, specifically to where the loudspeaker Space carry out IMAQ, and the image of collection is matched with the image of the user of advance collection；If matching into Work(, then the image according to the collection determine the orientation of the user.Accordingly, the method for the present embodiment includes：

S410, IMAQ is carried out to the space where the loudspeaker, and the image that will gather and the institute of collection in advance The image for stating user is matched.

Wherein, the user images of collection refer to what is compared for the image gathered with terminal in advance, pre-existing The image of the user in terminal for example can be the autonomous image for shooting of user, or terminal in images match before During automatic storage user image.

Specifically, terminal can be matched using face recognition algorithms to the image for gathering.The principle of face recognition algorithms It is the face information in the image for extracting terminal collection, including eyes, nose, face or ear etc., and will be to face information Matched with the face information in the image of the user of advance collection, when similarity reaches preset value, determined that terminal is gathered Image in there is terminal user.Wherein, the preset value of matching similarity can be terminal recommendation, or user from The adjusted value of definition, for example, can be 80% or 90%.When matching similarity preset value is higher, matching accuracy is higher, More long with elapsed time, accordingly, when matching similarity preset value is relatively low, matching speed is fast, and matching accuracy is low, easily goes out Now recognize the situation of mistake.

If S420, the match is successful, the image according to the collection determines the orientation of the user.

Wherein, the match is successful refer to terminal collection image and in advance collection user images in face information phase Matching similarity preset value is reached like degree, there is terminal user in the image for confirming terminal collection.Can be used by the terminal The face information at family relative position in the picture is calculated and determined orientation of the user relative to terminal.

In the present embodiment, by the image information match cognization of information in the image that is gathered to terminal and default user, Identification terminal user, determines the orientation of user, improves the degree of accuracy of orientation determination.

Optionally, before determining orientation where user, the method also includes：

The user is identified using iris recognition sensor.

Wherein, iris recognition technology refers to carrying out identification by eyes.Iris is the black for being located at human eye Annular formations between pupil and white sclera, it comprises many interlaced spots, filament, coronal, striped and hidden The minutia of nest etc.；Iris keeps constant in the whole life course after prenatal development stage is formed.According to the thin of iris Section feature is capable of the identity of the identifying user of uniqueness.

Iris recognition sensor is the sensor that can obtain human eye iris image and identifying user identity.Iris recognition is passed The operation principle of sensor is acquisition iris image；Iris image is pre-processed, it is met the demand for extracting iris feature； Extract iris feature；Characteristic matching, identifying user identity are carried out to extract and model.

Specifically, in the present embodiment, the iris image in space where loudspeaker is obtained by iris recognition sensor, and The iris image for obtaining is identified in real time, and is matched with the iris image of the terminal user for prestoring, when the match is successful When, it is determined that the iris image for obtaining belongs to terminal user, and calculate the orientation for determining terminal user.

In the present embodiment, by the identity of iris recognition technology unique identification terminal user, the orientation of user is determined, improve The degree of accuracy that orientation determines.

S430, the control loudspeaker are to output audio in orientation where the user.

The technical scheme of the present embodiment, by the image information of information in the image that is gathered to terminal and default user With identification, identification terminal user determines the orientation of user, solves the technology that directional loudspeaker is unable to automatic identification user direction Problem, realizes the orientation of automatic identification user, and the technique effect of audio is exported to the azimuthal orientation.

Embodiment five

Fig. 5 is a kind of flow chart of audio-frequency inputting method that the embodiment of the present invention five is provided, on the basis of above-described embodiment On audio-frequency inputting method is optimized, there is provided the method for determining orientation where user, specifically to where the loudspeaker Space carry out IMAQ, if during characteristics of human body's information comprising multiple users in recognizing the image of collection, using away from Determine the distance between the loudspeaker and each user from sensor；Image according to the collection is determined apart from the loudspeaker Orientation where nearest user.Accordingly, the method for the present embodiment includes：

S510, IMAQ is carried out to the space where the loudspeaker, if comprising multiple in recognizing the image of collection During characteristics of human body's information of user, then the distance between the loudspeaker and each user are determined using range sensor.

Wherein, range sensor is a kind of sensor that can detect physical distance, for example, can be passed by optoelectronic distance Distance between sensor or ultrasonic distance sensor detection user and terminal speaker.Specifically, when the figure for recognizing terminal collection There is characteristics of human body's information that is multiple and being not belonging to same user in different azimuth as in, terminal cannot determine the output side of audio To can detect by range sensor and recognize unique audio output user to determine the outbound course of audio.

S520, the orientation according to where the image of the collection determines the user nearest apart from the loudspeaker.

Specifically, detecting the distance between each user and terminal speaker by range sensor, the distance is compared Compared with according to comparative result, the nearest user of chosen distance is defined as audio output user.According to the orientation of audio output user Determine the outbound course of audio.

S530, the control loudspeaker are to output audio in orientation where the user.

The technical scheme of the present embodiment, when characteristics of human body's information of many people is recognized, is detected by range sensor and known Other each user and the distance of terminal speaker, are defined as audio output user to determine the output of audio by nearest user Direction, realizes in the presence of multiple users, the effect in automatic identification audio output direction.

Embodiment six

Fig. 6 is the structural representation of the audio output device that the embodiment of the present invention six is provided, and the device is adapted for carrying out this The audio-frequency inputting method that inventive embodiments are provided, as shown in fig. 6, the device can specifically include：

Orientation determining module 610, for when loudspeaker is detected in audio output state, determining user place side Position；

Dio Output Modules 620, for controlling the loudspeaker to output audio in orientation where the user.

Optionally, orientation determining module 610 specifically for, IMAQ is carried out to the space where the loudspeaker, and Image to gathering carries out image recognition；If characteristics of human body's information is included in the image of the collection, according to the collection Image determines the orientation of characteristics of human body, using the orientation of the characteristics of human body as orientation where user.

Optionally, orientation determining module 610 to the space where the loudspeaker using rotating camera specifically for being entered Row IMAQ, and the image that Real time identification is gathered during the rotating camera rotates；If recognizing the figure of collection Characteristics of human body's information is included as in, then controls the rotating camera to stop the rotation, by rotating camera when stopping the rotation The orientation of direction is used as orientation where user.

Optionally, orientation determining module 610 specifically for, IMAQ is carried out to the space where the loudspeaker, and The image of collection is matched with the image of the user of advance collection；If the match is successful, according to the figure of the collection Orientation as determining the user.

Optionally, orientation determining module 610 specifically for, IMAQ is carried out to the space where the loudspeaker, if When recognizing the characteristics of human body's information comprising multiple users in the image of collection, then the loudspeaker is determined using range sensor The distance between with each user；Image according to the collection determines orientation where the user nearest apart from the loudspeaker.

Optionally, the audio output device also includes：

The present embodiment passes through the orientation of terminal automatic identification user and exports audio to orientation where user, solves orientation Loudspeaker is unable to the technical problem in automatic identification user direction, realizes the orientation of automatic identification user, and to the azimuthal orientation Export the technique effect of audio.

Embodiment seven

Fig. 7 is the structural representation of the terminal device that the embodiment of the present invention seven is provided, based on the sound that above-described embodiment is provided Frequency output device, present embodiments provides the terminal device of any one audio output device provided comprising above-described embodiment 700.Audio output device 600 with the automatic identification user direction of control terminal equipment 700, and can orient output to user direction Audio.Specifically, the terminal device includes audio output device 600 and loudspeaker 710, the loudspeaker 710 is arranged on terminal and sets In standby 700.

Wherein, terminal device 700 can be the Intelligent worn devices such as intelligent watch or Intelligent bracelet, smart mobile phone or shifting Dynamic flat board etc..

The audio output direction instruction orientation output audio that loudspeaker 710 is formed according to audio output device 600.It is exemplary , loudspeaker can be realized using MEMS matrixes loudspeaker array in the present embodiment, and MEMS speaker sizes are micron order, MEMS squares MEMS number of loudspeakers can be typically 50-200 in battle array loudspeaker array, and MEMS number of loudspeakers is preferably in the present embodiment 100 or so, MEMS matrix loudspeaker array are preferably dimensioned to be 10mm.

MEMS matrix loudspeakers are different from classical matrix piezo-electric loudspeaker, and small volume is Miniaturized, volume production and can apply In terminal device 700.

Optionally, the terminal device 700 includes camera and range sensor；Or, camera and iris recognition sensor；

The iris recognition sensor, for identifying user.

Optionally, camera is rotary pick-up head.

The present embodiment is on the basis of above-described embodiment, there is provided a kind of terminal device, and the embodiment passes through audio output Device determines user location, and, to user location orientation output audio, solving directional loudspeaker can not be automatic for controlling loudspeaker The technical problem in identifying user direction, realizes the orientation of automatic identification user, and the technology of audio is exported to the azimuthal orientation Effect.

Note, above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art that The invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art various obvious changes, Readjust and substitute without departing from protection scope of the present invention.Therefore, although the present invention is carried out by above example It is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, also More other Equivalent embodiments can be included, and the scope of the present invention is determined by scope of the appended claims.

Claims

1. a kind of audio-frequency inputting method, it is characterised in that including：

The loudspeaker is controlled to output audio in orientation where the user.

2. method according to claim 1, it is characterised in that orientation where determining user includes：

If including characteristics of human body's information in the image of the collection, the image according to the collection determines the side of characteristics of human body Position, using the orientation of the characteristics of human body as orientation where user.

3. method according to claim 1, it is characterised in that orientation where determining user includes：

IMAQ is carried out to the space where the loudspeaker using rotating camera, and in rotating camera rotation During Real time identification collection image；

If comprising characteristics of human body's information in recognizing the image of collection, controlling the rotating camera to stop the rotation, will stop The orientation of the rotating camera direction is used as orientation where user during rotation.

4. method according to claim 1, it is characterised in that orientation where determining user includes：

Carry out IMAQ to the space where the loudspeaker, and the image that will be gathered and the user of advance collection figure As being matched；

5. method according to claim 1, it is characterised in that orientation where determining user includes：

IMAQ is carried out to the space where the loudspeaker, if comprising the human body of multiple users in recognizing the image of collection During characteristic information, then the distance between the loudspeaker and each user are determined using range sensor；

6. the method according to claim any one of 1-3, it is characterised in that before determining orientation where user, also include：

The user is identified using iris recognition sensor.

7. a kind of audio output device, it is characterised in that including：

8. device according to claim 7, it is characterised in that the orientation determining module to described specifically for raising one's voice Space where device carries out IMAQ, and image to gathering carries out image recognition；If including people in the image of the collection Body characteristicses information, then the image according to the collection determine the orientation of characteristics of human body, using the orientation of the characteristics of human body as with Orientation where family.

9. device according to claim 7, it is characterised in that the orientation determining module using rotation specifically for being taken the photograph IMAQ, and the Real time identification during the rotating camera rotates are carried out to the space where the loudspeaker as head The image of collection；If comprising characteristics of human body's information in recognizing the image of collection, controlling the rotating camera to stop the rotation, Using the orientation of rotating camera direction when stopping the rotation as orientation where user.

10. device according to claim 7, it is characterised in that the orientation determining module to described specifically for raising one's voice Space where device carries out IMAQ, and the image of collection is matched with the image of the user of advance collection；If The match is successful, then the image according to the collection determines the orientation of the user.

11. devices according to claim 7, it is characterised in that the orientation determining module to described specifically for raising one's voice Space where device carries out IMAQ, if during characteristics of human body's information comprising multiple users in recognizing the image of collection, The distance between the loudspeaker and each user are determined using range sensor；Image according to the collection is determined described in distance Orientation where the nearest user of loudspeaker.

12. device according to claim any one of 7-9, it is characterised in that also include：

Iris recognition module, before orientation where for determining user in the orientation determining module, is sensed using iris recognition Device identifies the user.

13. a kind of terminal devices, it is characterised in that including audio output device and loudspeaker described in any one of 7-12；

The loudspeaker is arranged in the terminal device.

14. terminal devices according to claim 13, it is characterised in that including camera and range sensor；Or, shooting Head and iris recognition sensor；

The camera, the image in the space where for gathering the terminal device, and user is determined according to the image of collection Place orientation；

The iris recognition sensor, for identifying user.

15. terminal devices according to claim 14, it is characterised in that the camera is rotary pick-up head.