CN107678793A - Voice assistant starts method and device, terminal and computer-readable recording medium - Google Patents

Voice assistant starts method and device, terminal and computer-readable recording medium Download PDF

Info

Publication number
CN107678793A
CN107678793A CN201710828831.1A CN201710828831A CN107678793A CN 107678793 A CN107678793 A CN 107678793A CN 201710828831 A CN201710828831 A CN 201710828831A CN 107678793 A CN107678793 A CN 107678793A
Authority
CN
China
Prior art keywords
terminal
voice assistant
microphone
user
ambient sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710828831.1A
Other languages
Chinese (zh)
Inventor
高素雅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Meizu Technology Co Ltd
Original Assignee
Meizu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Meizu Technology Co Ltd filed Critical Meizu Technology Co Ltd
Priority to CN201710828831.1A priority Critical patent/CN107678793A/en
Publication of CN107678793A publication Critical patent/CN107678793A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The present invention provides a kind of voice assistant and starts method and device, terminal and computer-readable recording medium.The voice assistant starts method and is applied to terminal, including:Whether the predeterminable area of detection terminal and the microphone of the terminal are blocked;If the predeterminable area of the terminal is not blocked and the microphone of the terminal is blocked, ambient sound is gathered by the microphone;And if the ambient sound meets preparatory condition, start the voice assistant in the terminal, and using the ambient sound as voice assistant described in voice operating instruction input.It is of the invention can simple and fast startup voice assistant, avoid cumbersome operating process and embarrassment that stiff wake-ups word is brought to user, make that man-machine interaction is more natural, simplicity.

Description

Voice assistant starts method and device, terminal and computer-readable recording medium
Technical field
The present invention relates to intelligent sound technical field, more particularly to a kind of voice assistant start method and device, terminal and Computer-readable recording medium.
Background technology
In prior art, user is usually required by pressing physical button, or is opened corresponding to voice assistant Application program could wake up voice assistant, it is cumbersome.
In addition, user can also wake up voice assistant by specifically waking up word, such as:" he, Siri ".It is so specific Wake-up word it is more stiff, easily user is felt awkward in public, man-machine interaction is not natural enough.
The content of the invention
In view of the foregoing, it is necessary to a kind of voice assistant is provided and starts method and device, terminal and computer-readable deposits Storage media, can simple and fast startup voice assistant, avoid cumbersome operating process and stiff wake-up word to user with The embarrassment come, make man-machine interaction more natural, easy.
A kind of voice assistant starts method, and methods described includes:
Whether the predeterminable area of detection terminal and the microphone of the terminal are blocked;
If the predeterminable area of the terminal is not blocked and the microphone of the terminal is blocked, pass through the Mike Elegance collection ambient sound;And
If the ambient sound meets preparatory condition, start the voice assistant in the terminal, and by the ambient sound Sound is as voice assistant described in voice operating instruction input.
According to the preferred embodiment of the present invention, the predeterminable area is located at the top of the terminal, and the microphone is located at The bottom of the terminal;Or
Position of the predeterminable area with the microphone in the terminal is relative.
According to the preferred embodiment of the present invention, methods described also includes:
Judge whether the distance between the terminal and the user meet the first preparatory condition;
Whether detection user is speaking;
If the distance between the terminal and the user meet the first preparatory condition, and the user is speaking, then really The fixed ambient sound meets preparatory condition.
According to the preferred embodiment of the present invention, it is described detection user whether speak including:
The lip image of the user is obtained by the camera device of the terminal;
Monitor the change of the lip image;
If the change of the lip image meets the second preparatory condition, it is determined that the user is speaking.
It is described to be helped the ambient sound as voice described in voice operating instruction input according to the preferred embodiment of the present invention Hand includes:
Noise reduction process is carried out to obtain the environment to the background sound in the ambient sound by diamylose noise reduction technology Speech sound in sound, and using the speech sound as voice assistant described in voice operating instruction input;
Methods described also includes:
The function according to corresponding to voice operating instruction controls the voice assistant to perform.
A kind of voice assistant starter, described device include:
Detection unit, whether it is blocked for the predeterminable area of detection terminal and the microphone of the terminal;
Collecting unit, if the predeterminable area for the terminal is not blocked and the microphone of the terminal is blocked, Ambient sound is then gathered by the microphone;And
Start unit, if meeting preparatory condition for the ambient sound, start the voice assistant in the terminal, and Using the ambient sound as voice assistant described in voice operating instruction input.
According to the preferred embodiment of the present invention, the predeterminable area is located at the top of the terminal, and the microphone is located at The bottom of the terminal;Or
Position of the predeterminable area with the microphone in the terminal is relative.
According to the preferred embodiment of the present invention, described device also includes:
Judging unit, for judging whether the distance between the terminal and the user meet the first preparatory condition;
The detection unit, it is additionally operable to detect whether user is speaking;
Determining unit, if meeting the first preparatory condition, and the use for the distance between the terminal and the user Speaking at family, it is determined that the ambient sound meets preparatory condition.
According to the preferred embodiment of the present invention, the detection unit is specifically used for:
The lip image of the user is obtained by the camera device of the terminal;
Monitor the change of the lip image;
If the change of the lip image meets the second preparatory condition, it is determined that the user is speaking.
According to the preferred embodiment of the present invention, the start unit is using the ambient sound as voice operating instruction input institute Stating voice assistant includes:
Noise reduction process is carried out to obtain the environment to the background sound in the ambient sound by diamylose noise reduction technology Speech sound in sound, and using the speech sound as voice assistant described in voice operating instruction input;
Described device also includes:
Control unit, for the function according to corresponding to the voice operating instruction control voice assistant execution.It is a kind of Terminal, the terminal includes processor, when the processor is used to perform the computer program stored in memory described in realization Voice assistant starts method.
A kind of computer-readable recording medium, is stored thereon with computer program, and the computer program is held by processor Realize that the voice assistant starts method during row.
As can be seen from the above technical solutions, whether the predeterminable area of detection terminal of the present invention and the microphone of the terminal It is blocked;If the predeterminable area of the terminal is not blocked and the microphone of the terminal is blocked, pass through the Mike Elegance collection ambient sound;And if the ambient sound meets preparatory condition, start the voice assistant in the terminal, and by institute Ambient sound is stated as voice assistant described in voice operating instruction input.Helped using the startup voice of energy simple and fast of the invention Hand, avoids cumbersome operating process and embarrassment that stiff wake-up word is brought to user, makes that man-machine interaction is more natural, letter Just.
Brief description of the drawings
Fig. 1 is the flow chart for the preferred embodiment that voice assistant of the present invention starts method.
Fig. 2 is the functional block diagram of the preferred embodiment of voice assistant starter of the present invention.
Fig. 3 is that the present invention realizes that voice assistant starts the structural representation of the terminal of the preferred embodiment of method.
Main element symbol description
Embodiment
In order that the object, technical solutions and advantages of the present invention are clearer, below in conjunction with the accompanying drawings with specific embodiment pair The present invention is described in detail.
As shown in figure 1, it is the flow chart for the preferred embodiment that voice assistant of the present invention starts method.According to different need Ask, the order of step can change in the flow chart, and some steps can be omitted.
The voice assistant starts method and is applied in one or more terminal, and the terminal is that one kind can be according to thing The equipment of the instruction for first setting or storing, automatic progress numerical computations and/or information processing, its hardware include but is not limited to micro- place Manage device, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), programmable gate array (Field-Programmable Gate Array, FPGA), digital processing unit (Digital Signal Processor, DSP), embedded device etc..
The terminal can be any electronic product that man-machine interaction can be carried out with user, for example, personal computer, Tablet personal computer, smart mobile phone, personal digital assistant (Personal Digital Assistant, PDA), game machine, interactive mode Web TV (Internet Protocol Television, IPTV), intellectual Wearable etc..Residing for the terminal Network includes but is not limited to internet, wide area network, Metropolitan Area Network (MAN), LAN, VPN (Virtual Private Network, VPN) etc..
Whether the microphone of S10, the predeterminable area of detection terminal and the terminal is blocked.
In at least one embodiment of the present invention, the predeterminable area can be located at the top of the terminal, and described Microphone is located at the bottom of the terminal;Either, position phase of the predeterminable area with the microphone in the terminal It is right, such as:When the microphone is located at the top of the terminal, the predeterminable area is then located at the bottom of the terminal.
In at least one embodiment of the present invention, the Mike of the predeterminable area of the detection terminal and the terminal Whether wind is blocked can be by the combination of following one or more kinds of modes:
(1) detecting the predeterminable area of the terminal and the microphone of the terminal by the range sensor of the terminal is It is no to be blocked.Such as:Mounting distance sensing is distinguished near the predeterminable area of the terminal and near the microphone of the terminal Device, the distance that the range sensor and the microphone of the terminal nearby installed by the predeterminable area of the terminal are nearby installed Sensor detects the distance with human body, judges whether the distance is less than or equal to the first pre-determined distance, so that it is determined that the terminal Predeterminable area and the microphone of the terminal whether be blocked.Preferably, if (such as the terminal is default for range sensor The range sensor of areas adjacent installation) detect and be less than or equal to the first pre-determined distance with the distance of human body, then judge this away from It is blocked from sensor correspondence position;If range sensor (such as the Distance-sensing that the predeterminable area of the terminal is nearby installed Device) detect and be more than the first pre-determined distance with the distance of human body, then judge that the range sensor correspondence position is not blocked.Example Such as, if range sensor and the distance of human body that the predeterminable area of the terminal is nearby installed are more than the first pre-determined distance, sentence The predeterminable area of the fixed terminal is not blocked;If the range sensor that the microphone of the terminal is nearby installed detects and people The distance of body is less than or equal to the first pre-determined distance, then judges that the microphone of the terminal is blocked.
(2) detecting the predeterminable area of the terminal and the microphone of the terminal by the light sensor of the terminal is It is no to be blocked.Such as:Light level is installed respectively near the predeterminable area of the terminal and near the microphone of the terminal Device, the light that the light sensor and the microphone of the terminal nearby installed by the predeterminable area of the terminal are nearby installed Sensor has detected whether that light is blocked, so that it is determined that the predeterminable area of the terminal and the microphone of the terminal whether by Block.Preferably, if light sensor (such as light sensor of the predeterminable area of terminal installation nearby) has detected When light is blocked, then judge that the light sensor correspondence position is blocked;If light sensor (such as the terminal is default The light sensor of areas adjacent installation) when detecting that no light is blocked, then judge the light sensor correspondence position not It is blocked.If for example, the light sensor that the predeterminable area of the terminal is nearby installed detects that no light is blocked, sentence The predeterminable area of the fixed terminal is not blocked;If the light sensor that the microphone of the terminal is nearby installed has detected light Line is blocked, then judges that the microphone of the terminal is blocked.
S11, if the predeterminable area of the terminal is not blocked and the microphone of the terminal is blocked, by described Microphone gathers ambient sound.
In at least one embodiment of the present invention, when the predeterminable area for detecting the terminal is not blocked and described When the microphone of terminal is blocked, it can be determined that microphone of the someone close to the terminal, it is possible to which someone is utilizing the wheat Terminal inputs voice gram described in wind direction, now, passes through the microphone and gathers ambient sound.
In at least one embodiment of the present invention, by diamylose noise reduction technology to the background sound in the ambient sound Noise reduction process is carried out to obtain the speech sound in the ambient sound.
In at least one embodiment of the present invention, the diamylose noise reduction technology refers to install two Mikes in the terminal Wind, the microphone that a microphone uses when being common user's communication, for collecting voice, and another microphone arrangement exists Fuselage top, possess ambient noise acquisition function, it is convenient to gather ambient noise.So, by diamylose noise reduction technology, to institute The background sound stated in ambient sound realizes noise reduction process, it is possible to effectively resists the ambient noise institute band of the terminal, peripheral The interference come, the definition of the speech sound in the ambient sound is greatly increased, so as to preferably get the ring Speech sound in the sound of border.
S12, if the ambient sound meets preparatory condition, start the voice assistant in the terminal, and by the ring Border sound is as voice assistant described in voice operating instruction input.
In at least one embodiment of the present invention, the voice assistant in the terminal is started, and by the ambient sound Before sound is as voice assistant described in voice operating instruction input, the terminal judges whether the ambient sound meets default bar Part.
In at least one embodiment of the present invention, the terminal judges whether the ambient sound meets preparatory condition Mode is:Judge whether the distance between the terminal and the user meet the first preparatory condition, and detect user whether Speak.If the distance between the terminal and the user meet the first preparatory condition, and the user is speaking, it is determined that institute State ambient sound and meet preparatory condition.
In at least one embodiment of the present invention, it is described to judge whether the distance between the terminal and the user are full The first preparatory condition of foot can pass through the combination of following one or more kinds of modes:
(1) judge whether the distance between the terminal and the user meet by the range sensor of the terminal One preparatory condition.
Such as:Range sensor is installed in the terminal, by the range sensor detect the terminal with it is described The distance between user, when the distance between the terminal and the user are less than or equal to the second pre-determined distance, judge The distance between the terminal and the user meet the first preparatory condition.
It should be noted that second pre-determined distance can be 50cm, 100cm etc..Second pre-determined distance can , can also be by the user of the terminal according to use habit when dispatching from the factory, to be rule of thumb configured by person skilled Self-defined setting is carried out, this is not restricted by the present invention.
(2) judge whether the distance between the terminal and the user meet by the light sensor of the terminal One preparatory condition.
Such as:Light sensor is installed in the terminal, when detecting the terminal by the light sensor When light is blocked, then judge that the distance between the terminal and the user meet the first preparatory condition.
(3) judge whether the distance between the terminal and the user meet by the temperature sensor of the terminal One preparatory condition.
Such as:Temperature sensor is installed in the terminal, when someone is close to the terminal, the temperature sensor will The change of the temperature of the terminal is detected, so as to judge that the distance between the terminal and the user meet the first default bar Part.
(4) judge whether the distance between the terminal and the user meet first by the camera device of the terminal Preparatory condition.
Such as:The terminal is provided with camera device, and the face-image of the user is gathered by the camera device.If In the face-image, the distance between eyes of the user are more than or equal to preset value, then judge the terminal and institute State the distance between user and meet the first preparatory condition.
In at least one embodiment of the present invention, the detection user whether speak including:Pass through the end The camera device at end obtains the lip image of the user;Monitor the change of the lip image;If the change of the lip image Change and meet the second preparatory condition, it is determined that the user is speaking.
In at least one embodiment of the present invention, it is following to meet that the second preparatory condition includes for the change of the lip image The combination of one or more kinds of situations:
(1) change frequency of the lip image is more than or equal to predeterminated frequency.
Such as:The terminal detects that the change frequency of the lip image is 20Hz, more than the predeterminated frequency 15Hz, Then the terminal determines that the change of the lip image meets the second preparatory condition.
It should be noted that the predeterminated frequency can be rule of thumb configured when dispatching from the factory by person skilled, Self-defined setting can also be carried out according to use habit by the user of the terminal, this is not restricted by the present invention.
(2) number that the lip image changes in preset duration is more than or equal to preset times.
Such as:The terminal detects that change frequency of the lip image in preset duration 1 second is more than or equal to During preset times 5 times, then the terminal determines that the change of the lip image meets the second preparatory condition.
It should be noted that similarly, the preset duration and the preset times can be when dispatching from the factory, by correlation technique Personnel are rule of thumb configured, and can also carry out self-defined setting, the present invention according to use habit by the user of the terminal This is not restricted.
In at least one embodiment of the present invention, using the ambient sound as voice described in voice operating instruction input Assistant includes:Noise reduction process is carried out to obtain the environment to the background sound in the ambient sound by diamylose noise reduction technology Speech sound in sound, and using the speech sound as voice assistant described in voice operating instruction input.
In at least one embodiment of the present invention, speech sound of the terminal in the ambient sound is got Afterwards, using the speech sound as voice assistant described in voice operating instruction input, and instructed and controlled according to the voice operating Function corresponding to the voice assistant execution.
Such as:When the terminal carries out noise reduction process by diamylose noise reduction technology to the background sound in the ambient sound During obtaining the speech sound in the ambient sound " weather for helping me to consult today ", the terminal gets institute's predicate Sound operational order " helps me to consult the weather of today ", and the voice assistant is activated, and the terminal can control the voice Assistant's execute instruction " inquiry weather ".
In summary, whether the present invention energy predeterminable area of detection terminal and the microphone of the terminal are blocked;If institute The predeterminable area for stating terminal is not blocked and the microphone of the terminal is blocked, then passes through the microphone and gather ambient sound Sound;And if the ambient sound meets preparatory condition, start the voice assistant in the terminal, and the ambient sound is made For voice assistant described in voice operating instruction input.Therefore, the startup voice assistant of present invention energy simple and fast, is avoided cumbersome Operating process and the embarrassment brought to user of stiff wake-up word, make man-machine interaction more natural, easy.
As shown in Fig. 2 it is the functional block diagram of the preferred embodiment of voice assistant starter of the present invention.The voice helps Manual starting device 11 include detection unit 100, collecting unit 101, start unit 102, judging unit 103, determining unit 104 and Control unit 105.In the present embodiment, the function on each unit will be described in detail in follow-up embodiment.
Whether the predeterminable area of the detection terminal of detection unit 100 and the microphone of the terminal are blocked.
In at least one embodiment of the present invention, the predeterminable area can be located at the top of the terminal, and described Microphone is located at the bottom of the terminal;Either, position phase of the predeterminable area with the microphone in the terminal It is right, such as:When the microphone is located at the top of the terminal, the predeterminable area is then located at the bottom of the terminal.
In at least one embodiment of the present invention, the detection unit 100 detects predeterminable area and the institute of the terminal Whether stating the microphone of terminal, be blocked can be by the combination of following one or more kinds of modes:
(1) detecting the predeterminable area of the terminal and the microphone of the terminal by the range sensor of the terminal is It is no to be blocked.Such as:Mounting distance sensing is distinguished near the predeterminable area of the terminal and near the microphone of the terminal Device, the distance that the range sensor and the microphone of the terminal nearby installed by the predeterminable area of the terminal are nearby installed Sensor detects the distance with human body, judges whether the distance is less than or equal to the first pre-determined distance, so that it is determined that the terminal Predeterminable area and the microphone of the terminal whether be blocked.Preferably, if (such as the terminal is default for range sensor The range sensor of areas adjacent installation) detect and be less than or equal to the first pre-determined distance with the distance of human body, then judge this away from It is blocked from sensor correspondence position;If range sensor (such as the Distance-sensing that the predeterminable area of the terminal is nearby installed Device) detect and be more than the first pre-determined distance with the distance of human body, then judge that the range sensor correspondence position is not blocked.Example Such as, if range sensor and the distance of human body that the predeterminable area of the terminal is nearby installed are more than the first pre-determined distance, sentence The predeterminable area of the fixed terminal is not blocked;If the range sensor that the microphone of the terminal is nearby installed detects and people The distance of body is less than or equal to the first pre-determined distance, then judges that the microphone of the terminal is blocked.
(2) detecting the predeterminable area of the terminal and the microphone of the terminal by the light sensor of the terminal is It is no to be blocked.Such as:Light level is installed respectively near the predeterminable area of the terminal and near the microphone of the terminal Device, the light that the light sensor and the microphone of the terminal nearby installed by the predeterminable area of the terminal are nearby installed Sensor has detected whether that light is blocked, so that it is determined that the predeterminable area of the terminal and the microphone of the terminal whether by Block.Preferably, if light sensor (such as light sensor of the predeterminable area of terminal installation nearby) has detected When light is blocked, then judge that the light sensor correspondence position is blocked;If light sensor (such as the terminal is default The light sensor of areas adjacent installation) when detecting that no light is blocked, then judge the light sensor correspondence position not It is blocked.If for example, the light sensor that the predeterminable area of the terminal is nearby installed detects that no light is blocked, sentence The predeterminable area of the fixed terminal is not blocked;If the light sensor that the microphone of the terminal is nearby installed has detected light Line is blocked, then judges that the microphone of the terminal is blocked.
If the predeterminable area of the terminal is not blocked and the microphone of the terminal is blocked, collecting unit 101 Ambient sound is gathered by the microphone.
In at least one embodiment of the present invention, when the detection unit 100 detects the predeterminable area of the terminal It is not blocked and when the microphone of the terminal is blocked, it can be determined that microphone of the someone close to the terminal, it is possible to Someone is using terminal input voice described in Mike's wind direction, and now, the collecting unit 101 is gathered by the microphone Ambient sound.
In at least one embodiment of the present invention, the collecting unit 101 by diamylose noise reduction technology to the environment Background sound in sound carries out noise reduction process to obtain the speech sound in the ambient sound.
In at least one embodiment of the present invention, the diamylose noise reduction technology refers to install two Mikes in the terminal Wind, the microphone that a microphone uses when being common user's communication, for collecting voice, and another microphone arrangement exists Fuselage top, possess ambient noise acquisition function, it is convenient to gather ambient noise.So, the collecting unit 101 is by double Wheat noise reduction technology, noise reduction process is realized to the background sound in the ambient sound, it is possible to effectively resist the terminal week Disturbed caused by the ambient noise on side, greatly increase the definition of the speech sound in the ambient sound, so as to more The good speech sound got in the ambient sound.
If the ambient sound meets preparatory condition, start unit 102 starts the voice assistant in the terminal, and will The ambient sound is as voice assistant described in voice operating instruction input.
In at least one embodiment of the present invention, the voice assistant in the start unit 102 starts the terminal, And using the ambient sound as voice assistant described in voice operating instruction input before, judging unit 103 judges the environment Whether sound meets preparatory condition.
In at least one embodiment of the present invention, it is pre- to judge whether the ambient sound meets for the judging unit 103 If the mode of condition is:Judge whether the distance between the terminal and the user meet the first preparatory condition, the detection Unit 100 detects whether user is speaking.If the distance between the terminal and the user meet the first preparatory condition, and institute State user speaking, then the determining unit 104 determines that the ambient sound meets preparatory condition.
In at least one embodiment of the present invention, the judging unit 103 is judged between the terminal and the user Distance whether meet that the first preparatory condition can be by the combination of following one or more kinds of modes:
(1) judge whether the distance between the terminal and the user meet by the range sensor of the terminal One preparatory condition.
Such as:Range sensor is installed in the terminal, by the range sensor detect the terminal with it is described The distance between user, it is described when the distance between the terminal and the user are less than or equal to the second pre-determined distance Judging unit 103 judges that the distance between the terminal and the user meet the first preparatory condition.
It should be noted that second pre-determined distance can be 50cm, 100cm etc..Second pre-determined distance can , can also be by the user of the terminal according to use habit when dispatching from the factory, to be rule of thumb configured by person skilled Self-defined setting is carried out, this is not restricted by the present invention.
(2) judge whether the distance between the terminal and the user meet by the light sensor of the terminal One preparatory condition.
Such as:Light sensor is installed in the terminal, when detecting the terminal by the light sensor When light is blocked, then the judging unit 103 judges that the distance between the terminal and the user meet the first default bar Part.
(3) judge whether the distance between the terminal and the user meet by the temperature sensor of the terminal One preparatory condition.
Such as:Temperature sensor is installed in the terminal, when someone is close to the terminal, the temperature sensor will Detect the change of the temperature of the terminal, the judging unit 103 so as to judge between the terminal and the user away from From meeting the first preparatory condition.
(4) judge whether the distance between the terminal and the user meet first by the camera device of the terminal Preparatory condition.
Such as:The terminal is provided with camera device, and the face-image of the user is gathered by the camera device.If In the face-image, the distance between eyes of the user are more than or equal to preset value, then the judging unit 103 is sentenced Disconnected the distance between the terminal and the user meet the first preparatory condition.
In at least one embodiment of the present invention, the detection unit 100 detect the user whether speak including: The lip image of the user is obtained by the camera device of the terminal;Monitor the change of the lip image;If the lip The change of shape image meets the second preparatory condition, then the determining unit 104 determines that the user is speaking.
In at least one embodiment of the present invention, it is following to meet that the second preparatory condition includes for the change of the lip image The combination of one or more kinds of situations:
(1) change frequency of the lip image is more than or equal to predeterminated frequency.
Such as:The detection unit 100 detects that the change frequency of the lip image is 20Hz, more than the default frequency Rate 15Hz, then the determining unit 104 determine the lip image change meet the second preparatory condition.
It should be noted that the predeterminated frequency can be rule of thumb configured when dispatching from the factory by person skilled, Self-defined setting can also be carried out according to use habit by the user of the terminal, this is not restricted by the present invention.
(2) number that the lip image changes in preset duration is more than or equal to preset times.
Such as:The detection unit 100 detect change frequency of the lip image in preset duration 1 second be more than or When person is equal to preset times 5 times, then the determining unit 104 determines that the change of the lip image meets the second preparatory condition.
It should be noted that similarly, the preset duration and the preset times can be when dispatching from the factory, by correlation technique Personnel are rule of thumb configured, and can also carry out self-defined setting, the present invention according to use habit by the user of the terminal This is not restricted.
In at least one embodiment of the present invention, using the ambient sound as voice described in voice operating instruction input Assistant includes:Noise reduction process is carried out to obtain the environment to the background sound in the ambient sound by diamylose noise reduction technology Speech sound in sound, and using the speech sound as voice assistant described in voice operating instruction input.
In at least one embodiment of the present invention, voice sound of the control unit 105 in the ambient sound is got After sound, using the speech sound as voice assistant described in voice operating instruction input, and instructed and controlled according to the voice operating Make function corresponding to the voice assistant execution.
Such as:When the terminal carries out noise reduction process by diamylose noise reduction technology to the background sound in the ambient sound During obtaining the speech sound in the ambient sound " weather for helping me to consult today ", the terminal gets institute's predicate Sound operational order " helps me to consult the weather of today ", and the voice assistant is activated, the i.e. controllable institute of described control unit 105 State voice assistant execute instruction " inquiry weather ".
In summary, whether the present invention energy predeterminable area of detection terminal and the microphone of the terminal are blocked;If institute The predeterminable area for stating terminal is not blocked and the microphone of the terminal is blocked, then passes through the microphone and gather ambient sound Sound;And if the ambient sound meets preparatory condition, start the voice assistant in the terminal, and the ambient sound is made For voice assistant described in voice operating instruction input.Therefore, the startup voice assistant of present invention energy simple and fast, is avoided cumbersome Operating process and the embarrassment brought to user of stiff wake-up word, make man-machine interaction more natural, easy.
As shown in figure 3, it is that the present invention realizes that voice assistant starts the structural representation of the terminal of the preferred embodiment of method.
The terminal 1 be it is a kind of can be automatic to carry out numerical computations and/or information according to the instruction for being previously set or storing The equipment of processing, its hardware include but is not limited to microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), programmable gate array (Field-Programmable Gate Array, FPGA), number Word processing device (Digital Signal Processor, DSP), embedded device etc..
The terminal 1, which can also be but not limited to any one, to pass through keyboard, mouse, remote control, touch pad with user Or the mode such as voice-operated device carries out the electronic product of man-machine interaction, for example, personal computer, tablet personal computer, smart mobile phone, individual Digital assistants (Personal Digital Assistant, PDA), game machine, IPTV (Internet Protocol Television, IPTV), intellectual Wearable etc..
The terminal 1 can also be the computing devices such as desktop PC, notebook, palm PC and cloud server.
Network residing for the terminal 1 includes but is not limited to internet, wide area network, Metropolitan Area Network (MAN), LAN, Virtual Private Network Network (Virtual Private Network, VPN) etc..
In one embodiment of the invention, the terminal 1 includes, but not limited to memory 12, processor 13, and The computer program that can be run in the memory 12 and on the processor 13 is stored in, such as voice assistant starts journey Sequence.It will be understood by those skilled in the art that the schematic diagram is only the example of terminal 1, the not restriction of structure paired terminal 1, It can include than illustrating more or less parts, either combine some parts or different parts, such as the terminal 1 Input-output equipment, network access equipment, bus etc. can also be included.
Alleged processor 13 can be CPU (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other PLDs, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor can also be any conventional processor Deng the processor 13 is arithmetic core and the control centre of the terminal 1, utilizes various interfaces and the whole terminal of connection 1 various pieces, and perform the operating system of the terminal 1 and the types of applications program of installation, program code etc..
The processor 13 performs the operating system of the terminal 1 and the types of applications program of installation.The processor 13 perform the application program to realize that above-mentioned each voice assistant starts the step in embodiment of the method, such as shown in Fig. 1 Step S10, S11 and S12.
Or the processor 13 realizes each module in above-mentioned each device embodiment/mono- when performing the computer program The function of member, such as:Whether the predeterminable area of detection terminal and the microphone of the terminal are blocked;If the terminal is default Region is not blocked and the microphone of the terminal is blocked, then gathers ambient sound by the microphone;It is and if described Ambient sound meets preparatory condition, then starts the voice assistant in the terminal, and using the ambient sound as voice operating Voice assistant described in instruction input.
Exemplary, the computer program can be divided into one or more module/units, one or more Individual module/unit is stored in the memory 12, and is performed by the processor 13, to complete the present invention.It is one Or multiple module/units can be the series of computation machine programmed instruction section that can complete specific function, the instruction segment is used to retouch State implementation procedure of the computer program in the terminal 1.For example, the computer program can be divided into detection list Member 100, collecting unit 101, start unit 102, judging unit 103, determining unit 104 and control unit 105.
The memory 12 can be used for storing the computer program and/or module, the processor 13 by operation or The computer program and/or module being stored in the memory 12 are performed, and calls the data being stored in memory 12, Realize the various functions of the terminal 1.The memory 12 can mainly include storing program area and storage data field, wherein, deposit Storing up program area can storage program area, application program (such as sound-playing function, image player work(needed at least one function Can etc.) etc.;Storage data field can store uses created data (such as voice data, phone directory etc.) etc. according to mobile phone. In addition, memory 12 can include high-speed random access memory, nonvolatile memory can also be included, for example, it is hard disk, interior Deposit, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, Flash card (Flash Card), at least one disk memory, flush memory device or other volatile solid-state parts.
The memory 12 can be the external memory storage and/or internal storage of terminal 1.Further, the storage Device 12 can be the circuit with store function for not having in integrated circuit physical form, such as RAM (Random-Access Memory, random access memory), FIFO (First In First Out) etc..Or the memory 12 can also be Memory with physical form, such as memory bar, TF card (Trans-flash Card).
If the integrated module/unit of the terminal 1 is realized in the form of SFU software functional unit and is used as independent product Sale in use, can be stored in a computer read/write memory medium.It is of the invention to realize based on such understanding All or part of flow in embodiment method is stated, by computer program the hardware of correlation can also be instructed to complete, institute The computer program stated can be stored in a computer-readable recording medium, and the computer program, can when being executed by processor The step of realizing above-mentioned each embodiment of the method.Wherein, the computer program includes computer program code, the computer Program code can be source code form, object identification code form, executable file or some intermediate forms etc..The computer can Reading medium can include:Any entity or device of the computer program code, recording medium, USB flash disk, mobile hard can be carried Disk, magnetic disc, CD, computer storage, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..Need what is illustrated It is that the content that the computer-readable medium includes can be fitted according to legislation in jurisdiction and the requirement of patent practice When increase and decrease, such as in some jurisdictions, according to legislation and patent practice, computer-readable medium, which does not include electric carrier wave, to be believed Number and telecommunication signal.
The range sensor 14 is used for by the distance of detection and human body to detect the predeterminable area of the terminal 1 and institute Whether the microphone 18 for stating terminal is blocked, and judges the end by detecting the distance between the terminal 1 and the user Whether end the distance between 1 and the user meet the first preparatory condition.
The light sensor 15 is used for by detecting whether that light is blocked to detect the predeterminable area of the terminal 1 And whether the microphone 18 of the terminal is blocked, and whether the light by detecting the terminal 1 is blocked with described in judgement Whether the distance between terminal 1 and the user meet the first preparatory condition.
The temperature sensor 16 is used for by the change of detection temperature to judge between the terminal 1 and the user Whether distance meets the first preparatory condition.
The camera device 17 is used for the face-image for gathering the user.
The microphone 18 is used to gather ambient sound.
With reference to Fig. 1, the memory 12 in the terminal 1 stores multiple instruction to realize a kind of voice assistant startup side Method, the processor 13 can perform the multiple instruction so as to realize:The Mike of the predeterminable area of detection terminal and the terminal Whether wind is blocked;If the predeterminable area of the terminal is not blocked and the microphone of the terminal is blocked, pass through institute State microphone collection ambient sound;And if the ambient sound meets preparatory condition, start the voice assistant in the terminal, And using the ambient sound as voice assistant described in voice operating instruction input.
According to the preferred embodiment of the present invention, the predeterminable area is located at the top of the terminal, and the microphone is located at The bottom of the terminal;Or
Position of the predeterminable area with the microphone in the terminal is relative.
According to the preferred embodiment of the present invention, the processor 13, which also performs multiple instruction, to be included:
Judge whether the distance between the terminal and the user meet the first preparatory condition;
Whether detection user is speaking;
If the distance between the terminal and the user meet the first preparatory condition, and the user is speaking, then really The fixed ambient sound meets preparatory condition.
According to the preferred embodiment of the present invention, the processor 13, which also performs multiple instruction, to be included:
The lip image of the user is obtained by the camera device of the terminal;
Monitor the change of the lip image;
If the change of the lip image meets the second preparatory condition, it is determined that the user is speaking.
According to the preferred embodiment of the present invention, the processor 13, which also performs multiple instruction, to be included:
Noise reduction process is carried out to obtain the environment to the background sound in the ambient sound by diamylose noise reduction technology Speech sound in sound, and using the speech sound as voice assistant described in voice operating instruction input;
According to the preferred embodiment of the present invention, the processor 13, which also performs multiple instruction, to be included:
The function according to corresponding to voice operating instruction controls the voice assistant to perform.
Specifically, the processor 13 refers to Fig. 1 to the concrete methods of realizing of above-mentioned instruction and corresponds to correlation in embodiment The description of step, will not be described here.
In several embodiments provided by the present invention, it should be understood that disclosed system, apparatus and method can be with Realize by another way.For example, device embodiment described above is only schematical, for example, the module Division, only a kind of division of logic function, can there is other dividing mode when actually realizing.
The module illustrated as separating component can be or may not be physically separate, show as module The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.Some or all of module therein can be selected to realize the mesh of this embodiment scheme according to the actual needs 's.
In addition, each functional module in each embodiment of the present invention can be integrated in a processing unit, can also That unit is individually physically present, can also two or more units it is integrated in a unit.Above-mentioned integrated list Member can both be realized in the form of hardware, can also be realized in the form of hardware adds software function module.
It is obvious to a person skilled in the art that the invention is not restricted to the details of above-mentioned one exemplary embodiment, Er Qie In the case of without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.
Therefore, no matter from the point of view of which point, embodiment all should be regarded as exemplary, and is nonrestrictive, sheet The scope of invention limits by appended claims rather than described above, it is intended that will fall equivalency in claim All changes in implication and scope are included in the present invention.Any attached associated diagram mark in claim should not be considered as limit The involved claim of system.
Furthermore, it is to be understood that the word of " comprising " one is not excluded for other units or step, odd number is not excluded for plural number.In system claims The multiple units or device of statement can also be realized by a unit or device by software or hardware.Second grade word is used To represent title, and it is not offered as any specific order.
Finally it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention and it is unrestricted, although reference The present invention is described in detail for preferred embodiment, it will be understood by those within the art that, can be to the present invention's Technical scheme is modified or equivalent substitution, without departing from the spirit and scope of technical solution of the present invention.

Claims (10)

1. a kind of voice assistant starts method, it is characterised in that methods described includes:
Whether the predeterminable area of detection terminal and the microphone of the terminal are blocked;
If the predeterminable area of the terminal is not blocked and the microphone of the terminal is blocked, pass through Mike's elegance Collect ambient sound;And
If the ambient sound meets preparatory condition, start the voice assistant in the terminal, and the ambient sound is made For voice assistant described in voice operating instruction input.
2. voice assistant as claimed in claim 1 starts method, it is characterised in that:
The predeterminable area is located at the top of the terminal, and the microphone is located at the bottom of the terminal;Or
Position of the predeterminable area with the microphone in the terminal is relative.
3. voice assistant as claimed in claim 1 starts method, it is characterised in that methods described also includes:
Judge whether the distance between the terminal and the user meet the first preparatory condition;
Whether detection user is speaking;
If the distance between the terminal and the user meet the first preparatory condition, and the user is speaking, it is determined that institute State ambient sound and meet preparatory condition.
4. voice assistant as claimed in claim 3 starts method, it is characterised in that whether the detection user is in bag of speaking Include:
The lip image of the user is obtained by the camera device of the terminal;
Monitor the change of the lip image;
If the change of the lip image meets the second preparatory condition, it is determined that the user is speaking.
5. voice assistant as claimed in claim 1 starts method, it is characterised in that described using the ambient sound as voice Operational order, which inputs the voice assistant, to be included:
Noise reduction process is carried out to obtain the ambient sound to the background sound in the ambient sound by diamylose noise reduction technology In speech sound, and using the speech sound as voice assistant described in voice operating instruction input;
Methods described also includes:
The function according to corresponding to voice operating instruction controls the voice assistant to perform.
6. a kind of voice assistant starter, it is characterised in that described device includes:
Detection unit, whether it is blocked for the predeterminable area of detection terminal and the microphone of the terminal;
Collecting unit, if the predeterminable area for the terminal is not blocked and the microphone of the terminal is blocked, lead to Cross the microphone collection ambient sound;And
Start unit, if meeting preparatory condition for the ambient sound, start the voice assistant in the terminal, and by institute Ambient sound is stated as voice assistant described in voice operating instruction input.
7. voice assistant starter as claimed in claim 6, it is characterised in that:
The predeterminable area is located at the top of the terminal, and the microphone is located at the bottom of the terminal;Or
Position of the predeterminable area with the microphone in the terminal is relative.
8. voice assistant starter as claimed in claim 6, it is characterised in that described device also includes:
Judging unit, for judging whether the distance between the terminal and the user meet the first preparatory condition;
The detection unit, it is additionally operable to detect whether user is speaking;
Determining unit, if meet the first preparatory condition for the distance between the terminal and the user, and the user exists Speak, it is determined that the ambient sound meets preparatory condition.
9. a kind of terminal, it is characterised in that the terminal includes processor, and the processor is used to perform what is stored in memory Realize that voice assistant starts method as described in any one in claim 1-5 during computer program.
10. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that:The computer program Realize that voice assistant starts method as described in any one in claim 1-5 when being executed by processor.
CN201710828831.1A 2017-09-14 2017-09-14 Voice assistant starts method and device, terminal and computer-readable recording medium Pending CN107678793A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710828831.1A CN107678793A (en) 2017-09-14 2017-09-14 Voice assistant starts method and device, terminal and computer-readable recording medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710828831.1A CN107678793A (en) 2017-09-14 2017-09-14 Voice assistant starts method and device, terminal and computer-readable recording medium

Publications (1)

Publication Number Publication Date
CN107678793A true CN107678793A (en) 2018-02-09

Family

ID=61136436

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710828831.1A Pending CN107678793A (en) 2017-09-14 2017-09-14 Voice assistant starts method and device, terminal and computer-readable recording medium

Country Status (1)

Country Link
CN (1) CN107678793A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108712566A (en) * 2018-04-27 2018-10-26 维沃移动通信有限公司 A kind of voice assistant awakening method and mobile terminal
CN108965562A (en) * 2018-07-24 2018-12-07 Oppo(重庆)智能科技有限公司 Voice data generation method and relevant apparatus
CN109166575A (en) * 2018-07-27 2019-01-08 百度在线网络技术(北京)有限公司 Exchange method, device, smart machine and the storage medium of smart machine
CN110265007A (en) * 2019-05-11 2019-09-20 出门问问信息科技有限公司 Control method, control device and the bluetooth headset of voice assistant system
WO2020087895A1 (en) * 2018-10-29 2020-05-07 华为技术有限公司 Voice interaction processing method and apparatus
CN112890782A (en) * 2021-02-07 2021-06-04 深圳云基智能科技有限公司 Human body temperature information monitoring terminal and monitoring method based on smart phone

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1215658A2 (en) * 2000-12-05 2002-06-19 Hewlett-Packard Company Visual activation of voice controlled apparatus
CN104103274A (en) * 2013-04-11 2014-10-15 纬创资通股份有限公司 Speech processing apparatus and speech processing method
CN104428832A (en) * 2012-07-09 2015-03-18 Lg电子株式会社 Speech recognition apparatus and method
CN104820556A (en) * 2015-05-06 2015-08-05 广州视源电子科技股份有限公司 Method and device for waking up voice assistant

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1215658A2 (en) * 2000-12-05 2002-06-19 Hewlett-Packard Company Visual activation of voice controlled apparatus
CN104428832A (en) * 2012-07-09 2015-03-18 Lg电子株式会社 Speech recognition apparatus and method
CN104103274A (en) * 2013-04-11 2014-10-15 纬创资通股份有限公司 Speech processing apparatus and speech processing method
CN104820556A (en) * 2015-05-06 2015-08-05 广州视源电子科技股份有限公司 Method and device for waking up voice assistant

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108712566A (en) * 2018-04-27 2018-10-26 维沃移动通信有限公司 A kind of voice assistant awakening method and mobile terminal
CN108965562A (en) * 2018-07-24 2018-12-07 Oppo(重庆)智能科技有限公司 Voice data generation method and relevant apparatus
CN108965562B (en) * 2018-07-24 2021-04-13 Oppo(重庆)智能科技有限公司 Voice data generation method and related device
CN109166575A (en) * 2018-07-27 2019-01-08 百度在线网络技术(北京)有限公司 Exchange method, device, smart machine and the storage medium of smart machine
WO2020087895A1 (en) * 2018-10-29 2020-05-07 华为技术有限公司 Voice interaction processing method and apparatus
US11620995B2 (en) 2018-10-29 2023-04-04 Huawei Technologies Co., Ltd. Voice interaction processing method and apparatus
CN110265007A (en) * 2019-05-11 2019-09-20 出门问问信息科技有限公司 Control method, control device and the bluetooth headset of voice assistant system
CN110265007B (en) * 2019-05-11 2020-07-24 出门问问信息科技有限公司 Control method and control device of voice assistant system and Bluetooth headset
CN112890782A (en) * 2021-02-07 2021-06-04 深圳云基智能科技有限公司 Human body temperature information monitoring terminal and monitoring method based on smart phone

Similar Documents

Publication Publication Date Title
CN107678793A (en) Voice assistant starts method and device, terminal and computer-readable recording medium
US10394328B2 (en) Feedback providing method and electronic device for supporting the same
US10754938B2 (en) Method for activating function using fingerprint and electronic device including touch display supporting the same
US9329676B2 (en) Control method and electronic device
CN109313519A (en) Electronic equipment including force snesor
CN107193471B (en) Unlocking control method and related product
CN108701043A (en) A kind of processing method and processing device of display
CN103150109A (en) Touch event model for web pages
WO2014026480A1 (en) Method and apparatus for realizing display of component's content
WO2018006374A1 (en) Function recommending method, system, and robot based on automatic wake-up
CN106651338A (en) Method for payment processing and terminal
KR20180074983A (en) Method for obtaining bio data and an electronic device thereof
CN110955364B (en) Application program recommendation method and electronic equipment
CN106156583A (en) A kind of method of speech unlocking and terminal
CN110210395B (en) Vein image acquisition method and related product
CN109739423B (en) Alarm clock setting method and flexible terminal
CN107402625A (en) Touch screen scanning method, device, terminal and computer-readable recording medium
CN109756818A (en) Dual microphone noise-reduction method, device, storage medium and electronic equipment
CN110191303A (en) Video call method and Related product based on screen sounding
WO2017161814A1 (en) Control method for terminal and terminal
CN107450811A (en) Touch area amplification display method and system
CN106529231A (en) User touch operation identification method and terminal
CN108519846B (en) Image editing processing method and terminal
CN107132927A (en) Input recognition methods and device and the device for identified input character of character
CN110162372B (en) Virtual key creation method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180209