CN110196900A - Exchange method and device for terminal - Google Patents
Exchange method and device for terminal Download PDFInfo
- Publication number
- CN110196900A CN110196900A CN201910509292.4A CN201910509292A CN110196900A CN 110196900 A CN110196900 A CN 110196900A CN 201910509292 A CN201910509292 A CN 201910509292A CN 110196900 A CN110196900 A CN 110196900A
- Authority
- CN
- China
- Prior art keywords
- wake
- condition
- response
- image sequence
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000004044 response Effects 0.000 claims abstract description 174
- 230000002452 interceptive effect Effects 0.000 claims abstract description 14
- 230000009471 action Effects 0.000 claims description 27
- 230000005540 biological transmission Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 6
- 238000001514 detection method Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 8
- 238000004891 communication Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 3
- 230000036651 mood Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000005291 magnetic effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012300 Sequence Analysis Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 210000003414 extremity Anatomy 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 210000003127 knee Anatomy 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 230000002618 waking effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3343—Query execution using phonetics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/013—Eye tracking input arrangements
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Acoustics & Sound (AREA)
- Health & Medical Sciences (AREA)
- User Interface Of Digital Computer (AREA)
- Telephonic Communication Services (AREA)
Abstract
The embodiment of the present application discloses the exchange method and device for terminal.One specific embodiment of the above method includes: acquisition image sequence, wherein image sequence at least framed user's image;According to image sequence, determine whether preset wake-up condition meets;In response to determining that wake-up condition meets, response voice corresponding with wake-up condition is determined;Play response voice.The embodiment can wake up terminal according to user images, increase interactive mode.
Description
Technical field
The invention relates to field of computer technology, and in particular to exchange method and device for terminal.
Background technique
With the continuous progress of science and technology, intelligent terminal has occurred gradually in the visual field of people.Above-mentioned intelligent terminal includes
Intelligent sound box, smart television, intelligent camera, intelligent robot etc..The outstanding place of these intelligent terminals is that it not only may be used
With playing audio-video, moreover it is possible to carry out interactive voice with user.Existing intelligent terminal interactive mode is simpler, and user experience is not
It is good.
Summary of the invention
The embodiment of the present application proposes the exchange method and device for terminal.
In a first aspect, the embodiment of the present application provides a kind of exchange method for terminal, comprising: image sequence is obtained,
Wherein, above-mentioned image sequence includes at least framed user's image;According to above-mentioned image sequence, whether preset wake-up condition is determined
Meet;Meet in response to the above-mentioned wake-up condition of determination, determines response voice corresponding with above-mentioned wake-up condition;Play above-mentioned response
Voice.
In some embodiments, above-mentioned image sequence includes the eyes image of user;And it is above-mentioned according to above-mentioned image sequence
Column, detect whether preset wake-up condition meets, comprising: analyze above-mentioned eyes image, determine whether above-mentioned user infuses
Depending on terminal;In response to determination, above-mentioned user watches terminal attentively, determines that above-mentioned wake-up condition meets.
In some embodiments, above-mentioned to meet in response to the above-mentioned wake-up condition of determination, determination is corresponding with above-mentioned wake-up condition
Response voice, comprising: meet in response to the above-mentioned wake-up condition of determination, obtain the first voice messaging in preset duration;To upper
It states the first voice messaging and carries out semantic parsing;According to semantic parsing result, response voice corresponding with above-mentioned wake-up condition is determined.
In some embodiments, above-mentioned image sequence includes the face-image of user;And it is above-mentioned according to above-mentioned image sequence
Column, detect whether preset wake-up condition meets, comprising: carry out Expression Recognition to the face-image in above-mentioned image sequence;Root
According to Expression Recognition as a result, determining whether the corresponding expression of adjacent two field pictures is identical in above-mentioned image sequence;In response in determination
It is different to state the corresponding expression of adjacent two field pictures in image sequence, determines that above-mentioned wake-up condition meets.
In some embodiments, above-mentioned to meet in response to the above-mentioned wake-up condition of determination, determination is corresponding with above-mentioned wake-up condition
Response voice, comprising: meet in response to the above-mentioned wake-up condition of determination, determination is corresponding with a later frame image in adjacent two field pictures
The corresponding response voice of expression;Using identified response voice as response voice corresponding with above-mentioned wake-up condition.
In some embodiments, above-mentioned image sequence includes the body image of user;And it is above-mentioned according to above-mentioned image sequence
Column, detect whether preset wake-up condition meets, comprising: analyze the body image in above-mentioned image sequence, determine and use
The action message at family;In response to determining that the action message of user meets preset condition, determine that above-mentioned wake-up condition meets.
In some embodiments, above-mentioned to meet in response to the above-mentioned wake-up condition of determination, determination is corresponding with above-mentioned wake-up condition
Response voice, comprising: the corresponding response voice of determining and above-mentioned action message is as response corresponding with above-mentioned wake-up condition
Voice.
In some embodiments, whether the preset wake-up condition of above-mentioned detection meets, comprising: in response to receiving server-side
The response voice of transmission determines that above-mentioned wake-up condition meets.
In some embodiments, the above method further include: in response to finishing playing for above-mentioned response voice, when obtaining default
The second voice messaging in length;According to above-mentioned second voice messaging, above-mentioned terminal is controlled.
Second aspect, the embodiment of the present application provide a kind of interactive device for terminal, comprising: image sequence obtains single
Member is configured to obtain image sequence, wherein above-mentioned image sequence includes at least framed user's image;Condition judgment unit, quilt
It is configured to determine whether preset wake-up condition meets according to above-mentioned image sequence;Response voice determination unit, is configured to ring
It should meet in determining above-mentioned wake-up condition, determine response voice corresponding with above-mentioned wake-up condition;Response voice playing unit, quilt
It is configured to play above-mentioned response voice.
In some embodiments, above-mentioned image sequence includes the eyes image of user;And above-mentioned condition judging unit into
One step is configured to: being analyzed above-mentioned eyes image, is determined whether above-mentioned user watches terminal attentively;In response to the above-mentioned use of determination
Terminal is watched at family attentively, determines that above-mentioned wake-up condition meets.
In some embodiments, above-mentioned response voice determination unit is further configured to: in response to the above-mentioned wake-up of determination
Condition meets, and obtains the first voice messaging in preset duration;Semantic parsing is carried out to above-mentioned first voice messaging;According to semanteme
Parsing result determines response voice corresponding with above-mentioned wake-up condition.
In some embodiments, above-mentioned image sequence includes the face-image of user;And above-mentioned condition judging unit into
One step is configured to: carrying out Expression Recognition to the face-image in above-mentioned image sequence;According to Expression Recognition as a result, determination is above-mentioned
Whether the corresponding expression of adjacent two field pictures is identical in image sequence;In response to adjacent two field pictures in the above-mentioned image sequence of determination
Corresponding expression is different, determines that above-mentioned wake-up condition meets.
In some embodiments, above-mentioned response voice determination unit is further configured to: in response to the above-mentioned wake-up of determination
Condition meets, the corresponding response voice of determination expression corresponding with a later frame image in adjacent two field pictures;It is answered by determined by
Voice is answered as response voice corresponding with above-mentioned wake-up condition.
In some embodiments, above-mentioned image sequence includes the body image of user;And above-mentioned condition judging unit into
One step is configured to: being analyzed the body image in above-mentioned image sequence, is determined the action message of user;In response to determination
The action message of user meets preset condition, determines that above-mentioned wake-up condition meets.
In some embodiments, above-mentioned response voice determination unit is further configured to: determining and above-mentioned action message
Corresponding response voice is as response voice corresponding with above-mentioned wake-up condition.
In some embodiments, above-mentioned condition judging unit is further configured to: being sent in response to receiving server-side
Response voice, determine that above-mentioned wake-up condition meets.
In some embodiments, above-mentioned apparatus further include: voice messaging acquiring unit is configured in response to above-mentioned response
Voice finishes playing, and obtains the second voice messaging in preset duration;Terminal Control Element is configured to according to above-mentioned second
Voice messaging controls above-mentioned terminal.
The third aspect, the embodiment of the present application provide a kind of terminal, comprising: one or more processors;Storage device,
On be stored with one or more programs, when said one or multiple programs are executed by said one or multiple processors so that on
It states one or more processors and realizes the method as described in first aspect any embodiment.
Fourth aspect, the embodiment of the present application provide a kind of computer-readable medium, are stored thereon with computer program, should
The method as described in first aspect any embodiment is realized when program is executed by processor.
The exchange method and device provided by the above embodiment for terminal of the application, available image sequence.Figure
As including at least framed user's image in sequence.Then, according to image sequence, determine whether preset wake-up condition meets.?
After determining that wake-up condition meets, response voice corresponding with wake-up condition is determined.Finally, playing response voice.The present embodiment
Method can wake up terminal according to user images, increase interactive mode.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that one embodiment of the application can be applied to exemplary system architecture figure therein;
Fig. 2 is the flow chart according to one embodiment of the exchange method for terminal of the application;
Fig. 3 a is the schematic diagram according to an application scenarios of the exchange method for terminal of the application;
Fig. 3 b is the schematic diagram according to another application scenarios of the exchange method for terminal of the application;
Fig. 3 c is the schematic diagram according to another application scenarios of the exchange method for terminal of the application;
Fig. 4 is the flow chart according to another embodiment of the exchange method for terminal of the application;
Fig. 5 is the structural schematic diagram according to one embodiment of the interactive device for terminal of the application;
Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the terminal of the embodiment of the present application.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the implementation of the exchange method for terminal or the interactive device for terminal of the application
The exemplary system architecture 100 of example.
As shown in Figure 1, system architecture 100 may include intelligent sound box 101, terminal device 102,103,104 kimonos of network
Business device 105.Network 104 between intelligent sound box 101, terminal device 102,103 and server 105 to provide communication link
Medium.Network 104 may include various connection types, such as wired, wireless communication link or fiber optic cables etc..
Image collecting device can be installed on intelligent sound box 101, user images can be acquired, then pass through network 104
It is interacted with server 105, to receive or send message.For example, the image of acquisition is sent to server 105, or receive clothes
The information that business device 105 is sent.
User can be used terminal device 102,103 and be interacted by network 104 with server 105, be disappeared with receiving or sending
Breath etc..Various telecommunication customer end applications, such as image processing class application, audio-video can be installed on terminal device 102,103
Broadcast message class application, web browser applications, shopping class application, searching class application, instant messaging tools, mailbox client, social activity
Platform software etc..
Terminal device 102,103 can be hardware, be also possible to software.It, can be with when terminal device 102,103 is hardware
It is the various electronic equipments with microphone and image collecting device, including but not limited to smart phone, tablet computer, intelligence are electric
Depending on, pocket computer on knee and desktop computer etc..When terminal device 102,103 is software, may be mounted at above-mentioned
In cited electronic equipment.Multiple softwares or software module (such as providing Distributed Services) may be implemented into it,
Single software or software module may be implemented into.It is not specifically limited herein.
Server 105 can be to provide the server of various services, for example, to intelligent sound box 101 and terminal device 102,
The background server that the image of 103 acquisitions is handled.Background server can analyze the data such as the image received
Deng processing, and processing result (such as replying voice) is fed back into intelligent sound box 101 and terminal device 102,103.
It should be noted that server 105 can be hardware, it is also possible to software.It, can when server 105 is hardware
To be implemented as the distributed server cluster that multiple servers form, individual server also may be implemented into.When server 105 is
When software, multiple softwares or software module (such as providing Distributed Services) may be implemented into, also may be implemented into single
Software or software module.It is not specifically limited herein.
It should be noted that for the exchange method of terminal generally by terminal (such as intelligence provided by the embodiment of the present application
It can speaker 101 or terminal device 102,103) execution.Correspondingly, it is generally positioned in terminal for the interactive device of terminal.?
In some realizations, some steps of the exchange method for terminal can also be by server-side (such as server 105) Lai Zhihang.Phase
Ying Di also can be set in server-side for some units in the interactive device of terminal.
It should be understood that the number of intelligent sound box, terminal device, network and server in Fig. 1 is only schematical.Root
It factually now needs, can have any number of intelligent sound box, terminal device, network and server.
With continued reference to Fig. 2, the process of one embodiment of the exchange method for terminal according to the application is shown
200.The exchange method for terminal of the present embodiment, comprising the following steps:
Step 201, image sequence is obtained.
In the present embodiment, for the executing subject of the exchange method of terminal (such as intelligent sound box shown in FIG. 1 101 or
103) terminal device 102 can pass through wired connection mode or radio connection acquisition image sequence.Such as it can pass through
The image collecting device of executing subject connection obtains image sequence.It may include at least framed user's image in image sequence.This
Place, user images refer to the image comprising user's face, user's limbs or user's eye.Each frame image in image sequence can be with
It is the multiple image that image collecting device interval preset time is continuously shot.
It should be pointed out that above-mentioned radio connection can include but is not limited to 3G/4G connection, WiFi connection, bluetooth
Connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection and other currently known or exploitations in the future
Radio connection.
Step 202, according to image sequence, determine whether preset wake-up condition meets.
After getting image sequence, executing subject may determine that whether preset wake-up condition meets.Specifically, executing
Main body can carry out a series of processing to image sequence first, and according to processing result, judge whether to meet wake-up condition.On
Stating processing can include but is not limited to feature extraction, Expression Recognition, action recognition etc..Herein, preset wake-up condition can be with
It is the pre-set wake-up condition for needing terminal to timely respond to.For example, wake-up condition can be the figure in image comprising flare
Picture.Alternatively, wake-up condition can be the image in image comprising stranger.It is understood that wake-up condition may include more
It is a.Using above-mentioned wake-up condition, executing subject may be implemented to automatically wake up or passively wake up.
Step 203, in response to determining that wake-up condition meets, response voice corresponding with wake-up condition is determined.
Executing subject can determine response voice corresponding with wake-up condition after determining that wake-up condition meets.This implementation
In example, each wake-up condition can correspond at least one response voice.Different response voices correspond to different scenes.It can manage
Solution, response voice is it is intended that user provides relevant information, or actively accompanies user (such as actively interacting with user).
Step 204, response voice is played.
Executing subject can play above-mentioned response voice after determining response voice, to realize the interaction with user.
The exchange method provided by the above embodiment for terminal of the application, available image sequence.Image sequence
In include at least framed user's image.Then, according to image sequence, determine whether preset wake-up condition meets.It is called out in determination
After the condition of waking up meets, response voice corresponding with wake-up condition is determined.Finally, playing response voice.The method of the present embodiment, can
To wake up terminal according to user images, interactive mode is increased.
In some optional implementations of embodiment shown in Fig. 2, image sequence may include the eyes image of user.
Above-mentioned steps 202 can specifically be realized by following steps unshowned in Fig. 2: being analyzed eyes image, determined and use
Whether family watches terminal attentively;In response to determining that user watches terminal attentively, determine that wake-up condition meets.
In this implementation, image sequence can be the eyes image including user obtained in real time.Executing subject can be with
Eyes image is analyzed, determines whether user's is just look at terminal.If it is determined that user is currently just look at terminal, then
Assert that user wants to interact with terminal, it is determined that wake-up condition meets.
Correspondingly, corresponding with this implementation, step 203 specifically can by following steps unshowned in Fig. 2 come
It realizes: in response to determining that wake-up condition meets, obtaining the first voice messaging in preset duration;Language is carried out to the first voice messaging
Justice parsing;According to semantic parsing result, response voice corresponding with wake-up condition is determined.
In this implementation, executing subject after determining that wake-up condition meets, can monitor sound in the environment
Information obtains the first voice messaging in preset duration.Executing subject can carry out semantic parsing to the first voice messaging.And root
According to semantic parsing result, to determine response voice corresponding with wake-up condition.For example, it after terminal is waken up, can monitor
The sound of user.When the voice for listening to user is " weather for having a talk about tomorrow ", terminal can make the Weather information of tomorrow
For the content of response voice.Alternatively, can keyword be arranged for terminal in advance in user, terminal can be using the keyword as its name
Word.For example, the pre-set terminal name of user is " Xiao Ming ".After terminal is waken up, the voice for listening to user is
" Xiao Ming sings first song ", terminal can play a song.
In some concrete implementations, terminal can be sent the first voice messaging after getting the first voice messaging
To server-side, semantic parsing is carried out to the first voice messaging by server-side, and according to semantic parsing result, to determine and wake up
The corresponding response voice of condition.Identified response voice can be sent to terminal after determining response voice by server-side.
Other realization in, terminal can after getting the first voice messaging, directly to the first voice messaging into
The semantic parsing of row, and according to semantic parsing result, to determine response voice corresponding with wake-up condition.Service can be mitigated in this way
The processing pressure at end.
Referring to Fig. 3 a, Fig. 3 a is the signal according to an application scenarios of the exchange method for terminal of the present embodiment
Figure.In the application scenarios of Fig. 3 a, user A chats with user B.User A has watched intelligent sound box attentively and has said that " we will go tomorrow at a glance
Spring outing ", user B say " ".User A says " being exactly not know that the conjunction of weather tomorrow is improper ".Intelligent sound box is on collecting
After stating voice messaging, semantic parsing is carried out to voice messaging.And according to parsing result, determine that the content of response voice is " tomorrow
Weather is fine, 25-30 DEG C of temperature, the exactly fair weather of spring outing ".Intelligent sound box plays above-mentioned response voice to user.In this way, intelligence
Speaker can be actively added dialogue, and provide relevant information.
In some optional implementations of embodiment shown in Fig. 2, image sequence may include the face-image of user.
Above-mentioned steps 202 can specifically be realized by following steps unshowned in Fig. 2: be carried out to the face-image in image sequence
Expression Recognition;According to Expression Recognition as a result, determining whether the corresponding expression of adjacent two field pictures is identical in image sequence;In response to
Determine that the corresponding expression of adjacent two field pictures is different in image sequence, determines that wake-up condition meets.
In this implementation, image sequence may include the face-image of user.Executing subject can to face-image into
Row Expression Recognition.Then, according to Expression Recognition as a result, the corresponding expression of arbitrary neighborhood two field pictures is in image sequence to determine
It is no identical.It is different if there is the adjacent corresponding expression of two field pictures, it is determined that the expression of user changes, and wakes up at this time
Condition meets.
Correspondingly, corresponding with this implementation, step 203 specifically can by following steps unshowned in Fig. 2 come
It realizes: in response to determining that wake-up condition meets, determining that expression corresponding with a later frame image in adjacent two field pictures is corresponding and answer
Answer voice;Using identified response voice as response voice corresponding with wake-up condition.
In this implementation, executing subject, can will be latter in adjacent two field pictures after determining that wake-up condition meets
Frame image is as target image.Then by response voice corresponding to the corresponding expression of target image, as with wake-up condition pair
The response voice answered.For example, user's expression suddenly becomes pain, and executing subject, can be with after determining above-mentioned label variations
Determining response voice corresponding with painful expression is " anything is needed to help? ".
Referring to Fig. 3 b, Fig. 3 b is the signal according to another application scenarios of the exchange method for terminal of the present embodiment
Figure.In the application scenarios of Fig. 3 b, user A is in alone, crys suddenly.After intelligent sound box recognizes the expression of user, determine
Response voice corresponding with sobbing be " owner, today mood less? can with I chat chat ".In this way, intelligent sound box energy
It is enough to initiate to talk with according to the expression of user, promote user experience.
In some optional implementations of embodiment shown in Fig. 2, image sequence may include the body image of user.
Above-mentioned steps 202 can specifically be realized by following steps unshowned in Fig. 2: be carried out to the body image in image sequence
Analysis, determines the type of action of user;In response to determining that the type of action of user meets preset condition, determine that wake-up condition is full
Foot.
In this implementation, image sequence may include the body image of user.Executing subject can to body image into
Row analysis, to determine the type of action of user.Specifically, executing subject can be according to user's body in image each in image sequence
Position, determine the action message of user.Action message may include type of action and duration.Above-mentioned type of action
It may include jumping, falling down, squatting down.Executing subject may determine that whether the action message of user meets preset condition, if
Meet, it is determined that wake-up condition meets.Above-mentioned preset condition for example can be to fall down the time to continue 2 minutes, or jump continues
Time is 2 minutes.
Correspondingly, corresponding with this implementation, step 203 specifically can by following steps unshowned in Fig. 2 come
Realize: the corresponding response voice of determining and action message is as response voice corresponding with wake-up condition.
Executing subject can be according to action message, to determine the response voice corresponding to it, and by identified response language
Sound is as response voice corresponding with wake-up condition.For example, the action message of user are as follows: jump, 5 minutes.It is corresponding to answer
Answering voice can be " movement appropriate facilitates health ".
With continued reference to Fig. 3 c, Fig. 3 c is showing according to application scenarios of the exchange method for terminal of the present embodiment
It is intended to.In the application scenarios of Fig. 3 a, old solitary people is fallen down suddenly, and the time that falls down to the ground is more than 2 minutes, and intelligent sound box wakes up
Afterwards, determine that corresponding response message is " may I ask and want help? ".So as to actively help user.
In some optional implementations of the present embodiment, above-mentioned steps 202 specifically can be by unshowned in Fig. 2
Following steps are realized: the response voice in response to receiving server-side transmission determines that wake-up condition meets.
In this implementation, executing subject can be interacted with server-side.When executing subject receives answering for server-side transmission
When answering voice, it is determined that wake-up condition meets, and terminal enters wake-up states.
With continued reference to Fig. 4, it illustrates the streams according to another embodiment of the exchange method for terminal of the application
Journey 400.As shown in figure 4, the exchange method for terminal of the present embodiment may comprise steps of:
Step 401, image sequence is obtained.
Step 402, according to image sequence, determine whether preset wake-up condition meets.
Step 403, in response to determining that wake-up condition meets, response voice corresponding with wake-up condition is determined.
Step 404, response voice is played.
The principle of step 401~404 and the principle of step 201~204 are similar, and details are not described herein again.
Step 405, finishing playing in response to response voice obtains the second voice messaging in preset duration.
Executing subject finishes playing after response voice, the second voice messaging in available preset duration.It is above-mentioned default
It is preset according to practical application scene that duration can be technical staff, such as 2 minutes.
Step 406, according to the second voice messaging, controlling terminal.
Executing subject, can be according to the second voice messaging, controlling terminal after obtaining the second voice messaging.Specifically, holding
Row main body can parse the second voice messaging, determine keyword therein or mood.Then basis and keyword or feelings
The corresponding instruction of thread, carrys out controlling terminal.For example, in the scene shown in Fig. 3 b, intelligent sound box can be answered finishing playing
After answering voice, the echo message of user is obtained.If the echo message of user indicates that user is ready to listen attentively to, illustrate intelligent sound box institute
The expression of identification is correct.Then the corresponding face-image of mood and the expression recognized can be stored in local by intelligent sound box.This
Sample can be by the characteristic of face-image and stored face when intelligent sound box again identifies that the expression of user's face image
The characteristic of portion's image is compared, so as to more rapidly accurately identify the expression of user.It applies shown in Fig. 3 c
In scene, intelligent sound box can obtain the echo message of user after the response voice that finishes playing.If the echo message of user
It " makes a phone call " comprising keyword.Then intelligent sound box can send above-mentioned echo message to server-side, for server-side processing.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides a kind of for terminal
One embodiment of interactive device, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer
For in various electronic equipments.
As shown in figure 5, the interactive device 500 for terminal of the present embodiment includes: image sequence acquiring unit 501, item
Part judging unit 502, response voice determination unit 503 and response voice playing unit 504.
Image sequence acquiring unit 501 is configured to obtain image sequence.Wherein, image sequence is used including an at least frame
Family image.
Condition judgment unit 502 is configured to determine whether preset wake-up condition meets according to image sequence.
Response voice determination unit 503 is configured in response to determine that wake-up condition meets, and determination is corresponding with wake-up condition
Response voice.
Response voice playing unit 504 is configured to play response voice.
In some optional implementations of the present embodiment, image sequence includes the eyes image of user.Condition judgement
Unit 502 can be further configured to: being analyzed eyes image, determined whether user watches terminal attentively;In response to determination
User watches terminal attentively, determines that wake-up condition meets.
In some optional implementations of the present embodiment, response voice determination unit 503 can be further configured
At: in response to determining that wake-up condition meets, obtain the first voice messaging in preset duration;First voice messaging is carried out semantic
Parsing;According to semantic parsing result, response voice corresponding with wake-up condition is determined.
In some optional implementations of the present embodiment, image sequence includes the face-image of user.Condition judgement
Unit 502 can be further configured to: carry out Expression Recognition to the face-image in image sequence;According to Expression Recognition knot
Fruit determines whether the corresponding expression of adjacent two field pictures is identical in image sequence;In response to determining adjacent two frame in image sequence
The corresponding expression of image is different, determines that wake-up condition meets.
In some optional implementations of the present embodiment, response voice determination unit 503 can be further configured
At: meet in response to determination wake-up condition, the corresponding response of determination expression corresponding with a later frame image in adjacent two field pictures
Voice;Using identified response voice as response voice corresponding with wake-up condition.
In some optional implementations of the present embodiment, image sequence includes the body image of user.Condition judgement
Unit 502 can be further configured to: being analyzed the body image in image sequence, determined the action message of user;
In response to determining that the action message of user meets preset condition, determine that wake-up condition meets.
In some optional implementations of the present embodiment, response voice determination unit 503 can be further configured
At the corresponding response voice of: determining and action message as response voice corresponding with wake-up condition.
In some optional implementations of the present embodiment, condition judgment unit 502 can be further configured to: be rung
Ying Yu receives the response voice of server-side transmission, determines that wake-up condition meets.
In some optional implementations of the present embodiment, device 500 may further include unshowned language in Fig. 5
Sound information acquisition unit and Terminal Control Element.
Voice messaging acquiring unit, is configured in response to finishing playing for response voice, obtains the in preset duration
Two voice messagings;
Terminal Control Element is configured to according to the second voice messaging, controlling terminal.
It should be appreciated that the unit 501 for recording in the interactive device 500 of terminal is to unit 504 respectively and in reference Fig. 2
Each step in the method for description is corresponding.As a result, above with respect to the operation and feature of the exchange method description for terminal
It is equally applicable to device 500 and unit wherein included, details are not described herein.
Below with reference to Fig. 6, it illustrates the terminal (intelligence of example as shown in figure 1 for being suitable for being used to realize embodiment of the disclosure
Speaker 101 or terminal device 102,103) 600 structural schematic diagram.Terminal shown in Fig. 6 is only an example, should not be to this
The function and use scope of disclosed embodiment bring any restrictions.
As shown in fig. 6, terminal 600 may include processing unit (such as central processing unit, graphics processor etc.) 601,
Random access storage device can be loaded into according to the program being stored in read-only memory (ROM) 602 or from storage device 608
(RAM) program in 603 and execute various movements appropriate and processing.In RAM 603, it is also stored with the behaviour of electronic equipment 600
Various programs and data needed for making.Processing unit 601, ROM 602 and RAM603 are connected with each other by bus 604.Input/
Output (I/O) interface 605 is also connected to bus 604.
In general, following device can connect to I/O interface 605: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph
As the input unit 606 of head, microphone, accelerometer, gyroscope etc.;Including such as liquid crystal display (LCD), loudspeaker, vibration
The output device 607 of dynamic device etc.;Storage device 608 including such as tape, hard disk etc.;And communication device 609.Communication device
609, which can permit electronic equipment 600, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 6 shows tool
There is the electronic equipment 600 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with
Alternatively implement or have more or fewer devices.Each box shown in Fig. 6 can represent a device, can also root
According to needing to represent multiple devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communication device 609, or from storage device 608
It is mounted, or is mounted from ROM 602.When the computer program is executed by processing unit 601, the implementation of the disclosure is executed
The above-mentioned function of being limited in the method for example.It should be noted that computer-readable medium described in embodiment of the disclosure can be with
It is computer-readable signal media or computer readable storage medium either the two any combination.It is computer-readable
Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or
Device, or any above combination.The more specific example of computer readable storage medium can include but is not limited to: have
The electrical connection of one or more conducting wires, portable computer diskette, hard disk, random access storage device (RAM), read-only memory
(ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-
ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In embodiment of the disclosure, computer
Readable storage medium storing program for executing can be any tangible medium for including or store program, which can be commanded execution system, device
Either device use or in connection.And in embodiment of the disclosure, computer-readable signal media may include
In a base band or as the data-signal that carrier wave a part is propagated, wherein carrying computer-readable program code.It is this
The data-signal of propagation can take various forms, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate
Combination.Computer-readable signal media can also be any computer-readable medium other than computer readable storage medium, should
Computer-readable signal media can send, propagate or transmit for by instruction execution system, device or device use or
Person's program in connection.The program code for including on computer-readable medium can transmit with any suitable medium,
Including but not limited to: electric wire, optical cable, RF (radio frequency) etc. or above-mentioned any appropriate combination.
Above-mentioned computer-readable medium can be included in above-mentioned terminal;It is also possible to individualism, and it is unassembled
Enter in the electronic equipment.Above-mentioned computer-readable medium carries one or more program, when said one or multiple journeys
When sequence is executed by the electronic equipment, so that the electronic equipment: obtaining image sequence, wherein image sequence is used including an at least frame
Family image;According to image sequence, determine whether preset wake-up condition meets;In response to determine wake-up condition meet, determine with
The corresponding response voice of wake-up condition;Play response voice.
The behaviour for executing embodiment of the disclosure can be write with one or more programming languages or combinations thereof
The computer program code of work, described program design language include object oriented program language-such as Java,
Smalltalk, C++ further include conventional procedural programming language-such as " C " language or similar program design language
Speech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independence
Software package execute, part on the user computer part execute on the remote computer or completely in remote computer or
It is executed on server.In situations involving remote computers, remote computer can pass through the network of any kind --- packet
It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit
It is connected with ISP by internet).
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in embodiment of the disclosure can be realized by way of software, can also be passed through
The mode of hardware is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor
Including image sequence acquiring unit, condition judgment unit, response voice determination unit and response voice playing unit.Wherein, this
The title of a little units does not constitute the restriction to the unit itself under certain conditions, for example, response voice playing unit may be used also
To be described as " playing the unit of response voice ".
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art
Member it should be appreciated that embodiment of the disclosure involved in invention scope, however it is not limited to the specific combination of above-mentioned technical characteristic and
At technical solution, while should also cover do not depart from foregoing invention design in the case where, by above-mentioned technical characteristic or its be equal
Feature carries out any combination and other technical solutions for being formed.Such as disclosed in features described above and embodiment of the disclosure (but
It is not limited to) technical characteristic with similar functions is replaced mutually and the technical solution that is formed.
Claims (20)
1. a kind of exchange method for terminal, comprising:
Obtain image sequence, wherein described image sequence includes at least framed user's image;
According to described image sequence, determine whether preset wake-up condition meets;
Meet in response to the determination wake-up condition, determines response voice corresponding with the wake-up condition;
Play the response voice.
2. according to the method described in claim 1, wherein, described image sequence includes the eyes image of user;And
It is described according to described image sequence, detect whether preset wake-up condition meets, comprising:
The eyes image is analyzed, determines whether the user watches terminal attentively;
In response to determination, the user watches terminal attentively, determines that the wake-up condition meets.
3. described to meet in response to the determination wake-up condition according to the method described in claim 2, wherein, it is determining with it is described
The corresponding response voice of wake-up condition, comprising:
Meet in response to the determination wake-up condition, obtains the first voice messaging in preset duration;
Semantic parsing is carried out to first voice messaging;
According to semantic parsing result, response voice corresponding with the wake-up condition is determined.
4. according to the method described in claim 1, wherein, described image sequence includes the face-image of user;And
It is described according to described image sequence, detect whether preset wake-up condition meets, comprising:
Expression Recognition is carried out to the face-image in described image sequence;
According to Expression Recognition as a result, determining whether the corresponding expression of adjacent two field pictures is identical in described image sequence;
It is different in response to the corresponding expression of two field pictures adjacent in determining described image sequence, determine that the wake-up condition meets.
5. described to meet in response to the determination wake-up condition according to the method described in claim 4, wherein, it is determining with it is described
The corresponding response voice of wake-up condition, comprising:
Meet in response to the determination wake-up condition, determines that expression corresponding with a later frame image in adjacent two field pictures is corresponding
Response voice;
Using identified response voice as response voice corresponding with the wake-up condition.
6. according to the method described in claim 1, wherein, described image sequence includes the body image of user;And
It is described according to described image sequence, detect whether preset wake-up condition meets, comprising:
Body image in described image sequence is analyzed, determines the action message of user;
In response to determining that the action message of user meets preset condition, determine that the wake-up condition meets.
7. described to meet in response to the determination wake-up condition according to the method described in claim 6, wherein, it is determining with it is described
The corresponding response voice of wake-up condition, comprising:
The corresponding response voice of determining and described action message is as response voice corresponding with the wake-up condition.
8. according to the method described in claim 1, wherein, whether the preset wake-up condition of detection meets, comprising:
In response to receiving the response voice of server-side transmission, determine that the wake-up condition meets.
9. method according to claim 1-8, wherein the method also includes:
In response to finishing playing for the response voice, the second voice messaging in preset duration is obtained;
According to second voice messaging, the terminal is controlled.
10. a kind of interactive device for terminal, comprising:
Image sequence acquiring unit is configured to obtain image sequence, wherein described image sequence is schemed including an at least framed user
Picture;
Condition judgment unit is configured to determine whether preset wake-up condition meets according to described image sequence;
Response voice determination unit is configured in response to determine that the wake-up condition meets, the determining and wake-up condition pair
The response voice answered;
Response voice playing unit is configured to play the response voice.
11. device according to claim 10, wherein described image sequence includes the eyes image of user;And
The condition judgment unit is further configured to:
The eyes image is analyzed, determines whether the user watches terminal attentively;
In response to determination, the user watches terminal attentively, determines that the wake-up condition meets.
12. device according to claim 11, wherein the response voice determination unit is further configured to:
Meet in response to the determination wake-up condition, obtains the first voice messaging in preset duration;
Semantic parsing is carried out to first voice messaging;
According to semantic parsing result, response voice corresponding with the wake-up condition is determined.
13. device according to claim 10, wherein described image sequence includes the face-image of user;And
The condition judgment unit is further configured to:
Expression Recognition is carried out to the face-image in described image sequence;
According to Expression Recognition as a result, determining whether the corresponding expression of adjacent two field pictures is identical in described image sequence;
It is different in response to the corresponding expression of two field pictures adjacent in determining described image sequence, determine that the wake-up condition meets.
14. device according to claim 13, wherein the response voice determination unit is further configured to:
Meet in response to the determination wake-up condition, determines that expression corresponding with a later frame image in adjacent two field pictures is corresponding
Response voice;
Using identified response voice as response voice corresponding with the wake-up condition.
15. device according to claim 10, wherein described image sequence includes the body image of user;And
The condition judgment unit is further configured to:
Body image in described image sequence is analyzed, determines the action message of user;
In response to determining that the action message of user meets preset condition, determine that the wake-up condition meets.
16. device according to claim 15, wherein the response voice determination unit is further configured to:
The corresponding response voice of determining and described action message is as response voice corresponding with the wake-up condition.
17. device according to claim 10, wherein the condition judgment unit is further configured to:
In response to receiving the response voice of server-side transmission, determine that the wake-up condition meets.
18. the described in any item devices of 0-17 according to claim 1, wherein described device further include:
Voice messaging acquiring unit, is configured in response to finishing playing for the response voice, obtains the in preset duration
Two voice messagings;
Terminal Control Element is configured to control the terminal according to second voice messaging.
19. a kind of terminal, comprising:
One or more processors;
Storage device is stored thereon with one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
The now method as described in any in claim 1-9.
20. a kind of computer-readable medium, is stored thereon with computer program, wherein the realization when program is executed by processor
Method as described in any in claim 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910509292.4A CN110196900A (en) | 2019-06-13 | 2019-06-13 | Exchange method and device for terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910509292.4A CN110196900A (en) | 2019-06-13 | 2019-06-13 | Exchange method and device for terminal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110196900A true CN110196900A (en) | 2019-09-03 |
Family
ID=67754421
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910509292.4A Pending CN110196900A (en) | 2019-06-13 | 2019-06-13 | Exchange method and device for terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110196900A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110910887A (en) * | 2019-12-30 | 2020-03-24 | 苏州思必驰信息科技有限公司 | Voice wake-up method and device |
CN113626778A (en) * | 2020-05-08 | 2021-11-09 | 百度在线网络技术(北京)有限公司 | Method, apparatus, electronic device, and computer storage medium for waking up device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108154140A (en) * | 2018-01-22 | 2018-06-12 | 北京百度网讯科技有限公司 | Voice awakening method, device, equipment and computer-readable medium based on lip reading |
CN108733420A (en) * | 2018-03-21 | 2018-11-02 | 北京猎户星空科技有限公司 | Awakening method, device, smart machine and the storage medium of smart machine |
CN109346076A (en) * | 2018-10-25 | 2019-02-15 | 三星电子(中国)研发中心 | Interactive voice, method of speech processing, device and system |
-
2019
- 2019-06-13 CN CN201910509292.4A patent/CN110196900A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108154140A (en) * | 2018-01-22 | 2018-06-12 | 北京百度网讯科技有限公司 | Voice awakening method, device, equipment and computer-readable medium based on lip reading |
CN108733420A (en) * | 2018-03-21 | 2018-11-02 | 北京猎户星空科技有限公司 | Awakening method, device, smart machine and the storage medium of smart machine |
CN109346076A (en) * | 2018-10-25 | 2019-02-15 | 三星电子(中国)研发中心 | Interactive voice, method of speech processing, device and system |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110910887A (en) * | 2019-12-30 | 2020-03-24 | 苏州思必驰信息科技有限公司 | Voice wake-up method and device |
CN113626778A (en) * | 2020-05-08 | 2021-11-09 | 百度在线网络技术(北京)有限公司 | Method, apparatus, electronic device, and computer storage medium for waking up device |
CN113626778B (en) * | 2020-05-08 | 2024-04-02 | 百度在线网络技术(北京)有限公司 | Method, apparatus, electronic device and computer storage medium for waking up device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11158102B2 (en) | Method and apparatus for processing information | |
US11271765B2 (en) | Device and method for adaptively providing meeting | |
CN105320726B (en) | Reduce the demand to manual beginning/end point and triggering phrase | |
US20170277993A1 (en) | Virtual assistant escalation | |
JP2023553101A (en) | Live streaming interaction methods, apparatus, devices and media | |
CN107704169B (en) | Virtual human state management method and system | |
EP3611724A1 (en) | Voice response method and device, and smart device | |
CN110267113B (en) | Video file processing method, system, medium, and electronic device | |
CN109887505A (en) | Method and apparatus for wake-up device | |
US11803579B2 (en) | Apparatus, systems and methods for providing conversational assistance | |
CN108763475B (en) | Recording method, recording device and terminal equipment | |
CN110196900A (en) | Exchange method and device for terminal | |
CN109949793A (en) | Method and apparatus for output information | |
CN111312243B (en) | Equipment interaction method and device | |
CN110288683B (en) | Method and device for generating information | |
CN112309387A (en) | Method and apparatus for processing information | |
CN116437155A (en) | Live broadcast interaction method and device, computer equipment and storage medium | |
US20240171418A1 (en) | Information processing device and information processing method | |
CN110459239A (en) | Role analysis method, apparatus and computer readable storage medium based on voice data | |
CN212588503U (en) | Embedded audio playing device | |
CN113312928A (en) | Text translation method and device, electronic equipment and storage medium | |
KR20190030549A (en) | Method, system and non-transitory computer-readable recording medium for controlling flow of advertising contents based on video chat | |
CN114065056A (en) | Learning scheme recommendation method, server and system | |
CN109348353B (en) | Service processing method and device of intelligent sound box and intelligent sound box | |
CN110188712B (en) | Method and apparatus for processing image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190903 |
|
RJ01 | Rejection of invention patent application after publication |