CN110196900A

CN110196900A - Exchange method and device for terminal

Info

Publication number: CN110196900A
Application number: CN201910509292.4A
Authority: CN
Inventors: 肖方; 董伟鑫; 马权
Original assignee: Samsung Electronics China R&D Center; Samsung Electronics Co Ltd
Current assignee: Samsung Electronics China R&D Center; Samsung Electronics Co Ltd
Priority date: 2019-06-13
Filing date: 2019-06-13
Publication date: 2019-09-03

Abstract

The embodiment of the present application discloses the exchange method and device for terminal.One specific embodiment of the above method includes: acquisition image sequence, wherein image sequence at least framed user's image；According to image sequence, determine whether preset wake-up condition meets；In response to determining that wake-up condition meets, response voice corresponding with wake-up condition is determined；Play response voice.The embodiment can wake up terminal according to user images, increase interactive mode.

Description

Exchange method and device for terminal

Technical field

The invention relates to field of computer technology, and in particular to exchange method and device for terminal.

Background technique

With the continuous progress of science and technology, intelligent terminal has occurred gradually in the visual field of people.Above-mentioned intelligent terminal includes Intelligent sound box, smart television, intelligent camera, intelligent robot etc..The outstanding place of these intelligent terminals is that it not only may be used With playing audio-video, moreover it is possible to carry out interactive voice with user.Existing intelligent terminal interactive mode is simpler, and user experience is not It is good.

Summary of the invention

The embodiment of the present application proposes the exchange method and device for terminal.

In a first aspect, the embodiment of the present application provides a kind of exchange method for terminal, comprising: image sequence is obtained, Wherein, above-mentioned image sequence includes at least framed user's image；According to above-mentioned image sequence, whether preset wake-up condition is determined Meet；Meet in response to the above-mentioned wake-up condition of determination, determines response voice corresponding with above-mentioned wake-up condition；Play above-mentioned response Voice.

In some embodiments, above-mentioned image sequence includes the eyes image of user；And it is above-mentioned according to above-mentioned image sequence Column, detect whether preset wake-up condition meets, comprising: analyze above-mentioned eyes image, determine whether above-mentioned user infuses Depending on terminal；In response to determination, above-mentioned user watches terminal attentively, determines that above-mentioned wake-up condition meets.

In some embodiments, above-mentioned to meet in response to the above-mentioned wake-up condition of determination, determination is corresponding with above-mentioned wake-up condition Response voice, comprising: meet in response to the above-mentioned wake-up condition of determination, obtain the first voice messaging in preset duration；To upper It states the first voice messaging and carries out semantic parsing；According to semantic parsing result, response voice corresponding with above-mentioned wake-up condition is determined.

In some embodiments, above-mentioned image sequence includes the face-image of user；And it is above-mentioned according to above-mentioned image sequence Column, detect whether preset wake-up condition meets, comprising: carry out Expression Recognition to the face-image in above-mentioned image sequence；Root According to Expression Recognition as a result, determining whether the corresponding expression of adjacent two field pictures is identical in above-mentioned image sequence；In response in determination It is different to state the corresponding expression of adjacent two field pictures in image sequence, determines that above-mentioned wake-up condition meets.

In some embodiments, above-mentioned to meet in response to the above-mentioned wake-up condition of determination, determination is corresponding with above-mentioned wake-up condition Response voice, comprising: meet in response to the above-mentioned wake-up condition of determination, determination is corresponding with a later frame image in adjacent two field pictures The corresponding response voice of expression；Using identified response voice as response voice corresponding with above-mentioned wake-up condition.

In some embodiments, above-mentioned image sequence includes the body image of user；And it is above-mentioned according to above-mentioned image sequence Column, detect whether preset wake-up condition meets, comprising: analyze the body image in above-mentioned image sequence, determine and use The action message at family；In response to determining that the action message of user meets preset condition, determine that above-mentioned wake-up condition meets.

In some embodiments, above-mentioned to meet in response to the above-mentioned wake-up condition of determination, determination is corresponding with above-mentioned wake-up condition Response voice, comprising: the corresponding response voice of determining and above-mentioned action message is as response corresponding with above-mentioned wake-up condition Voice.

In some embodiments, whether the preset wake-up condition of above-mentioned detection meets, comprising: in response to receiving server-side The response voice of transmission determines that above-mentioned wake-up condition meets.

In some embodiments, the above method further include: in response to finishing playing for above-mentioned response voice, when obtaining default The second voice messaging in length；According to above-mentioned second voice messaging, above-mentioned terminal is controlled.

Second aspect, the embodiment of the present application provide a kind of interactive device for terminal, comprising: image sequence obtains single Member is configured to obtain image sequence, wherein above-mentioned image sequence includes at least framed user's image；Condition judgment unit, quilt It is configured to determine whether preset wake-up condition meets according to above-mentioned image sequence；Response voice determination unit, is configured to ring It should meet in determining above-mentioned wake-up condition, determine response voice corresponding with above-mentioned wake-up condition；Response voice playing unit, quilt It is configured to play above-mentioned response voice.

In some embodiments, above-mentioned image sequence includes the eyes image of user；And above-mentioned condition judging unit into One step is configured to: being analyzed above-mentioned eyes image, is determined whether above-mentioned user watches terminal attentively；In response to the above-mentioned use of determination Terminal is watched at family attentively, determines that above-mentioned wake-up condition meets.

In some embodiments, above-mentioned response voice determination unit is further configured to: in response to the above-mentioned wake-up of determination Condition meets, and obtains the first voice messaging in preset duration；Semantic parsing is carried out to above-mentioned first voice messaging；According to semanteme Parsing result determines response voice corresponding with above-mentioned wake-up condition.

In some embodiments, above-mentioned image sequence includes the face-image of user；And above-mentioned condition judging unit into One step is configured to: carrying out Expression Recognition to the face-image in above-mentioned image sequence；According to Expression Recognition as a result, determination is above-mentioned Whether the corresponding expression of adjacent two field pictures is identical in image sequence；In response to adjacent two field pictures in the above-mentioned image sequence of determination Corresponding expression is different, determines that above-mentioned wake-up condition meets.

In some embodiments, above-mentioned response voice determination unit is further configured to: in response to the above-mentioned wake-up of determination Condition meets, the corresponding response voice of determination expression corresponding with a later frame image in adjacent two field pictures；It is answered by determined by Voice is answered as response voice corresponding with above-mentioned wake-up condition.

In some embodiments, above-mentioned image sequence includes the body image of user；And above-mentioned condition judging unit into One step is configured to: being analyzed the body image in above-mentioned image sequence, is determined the action message of user；In response to determination The action message of user meets preset condition, determines that above-mentioned wake-up condition meets.

In some embodiments, above-mentioned response voice determination unit is further configured to: determining and above-mentioned action message Corresponding response voice is as response voice corresponding with above-mentioned wake-up condition.

In some embodiments, above-mentioned condition judging unit is further configured to: being sent in response to receiving server-side Response voice, determine that above-mentioned wake-up condition meets.

In some embodiments, above-mentioned apparatus further include: voice messaging acquiring unit is configured in response to above-mentioned response Voice finishes playing, and obtains the second voice messaging in preset duration；Terminal Control Element is configured to according to above-mentioned second Voice messaging controls above-mentioned terminal.

The third aspect, the embodiment of the present application provide a kind of terminal, comprising: one or more processors；Storage device, On be stored with one or more programs, when said one or multiple programs are executed by said one or multiple processors so that on It states one or more processors and realizes the method as described in first aspect any embodiment.

Fourth aspect, the embodiment of the present application provide a kind of computer-readable medium, are stored thereon with computer program, should The method as described in first aspect any embodiment is realized when program is executed by processor.

The exchange method and device provided by the above embodiment for terminal of the application, available image sequence.Figure As including at least framed user's image in sequence.Then, according to image sequence, determine whether preset wake-up condition meets.? After determining that wake-up condition meets, response voice corresponding with wake-up condition is determined.Finally, playing response voice.The present embodiment Method can wake up terminal according to user images, increase interactive mode.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:

Fig. 1 is that one embodiment of the application can be applied to exemplary system architecture figure therein；

Fig. 2 is the flow chart according to one embodiment of the exchange method for terminal of the application；

Fig. 3 a is the schematic diagram according to an application scenarios of the exchange method for terminal of the application；

Fig. 3 b is the schematic diagram according to another application scenarios of the exchange method for terminal of the application；

Fig. 3 c is the schematic diagram according to another application scenarios of the exchange method for terminal of the application；

Fig. 4 is the flow chart according to another embodiment of the exchange method for terminal of the application；

Fig. 5 is the structural schematic diagram according to one embodiment of the interactive device for terminal of the application；

Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the terminal of the embodiment of the present application.

Specific embodiment

The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 is shown can be using the implementation of the exchange method for terminal or the interactive device for terminal of the application The exemplary system architecture 100 of example.

As shown in Figure 1, system architecture 100 may include intelligent sound box 101, terminal device 102,103,104 kimonos of network Business device 105.Network 104 between intelligent sound box 101, terminal device 102,103 and server 105 to provide communication link Medium.Network 104 may include various connection types, such as wired, wireless communication link or fiber optic cables etc..

Image collecting device can be installed on intelligent sound box 101, user images can be acquired, then pass through network 104 It is interacted with server 105, to receive or send message.For example, the image of acquisition is sent to server 105, or receive clothes The information that business device 105 is sent.

User can be used terminal device 102,103 and be interacted by network 104 with server 105, be disappeared with receiving or sending Breath etc..Various telecommunication customer end applications, such as image processing class application, audio-video can be installed on terminal device 102,103 Broadcast message class application, web browser applications, shopping class application, searching class application, instant messaging tools, mailbox client, social activity Platform software etc..

Terminal device 102,103 can be hardware, be also possible to software.It, can be with when terminal device 102,103 is hardware It is the various electronic equipments with microphone and image collecting device, including but not limited to smart phone, tablet computer, intelligence are electric Depending on, pocket computer on knee and desktop computer etc..When terminal device 102,103 is software, may be mounted at above-mentioned In cited electronic equipment.Multiple softwares or software module (such as providing Distributed Services) may be implemented into it, Single software or software module may be implemented into.It is not specifically limited herein.

Server 105 can be to provide the server of various services, for example, to intelligent sound box 101 and terminal device 102, The background server that the image of 103 acquisitions is handled.Background server can analyze the data such as the image received Deng processing, and processing result (such as replying voice) is fed back into intelligent sound box 101 and terminal device 102,103.

It should be noted that server 105 can be hardware, it is also possible to software.It, can when server 105 is hardware To be implemented as the distributed server cluster that multiple servers form, individual server also may be implemented into.When server 105 is When software, multiple softwares or software module (such as providing Distributed Services) may be implemented into, also may be implemented into single Software or software module.It is not specifically limited herein.

It should be noted that for the exchange method of terminal generally by terminal (such as intelligence provided by the embodiment of the present application It can speaker 101 or terminal device 102,103) execution.Correspondingly, it is generally positioned in terminal for the interactive device of terminal.? In some realizations, some steps of the exchange method for terminal can also be by server-side (such as server 105) Lai Zhihang.Phase Ying Di also can be set in server-side for some units in the interactive device of terminal.

It should be understood that the number of intelligent sound box, terminal device, network and server in Fig. 1 is only schematical.Root It factually now needs, can have any number of intelligent sound box, terminal device, network and server.

With continued reference to Fig. 2, the process of one embodiment of the exchange method for terminal according to the application is shown 200.The exchange method for terminal of the present embodiment, comprising the following steps:

Step 201, image sequence is obtained.

In the present embodiment, for the executing subject of the exchange method of terminal (such as intelligent sound box shown in FIG. 1 101 or 103) terminal device 102 can pass through wired connection mode or radio connection acquisition image sequence.Such as it can pass through The image collecting device of executing subject connection obtains image sequence.It may include at least framed user's image in image sequence.This Place, user images refer to the image comprising user's face, user's limbs or user's eye.Each frame image in image sequence can be with It is the multiple image that image collecting device interval preset time is continuously shot.

It should be pointed out that above-mentioned radio connection can include but is not limited to 3G/4G connection, WiFi connection, bluetooth Connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection and other currently known or exploitations in the future Radio connection.

Step 202, according to image sequence, determine whether preset wake-up condition meets.

After getting image sequence, executing subject may determine that whether preset wake-up condition meets.Specifically, executing Main body can carry out a series of processing to image sequence first, and according to processing result, judge whether to meet wake-up condition.On Stating processing can include but is not limited to feature extraction, Expression Recognition, action recognition etc..Herein, preset wake-up condition can be with It is the pre-set wake-up condition for needing terminal to timely respond to.For example, wake-up condition can be the figure in image comprising flare Picture.Alternatively, wake-up condition can be the image in image comprising stranger.It is understood that wake-up condition may include more It is a.Using above-mentioned wake-up condition, executing subject may be implemented to automatically wake up or passively wake up.

Step 203, in response to determining that wake-up condition meets, response voice corresponding with wake-up condition is determined.

Executing subject can determine response voice corresponding with wake-up condition after determining that wake-up condition meets.This implementation In example, each wake-up condition can correspond at least one response voice.Different response voices correspond to different scenes.It can manage Solution, response voice is it is intended that user provides relevant information, or actively accompanies user (such as actively interacting with user).

Step 204, response voice is played.

Executing subject can play above-mentioned response voice after determining response voice, to realize the interaction with user.

The exchange method provided by the above embodiment for terminal of the application, available image sequence.Image sequence In include at least framed user's image.Then, according to image sequence, determine whether preset wake-up condition meets.It is called out in determination After the condition of waking up meets, response voice corresponding with wake-up condition is determined.Finally, playing response voice.The method of the present embodiment, can To wake up terminal according to user images, interactive mode is increased.

In some optional implementations of embodiment shown in Fig. 2, image sequence may include the eyes image of user. Above-mentioned steps 202 can specifically be realized by following steps unshowned in Fig. 2: being analyzed eyes image, determined and use Whether family watches terminal attentively；In response to determining that user watches terminal attentively, determine that wake-up condition meets.

In this implementation, image sequence can be the eyes image including user obtained in real time.Executing subject can be with Eyes image is analyzed, determines whether user's is just look at terminal.If it is determined that user is currently just look at terminal, then Assert that user wants to interact with terminal, it is determined that wake-up condition meets.

Correspondingly, corresponding with this implementation, step 203 specifically can by following steps unshowned in Fig. 2 come It realizes: in response to determining that wake-up condition meets, obtaining the first voice messaging in preset duration；Language is carried out to the first voice messaging Justice parsing；According to semantic parsing result, response voice corresponding with wake-up condition is determined.

In this implementation, executing subject after determining that wake-up condition meets, can monitor sound in the environment Information obtains the first voice messaging in preset duration.Executing subject can carry out semantic parsing to the first voice messaging.And root According to semantic parsing result, to determine response voice corresponding with wake-up condition.For example, it after terminal is waken up, can monitor The sound of user.When the voice for listening to user is " weather for having a talk about tomorrow ", terminal can make the Weather information of tomorrow For the content of response voice.Alternatively, can keyword be arranged for terminal in advance in user, terminal can be using the keyword as its name Word.For example, the pre-set terminal name of user is " Xiao Ming ".After terminal is waken up, the voice for listening to user is " Xiao Ming sings first song ", terminal can play a song.

In some concrete implementations, terminal can be sent the first voice messaging after getting the first voice messaging To server-side, semantic parsing is carried out to the first voice messaging by server-side, and according to semantic parsing result, to determine and wake up The corresponding response voice of condition.Identified response voice can be sent to terminal after determining response voice by server-side.

Other realization in, terminal can after getting the first voice messaging, directly to the first voice messaging into The semantic parsing of row, and according to semantic parsing result, to determine response voice corresponding with wake-up condition.Service can be mitigated in this way The processing pressure at end.

Referring to Fig. 3 a, Fig. 3 a is the signal according to an application scenarios of the exchange method for terminal of the present embodiment Figure.In the application scenarios of Fig. 3 a, user A chats with user B.User A has watched intelligent sound box attentively and has said that " we will go tomorrow at a glance Spring outing ", user B say " ".User A says " being exactly not know that the conjunction of weather tomorrow is improper ".Intelligent sound box is on collecting After stating voice messaging, semantic parsing is carried out to voice messaging.And according to parsing result, determine that the content of response voice is " tomorrow Weather is fine, 25-30 DEG C of temperature, the exactly fair weather of spring outing ".Intelligent sound box plays above-mentioned response voice to user.In this way, intelligence Speaker can be actively added dialogue, and provide relevant information.

In some optional implementations of embodiment shown in Fig. 2, image sequence may include the face-image of user. Above-mentioned steps 202 can specifically be realized by following steps unshowned in Fig. 2: be carried out to the face-image in image sequence Expression Recognition；According to Expression Recognition as a result, determining whether the corresponding expression of adjacent two field pictures is identical in image sequence；In response to Determine that the corresponding expression of adjacent two field pictures is different in image sequence, determines that wake-up condition meets.

In this implementation, image sequence may include the face-image of user.Executing subject can to face-image into Row Expression Recognition.Then, according to Expression Recognition as a result, the corresponding expression of arbitrary neighborhood two field pictures is in image sequence to determine It is no identical.It is different if there is the adjacent corresponding expression of two field pictures, it is determined that the expression of user changes, and wakes up at this time Condition meets.

Correspondingly, corresponding with this implementation, step 203 specifically can by following steps unshowned in Fig. 2 come It realizes: in response to determining that wake-up condition meets, determining that expression corresponding with a later frame image in adjacent two field pictures is corresponding and answer Answer voice；Using identified response voice as response voice corresponding with wake-up condition.

In this implementation, executing subject, can will be latter in adjacent two field pictures after determining that wake-up condition meets Frame image is as target image.Then by response voice corresponding to the corresponding expression of target image, as with wake-up condition pair The response voice answered.For example, user's expression suddenly becomes pain, and executing subject, can be with after determining above-mentioned label variations Determining response voice corresponding with painful expression is " anything is needed to help? ".

Referring to Fig. 3 b, Fig. 3 b is the signal according to another application scenarios of the exchange method for terminal of the present embodiment Figure.In the application scenarios of Fig. 3 b, user A is in alone, crys suddenly.After intelligent sound box recognizes the expression of user, determine Response voice corresponding with sobbing be " owner, today mood less? can with I chat chat ".In this way, intelligent sound box energy It is enough to initiate to talk with according to the expression of user, promote user experience.

In some optional implementations of embodiment shown in Fig. 2, image sequence may include the body image of user. Above-mentioned steps 202 can specifically be realized by following steps unshowned in Fig. 2: be carried out to the body image in image sequence Analysis, determines the type of action of user；In response to determining that the type of action of user meets preset condition, determine that wake-up condition is full Foot.

In this implementation, image sequence may include the body image of user.Executing subject can to body image into Row analysis, to determine the type of action of user.Specifically, executing subject can be according to user's body in image each in image sequence Position, determine the action message of user.Action message may include type of action and duration.Above-mentioned type of action It may include jumping, falling down, squatting down.Executing subject may determine that whether the action message of user meets preset condition, if Meet, it is determined that wake-up condition meets.Above-mentioned preset condition for example can be to fall down the time to continue 2 minutes, or jump continues Time is 2 minutes.

Correspondingly, corresponding with this implementation, step 203 specifically can by following steps unshowned in Fig. 2 come Realize: the corresponding response voice of determining and action message is as response voice corresponding with wake-up condition.

Executing subject can be according to action message, to determine the response voice corresponding to it, and by identified response language Sound is as response voice corresponding with wake-up condition.For example, the action message of user are as follows: jump, 5 minutes.It is corresponding to answer Answering voice can be " movement appropriate facilitates health ".

With continued reference to Fig. 3 c, Fig. 3 c is showing according to application scenarios of the exchange method for terminal of the present embodiment It is intended to.In the application scenarios of Fig. 3 a, old solitary people is fallen down suddenly, and the time that falls down to the ground is more than 2 minutes, and intelligent sound box wakes up Afterwards, determine that corresponding response message is " may I ask and want help? ".So as to actively help user.

In some optional implementations of the present embodiment, above-mentioned steps 202 specifically can be by unshowned in Fig. 2 Following steps are realized: the response voice in response to receiving server-side transmission determines that wake-up condition meets.

In this implementation, executing subject can be interacted with server-side.When executing subject receives answering for server-side transmission When answering voice, it is determined that wake-up condition meets, and terminal enters wake-up states.

With continued reference to Fig. 4, it illustrates the streams according to another embodiment of the exchange method for terminal of the application Journey 400.As shown in figure 4, the exchange method for terminal of the present embodiment may comprise steps of:

Step 401, image sequence is obtained.

Step 402, according to image sequence, determine whether preset wake-up condition meets.

Step 403, in response to determining that wake-up condition meets, response voice corresponding with wake-up condition is determined.

Step 404, response voice is played.

The principle of step 401~404 and the principle of step 201~204 are similar, and details are not described herein again.

Step 405, finishing playing in response to response voice obtains the second voice messaging in preset duration.

Executing subject finishes playing after response voice, the second voice messaging in available preset duration.It is above-mentioned default It is preset according to practical application scene that duration can be technical staff, such as 2 minutes.

Step 406, according to the second voice messaging, controlling terminal.

Executing subject, can be according to the second voice messaging, controlling terminal after obtaining the second voice messaging.Specifically, holding Row main body can parse the second voice messaging, determine keyword therein or mood.Then basis and keyword or feelings The corresponding instruction of thread, carrys out controlling terminal.For example, in the scene shown in Fig. 3 b, intelligent sound box can be answered finishing playing After answering voice, the echo message of user is obtained.If the echo message of user indicates that user is ready to listen attentively to, illustrate intelligent sound box institute The expression of identification is correct.Then the corresponding face-image of mood and the expression recognized can be stored in local by intelligent sound box.This Sample can be by the characteristic of face-image and stored face when intelligent sound box again identifies that the expression of user's face image The characteristic of portion's image is compared, so as to more rapidly accurately identify the expression of user.It applies shown in Fig. 3 c In scene, intelligent sound box can obtain the echo message of user after the response voice that finishes playing.If the echo message of user It " makes a phone call " comprising keyword.Then intelligent sound box can send above-mentioned echo message to server-side, for server-side processing.

With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides a kind of for terminal One embodiment of interactive device, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer For in various electronic equipments.

As shown in figure 5, the interactive device 500 for terminal of the present embodiment includes: image sequence acquiring unit 501, item Part judging unit 502, response voice determination unit 503 and response voice playing unit 504.

Image sequence acquiring unit 501 is configured to obtain image sequence.Wherein, image sequence is used including an at least frame Family image.

Condition judgment unit 502 is configured to determine whether preset wake-up condition meets according to image sequence.

Response voice determination unit 503 is configured in response to determine that wake-up condition meets, and determination is corresponding with wake-up condition Response voice.

Response voice playing unit 504 is configured to play response voice.

In some optional implementations of the present embodiment, image sequence includes the eyes image of user.Condition judgement Unit 502 can be further configured to: being analyzed eyes image, determined whether user watches terminal attentively；In response to determination User watches terminal attentively, determines that wake-up condition meets.

In some optional implementations of the present embodiment, response voice determination unit 503 can be further configured At: in response to determining that wake-up condition meets, obtain the first voice messaging in preset duration；First voice messaging is carried out semantic Parsing；According to semantic parsing result, response voice corresponding with wake-up condition is determined.

In some optional implementations of the present embodiment, image sequence includes the face-image of user.Condition judgement Unit 502 can be further configured to: carry out Expression Recognition to the face-image in image sequence；According to Expression Recognition knot Fruit determines whether the corresponding expression of adjacent two field pictures is identical in image sequence；In response to determining adjacent two frame in image sequence The corresponding expression of image is different, determines that wake-up condition meets.

In some optional implementations of the present embodiment, response voice determination unit 503 can be further configured At: meet in response to determination wake-up condition, the corresponding response of determination expression corresponding with a later frame image in adjacent two field pictures Voice；Using identified response voice as response voice corresponding with wake-up condition.

In some optional implementations of the present embodiment, image sequence includes the body image of user.Condition judgement Unit 502 can be further configured to: being analyzed the body image in image sequence, determined the action message of user； In response to determining that the action message of user meets preset condition, determine that wake-up condition meets.

In some optional implementations of the present embodiment, response voice determination unit 503 can be further configured At the corresponding response voice of: determining and action message as response voice corresponding with wake-up condition.

In some optional implementations of the present embodiment, condition judgment unit 502 can be further configured to: be rung Ying Yu receives the response voice of server-side transmission, determines that wake-up condition meets.

In some optional implementations of the present embodiment, device 500 may further include unshowned language in Fig. 5 Sound information acquisition unit and Terminal Control Element.

Voice messaging acquiring unit, is configured in response to finishing playing for response voice, obtains the in preset duration Two voice messagings；

Terminal Control Element is configured to according to the second voice messaging, controlling terminal.

It should be appreciated that the unit 501 for recording in the interactive device 500 of terminal is to unit 504 respectively and in reference Fig. 2 Each step in the method for description is corresponding.As a result, above with respect to the operation and feature of the exchange method description for terminal It is equally applicable to device 500 and unit wherein included, details are not described herein.

Below with reference to Fig. 6, it illustrates the terminal (intelligence of example as shown in figure 1 for being suitable for being used to realize embodiment of the disclosure Speaker 101 or terminal device 102,103) 600 structural schematic diagram.Terminal shown in Fig. 6 is only an example, should not be to this The function and use scope of disclosed embodiment bring any restrictions.

As shown in fig. 6, terminal 600 may include processing unit (such as central processing unit, graphics processor etc.) 601, Random access storage device can be loaded into according to the program being stored in read-only memory (ROM) 602 or from storage device 608 (RAM) program in 603 and execute various movements appropriate and processing.In RAM 603, it is also stored with the behaviour of electronic equipment 600 Various programs and data needed for making.Processing unit 601, ROM 602 and RAM603 are connected with each other by bus 604.Input/ Output (I/O) interface 605 is also connected to bus 604.

In general, following device can connect to I/O interface 605: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph As the input unit 606 of head, microphone, accelerometer, gyroscope etc.；Including such as liquid crystal display (LCD), loudspeaker, vibration The output device 607 of dynamic device etc.；Storage device 608 including such as tape, hard disk etc.；And communication device 609.Communication device 609, which can permit electronic equipment 600, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 6 shows tool There is the electronic equipment 600 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with Alternatively implement or have more or fewer devices.Each box shown in Fig. 6 can represent a device, can also root According to needing to represent multiple devices.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communication device 609, or from storage device 608 It is mounted, or is mounted from ROM 602.When the computer program is executed by processing unit 601, the implementation of the disclosure is executed The above-mentioned function of being limited in the method for example.It should be noted that computer-readable medium described in embodiment of the disclosure can be with It is computer-readable signal media or computer readable storage medium either the two any combination.It is computer-readable Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or Device, or any above combination.The more specific example of computer readable storage medium can include but is not limited to: have The electrical connection of one or more conducting wires, portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD- ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In embodiment of the disclosure, computer Readable storage medium storing program for executing can be any tangible medium for including or store program, which can be commanded execution system, device Either device use or in connection.And in embodiment of the disclosure, computer-readable signal media may include In a base band or as the data-signal that carrier wave a part is propagated, wherein carrying computer-readable program code.It is this The data-signal of propagation can take various forms, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate Combination.Computer-readable signal media can also be any computer-readable medium other than computer readable storage medium, should Computer-readable signal media can send, propagate or transmit for by instruction execution system, device or device use or Person's program in connection.The program code for including on computer-readable medium can transmit with any suitable medium, Including but not limited to: electric wire, optical cable, RF (radio frequency) etc. or above-mentioned any appropriate combination.

Above-mentioned computer-readable medium can be included in above-mentioned terminal；It is also possible to individualism, and it is unassembled Enter in the electronic equipment.Above-mentioned computer-readable medium carries one or more program, when said one or multiple journeys When sequence is executed by the electronic equipment, so that the electronic equipment: obtaining image sequence, wherein image sequence is used including an at least frame Family image；According to image sequence, determine whether preset wake-up condition meets；In response to determine wake-up condition meet, determine with The corresponding response voice of wake-up condition；Play response voice.

The behaviour for executing embodiment of the disclosure can be write with one or more programming languages or combinations thereof The computer program code of work, described program design language include object oriented program language-such as Java, Smalltalk, C++ further include conventional procedural programming language-such as " C " language or similar program design language Speech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independence Software package execute, part on the user computer part execute on the remote computer or completely in remote computer or It is executed on server.In situations involving remote computers, remote computer can pass through the network of any kind --- packet It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit It is connected with ISP by internet).

Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in embodiment of the disclosure can be realized by way of software, can also be passed through The mode of hardware is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor Including image sequence acquiring unit, condition judgment unit, response voice determination unit and response voice playing unit.Wherein, this The title of a little units does not constitute the restriction to the unit itself under certain conditions, for example, response voice playing unit may be used also To be described as " playing the unit of response voice ".

Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art Member it should be appreciated that embodiment of the disclosure involved in invention scope, however it is not limited to the specific combination of above-mentioned technical characteristic and At technical solution, while should also cover do not depart from foregoing invention design in the case where, by above-mentioned technical characteristic or its be equal Feature carries out any combination and other technical solutions for being formed.Such as disclosed in features described above and embodiment of the disclosure (but It is not limited to) technical characteristic with similar functions is replaced mutually and the technical solution that is formed.

Claims

1. a kind of exchange method for terminal, comprising:

Obtain image sequence, wherein described image sequence includes at least framed user's image；

According to described image sequence, determine whether preset wake-up condition meets；

Meet in response to the determination wake-up condition, determines response voice corresponding with the wake-up condition；

Play the response voice.

2. according to the method described in claim 1, wherein, described image sequence includes the eyes image of user；And

It is described according to described image sequence, detect whether preset wake-up condition meets, comprising:

The eyes image is analyzed, determines whether the user watches terminal attentively；

In response to determination, the user watches terminal attentively, determines that the wake-up condition meets.

3. described to meet in response to the determination wake-up condition according to the method described in claim 2, wherein, it is determining with it is described The corresponding response voice of wake-up condition, comprising:

Meet in response to the determination wake-up condition, obtains the first voice messaging in preset duration；

Semantic parsing is carried out to first voice messaging；

According to semantic parsing result, response voice corresponding with the wake-up condition is determined.

4. according to the method described in claim 1, wherein, described image sequence includes the face-image of user；And

Expression Recognition is carried out to the face-image in described image sequence；

According to Expression Recognition as a result, determining whether the corresponding expression of adjacent two field pictures is identical in described image sequence；

It is different in response to the corresponding expression of two field pictures adjacent in determining described image sequence, determine that the wake-up condition meets.

5. described to meet in response to the determination wake-up condition according to the method described in claim 4, wherein, it is determining with it is described The corresponding response voice of wake-up condition, comprising:

Meet in response to the determination wake-up condition, determines that expression corresponding with a later frame image in adjacent two field pictures is corresponding Response voice；

Using identified response voice as response voice corresponding with the wake-up condition.

6. according to the method described in claim 1, wherein, described image sequence includes the body image of user；And

Body image in described image sequence is analyzed, determines the action message of user；

In response to determining that the action message of user meets preset condition, determine that the wake-up condition meets.

7. described to meet in response to the determination wake-up condition according to the method described in claim 6, wherein, it is determining with it is described The corresponding response voice of wake-up condition, comprising:

The corresponding response voice of determining and described action message is as response voice corresponding with the wake-up condition.

8. according to the method described in claim 1, wherein, whether the preset wake-up condition of detection meets, comprising:

In response to receiving the response voice of server-side transmission, determine that the wake-up condition meets.

9. method according to claim 1-8, wherein the method also includes:

In response to finishing playing for the response voice, the second voice messaging in preset duration is obtained；

According to second voice messaging, the terminal is controlled.

10. a kind of interactive device for terminal, comprising:

Image sequence acquiring unit is configured to obtain image sequence, wherein described image sequence is schemed including an at least framed user Picture；

Condition judgment unit is configured to determine whether preset wake-up condition meets according to described image sequence；

Response voice determination unit is configured in response to determine that the wake-up condition meets, the determining and wake-up condition pair The response voice answered；

Response voice playing unit is configured to play the response voice.

11. device according to claim 10, wherein described image sequence includes the eyes image of user；And

The condition judgment unit is further configured to:

12. device according to claim 11, wherein the response voice determination unit is further configured to:

Semantic parsing is carried out to first voice messaging；

13. device according to claim 10, wherein described image sequence includes the face-image of user；And

The condition judgment unit is further configured to:

14. device according to claim 13, wherein the response voice determination unit is further configured to:

15. device according to claim 10, wherein described image sequence includes the body image of user；And

The condition judgment unit is further configured to:

16. device according to claim 15, wherein the response voice determination unit is further configured to:

17. device according to claim 10, wherein the condition judgment unit is further configured to:

18. the described in any item devices of 0-17 according to claim 1, wherein described device further include:

Voice messaging acquiring unit, is configured in response to finishing playing for the response voice, obtains the in preset duration Two voice messagings；

Terminal Control Element is configured to control the terminal according to second voice messaging.

19. a kind of terminal, comprising:

One or more processors；

Storage device is stored thereon with one or more programs,

When one or more of programs are executed by one or more of processors, so that one or more of processors are real The now method as described in any in claim 1-9.

20. a kind of computer-readable medium, is stored thereon with computer program, wherein the realization when program is executed by processor Method as described in any in claim 1-9.