CN112487958A - Gesture control method and system - Google Patents

Gesture control method and system Download PDF

Info

Publication number
CN112487958A
CN112487958A CN202011361691.XA CN202011361691A CN112487958A CN 112487958 A CN112487958 A CN 112487958A CN 202011361691 A CN202011361691 A CN 202011361691A CN 112487958 A CN112487958 A CN 112487958A
Authority
CN
China
Prior art keywords
gesture
user
dynamic
action
gestures
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011361691.XA
Other languages
Chinese (zh)
Inventor
朱成亚
樊帅
宋洪博
吴卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AI Speech Ltd
Original Assignee
AI Speech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AI Speech Ltd filed Critical AI Speech Ltd
Priority to CN202011361691.XA priority Critical patent/CN112487958A/en
Publication of CN112487958A publication Critical patent/CN112487958A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/113Recognition of static hand signs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/4223Cameras
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The embodiment of the invention provides a gesture control method. The method comprises the following steps: responding to a wake-up word input by a user, and starting a camera to acquire a gesture of the user; and judging whether to output the corresponding action or not according to the collected dynamic gesture of the user. The embodiment of the invention also provides a gesture control system for the electronic equipment. According to the embodiment of the invention, gesture recognition is started after voice awakening instead of starting shooting when the television is turned on, so that the CPU use of the television under the state of not awakening at ordinary times is effectively reduced. Meanwhile, the static gesture is easy to trigger by mistake, a more accurate dynamic gesture is used, false triggering and false recognition are avoided, the gesture hit rate is improved, and user experience is improved. Considering that different scenes may have different gestures, the scenes are divided, and dynamic gestures are further limited through the scenes, so that different recognition results and states exist in different scene modes, and multi-modal control is realized.

Description

Gesture control method and system
Technical Field
The invention relates to the field of intelligent voice, in particular to a gesture control method and system.
Background
Along with the development of intelligent speech technology, more and more smart machines possess speech control's function, in order to improve the interactive efficiency with the user, on speech interaction's basis, introduced gesture control, promote mutual experience.
For example, the control of the television is realized by combining voice with a gesture of a static picture, for example, after the television is woken up by voice, a static gesture ok realizes the playing operation of a certain selected movie or television program. In consideration of the fact that some users do not like to speak, only gestures of static pictures are used for realizing television control, and voice is not combined. The control of the television is realized by combining voice with the gestures of the static pictures, after the voice is awakened, each frame of picture is input, a gesture action is output, and image recognition related algorithms are used; the television control is realized by combining the gestures of the static pictures, and the image recognition related algorithm is also used without combining the voice.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the related art:
1. after voice awakening, if each frame of picture is input, one gesture action output exists, the possibility of triggering gesture actions for multiple times exists, for example, the above ok gesture exists, if videos are continuously acquired and pictures are continuously input, a plurality of ok gesture outputs exist, and a plurality of action outputs may be triggered;
2. meanwhile, gesture output based on the static picture cannot meet the requirement of accurately triggering two scenes with the same gesture action output, for example, two page turning operation operations are continuously triggered through gestures;
3. the gesture output based on the static picture has no fault-tolerant processing when the gesture action is output, and once a certain gesture is identified incorrectly, other gestures are identified and are triggered by mistake.
4. Generally, after the voice is awakened, gesture recognition is started, if an algorithm does not make scene limitation, there is a possibility that the gesture result output is always triggered, especially there is a problem of picture misrecognition, for example, after the current voice is awakened, the list selection state is not currently available, but a page turning action is triggered (due to picture misrecognition or human-induced gesture operation).
If the control of the television is realized by combining the gesture of the static picture without combining the voice, generally, after the television is started, a camera is called, the possibility of false triggering exists, the CPU occupation is improved for the television, and thirdly, for a user, the personal privacy problem is not protected when the user continuously collects the picture.
Disclosure of Invention
In order to at least solve the problem that in the prior art, the control of the television is realized by combining voice with gestures of a static picture, the gestures are output based on the static picture, one gesture action is output by one frame of picture, and a plurality of gestures are output by a plurality of frames of pictures, so that a plurality of actions are triggered, and the gestures based on the static picture do not consider the correlation of front and back gestures; meanwhile, after the general gesture recognition is started, the limitation of combining gesture scenes cannot be made. The television control is realized by combining the gesture of the static picture, and the voice is not combined, so that the gesture recognition starting time can only be started when the television is started, and the problems of easy false triggering and high CPU occupation are solved.
In a first aspect, an embodiment of the present invention provides a gesture control method, including:
responding to a wake-up word input by a user, and starting a camera to acquire a gesture of the user;
and judging whether to output corresponding actions according to the collected dynamic gestures of the user.
In a second aspect, an embodiment of the present invention provides a gesture control system for an electronic device, including:
the gesture acquisition program module is used for responding to a wake-up word input by a user and starting a camera to acquire the gesture of the user;
and the action output program module is used for judging whether to output the corresponding action according to the collected dynamic gesture of the user.
In a third aspect, an electronic device is provided, comprising: the gesture control system comprises at least one processor and a memory which is connected with the at least one processor in a communication mode, wherein the memory stores instructions which can be executed by the at least one processor, and the instructions are executed by the at least one processor so as to enable the at least one processor to execute the steps of the gesture control method of any embodiment of the invention.
In a fourth aspect, an embodiment of the present invention provides a storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the steps of the gesture control method according to any embodiment of the present invention.
The embodiment of the invention has the beneficial effects that: gesture recognition is started after voice awakening, and camera shooting is started instead of starting the television, so that the CPU use of the television under the state that the television is not awakened at ordinary times is effectively reduced. Meanwhile, the static gesture is easy to trigger by mistake, a more accurate dynamic gesture is used, false triggering and false recognition are avoided, the gesture hit rate is improved, and user experience is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a gesture control method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a dynamic gesture process of a gesture control method according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating an operation of a gesture control method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating a dynamic gesture state of a gesture control method according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a gesture control system for an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a gesture control method according to an embodiment of the present invention, which includes the following steps:
s11: responding to a wake-up word input by a user, and starting a camera to acquire a gesture of the user;
s12: and judging whether to output corresponding actions according to the collected dynamic gestures of the user.
In this embodiment, the method can be adapted to various electronic devices, for example, an electronic device such as a smart television, a smart phone, and a smart tablet. Generally, the smart tv uses gestures in many cases, and in the following example, the smart tv is taken as an example.
For step S11, the smart tv plays the media content, and the user wishes to control the tv and first performs voice wakeup on the smart tv. In consideration of the utilization rate of the cpu, the camera shooting is not started when the intelligent television is started, but the gesture recognition is started after the voice awakening, so that the use of the intelligent television in a normally non-awakening state is effectively reduced, and the gesture recognition is started only after the voice awakening.
For example, the smart television is playing a certain video, and the smart television is in a real-time voice capture state at the moment, and continuously records the voice through the microphone (sdk starting of the voice, and the recorded voice is given to the voice sdk in real time). When the user says the awakening word of the intelligent television set, namely 'hello, small A', the remote controller of some intelligent televisions also has the awakening function, and the user can also awaken the intelligent television set by pressing the remote controller.
In consideration of the fact that the intelligent television has large sound during playing, a microphone can be configured on the remote controller to collect the sound of the user, so that the sound of the user can be collected bidirectionally through the television and the remote controller, and the awakening efficiency is further improved.
For step S12, after confirming that the user inputs the wake up word, the gesture sdk is activated to turn on the camera to capture the dynamic gesture made by the user. And identifying and judging the acquired dynamic gesture of the user to determine whether to output the action corresponding to the dynamic gesture.
According to the embodiment, gesture recognition is started after voice awakening, and the shooting is started instead of starting the television when the television is turned on, so that the CPU use of the television is effectively reduced under the state that the television is not awakened at ordinary times. Meanwhile, the static gesture is easy to trigger by mistake, a more accurate dynamic gesture is used, false triggering and false recognition are avoided, the gesture hit rate is improved, and user experience is improved.
As an implementation manner, in the present embodiment, the dynamic gesture is composed of a plurality of static gestures.
In this embodiment, a dynamic gesture is composed of a plurality of static states, for example, if no gesture is detected, i.e. no gesture output is defined as no gesture, when there is a gesture output, a corresponding gesture is output, such as hand left/hand right.
A dynamic gesture process is defined, taking a hand left gesture as an example, as shown in fig. 2. A dynamic gesture, which must start with no gesture (three consecutive times (the specific number is not limited, three times are only examples)), and end with no gesture (three consecutive times), in which three consecutive identical gesture outputs occur for the first time, a unique motion is output in a normalized manner, and then no gesture motion is output until no gesture motion occurs.
For example, the simplest next stage or set of dynamic gestures is a plurality of consecutive no-gettrue static images, a plurality of consecutive "one finger right" static gestures, and finally a plurality of no-gettrue static images.
Considering some complex operations, a single static gesture may be burdensome, and for example, difficult to represent as a type of dynamic gesture "focus", "like", "throw", "coin", etc. operations. The dynamic gesture can be set to be a static gesture which starts with no texture for multiple times continuously, continues multiple 'one finger left finger 45 degrees' (in practical cases, the dynamic gesture can be set to be an interval, so that the recognition rate can be improved), continues multiple 'one finger right finger 45 degrees' static gestures, and finally ends with multiple no textures.
In one embodiment, when a predetermined number of continuous static gestures are acquired to be the same, the continuous static gestures are normalized.
Considering that the smart television is collected in real time, if the frequency is too fast, multiple identical gestures may be collected, for example, a static gesture of "one finger is facing right" of the user swings for 1 second, but the collection frequency is 0.2S, so that 5 identical gestures are collected continuously, and the repeated identical gestures are avoided and normalized. These 5 identical gestures are determined to be 1 gesture. Avoiding the occurrence of the situation of "no texture" → "right finger" → "no texture", and by the normalization processing, "no texture" → "right finger" → "no texture" can be obtained. This one motion gesture.
In actual conditions, actions corresponding to the dynamic gestures can be preset, and user-defined permission can be given to the user, so that the user can use the gestures more comfortably, and the user interaction experience is improved. Moreover, the normalized dynamic gesture can prevent false triggering and false recognition, and the hit rate of the user gesture is improved.
For gesture normalization, different settings may be made for different scenes, for example, considering the scenes of part of gesture applications, normalized output is not required, such as two finger up/two finger down is defined to implement the actions of volume up and volume down on the tv, and it is desirable to output the gesture action continuously. At this time, normalization is not performed, and the normalization can be performed. Further to gesture sdk, the processing is preferably uniform, the functions are inputting pictures, outputting corresponding gestures, and the application layer performs different processing in combination with different scenes. A gesture processing module is abstracted based on the special abstraction, and for different processing of different gestures, part of the gestures need normalization, and part of the gestures do not need normalization.
As an implementation manner, in this embodiment, the determining whether to output the corresponding action according to the collected dynamic gesture of the user includes:
identifying a scene type corresponding to the dynamic gesture of the user;
when the scene type is consistent with the current scene of the electronic equipment, judging whether the dynamic gesture belongs to a registered gesture in the current scene;
and if so, outputting the action corresponding to the dynamic gesture.
And when the scene type is inconsistent with the current scene of the electronic equipment, outputting a corresponding action without responding to the dynamic gesture.
In the present embodiment, it is considered that in some cases, there is no action corresponding to a partial dynamic command, and even if it is recognized, it cannot be realized. For example, a smart tv is used to watch on-line CCTV tv, such as a kid channel, a technology channel, etc. This type of playback is distinguished from other video platforms that record videos for others to watch. Such live cable television channels are typically without "like," "pay attention," "coin-in," etc. operations. In consideration of this, it is determined whether the current scene type is identical to the scene of the dynamic gesture, and whether the current scene type is registered or not is determined. Otherwise, even if the dynamic gesture is registered, it is useless because the dynamic gesture cannot be used in this scenario.
If the current scene of the user is a multimedia control type, for example, a recorded video of other platforms usually consists of a great number of media controls, in this case, the scene (multimedia control type) corresponding to the gesture of "focus" of the user is consistent with the current scene (i.e. other platforms that the user is playing have multimedia controls). At this time, the action corresponding to the dynamic gesture is output, and attention is paid to the anchor of the watched video.
In more detail, the scenes can be divided into the following scenes:
text control type: the data representing the display only needs to contain text;
content card control types: the data which represents the display comprises text information, and also comprises additional description information, icon information, display information and the like;
list control type: representing that the current data is a plurality of content card information;
multimedia control type: similar to the list control, except that each item represents information related to the multimedia
Self-defining the control type: the format of the returned data is not limited.
According to the embodiment, the scenes are divided in consideration of the fact that different gestures may exist in different scenes, and the dynamic gestures are further limited through the scenes, so that different recognition results and states exist in different scene modes, and multi-modal control is achieved.
In a concise manner, the method is described as a whole, and as shown in fig. 3, the method is a flowchart of the whole operation of the dynamic gesture.
Starting up: the intelligent television is started;
the recorder continuously receives: the recorder of the intelligent television receives the sound, sdk of the voice is started after the recorder is started, and the recorder continuously sends the audio to the voice sdk;
voice awakening: the user inputs a wake-up word by far field or pressing the remote controller by voice, such as "input: nihao Xiao S "
Gesture sdk initializes: after the smart television is awakened, initiating a gesture sdk;
gesture result callback function registration: the smart television registers a callback function of the gesture result, and then the callback function is called back when the gesture result returns;
gesture registration: registering gestures which are required to be used currently, after the gestures are registered, only responding to the registered gestures, and not responding to the unregistered gestures;
starting a camera: intelligent television starting camera
Gesture sdk Picture input: a picture feed interface provided by gesture sdk of the smart television;
and (3) gesture result callback: the gesture result is recalled;
and (3) processing a gesture result: the flow is shown in fig. 4 (taking hand left as an example):
state 1, no texture, from now on, when testing three consecutive times no texture, state 1 can be switched to state 2;
when the state 1 can be switched to the state 2, three hand left gestures are continuously encountered, the hand left gesture is output at the moment, and the hand left gesture is not output in other subsequent conditions;
and in the state 3, no texture is continuously ended for three times, and then the state is automatically switched to the state 1.
Scene filtering: defining the scene of the gesture response, when the gesture response is not in the corresponding scene, even if the corresponding gesture is output, filtering,
and (3) gesture output: outputting a dynamic gesture of a user;
and (3) output action: and outputting the action mapped by the dynamic gesture (such as next set, next channel, attention and the like).
Fig. 5 is a schematic structural diagram of a gesture control system for an electronic device according to an embodiment of the present invention, which can execute the gesture control method according to any of the above embodiments and is configured in a terminal.
The present embodiment provides a gesture control system 10 for an electronic device, including: a gesture acquisition program module 11 and a motion output program module 12.
The gesture collection program module 11 is configured to respond to a wake-up word input by a user and start a camera to collect a gesture of the user; the action output program module 12 is configured to determine whether to output a corresponding action according to the collected dynamic gesture of the user.
Further, the system is also configured to: the dynamic gesture is made up of a plurality of static gestures.
Further, the action output program module is configured to:
identifying a scene type corresponding to the dynamic gesture of the user;
when the scene type is consistent with the current scene of the electronic equipment, judging whether the dynamic gesture belongs to a registered gesture in the current scene;
and if so, outputting the action corresponding to the dynamic gesture.
Further, the action output program module is further configured to:
identifying a scene type corresponding to the dynamic gesture of the user;
and when the scene type is inconsistent with the current scene of the electronic equipment, outputting a corresponding action without responding to the dynamic gesture.
Further, the system is also configured to: when a plurality of continuous static gestures with the preset number are collected to be the same, normalization processing is carried out on the plurality of continuous static gestures.
The embodiment of the invention also provides a nonvolatile computer storage medium, wherein the computer storage medium stores computer executable instructions which can execute the gesture control method in any method embodiment;
as one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:
responding to a wake-up word input by a user, and starting a camera to acquire a gesture of the user;
and judging whether to output corresponding actions according to the collected dynamic gestures of the user.
As a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules corresponding to the methods in embodiments of the present invention. One or more program instructions are stored in a non-transitory computer readable storage medium, which when executed by a processor, perform the gesture control method of any of the method embodiments described above.
The non-volatile computer-readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the device, and the like. Further, the non-volatile computer-readable storage medium may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-transitory computer readable storage medium optionally includes memory located remotely from the processor, which may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
An embodiment of the present invention further provides an electronic device, which includes: the gesture control system comprises at least one processor and a memory which is connected with the at least one processor in a communication mode, wherein the memory stores instructions which can be executed by the at least one processor, and the instructions are executed by the at least one processor so as to enable the at least one processor to execute the steps of the gesture control method of any embodiment of the invention.
The electronic device of the embodiments of the present application exists in various forms, including but not limited to:
(1) mobile communication devices, which are characterized by mobile communication capabilities and are primarily targeted at providing voice and data communications. Such terminals include smart phones, multimedia phones, functional phones, and low-end phones, among others.
(2) The ultra-mobile personal computer equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include PDA, MID, and UMPC devices, such as tablet computers.
(3) Portable entertainment devices such devices may display and play multimedia content. The devices comprise audio and video players, handheld game consoles, electronic books, intelligent toys and portable vehicle-mounted navigation devices.
(4) Other electronic devices with data processing capabilities.
As used herein, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A gesture control method is applied to an electronic device and comprises the following steps:
responding to a wake-up word input by a user, and starting a camera to acquire a gesture of the user;
and judging whether to output corresponding actions according to the collected dynamic gestures of the user.
2. The method of claim 1, wherein the dynamic gesture is comprised of a plurality of static gestures.
3. The method according to any one of claims 1-2, wherein the determining whether to output the corresponding action according to the collected dynamic gesture of the user comprises:
identifying a scene type corresponding to the dynamic gesture of the user;
when the scene type is consistent with the current scene of the electronic equipment, judging whether the dynamic gesture belongs to a registered gesture in the current scene;
and if so, outputting the action corresponding to the dynamic gesture.
4. The method according to any one of claims 1-2, wherein the determining whether to output the corresponding action according to the collected dynamic gesture of the user comprises:
identifying a scene type corresponding to the dynamic gesture of the user;
and when the scene type is inconsistent with the current scene of the electronic equipment, outputting a corresponding action without responding to the dynamic gesture.
5. The method according to any one of claims 1-2, wherein the method comprises: when a plurality of continuous static gestures with the preset number are collected to be the same, normalization processing is carried out on the plurality of continuous static gestures.
6. A gesture control system for an electronic device, comprising:
the gesture acquisition program module is used for responding to a wake-up word input by a user and starting a camera to acquire the gesture of the user;
and the action output program module is used for judging whether to output the corresponding action according to the collected dynamic gesture of the user.
7. The system of claim 6, wherein the dynamic gesture is comprised of a plurality of static gestures.
8. The system of any of claims 6-7, wherein the action output program module is to:
identifying a scene type corresponding to the dynamic gesture of the user;
when the scene type is consistent with the current scene of the electronic equipment, judging whether the dynamic gesture belongs to a registered gesture in the current scene;
and if so, outputting the action corresponding to the dynamic gesture.
9. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any of claims 1-5.
10. A storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.
CN202011361691.XA 2020-11-27 2020-11-27 Gesture control method and system Pending CN112487958A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011361691.XA CN112487958A (en) 2020-11-27 2020-11-27 Gesture control method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011361691.XA CN112487958A (en) 2020-11-27 2020-11-27 Gesture control method and system

Publications (1)

Publication Number Publication Date
CN112487958A true CN112487958A (en) 2021-03-12

Family

ID=74936587

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011361691.XA Pending CN112487958A (en) 2020-11-27 2020-11-27 Gesture control method and system

Country Status (1)

Country Link
CN (1) CN112487958A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113934307A (en) * 2021-12-16 2022-01-14 佛山市霖云艾思科技有限公司 Method for starting electronic equipment according to gestures and scenes

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113934307A (en) * 2021-12-16 2022-01-14 佛山市霖云艾思科技有限公司 Method for starting electronic equipment according to gestures and scenes
CN113934307B (en) * 2021-12-16 2022-03-18 佛山市霖云艾思科技有限公司 Method for starting electronic equipment according to gestures and scenes

Similar Documents

Publication Publication Date Title
CN107172497B (en) Live broadcasting method, apparatus and system
CN109660817B (en) Video live broadcast method, device and system
WO2020010818A1 (en) Video capturing method and apparatus, terminal, server and storage medium
CN106506448A (en) Live display packing, device and terminal
CN108924661B (en) Data interaction method, device, terminal and storage medium based on live broadcast room
CN104618218A (en) Information reminding method and device
CN110446115B (en) Live broadcast interaction method and device, electronic equipment and storage medium
CN107888965B (en) Image gift display method and device, terminal, system and storage medium
CN108881766B (en) Video processing method, device, terminal and storage medium
CN106105172A (en) Highlight the video messaging do not checked
CN106550252A (en) The method for pushing of information, device and equipment
CN109788345B (en) Live broadcast control method and device, live broadcast equipment and readable storage medium
US11200899B2 (en) Voice processing method, apparatus and device
CN109032554B (en) Audio processing method and electronic equipment
WO2021208607A1 (en) Video stream playing control method and apparatus, and storage medium
CN110691281B (en) Video playing processing method, terminal device, server and storage medium
CN112487958A (en) Gesture control method and system
CN111107421A (en) Video processing method and device, terminal equipment and storage medium
CN111596760A (en) Operation control method and device, electronic equipment and readable storage medium
CN109686370A (en) The method and device of fighting landlord game is carried out based on voice control
CN110121106A (en) Video broadcasting method and device
CN106572397A (en) Interaction method and device for live video application
CN110705356A (en) Function control method and related equipment
TWI581626B (en) System and method for processing media files automatically
CN104049758A (en) Information processing method and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant after: Sipic Technology Co.,Ltd.

Address before: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant before: AI SPEECH Ltd.

CB02 Change of applicant information