CN114167984A - Device control method, device, storage medium and electronic device - Google Patents
Device control method, device, storage medium and electronic device Download PDFInfo
- Publication number
- CN114167984A CN114167984A CN202111416104.7A CN202111416104A CN114167984A CN 114167984 A CN114167984 A CN 114167984A CN 202111416104 A CN202111416104 A CN 202111416104A CN 114167984 A CN114167984 A CN 114167984A
- Authority
- CN
- China
- Prior art keywords
- action
- classification result
- target
- sensor data
- motion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 89
- 230000009471 action Effects 0.000 claims abstract description 358
- 230000033001 locomotion Effects 0.000 claims abstract description 186
- 238000012545 processing Methods 0.000 claims abstract description 21
- 238000004590 computer program Methods 0.000 claims description 22
- 239000013598 vector Substances 0.000 claims description 17
- 238000011176 pooling Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 8
- 238000004891 communication Methods 0.000 claims description 7
- 238000012790 confirmation Methods 0.000 claims description 5
- 230000009977 dual effect Effects 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 230000000875 corresponding effect Effects 0.000 description 78
- 230000006870 function Effects 0.000 description 21
- 230000001276 controlling effect Effects 0.000 description 20
- 238000010586 diagram Methods 0.000 description 15
- 230000001133 acceleration Effects 0.000 description 14
- 230000008569 process Effects 0.000 description 13
- 239000011521 glass Substances 0.000 description 12
- 238000012549 training Methods 0.000 description 8
- 239000004984 smart glass Substances 0.000 description 7
- 230000003993 interaction Effects 0.000 description 6
- 238000005259 measurement Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000001960 triggered effect Effects 0.000 description 3
- 238000013145 classification model Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000004091 panning Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000011982 device technology Methods 0.000 description 1
- 238000007599 discharging Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/012—Head tracking input arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/10—Earpieces; Attachments therefor ; Earphones; Monophonic headphones
- H04R1/1016—Earpieces of the intra-aural type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/10—Earpieces; Attachments therefor ; Earphones; Monophonic headphones
- H04R1/1041—Mechanical or electronic switches, or control elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/10—Earpieces; Attachments therefor ; Earphones; Monophonic headphones
- H04R1/1091—Details not provided for in groups H04R1/1008 - H04R1/1083
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2201/00—Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
- H04R2201/10—Details of earpieces, attachments therefor, earphones or monophonic headphones covered by H04R1/10 but not provided for in any of its subgroups
- H04R2201/109—Arrangements to adapt hands free headphones for use on both ears
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/02—Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The embodiment of the application discloses a device control method, a device, a storage medium and electronic equipment, wherein the embodiment of the application acquires target sensor data from a sensor data sequence according to a preset sliding window; processing the target sensor data according to a motion recognition model to obtain a motion classification result corresponding to the target sensor data; and when determining that the action occurs according to the action classification result, the control target device responds to the operation corresponding to the action. By adopting the scheme, whether the user takes the action or not can be directly identified through the data detected by the motion sensor, the operation corresponding to the action is further responded, the user does not need to manually control the electronic equipment, and the control efficiency of the electronic equipment is improved.
Description
Technical Field
The present application relates to the field of electronic device technologies, and in particular, to a device control method, apparatus, storage medium, and electronic device.
Background
Along with popularization of intelligent wearing equipment and intelligent upgrading of a man-machine interaction mode in recent years, people have new requirements for more convenient and novel man-machine interaction modes. Most of the existing wearable devices are controlled in a touch mode, and the control efficiency of the mode is low.
Disclosure of Invention
The embodiment of the application provides an equipment control method and device, a storage medium and electronic equipment, and the control efficiency of wearable equipment can be improved.
In a first aspect, an embodiment of the present application provides an apparatus control method, including:
acquiring target sensor data from a sensor data sequence according to a preset sliding window;
processing the target sensor data according to a motion recognition model to obtain a motion classification result corresponding to the target sensor data;
and when determining that the action occurs according to the action classification result, the control target device responds to the operation corresponding to the action.
In a second aspect, an embodiment of the present application further provides an apparatus control device, including:
the data acquisition module is used for acquiring target sensor data from the sensor data sequence according to a preset sliding window;
the action recognition module is used for processing the target sensor data according to an action recognition model to obtain an action classification result corresponding to the target sensor data;
and the equipment control module is used for controlling the target equipment to respond to the operation corresponding to the action when the action is determined to occur according to the action classification result.
In a third aspect, an embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the apparatus control method provided in any embodiment of the present application.
In a fourth aspect, an embodiment of the present application further provides an electronic device, including a processor and a memory, where the memory has a computer program, and the processor is configured to execute the device control method provided in any embodiment of the present application by calling the computer program.
According to the technical scheme provided by the embodiment of the application, the target sensor data are acquired from the sensor data sequence according to the preset sliding window, then the target sensor data are processed according to the action recognition model, the action classification result corresponding to the target sensor data is obtained, and when the action is determined to occur according to the action classification result, the electronic equipment is controlled to respond to the operation corresponding to the action. By the method, whether the user takes the action or not can be directly identified through the data detected by the motion sensor, so that the operation corresponding to the action is responded, the user does not need to manually control the electronic equipment, and the control efficiency of the electronic equipment is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a first flowchart of an apparatus control method according to an embodiment of the present application.
Fig. 2 is a schematic diagram of a preset sliding window and a preset step size in the device control method provided in the embodiment of the present application.
Fig. 3 is a schematic network structure diagram of a motion recognition model in the device control method according to the embodiment of the present application.
Fig. 4 is a schematic diagram of another network structure of a motion recognition model in the device control method according to the embodiment of the present application.
Fig. 5 is a schematic diagram of a head shaking action in a device control method provided in an embodiment of the present application.
Fig. 6 is a schematic diagram of a device control method provided in an embodiment of the present application, where no panning motion occurs.
Fig. 7 is a scene schematic diagram of a user wearing a head-mounted device in an embodiment of the present application.
Fig. 8 is a second flowchart of an apparatus control method according to an embodiment of the present application.
Fig. 9 is a second flowchart of an apparatus control method according to an embodiment of the present application.
Fig. 10 is a schematic structural diagram of an apparatus control device according to an embodiment of the present application.
Fig. 11 is a schematic structural diagram of a first electronic device according to an embodiment of the present application.
Fig. 12 is a schematic structural diagram of a second electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without inventive step, are within the scope of the present application.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The human body action mainly refers to the way of human body action, the reaction of human body to environment or object, and the complex motion of human body through limbs to describe or express the complex human body action. It can be said that most of the actions of the human body need to be reflected by the movement of the limbs of the human body. Research and exploration of human body movement through human body movement become a very effective way for analyzing human body movement.
The inventor finds, in research on a related equipment control method, that the accuracy of the related equipment control method in recognizing the human body action needs to be improved.
In order to improve the accuracy of motion recognition, an embodiment of the present application provides an apparatus control method, where an execution main body of the apparatus control method may be the apparatus control device provided in the embodiment of the present application, or an electronic apparatus integrated with the apparatus control device, where the apparatus control device may be implemented in a hardware or software manner. Wherein, electronic equipment can also be the head-mounted apparatus for intelligent terminals such as smart mobile phone, panel computer, big screen of wisdom, like intelligent glasses, AR glasses, VR glasses, earphone etc..
When the electronic device that is the execution subject of the device control method of the present application is a smart terminal, the target device that is the control target may be a head-mounted device or a smart terminal. For example, the device control method according to the embodiment of the present application is provided in the form of a computer program in an intelligent terminal, and the intelligent terminal receives a sensor data sequence transmitted by a motion sensor of a head-mounted device, performs motion recognition, and controls the intelligent terminal to perform an operation matching a head motion of a wearer of the head-mounted device (e.g., VR glasses) according to a motion classification result. Or the intelligent terminal receives a sensor data sequence sent by a motion sensor of the head-mounted device, performs motion recognition, and controls the head-mounted device (such as an earphone) to execute operation matched with the head motion of a wearer of the head-mounted device according to the motion classification result. For example, the intelligent terminal generates a corresponding control instruction according to the action classification result, and sends the control instruction to the head-mounted device to control the head-mounted device to execute a corresponding operation.
When the electronic device that is the execution subject of the device control method of the present application is a head-mounted device, the target device that is the control target may be the head-mounted device or may be an intelligent terminal. For example, the device control method according to the embodiment of the present application is provided in the form of a computer program on a head-mounted device, and the head-mounted device acquires a sensor data sequence acquired by a motion sensor thereof, performs motion recognition, and controls an intelligent terminal to perform an operation matching with a head motion of a wearer of the head-mounted device (e.g., VR glasses) according to a motion classification result. Or the head-mounted equipment acquires a sensor data sequence acquired by a motion sensor of the head-mounted equipment, performs action recognition, and controls the head-mounted equipment (such as VR glasses) to perform operation matched with the head action of a wearer of the head-mounted equipment according to the action classification result.
Electronic equipment such as an intelligent terminal and a tablet computer can be connected with the head-mounted equipment through a network. Such as establishing a network connection via a Wi-Fi connection, via a bluetooth connection, etc.
Referring to fig. 1, fig. 1 is a first flowchart illustrating an apparatus control method according to an embodiment of the present disclosure. The specific process of the device control method provided by the embodiment of the application can be as follows:
101. and acquiring target sensor data from the sensor data sequence according to a preset sliding window.
The scheme of the embodiment of the application can be applied to a man-machine interaction scene. The motion sensor may be an acceleration sensor, a gyroscope, an inertial measurement unit, or the like, capable of detecting motion data of the device. The sensor data may be acceleration data, angular velocity data, or the like. When the sensor data is inertial measurement units, the sensor data may include one or more of raw acceleration data, angular velocity data, and magnetometer data output by the inertial measurement units; the sensor data may also be quaternions calculated from raw acceleration data, angular velocity data and magnetometer data output by the inertial measurement unit.
The motion sensor is a motion sensor of a head-mounted device. Wherein the motion sensor may collect sensor data at a preset frequency during operation of the headset or while the headset is in a preset mode of operation. For example, a motion sensor collects sensor data at a frequency of 50Hz per second, then 1 second may collect 50 sensor data. The motion sensor collects the sensor data according to the preset frequency and continuously outputs the sensor data in sequence, and the sensor data form a sensor data sequence according to the collected time sequence.
In the following, for the convenience of the reader to understand the solution, the electronic device and the target device are both head-mounted devices. The head-mounted equipment identifies the head action of the current wearer according to the sensor data sequence acquired by the motion sensor, and controls and executes corresponding operation according to the identification result.
Since the time of the action of the user when the control target apparatus performs a certain operation is not particularly long in general. For example, the target device is smart glasses and one action performed by the user may be "nodding" or "shaking". When the electronic device performs motion recognition according to the sensor data, in order to improve the operation efficiency and the recognition accuracy, only a section of recently acquired sensor data may be acquired as a basis for recognition. For example, the target sensor data is acquired from the sensor data sequence according to a preset sliding window.
In some embodiments, obtaining target sensor data from a sensor data sequence according to a preset sliding window comprises: controlling a motion sensor to acquire sensor data according to a preset frequency, wherein the acquired sensor data form a sensor data sequence according to a time sequence; and acquiring target sensor data from the sensor data sequence according to a preset sliding window and a preset step length.
In this embodiment, the preset sliding window and the preset step length may be set according to user requirements, and in general, a most appropriate value may be determined according to experimental data in a test stage. Assuming that the length of the motion is generally about 80 points, the length of the preset sliding window may be close to the length of one motion or slightly longer than the length of one motion in order to accurately identify a completed motion. For example, the length of the preset sliding window may be 70 to 100 points. Wherein a point refers to a piece of sensor data.
The preset step is a distance of one movement of the preset sliding window, for example, in an embodiment, the preset step is 1 to 10 points. Referring to fig. 2, fig. 2 is a schematic diagram of a preset sliding window and a preset step length in an apparatus control method according to an embodiment of the present application.
The motion sensor continuously outputs sensor data according to a preset frequency, the electronic equipment receives the sensor data item by item according to the frequency sent by the motion sensor, and when the interval from the last window reaches a preset step length, the latest sensor data is taken as a window terminal, and the latest sensor data is intercepted from a sensor data sequence according to a preset sliding window and taken as target sensor data. In this manner, the target sensor data is continuously intercepted from the sensor data sequence. It is understood that the target sensor data is a short sequence over the entire sensor data sequence.
According to the method for acquiring the target sensor data, real-time action recognition can be realized according to the received sensor data.
In an embodiment, if the electronic device is an intelligent terminal, before acquiring the target sensor data, it may further be detected whether the electronic device is in a connection state with the head-mounted device, and if the electronic device is in a connection state with the head-mounted device, the electronic device acquires the target sensor data.
As one of the manners, the connection state between the electronic device and the head-mounted device may be determined by looking up the state value of the electronic device, specifically, two different state values may be set for the electronic device in advance, when the electronic device is connected to the head-mounted device, the first state value is returned, and when the electronic device is not connected to the head-mounted device, the second state value is returned, so that whether the electronic device is in the connection state with the head-mounted device may be determined by detecting the first state value and the second state value. For example, the first state value of the electronic device is set to 1 in advance, the second state value of the electronic device is set to 0, if the state value of the electronic device is detected to be 1, it is determined that the electronic device and the head-mounted device are in a connected state, and if the state value returned by the electronic device is detected to be 0, it is determined that the electronic device and the head-mounted device are in a non-connected state. Optionally, if it is detected in the time sequence that the state value returned by the electronic device at the adjacent time is changed from 1 to 0, it is determined that the electronic device and the head-mounted device are in the connection interruption state.
As another mode, the electronic device sends a broadcast both when the headset is connected and when the headset is disconnected, so the electronic device can determine whether the electronic device is connected to the headset by monitoring the broadcast.
102. And processing the target sensor data according to the action recognition model to obtain an action classification result corresponding to the target sensor data.
After the target sensor data is acquired, the data is input into the motion recognition model for motion recognition. The motion recognition model is obtained by training according to sample data. The action recognition model is a classification model built based on a convolutional neural network, and can recognize which action the current action belongs to according to input target sensor data.
For example, in one embodiment, the action recognition model includes a feature extraction network and a fully connected layer; processing the target sensor data according to the action recognition model to obtain an action classification result corresponding to the target sensor data, and the method comprises the following steps: converting target sensor data into a multi-channel input vector; performing feature extraction processing on the multi-channel input vector according to a feature extraction network to obtain a feature vector; and processing the characteristic vectors according to the full connection layer to obtain the action type.
The feature extraction network comprises a plurality of convolution layers, and the plurality of convolution layers perform convolution operation on input target sensor data to obtain feature vectors. And the full connection layer performs classification operation based on the feature vectors to obtain the probability value of the target sensor data on each action, and finally, one action with the highest probability value is used as an action classification result.
Next, the following description will be given taking sensor data as gyro three-axis data as an example. The gyro data includes data on three coordinate axes of a three-dimensional space, that is, data in three dimensions for each piece of sensor data. Assuming that one target sensor data is 5 in length, it can be converted into three input vectors of 5 in length, the size of which is (1 × 5). And then inputting the input vectors of the three channels into a motion recognition model for calculation to obtain a motion classification result corresponding to the target sensor. Wherein, for the action recognition model, a plurality of actions can be set in the full connection layer. For example, the actions include a nodding action and a shaking action, then the fully connected layer may set three categories: shaking head motion, nodding head motion, and meaningless motion. Among them, other motions than the nodding motion and the shaking motion may be recognized as meaningless motions, or may be recognized as meaningless motions when the user has no motion at all.
It is understood that the above-listed actions are merely exemplary, and in other embodiments, the actions may further include nodding once, nodding twice, raising, shaking, swinging left, etc., not to mention herein.
Next, a training process of the motion recognition model in the embodiment of the present application will be described.
In some embodiments, the training method of the motion recognition model includes: acquiring a sample sensor data sequence, wherein the sample sensor data sequence comprises a plurality of actions, and a starting point label and an action label of each action; dividing a sample sensor data sequence into a plurality of sample data according to a preset sliding window and a sliding step length; for each sample data, generating an action tag of each sample data according to the starting point tag and the action tag corresponding to the sensor data in the sample data; and training the motion recognition model according to the plurality of sample data and the corresponding motion labels to determine model parameters.
Taking an electronic device as an example of smart glasses, first, one or more sample sensor data sequences are collected. For example, 70 persons of motion data are collected, and each person performs some specific motion, such as nodding once, nodding twice, raising head, shaking head, and swinging head to the left, so as to define specific operations through the five motions. Wherein an action may correspond to an operation. Wherein each action is collected for a plurality of times, such as 50-100 times. In addition to these actions, the daily acquisition of meaningless actions is additionally provided in order to prevent the effects of jitter and other actions, increasing the robustness of the algorithm. The meaningless actions comprise walking, jogging, going upstairs and downstairs, slowly rotating for one circle, slowly lowering head to see things, standing after standing from standing to sitting, slowly raising head and stretching waist after sitting, and the actions of supporting glasses and the like. The 70 people perform these meaningless actions a certain number of times in addition to the specific actions described above. A plurality of sample slave sensor data sequences are obtained in the manner described above for collecting samples. And then adding labels to the sensor data, wherein the starting time of each action can be recorded in the data collection stage, and when labeling, the labels corresponding to the action are added to all the sensor data corresponding to the time period in which each action is executed, and meanwhile, the action starting labels can be selectively added to the sensor data corresponding to the starting time points, and the action ending labels are added to the sensor data corresponding to the action ending points. Sensor data other than the sensor data corresponding to the particular motion described above is labeled as meaningless motion.
Next, the sample sensor data sequence is processed to obtain a plurality of sample data. Specifically, the sample sensor data sequence is divided into a plurality of sample data according to a preset sliding window and a sliding step length. The division manner is similar to the acquisition manner of the target sensor data in the foregoing, and is not described herein again. After a plurality of sample data are obtained, marking each sample data according to the label corresponding to each sensor data. When the data in one sample data is not less than the preset proportion and belongs to a certain action, a label of the action is added to the sample data, otherwise, when the proportion of the sensor data in one sample data, which belongs to a certain action, is less than the preset proportion, a label of a meaningless action is added to the data.
After the action label of each sample data is generated, the action recognition model is trained according to a plurality of sample data and the corresponding action labels to determine model parameters. The model can be trained by adopting an Adam optimizer and a cross entropy loss function, and the evaluation method adopts the combination of at least two of F1 value, accuracy rate, precision rate and recall rate to carry out comprehensive evaluation.
In order to facilitate the reader to understand the motion recognition model of the present solution, a specific network structure is described as an example below. Referring to fig. 3, fig. 3 is a schematic network structure diagram of a motion recognition model in the device control method according to the embodiment of the present application. In one embodiment, a lightweight CNN network model is used for real-time motion recognition. The number of channels in the input layer of the network is 3, corresponding to the input vector of three channels. The first dual convolutional layer is followed, which includes two convolutional layers with the same structure, the same superparameter, but different weight data, where the number of channels per convolutional layer is 64, and the number of convolutional cores is 5. The first dual convolutional layer is followed by a batch normalization layer and a pooling layer, both of which have 64 channels. The second double convolutional layer follows, which includes two convolutional layers with the same structure, the same superparameter, but different weight data, where the number of channels per convolutional layer is 128 and the number of convolutional cores is 5. The second double convolutional layer is followed by a batch normalization layer and a pooling layer, both of which have 128 channels. And the last layer is a full connection layer, and the channel data of the full connection layer is set according to the number of the actions. The size of the convolution kernel matches the size of the input feature, for example, if the size of the input feature vector for the first convolution layer is (1 × 5), the size of the convolution kernel for the first convolution layer is also (1 × 5).
It is understood that in other embodiments, no batch normalization layer may be provided in the network.
Wherein the first double convolutional layer functions to extract the characteristics of the sensor data of a preset sliding window on the acquired time series; the first pooling layer is used for reducing the dimensionality of the acquired sensor data of the preset sliding window on the time sequence and the translational invariance of the acquired sensor data of the preset sliding window on the time sequence to a certain degree; the second double convolution layer is used for further extracting high-order features of the acquired sensor data of the preset sliding window on the time sequence and increasing the dimension after the feature scale is reduced so as to keep the richness of the information; the second pooling layer is used for centralizing the characteristics detected at each position in the acquired sensor data of the preset sliding window on the time sequence and enhancing the translation invariance; the full connection layer is used for converting all the characteristics into logits values of all the action categories; the role of the Softmax layer is to translate logits values into probability values that sum to 1.
Also not shown in fig. 3, a Relu activation function is added after each convolutional layer to enhance the non-linear capability of the motion recognition model.
In some embodiments, a Dropout layer with a probability of 0.5 is added before a fully-connected layer in the training of the motion recognition model, in the motion recognition model, the effect of the Dropout layer is to randomly set the values of half of the neurons to 0, and the result is predicted according to the remaining half of the neurons, so that the generalization capability of the motion recognition model is enhanced.
In the embodiment of the application, the loss function adopted in the action recognition model is a cross-entropy loss function.
Wherein, the cross entropy loss function formula is as follows:
wherein: m represents the number of action classes; y isicCharacterization indicator variable (0)Or 1) if the predicted action type is the same as the action type of the observation sample i, the predicted action type is 0, and if the predicted action type is different from the action type of the observation sample i, the predicted action type is 1; p is a radical oficThe predicted probability of belonging to the action class c for the observation sample i is characterized.
And performing iterative training on the action recognition model based on the cross entropy loss function to obtain the gradient of the action recognition model, and updating the parameters of the action recognition model by adopting a random gradient descent method until the maximum iteration times is reached to obtain the trained action recognition model. Wherein, the action recognition model is a convolution neural network model.
Alternatively, in another embodiment, the motion recognition model includes a first convolutional layer, a second convolutional layer, a maximum pooling layer, a third convolutional layer, a fourth convolutional layer, a global average pooling layer, a fully-connected layer, and a softmax layer, which are connected in sequence.
The hyper-parameters of the four convolutional layers can be set as required, for example, the first convolutional layer and the second convolutional layer are convolutional layers with convolution kernel of 7 and channel number of 64; the third convolutional layer and the fourth convolutional layer are convolutional layers having a convolutional kernel of 7 and a channel number of 128. In other embodiments, the first convolutional layer and the second convolutional layer may have different structures and hyper-parameters, and the third convolutional layer and the fourth convolutional layer may have different structures and hyper-parameters.
Referring to fig. 4, fig. 4 is a schematic diagram of another network structure of an action recognition model in the device control method according to the embodiment of the present application.
It is to be understood that the network structures shown in fig. 3 and 4 are merely exemplary, and in other embodiments, other network structures may be adopted as long as the features on the time axis can be extracted from the target sensor data, and the motion category can be identified based on the features.
It should be noted that the network structure is only an example, and in practical application, the number of convolutional layers, and the number of convolutional cores, the number of convolutional layer channels, and other hyper-parameters may also be set according to the complexity (such as the number of channels, the length of vectors) of input data.
However, it is understood that the convolution kernel in the convolutional layer is also in the form of a one-dimensional vector because the input data of the model is a one-dimensional vector.
103. And when determining that the action occurs according to the action classification result, controlling the target equipment to execute the operation corresponding to the action.
And after the action classification result is obtained, determining whether a specific action occurs according to the action classification result. For example, when the action classification result is a specific action, the current action of the user is determined as the specific action, and the control target device responds to the operation corresponding to the action. Otherwise, when the target motion classification result corresponds to a meaningless motion, no operation is performed.
In this embodiment, the association relationship between the actions and the operations is pre-constructed, for example, in an embodiment, the head nodding action is associated with the confirmation operation, and the head shaking action is associated with the cancellation operation. When determining that the action occurs according to the action classification result, controlling the target device to execute an operation corresponding to the action, including: when the head-nodding action is determined to occur according to the action classification result, the target equipment is controlled to execute the confirmation operation corresponding to the head-nodding action; and when the shaking motion is determined to occur according to the motion classification result, controlling the target equipment to execute cancellation operation corresponding to the shaking motion. And when the target action classification result is not the actions which are constructed in advance and have the relationship with the specific operation, judging the current action as a meaningless action, and not executing any operation.
Wherein, in different scenarios, the confirm operation and the cancel operation may correspond to different specific operations.
For example, in an embodiment, if the operations corresponding to the nodding motion and the shaking motion are a page turning operation and a return operation, when it is determined that the nodding motion occurs, the electronic device is controlled to respond to the page turning operation corresponding to the nodding motion; and when the shaking motion is determined to occur, controlling the electronic equipment to respond to the return operation corresponding to the shaking motion. Optionally, when it is determined that the nodding action occurs, the electronic device may also be controlled to respond to a return operation corresponding to the nodding action; when the shaking motion is determined to occur, the electronic equipment can also be controlled to respond to the page turning operation corresponding to the shaking motion.
Specifically, a method for giving action judgment according to the action classification result is preset in the electronic device, and is called a decision strategy herein. The electronic equipment judges whether an action occurs or not by combining an action classification result output by the action recognition model and a preset action judgment decision strategy, and controls the electronic equipment to respond to a page turning operation or a returning operation corresponding to the nodding action if the nodding action occurs according to the decision strategy; and if the shaking motion is judged to occur through the decision strategy, controlling the electronic equipment to respond to a return operation or a page turning operation corresponding to the shaking motion.
Optionally, when determining whether an action occurs, it may further be detected whether the amplitude of the action exceeds a preset amplitude, and if the amplitude of the action exceeds the preset amplitude, it is determined that an action occurs. For example, the amplitude of the nodding action is set to move downwards by 10 degrees facing the ground, and if the amplitude of the downward movement of the head of the user facing the ground is detected to exceed 10 degrees, the nodding action is determined to occur; and if the amplitude of the downward movement of the head of the user facing the ground is detected not to exceed 10 degrees, determining that no nodding action occurs. Correspondingly, when determining whether the head shaking motion occurs, detecting whether the amplitude of the leftward or rightward motion of the head of the user exceeds a preset amplitude, and further determining that the head shaking motion occurs when detecting that the amplitude of the leftward or rightward motion of the head of the user exceeds the preset amplitude; and if the detected head of the user does not move leftwards or rightwards beyond the preset amplitude, determining that no shaking motion occurs. For example, a preset amplitude of the shaking motion may be set to move 40 degrees to the left or right in advance, and when it is detected that the amplitude of the left or right movement of the head of the user exceeds 40 degrees, it may be determined that the shaking motion occurs, as shown in fig. 5; if it is detected that the amplitude of the leftward or rightward movement of the head of the user does not exceed 40 degrees, it is determined that no panning motion occurs, as shown in fig. 6.
In some embodiments, in order to improve the accuracy of motion recognition and prevent the occurrence of 0-1-0 jump, after obtaining a motion classification result, performing comprehensive judgment by combining output results before a model, when the motion classification results output for a preset number of consecutive times are the same motion classification result, judging that a motion occurs, and controlling a target device to respond to an operation corresponding to the motion.
The target device may be the electronic device itself, or may be another device other than the electronic device, for example, an external device that establishes a connection with the electronic device, or the like. For example, when the electronic device is a pair of smart glasses, when it is determined that an action occurs according to the action classification result, the smart glasses are controlled to execute a corresponding operation; or when determining that an action occurs according to the action classification result, the smart glasses generate a corresponding control instruction according to the judgment result, and send the control instruction to an external device, for example, a smart phone which establishes network connection with the smart glasses, so as to control the smart phone to execute a corresponding operation.
When the user wears intelligent glasses, if control is carried out through button or touch screen on the intelligent glasses, because the user can't directly see the touch-control position, the operation is inconvenient. And moreover, the user can not directly operate the connected equipment such as the smart phone and the like because the user wears the smart glasses. Through the scheme of this application, intelligent glasses pass through sensor data and discern user's head action to make the user can control target device through the head action fast conveniently.
For example, in an embodiment, when it is determined that an action occurs according to the action classification result, the controlling target device performs an operation corresponding to the action, including: when the head action is determined to occur according to the action classification result, controlling a notification bar of the target device to be unfolded; and when the shaking motion is determined to occur according to the motion classification result, controlling the notification bar to retract.
In this embodiment, the preset action categories include a nodding action, a shaking action, and a raising action, where the nodding action is used to control the notification bar of the target device to be expanded downward. The shaking motion is used for controlling the notification bar to be folded upwards, and the shaking motion generally means that the head part rotates leftwards or rightwards.
Further, in another embodiment, when it is determined that an action occurs according to the action classification result, the controlling target device executes an operation corresponding to the action, which may further include: and displaying the state information of the degree of freedom when the head-up action is determined to occur according to the action classification result. The head-up motion is used to control the target device to display DOF (degree of freedom) state information, such as 3DOF state information or 6DOF state information.
For example, please refer to fig. 7, and fig. 7 is a schematic view illustrating a scene in which a user wears a headset according to an embodiment of the present application. The head-mounted device may be VR glasses, AR glasses, or the like. When a user wears the head-mounted equipment, if the user needs to check information in the notification bar, the head-mounted equipment performs head nodding action, acquires sensing data acquired by the motion sensor, intercepts target sensor data with a preset sliding window length from the sensing data, obtains an action recognition result according to the target sensor data and the action recognition model, and controls the notification bar of the head-mounted equipment to be downwards unfolded when the current action is determined as the head nodding action according to the action recognition result so that the user can check the information of the notification bar. In addition, it can be understood that, if the head-mounted device is used in connection with the intelligent terminal, when the head-mounted device determines that the current action is the nodding action, the head-mounted device generates a corresponding control instruction and sends the control instruction to the intelligent terminal, and a notification bar of a user interface on the intelligent terminal is expanded downwards so that a user can view information of the notification bar.
Or, in another embodiment, when it is determined that an action occurs according to the action classification result, the controlling target device may execute an operation corresponding to the action, and the method may further include: and displaying the state information of the degree of freedom when the head-up action is determined to occur according to the action classification result and a touch instruction triggered based on a preset touch area is detected.
For example, if the user wants to view the degree-of-freedom state information, the head-up operation is performed, and meanwhile, the touch instruction is triggered based on the preset touch area, for example, the touch instruction is triggered by pressing the preset touch area.
In particular implementation, the present application is not limited by the execution sequence of the described steps, and some steps may be performed in other sequences or simultaneously without conflict.
As can be seen from the above, the device control method provided in the embodiment of the present application obtains target sensor data from a sensor data sequence according to a preset sliding window, processes the target sensor data according to a motion recognition model to obtain a motion classification result corresponding to the target sensor data, and controls the electronic device to respond to an operation corresponding to a motion when it is determined that the motion occurs according to the motion classification result. By the method, whether the user takes the action or not can be directly identified through the data detected by the motion sensor, so that the operation corresponding to the action is responded, the user does not need to manually control the electronic equipment, and the control efficiency of the electronic equipment is improved.
Referring to fig. 8, fig. 8 is a second flowchart illustrating an apparatus control method according to an embodiment of the present disclosure. The method comprises the following steps:
201. and acquiring target sensor data from the sensor data sequence according to a preset sliding window.
During operation of the electronic device, the motion sensor may collect sensor data at a preset frequency. For example, a motion sensor collects sensor data at a frequency of 50Hz per second, then 1 second may collect 50 sensor data. The motion sensor collects the sensor data according to the preset frequency and continuously outputs the sensor data in sequence, and the sensor data form a sensor data sequence according to the collected time sequence. The electronic equipment receives sensor data item by item according to the frequency sent by the motion sensor, and when the interval from the previous window reaches a preset step length, the latest sensor data item is taken as a window terminal point, and the latest sensor data item is intercepted from the sensor data sequence according to a preset sliding window and taken as target sensor data. In this manner, the target sensor data is continuously captured from the sensor data sequence, and then motion recognition is performed in real time based on the received sensor data.
202. And processing the target sensor data according to the action recognition model to obtain an action classification result corresponding to the target sensor data.
After the target sensor data is acquired, the data is input into the motion recognition model for motion recognition. The motion recognition model is obtained by training according to sample data. The action recognition model is a classification model built based on a convolutional neural network, and can recognize which action the current action belongs to according to input target sensor data. For the specific recognition principle of the motion recognition model, please refer to the previous embodiment, which is not described herein again.
203. And when the action classification result is a preset action type, acquiring a preset number of historical action classification results output by the action recognition model before the current action classification result.
204. And taking the classification result with the largest quantity in the preset quantity of historical action classification results and the action classification results of the current time as a target classification result.
In order to improve the accuracy of action recognition and prevent the occurrence of 0-1-0 jump, after obtaining an action classification result, if the action classification result is a preset action category, obtaining a preset number of historical action classification results output by an action recognition model before the action classification result of the current time, for example, obtaining a plurality of adjacent continuous historical action classification results before the action classification result output at the current time, such as 2-10 historical action classification results, and integrating the preset number of historical action classification results and the action classification result of the current time to take the classification result with the largest number as a target classification result. Compared with the scheme of classifying only through a single recognition result, the recognition mode has higher accuracy.
205. And when determining that the action occurs according to the target classification result, controlling the target equipment to execute the operation corresponding to the action.
And after the target action classification result is obtained, determining whether an action occurs according to the target action classification result. For example, when the target action classification result is a certain action, the current action of the user is determined as the action, and the target device is controlled to respond to the operation corresponding to the action. Otherwise, when the target motion classification result corresponds to a meaningless motion, no operation is performed.
Therefore, when the action is judged to occur according to the continuous detection results of the action recognition model, the operation corresponding to the action is responded, the user does not need to manually control the electronic equipment, and the control efficiency of the electronic equipment is improved.
Referring to fig. 9, fig. 9 is a third flowchart illustrating an apparatus control method according to an embodiment of the present disclosure. The method comprises the following steps:
301. and acquiring target sensor data from the sensor data sequence according to a preset sliding window.
302. And processing the target sensor data according to the action recognition model to obtain an action classification result corresponding to the target sensor data.
For the intercepting manner of the target sensor data and the specific recognition principle of the motion recognition model, please refer to the previous embodiment, which is not described herein again.
303. And when the action classification result is a preset action type, acquiring a preset number of historical action classification results output by the action recognition model before the current action classification result.
304. And taking the classification result with the largest quantity in the preset quantity of historical action classification results and the action classification results of the current time as a target classification result.
In order to improve the accuracy of action recognition and prevent the occurrence of 0-1-0 jump, after obtaining the action classification result, obtaining a preset number of historical action classification results output by the action recognition model before the action classification result of the current time, for example, obtaining a plurality of adjacent continuous historical action classification results before the action classification result output at the current time, such as 2-10 historical action classification results, and integrating the preset number of historical action classification results and the action classification result at the current time to take the classification result with the largest number as a target classification result. Compared with the scheme of classifying only through a single recognition result, the recognition mode has higher accuracy.
305. And when the action is determined to occur according to the target classification result, judging whether the target classification result corresponds to the action starting point.
In this embodiment, in order to further improve the accuracy of motion recognition and prevent the occurrence of misjudgment, when it is determined that a motion occurs according to the target classification result, it is determined whether the target classification result corresponds to the motion start point. There are various ways to determine whether the classification result corresponds to the action start point. The first method is implemented by using an action recognition model, and since there is a label of an action starting point in the model training data, a dimension output result can be added to the output result of the model, and the dimension output result is used to indicate whether the current classification result corresponds to the action starting point. And in the second mode, after the target classification result of the current time is obtained, the last target classification result is obtained, the target classification result of the current time is compared with the last target classification result, and if the target classification result of the current time is different from the last target classification result, the target classification result of the current time is judged to correspond to the action starting point. Otherwise, the target classification result of the current time is judged not to correspond to the action starting point.
If the target classification result corresponds to the action start point, then 306 is performed. If the target classification result does not correspond to the action start point, 307 is executed.
306. The recording action length is started.
307. And updating the action length, and judging whether the updated action length is greater than the preset length.
308. And when the updated action length is larger than the preset length, controlling the target equipment to execute the operation corresponding to the action.
And if the target classification result of the current time corresponds to the action starting point, the target device is not controlled to execute the operation corresponding to the action for the moment. But starts recording the action length. And if the current target classification result does not correspond to the action starting point, updating the action length, and judging whether the updated action length is greater than the preset length. If the updated action length is larger than the preset length, that is, when the target classification results of multiple continuous times are detected to be the same result, the target device is controlled to execute the operation corresponding to the action. In this way, the accuracy of controlling the target device by the motion can be improved.
Therefore, when the action is judged to occur according to the continuous multiple detection results of the action recognition model, whether the action occurs is further judged by further combining the multiple target classification results, and when the judgment is yes, the operation corresponding to the action is responded, the user does not need to manually control the electronic equipment, and the control efficiency of the electronic equipment is improved.
In some embodiments, obtaining target sensor data from the sensor data sequence according to a preset sliding window includes: when a specified event is detected, a sensor data sequence acquired by a motion sensor in target equipment is acquired, and target sensor data are acquired from the sensor data sequence according to a preset sliding window.
In one embodiment, the sensor data is acquired by an acceleration sensor built in the head-mounted device when the head-mounted device is in a wearing state. Wherein the head-mounted equipment is a wireless Bluetooth headset; the acceleration sensor is a triaxial acceleration sensor.
Therefore, before acquiring the preset sliding window sensor data in the time sequence, it is necessary to detect whether the wireless bluetooth headset is in a wearing state, and optionally, it may be detected whether the wireless bluetooth headset is in a wearing state by an infrared sensor arranged in the wireless bluetooth headset. It should be noted that, when the wireless bluetooth headset is worn on the ear of a person, some areas may be shielded, and in this case, the infrared sensor may be set in the shielded area after the wireless bluetooth headset is in the wearing state, and then it may be determined whether the infrared signal emitted by the infrared sensor is shielded by detecting the state value returned by the infrared sensor, so as to determine whether the wireless bluetooth headset is in the wearing state or the non-wearing state. It will be appreciated that the wireless bluetooth headset is determined to be in a worn state when the returned status value indicates that the infrared signal is obscured, and the wireless bluetooth headset is determined to be in an unworn state when the returned status value indicates that the infrared signal is not obscured.
Furthermore, after the wireless Bluetooth headset is determined to be in the wearing state, whether the state of the wireless Bluetooth headset is in the starting state or not can be acquired. The wireless bluetooth headset may at least include a locked state, an unlocked state, an off state, an on state, a dormant state, or a combination of several states, for example, a locked and off state, an unlocked and on state, a locked and off state, and the like, which is not limited herein. Specifically, when the wireless bluetooth headset is in a locked state, a plurality of function keys or function buttons of the wireless bluetooth headset are not operable, so that the wireless bluetooth headset is prevented from being touched by a user by mistake and being used by others without permission of the user; when the wireless Bluetooth headset is in an unlocked state, it indicates that a plurality of function keys or function buttons of the wireless Bluetooth headset are operable, so that a user can conveniently adjust functions of the wireless Bluetooth headset, such as volume up and volume down; when the wireless Bluetooth headset is in a power-on state, the wireless Bluetooth headset can be used currently; when the wireless Bluetooth headset is in a power-off state, the wireless Bluetooth headset is not available currently; when the wireless Bluetooth headset is in the dormant state, the wireless Bluetooth headset is in a standby working state currently.
In the embodiment of the application, whether the wireless bluetooth headset is in the power-on state or not can be determined by detecting whether a plurality of function keys or function buttons of the wireless bluetooth headset are in the normal working state or whether the wireless bluetooth headset can be used currently. Specifically, if it is detected that a plurality of function keys or function buttons of the wireless bluetooth headset are in a normal working state, or it is detected that the wireless bluetooth headset can be currently used, it is determined that the wireless bluetooth headset is in a power-on state.
By the method, whether a specified event occurs can be detected after the wireless Bluetooth headset is determined to be in the power-on state and the wearing state, and when the specified event is detected, the acquisition of the data of the acceleration sensor built in the wireless Bluetooth headset is started to acquire the data of the sensor with the preset sliding window in the time sequence. The specified event can be any preset event which can trigger the wireless bluetooth headset to send the sensor data to the electronic device, such as an incoming call event.
Alternatively, it may be determined whether a specified event is detected by detecting whether the electronic device receives a particular identification. Specifically, different identifiers can be set for different events in advance, when an event occurs, the electronic device can receive the identifier corresponding to the event first, and then can judge whether the event is detected by detecting whether the identifier is received, and if the identifier is received, the event is determined to be detected. Further, the electronic device may identify the received identifier to determine whether the specified event is detected by identifying the identifier.
For example, different identifiers are set for different events in advance, where an event may include an incoming call event, a power-on event, a message receiving event, and the like, and specifically, the identifiers corresponding to the events may be set as: the identification corresponding to the incoming call event is set to LDSJ, the identification corresponding to the starting event is set to KJSJ, and the identification corresponding to the message receiving event is set to XXSJ. Furthermore, a specified event can be set in the electronic device in advance, and when the specified event is detected, the sensor data acquired by the acceleration sensor in the wireless Bluetooth headset is acquired. For example, a specified event is set as an incoming call event in advance, and then the electronic device starts to identify the identifier after receiving the identifier and determining that the event occurs, and if the identifier is identified as LDSJ, it is determined that the specified event is detected.
In an embodiment, the processing the target sensor data according to the motion recognition model to obtain a motion classification result corresponding to the target sensor data includes: acquiring a value of data to be input, wherein the value of the data to be input is an average value of values of all channels of target sensor data; inputting the value of the data to be input into the action recognition model, and acquiring an action classification result output by the action recognition model; and carrying out zero treatment on the data to be input.
In the embodiment of the present application, since the acceleration sensor is a three-axis acceleration sensor, when the wireless bluetooth headset is in the on state and in the wearing state, and a specific event is detected, the three-axis acceleration sensor built in the wireless bluetooth headset starts to acquire sensor data from X, Y, Z three dimensions.
By the method, after the preset sliding window sensor data in the time sequence are acquired, the average value of X, Y, Z three-channel values of each sensor data in the preset sliding window sensor data is acquired, and the average value of X, Y, Z three-channel values of each sensor data in the preset sliding window sensor data is used as the value of the data to be input, wherein the data to be input is the sensor data which needs to be input into the action recognition model for action recognition.
The value of the data to be input acquired in the above manner is input into the motion recognition model, and the data to be input is recognized through the motion recognition model, so that a motion classification result output by the motion recognition model can be acquired.
In addition, in order to avoid the influence of the previous sensor data on the sensor data of the motion recognition model which is input later, after the value of the data to be input is input into the motion recognition model, the data to be input is subjected to the pre-processing of returning to zero. And when the data to be input is acquired again next time, setting the value of the data to be input as the average value of the values of all the channels of the currently acquired preset sliding window sensor data to obtain new data to be input, and inputting the new data to be input into the action recognition model to perform action recognition operation, wherein the operation is repeatedly performed in the action recognition process.
In the device control method provided by this embodiment, when a specified event is detected, sensor data acquired by an acceleration sensor in a head-mounted device is started to acquire preset sliding window sensor data in a time sequence, and then a value of data to be input is acquired, where the value of the data to be input is an average value of values of all channels of a preset sliding window sensor, the value of the data to be input is input into an action recognition model, an action classification result output by the action recognition model is acquired, and then zero processing is performed on the data to be input. By the method, the value of the data to be input is input into the action recognition model for action recognition, and then the data to be input is subjected to zeroing processing, so that the influence of the previous action recognition result on the next action recognition can be avoided, and the accuracy of the action recognition is improved.
In some embodiments, the designated event is an incoming call event, and the target device is a communication device; when determining that the action occurs according to the action classification result, controlling the target device to execute an operation corresponding to the action, including: when the nodding action is determined to occur according to the action classification result, controlling the communication equipment to respond to the incoming call access operation corresponding to the nodding action; and when the shaking motion is determined to occur according to the motion classification result, controlling the communication equipment to respond to incoming call rejection operation corresponding to the shaking motion.
In this embodiment, when it is determined that the nodding action occurs according to the decision policy, the control electronic device invokes the interface to respond to the incoming call access operation corresponding to the nodding action. When it is determined that the shaking motion occurs, the control electronics respond to the incoming call rejection operation corresponding to the shaking motion. By the method, the user controls the call answering and the call refusing through the head nodding action and the head shaking action, the problem that the user is inconvenient to perform touch screen operation by hands in certain scenes is solved, and the application scenes can include the following scenes: the application scene of operating the remote equipment, the application scene of occupying the hands by other matters, the application scene of private interaction when handicapped people or inconvenient voice operation, and the like. In addition, the method also solves the problem that the touch screen operation is inconvenient to carry out by hands in certain scenes, provides a brand-new basic interaction mode for the user, provides more interaction choices for the user, is beneficial to improving the expressive force of each electronic device, and improves the user experience.
In one embodiment, an apparatus control device is also provided. Referring to fig. 10, fig. 10 is a schematic structural diagram of an apparatus control device 400 according to an embodiment of the present disclosure. The device control apparatus 400 is applied to an electronic device, and the device control apparatus 400 includes a data acquisition module 401, an action recognition module 402, and a device control module 403, as follows:
a data obtaining module 401, configured to obtain target sensor data from a sensor data sequence according to a preset sliding window;
the action recognition module 402 is configured to process the target sensor data according to an action recognition model to obtain an action classification result corresponding to the target sensor data;
and the device control module 403 is configured to, when it is determined that an action occurs according to the action classification result, control the target device to respond to an operation corresponding to the action.
It should be noted that the device control apparatus provided in this embodiment of the present application and the device control method in the foregoing embodiment belong to the same concept, and any method provided in the device control method embodiment may be implemented by the device control apparatus, and specific implementation processes thereof are described in detail in the device control method embodiment and will not be described herein again.
As can be seen from the above, the device control apparatus provided in the embodiment of the present application obtains the target sensor data from the sensor data sequence according to the preset sliding window, processes the target sensor data according to the motion recognition model to obtain the motion classification result corresponding to the target sensor data, and controls the electronic device to respond to the operation corresponding to the motion when it is determined that the motion occurs according to the motion classification result. By the method, whether the user takes the action or not can be directly identified through the data detected by the motion sensor, so that the operation corresponding to the action is responded, the user does not need to manually control the electronic equipment, and the control efficiency of the electronic equipment is improved.
The embodiment of the application also provides the electronic equipment. The electronic device can be a smart phone, a tablet computer and the like. Referring to fig. 11, fig. 11 is a first structural schematic diagram of an electronic device according to an embodiment of the present disclosure. The electronic device 500 comprises a processor 501 and a memory 502. The processor 501 is electrically connected to the memory 502.
The processor 501 is a control center of the electronic device 500, connects various parts of the whole electronic device by using various interfaces and lines, and performs various functions of the electronic device and processes data by running or calling a computer program stored in the memory 502 and calling data stored in the memory 502, thereby performing overall monitoring of the electronic device.
The memory 502 may be used to store computer programs and data. The memory 502 stores computer programs containing instructions executable in the processor. The computer program may constitute various functional modules. The processor 501 executes various functional applications and data processing by calling a computer program stored in the memory 502.
In this embodiment, the processor 501 in the electronic device 500 loads instructions corresponding to one or more processes of the computer program into the memory 502, and the processor 501 runs the computer program stored in the memory 502, so as to implement various functions as follows:
acquiring target sensor data from a sensor data sequence according to a preset sliding window;
processing the target sensor data according to a motion recognition model to obtain a motion classification result corresponding to the target sensor data;
and when determining that the action occurs according to the action classification result, the control target device responds to the operation corresponding to the action.
In some embodiments, please refer to fig. 12, and fig. 12 is a second structural diagram of an electronic device according to an embodiment of the present disclosure. The electronic device 500 further includes: radio frequency circuit 503, display 504, control circuit 505, input unit 506, audio circuit 507, sensor 508, and power supply 509. The processor 501 is electrically connected to the radio frequency circuit 503, the display 504, the control circuit 505, the input unit 506, the audio circuit 507, the sensor 508, and the power supply 509.
The radio frequency circuit 503 is used for transceiving radio frequency signals to communicate with a network device or other electronic devices through wireless communication.
The display screen 504 may be used to display information input by or provided to the user as well as various graphical user interfaces of the electronic device, which may be comprised of images, text, icons, video, and any combination thereof.
The control circuit 505 is electrically connected to the display 504 and is configured to control the display 504 to display information.
The input unit 506 may be used to receive input numbers, character information, or user characteristic information (e.g., fingerprint), and generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control. The input unit 506 may include a fingerprint recognition module.
The sensor 508 is used to collect external environmental information. The sensors 508 may include one or more of ambient light sensors, acceleration sensors, gyroscopes, and the like.
The power supply 509 is used to power the various components of the electronic device 500. In some embodiments, power supply 509 may be logically coupled to processor 501 through a power management system to manage charging, discharging, and power consumption management functions through the power management system.
Although not shown in the drawings, the electronic device 500 may further include a camera, a bluetooth module, and the like, which are not described in detail herein.
In this embodiment, the processor 501 in the electronic device 500 loads instructions corresponding to one or more processes of the computer program into the memory 502, and the processor 501 runs the computer program stored in the memory 502, so as to implement various functions as follows:
as can be seen from the above, an embodiment of the present application provides an electronic device, which obtains target sensor data from a sensor data sequence according to a preset sliding window, processes the target sensor data according to a motion recognition model to obtain a motion classification result corresponding to the target sensor data, and controls the electronic device to respond to an operation corresponding to a motion when it is determined that the motion occurs according to the motion classification result. By the method, whether the user takes the action or not can be directly identified through the data detected by the motion sensor, so that the operation corresponding to the action is responded, the user does not need to manually control the electronic equipment, and the control efficiency of the electronic equipment is improved.
An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program runs on a computer, the computer executes the apparatus control method according to any of the above embodiments.
It should be noted that, all or part of the steps in the methods of the above embodiments may be implemented by hardware related to instructions of a computer program, which may be stored in a computer readable storage medium, which may include, but is not limited to: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Furthermore, the terms "first", "second", and "third", etc. in this application are used to distinguish different objects, and are not used to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or modules is not limited to only those steps or modules listed, but rather, some embodiments may include other steps or modules not listed or inherent to such process, method, article, or apparatus.
The device control method, the device, the storage medium, and the electronic device provided in the embodiments of the present application are described in detail above. The principle and the implementation of the present application are explained herein by applying specific examples, and the above description of the embodiments is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.
Claims (15)
1. An apparatus control method characterized by comprising:
acquiring target sensor data from a sensor data sequence according to a preset sliding window;
processing the target sensor data according to a motion recognition model to obtain a motion classification result corresponding to the target sensor data;
and when determining that the action occurs according to the action classification result, the control target device responds to the operation corresponding to the action.
2. The method of claim 1, wherein said obtaining target sensor data from a sequence of sensor data according to a predetermined sliding window comprises:
and when a specified event is detected, starting to acquire a sensor data sequence acquired by the motion sensor in the target equipment, and acquiring target sensor data from the sensor data sequence according to a preset sliding window.
3. The method of claim 1, wherein the processing the target sensor data according to the motion recognition model to obtain a motion classification result corresponding to the target sensor data comprises:
acquiring a value of data to be input, wherein the value of the data to be input is an average value of values of all channels of the target sensor data;
inputting the value of the data to be input into the action recognition model, and acquiring an action classification result output by the action recognition model;
and performing zeroing processing on the data to be input.
4. The method of claim 1, wherein when it is determined that an action occurs according to the action classification result, the controlling target device responds to an operation corresponding to the action, and the method comprises the following steps:
when the action classification result is a preset action type, acquiring a preset number of historical action classification results output by the action recognition model before the current action classification result;
taking the classification result with the largest quantity in the preset quantity of historical action classification results and the action classification result of the current time as a target classification result;
and when determining that the action occurs according to the target classification result, controlling the target equipment to respond to the operation corresponding to the action.
5. The method of claim 4, wherein when it is determined that an action occurs according to the target classification result, controlling the target device to respond to an operation corresponding to the action comprises:
when determining that an action occurs according to the target classification result, judging whether the target classification result corresponds to an action starting point;
if the target classification result corresponds to the action starting point, starting to record the action length; and/or
If the target classification result does not correspond to the action starting point, updating the action length, and judging whether the updated action length is larger than a preset length or not;
and when the updated action length is larger than the preset length, the control target device responds to the operation corresponding to the action.
6. The method of claim 1, wherein the action recognition model comprises a feature extraction network and a fully connected layer; the processing the target sensor data according to the action recognition model to obtain an action classification result corresponding to the target sensor data includes:
converting the target sensor data into a multi-channel input vector;
performing feature extraction processing on the multi-channel input vector according to the feature extraction network to obtain a feature vector;
and processing the feature vector according to the full connection layer to obtain the action category.
7. The method of claim 6, wherein the feature extraction network comprises a plurality of dual convolutional layers, each dual convolutional layer followed by a corresponding batch normalization layer and pooling layer.
8. The method of claim 6, wherein the motion recognition model comprises a first convolutional layer, a second convolutional layer, a maximum pooling layer, a third convolutional layer, a fourth convolutional layer, a global average pooling layer, a fully-connected layer, and a softmax layer connected in sequence; the first convolution layer and the second convolution layer are convolution layers with convolution kernels of 7 and channel numbers of 64; the third convolutional layer and the fourth convolutional layer are convolutional layers having a convolutional kernel of 7 and a channel number of 128.
9. The method of claim 1, wherein when it is determined that an action occurs according to the action classification result, the controlling target device responds to an operation corresponding to the action, and the method comprises the following steps:
when the head nodding action is determined to occur according to the action classification result, controlling the target equipment to respond to the confirmation operation corresponding to the head nodding action; and/or
And when the shaking motion is determined to occur according to the motion classification result, controlling the target equipment to respond to the cancellation operation corresponding to the shaking motion.
10. The method of claim 9, wherein the target device is a communication device; when determining that the nodding action occurs according to the action classification result, controlling the target device to respond to a confirmation operation corresponding to the nodding action, including:
when the head-nodding action is determined to occur according to the action classification result, controlling the communication equipment to respond to the incoming call access operation corresponding to the head-nodding action;
when it is determined that a shaking motion occurs according to the motion classification result, controlling the target device to respond to a cancellation operation corresponding to the shaking motion, including:
and when the shaking motion is determined to occur according to the motion classification result, controlling the communication equipment to respond to incoming call rejection operation corresponding to the shaking motion.
11. The method of claim 9, wherein the controlling the target device to respond to a confirmation operation corresponding to the nodding action when the nodding action is determined to occur according to the action classification result comprises:
when the head action is determined to occur according to the action classification result, controlling a notification bar of the target equipment to be expanded;
when it is determined that a shaking motion occurs according to the motion classification result, controlling the target device to respond to a cancellation operation corresponding to the shaking motion, including:
and when the shaking motion is determined to occur according to the motion classification result, controlling the notification bar to be retracted.
12. The method of claim 11, wherein when it is determined from the action classification result that an action has occurred, the control-target device responds to an operation corresponding to the action, further comprising:
and displaying the state information of the degree of freedom when the head-up action is determined to occur according to the action classification result.
13. An apparatus control device, characterized by comprising:
the data acquisition module is used for acquiring target sensor data from the sensor data sequence according to a preset sliding window;
the action recognition module is used for processing the target sensor data according to an action recognition model to obtain an action classification result corresponding to the target sensor data;
and the equipment control module is used for controlling the target equipment to respond to the operation corresponding to the action when the action is determined to occur according to the action classification result.
14. A computer-readable storage medium on which a computer program is stored, characterized in that when the computer program is run on a computer, the computer is caused to execute the apparatus control method according to any one of claims 1 to 12.
15. An electronic device comprising a processor and a memory, the memory storing a computer program, wherein the processor is configured to execute the device control method according to any one of claims 1 to 12 by calling the computer program.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110118213.4A CN112817450A (en) | 2021-01-28 | 2021-01-28 | Action recognition method and device, electronic equipment and storage medium |
CN2021101182134 | 2021-01-28 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114167984A true CN114167984A (en) | 2022-03-11 |
CN114167984B CN114167984B (en) | 2024-03-12 |
Family
ID=75860099
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110118213.4A Pending CN112817450A (en) | 2021-01-28 | 2021-01-28 | Action recognition method and device, electronic equipment and storage medium |
CN202111416104.7A Active CN114167984B (en) | 2021-01-28 | 2021-11-25 | Equipment control method and device, storage medium and electronic equipment |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110118213.4A Pending CN112817450A (en) | 2021-01-28 | 2021-01-28 | Action recognition method and device, electronic equipment and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (2) | CN112817450A (en) |
WO (1) | WO2022161026A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112817450A (en) * | 2021-01-28 | 2021-05-18 | Oppo广东移动通信有限公司 | Action recognition method and device, electronic equipment and storage medium |
CN114020382A (en) * | 2021-10-29 | 2022-02-08 | 杭州逗酷软件科技有限公司 | Execution method, electronic equipment and computer storage medium |
CN114067623B (en) * | 2021-12-01 | 2024-03-22 | 云南民族大学 | Non-paper surface contact foreign language learning responder |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106919958A (en) * | 2017-03-21 | 2017-07-04 | 电子科技大学 | A kind of human finger action identification method based on intelligent watch |
CN108062170A (en) * | 2017-12-15 | 2018-05-22 | 南京师范大学 | Multi-class human posture recognition method based on convolutional neural networks and intelligent terminal |
WO2018141409A1 (en) * | 2017-02-06 | 2018-08-09 | Telefonaktiebolaget Lm Ericsson (Publ) | Initiating a control operation in response to a head gesture |
CN109508677A (en) * | 2018-11-15 | 2019-03-22 | 电子科技大学 | A kind of aerial hand-written action recognition based on improvement CNN network |
CN109731302A (en) * | 2019-01-22 | 2019-05-10 | 深圳职业技术学院 | Athletic posture recognition methods, device and electronic equipment |
KR20190098806A (en) * | 2018-01-31 | 2019-08-23 | 계명대학교 산학협력단 | A smart hand device for gesture recognition and control method thereof |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110110616A (en) * | 2019-04-19 | 2019-08-09 | 出门问问信息科技有限公司 | A kind of electronic equipment and control method |
CN110348494A (en) * | 2019-06-27 | 2019-10-18 | 中南大学 | A kind of human motion recognition method based on binary channels residual error neural network |
CN110991482B (en) * | 2019-10-31 | 2022-02-18 | 曾剑 | Body-building action recognition method, terminal and computer storage medium |
CN111200745A (en) * | 2019-12-31 | 2020-05-26 | 歌尔股份有限公司 | Viewpoint information acquisition method, apparatus, device and computer storage medium |
CN112817450A (en) * | 2021-01-28 | 2021-05-18 | Oppo广东移动通信有限公司 | Action recognition method and device, electronic equipment and storage medium |
-
2021
- 2021-01-28 CN CN202110118213.4A patent/CN112817450A/en active Pending
- 2021-11-25 CN CN202111416104.7A patent/CN114167984B/en active Active
- 2021-12-20 WO PCT/CN2021/139746 patent/WO2022161026A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018141409A1 (en) * | 2017-02-06 | 2018-08-09 | Telefonaktiebolaget Lm Ericsson (Publ) | Initiating a control operation in response to a head gesture |
CN106919958A (en) * | 2017-03-21 | 2017-07-04 | 电子科技大学 | A kind of human finger action identification method based on intelligent watch |
CN108062170A (en) * | 2017-12-15 | 2018-05-22 | 南京师范大学 | Multi-class human posture recognition method based on convolutional neural networks and intelligent terminal |
KR20190098806A (en) * | 2018-01-31 | 2019-08-23 | 계명대학교 산학협력단 | A smart hand device for gesture recognition and control method thereof |
CN109508677A (en) * | 2018-11-15 | 2019-03-22 | 电子科技大学 | A kind of aerial hand-written action recognition based on improvement CNN network |
CN109731302A (en) * | 2019-01-22 | 2019-05-10 | 深圳职业技术学院 | Athletic posture recognition methods, device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
WO2022161026A1 (en) | 2022-08-04 |
CN114167984B (en) | 2024-03-12 |
CN112817450A (en) | 2021-05-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114167984B (en) | Equipment control method and device, storage medium and electronic equipment | |
US9235278B1 (en) | Machine-learning based tap detection | |
Wang et al. | Human activity recognition with user-free accelerometers in the sensor networks | |
CN109032734B (en) | Background application program display method and mobile terminal | |
JP6064280B2 (en) | System and method for recognizing gestures | |
CN107193455B (en) | Information processing method and mobile terminal | |
KR20140147557A (en) | Mobile terminal and method for detecting a gesture to control functions | |
Fujinami et al. | Recognizing a Mobile Phone’s Storing Position as a Context of a Device and a User | |
JPWO2017104227A1 (en) | Information processing apparatus, information processing method, and program | |
WO2011092549A1 (en) | Method and apparatus for assigning a feature class value | |
CN113253908B (en) | Key function execution method, device, equipment and storage medium | |
CN113192537B (en) | Awakening degree recognition model training method and voice awakening degree acquisition method | |
CN117130469B (en) | Space gesture recognition method, electronic equipment and chip system | |
CN108765522B (en) | Dynamic image generation method and mobile terminal | |
Raj et al. | Different techniques for human activity recognition | |
CN106055958B (en) | A kind of unlocking method and device | |
CN113971271A (en) | Fingerprint unlocking method and device, terminal and storage medium | |
CN110796015A (en) | Remote monitoring method and device | |
CN110262767B (en) | Voice input wake-up apparatus, method, and medium based on near-mouth detection | |
US20220014683A1 (en) | System and method for ai enhanced shutter button user interface | |
US20170199578A1 (en) | Gesture control method for interacting with a mobile or wearable device | |
Cheng et al. | Finger-worn device based hand gesture recognition using long short-term memory | |
CN109408676A (en) | A kind of method and terminal device showing user information | |
CN109542315B (en) | Control method and system of mobile terminal | |
CN116502203B (en) | User identity recognition method and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |