CN111797655A

CN111797655A - User activity identification method and device, storage medium and electronic equipment

Info

Publication number: CN111797655A
Application number: CN201910282031.3A
Authority: CN
Inventors: 陈仲铭; 何明
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2019-04-09
Filing date: 2019-04-09
Publication date: 2020-10-20

Abstract

The embodiment of the application discloses a user activity identification method, a user activity identification device, a storage medium and electronic equipment, wherein a panoramic data sequence of a target user in a target time interval is obtained; extracting features of the panoramic data sequence to obtain a panoramic feature sequence, wherein the panoramic feature sequence comprises panoramic features of each sample point; calculating the probability distribution of each sample point in the panoramic feature sequence on a plurality of activity labels according to the time recurrent neural network model; and decoding according to a preset connection time sequence classification model and probability distribution of each sample point in the panoramic data sequence on the plurality of active tags, determining an active tag sequence of the target user in a target time interval, and realizing sequence-to-sequence user activity identification.

Description

User activity identification method and device, storage medium and electronic equipment

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a user activity recognition method, apparatus, storage medium, and electronic device.

Background

In the activity recognition scheme, user data in a period of time is generally collected as a basis to recognize user activities, and the activity recognition scheme mainly recognizes independent user activities, cannot process time sequence data, and is difficult to realize sequence-to-sequence user activity recognition.

Disclosure of Invention

The embodiment of the application provides a user activity identification method, a user activity identification device, a storage medium and electronic equipment, which can realize sequence-to-sequence user activity identification.

In a first aspect, an embodiment of the present application provides a user activity identification method, including:

acquiring a panoramic data sequence of a target user in a target time interval;

extracting features of the panoramic data sequence to obtain a panoramic feature sequence, wherein the panoramic feature sequence comprises panoramic features of each sample point;

calculating the probability distribution of each sample point in the panoramic feature sequence on a plurality of activity labels according to a time recurrent neural network model;

and decoding according to a preset connection time sequence classification model and probability distribution of each sample point in the panoramic data sequence on a plurality of active labels, and determining the active label sequence of the target user in the target time interval.

In a second aspect, an embodiment of the present application provides an apparatus for recognizing user activity, including:

the data acquisition module is used for acquiring a panoramic data sequence of a target user in a target time interval;

the feature extraction module is used for extracting features of the panoramic data sequence to obtain a panoramic feature sequence, wherein the panoramic feature sequence comprises panoramic features of each sample point;

the probability calculation module is used for calculating the probability distribution of each sample point in the panoramic feature sequence on a plurality of active labels according to a time recurrent neural network model;

and the activity classification module is used for performing decoding processing according to a preset connection time sequence classification model and probability distribution of each sample point in the panoramic data sequence on a plurality of activity labels, and determining the activity label sequence of the target user in the target time interval.

In a third aspect, a storage medium is provided in this application, where a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the user activity identification method provided in any embodiment of this application.

In a fourth aspect, an embodiment of the present application provides an electronic device, including a processor and a memory, where the memory has a computer program, and the processor is configured to execute the user activity recognition method according to any embodiment of the present application by calling the computer program.

According to the technical scheme, a panoramic data sequence of a target user in a target time interval is obtained, feature extraction is carried out on the panoramic data sequence to obtain a panoramic feature sequence, the panoramic feature sequence comprises panoramic features of all sample points, probability distribution of all the sample points in the panoramic feature sequence on a plurality of movable labels is calculated according to a time recursive neural network model, then decoding processing is carried out according to a connection time sequence classification model and the probability distribution of all the sample points in the panoramic data sequence on the plurality of movable labels, and an active label sequence of the target user in the target time interval is determined. In the user activity identification scheme, each sample point of the panoramic data sequence is preliminarily classified through a time recursive neural network model, probability distribution is obtained, the probability distribution is used as input of a connection time sequence classification model, decoding operation is carried out, an activity label sequence is obtained, and user activity identification from the sequence to the sequence is realized.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic view of a panoramic sensing architecture of a user activity recognition method according to an embodiment of the present application.

Fig. 2 is a schematic flowchart of a first method for identifying a user activity according to an embodiment of the present application.

Fig. 3 is a schematic flowchart of a second method for identifying user activities according to an embodiment of the present application.

Fig. 4 is a schematic diagram of a data processing flow in the user activity recognition method according to the embodiment of the present application.

Fig. 5 is a third flowchart illustrating a user activity recognition method according to an embodiment of the present application.

Fig. 6 is a schematic diagram of a forward-backward algorithm according to an embodiment of the present invention.

Fig. 7 is a schematic structural diagram of a user activity recognition apparatus according to an embodiment of the present application.

Fig. 8 is a schematic structural diagram of a first electronic device according to an embodiment of the present application.

Fig. 9 is a schematic structural diagram of a second electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without inventive step, are within the scope of the present application.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

Referring to fig. 1, fig. 1 is a schematic view of a panoramic sensing architecture of a user activity recognition method according to an embodiment of the present application. The user activity identification method is applied to the electronic equipment. A panoramic perception framework is arranged in the electronic equipment. The panoramic sensing architecture is an integration of hardware and software for implementing the user activity recognition method in an electronic device.

The panoramic perception architecture comprises an information perception layer, a data processing layer, a feature extraction layer, a scene modeling layer and an intelligent service layer.

The information perception layer is used for acquiring information of the electronic equipment or information in an external environment. The information-perceiving layer may include a plurality of sensors. For example, the information sensing layer includes a plurality of sensors such as a distance sensor, a magnetic field sensor, a light sensor, an acceleration sensor, a fingerprint sensor, a hall sensor, a position sensor, a gyroscope, an inertial sensor, an attitude sensor, a barometer, and a heart rate sensor.

Among other things, a distance sensor may be used to detect a distance between the electronic device and an external object. The magnetic field sensor may be used to detect magnetic field information of the environment in which the electronic device is located. The light sensor can be used for detecting light information of the environment where the electronic equipment is located. The acceleration sensor may be used to detect acceleration data of the electronic device. The fingerprint sensor may be used to collect fingerprint information of a user. The Hall sensor is a magnetic field sensor manufactured according to the Hall effect, and can be used for realizing automatic control of electronic equipment. The location sensor may be used to detect the geographic location where the electronic device is currently located. Gyroscopes may be used to detect angular velocity of an electronic device in various directions. Inertial sensors may be used to detect motion data of an electronic device. The gesture sensor may be used to sense gesture information of the electronic device. A barometer may be used to detect the barometric pressure of the environment in which the electronic device is located. The heart rate sensor may be used to detect heart rate information of the user.

And the data processing layer is used for processing the data acquired by the information perception layer. For example, the data processing layer may perform data cleaning, data integration, data transformation, data reduction, and the like on the data acquired by the information sensing layer.

The data cleaning refers to cleaning a large amount of data acquired by the information sensing layer to remove invalid data and repeated data. The data integration refers to integrating a plurality of single-dimensional data acquired by the information perception layer into a higher or more abstract dimension so as to comprehensively process the data of the plurality of single dimensions. The data transformation refers to performing data type conversion or format conversion on the data acquired by the information sensing layer so that the transformed data can meet the processing requirement. The data reduction means that the data volume is reduced to the maximum extent on the premise of keeping the original appearance of the data as much as possible.

The characteristic extraction layer is used for extracting characteristics of the data processed by the data processing layer so as to extract the characteristics included in the data. The extracted features may reflect the state of the electronic device itself or the state of the user or the environmental state of the environment in which the electronic device is located, etc.

The feature extraction layer may extract features or process the extracted features by a method such as a filtering method, a packing method, or an integration method.

The filtering method is to filter the extracted features to remove redundant feature data. Packaging methods are used to screen the extracted features. The integration method is to integrate a plurality of feature extraction methods together to construct a more efficient and more accurate feature extraction method for extracting features.

The scene modeling layer is used for building a model according to the features extracted by the feature extraction layer, and the obtained model can be used for representing the state of the electronic equipment, the state of a user, the environment state and the like. For example, the scenario modeling layer may construct a key value model, a pattern identification model, a graph model, an entity relation model, an object-oriented model, and the like according to the features extracted by the feature extraction layer.

The intelligent service layer is used for providing intelligent services for the user according to the model constructed by the scene modeling layer. For example, the intelligent service layer can provide basic application services for users, perform system intelligent optimization for electronic equipment, and provide personalized intelligent services for users.

In addition, the panoramic perception architecture can further comprise a plurality of algorithms, each algorithm can be used for analyzing and processing data, and the plurality of algorithms can form an algorithm library. For example, the algorithm library may include algorithms such as markov algorithm, hidden dirichlet distribution algorithm, bayesian classification algorithm, support vector machine, K-means clustering algorithm, K-nearest neighbor algorithm, conditional random field, residual network, long-short term memory network, convolutional neural network, cyclic neural network, and the like.

Based on the panoramic sensing framework, the electronic equipment collects panoramic data of a target user through an information sensing layer and/or other modes to form a panoramic data sequence. The data processing layer processes the panoramic data, for example, performs data cleaning, data integration, and the like on the acquired panoramic data. Next, the feature extraction layer determines an activity tag sequence of the target user in the target time interval according to the user activity identification method provided by the embodiment of the application. For example, a panoramic data sequence of a target user in a target time interval is acquired, feature extraction is performed on the panoramic data sequence to acquire a panoramic feature sequence, the panoramic feature sequence comprises panoramic features of each sample point, probability distribution of each sample point in the panoramic feature sequence on a plurality of active tags is calculated according to a time recursive neural network model, then, decoding processing is performed according to a connection time sequence classification model and the probability distribution of each sample point in the panoramic data sequence on the plurality of active tags, and an active tag sequence of the target user in the target time interval is determined. In the user activity identification scheme, each sample point of the panoramic data sequence is preliminarily classified through a time recursive neural network model, probability distribution is obtained, the probability distribution is used as input of a connection time sequence classification model, decoding operation is carried out, an activity label sequence is obtained, and user activity identification from the sequence to the sequence is realized.

An execution main body of the user activity recognition method may be the user activity recognition device provided in the embodiment of the present application, or an electronic device integrated with the user activity recognition device, where the user activity recognition device may be implemented in a hardware or software manner. The electronic device may be a smart phone, a tablet computer, a palm computer, a notebook computer, or a desktop computer.

Referring to fig. 2, fig. 2 is a first flowchart illustrating a user activity recognition method according to an embodiment of the present disclosure. The specific process of the user activity identification method provided by the embodiment of the application can be as follows:

step 101, acquiring a panoramic data sequence of a target user in a target time interval.

In the embodiment of the application, a scheme that a time recursive neural network model and a CTC (connection timing classification) model are combined is adopted, a panoramic data sequence of a user in a period of time is used as a basis, and an active tag sequence of the user in the period of time is identified, wherein the panoramic data sequence is a time sequence which contains a time sequence characteristic, the time sequence characteristic can be found through the time recursive neural network model, and the probability distribution of each sample point in the panoramic data sequence is obtained by using the time sequence characteristic, wherein one sample point corresponds to one time node in the period of time; in addition, the CTC model can decode the probability distribution, realize sequence-to-sequence activity recognition, and finally obtain a user activity tag sequence of the user in the time interval.

In the embodiment of the present application, the panoramic data refers to related data collected by the electronic device during the process of using the electronic device by the user, and includes terminal state data and/or sensor data.

The terminal state data comprises an operation mode, a display mode, a network state, a screen off/lock state, a memory occupancy rate, an electric quantity state and the like of the electronic equipment, wherein the operation mode of the electronic equipment comprises a game mode, an entertainment mode, a video mode and the like, the operation mode of the electronic equipment can be determined according to the type of the currently running application program, and the type of the currently running application program can be directly obtained from the application program installation package.

The sensor data includes signals collected by various sensors on the electronic device, for example, the following sensors are included on the electronic device: a plurality of sensors such as distance sensor, magnetometer, light sensor, acceleration sensor, fingerprint sensor, hall sensor, position sensor, gyroscope, inertial sensor, attitude sensor, barometer, heart rate sensor. The electronic equipment collects sensor data according to a preset frequency.

In an optional embodiment, the panoramic data sequence in the target time interval may be obtained by dividing the panoramic data sequence of the target user within a period of time by a time series data division method, for example, dividing the collected panoramic data sequence into panoramic data sequences corresponding to a plurality of consecutive time windows by a time window division method, extracting one of the time windows as the target time interval in the application, and further acquiring the panoramic data sequence included in the target time interval from all the panoramic data. Or, in other embodiments, other time series data segmentation methods may also be adopted, for example, sliding segmentation is performed according to a preset time window, a small part of each two adjacent time windows are overlapped, and the total panoramic data sequence is segmented into a plurality of panoramic data sequences with partial data overlapping; or, the total panoramic data sequence is divided into a plurality of panoramic data sequences with different time interval lengths according to manual rules.

In an optional embodiment, the electronic device collects panoramic data of a plurality of time nodes according to a preset frequency to form a panoramic data sequence. It should be noted that, if the acquisition frequencies of different types of panoramic data are different, when a panoramic data sequence is constructed, a timestamp synchronization mode may be adopted to perform synchronization processing on multiple types of panoramic data, so that time nodes thereof are kept the same. For example, the data return time of the acceleration sensor is not consistent with that of the gyroscope, when the sensor data of a certain time node is recorded, the data of the acceleration sensor with the return time closest to the time node can be selected and recorded as the acceleration sensor data corresponding to the time node, and meanwhile, the data of the gyroscope with the return time closest to the time node can be selected and recorded as the data of the gyroscope corresponding to the time node.

In addition, it should be noted that some of the above panoramic data may be in a non-digital form, and after the electronic device acquires the panoramic data, the panoramic data is converted into a digital representation in a preset manner. For example, the index number may be established to convert the text-type panoramic data into a digital representation, and the index number is used to represent different operation modes, such as 1 being a game mode, 2 being an entertainment mode, and 3 being a video mode, taking the operation mode of the electronic device as an example. By the conversion mode, the obtained panoramic data sequences are all digital sequences, and subsequent operation is facilitated.

And 102, extracting the features of the panoramic data sequence to obtain a panoramic feature sequence, wherein the panoramic feature sequence comprises the panoramic features of each sample point.

And after acquiring the panoramic data sequence, performing feature extraction on the panoramic data sequence to acquire a panoramic feature sequence, wherein the panoramic feature sequence is used as input data of a subsequent time recursive neural network model. After feature extraction, the generated panoramic feature sequence comprises the panoramic features of each sample point, and one sample point corresponds to one time node in the target time interval.

In an embodiment, step 102, performing feature extraction on the panoramic data sequence, and acquiring a panoramic feature sequence includes: preprocessing the sensor data sequence and the terminal state data sequence; generating a first panoramic characteristic sequence according to the preprocessed sensor data sequence, and generating a second panoramic characteristic sequence according to the preprocessed terminal state data sequence; and fusing the first panoramic feature sequence and the second panoramic feature sequence to generate the panoramic feature sequence.

The data preprocessing mainly comprises noise removal, missing value supplement and the like, wherein the noise removal can be finished by adopting certain threshold filtering on a number domain or a frequency domain, and the missing value can be estimated and filled by adopting an interpolation method. The data preprocessing aims to ensure the reliability and integrity of the data and facilitate subsequent data analysis.

For the sensor data sequence, due to the different characteristics of the different sensors, when the first panorama eigenvector is acquired according to the sensor data sequence, the sensor data is filtered by using a proper filter, and the characteristics matched with the characteristics of the sensor can be acquired.

Specifically, the step of generating a first panorama feature sequence from the preprocessed sensor data sequence includes: filtering the sensor data sequence according to a preset filter corresponding to the sensor; generating a first sensor characteristic according to the sensor data sequence after filtering processing; based on a complementary filtering method, carrying out fusion processing on the sensor data sequences with complementary relation to generate a fused sensor data sequence; generating a second sensor feature from the fused sensor data sequence; combining the first sensor features and the second sensor features to generate the first panoramic feature sequence. The manner of combining the first sensor feature and the second sensor feature may be a manner of adding features corresponding to the same time node.

For example, for an acceleration sensor, a high-pass filter is used to remove high-frequency signals and keep low-frequency signals; for a gyroscope, a low pass filter is used to remove low frequency signals and leave high frequency signals. A first sensor feature is extracted from the filtered sensor data.

Furthermore, complementary relationships may be formed between different sensors, such as acceleration sensors and gyroscopes, acceleration sensors and magnetometers, gyroscopes and magnetometers, acceleration sensors and barometers. The method comprises the steps of presetting a sensor combination with a complementary relation, after a sensor data sequence is obtained, carrying out fusion processing on the sensor data sequence with the complementary relation according to a complementary filtering method to generate a fused sensor data sequence, and then extracting a second sensor characteristic from the fused sensor data sequence.

Taking an acceleration sensor and a magnetometer as examples, the low-frequency characteristic of the acceleration sensor is good, and the angle of the acceleration can be directly calculated, so that no accumulated error exists, and the acceleration sensor is accurate after a long time. And after the gyroscope is used for a long time, the output error is larger due to the accumulation of the integral error, so that the two sensors can just make up for the mutual defects, and new sensor characteristics can be obtained after complementary filtering processing. The complementary filter performs filtering processing through different filters (such as complementary high-pass filters or complementary low-pass filters) according to different sensor characteristics, then adds the filtering results to obtain a signal of the whole frequency band, and performs weighted summation according to weights preset by each sensor during the addition.

For the terminal state data sequence, a pre-trained recurrent neural network model can be used for extracting features, the recurrent neural network model is pre-trained, and training data is a panoramic data sequence. The trained recurrent neural network model can efficiently learn the nonlinear features, and further can well mine the features in the time sequence data. In this embodiment, the data output from the second last layer (i.e., the last hidden layer) of the recurrent neural network model is used as the second panoramic feature sequence.

It can be understood that, if the panoramic feature sequence includes a sensor feature sequence and a terminal state feature sequence, where the electronic device includes a plurality of sensors, each time node in the panoramic feature sequence actually obtained corresponds to a plurality of panoramic features, and the features may be represented in a column vector form.

And 103, calculating the probability distribution of each sample point in the panoramic feature sequence on a plurality of activity labels according to a time recurrent neural network model.

After the panoramic characteristic sequence is obtained, the panoramic characteristic sequence is used as input data of a pre-trained time recursive neural network model, and probability distribution of each sample point in the panoramic characteristic sequence on a plurality of activity labels is calculated.

Fig. 3 is a schematic flowchart of a second method for identifying a user activity according to an embodiment of the present application.

In particular, in some embodiments, the temporal recurrent neural network model may be an long-and-short memory network model that includes a plurality of recurrent neural layers and a classification layer; step 103, calculating the probability distribution of each sample point in the panoramic feature sequence on a plurality of activity labels according to the time recurrent neural network model comprises:

step 1031, calculating the panoramic feature sequence in the recurrent neural layer, and extracting time sequence features;

step 1032, inputting the time sequence characteristics into the classification layer for calculation, and obtaining probability distribution of each sample point in the panoramic characteristic sequence on a plurality of active labels.

The recurrent neural layer in the long-time and short-time memory network model can extract time sequence features in the panoramic feature sequence, after the calculation of a plurality of recurrent neural layers, the obtained time sequence features are input into the classification layer for calculation, so that the probability distribution of each sample point in the panoramic data sequence on a plurality of activity labels can be obtained, in some embodiments, the classification layer can be a softmax classification layer, and the classification layer is classified by adopting a softmax function. Wherein a plurality of activity tags are preset, such as dining, listening to music, sports, shopping, meetings, playing games, taking pictures, repairing pictures, etc. Alternatively, in other embodiments, the temporal recurrent neural network model may also be an RNN (recurrent neural network) model.

And step 104, decoding according to a preset connection time sequence classification model and probability distribution of each sample point in the panoramic data sequence on a plurality of active tags, and determining an active tag sequence of the target user in the target time interval.

Fig. 4 is a schematic diagram illustrating a data processing flow in the user activity recognition method according to the embodiment of the present application. And the data in the dotted frame of the first row is a panoramic data sequence, and the probability distribution of each sample point on the plurality of activity labels is obtained after the calculation of a recurrent neural layer and a classification layer in the time recurrent neural network. And the CTC model is connected with the last layer of the time recurrent neural network model, and the probability distribution is decoded to obtain the active tag sequence of the target user in the target time interval.

Referring to fig. 5, a third flowchart of a user activity recognition method according to an embodiment of the present application is shown.

Step 104, according to a preset connection time sequence classification model and probability distribution of each sample point in the panoramic data sequence on a plurality of active tags, performing decoding processing, and determining an active tag sequence of the target user in the target time interval includes:

step 1041, determining all label paths according to probability distribution of each time point in the panoramic data sequence on the plurality of active labels;

step 1042, according to a preset connection time sequence classification model, decoding all the label paths, and determining an active label sequence of the target user in the target time interval.

In the scheme, L +1 tags are provided, wherein L active tags are provided, and the other tag is a blank active tag and is used for indicating a blank position in an active sequence.

Assuming that the sequence lengths of the panoramic data sequence and the panoramic feature sequence are both T, there are T possible label paths of the power L according to probability distribution, where the conditional probability of each label path may be represented as a product of output prediction probabilities at each time on the corresponding path, and may be represented as follows:

where p (π | x) represents the probability that the predicted active path is π when the input is x.

Is the probability that the active label output at time t is k, where k is one of the preset active labels A, B, C.

The probability distribution output by the classification layer defines the probability of all possible paths aligning all possible label sequences with the input sequence, and the total probability of any one active label sequence can be obtained by summing the probabilities of its different permutations.

In particular, during active tag identification, blank activity may occur between tags in a general sequence, and thus, repeated activity and blank activity in each path may be removed, e.g., for a tag path

Or

Both correspond to the active tag sequences (a, B). That is, a plurality of label paths will correspond to a correct label sequence l, and the length of the sequence is often smaller than the length of the path, then the final probability of the sequence can be expressed as follows by using the sum of the probabilities of the paths:

then for the CTC model, its output is required to be the most likely sequence of active tags corresponding to the panoramic data sequence:

two decoding approaches are provided below:

in the first mode, according to a preset connection time sequence classification model, all the label paths are decoded according to an optimal path decoding algorithm, and an active label sequence with the maximum probability value is obtained and used as an active label sequence corresponding to the panoramic data sequence.

The best path decoding algorithm may be represented as follows:

wherein, pi^*Is the most likely output connection on each time node.

And in the second mode, decoding all the label paths according to a preset connection time sequence classification model and a prefix searching decoding algorithm to obtain an active label sequence with the maximum probability value as an active label sequence corresponding to the panoramic data sequence.

The prefix search decoding algorithm performs prefix calculation according to a forward-backward algorithm, and is shown in fig. 6, which is a schematic diagram of the forward-backward algorithm provided in the embodiment of the present invention. The black circles in the figure represent active labels, the white circles are blank labels added, at the beginning and end of the sequence, and between every two adjacent active labels. The probability of multiple paths is calculated by a forward-backward algorithm, and assuming that a panoramic data sequence contains 100 sample points, the sequence is classified as AB (active label truth), where a and B are different active labels, respectively, when the forward-backward algorithm is used for solving, all possible situations are summarized, and the probability of label truth after combination is calculated. The forward direction refers to the probability that the simplified active label is the same as the truth value of the labeled active label at the moment t, and the backward direction refers to the probability that all the simplified active labels from the moment t to the backward direction are the same as the truth value of the labeled active label, and both the simplified active labels and the truth value of the labeled active label are solved through dynamic programming. For example, when t is 0, the active label true value is a, AAA- - -AA is the same as active label true value a, but AAABB is different from active label true value a, all correct cases can be found by dynamic programming, and the probabilities of all correct cases are summed. The probability and the maximum sequence are the final active label sequence.

Further, the CTC model was trained as follows:

acquiring a sample panoramic data sequence and a movable label sequence corresponding to the sample panoramic data sequence;

extracting characteristics of the sample panoramic data sequence to obtain a sample panoramic characteristic sequence;

calculating the probability distribution of each sample point in the sample panoramic feature sequence on the plurality of activity labels according to the time recurrent neural network model;

taking the probability distribution of each sample point in the sample panoramic feature sequence on the plurality of active labels and the active label sequence corresponding to the sample panoramic data sequence as training data;

and training the connection time sequence classification model according to the training data.

Wherein the step of training the connection timing classification model according to the training data comprises: determining a loss function of the connection timing classification model; and optimizing the loss function according to a gradient descent algorithm based on the training data so as to train the connection time sequence classification model.

In the embodiment of the application, the objective function of the CTC model is derived by the maximum likelihood principle, that is, the logarithm possibility of the target active tag sequence can be maximized by minimizing the objective function, and after the objective function is determined, the loss function can be optimized according to a gradient descent algorithm, so that the CTC model can be trained.

In particular implementation, the present application is not limited by the execution sequence of the described steps, and some steps may be performed in other sequences or simultaneously without conflict.

As can be seen from the above, the user activity identification method provided in this embodiment of the present application obtains a panoramic data sequence of a target user in a target time interval, performs feature extraction on the panoramic data sequence to obtain a panoramic feature sequence, where the panoramic feature sequence includes panoramic features of each sample point, calculates probability distribution of each sample point in the panoramic feature sequence on multiple active tags according to a time recurrent neural network model, and then performs decoding processing according to a connection timing classification model and the probability distribution of each sample point in the panoramic data sequence on multiple active tags to determine an active tag sequence of the target user in the target time interval. In the user activity identification scheme, each sample point of the panoramic data sequence is preliminarily classified through the time recursive neural network model, probability distribution is obtained, the probability distribution is used as input of the connection time sequence classification model, decoding operation is carried out, an activity label sequence is obtained, and user activity identification from the panoramic data sequence to the user activity label sequence is achieved.

In one embodiment, a user activity recognition apparatus is also provided. Referring to fig. 7, fig. 7 is a schematic structural diagram of a user activity recognition apparatus 400 according to an embodiment of the present application. The user activity recognition apparatus 400 is applied to an electronic device, and the user activity recognition apparatus 400 includes a data obtaining module 401, a feature extracting module 402, a probability calculating module 403, and an activity classifying module 404, as follows:

a data obtaining module 401, configured to obtain a panoramic data sequence of a target user in a target time interval;

a feature extraction module 402, configured to perform feature extraction on the panoramic data sequence to obtain a panoramic feature sequence, where the panoramic feature sequence includes panoramic features of each sample point;

a probability calculation module 403, configured to calculate probability distribution of each sample point in the panoramic feature sequence on a plurality of active tags according to a time recurrent neural network model;

and an activity classification module 404, configured to perform decoding processing according to a preset connection timing classification model and probability distribution of each sample point in the panoramic data sequence on multiple activity tags, and determine an activity tag sequence of the target user in the target time interval.

In some embodiments, activity classification module 404 is further to: determining all label paths according to the probability distribution of each time point in the panoramic data sequence on the plurality of active labels; and decoding all the label paths according to a preset connection time sequence classification model, and determining the active label sequence of the target user in the target time interval.

In some embodiments, activity classification module 404 is further to: and according to a preset connection time sequence classification model, searching a decoding algorithm according to an optimal path decoding algorithm or a prefix, decoding all the label paths, and acquiring an active label sequence with the maximum probability value as an active label sequence corresponding to the panoramic data sequence.

In some embodiments, the temporal recurrent neural network model is an episodic memory network model comprising a plurality of recurrent neural layers and classification layers;

in some embodiments, the probability calculation module 403 is further configured to: calculating the panoramic characteristic sequence in the recurrent neural layer, and extracting time sequence characteristics; and inputting the time sequence characteristics into the classification layer for calculation, and acquiring the probability distribution of each sample point in the panoramic characteristic sequence on a plurality of active labels.

In some embodiments, the panoramic data sequence comprises a sensor data sequence and a terminal state data sequence; the feature extraction module 402 is further configured to: preprocessing the sensor data sequence and the terminal state data sequence; generating a first panoramic characteristic sequence according to the preprocessed sensor data sequence, and generating a second panoramic characteristic sequence according to the preprocessed terminal state data sequence; and fusing the first panoramic feature sequence and the second panoramic feature sequence to generate the panoramic feature sequence.

In some embodiments, the user activity recognition device 400 further comprises a model training module for: acquiring a sample panoramic data sequence and a movable label sequence corresponding to the sample panoramic data sequence; extracting characteristics of the sample panoramic data sequence to obtain a sample panoramic characteristic sequence; calculating the probability distribution of each sample point in the sample panoramic feature sequence on the plurality of activity labels according to the time recurrent neural network model; taking the probability distribution of each sample point in the sample panoramic feature sequence on the plurality of active labels and the active label sequence corresponding to the sample panoramic data sequence as training data; and training the connection time sequence classification model according to the training data.

In some embodiments, the model training module is further to: determining a loss function of the connection timing classification model; and optimizing the loss function according to a gradient descent algorithm based on the training data so as to train the connection time sequence classification model.

In specific implementation, the above modules may be implemented as independent entities, or may be combined arbitrarily to be implemented as the same or several entities, and specific implementation of the above modules may refer to the foregoing method embodiments, which are not described herein again.

As can be seen from the above, in the user activity recognition apparatus provided in this embodiment of the application, the data obtaining module 401 obtains a panoramic data sequence of a target user in a target time interval, the feature extracting module 402 performs feature extraction on the panoramic data sequence to obtain a panoramic feature sequence, where the panoramic feature sequence includes panoramic features of each sample point, the probability calculating module 403 calculates probability distribution of each sample point in the panoramic feature sequence on a plurality of activity tags according to the time recursive neural network model, and then the activity classifying module 404 performs decoding processing according to the connection timing sequence classification model and the probability distribution of each sample point in the panoramic data sequence on a plurality of activity tags, so as to determine an activity tag sequence of the target user in the target time interval. In the user activity identification scheme, each sample point of the panoramic data sequence is preliminarily classified through the time recursive neural network model, probability distribution is obtained, the probability distribution is used as input of the connection time sequence classification model, decoding operation is carried out, an activity label sequence is obtained, and user activity identification from the panoramic data sequence to the user activity label sequence is achieved.

The embodiment of the application also provides the electronic equipment. The electronic device can be a smart phone, a tablet computer and the like. As shown in fig. 8, fig. 8 is a schematic view of a first structure of an electronic device according to an embodiment of the present application. The electronic device 300 comprises a processor 301 and a memory 302. The processor 301 is electrically connected to the memory 302.

The processor 301 is a control center of the electronic device 300, connects various parts of the entire electronic device using various interfaces and lines, and performs various functions of the electronic device and processes data by running or calling a computer program stored in the memory 302 and calling data stored in the memory 302, thereby performing overall monitoring of the electronic device.

In this embodiment, the processor 301 in the electronic device 300 loads instructions corresponding to one or more processes of the computer program into the memory 302 according to the following steps, and the processor 301 runs the computer program stored in the memory 302, so as to implement various functions:

acquiring a panoramic data sequence of a target user in a target time interval;

In some embodiments, when performing decoding processing according to a preset connection timing classification model and a probability distribution of each sample point in the panoramic data sequence on a plurality of active tags, and determining an active tag sequence of the target user in the target time interval, the processor 301 performs the following steps:

determining all label paths according to the probability distribution of each time point in the panoramic data sequence on the plurality of active labels;

and decoding all the label paths according to a preset connection time sequence classification model, and determining the active label sequence of the target user in the target time interval.

In some embodiments, when all the tag paths are decoded according to a preset connection timing classification model to determine an active tag sequence of the target user in the target time interval, the processor 301 performs the following steps:

and according to a preset connection time sequence classification model, searching a decoding algorithm according to an optimal path decoding algorithm or a prefix, decoding all the label paths, and acquiring an active label sequence with the maximum probability value as an active label sequence corresponding to the panoramic data sequence.

In some embodiments, the temporal recurrent neural network model is an episodic memory network model comprising a plurality of recurrent neural layers and classification layers; when calculating the probability distribution of each sample point in the panoramic feature sequence over a plurality of activity labels according to the temporal recurrent neural network model, the processor 301 performs the following steps:

calculating the panoramic characteristic sequence in the recurrent neural layer, and extracting time sequence characteristics;

and inputting the time sequence characteristics into the classification layer for calculation, and acquiring the probability distribution of each sample point in the panoramic characteristic sequence on a plurality of active labels.

In some embodiments, the panoramic data sequence comprises a sensor data sequence and a terminal state data sequence; when feature extraction is performed on the panoramic data sequence to obtain a panoramic feature sequence, the processor 301 executes the following steps:

preprocessing the sensor data sequence and the terminal state data sequence;

generating a first panoramic characteristic sequence according to the preprocessed sensor data sequence, and generating a second panoramic characteristic sequence according to the preprocessed terminal state data sequence;

and fusing the first panoramic feature sequence and the second panoramic feature sequence to generate the panoramic feature sequence.

In some embodiments, processor 301 further performs the steps of:

In some embodiments, when training the connection timing classification model according to the training data, processor 301 further performs the following steps:

determining a loss function of the connection timing classification model;

and optimizing the loss function according to a gradient descent algorithm based on the training data so as to train the connection time sequence classification model.

Memory 302 may be used to store computer programs and data. The memory 302 stores computer programs containing instructions executable in the processor. The computer program may constitute various functional modules. The processor 301 executes various functional applications and data processing by calling a computer program stored in the memory 302.

In some embodiments, as shown in fig. 9, fig. 9 is a second schematic structural diagram of an electronic device provided in the embodiments of the present application. The electronic device 300 further includes: radio frequency circuit 303, display screen 304, control circuit 305, input unit 306, audio circuit 307, sensor 308, and power supply 309. The processor 301 is electrically connected to the rf circuit 303, the display 304, the control circuit 305, the input unit 306, the audio circuit 307, the sensor 308, and the power source 309, respectively.

The radio frequency circuit 303 is used for transceiving radio frequency signals to communicate with a network model device or other electronic devices through wireless communication.

The display screen 304 may be used to display information entered by or provided to the user as well as various graphical user interfaces of the electronic device, which may be comprised of images, text, icons, video, and any combination thereof.

The control circuit 305 is electrically connected to the display screen 304, and is used for controlling the display screen 304 to display information.

The input unit 306 may be used to receive input numbers, character information, or user characteristic information (e.g., fingerprint), and to generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control. The input unit 306 may include a fingerprint recognition module.

Audio circuitry 307 may provide an audio interface between the user and the electronic device through a speaker, microphone. Where audio circuitry 307 includes a microphone. The microphone is electrically connected to the processor 301. The microphone is used for receiving voice information input by a user.

The sensor 308 is used to collect external environmental information. The sensor 308 may include one or more of an ambient light sensor, an acceleration sensor, a gyroscope, and the like.

The power supply 309 is used to power the various components of the electronic device 300. In some embodiments, the power source 309 may be logically connected through the power management system preprocessor 301 to implement functions of managing charging, discharging, and power consumption management through the power management system.

Although not shown in fig. 9, the electronic device 300 may further include a camera, a bluetooth module, and the like, which are not described in detail herein.

As can be seen from the above, an embodiment of the present application provides an electronic device, where the electronic device obtains a panoramic data sequence of a target user in a target time interval, performs feature extraction on the panoramic data sequence to obtain a panoramic feature sequence, where the panoramic feature sequence includes panoramic features of each sample point, calculates probability distributions of each sample point in the panoramic feature sequence on multiple active tags according to a time recurrent neural network model, and then performs decoding processing according to a connection timing classification model and the probability distributions of each sample point in the panoramic data sequence on the multiple active tags to determine an active tag sequence of the target user in the target time interval. In the user activity identification scheme, each sample point of the panoramic data sequence is preliminarily classified through the time recursive neural network model, probability distribution is obtained, the probability distribution is used as input of the connection time sequence classification model, decoding operation is carried out, an activity label sequence is obtained, and user activity identification from the panoramic data sequence to the user activity label sequence is achieved.

An embodiment of the present application further provides a storage medium, where a computer program is stored in the storage medium, and when the computer program runs on a computer, the computer executes the user activity identification method according to any of the above embodiments.

It should be noted that, all or part of the steps in the methods of the above embodiments may be implemented by hardware related to instructions of a computer program, which may be stored in a computer-readable storage medium, which may include, but is not limited to: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

Furthermore, the terms "first", "second", and "third", etc. in this application are used to distinguish different objects, and are not used to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or modules is not limited to only those steps or modules listed, but rather, some embodiments may include other steps or modules not listed or inherent to such process, method, article, or apparatus.

The user activity recognition method, the user activity recognition device, the storage medium and the electronic device provided by the embodiment of the application are described in detail above. The principle and the implementation of the present application are explained herein by applying specific examples, and the above description of the embodiments is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method for identifying user activity, comprising:

acquiring a panoramic data sequence of a target user in a target time interval;

2. The method as claimed in claim 1, wherein the step of performing a decoding process according to a predetermined connection timing classification model and a probability distribution of each sample point in the panoramic data sequence over a plurality of active tags, and determining the active tag sequence of the target user in the target time interval comprises:

3. The method as claimed in claim 2, wherein the step of decoding all the label paths according to a preset connection timing classification model, and determining the active label sequence of the target user in the target time interval comprises:

4. The user activity recognition method of claim 1, wherein the temporal recurrent neural network model is an episodic memory network model, the episodic memory network model comprising a plurality of recurrent neural layers and classification layers; according to the time recursive neural network model, the step of calculating the probability distribution of each sample point in the panoramic feature sequence on a plurality of activity labels comprises the following steps:

5. The user activity recognition method of claim 1, wherein the panoramic data sequence includes a sensor data sequence and a terminal state data sequence; the step of extracting the features of the panoramic data sequence and acquiring the panoramic feature sequence comprises the following steps:

preprocessing the sensor data sequence and the terminal state data sequence;

6. The user activity recognition method of any one of claims 1 to 5, wherein the method further comprises:

7. The method of claim 6, wherein the step of training the connection timing classification model based on the training data comprises:

determining a loss function of the connection timing classification model;

8. An apparatus for recognizing user activity, comprising:

9. A storage medium having stored thereon a computer program, characterized in that, when the computer program is run on a computer, it causes the computer to execute the user activity recognition method according to any one of claims 1 to 7.

10. An electronic device comprising a processor and a memory, said memory storing a computer program, wherein said processor is adapted to perform the user activity recognition method of any of claims 1 to 7 by invoking said computer program.