CN108875597B - Large-scale data set-oriented two-layer activity cluster identification method - Google Patents
Large-scale data set-oriented two-layer activity cluster identification method Download PDFInfo
- Publication number
- CN108875597B CN108875597B CN201810538902.9A CN201810538902A CN108875597B CN 108875597 B CN108875597 B CN 108875597B CN 201810538902 A CN201810538902 A CN 201810538902A CN 108875597 B CN108875597 B CN 108875597B
- Authority
- CN
- China
- Prior art keywords
- activity
- group
- activities
- training
- feature selection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000000694 effects Effects 0.000 title claims abstract description 95
- 238000000034 method Methods 0.000 title claims abstract description 19
- 238000012549 training Methods 0.000 claims abstract description 17
- 238000010187 selection method Methods 0.000 claims abstract description 8
- 239000011159 matrix material Substances 0.000 claims description 8
- 230000000875 corresponding effect Effects 0.000 claims description 2
- 239000000203 mixture Substances 0.000 claims description 2
- 239000010410 layer Substances 0.000 abstract description 17
- 238000013145 classification model Methods 0.000 abstract description 5
- 239000002356 single layer Substances 0.000 abstract description 4
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 230000009191 jumping Effects 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 5
- 238000007637 random forest analysis Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001174 ascending effect Effects 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a two-layer activity cluster identification method facing a large-scale data set, which comprises the following steps: 1) sparse coding based active clustering; 2) feature selection and training of a group classifier; 3) feature selection and training of intra-group classifiers. The invention has the beneficial effects that: according to the method, activities are divided into different groups on a large-scale data set according to the similarity, characteristics with higher pertinence are selected, and the accuracy of activity identification is improved; compared with a single-layer classification model, the classification effect of the two-layer activity clustering recognition model is obviously improved, and the selected features are more targeted; the feature selection method can select important features more and use fewer features to achieve satisfactory classification accuracy.
Description
Technical Field
The invention relates to a wearable sensor-based activity recognition method, in particular to a two-layer activity cluster recognition method for a large-scale data set.
Background
At present, the activity recognition based on the wearable sensor achieves high recognition accuracy, but most of research works are carried out on a small-scale data set, and the number of involved experimenters is often small. However, in the practical application process, the number of objects involved in activity recognition is huge, and meanwhile, the activity data of the objects cannot be obtained in advance. Therefore, establishing an efficient object-independent activity recognition method on large-scale data sets remains a problem to be solved.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provide a two-layer activity cluster identification method for a large-scale data set.
The two-layer activity cluster identification method facing the large-scale data set comprises the following steps:
1) sparse coding based active clustering
1.1) carrying out sparse coding on sample data according to a formula (1), and solving a sparse coefficient alpha, wherein A is training data, D is a dictionary, and alpha is a sparse coefficient to be solved;
A=Dα (1)
1.2) calculating the distance between different activity categories according to the formula (2) to obtain a matrix M with the size of n multiplied by n, wherein n is the number of the activity categories;
wherein Δi,jIs an activity AiAnd AjThe smaller the distance, the more similar the activities are, f is the number of features, Ni,kIs solved in 1) to obtain alphai,kNumber of non-0 coefficients, S, on the kth featureiIs an activity AiThe number of samples of (a);
1.3) clustering mutually selected activities into the same activity group G according to the matrix MkIn the step (c), a preliminary active set G ═ G is obtained1,G2...GkAnd A isiAnd AjRemove from active set a;
1.4) search the active set A, querying each A from the matrix MpMost similar Activity A for E AqIf A isqE is G, then A ispAdding AqIn the corresponding activity group, and ApRemove from active set a;
1.5) repeating the step 1.4) until the active set A is an empty set or the number of activities in the active set A is not changed any more;
1.6) if A is not empty, then cluster all the activities left in A into a new activity group GmIn (1), GmAdding the mixture into G;
1.7) outputting an activity group set G to complete the grouping of activities;
2) feature selection and training of group classifiers
2.1) according to the activity group set completed in the step 1), taking the same activity group as the same type of activity to perform feature selection, wherein the feature selection method is shown as a formula (3):
wherein WkIs the weight value of the kth feature, var (f)k) Is the variance of the kth feature, var (f)k,i) Is the variance of the kth feature over activity i;
2.2) carrying out group classifier training according to the features selected in the step 2.1) to obtain a first-layer classifier, wherein the classifier is used for classifying activities into a certain activity group;
3) feature selection and training of intra-group classifiers
3.1) for each activity group, respectively selecting features in different groups by using the formula (3);
and 3.2) training intra-group classifiers according to the selected features in each activity group to obtain a second-layer classifier, wherein the classifier is used for classifying the activities to the final specific activities.
Preferably, the method comprises the following steps: in the step 1.3), A is mutually selectediThe most similar activity of is Aj,AjThe most similar activity of is Ai。
The invention has the beneficial effects that:
1. according to the invention, activities are divided into different groups according to the similarity on a large-scale data set, characteristics with higher pertinence are selected, and the accuracy of activity recognition is improved.
2. Compared with a single-layer classification model, the classification effect of the two-layer activity clustering recognition model is obviously improved, and the selected features are more targeted.
3. The feature selection method can select important features more and use fewer features to achieve satisfactory classification accuracy.
Drawings
FIG. 1 is a general flow diagram of the present method;
FIG. 2 is a two-tier active cluster identification model constructed on a HASC-PAC dataset;
fig. 3 is a comparison of different feature selection methods.
Detailed Description
The present invention will be further described with reference to the following examples. The following examples are set forth merely to aid in the understanding of the invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
The large-scale dataset used in the experiment was HASC-PAC2016, which contained behavioral data for 510 different people, with specific activity types and their labels: standing (1), walking (2), jogging (3), jumping (4), going upstairs (5) and going downstairs (6).
The two-layer activity cluster recognition method facing the large-scale data set is characterized in that an overall training flow chart is shown in figure 1, and the method comprises the following specific steps:
step one, active clustering based on sparse coding
1) And carrying out sparse coding on the HASC-PAC2016 data set, and solving a sparse coefficient.
2) The distance between different activity classes is calculated according to the sparse coefficient to obtain a matrix M of 6 × 6, and as shown in table 1, the lower the value, the more similar the activities are.
TABLE 1 distance between different Activity classes
Standing up | Walk | Jogging | Jumping toy | Go upstairs | Go downstairs | |
Standing up | 0 | 6.9563 | 7.6925 | 7.6735 | 6.8843 | 6.9700 |
Walk | 6.9563 | 0 | 1.3304 | 1.0204 | 0.3730 | 0.3739 |
Jogging | 7.6925 | 1.3304 | 0 | 0.8013 | 1.3143 | 1.2329 |
Jumping toy | 7.6735 | 1.0204 | 0.8013 | 0 | 1.0558 | 0.9272 |
Go upstairs | 6.8843 | 0.3730 | 1.3143 | 1.0558 | 0 | 0.2886 |
Go downstairs | 6.9700 | 0.3739 | 1.2329 | 0.9272 | 0.2886 | 0 |
3) The activities selected from each other are clustered according to the matrix M, wherein the activities of jogging and jumping are selected from each other, and the activities of going upstairs and downstairs are selected from each other, so that two preliminary activity sets G1: { jogging, jumping }, G2: { upstairs and downstairs }, can be obtained, and the four activities are removed from the activity sets.
4) At this point in the active set there are two activities remaining standing and walking, searching the active set, the activity most similar to standing is ascending stairs, which belong to the active group G2, so standing is added to G2 and removed from the active set.
5) And repeating the step 4) until the active set is an empty set or the number of activities in the active set is not changed any more.
6) And outputting the activity group set G to complete the grouping of the activities.
The final grouping results of the experiment on the HASC-PAC2016 dataset were 2 different groups, as shown in table 2:
TABLE 2 grouping results for HASC-PAC2016 dataset
Group of | Active set |
First group | Jogging and jumping |
Second group | Standing, walking, going upstairs and downstairs |
Step two, feature selection and training of group classifier
According to table 2, the activities in the same group are considered as the same type of activities, the learning of feature weights and the feature extraction are performed using formula (3), and then the training of a group classifier for classifying the activities into a certain group is performed using the selected features.
Step three, feature selection and training of intra-group classifier
And (3) respectively performing feature weight learning and feature selection on 2 different groups by using a formula (3), wherein the features selected on the different groups are different, and then performing training of an intra-group classifier according to the selected features, wherein the intra-group classifier is used for classifying the activities to be recognized into specific activities.
The resulting two-layer active cluster recognition model trained on the HASC-PAC2016 dataset is shown in FIG. 2.
Experiments and results are as follows:
the invention aims to divide activities into different groups according to similarity on a large-scale data set, select more targeted characteristics and improve the accuracy of activity recognition. To measure the effectiveness of this method, we performed experiments on a large-scale data set HASC-PAC2016 using impersonal and 5-fold cross validation using SVM and Random forest as the basic classifiers. The results of the experiment are shown in table 3:
TABLE 3 Experimental results of two-layer Activity group recognition model
Method | HASC-PAC2016 |
Single-layer SVM | 0.6037 |
Two-layer SVM | 0.6379 |
Single layer Random forest | 0.7035 |
Two-layer Random forest | 0.7441 |
As can be seen from table 3, the classification effect of the two-layer active clustering recognition model is significantly improved compared with the original classification model, which indicates that the features selected by the two-layer classification model are more targeted. Meanwhile, in order to verify the performance of the proposed feature selection method, other three common feature selection methods are selected to be compared on two layers of classification models, namely Laplacian score and Relief-F, MCFS. The experimental results are shown in fig. 3, and the selected number of features is 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, respectively. We can see that the feature selection method of the present invention can select more important features than the other three methods, and can use fewer features to achieve a satisfactory classification accuracy.
Claims (2)
1. A two-layer activity cluster identification method facing a large-scale data set is characterized by comprising the following steps:
1) sparse coding based active clustering
1.1) carrying out sparse coding on sample data according to a formula (1), and solving a sparse coefficient alpha, wherein A is training data, D is a dictionary, and alpha is a sparse coefficient to be solved;
A=Dα (1)
1.2) calculating the distance between different activity categories according to the formula (2) to obtain a matrix M with the size of n multiplied by n, wherein n is the number of the activity categories;
wherein Δi,jIs an activity AiAnd AjThe smaller the distance, the more similar the activities are, f is the number of features, Ni,kIs solved in 1) to obtain alphai,kNumber of non-0 coefficients, S, on the kth featureiIs an activity AiThe number of samples of (a);
1.3) clustering mutually selected activities into the same activity group G according to the matrix MkIn the middle, get preliminary activitiesSet G ═ G1,G2...GkAnd A isiAnd AjRemove from active set a;
1.4) search the active set A, querying each A from the matrix MpMost similar Activity A for E AqIf A isqE is G, then A ispAdding AqIn the corresponding activity group, and ApRemove from active set a;
1.5) repeating the step 1.4) until the active set A is an empty set or the number of activities in the active set A is not changed any more;
1.6) if A is not empty, then cluster all the activities left in A into a new activity group GmIn (1), GmAdding the mixture into G;
1.7) outputting an activity group set G to complete the grouping of activities;
2) feature selection and training of group classifiers
2.1) according to the activity group set completed in the step 1), taking the same activity group as the same type of activity to perform feature selection, wherein the feature selection method is shown as a formula (3):
wherein WkIs the weight value of the kth feature, var (f)k) Is the variance of the kth feature, var (f)k,i) Is the variance of the kth feature over activity i;
2.2) carrying out group classifier training according to the features selected in the step 2.1) to obtain a first-layer classifier, wherein the classifier is used for classifying activities into a certain activity group;
3) feature selection and training of intra-group classifiers
3.1) for each activity group, respectively selecting features in different groups by using the formula (3);
and 3.2) training intra-group classifiers according to the selected features in each activity group to obtain a second-layer classifier, wherein the classifier is used for classifying the activities to the final specific activities.
2. The large-scale data set-oriented two-tier active cluster recognition method according to claim 1, wherein in the step 1.3), A is selected as each otheriThe most similar activity of is Aj,AjThe most similar activity of is Ai。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810538902.9A CN108875597B (en) | 2018-05-30 | 2018-05-30 | Large-scale data set-oriented two-layer activity cluster identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810538902.9A CN108875597B (en) | 2018-05-30 | 2018-05-30 | Large-scale data set-oriented two-layer activity cluster identification method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108875597A CN108875597A (en) | 2018-11-23 |
CN108875597B true CN108875597B (en) | 2021-03-30 |
Family
ID=64335735
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810538902.9A Active CN108875597B (en) | 2018-05-30 | 2018-05-30 | Large-scale data set-oriented two-layer activity cluster identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108875597B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110751115B (en) * | 2019-10-24 | 2021-01-01 | 北京金茂绿建科技有限公司 | Non-contact human behavior identification method and system |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010137325A1 (en) * | 2009-05-27 | 2010-12-02 | パナソニック株式会社 | Behavior recognition device |
CN103268495A (en) * | 2013-05-31 | 2013-08-28 | 公安部第三研究所 | Human body behavioral modeling identification method based on priori knowledge cluster in computer system |
CN103440471A (en) * | 2013-05-05 | 2013-12-11 | 西安电子科技大学 | Human body action identifying method based on lower-rank representation |
CN103473539A (en) * | 2013-09-23 | 2013-12-25 | 智慧城市系统服务(中国)有限公司 | Gait recognition method and device |
CN104732208A (en) * | 2015-03-16 | 2015-06-24 | 电子科技大学 | Video human action reorganization method based on sparse subspace clustering |
CN105868779A (en) * | 2016-03-28 | 2016-08-17 | 浙江工业大学 | Method for identifying behavior based on feature enhancement and decision fusion |
CN106203484A (en) * | 2016-06-29 | 2016-12-07 | 北京工业大学 | A kind of human motion state sorting technique based on classification layering |
CN106326906A (en) * | 2015-06-17 | 2017-01-11 | 姚丽娜 | Activity identification method and device |
-
2018
- 2018-05-30 CN CN201810538902.9A patent/CN108875597B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010137325A1 (en) * | 2009-05-27 | 2010-12-02 | パナソニック株式会社 | Behavior recognition device |
CN103440471A (en) * | 2013-05-05 | 2013-12-11 | 西安电子科技大学 | Human body action identifying method based on lower-rank representation |
CN103268495A (en) * | 2013-05-31 | 2013-08-28 | 公安部第三研究所 | Human body behavioral modeling identification method based on priori knowledge cluster in computer system |
CN103473539A (en) * | 2013-09-23 | 2013-12-25 | 智慧城市系统服务(中国)有限公司 | Gait recognition method and device |
CN104732208A (en) * | 2015-03-16 | 2015-06-24 | 电子科技大学 | Video human action reorganization method based on sparse subspace clustering |
CN106326906A (en) * | 2015-06-17 | 2017-01-11 | 姚丽娜 | Activity identification method and device |
CN105868779A (en) * | 2016-03-28 | 2016-08-17 | 浙江工业大学 | Method for identifying behavior based on feature enhancement and decision fusion |
CN106203484A (en) * | 2016-06-29 | 2016-12-07 | 北京工业大学 | A kind of human motion state sorting technique based on classification layering |
Non-Patent Citations (3)
Title |
---|
Complex human activities recognition using interval temporal syntactic model;XIA Li-min et al.;《J. Cent. South Univ.》;20161231;第2578-2586页 * |
基于可穿戴传感器的人体活动识别研究综述;郑增威 等;《计算机应用》;20180510;第38卷(第5期);第1223-1229页 * |
基于层次K-均值聚类的支持向量机模型;王秀华 等;《计算机应用与软件》;20140531;第31卷(第5期);第172-176页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108875597A (en) | 2018-11-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104573046B (en) | A kind of comment and analysis method and system based on term vector | |
CN105005589B (en) | A kind of method and apparatus of text classification | |
CN102289522B (en) | Method of intelligently classifying texts | |
CN109635936A (en) | A kind of neural networks pruning quantization method based on retraining | |
CN109446332B (en) | People reconciliation case classification system and method based on feature migration and self-adaptive learning | |
CN100595780C (en) | Handwriting digital automatic identification method based on module neural network SN9701 rectangular array | |
CN108595913A (en) | Differentiate the supervised learning method of mRNA and lncRNA | |
CN109063719B (en) | Image classification method combining structure similarity and class information | |
CN108197643B (en) | Transfer learning method based on unsupervised clustering and metric learning | |
CN110046634B (en) | Interpretation method and device of clustering result | |
CN103942568A (en) | Sorting method based on non-supervision feature selection | |
CN113407660B (en) | Unstructured text event extraction method | |
CN107545033B (en) | Knowledge base entity classification calculation method based on representation learning | |
CN109858034A (en) | A kind of text sentiment classification method based on attention model and sentiment dictionary | |
CN107895000A (en) | A kind of cross-cutting semantic information retrieval method based on convolutional neural networks | |
CN112231477A (en) | Text classification method based on improved capsule network | |
CN103617203B (en) | Protein-ligand bindings bit point prediction method based on query driven | |
CN110569920A (en) | prediction method for multi-task machine learning | |
CN110728144B (en) | Extraction type document automatic summarization method based on context semantic perception | |
CN104809474B (en) | Large data based on adaptive grouping multitiered network is intensive to subtract method | |
CN106844328A (en) | A kind of new extensive document subject matter semantic analysis and system | |
CN104008187A (en) | Semi-structured text matching method based on the minimum edit distance | |
CN111507224A (en) | CNN facial expression recognition significance analysis method based on network pruning | |
CN108470025A (en) | Partial-Topic probability generates regularization own coding text and is embedded in representation method | |
CN108875597B (en) | Large-scale data set-oriented two-layer activity cluster identification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220708 Address after: 310015 No. 51, Huzhou street, Hangzhou, Zhejiang Patentee after: HANGZHOU City University Address before: 310015 No. 50 Huzhou Street, Hangzhou City, Zhejiang Province Patentee before: Zhejiang University City College |
|
TR01 | Transfer of patent right |