CN108875597B - Large-scale data set-oriented two-layer activity cluster identification method - Google Patents

Large-scale data set-oriented two-layer activity cluster identification method Download PDF

Info

Publication number
CN108875597B
CN108875597B CN201810538902.9A CN201810538902A CN108875597B CN 108875597 B CN108875597 B CN 108875597B CN 201810538902 A CN201810538902 A CN 201810538902A CN 108875597 B CN108875597 B CN 108875597B
Authority
CN
China
Prior art keywords
activity
group
activities
training
feature selection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810538902.9A
Other languages
Chinese (zh)
Other versions
CN108875597A (en
Inventor
郑增威
杜俊杰
孙霖
霍梅梅
陈垣毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou City University
Original Assignee
Zhejiang University City College ZUCC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University City College ZUCC filed Critical Zhejiang University City College ZUCC
Priority to CN201810538902.9A priority Critical patent/CN108875597B/en
Publication of CN108875597A publication Critical patent/CN108875597A/en
Application granted granted Critical
Publication of CN108875597B publication Critical patent/CN108875597B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a two-layer activity cluster identification method facing a large-scale data set, which comprises the following steps: 1) sparse coding based active clustering; 2) feature selection and training of a group classifier; 3) feature selection and training of intra-group classifiers. The invention has the beneficial effects that: according to the method, activities are divided into different groups on a large-scale data set according to the similarity, characteristics with higher pertinence are selected, and the accuracy of activity identification is improved; compared with a single-layer classification model, the classification effect of the two-layer activity clustering recognition model is obviously improved, and the selected features are more targeted; the feature selection method can select important features more and use fewer features to achieve satisfactory classification accuracy.

Description

Large-scale data set-oriented two-layer activity cluster identification method
Technical Field
The invention relates to a wearable sensor-based activity recognition method, in particular to a two-layer activity cluster recognition method for a large-scale data set.
Background
At present, the activity recognition based on the wearable sensor achieves high recognition accuracy, but most of research works are carried out on a small-scale data set, and the number of involved experimenters is often small. However, in the practical application process, the number of objects involved in activity recognition is huge, and meanwhile, the activity data of the objects cannot be obtained in advance. Therefore, establishing an efficient object-independent activity recognition method on large-scale data sets remains a problem to be solved.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provide a two-layer activity cluster identification method for a large-scale data set.
The two-layer activity cluster identification method facing the large-scale data set comprises the following steps:
1) sparse coding based active clustering
1.1) carrying out sparse coding on sample data according to a formula (1), and solving a sparse coefficient alpha, wherein A is training data, D is a dictionary, and alpha is a sparse coefficient to be solved;
A=Dα (1)
1.2) calculating the distance between different activity categories according to the formula (2) to obtain a matrix M with the size of n multiplied by n, wherein n is the number of the activity categories;
Figure BDA0001678822450000011
wherein Δi,jIs an activity AiAnd AjThe smaller the distance, the more similar the activities are, f is the number of features, Ni,kIs solved in 1) to obtain alphai,kNumber of non-0 coefficients, S, on the kth featureiIs an activity AiThe number of samples of (a);
1.3) clustering mutually selected activities into the same activity group G according to the matrix MkIn the step (c), a preliminary active set G ═ G is obtained1,G2...GkAnd A isiAnd AjRemove from active set a;
1.4) search the active set A, querying each A from the matrix MpMost similar Activity A for E AqIf A isqE is G, then A ispAdding AqIn the corresponding activity group, and ApRemove from active set a;
1.5) repeating the step 1.4) until the active set A is an empty set or the number of activities in the active set A is not changed any more;
1.6) if A is not empty, then cluster all the activities left in A into a new activity group GmIn (1), GmAdding the mixture into G;
1.7) outputting an activity group set G to complete the grouping of activities;
2) feature selection and training of group classifiers
2.1) according to the activity group set completed in the step 1), taking the same activity group as the same type of activity to perform feature selection, wherein the feature selection method is shown as a formula (3):
Figure BDA0001678822450000021
wherein WkIs the weight value of the kth feature, var (f)k) Is the variance of the kth feature, var (f)k,i) Is the variance of the kth feature over activity i;
2.2) carrying out group classifier training according to the features selected in the step 2.1) to obtain a first-layer classifier, wherein the classifier is used for classifying activities into a certain activity group;
3) feature selection and training of intra-group classifiers
3.1) for each activity group, respectively selecting features in different groups by using the formula (3);
and 3.2) training intra-group classifiers according to the selected features in each activity group to obtain a second-layer classifier, wherein the classifier is used for classifying the activities to the final specific activities.
Preferably, the method comprises the following steps: in the step 1.3), A is mutually selectediThe most similar activity of is Aj,AjThe most similar activity of is Ai
The invention has the beneficial effects that:
1. according to the invention, activities are divided into different groups according to the similarity on a large-scale data set, characteristics with higher pertinence are selected, and the accuracy of activity recognition is improved.
2. Compared with a single-layer classification model, the classification effect of the two-layer activity clustering recognition model is obviously improved, and the selected features are more targeted.
3. The feature selection method can select important features more and use fewer features to achieve satisfactory classification accuracy.
Drawings
FIG. 1 is a general flow diagram of the present method;
FIG. 2 is a two-tier active cluster identification model constructed on a HASC-PAC dataset;
fig. 3 is a comparison of different feature selection methods.
Detailed Description
The present invention will be further described with reference to the following examples. The following examples are set forth merely to aid in the understanding of the invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
The large-scale dataset used in the experiment was HASC-PAC2016, which contained behavioral data for 510 different people, with specific activity types and their labels: standing (1), walking (2), jogging (3), jumping (4), going upstairs (5) and going downstairs (6).
The two-layer activity cluster recognition method facing the large-scale data set is characterized in that an overall training flow chart is shown in figure 1, and the method comprises the following specific steps:
step one, active clustering based on sparse coding
1) And carrying out sparse coding on the HASC-PAC2016 data set, and solving a sparse coefficient.
2) The distance between different activity classes is calculated according to the sparse coefficient to obtain a matrix M of 6 × 6, and as shown in table 1, the lower the value, the more similar the activities are.
TABLE 1 distance between different Activity classes
Standing up Walk Jogging Jumping toy Go upstairs Go downstairs
Standing up 0 6.9563 7.6925 7.6735 6.8843 6.9700
Walk 6.9563 0 1.3304 1.0204 0.3730 0.3739
Jogging 7.6925 1.3304 0 0.8013 1.3143 1.2329
Jumping toy 7.6735 1.0204 0.8013 0 1.0558 0.9272
Go upstairs 6.8843 0.3730 1.3143 1.0558 0 0.2886
Go downstairs 6.9700 0.3739 1.2329 0.9272 0.2886 0
3) The activities selected from each other are clustered according to the matrix M, wherein the activities of jogging and jumping are selected from each other, and the activities of going upstairs and downstairs are selected from each other, so that two preliminary activity sets G1: { jogging, jumping }, G2: { upstairs and downstairs }, can be obtained, and the four activities are removed from the activity sets.
4) At this point in the active set there are two activities remaining standing and walking, searching the active set, the activity most similar to standing is ascending stairs, which belong to the active group G2, so standing is added to G2 and removed from the active set.
5) And repeating the step 4) until the active set is an empty set or the number of activities in the active set is not changed any more.
6) And outputting the activity group set G to complete the grouping of the activities.
The final grouping results of the experiment on the HASC-PAC2016 dataset were 2 different groups, as shown in table 2:
TABLE 2 grouping results for HASC-PAC2016 dataset
Group of Active set
First group Jogging and jumping
Second group Standing, walking, going upstairs and downstairs
Step two, feature selection and training of group classifier
According to table 2, the activities in the same group are considered as the same type of activities, the learning of feature weights and the feature extraction are performed using formula (3), and then the training of a group classifier for classifying the activities into a certain group is performed using the selected features.
Step three, feature selection and training of intra-group classifier
And (3) respectively performing feature weight learning and feature selection on 2 different groups by using a formula (3), wherein the features selected on the different groups are different, and then performing training of an intra-group classifier according to the selected features, wherein the intra-group classifier is used for classifying the activities to be recognized into specific activities.
The resulting two-layer active cluster recognition model trained on the HASC-PAC2016 dataset is shown in FIG. 2.
Experiments and results are as follows:
the invention aims to divide activities into different groups according to similarity on a large-scale data set, select more targeted characteristics and improve the accuracy of activity recognition. To measure the effectiveness of this method, we performed experiments on a large-scale data set HASC-PAC2016 using impersonal and 5-fold cross validation using SVM and Random forest as the basic classifiers. The results of the experiment are shown in table 3:
TABLE 3 Experimental results of two-layer Activity group recognition model
Method HASC-PAC2016
Single-layer SVM 0.6037
Two-layer SVM 0.6379
Single layer Random forest 0.7035
Two-layer Random forest 0.7441
As can be seen from table 3, the classification effect of the two-layer active clustering recognition model is significantly improved compared with the original classification model, which indicates that the features selected by the two-layer classification model are more targeted. Meanwhile, in order to verify the performance of the proposed feature selection method, other three common feature selection methods are selected to be compared on two layers of classification models, namely Laplacian score and Relief-F, MCFS. The experimental results are shown in fig. 3, and the selected number of features is 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, respectively. We can see that the feature selection method of the present invention can select more important features than the other three methods, and can use fewer features to achieve a satisfactory classification accuracy.

Claims (2)

1. A two-layer activity cluster identification method facing a large-scale data set is characterized by comprising the following steps:
1) sparse coding based active clustering
1.1) carrying out sparse coding on sample data according to a formula (1), and solving a sparse coefficient alpha, wherein A is training data, D is a dictionary, and alpha is a sparse coefficient to be solved;
A=Dα (1)
1.2) calculating the distance between different activity categories according to the formula (2) to obtain a matrix M with the size of n multiplied by n, wherein n is the number of the activity categories;
Figure FDA0001678822440000011
wherein Δi,jIs an activity AiAnd AjThe smaller the distance, the more similar the activities are, f is the number of features, Ni,kIs solved in 1) to obtain alphai,kNumber of non-0 coefficients, S, on the kth featureiIs an activity AiThe number of samples of (a);
1.3) clustering mutually selected activities into the same activity group G according to the matrix MkIn the middle, get preliminary activitiesSet G ═ G1,G2...GkAnd A isiAnd AjRemove from active set a;
1.4) search the active set A, querying each A from the matrix MpMost similar Activity A for E AqIf A isqE is G, then A ispAdding AqIn the corresponding activity group, and ApRemove from active set a;
1.5) repeating the step 1.4) until the active set A is an empty set or the number of activities in the active set A is not changed any more;
1.6) if A is not empty, then cluster all the activities left in A into a new activity group GmIn (1), GmAdding the mixture into G;
1.7) outputting an activity group set G to complete the grouping of activities;
2) feature selection and training of group classifiers
2.1) according to the activity group set completed in the step 1), taking the same activity group as the same type of activity to perform feature selection, wherein the feature selection method is shown as a formula (3):
Figure FDA0001678822440000012
wherein WkIs the weight value of the kth feature, var (f)k) Is the variance of the kth feature, var (f)k,i) Is the variance of the kth feature over activity i;
2.2) carrying out group classifier training according to the features selected in the step 2.1) to obtain a first-layer classifier, wherein the classifier is used for classifying activities into a certain activity group;
3) feature selection and training of intra-group classifiers
3.1) for each activity group, respectively selecting features in different groups by using the formula (3);
and 3.2) training intra-group classifiers according to the selected features in each activity group to obtain a second-layer classifier, wherein the classifier is used for classifying the activities to the final specific activities.
2. The large-scale data set-oriented two-tier active cluster recognition method according to claim 1, wherein in the step 1.3), A is selected as each otheriThe most similar activity of is Aj,AjThe most similar activity of is Ai
CN201810538902.9A 2018-05-30 2018-05-30 Large-scale data set-oriented two-layer activity cluster identification method Active CN108875597B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810538902.9A CN108875597B (en) 2018-05-30 2018-05-30 Large-scale data set-oriented two-layer activity cluster identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810538902.9A CN108875597B (en) 2018-05-30 2018-05-30 Large-scale data set-oriented two-layer activity cluster identification method

Publications (2)

Publication Number Publication Date
CN108875597A CN108875597A (en) 2018-11-23
CN108875597B true CN108875597B (en) 2021-03-30

Family

ID=64335735

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810538902.9A Active CN108875597B (en) 2018-05-30 2018-05-30 Large-scale data set-oriented two-layer activity cluster identification method

Country Status (1)

Country Link
CN (1) CN108875597B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110751115B (en) * 2019-10-24 2021-01-01 北京金茂绿建科技有限公司 Non-contact human behavior identification method and system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010137325A1 (en) * 2009-05-27 2010-12-02 パナソニック株式会社 Behavior recognition device
CN103268495A (en) * 2013-05-31 2013-08-28 公安部第三研究所 Human body behavioral modeling identification method based on priori knowledge cluster in computer system
CN103440471A (en) * 2013-05-05 2013-12-11 西安电子科技大学 Human body action identifying method based on lower-rank representation
CN103473539A (en) * 2013-09-23 2013-12-25 智慧城市系统服务(中国)有限公司 Gait recognition method and device
CN104732208A (en) * 2015-03-16 2015-06-24 电子科技大学 Video human action reorganization method based on sparse subspace clustering
CN105868779A (en) * 2016-03-28 2016-08-17 浙江工业大学 Method for identifying behavior based on feature enhancement and decision fusion
CN106203484A (en) * 2016-06-29 2016-12-07 北京工业大学 A kind of human motion state sorting technique based on classification layering
CN106326906A (en) * 2015-06-17 2017-01-11 姚丽娜 Activity identification method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010137325A1 (en) * 2009-05-27 2010-12-02 パナソニック株式会社 Behavior recognition device
CN103440471A (en) * 2013-05-05 2013-12-11 西安电子科技大学 Human body action identifying method based on lower-rank representation
CN103268495A (en) * 2013-05-31 2013-08-28 公安部第三研究所 Human body behavioral modeling identification method based on priori knowledge cluster in computer system
CN103473539A (en) * 2013-09-23 2013-12-25 智慧城市系统服务(中国)有限公司 Gait recognition method and device
CN104732208A (en) * 2015-03-16 2015-06-24 电子科技大学 Video human action reorganization method based on sparse subspace clustering
CN106326906A (en) * 2015-06-17 2017-01-11 姚丽娜 Activity identification method and device
CN105868779A (en) * 2016-03-28 2016-08-17 浙江工业大学 Method for identifying behavior based on feature enhancement and decision fusion
CN106203484A (en) * 2016-06-29 2016-12-07 北京工业大学 A kind of human motion state sorting technique based on classification layering

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Complex human activities recognition using interval temporal syntactic model;XIA Li-min et al.;《J. Cent. South Univ.》;20161231;第2578-2586页 *
基于可穿戴传感器的人体活动识别研究综述;郑增威 等;《计算机应用》;20180510;第38卷(第5期);第1223-1229页 *
基于层次K-均值聚类的支持向量机模型;王秀华 等;《计算机应用与软件》;20140531;第31卷(第5期);第172-176页 *

Also Published As

Publication number Publication date
CN108875597A (en) 2018-11-23

Similar Documents

Publication Publication Date Title
CN104573046B (en) A kind of comment and analysis method and system based on term vector
CN105005589B (en) A kind of method and apparatus of text classification
CN102289522B (en) Method of intelligently classifying texts
CN109635936A (en) A kind of neural networks pruning quantization method based on retraining
CN109446332B (en) People reconciliation case classification system and method based on feature migration and self-adaptive learning
CN100595780C (en) Handwriting digital automatic identification method based on module neural network SN9701 rectangular array
CN108595913A (en) Differentiate the supervised learning method of mRNA and lncRNA
CN109063719B (en) Image classification method combining structure similarity and class information
CN108197643B (en) Transfer learning method based on unsupervised clustering and metric learning
CN110046634B (en) Interpretation method and device of clustering result
CN103942568A (en) Sorting method based on non-supervision feature selection
CN113407660B (en) Unstructured text event extraction method
CN107545033B (en) Knowledge base entity classification calculation method based on representation learning
CN109858034A (en) A kind of text sentiment classification method based on attention model and sentiment dictionary
CN107895000A (en) A kind of cross-cutting semantic information retrieval method based on convolutional neural networks
CN112231477A (en) Text classification method based on improved capsule network
CN103617203B (en) Protein-ligand bindings bit point prediction method based on query driven
CN110569920A (en) prediction method for multi-task machine learning
CN110728144B (en) Extraction type document automatic summarization method based on context semantic perception
CN104809474B (en) Large data based on adaptive grouping multitiered network is intensive to subtract method
CN106844328A (en) A kind of new extensive document subject matter semantic analysis and system
CN104008187A (en) Semi-structured text matching method based on the minimum edit distance
CN111507224A (en) CNN facial expression recognition significance analysis method based on network pruning
CN108470025A (en) Partial-Topic probability generates regularization own coding text and is embedded in representation method
CN108875597B (en) Large-scale data set-oriented two-layer activity cluster identification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220708

Address after: 310015 No. 51, Huzhou street, Hangzhou, Zhejiang

Patentee after: HANGZHOU City University

Address before: 310015 No. 50 Huzhou Street, Hangzhou City, Zhejiang Province

Patentee before: Zhejiang University City College

TR01 Transfer of patent right