CN110781961A - Accurate behavior identification method based on decision tree classification algorithm - Google Patents
Accurate behavior identification method based on decision tree classification algorithm Download PDFInfo
- Publication number
- CN110781961A CN110781961A CN201911025926.5A CN201911025926A CN110781961A CN 110781961 A CN110781961 A CN 110781961A CN 201911025926 A CN201911025926 A CN 201911025926A CN 110781961 A CN110781961 A CN 110781961A
- Authority
- CN
- China
- Prior art keywords
- base station
- data
- information
- behavior
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000003066 decision tree Methods 0.000 title claims abstract description 23
- 238000007635 classification algorithm Methods 0.000 title claims abstract description 13
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 33
- 238000013480 data collection Methods 0.000 claims abstract description 7
- 238000000605 extraction Methods 0.000 claims abstract description 5
- 238000000638 solvent extraction Methods 0.000 claims abstract description 4
- 230000006399 behavior Effects 0.000 claims description 66
- 238000012549 training Methods 0.000 claims description 21
- 210000000629 knee joint Anatomy 0.000 claims description 14
- 238000012545 processing Methods 0.000 claims description 10
- 210000003127 knee Anatomy 0.000 claims description 9
- 238000005192 partition Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000002474 experimental method Methods 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 7
- 230000033001 locomotion Effects 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000006073 displacement reaction Methods 0.000 claims description 3
- 239000003550 marker Substances 0.000 claims description 3
- 230000009897 systematic effect Effects 0.000 claims description 3
- 238000005303 weighing Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 description 8
- 238000011161 development Methods 0.000 description 5
- 230000018109 developmental process Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000011218 segmentation Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000007637 random forest analysis Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005265 energy consumption Methods 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/02—Services making use of location information
- H04W4/029—Location-based management or tracking services
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Signal Processing (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
The invention discloses an accurate behavior identification method based on a decision tree classification algorithm, which comprises the following steps: s1, data collection: solving the label position by adopting a Chan algorithm based on TDOA; s2, feature extraction: selecting a characteristic value and partitioning a characteristic value interval; s3, behavior recognition: and establishing a user behavior recognition model. The invention discloses a method for realizing behavior recognition based on a decision tree algorithm. Experimental results show that the algorithm has good daily behavior recognition performance under specific conditions.
Description
Technical Field
The invention relates to a behavior recognition method, in particular to an accurate behavior recognition method based on a decision tree classification algorithm.
Background
At present, with the continuous progress of wireless technologies such as bluetooth Zigbee UWB (ultra wide band) and wireless network, the vigorous development of ad hoc network wireless sensor networks and internet of things has attracted extensive attention in academia. Behavior recognition is widely recognized as an indispensable key technology. The behavior recognition information helps in pre-warning, decision-making during events, and post-processing emergencies in network emergencies. The behavior recognition technology plays a decisive role in the further development of wireless networks. Therefore, research on behavior recognition technology is important.
In an indoor location, since a signal propagation environment is more complicated than that in an outdoor location, it is difficult to accurately analyze parameters such as a signal arrival time or an arrival angle. However, with the continuous progress and development of wireless sensor networks WSNs, academic research has not been limited to traditional indoor positioning and location awareness. Currently, position sensing using radio has been started in many fields, and rapid development and maturity of position sensing technology based on UWB (ultra wide band) radar systems are most prominent. In the reference, a UWB channel based on human occlusion is proposed, and the influence of human occlusion on TOA ranging errors is studied by measuring and analyzing the ranging errors of TOAs. At present, a novel indoor positioning technology based on commercial Wi-Fi equipment has better development advantages in all aspects. Such as indoor intrusion detection, campus security, staff detection in shopping malls, patient monitoring, real-time detection of old people and children at home, etc.
The frequency band of the 2.4GHz wireless network is similar to that of Bluetooth, and the positioning method is also influenced by the environment, so that the data are inaccurate when obstacles or electromagnetic interference is encountered. Compared with the characteristics of various positioning technologies, UWB is a long-term research hotspot in the field of radio frequency communication at home and abroad. Many existing behavior awareness methods employ image processing methods. And extracting low-level features by using the image information, identifying human motion and constructing a human motion mode. But the disadvantage is that the extraction amount of the characteristic value is large, and the safety and the privacy of the user are seriously threatened. Therefore, more and more technologies adopt a sensor which senses small size, is cheap to deploy, is simple and is resistant to interference to replace an image processing method. The existing algorithm adopts big data parallel classification to guide a power supply mode, but does not consider the problem of energy consumption; some methods refine the original classification data and provide a maximum attribute index algorithm in the concept refinement of the same level. A hierarchical geometric distribution mechanism is used between different levels to more reasonably distribute the privacy budget. However, data distribution cannot be implemented in a dynamic data environment. Some methods improve the objective function of the decision tree generation algorithm, so that inconsistent data can be classified, and influence factors of the function are directly adjusted, so that node segmentation of the decision tree is more accurate, and the classification effect is better. Some methods adopt HBase data classification-based compression strategy selection. However, the data processing process is relatively complex.
Disclosure of Invention
The invention mainly aims to provide an accurate behavior identification method based on a decision tree classification algorithm.
The technical scheme adopted by the invention is as follows: an accurate behavior identification method based on a decision tree classification algorithm comprises the following steps:
s1, data collection: solving the label position by adopting a Chan algorithm based on TDOA;
s2, feature extraction: selecting a characteristic value and partitioning a characteristic value interval;
s3, behavior recognition: and establishing a user behavior recognition model.
Further, the step S1 specifically includes:
considering the positioning accuracy and the equipment cost, selecting 4 base stations is more suitable; in a two-dimensional rectangular plane coordinate system, the coordinate of the ith base station is B
uwb,i=[x
i,y
i]
T(i-1, 2, …,5) with the label coordinate T
uwb=[x
0,y
0]
TThe non-line of sight between the base station and the tag is R
i=||B
uwb,i-T
uwb||
2(i ═ 1,2, …, 5); obtaining a set of TDOA observations Δ t using a first base station as a common reference node
i,1(i ═ 2,3,4,5) indicating a signal arrival time difference between the ith base station and the first base station;
in the case of this model, it is,
is Δ t
i,1True value of (n)
i,1Measured by systematic error, NLOS errorIs n
NLOS,i;
Let the signal propagation speed be c, and calculate R
i,1Difference between distance from marker to ith base station and first base station:
R
i,1=c·Δt
i,1(i=2,3,4,5) (2)
establishing three hyperbolic equations R according to hyperbolic characteristics
i,1=R
i-R
1(i ═ 2,3,4,5), for T
uwbCan be established as shown in equation (3);
a 4 base station label positioning framework is adopted, one base station is taken as a main part, and the rest 3 base stations are all from the base station; when a tester carrying the positioning label enters a testing area, a signal sent by the label is received by one or more sensors; decoding signals from the sensors that send angle of arrival and timing information and then transmitting these data to the master sensor; the main sensor collects all information sent by the base station to calculate the position information of the label, so that data collection is realized; the sensor then transmits data through the switch and the server every second, the data is in a UDP data packet format, and the server receives the UDP data packets, so that label-specific X, Y coordinate information can be obtained.
Further, the step S2 specifically includes:
the feature value selection includes: position division, head, shoulder, waist and knee height processing and distance movement in unit time of the head, shoulder, waist and knee joint;
the location division includes:
in real life, the position of a user has a certain relationship with the behavior and activity of the user; spatial positions are divided into three categories: first, the user can sit at a place to rest on his back; the second type is the distance of 0.1-0.3 meters in area, depending on the object, denoted Da; residual space is of a third type, denoted La;
the head, shoulder, waist and knee height processing comprises head height, shoulder height, waist height and knee height;
the Z-axis data of the head, the shoulder, the waist and the knee joint of the user represents the height of the user space and is directly read from the label coordinate;
the distance moves within unit time of the head, the shoulder, the waist and the knee joint and comprises the distance from the head to the shoulder to the waist and the knee joint;
the direct calculation of the distance between the head, shoulder, waist and knee joint of the user in unit time is difficult to realize, mainly because the unit time is difficult to determine; the displacement calculation result cannot accurately describe the user behavior due to the influence of the overlarge numerical value, so that the accuracy of the user behavior identification error is reduced; too small a value increases because of the large amount of computation due to delay overhead; through multiple experiments, the optimal unit time LS of weighing accuracy and operation delay is obtained;
the partition eigenvalue interval includes:
after determining the above feature values, classification boundaries must be determined to ensure similarity between data and differences within and between classes; combining with experiments, and processing the test sample by adopting a layered classification method; the classification focuses on how to determine the boundaries of each level; currently, two algorithms are used to determine the boundary values: length equivalence and distributed equivalence;
let the range of characteristic values phi ═ c
min,c
max]Dividing the data into N levels, wherein the level labels are 1-N; from the range of values phi ═ c
min,c
max]The range of sensor values R ═ c can be found
max-c
min(ii) a To ensure that the length of each interval in the region is the same; then, the length of each interval is obtained by calculation as R ═ R/N; thus, the value range of each interval can be determined.
Further, the step S3 specifically includes:
assuming that D is the training tuples divided by category, the entropy of D is expressed as:
wherein p is
iRepresents the probability that the ith class appears in the entire training tuple, which can be estimated by dividing the number of elements belonging to that class by the total number of elements in the training tuple;the actual meaning of entropy represents the average amount of information needed for tuple class marking in D;
assuming that the training tuple D is divided by the attribute A, the expected information of the partition D is:
the information increment is the difference between them:
gain(A)=inf o(D)-inf o
A(D) (6)
establishing a user behavior recognition model; the specific behavior identification steps are as follows:
s31, classifying and collecting various behaviors according to the training data set, and dividing the training tuples into entropy inf o (D) of the training set;
s32, extracting a position height characteristic value from data preprocessing, calculating a characteristic value interval and dividing the characteristic value;
s33, obtaining information gain (a) about the partition characteristic values before expectation by step 2;
s34, differences between information increments due to expected information differences;
when the maximum output value is reached, the maximum gain is memorized as the maximum memory; the incremental information is then the corresponding behavior information.
The invention has the advantages that:
the invention discloses a method for realizing behavior recognition based on a decision tree algorithm. Experimental results show that the algorithm has good daily behavior recognition performance under specific conditions.
In addition to the objects, features and advantages described above, other objects, features and advantages of the present invention are also provided. The present invention will be described in further detail below with reference to the drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention.
FIG. 1 is a diagram of a behavior recognition framework of the present invention;
FIG. 2 is a sectional view of a user's body part awaiting testing in accordance with the present invention;
FIG. 3 is an experimental environment plan of the present invention;
fig. 4 is a comparison graph of the behavior recognition error rates of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, as shown in fig. 1, a method for identifying an accurate behavior based on a decision tree classification algorithm includes the following steps:
s1, data collection: solving the label position by adopting a Chan algorithm based on TDOA;
s2, feature extraction: selecting a characteristic value and partitioning a characteristic value interval;
s3, behavior recognition: and establishing a user behavior recognition model.
The step S1 specifically includes:
considering the positioning accuracy and the equipment cost, selecting 4 base stations is more suitable; in a two-dimensional rectangular plane coordinate system, the coordinate of the ith base station is B
uwb,i=[x
i,y
i]
T(i-1, 2, …,5) with the label coordinate T
uwb=[x
0,y
0]
TThe non-line of sight between the base station and the tag is R
i=||B
uwb,i-T
uwb||
2(i ═ 1,2, …, 5); obtaining a set of TDOA observations Δ t using a first base station as a common reference node
i,1(i ═ 2,3,4,5) indicating a signal arrival time difference between the ith base station and the first base station;
in the case of this model, it is,
is Δ t
i,1True value of (n)
i,1Measured by systematic error, NLOS error is n
NLOS,i;
Let the signal propagation speed be c, and calculate R
i,1Difference between distance from marker to ith base station and first base station:
R
i,1=c·Δt
i,1(i=2,3,4,5) (2)
establishing three hyperbolic equations R according to hyperbolic characteristics
i,1=R
i-R
1(i ═ 2,3,4,5), for T
uwbCan be established as shown in equation (3).
A 4 base station label positioning framework is adopted, one base station is taken as a main part, and the rest 3 base stations are all from the base station; when a tester carrying the positioning label enters a testing area, a signal sent by the label is received by one or more sensors; decoding signals from the sensors that send angle of arrival and timing information and then transmitting these data to the master sensor; the main sensor collects all information sent by the base station to calculate the position information of the label, so that data collection is realized; the sensor then transmits data through the switch and the server every second, the data is in a UDP data packet format, and the server receives the UDP data packets, so that label-specific X, Y coordinate information can be obtained.
The step S2 specifically includes:
the feature value selection includes: position division, head, shoulder, waist and knee height processing and distance movement in unit time of the head, shoulder, waist and knee joint;
the location division includes:
in real life, the position of a user has a certain relationship with the behavior and activity of the user; for example, on a sofa, a user may be sitting or lying; in a hallway, when a user falls or lies near an item at home, they may run while walking, which is more likely to fall than they are near an item at home; therefore, spatial locations are divided into three categories: first, a user may sit in a place where he or she is lying to rest, such as a sofa, bed, chair, etc., and is recorded as ra (restarera); the second category is the distance of the area 0.1-0.3 meters, depending on the object, denoted da (distanceaera); residual space is of a third type, denoted la (latarata); the division for measuring the body part of the user is shown in fig. 2;
the head, shoulder, waist and knee height processing comprises (head height, shoulder height, waist height and knee height);
the Z-axis data of the head, the shoulder, the waist and the knee joint of the user represents the height of the user space and can be directly read from the label coordinate;
the distance moves in unit time including the distance from the head to the shoulder to the waist and the knee joint;
the direct calculation of the distance between the head, shoulder, waist and knee joint of the user in unit time is difficult to realize, mainly because the unit time is difficult to determine; the displacement calculation result cannot accurately describe the user behavior due to the influence of the overlarge numerical value, so that the accuracy of the user behavior identification error is reduced; too small a value increases because of the large amount of computation due to delay overhead; through multiple experiments, the optimal unit time LS of weighing accuracy and operation delay is obtained;
the partition eigenvalue interval includes:
after determining the above feature values, classification boundaries must be determined to ensure similarity between data and differences within and between classes; combining with experiments, and processing the test sample by adopting a layered classification method; the classification focuses on how to determine the boundaries of each level; currently, two algorithms are used to determine the boundary values: length equivalence and distributed equivalence;
let the range of characteristic values phi ═ c
min,c
max]Dividing the data into N levels, wherein the level labels are 1-N; from the range of values phi ═ c
min,c
max]The range of sensor values R ═ c can be found
max-c
min(ii) a To ensure that the length of each interval in the region is the same; then, the length R of each interval is obtained by calculationN; thus, the value range of each interval can be determined. For example, the value range of the ith interval is [ c ]
min+(i-1)r,c
min+ir]。
The step S3 specifically includes:
the decision tree algorithm is a method of approximate discrete function value, which is a typical classification method, and the basic idea is to process data first, then use induction algorithm to generate readable rules and decision tree, and then use decision to analyze new data; essentially, a decision tree is a process of classifying data through a series of rules; common decision tree classification algorithms include ID3, C4.5, CART, etc.; the smaller the expected information, the larger the information gain and the higher the purity, and the core idea of the ID3 algorithm is to select the information gain as an attribute; the identification is performed using the ID3 algorithm. Assuming that D is the training tuples divided by category, the entropy of D is expressed as:
wherein p is
iRepresents the probability that the ith class appears in the entire training tuple, which can be estimated by dividing the number of elements belonging to that class by the total number of elements in the training tuple; the actual meaning of entropy represents the average amount of information needed for tuple class marking in D;
now assuming that the training tuple D is divided by the attribute A, the expected information for the D partition is:
the information increment is the difference between them:
gain(A)=inf o(D)-inf o
A(D) (6)
the ID3 algorithm calculates the gain rate of each attribute each time segmentation is required, and then selects the attribute with the largest gain rate for segmentation; therefore, as long as the maximum gain rate can be found, the best segmentation effect can be obtained;
according to the analysis, the abstract description of the user behavior is a behavior classification model and is also the basis of behavior recognition; based on the decision tree classification algorithm, a user behavior recognition model is established; the specific behavior identification steps are as follows:
s31, classifying and collecting various behaviors according to the training data set, and dividing the training tuples into entropy inf o (D) of the training set;
s32, extracting a position height characteristic value from data preprocessing, calculating a characteristic value interval and dividing the characteristic value;
s33, obtaining information gain (a) about the partition characteristic values before expectation by step 2;
s34, differences between information increments due to expected information differences;
when the maximum output value is reached, the maximum gain is memorized as the maximum memory; the incremental information is then the corresponding behavior information.
To verify the performance of the algorithm in behavior recognition, a scenario experiment was performed and three scenarios were selected from office meetings and laboratories. The resulting behavior recognition accuracy is compared to other algorithms. In the experiment, the researcher holds the positioning tag and simulates six basic actions of sitting, standing, falling, lying, walking and running. The experimental environment plan is shown in fig. 3:
during the measurement process, due to various behaviors simulated by the human hand label and instability of the sensor, during the preprocessing process, many data different from the actual behaviors may need to be eliminated. Finally, as shown in table 1, valid data is provided.
TABLE 1 efficient data set
The areas show that the three characteristics of sitting posture, lying and walking are respectively three characteristics. Y represents user behavior, 1 represents sitting, 2 represents standing, 3 represents falling, 4 represents lying, 5 represents walking, and 6 represents running. Height and distance are in meters.
Through algorithm analysis, the accuracy of each behavior recognition is found to be divided into three categories. All the characteristic values participate in the 1 st class, and the identification result is shown in the table 2; the second category does not contain a position feature recognition result, and the recognition result is shown in table 3; the third category does not contain high feature recognition results, which are shown in table 4.
TABLE 2 behavior accuracy (%)
As can be seen from table 2, both standing and lying behaviour are most easily identified, since both states are easier to identify and the tag data characteristics are obvious, i.e. there is no significant change in height and position. Namely the moving distance per unit time is 0; the recognition effect of the sitting and walking states is relatively different, and the movement position and the height of the person are slightly changed and the characteristic value is relatively close; since the range of variation is also better differentiated.
TABLE 3 behavior recognition accuracy (%), excluding position feature value
As can be seen from table 3, the recognition rate of each behavior is lower than that of all the feature values; standing and lying behaviour is still most easily recognized; the standing and sitting posture identification effect is not obvious and is influenced by the position characteristic value; however, knowledge of the decline and the operating state is still apparent.
TABLE 4 behavior recognition accuracy (%), excluding height eigenvalue
In table 4, it is shown that the standing and lying behaviors are most easily recognized because both states are less affected by the height characteristic value and the walking and sitting posture recognition rate is significantly reduced.
Compared with tables 3 and 4, the accuracy of behavior recognition is significantly higher than the position feature value due to the influence of the height feature value. To further show the best performance of the decision tree algorithm, the present invention compares naive Bayesian network (NBN, NaiveBayesian network), Random Forest (RF, Random Forest), Random Forest (KNN, K-Nearest Neighbor). Comparing data sets D and 50100150200, respectively, for behavior recognition error rates, is shown in fig. 4 below:
according to fig. 4, the present invention shows that the error rate of the decision tree algorithm DR is significantly lower than the error rates of the other three algorithms, especially with the increase of the data set, the performance is better and the error rate is lower. When the algorithm is greatly influenced by the sensor, the error rate of the KNN algorithm is the largest due to the influence on the sensor; when D is 100, the algorithm is similar to RF, and the KNN algorithm error rate is the largest; when D is 100, the error rate of the RF algorithm is the largest, NBN is the second; when D is 200, the error rate of the RF algorithm is the largest.
The invention discloses a method for realizing behavior recognition based on a decision tree algorithm. Experimental results show that the algorithm has good daily behavior recognition performance under specific conditions.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (4)
1. An accurate behavior identification method based on a decision tree classification algorithm is characterized by comprising the following steps:
s1, data collection: solving the label position by adopting a Chan algorithm based on TDOA;
s2, feature extraction: selecting a characteristic value and partitioning a characteristic value interval;
s3, behavior recognition: and establishing a user behavior recognition model.
2. The method for identifying an accurate behavior based on a decision tree classification algorithm according to claim 1, wherein the step S1 specifically comprises:
selecting 4 base stations; in a two-dimensional rectangular plane coordinate system, the coordinate of the ith base station is B
uwb,i=[x
i,y
i]
T(i-1, 2, …,5) with the label coordinate T
uwb=[x
0,y
0]
TThe non-line of sight between the base station and the tag is R
i=||B
uwb,i-T
uwb||
2(i ═ 1,2, …, 5); obtaining a set of TDOA observations Δ t using a first base station as a common reference node
i,1(i ═ 2,3,4,5) indicating a signal arrival time difference between the ith base station and the first base station;
in the case of this model, it is,
is Δ t
i,1True value of (n)
i,1Measured by systematic error, NLOS error is n
NLOS,i;
Let the signal propagation speed be c, and calculate R
i,1Difference between distance from marker to ith base station and first base station:
R
i,1=c·Δt
i,1(i=2,3,4,5) (2)
establishing three hyperbolic equations R according to hyperbolic characteristics
i,1=R
i-R
1(i ═ 2,3,4,5), for T
uwbCan be established as shown in equation (3);
a 4 base station label positioning framework is adopted, one base station is taken as a main part, and the rest 3 base stations are all from the base station; when a tester carrying the positioning label enters a testing area, a signal sent by the label is received by one or more sensors; decoding signals from the sensors that send angle of arrival and timing information and then transmitting these data to the master sensor; the main sensor collects all information sent by the base station to calculate the position information of the label, so that data collection is realized; the sensor then transmits data through the switch and the server every second, the data is in a UDP data packet format, and the server receives the UDP data packets, so that label-specific X, Y coordinate information can be obtained.
3. The method for identifying an accurate behavior based on a decision tree classification algorithm according to claim 1, wherein the step S2 specifically comprises:
the feature value selection includes: position division, head, shoulder, waist and knee height processing and distance movement in unit time of the head, shoulder, waist and knee joint;
the location division includes:
in real life, the position of a user has a certain relationship with the behavior and activity of the user; spatial positions are divided into three categories: first, the user can sit at a place to rest on his back; the second type is the distance of 0.1-0.3 meters in area, depending on the object, denoted Da; residual space is of a third type, denoted La;
the head, shoulder, waist and knee height processing comprises head height, shoulder height, waist height and knee height;
the Z-axis data of the head, the shoulder, the waist and the knee joint of the user represents the height of the user space and is directly read from the label coordinate;
the distance moves within unit time of the head, the shoulder, the waist and the knee joint and comprises the distance from the head to the shoulder to the waist and the knee joint;
the direct calculation of the distance between the head, shoulder, waist and knee joint of the user in unit time is difficult to realize, mainly because the unit time is difficult to determine; the displacement calculation result cannot accurately describe the user behavior due to the influence of the overlarge numerical value, so that the accuracy of the user behavior identification error is reduced; too small a value increases because of the large amount of computation due to delay overhead; through multiple experiments, the optimal unit time LS of weighing accuracy and operation delay is obtained;
the partition eigenvalue interval includes:
after determining the above feature values, classification boundaries must be determined to ensure similarity between data and differences within and between classes; combining with experiments, and processing the test sample by adopting a layered classification method; the classification focuses on how to determine the boundaries of each level; currently, two algorithms are used to determine the boundary values: length equivalence and distributed equivalence;
let the range of characteristic values phi ═ c
min,c
max]Dividing the data into N levels, wherein the level labels are 1-N; from the range of values phi ═ c
min,c
max]The range of sensor values R ═ c can be found
max-c
min(ii) a To ensure that the length of each interval in the region is the same; then, the length of each interval is obtained by calculation as R ═ R/N; thus, the value range of each interval can be determined.
4. The method for identifying an accurate behavior based on a decision tree classification algorithm according to claim 1, wherein the step S3 specifically comprises:
assuming that D is the training tuples divided by category, the entropy of D is expressed as:
wherein p is
iRepresents the probability that the ith class appears in the entire training tuple, which can be estimated by dividing the number of elements belonging to that class by the total number of elements in the training tuple; the actual meaning of entropy represents the average amount of information needed for tuple class marking in D;
assuming that the training tuple D is divided by the attribute A, the expected information of the partition D is:
the information increment is the difference between them:
gain(A)=info(D)-info
A(D) (6)
establishing a user behavior recognition model; the specific behavior identification steps are as follows:
s31, classifying and collecting various behaviors according to the training data set, and dividing the training tuples into entropy info (D) of the training set;
s32, extracting a position height characteristic value from data preprocessing, calculating a characteristic value interval and dividing the characteristic value;
s33, obtaining information gain (a) about the partition characteristic values before expectation by step 2;
s34, differences between information increments due to expected information differences;
when the maximum output value is reached, the maximum gain is memorized as the maximum memory; the incremental information is then the corresponding behavior information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911025926.5A CN110781961B (en) | 2019-10-25 | 2019-10-25 | Accurate behavior recognition method based on decision tree classification algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911025926.5A CN110781961B (en) | 2019-10-25 | 2019-10-25 | Accurate behavior recognition method based on decision tree classification algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110781961A true CN110781961A (en) | 2020-02-11 |
CN110781961B CN110781961B (en) | 2024-02-23 |
Family
ID=69386768
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911025926.5A Active CN110781961B (en) | 2019-10-25 | 2019-10-25 | Accurate behavior recognition method based on decision tree classification algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110781961B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111339385A (en) * | 2020-02-26 | 2020-06-26 | 山东爱城市网信息技术有限公司 | CART-based public opinion type identification method and system, storage medium and electronic equipment |
CN111417067A (en) * | 2020-03-13 | 2020-07-14 | 智慧足迹数据科技有限公司 | Method and device for positioning visited position of user |
CN113112635A (en) * | 2021-04-12 | 2021-07-13 | 滁州博格韦尔电气有限公司 | Conventional inspection system for intelligent equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030146871A1 (en) * | 1998-11-24 | 2003-08-07 | Tracbeam Llc | Wireless location using signal direction and time difference of arrival |
US20040266457A1 (en) * | 1997-08-20 | 2004-12-30 | Dupray Dennis J. | Wireless location gateway and applications therefor |
US20120190380A1 (en) * | 1996-09-09 | 2012-07-26 | Tracbeam Llc | Wireless location using network centric location estimators |
CN108764282A (en) * | 2018-04-19 | 2018-11-06 | 中国科学院计算技术研究所 | A kind of Class increment Activity recognition method and system |
-
2019
- 2019-10-25 CN CN201911025926.5A patent/CN110781961B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120190380A1 (en) * | 1996-09-09 | 2012-07-26 | Tracbeam Llc | Wireless location using network centric location estimators |
US20040266457A1 (en) * | 1997-08-20 | 2004-12-30 | Dupray Dennis J. | Wireless location gateway and applications therefor |
US20030146871A1 (en) * | 1998-11-24 | 2003-08-07 | Tracbeam Llc | Wireless location using signal direction and time difference of arrival |
CN108764282A (en) * | 2018-04-19 | 2018-11-06 | 中国科学院计算技术研究所 | A kind of Class increment Activity recognition method and system |
Non-Patent Citations (2)
Title |
---|
张聪聪;宋承祥;丁艳辉;: "群体仿真中基于决策树的路径自动评价研究" * |
李孝伟;陈福才;李邵梅;: "基于分类规则的C4.5决策树改进算法" * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111339385A (en) * | 2020-02-26 | 2020-06-26 | 山东爱城市网信息技术有限公司 | CART-based public opinion type identification method and system, storage medium and electronic equipment |
CN111417067A (en) * | 2020-03-13 | 2020-07-14 | 智慧足迹数据科技有限公司 | Method and device for positioning visited position of user |
CN113112635A (en) * | 2021-04-12 | 2021-07-13 | 滁州博格韦尔电气有限公司 | Conventional inspection system for intelligent equipment |
CN113112635B (en) * | 2021-04-12 | 2022-11-15 | 滁州博格韦尔电气有限公司 | Conventional inspection system for intelligent equipment |
Also Published As
Publication number | Publication date |
---|---|
CN110781961B (en) | 2024-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jiang et al. | Ariel: Automatic wi-fi based room fingerprinting for indoor localization | |
Yuan et al. | Estimating crowd density in an RF-based dynamic environment | |
CN110781961A (en) | Accurate behavior identification method based on decision tree classification algorithm | |
Ngamakeur et al. | A survey on device-free indoor localization and tracking in the multi-resident environment | |
CN105636201B (en) | Indoor orientation method based on sparse signal fingerprint database | |
CN103281779B (en) | Based on the radio frequency tomography method base of Background learning | |
CN106255059B (en) | It is a kind of based on geometric ways without device target localization method | |
Lee et al. | Non-obstructive room-level locating system in home environments using activity fingerprints from smartwatch | |
CN103344941B (en) | Based on the real-time target detection method of wireless sensor network | |
Farid et al. | Hybrid Indoor‐Based WLAN‐WSN Localization Scheme for Improving Accuracy Based on Artificial Neural Network | |
Anzum et al. | Zone-based indoor localization using neural networks: A view from a real testbed | |
CN111461251A (en) | Indoor positioning method of WiFi fingerprint based on random forest and self-encoder | |
Baird et al. | Principal component analysis-based occupancy detection with ultra wideband radar | |
CN109541537B (en) | Universal indoor positioning method based on ranging | |
Martín et al. | Affinity propagation clustering for older adults daily routine estimation | |
Guo et al. | TWCC: A robust through-the-wall crowd counting system using ambient WiFi signals | |
Sridharan et al. | Inferring micro-activities using wearable sensing for ADL recognition of home-care patients | |
Al-Husseiny et al. | Unsupervised learning of signal strength models for device-free localization | |
CN105960011B (en) | Indoor objects localization method based on Sensor Network and bayes method | |
Jain et al. | Performance analysis of received signal strength fingerprinting based distributed location estimation system for indoor wlan | |
Chen et al. | GraphLoc: A graph-based method for indoor subarea localization with zero-configuration | |
Guillen-Perez et al. | Pedestrian characterisation in urban environments combining WiFi and AI | |
Zhao et al. | A nlos detection method based on machine learning in uwb indoor location system | |
Münch et al. | Towards a device-free passive presence detection system with Bluetooth Low Energy beacons. | |
Salsabila et al. | The implementation of optimal k-means clustering for indoor moving object localization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |