CN110119762B - Human behavior dependency analysis method based on clustering - Google Patents

Human behavior dependency analysis method based on clustering Download PDF

Info

Publication number
CN110119762B
CN110119762B CN201910297813.4A CN201910297813A CN110119762B CN 110119762 B CN110119762 B CN 110119762B CN 201910297813 A CN201910297813 A CN 201910297813A CN 110119762 B CN110119762 B CN 110119762B
Authority
CN
China
Prior art keywords
behavior
time
individual
analyzed
vectors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910297813.4A
Other languages
Chinese (zh)
Other versions
CN110119762A (en
Inventor
王晓玲
张欣蕾
李欣
靳远远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN201910297813.4A priority Critical patent/CN110119762B/en
Publication of CN110119762A publication Critical patent/CN110119762A/en
Application granted granted Critical
Publication of CN110119762B publication Critical patent/CN110119762B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a clustering-based human behavior dependency analysis method, which comprises the following steps: s1, normalizing behavior data of individual time t to obtain a behavior vector and specific behavior time of each individual time t; s2, clustering the behavior vectors of each body time t; s3, traversing all the behavior vectors of the individual to be analyzed, calculating the similarity between each behavior vector and other behavior vectors in the class cluster to which the behavior vector belongs, and checking all similar behavior vectors from the class cluster to which the behavior vector of the individual to be analyzed belongs; s4: traversing all behavior vectors of the individual to be analyzed, and obtaining normal specific behavior time by enabling similar behavior vectors of the individual to be analyzed at selected time to correspond to a weighted average value of current specific behavior time of the individual; s5: and dividing the sum of the actual specific behavior time and the normal specific behavior time difference value in all periods of the individual to be analyzed by the sum of the time unit and the super parameter for regulating the specific behavior dependency degree to obtain the specific behavior dependency degree value of the individual to be analyzed.

Description

Human behavior dependency analysis method based on clustering
Technical Field
The application relates to the field of human behavior analysis, in particular to a method for analyzing the dependency degree of human specific behaviors by adopting a clustering algorithm.
Background
The traditional way of collecting human behavior data in the form of subjective questions questionnaires is very old and inefficient because the process of performing scientific analysis by counting and collecting human behavior data in the form of questionnaires requires the panelist to actively fill out answers, and thus not only is the data collection process cumbersome, but more importantly, the questionnaires are often subjective in terms of question settings, such as querying the panelist: is your performance worse because of the network? Or is often frustrated or strained by inability to surf the internet? And the like.
Therefore, the surveyors are often limited to specific choices, and cannot give true meaning expression which the surveyors want to express, so that data errors are caused, meanwhile, the subjectivity of the existing means for collecting human behavior data in the form of a questionnaire is too direct, so that the obtained survey data is not objective, and a certain error influence is caused for corresponding researches.
In summary, there is a need in the art for an objective analysis method capable of specifically and quantitatively determining human specific behavior dependence according to human behavior objective data.
Disclosure of Invention
The application mainly aims to provide a clustering-based human behavior dependency analysis method for realizing specific and quantitative objective analysis of human specific behavior dependency.
In order to achieve the above object, the present application provides a human behavior dependency analysis method based on clustering, comprising the steps of: s1, normalizing behavior data of individual time t to obtain a behavior vector and specific behavior time of each individual time t; s2, clustering the behavior vectors of each body time t; s3, traversing all the behavior vectors of the individual to be analyzed, calculating the similarity of each behavior vector and other behavior vectors in the class cluster to which the behavior vector belongs, and checking all similar behavior vectors from the class cluster to which the behavior vector of the individual to be analyzed belongs; s4, traversing all behavior vectors of the individual to be analyzed, and obtaining normal specific behavior time by enabling the similar behavior vectors of the individual to be analyzed at the selected time to correspond to the weighted average value of the current specific behavior time of the individual; s5, dividing the sum of the actual specific behavior time and the normal specific behavior time difference value in all periods of the individual to be analyzed by the value obtained by the sum of the preset time unit and the super-parameter for regulating the specific behavior dependency degree, and obtaining the specific behavior dependency degree value of the individual to be analyzed.
Preferably, wherein the behavior vector b of the individual u to be analyzed at time t u (t) and other behavior vectors b in the class cluster in which it is located j The conditions required to be satisfied when the similarity is judged as follows:
cos(b u (t),b j )≥r
wherein cos (b) u (t),b j ) A behavior vector b representing the individual u to be analyzed at time t u (t) and other behavior vectors b in the class cluster in which it is located j Cosine similarity of (c). r E [0,1 ]]Is a distance radius threshold.
Preferably, wherein in this step S5, individual u is to be analyzed at time tNormal specific behavior time r of (2) u The calculation formula of (t) is as follows:
wherein S is the behavior vector b of the individual u to be analyzed at time t u Set of similar behavior vectors of (t), y j Representing the behavior vector b j Corresponding to the actual internet time of the individual on the same day, weight w j The calculation formula of (2) is as follows:
wherein S is a behavior vector b at time t with the individual u to be analyzed u (t) similar behavior vector set, cos (b) u (t),b j ) Behavior vector b at time t for individual u to be analyzed u (t) similar behavior vector b j Cosine similarity of cos (b) u (t),b j′ ) A behavior vector b representing the individual u to be analyzed at time t u (t) similar behavior vector b j′ Cosine similarity of (c).
Preferably, the specific behavior dependency value calculation formula of the individual u in the step S5 is:
wherein T is the time set of the behavior data of the individual u in the sample set, C ε R + To adjust a superparameter, y of a specific behavior dependency weight u (t) is the actual specific behavioural time of individual u at time t, r u (t) is the normal specific behavioral time of individual u at time t.
Preferably, step S1 includes respectively performing normalization processing on different types of behavior data, and unifying the value ranges of all types of behavior data.
Preferably, in step S2, all the behavior vectors are clustered by using a K-means algorithm, so as to group similar behavior vectors into the same cluster.
Preferably, step S1 includes: the behavioural data at individual time t is processed according to the selected granularity.
Preferably, the step S3 includes: and calculating the similarity of each behavior vector and other behavior vectors in the class cluster to which the behavior vector belongs by using a cosine similarity method.
According to the clustering-based human behavior dependency analysis method provided by the application, the characteristics of different behaviors of the individuals can be synthesized to analyze and obtain the estimated value of the possible specific behavior time of the individuals, and even if each individual has the same behavior in the same specific occasion, the accurate specific behavior dependency degree value can be obtained, so that the specific and quantized specific behavior dependency degree value of the current individual can be calculated, and an objective quantized analysis result can be formed.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:
FIG. 1 is a flow chart of a cluster-based human behavior dependency analysis method of the present application.
Detailed Description
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, based on the embodiments of the application, which are obtained without inventive effort by a person of ordinary skill in the art, shall fall within the scope of the application.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion.
Referring to fig. 1, in order to implement a specific and quantitative analysis of human specific behavior dependency, the clustering-based human behavior dependency analysis method provided by the present application mainly includes the following steps:
s1, processing individual time t according to the selected granularity, such as: daily behavior data are normalized, and daily behavior vectors and normalized specific behavior time of each individual are obtained;
s2, clustering the daily behavior vectors of each body;
s3, traversing all the behavior vectors of the individual to be analyzed, calculating the similarity of each behavior vector and other behavior vectors in the class cluster to which the behavior vector belongs by using a cosine similarity method, and checking all similar behavior vectors from the class cluster to which the behavior vector of the individual to be analyzed belongs;
s4, traversing all behavior vectors of the individual to be analyzed, and taking the similar behavior vector of the individual to be analyzed on a certain day, and taking the weighted average value of the specific behavior time of the corresponding individual on the same day as the normal specific behavior time of the individual to be analyzed on the same day;
s5, dividing the sum of the actual specific behavior time and the normal specific behavior time difference value in all periods of the individual to be analyzed by the sum of the number of days and the super-parameters for regulating the specific behavior dependency degree to obtain a value of the specific behavior dependency degree of the individual to be analyzed.
It should be noted that, in the above embodiment, the time t is exemplified by the daily behavior data, but not limited to, in other preferred embodiments, a reasonable specific period may be set according to the practical situation, for example, each hour, each week, each month, etc., and in the similar behavior vector of the individual on the day in the step S4, the day may also be a random or specific selected time for the analyst to self-collect the whole individual data during the period.
Specifically, according to the clustering-based human behavior dependency analysis method, behavior data of an individual such as surfing, catering, showering and the like every day is processed according to a selected granularity, and after normalization processing is carried out on the data, a behavior vector of each individual every day and a normalized specific behavior time are obtained, for example: surfing the internet time; clustering the behavior vectors of each individual on each day; traversing all the behavior vectors of the individual to be analyzed, calculating the similarity of each behavior vector and other behavior vectors in the class cluster to which the behavior vector belongs by using a cosine similarity method, and searching all similar behavior vectors from the class cluster to which the behavior vector of the individual to be analyzed belongs; traversing all behavior vectors of the individual to be analyzed, and taking a weighted average value of similar behavior vectors of the individual to be analyzed on a day corresponding to specific behavior time of the individual on the day as normal surfing time of the individual to be analyzed on the day.
Then dividing the sum of the actual internet surfing time and the normal internet surfing time difference value in all periods of the individual to be analyzed, i.e. in all time, by the sum of the preset time unit (such as days) and the super parameter for adjusting the network dependency degree to obtain a value of the specific behavior dependency degree of the individual to be analyzed, for example: network dependency value.
Therefore, the scheme of the application can process the behavior data of individuals on the internet, in catering, in showering and the like every day according to the selected granularity, obtain the behavior vector of each individual every day, and flexibly model the behavior regularity of the individuals every day, so that the subsequent searching of similar behaviors is more accurate. Meanwhile, the application also respectively normalizes all types of behavior data, unifies the value ranges of all types of behavior data, thereby effectively reducing the influence on the subsequent clustering and cosine similarity calculation caused by different orders of magnitude of data, and improving the accuracy of subsequent similar behavior searching.
Furthermore, in the embodiment of the present application, a K-means clustering algorithm is preferably used, that is: in step S2 of the foregoing embodiment, the clustering is preferably performed on all the behavior vectors by using a K-means algorithm to cluster similar behavior vectors into the same class cluster, so that in the subsequent process of searching similar behavior vectors, only the similarity between the behavior vector to be analyzed and other behavior vectors in the class cluster where the behavior vector to be analyzed is located may be calculated, thereby reducing the calculation times under the condition of ensuring that the similar behavior is accurately searched, namely: the cosine similarity between the behavior vectors is calculated for times, thereby achieving the beneficial effect of reducing the time cost.
Furthermore, the application also searches similar behavior vectors by setting a radius threshold, thereby effectively limiting the similarity degree of different behaviors, and only behaviors with the similarity degree larger than the threshold are considered to be similar so as to participate in subsequent calculation. By adopting the scheme, the most similar behaviors can be effectively screened out, the accuracy of the subsequent calculation of the normal internet surfing time of the individual is improved, and the situation that errors are caused to the calculation of the normal internet surfing time of the individual because the dissimilar behaviors participate in the subsequent calculation is avoided.
It is worth mentioning that similar behavior vectors represent the similarity of the activities of individuals in specific situations, such as: similarity of individual activities in a particular occasion on a campus. Therefore, the weighted average value of the similar behavior vector corresponding to the surfing time is used as the normal surfing time, the characteristics of different behaviors of the individual are integrated, and the most probable estimated value of the normal surfing time of the day of the individual to be analyzed can be obtained.
While different individuals have different degrees of specific behavioral dependence, such as: the existence of this feature allows different surfing times even if two individuals have the same behavior in a particular situation on a campus. Therefore, the difference value between the actual internet time and the normal internet time reflects the dependence degree of the individual on the network, and the average value of the actual internet time and the normal internet time difference value in a period of time is used as an estimated value, so that the error existing in the single difference value can be reduced. Meanwhile, the setting of the super parameters adjusts the weight of the influence of the network dependency on the internet surfing time, and improves the accuracy of the calculation of the network dependency.
In addition, in the preferred embodiment, in the step S1, the granularity of dividing the individual behavior data may be adjusted according to the actual situation. For example: according to the selected granularity, the behavior data of the individuals, such as surfing the Internet, catering, showering and the like, are processed every day, the behavior vector of each individual every day is obtained, and the behavior regularity of the tested individuals every day can be flexibly modeled, so that the subsequent searching of similar behaviors is more accurate.
In another preferred embodiment, in the step S1, normalization processing is further required for different types of data, for example: and respectively carrying out normalization processing on all types of behavior data, and unifying the value ranges of all types of behavior data, so that the influence on the subsequent clustering and cosine similarity calculation caused by different orders of magnitude of the data can be reduced, and the accuracy of subsequent similar behavior searching is improved.
Further, in order to find all similar behavior vectors from the class cluster in which the behavior vector of the individual to be analyzed is located, wherein the behavior vector b of the individual to be analyzed u at time t u (t) and other behavior vectors b in the class cluster in which it is located j The conditions required to be satisfied when the similarity is judged as follows:
cos(b u (t),b j )≥r
wherein cos (b) u (t),b j ) A behavior vector b representing the individual u to be analyzed at time t u (t) and other behavior vectors b in the class cluster in which it is located j Cosine similarity of (c). r E [0,1 ]]Is a distance radius threshold.
The aim of the preferred embodiment is to find similar behavior vectors by setting a radius threshold, so that the similarity degree of different behaviors can be effectively limited, and only behaviors with the similarity degree larger than the threshold can be considered to be similar, thereby participating in subsequent calculation. By adopting the scheme, the most similar behaviors can be effectively screened out, the accuracy of the subsequent calculation of the normal internet surfing time of the individual is improved, and the situation that errors are caused to the calculation of the normal internet surfing time of the individual because the dissimilar behaviors participate in the subsequent calculation is avoided.
In the above embodiment, the step S5 is to analyze the normal surfing time r of the individual u at the time t u The calculation formula of (t) is preferably:
wherein S is the behavior vector b of the individual u to be analyzed at time t u Set of similar behavior vectors of (t), y j Representing the behavior vector b j Corresponding to the actual internet time of the individual on the same day, weight w j The calculation formula of (2) is as follows:
wherein S is a behavior vector b at time t with the individual u to be analyzed u (t) similar behavior vector set, cos (b) u (t),b j ) Behavior vector b at time t for individual u to be analyzed u (t) similar behavior vector b j Cosine similarity of cos (b) u (t),b j′ ) A behavior vector b representing the individual u to be analyzed at time t u (t) similar behavior vector b j′ Cosine similarity of (c).
In particular, similar behavior vectors represent the similarity of an individual's activities in a particular situation, such as: similarity of activities within the campus, and thus similar behavior vectors correspond to specific behavior times, such as: the weighted average of the surfing time is used as the normal surfing time, and the characteristics of different behaviors of the individual are integrated, so that the estimated value of the normal surfing time most likely in the day of the individual to be analyzed can be obtained.
In the above embodiment, the specific behavior dependency value calculation formula of the individual u in step S5 is:
wherein T is the time set of the behavior data of the individual u in the sample set, C ε R + To adjust a superparameter, y of a specific behavior dependency weight u (t) is the actual specific behavioural time of individual u at time t, r u (t) is the normal specific behavioral time of individual u at time t.
The purpose of the preferred embodiment described above is that different individuals have different degrees of dependence on specific behaviour, and the presence of this feature allows different times of specific behaviour even if two individuals have the same behaviour in a specific situation. Thus, for example, when analyzing the internet surfing behavior and the internet surfing time of an individual, the difference value between the actual internet surfing time and the normal internet surfing time of the individual reflects the degree of dependence of the individual on the network, and the average value of the actual internet surfing time and the normal internet surfing time difference value in a period of time can reduce the error existing in the single difference value. Meanwhile, the setting of the super parameters adjusts the weight of the influence of the network dependence degree on the internet surfing time, avoids over fitting and reduces the calculation error.
Example 1
In order to further prove and explain the feasibility of the analysis scheme of the application, the embodiment will be described by taking quantitative analysis of the online behaviors of college students as an example, because with the development of the network, the network becomes an indispensable component in the learning life of the college students, and due to the flexibility and autonomy of the college learning, the college students often enthusiastically satisfy the network while enjoying the convenience brought by the network, so that more college students show higher dependence on the network. However, excessive use of the network can have a major negative impact on the students' mind and life. Therefore, the method can find out the network dependence tendency of students in time and has important significance for guiding the students correctly.
Furthermore, analysis of the degree of network dependence of college students is an important piece of content in the field of psychology. Currently, the analysis of college student network dependence in the psychological field is mainly based on questionnaire methods. However, the process is cumbersome, as questionnaires typically need to be manually filled in by the subject. In addition, the questions of the questionnaire are subjective, such as: is your performance worse because of the network? Or is often frustrated or strained by inability to surf the internet? The subjectivity of the problem creates a certain error in the corresponding study.
In summary, when analyzing the college student network dependence problem, the problems of complicated process and research errors caused by subjectivity exist at present. Therefore, in order to solve the problems, the clustering-based human behavior dependence analysis method can well give objective quantitative analysis results.
Referring to fig. 1, in order to further illustrate the implementation process of the clustering-based human behavior dependency analysis method of the present application, the present embodiment is exemplified by analyzing the dependency degree (network dependency degree) of a specific behavior of a college student on daily internet in a campus, but it should be understood by those skilled in the art that the scheme of the present application can also perform quantization analysis on various specific behaviors of a human individual, and the type of the specific behavior that can be analyzed is not limited, so any specific behavior applicable to the scheme of the present application can be performed in the quantization analysis under the scheme and is within the scope of the disclosure of the scheme of the present application.
Specifically, in order to implement analysis of student network dependencies based on a clustering algorithm, according to the above embodiment, first it is required that:
s101: acquiring student behavior data and constructing a behavior vector:
and if the behavior data of students on the internet, catering, showering and the like every day are processed according to the selected granularity, normalizing the data to obtain the behavior vector of each student on every day and the normalized internet time. In the embodiment, the granularity is set to 24 hours, and behavior data of two students from 2018, 11, 6, and 11, 9 are selected.
Table 1 is the behavior data of two students during 24 hours of the four days, including time of access to the campus network, frequency of consumption, amount of consumption, and time of day of surfing the internet.
TABLE 1
Table 2 is the raw daily behavior vector, normalized behavior vector, and normalized surfing time for each student constructed at 24h granularity.
TABLE 2
S102: clustering the behavior vectors:
the behavior vectors are clustered using the K-means algorithm, in this embodiment k=2 is set, and the behavior vectors are divided into two clusters.
Table 3 shows the result of the behavior vector clustering in this embodiment
Cluster Behavior vector numbering
1 1,2,4,5,6,8
2 3,7
TABLE 3 Table 3
S103: searching similar behavior vectors:
traversing all the behavior vectors of the students to be analyzed, and calculating each behavior vector by using a cosine similarity methodSimilarity to other behavior vectors in the cluster to which it belongs. Table 4 shows student u 1 Similarity to other behavior vectors in the class cluster in which it resides.
TABLE 4 Table 4
And checking all similar behavior vectors from the class clusters where the behavior vectors of the students to be analyzed are located.
Wherein student u's behavior vector b at time t u (t) and other behavior vectors b in the class cluster in which it is located j The conditions are judged to be similar and are required to be met:
cos(b u (t),b j )≥r
wherein cos (b) u (t),b j ) A behavior vector b representing the student u to be analyzed at time t u (t) and other behavior vectors b in the class cluster in which it is located j Cosine similarity of (c). r E [0,1 ]]Is a distance radius threshold. r=0 indicates that all but itself of the behavior vectors in the cluster satisfy the condition, and r=1 indicates that only the behavior vector identical to the behavior vector to be analyzed satisfies the condition. Whereas in this embodiment r=0.3 is preferably set
Table 5 shows student u in this example 1 Is a set S of similar behavior vector numbers for the behavior vectors of (a).
TABLE 5
S104: calculating normal internet surfing time:
and traversing all the behavior vectors of the students to be analyzed, and taking the weighted average value of the similar behavior vectors of the students to be analyzed on a certain day corresponding to the daily surfing time of the students as the normal surfing time of the students to be analyzed on the certain day. Wherein student u is at normal surfing time r at time t u The calculation formula of (t) is as follows:
wherein S is a behavior vector b at time t with a student u to be analyzed u (t) set of similar behavior vectors, y j Representing the behavior vector b j Corresponding to the actual online time of the student on the same day, weight w j The calculation formula of (2) is as follows:
wherein S is a behavior vector b at time t with a student u to be analyzed u (t) similar behavior vector set, cos (b) u (t),b j ) For student u to be analyzed, behavior vector b at time t u (t) similar behavior vector b j Cosine similarity of cos (b) u (t),b j′ ) A behavior vector b representing the student u to be analyzed at time t u (t) similar behavior vector b j′ Cosine similarity of (c).
Table 6 is student u to be analyzed in the present embodiment 1 Weights w of similar behavior vectors of all behavior vectors j Actual net time y corresponding to similar behavior vector j
TABLE 6
Table 7 shows the students u to be analyzed in the present embodiment 1 Actual internet time y corresponding to all behavior vectors u (t) and normal surfing time r u (t)。
Behavior vector numbering Actual internet time y u (t) Normal internet time r u (t)
1 0.09 0.16303
2 0 0.15807
3 1 0.15
4 0.44 0.09
TABLE 7
S105: calculating the network dependency degree:
and dividing the sum of the actual internet time and the normal internet time difference value of all the times of the students to be analyzed by the sum of the days and the super parameters for regulating the network dependency degree to obtain the network dependency degree value of the students to be analyzed. The calculation formula of the network dependency degree value of the student u is as follows:
wherein T is the time set of the behavior data of student u in the sample set, C E R + To adjust one of the hyper-parameters of the network dependency weights, the greater the C, the less the network dependency affects the surfing time. y is u (t) is student u actual online time at time t, r u And (t) is the normal surfing time of student u at time t. In this embodiment, c=0 is preferably set so as to finally obtain student u 1 The network dependency values of (2) are:
therefore, the objective quantitative analysis result of the behavior analysis of different students with different network dependency degrees is realized, and even if two students have the same behavior in a campus, different surfing times can be realized. Therefore, the difference value between the actual internet time and the normal internet time reflects the degree of dependence of the classmates on the network, and the average value of the actual internet time and the normal internet time difference value in a period of time can reduce the error existing in the single difference value. Meanwhile, the setting of the super parameters adjusts the weight of the influence of the network dependence degree on the internet surfing time, so that the over fitting is avoided, and the calculation error is reduced.
In summary, the clustering-based human behavior dependency analysis method provided by the application can integrate different behavior characteristics of individuals to analyze and obtain the estimated value of the possible specific behavior time of the individuals, even if each individual has the same behavior in the same specific occasion, the accurate specific behavior dependency value can be obtained, thereby realizing specific and quantized calculation of the specific behavior dependency value of the current individual and forming objective quantized analysis results.
The preferred embodiments of the application disclosed above are intended only to assist in the explanation of the application. The preferred embodiments are not exhaustive or to limit the application to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and the practical application, to thereby enable others skilled in the art to best understand and utilize the application. The application is to be limited only by the following claims and their full scope and equivalents, and any modifications, equivalents, improvements, etc., which fall within the spirit and principles of the application are intended to be included within the scope of the application.
Those skilled in the art will appreciate that all or part of the steps in implementing the methods of the embodiments described above may be implemented by a program stored in a storage medium, including instructions for causing a single-chip microcomputer, chip or processor (processor) to perform all or part of the steps of the methods of the embodiments described herein. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In addition, any combination of various embodiments of the present application may be performed, so long as the concept of the embodiments of the present application is not violated, and the disclosure of the embodiments of the present application should also be considered.

Claims (7)

1. A clustering-based human behavior dependency analysis method, comprising the steps of:
s1, normalizing behavior data of individual time t to obtain a behavior vector and specific behavior time of each individual time t;
s2, clustering the behavior vectors of each body time t;
s3, traversing all the behavior vectors of the individual to be analyzed, calculating the similarity of each behavior vector and other behavior vectors in the class cluster to which the behavior vector belongs, and checking all similar behavior vectors from the class cluster to which the behavior vector of the individual to be analyzed belongs;
s4, traversing all behavior vectors of the individual to be analyzed, and obtaining normal specific behavior time by enabling the similar behavior vectors of the individual to be analyzed at the selected time to correspond to the weighted average value of the current specific behavior time of the individual;
s5, dividing the sum of the actual specific behavior time and the normal specific behavior time difference value in all periods of the individual to be analyzed by the sum of the preset time unit and the super-parameter for regulating the specific behavior dependency degree to obtain the specific behavior dependency degree value of the individual to be analyzed;
the specific behavior dependency degree value calculation formula of the individual u in the step S5 is as follows:
wherein T is the time set of the behavior data of the individual u in the sample set, C ε R + To adjust a superparameter, y of a specific behavior dependency weight u (t) is the actual specific behavioural time of individual u at time t, r u (t) is the normal specific behavioral time of individual u at time t.
2. The cluster-based human behavior dependency analysis method according to claim 1, wherein the behavior vector b of the individual u to be analyzed at time t u (t) and other behavior vectors b in the class cluster in which it is located j The conditions required to be satisfied when the similarity is judged as follows:
cos(b u (t),b j )≥r
wherein cos (b) u (t),b j ) A behavior vector b representing the individual u to be analyzed at time t u (t) and other behavior vectors b in the class cluster in which it is located j R.epsilon.0, 1 is the distance radius threshold.
3. The cluster-based human behavior dependent analysis method according to claim 1, wherein the normal specific behavior time r of the individual u at time t is to be analyzed in the step S5 u The calculation formula of (t) is as follows:
wherein S is the behavior vector b of the individual u to be analyzed at time t u Set of similar behavior vectors of (t), y j Representing the behavior vector b j Corresponding to the actual internet time of the individual on the same day, weight w j The calculation formula of (2) is as follows:
wherein S is a behavior vector b at time t with the individual u to be analyzed u (t) similar behavior vector set, cosb u (t),b j ) Behavior vector b at time t for individual u to be analyzed u (t) similar behavior vector b j Cosb of cosbs u (t),b j′ ) A behavior vector b representing the individual u to be analyzed at time t u (t) similar behavior vector b j′ Cosine similarity of (c).
4. The method for analyzing human behavior dependency based on clustering according to claim 1, wherein the step S1 comprises respectively normalizing the different types of behavior data, and unifying the value ranges of all types of behavior data.
5. The clustering-based human behavior dependency analysis method as claimed in claim 1, wherein the K-means algorithm is used to cluster all the behavior vectors in step S2 to group similar behavior vectors into the same cluster.
6. The cluster-based human behavior dependency analysis method as claimed in claim 1, wherein the step S1 includes: the behavioural data at individual time t is processed according to the selected granularity.
7. The cluster-based human behavior dependency analysis method as claimed in claim 1, wherein the step S3 includes: and calculating the similarity of each behavior vector and other behavior vectors in the class cluster to which the behavior vector belongs by using a cosine similarity method.
CN201910297813.4A 2019-04-15 2019-04-15 Human behavior dependency analysis method based on clustering Active CN110119762B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910297813.4A CN110119762B (en) 2019-04-15 2019-04-15 Human behavior dependency analysis method based on clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910297813.4A CN110119762B (en) 2019-04-15 2019-04-15 Human behavior dependency analysis method based on clustering

Publications (2)

Publication Number Publication Date
CN110119762A CN110119762A (en) 2019-08-13
CN110119762B true CN110119762B (en) 2023-09-26

Family

ID=67520895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910297813.4A Active CN110119762B (en) 2019-04-15 2019-04-15 Human behavior dependency analysis method based on clustering

Country Status (1)

Country Link
CN (1) CN110119762B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106339417A (en) * 2016-08-15 2017-01-18 浙江大学 Detection method for user group behavior rules based on stay places in mobile trajectory
CN107608992A (en) * 2016-07-12 2018-01-19 上海视畅信息科技有限公司 A kind of personalized recommendation method based on time shaft
CN107622072A (en) * 2016-07-15 2018-01-23 阿里巴巴集团控股有限公司 A kind of recognition methods and server, terminal for web page operation behavior
CN108572984A (en) * 2017-03-13 2018-09-25 阿里巴巴集团控股有限公司 A kind of active user interest recognition methods and device
CN109065028A (en) * 2018-06-11 2018-12-21 平安科技(深圳)有限公司 Speaker clustering method, device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107608992A (en) * 2016-07-12 2018-01-19 上海视畅信息科技有限公司 A kind of personalized recommendation method based on time shaft
CN107622072A (en) * 2016-07-15 2018-01-23 阿里巴巴集团控股有限公司 A kind of recognition methods and server, terminal for web page operation behavior
CN106339417A (en) * 2016-08-15 2017-01-18 浙江大学 Detection method for user group behavior rules based on stay places in mobile trajectory
CN108572984A (en) * 2017-03-13 2018-09-25 阿里巴巴集团控股有限公司 A kind of active user interest recognition methods and device
CN109065028A (en) * 2018-06-11 2018-12-21 平安科技(深圳)有限公司 Speaker clustering method, device, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Intelligent Behavior Data Analysis for Internet Addiction";Xin Li etal.;《ACM》;20190101;第1-12页 *

Also Published As

Publication number Publication date
CN110119762A (en) 2019-08-13

Similar Documents

Publication Publication Date Title
Cavagnaro et al. On the functional form of temporal discounting: An optimized adaptive test
Anauati et al. Quantifying the life cycle of scholarly articles across fields of economic research
WO2011133551A2 (en) Reducing the dissimilarity between a first multivariate data set and a second multivariate data set
CN110689523A (en) Personalized image information evaluation method based on meta-learning and information data processing terminal
Wilson et al. Predictors of delay discounting among smokers: Education level and a Utility Measure of Cigarette Reinforcement Efficacy are better predictors than demographics, smoking characteristics, executive functioning, impulsivity, or time perception
Walters A regression-based approach to library fund allocation
Kalb et al. Identifying Important Factors for Closing the Gap in Labour Force Status between Indigenous and Non‐Indigenous A ustralians
Wisse et al. Relieving the elicitation burden of Bayesian belief networks.
Ledić et al. Beyond wage gap, towards job quality gap: The role of inter-group differences in wages, non-wage job dimensions, and preferences
Kirkebøen Preferences for lifetime earnings, earnings risk and nonpecuniary attributes in choice of higher education
CN110119762B (en) Human behavior dependency analysis method based on clustering
Kalb et al. Decomposing differences in labour force status between Indigenous and non-Indigenous Australians
CN109344232A (en) A kind of public feelings information search method and terminal device
CN111241415B (en) Recommendation method integrating multi-factor social activities
CN113421122A (en) First-purchase user refined loss prediction method under improved transfer learning framework
Filona et al. Factors affecting the adoption of electronic money using technology acceptance model and theory of planned behavior
Valenzuela et al. Income and consumption inequality in the Philippines: A stochastic dominance analysis of household unit records
US10373519B1 (en) System and method for determining and providing activity recommendations
Ceriani et al. Bottom incomes and the measurement of poverty: A brief assessment of the literature
Calero et al. Efficiency in the transformation of schooling into competences: A cross‐country analysis using PIAAC data
Hiwatari et al. Notes on Happiness and Trust in Transition Countries: An Empirical Analysis Based on Life in Transition Survey I-III
Motegi et al. Retirement and cognitive decline: Evidence from global aging data
Miyata et al. Learning, risk, and credit in households’ new technology investments: the case of aquaculture in rural Indonesia
Varin et al. A mixed probit model for the analysis of pain severity diaries
Cundy Essays on the elasticity of intertemporal substitution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant