CN111860661B

CN111860661B - Data analysis method and device based on user behaviors, electronic equipment and medium

Info

Publication number: CN111860661B
Application number: CN202010723062.0A
Authority: CN
Inventors: 林峰; 尹钏
Original assignee: Ping An Property and Casualty Insurance Company of China Ltd
Current assignee: Ping An Property and Casualty Insurance Company of China Ltd
Priority date: 2020-07-24
Filing date: 2020-07-24
Publication date: 2024-04-30
Anticipated expiration: 2040-07-24
Also published as: CN111860661A

Abstract

The invention relates to a data processing technology, and discloses a data analysis method based on user behaviors, which comprises the following steps: acquiring a behavior data set of a user and performing smoothing filtering treatment to obtain a noise-free data set; identifying a switching point in the noiseless data set, and dividing the noiseless data set into a plurality of noiseless data subsets according to the switching point; performing feature extraction on the plurality of noise-free data subsets by using a feature extraction network to obtain a feature data set; feature optimization is carried out on feature data in the feature data set to obtain an optimized feature set; classifying the preferred features in the preferred feature set by using a classifier to obtain a behavior feature set of the user; a score for the behavioral characteristics in the behavioral characteristics set is calculated, and a behavioral type of the user is determined based on the score. Furthermore, the present invention relates to blockchain techniques in which behavioral data sets and/or characteristic data sets may be stored in blockchain nodes. The invention can solve the problems of low data analysis efficiency and low accuracy based on user behaviors.

Description

Data analysis method and device based on user behaviors, electronic equipment and medium

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to a data analysis method and apparatus based on user behavior, an electronic device, and a computer readable storage medium.

Background

In the existing technology for analyzing user behavior data, behavior analysis is mainly performed through videos. For example, video analysis is carried out on driving videos stored in a vehicle data recorder to obtain data such as driving habits, and when the data are specifically analyzed, whether the user has a habit of liking sudden braking is judged by counting the number of sudden braking of the user in the driving videos within a period of time; and then, for example, counting the overspeed running time of the user in a period of time in the driving video to judge whether the user has habit of liking overspeed running. However, in the process of video analysis, massive image data processing is required, which occupies a large amount of computing resources, and the accuracy of the analyzed result is not high due to external environment factors or low definition of the video, so how to improve the efficiency and accuracy of analyzing the behavior data of the user becomes an increasingly important requirement.

Disclosure of Invention

The invention provides a data analysis method, a device, electronic equipment and a computer readable storage medium based on user behavior, and mainly aims to provide a data analysis method for improving efficiency and accuracy of analyzing behavior data of a user.

In order to achieve the above object, the present invention provides a data analysis method based on user behavior, including:

Acquiring a behavior data set of a user, and performing smooth filtering processing on the behavior data set to obtain a noise-free data set;

Identifying a switching point in the noiseless data set, and dividing the noiseless data set into a plurality of noiseless data subsets according to the switching point;

Performing feature extraction on the plurality of noise-free point data subsets by using a feature extraction network to obtain a feature data set;

feature optimization is carried out on the feature data in the feature data set to obtain an optimized feature set;

Classifying the preferred features in the preferred feature set by using a preset classifier to obtain a behavior feature set of the user;

And respectively calculating the scores of the behavior features in the behavior feature set, and determining the behavior type of the user based on the scores.

Optionally, the identifying a switching point in the noise-free data set includes:

Performing data sampling on the noise-free data set by using a plurality of time windows with preset adjustable lengths to obtain a sampled data set;

calculating the energy value of the sampling data set by using a preset energy value algorithm;

judging whether the difference of energy values of first sampling data and second sampling data in the sampling data set is larger than a preset energy threshold value, wherein the first sampling data and the second sampling data are continuous data acquired respectively in different time windows;

and if the difference between the energy values of the first sampling data and the second sampling data in the sampling data set is larger than a preset energy threshold value, determining the midpoint position of the continuous different time windows as a switching point.

Optionally, the energy value algorithm is:

wherein E is the energy value, s (k) is the value of any one of the sampled data in the sampled data set, μ is the average value of all the sampled data in the sampled data set, δ is the standard deviation of all the sampled data in the sampled data set, and m is the length of the time window.

Optionally, the feature optimization of the feature data in the feature data set to obtain an optimized feature set includes:

performing weight calculation on the characteristic data in the characteristic data set to obtain characteristic weights;

And selecting target feature data with the feature weight greater than a weight threshold from the feature data set, and collecting the target feature data into the preferred feature set.

Optionally, the calculating the score of each behavior feature in the behavior feature set, and determining the behavior type of the user based on the score, includes:

calculating the score of each behavior feature in the behavior feature set by using a scoring algorithm;

weighted average is carried out on the score of each behavior feature in the behavior feature set to obtain the overall behavior score of the user;

and determining the behavior type of the user according to the overall behavior score of the user.

Optionally, the performing smoothing filtering on the behavior data set to obtain a noise-free data set includes:

And eliminating the data with the frequency greater than the cut-off frequency threshold of the filtering model in the behavior data set, and obtaining the noise-free data set after eliminating.

Optionally, the feature extraction of the plurality of noise-free data subsets using the feature extraction network includes:

describing data characteristics of noiseless data in the plurality of noiseless data subsets;

And carrying out feature extraction on the data features by using a machine learning algorithm to obtain feature data.

In order to solve the above problems, the present invention also provides a data analysis device based on user behavior, the device comprising:

The data denoising module is used for acquiring a behavior data set of a user, and performing smooth filtering processing on the behavior data set to obtain a noise-free data set;

The data dividing module is used for identifying switching points in the noiseless data set and dividing the noiseless data set into a plurality of noiseless data subsets according to the switching points;

the feature extraction module is used for carrying out feature extraction on the plurality of noise-free data subsets by utilizing a feature extraction network to obtain a feature data set;

the feature optimization module is used for performing feature optimization on the feature data in the feature data set to obtain an optimized feature set;

the feature classification module is used for classifying the preferred features in the preferred feature set by using a preset classifier to obtain a behavior feature set of the user;

And the type judging module is used for respectively calculating the scores of the behavior features in the behavior feature set and determining the behavior type of the user based on the scores.

In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:

A memory storing at least one instruction; and

And a processor executing instructions stored in the memory to implement the user behavior-based data analysis method of any one of the above.

In order to solve the above-mentioned problems, the present invention is a computer-readable storage medium comprising a storage data area storing created data and a storage program area storing a computer program; wherein the computer program, when executed by a processor, implements the user behavior based data analysis method of any of the above.

According to the embodiment of the invention, the behavior data set is subjected to smooth filtering, so that noise points in the data are removed, and the accuracy of data analysis is improved; by identifying the switching point, the change of the user behavior can be accurately identified, and the accuracy of data analysis is further improved; by extracting the characteristics of a plurality of noise-free data subsets and optimizing, more representative characteristics are screened, invalid data is reduced, effective data is increased, the accuracy of data analysis is improved, and the efficiency of data processing is improved; the user behavior classification is determined based on the scores by calculating the scores of different behavior features, and the user behavior classification is accurately determined in a quantitative mode, so that the accuracy of analysis results is improved. Therefore, the data analysis method, the data analysis device and the computer readable storage medium based on the user behaviors can achieve the aim of improving the efficiency and the accuracy of analyzing the behavior data of the user.

Drawings

FIG. 1 is a flow chart of a data analysis method based on user behavior according to an embodiment of the present invention;

FIG. 2 is a schematic block diagram of a data analysis device based on user behavior according to an embodiment of the present invention;

Fig. 3 is a schematic diagram of an internal structure of an electronic device for implementing a data analysis method based on user behavior according to an embodiment of the present invention;

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The execution subject of the data analysis method based on user behavior provided by the embodiment of the application comprises at least one of a server, a terminal and the like which can be configured to execute the method provided by the embodiment of the application. In other words, the data analysis method based on the user behavior may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.

The invention provides a data analysis method based on user behaviors. Referring to fig. 1, a flow chart of a data analysis method based on user behavior according to an embodiment of the invention is shown. The method may be performed by an apparatus, which may be implemented in software and/or hardware.

In this embodiment, the data analysis method based on user behavior includes:

S1, acquiring a behavior data set of a user, and performing smooth filtering processing on the behavior data set to obtain a noise-free data set.

In the embodiment of the invention, the behavior data set of the user comprises a behavior data set of a certain class of behavior of the user, for example, the user behavior data set comprises driving behavior data of the user when driving a vehicle such as an automobile, wherein the driving behavior data comprises, but is not limited to, real-time driving speed, braking speed, vehicle lane change speed and vehicle steering speed.

In the following description, behavior data will be described as driving behavior data.

Optionally, in the embodiment of the present invention, the behavior data set may be obtained from a vehicle recorder on the automobile and/or a camera on the street for monitoring the automobile, or may be obtained from a blockchain node for storing the behavior data set, and the efficiency of obtaining the behavior data is improved by using the big data throughput of the blockchain.

Optionally, in the embodiment of the present invention, the smoothing filtering processing is performed on the behavioral data set by using a filtering model, where the filtering model may be an instrument with a smoothing filtering function, such as a low-pass filter.

Specifically, the performing smoothing filtering on the behavior data set to obtain a noise-free data set includes:

In the embodiment, by performing smoothing filtering processing on the behavior data set, noise data in the behavior data set can be removed, and accuracy of data analysis is improved.

S2, identifying switching points in the noiseless data set, and dividing the noiseless data set into a plurality of noiseless data subsets according to the switching points.

In detail, the switching point refers to a time node that switches from one behavior to another behavior. In this embodiment, the switching point may refer to a time node when the user switches from one driving state to another driving state during driving, such as a time node when the driving state of the user switches from a normal driving state to an acceleration driving state, or a time node when the driving state of the user switches from a normal driving state to a flameout state, etc.

In this embodiment, dividing the noiseless data set into a plurality of noiseless data subsets according to the switching points refers to taking the switching points as boundaries and taking the noiseless data between the switching points as one noiseless data subset. For example, the noise-free data set from the start time to the first switching point is a noise-free data subset, the noise-free data set from the first switching point to the second switching point is a noise-free data subset, and so on, a plurality of noise-free data subsets are obtained.

In the actual driving process, different driving states are congested with corresponding data waveforms, the shape of the data waveforms can be changed along with the change of the driving states, the driving states can be changed at any time, but the similarity of the change of various different driving states in a short period of time is very high, and in order to accurately find the time node of the driving state switching, preferably, the embodiment of the invention utilizes a preset algorithm to identify the switching point in the noiseless data set.

Preferably, the identifying the switching point in the noise-free point data set includes:

In particular, a sampler may be utilized to set an adjustable length time window with which to sample the noise free data set.

Preferably, the energy value algorithm is:

According to the embodiment of the invention, whether the driving state of the user is changed is judged according to the obtained energy value, a judging result is obtained, a switching point for switching the driving state of the user in the driving process is determined according to the judging result, and then the noiseless data set is segmented according to the switching point, so that a plurality of noiseless data subsets are obtained.

And S3, performing feature extraction on the plurality of noise-free data subsets by using a feature extraction network to obtain a feature data set.

Specifically, feature extraction is performed on each of the noise-free data in the plurality of noise-free data subsets.

Preferably, the feature extraction network includes a plurality of visual layers and a plurality of hidden layers, wherein the visual layers include a plurality of visual units, the hidden layers include a plurality of hidden units, and the plurality of visual layers and the plurality of hidden layers correspond to each other, and the plurality of visual units and the plurality of hidden units correspond to each other.

The feature extraction of the plurality of noise-free data subsets using a feature extraction network comprises:

In detail, in the process of extracting the characteristics, the characteristics of one noiseless datum in the noiseless datum subset are described through each visual unit in a visual layer of the characteristic extraction network, the characteristics of the datum are extracted through each hidden unit in a hidden layer of the characteristic extraction network, and the characteristics of the visual unit description matched with the hidden unit are extracted through each hidden unit in the hidden layer based on a machine learning algorithm.

Specifically, the states of the visual element and the hidden element are represented by boolean values, e.g., 0 and 1, where 0 represents an inactive state and 1 represents an active state. And after the visual unit and/or the hidden unit are activated by the activation function, transmitting the data contained in the visual unit to the hidden unit matched with the visual unit.

Specifically, the activation function of the visual element and/or the hidden element is as follows:

wherein E (v, h, θ) is an activation value, I is the number of visual units in the visual layer, J is the number of hidden units in the hidden layer, a is a bias vector of the visual layer, b is a bias vector of the hidden layer, w is a weight matrix of the visual layer and the hidden layer directly, v is any visual unit in the visual layer, h is any hidden unit in the hidden layer, and θ is a preset error parameter.

When the activation value of the activation function is greater than an activation threshold, the visual element and/or the hidden element is activated by the activation function. And after the visual unit and/or the hidden unit are activated by an activation function, transmitting data contained in the visual unit and/or the hidden unit to the hidden unit matched with the visual unit.

Preferably, in the embodiment of the present invention, the visual units in the visual layer and the hidden units in the hidden layer are matched by the following matching algorithm:

Wherein P (v, h, θ) is a matching value, v is any visual unit in the visual layer, h is any hidden unit in the hidden layer, θ is a preset error parameter, Z is a normalization factor of the feature extraction network, exp (-E (v, h, θ)) is an expectation that the visual unit v is matched with the hidden unit h.

Preferably, the activated visual layer may transmit data to the hidden layer that is matched with the visual layer and has been activated only after the visual unit in the visual layer is matched with the hidden unit in the hidden layer.

Further, when a visual element in a given visual layer is activated, the probability that the corresponding hidden element in the hidden layer is also activated is P (v _j = 1|h; θ):

v _j is the J-th hidden unit in the hidden layer, h is any hidden unit in the hidden layer, θ is a preset error parameter, J is the number of hidden units in the hidden layer, w is a weight matrix of the visual layer and the hidden layer directly, b is a bias vector of the hidden layer, and δ is a preset probability coefficient.

When a hidden unit in a given hidden layer is activated, the probability that the corresponding visual unit in the visual layer is also activated is P (h _i = 1|v; θ):

Wherein h _i is the ith visual unit in the visual layer, v is any visual unit in the visual layer, θ is a preset error parameter, I is the number of visual units in the visual layer, w is a weight matrix of the visual layer and the hidden layer directly, a is a bias vector of the visual layer, and δ is a preset probability coefficient.

In the embodiment of the present invention, after the visual unit/hidden unit in the visual layer/hidden layer is activated, only when the probability of the hidden unit/visual unit matched with the visual unit/hidden unit being activated is1, the hidden unit/visual unit is indicated to be activated.

In the embodiment of the invention, a plurality of visual layers and hidden layers in a plurality of feature extraction networks are utilized for superposition, so that feature extraction is performed on the time domain more accurately.

Specifically, the hidden layer performs feature extraction on the data features described by the visual layer by using a machine learning algorithm to obtain feature data, and the method includes:

And carrying out feature extraction on the data features described by the visual layer by using a machine learning algorithm as follows to obtain feature data h:

Wherein Y is noiseless data in the noiseless data subset, w is a weight matrix between the visual layer and the hidden layer, and b is a bias vector of the hidden layer.

And after the characteristic extraction process is finished on the noiseless data in the plurality of noiseless data subsets, obtaining a plurality of characteristic data, and collecting the plurality of characteristic data into the characteristic data set.

And S4, carrying out feature optimization on the feature data in the feature data set to obtain an optimized feature set.

In an embodiment of the present invention, the performing feature optimization on the feature data in the feature data set to obtain an optimized feature set includes:

Specifically, the calculating the weight of the feature data in the feature data set to obtain a feature weight includes:

the feature weight X is calculated using the weight algorithm described below:

wherein V is any one of the characteristic data in the characteristic data set, And f is a preset weight function, wherein the weight coefficient is preset for different behavior states of the user.

Preferably, when the feature is extracted, the noise-free data subset may extract a plurality of features, such as a speed feature in a driving process and an intersection congestion feature in the driving process, however, not all features are representative.

S5, classifying the preferred features in the preferred feature set by using a preset classifier to obtain a behavior feature set of the user.

In an embodiment of the present invention, the classification includes, but is not limited to: normal running, accelerating running, overspeed running, lane changing running, braking and flameout.

In detail, the classifying the preferred features in the preferred feature set by using a preset classifier to obtain the behavior features of the user includes:

The preferred features in the preferred feature set are classified using a softmax classifier. For example, the softmax classifier is utilized to divide the preferred features with the running speed smaller than or equal to the pre-speed threshold value into overspeed running and the like, and after all the preferred features in the preferred feature set are classified, the behavior feature set of the user is obtained.

S6, respectively calculating the scores of the behavior features in the behavior feature set, and determining the behavior type of the user based on the scores.

Specifically, the calculating the score of each behavior feature in the behavior feature set, and determining the behavior type of the user based on the score, includes:

In detail, in the embodiment of the present invention, the scoring algorithm is:

Score＝asg(ks,-log(psi))

Where Score is the Score of the behavioral feature, ks is the behavioral feature, psi is the Score error factor, asg is the Score operator.

After obtaining the score of each behavior feature in the behavior feature set, the embodiment of the invention carries out weighted average on the score of each behavior feature in the behavior feature set, and the obtained score is used as the overall behavior score of the user, and classifies the user according to the overall behavior score of the user.

For example, the behavior types of users who score between (a, b) are determined to have more dangerous behavior classes, the behavior types of users who score between (b, c) are determined to have some dangerous behavior classes, and the behavior types of users who score between (c, d) are determined to have almost no dangerous behavior classes.

Further, the embodiment of the invention further comprises pushing the score of each behavior feature in the behavior feature set and the user behavior type determined based on the score to the user, for example, pushing by means of mobile phone short messages and/or apps on the mobile phones of the user. Through the message pushing mode, the user can improve the driving habit of the user according to the received pushing data.

According to the embodiment of the invention, the behavior data set is subjected to smooth filtering, so that noise points in the data are removed, and the accuracy of data analysis is improved; by identifying the switching point, the change of the user behavior can be accurately identified, and the accuracy of data analysis is further improved; by extracting the characteristics of a plurality of noise-free data subsets and optimizing, more representative characteristics are screened, invalid data is reduced, effective data is increased, the accuracy of data analysis is improved, and the efficiency of data processing is improved; the user behavior classification is determined based on the scores by calculating the scores of different behavior features, and the user behavior classification is accurately determined in a quantitative mode, so that the accuracy of analysis results is improved. Therefore, the data analysis method based on the user behavior can achieve the purpose of improving the efficiency and accuracy of analyzing the behavior data of the user.

Fig. 2 is a schematic block diagram of a data analysis device based on user behavior according to the present invention.

The data analysis device 100 based on user behavior according to the present invention may be installed in an electronic apparatus. The data analysis device based on user behavior may include a data denoising module 101, a data dividing module 102, a feature extraction module 103, a feature preference module 104, a feature classification module 105, and a type discrimination module 106 according to the implemented functions. The module of the present invention may also be referred to as a unit, meaning a series of computer program segments capable of being executed by the processor of the electronic device and of performing fixed functions, stored in the memory of the electronic device.

In the present embodiment, the functions concerning the respective modules/units are as follows:

The data denoising module 101 is configured to obtain a behavior data set of a user, and perform smoothing filtering processing on the behavior data set to obtain a noise-free data set;

the data dividing module 102 is configured to identify a switching point in the noise-free data set, and divide the noise-free data set into a plurality of noise-free data subsets according to the switching point;

The feature extraction module 103 is configured to perform feature extraction on the plurality of noise-free point data subsets by using a feature extraction network to obtain a feature data set;

The feature optimization module 104 is configured to perform feature optimization on feature data in the feature data set to obtain an optimized feature set;

The feature classification module 105 is configured to classify preferred features in the preferred feature set by using a preset classifier, so as to obtain a behavior feature set of the user;

The type discriminating module 106 is configured to calculate a score of each behavior feature in the behavior feature set, and determine a behavior type of the user based on the score.

In detail, the specific implementation steps of each module of the text content extraction and generation device in the image are as follows:

the data denoising module 101 is configured to obtain a behavior data set of a user, and perform smoothing filtering processing on the behavior data set to obtain a noise-free data set.

Optionally, in the embodiment of the present invention, the behavior data set may be obtained from a vehicle recorder on the automobile and/or a camera on the street for monitoring the automobile.

The data dividing module 102 is configured to identify a switching point in the noise-free data set, and divide the noise-free data set into a plurality of noise-free data subsets according to the switching point.

Preferably, the data dividing module 102 is specifically configured to:

if the difference between the energy values of the first sampling data and the second sampling data in the sampling data set is larger than a preset energy threshold, determining the midpoint position of the continuous different time windows as a switching point;

And dividing the noiseless data set into a plurality of noiseless data subsets according to the switching points.

Preferably, the energy value algorithm is:

The feature extraction module 103 is configured to perform feature extraction on the plurality of noise-free data subsets by using a feature extraction network, so as to obtain a feature data set.

The feature extraction of the plurality of noise-free data subsets using a feature extraction network includes:

The feature optimization module 104 is configured to perform feature optimization on feature data in the feature data set to obtain a preferred feature set.

the feature weight X is calculated using the weight algorithm described below:

The feature classification module 105 is configured to classify the preferred features in the preferred feature set by using a preset classifier, so as to obtain a behavior feature set of the user.

Specifically, the type discrimination module 106 is specifically configured to:

Score＝asg(ks,-log(psi))

Fig. 3 is a schematic structural diagram of an electronic device implementing a data analysis method based on user behavior according to the present invention.

The electronic device 1 may comprise a processor 10, a memory 11 and a bus, and may further comprise a computer program stored in the memory 11 and executable on the processor 10, such as a data analysis program 12 based on user behavior.

The memory 11 includes at least one type of readable storage medium, including flash memory, a mobile hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may in other embodiments also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only for storing application software installed in the electronic device 1 and various types of data, such as codes of the data analysis program 12 based on user behavior, but also for temporarily storing data that has been output or is to be output.

The processor 10 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, combinations of various control chips, and the like. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects respective parts of the entire electronic device using various interfaces and lines, executes or executes programs or modules stored in the memory 11 (for example, executes a data analysis program based on user behavior, etc.), and invokes data stored in the memory 11 to perform various functions of the electronic device 1 and process data.

The bus may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.

The electronic equipment 1 in the embodiment of the invention acquires a behavior data set of a user, and performs smooth filtering processing on the behavior data set to obtain a noise-free data set; identifying a switching point in the noiseless data set, and dividing the noiseless data set into a plurality of noiseless data subsets according to the switching point; performing feature extraction on the plurality of noise-free point data subsets by using a feature extraction network to obtain a feature data set; feature optimization is carried out on the feature data in the feature data set to obtain an optimized feature set; classifying the preferred features in the preferred feature set by using a preset classifier to obtain a behavior feature set of the user; and respectively calculating the scores of the behavior features in the behavior feature set, and determining the behavior type of the user based on the scores. By carrying out smooth filtering on the behavior data set, noise points in the data are removed, and the accuracy of data analysis is improved; by identifying the switching point, the change of the user behavior can be accurately identified, and the accuracy of data analysis is further improved; by extracting the characteristics of a plurality of noise-free data subsets and optimizing, more representative characteristics are screened, invalid data is reduced, effective data is increased, the accuracy of data analysis is improved, and the efficiency of data processing is improved; the user behavior classification is determined based on the scores by calculating the scores of different behavior features, and the user behavior classification is accurately determined in a quantitative mode, so that the accuracy of analysis results is improved. Therefore, the data analysis device based on the user behavior can achieve the purpose of improving the efficiency and accuracy of analyzing the behavior data of the user.

Fig. 3 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.

For example, although not shown, the electronic device 1 may further include a power source (such as a battery) for supplying power to each component, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described herein.

Further, the electronic device 1 may also comprise a network interface, optionally the network interface may comprise a wired interface and/or a wireless interface (e.g. WI-FI interface, bluetooth interface, etc.), typically used for establishing a communication connection between the electronic device 1 and other electronic devices.

The electronic device 1 may optionally further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device 1 and for displaying a visual user interface.

It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.

The data analysis program 12 based on user behavior stored in the memory 11 in the electronic device 1 is a combination of instructions which, when run in the processor 10, can implement:

Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).

Further, the computer-usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.

In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.

The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.

The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any accompanying diagram representation in the claims should not be considered as limiting the claim concerned.

The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms second, etc. are used to denote a name, but not any particular order.

Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims

1. A method of data analysis based on user behavior, the method comprising:

respectively calculating the scores of all the behavior features in the behavior feature set, and determining the behavior types of the user based on the scores;

Wherein said identifying a switch point in said noise free data set comprises: performing data sampling on the noise-free data set by using a plurality of time windows with preset adjustable lengths to obtain a sampled data set; calculating the energy value of the sampling data set by using a preset energy value algorithm; judging whether the difference of energy values of first sampling data and second sampling data in the sampling data set is larger than a preset energy threshold value, wherein the first sampling data and the second sampling data are continuous data acquired respectively in different time windows; if the difference between the energy values of the first sampling data and the second sampling data in the sampling data set is larger than a preset energy threshold, determining the midpoint position of the continuous different time windows as a switching point;

the energy value algorithm is as follows:

wherein E is the energy value, For the value of any one of the sampled data sets, μ is the mean value of all the sampled data in the sampled data set, δ is the standard deviation of all the sampled data in the sampled data set,/>Is the length of the time window;

The feature optimization of the feature data in the feature data set to obtain an optimized feature set includes: performing weight calculation on the characteristic data in the characteristic data set to obtain a characteristic weight X; selecting target feature data with the feature weight X larger than a weight threshold value from the feature data set, and collecting the target feature data into the preferred feature set;

the computing the scores of the behavior features in the behavior feature set respectively, and determining the behavior type of the user based on the scores comprises the following steps: calculating the score of each behavior feature in the behavior feature set by using a scoring algorithm; weighted average is carried out on the score of each behavior feature in the behavior feature set to obtain the overall behavior score of the user; determining the behavior type of the user according to the overall behavior score of the user;

the scoring algorithm is as follows: score= asg (ks, -log (psi))

Wherein Score is the Score of the behavioral characteristics, ks is the behavioral characteristics, psi is the Score error factor, asg is the Score operator;

the weight algorithm of the characteristic weight X is as follows: x=phi (f, V)

Wherein V is any one of the characteristic data set, phi is a weight coefficient preset for different behavior states of the user, and f is a preset weight function.

2. The method for analyzing data based on user behavior according to claim 1, wherein the smoothing filtering the behavior data set to obtain a noise-free data set comprises:

3. The method of user behavior based data analysis according to claim 1, wherein the feature extraction of the plurality of noise-free data subsets using a feature extraction network comprises:

4. A user behavior based data analysis apparatus for implementing the user behavior based data analysis method according to any one of claims 1 to 3, characterized in that the apparatus comprises:

5. An electronic device, the electronic device comprising:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the user behavior based data analysis method of any one of claims 1 to 3.

6. A computer-readable storage medium comprising a storage data area storing created data and a storage program area storing a computer program; wherein the computer program, when executed by a processor, implements a data analysis method based on user behavior as claimed in any one of claims 1 to 3.