CA3132346A1 - User abnormal behavior recognition method and device and computer readable storage medium - Google Patents

User abnormal behavior recognition method and device and computer readable storage medium Download PDF

Info

Publication number
CA3132346A1
CA3132346A1 CA3132346A CA3132346A CA3132346A1 CA 3132346 A1 CA3132346 A1 CA 3132346A1 CA 3132346 A CA3132346 A CA 3132346A CA 3132346 A CA3132346 A CA 3132346A CA 3132346 A1 CA3132346 A1 CA 3132346A1
Authority
CA
Canada
Prior art keywords
data
user
neural network
neuron
winning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CA3132346A
Other languages
French (fr)
Other versions
CA3132346C (en
Inventor
Yiwen Li
Xin Huang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
10353744 Canada Ltd
Original Assignee
10353744 Canada Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 10353744 Canada Ltd filed Critical 10353744 Canada Ltd
Publication of CA3132346A1 publication Critical patent/CA3132346A1/en
Application granted granted Critical
Publication of CA3132346C publication Critical patent/CA3132346C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention discloses to user anomaly behavior identification method, apparatus, and computer readable storage medium from computer technology field. The method comprises: obtaining time series data and spatial series data associated with user behavior; according to a plurality of actual indicators values before pre-set time point in the time series data, predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point; comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for user behavior; according to spatial series data, performing anomaly detection through pre- trained SOM neural network model, obtaining second detection result for user behavior; according to the first detection result and the second detection result, performing anomaly identification on user behavior. The implementations of present invention can achieve accurate and reliable identification of user anomaly behavior.

Description

USER ABNORMAL BEHAVIOR RECOGNITION METHOD AND DEVICE AND COMPUTER
READABLE STORAGE MEDIUM
Field [0001] The present disclosure relates to field of computer technology, particularly to a user anomaly behavior method, apparatus, and computer storage medium.
Background
[0002] Information security is an increasingly prominent topic among people.
The theft of commonly used network and app accounts may cause information leakage, funds are transferred, or being used as a springboard for a series of attacks on important assets. Many industries do not have clear identification and tracking method, therefore, the biggest victims are often users themselves. Due to the difference in account permissions, it is difficult to simply judge the extent of activities as illegal behaviors, due to the complexity of business, it is difficult to accurately determine whether the account is under normal status or anomaly status.
An anomaly status is a phenomenon or an event generated by various anomaly activities that is not consistent with user's routine.
[0003] At present, the identification of user anomaly behavior usually adopts K-Means clustering which is an unsupervised machine learning algorithm, but K-Means algorithm needs to determine the number of classes (k) in advance, and after finding the most similar class for each input data, only updating the parameter of this class, so the result of each time is unstable due to the influence of the initial value and the noise data, as a result, risky users cannot be accurately and reliably identified.
Invention Content
[0004] To solve the problems in above-mentioned technical background, the present invention provides a user anomaly behavior identification method, apparatus and computer readable storage medium, to achieve accurate and reliable identification of user anomaly behavior.
[0005] The technical solutions provided in implementations of the present invention are as following:

Date recue / Date received 202 1-1 1-29
[0006] The first aspect provides a user anomaly behavior identification method, comprising:
[0007] Obtaining time series data and spatial series data associated with user behavior;
[0008] According to a plurality of actual indicators values before pre-set time point in the time series data, predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point;
[0009] Comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior;
[0010] According to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining second detection result for the user behavior;
[0011] According to the first detection result and the second detection result, performing anomaly identification on the user behavior.
[0012] Furthermore, the ARIMA model is constructed by following methods:
[0013] Obtaining time series sample data associated with sample user behavior;
[0014] Performing stationarity test on the time series sample data, for failing the test's time series sample data, differential processing the data to obtain stationary time series sample data;
[0015] For the stationary time series sample data, establishing an initial ARIMA model, determining the initial ARIMA model's autoregressive order and range of moving average order according to autocorrelation coefficient and partial autocorrelation coefficient of the stationary time series sample data;
[0016] Using AIC information criterion, determining the combination of the initial ARIMA model's optimal autoregressive order and range of moving average order, constructing the ARIMA
model.

Date recue / Date received 202 1-1 1-29
[0017] Furthermore, the SOM neural network model is trained by following methods to obtain:
[0018] Si, initializing weight of each neuron in the pre-set SOM neural network;
[0019] S2, obtaining spatial series sample data associated with behavior of sample user, normalization processing each spatial series sample data, obtaining training sample set;
[0020] S3, randomly selecting training samples from the training sample set to be input into the SOM neural network input layer, obtaining input vector;
[0021] S4, according to Euclidean distance between the input vector and each neuron in competition layer of the SOM neural network, searching for winning neuron corresponding to the input vector;
[0022] S5, using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron;
[0023] S6, iteratively execute step S3 to S5, the training ends until reaching the pre-set end condition, obtaining the SOM neural network, and obtaining a plurality of clusters output by the SOM neural network model.
[0024] Furthermore, according to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining a second detection result for the user behavior, comprising:
[0025] Normalization processing the spatial series data, regarding the normalization processed spatial series data as input parameters, the parameters are input to the SOM neural network model, according to the Euclidean distance from the input parameters to each neuron, determining the winning neurons corresponding to the input parameters and cluster of the winning neurons;
[0026] Calculating cluster area of the winning neurons, comparing the cluster area with an area threshold, wherein, only when the cluster area is less than the area threshold, the cluster is an anomaly cluster;

Date recue / Date received 202 1-1 1-29
[0027] According to the comparing result, generating a second detection result for user behavior.
[0028] Furthermore, according to the first detection result and the second detection result, after anomaly recognition of user behavior, the method comprises:
[0029] If the identification result of user behavior is the user anomaly behavior, then performing identification authentication on the user, or restricting the user's operations and behavior.
[0030] The second aspect provides an apparatus for identifying user anomaly behavior, the apparatus comprises:
[0031] A data obtaining module configured to obtain time series data and spatial series data associated with user behavior;
[0032] A first detection module configured to predict confidence interval of the indicator through ARIMA
model when user is at the pre-set time point according to a plurality of actual indicators values before pre-set time point in the time series data; compare actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtain first detection result for the user behavior;
[0033] A second detection module configured to perform anomaly detection through pre-trained SOM
neural network model according to the spatial series data and obtain second detection result for the user behavior;
[0034] An anomaly identification module configured to perform anomaly identification on the user behavior according to the first detection result and the second detection result.
[0035] Furthermore, the apparatus also comprises construction module, the construction module is specifically for:
[0036] Obtaining time series sample data associated with sample user behavior;

Date recue / Date received 202 1-1 1-29
[0037] Performing stationarity test on the time series sample data, for failing the test's time series sample data, differential processing the data to obtain stationary time series sample data;
[0038] For the stationary time series sample data, establishing an initial ARIMA model, determining the initial ARIMA model's autoregressive order and range of moving average order according to autocorrelation coefficient and partial autocorrelation coefficient of the stationary time series sample data;
[0039] Using AIC information criterion, determining the combination of the initial ARIMA model's optimal autoregressive order and range of moving average order, constructing the ARIMA
model.
[0040] Furthermore, the apparatus also comprises training module, the training module comprises:
[0041] An initializing submodule configured to initialize weight of each neuron in the pre-set SOM neural network;
[0042] A pre-processing submodule configured to obtain spatial series sample data associated with behavior of sample user, normalization processing each spatial series sample data, obtaining training sample set;
[0043] A training submodule configured to randomly select training samples from the training sample set to be input into the SOM neural network input layer and obtain input vector, according to Euclidean distance between the input vector and each neuron in competition layer of the SOM
neural network, searching for winning neuron corresponding to the input vector, using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron;
[0044] An iteration submodule configured to iteratively execute steps in training submodule, the training ends until reaching the pre-set end condition, obtaining the SOM neural network, and obtaining a plurality of clusters output by the SOM neural network model.
[0045] Furthermore, the second detection module is specifically for:
[0046] Normalization processing the spatial series data, regarding the normalization processed spatial series Date recue / Date received 202 1-1 1-29 data as input parameters, the parameters are input to the SOM neural network model, according to the Euclidean distance from the input parameters to each neuron, determining the winning neurons corresponding to the input parameters and cluster of the winning neurons;
[0047] Calculating cluster area of the winning neurons, comparing the cluster area with an area threshold, wherein, only when the cluster area is less than the area threshold, the cluster is an anomaly cluster;
[0048] According to the comparing result, generating a second detection result for user behavior.
[0049] Furthermore, the apparatus also comprises anomaly processing module, the anomaly processing module is specifically for:
[0050] if the identification result of user behavior is the user anomaly behavior, then performing identification authentication on the user, or restricting the user's operations and behavior.
[0051] The third aspect provides a computer device, comprising:
[0052] one or a plurality of processors;
[0053] A storage apparatus configured to store one or a plurality of programs;
[0054] When one or a plurality of programs are executed by one or a plurality of processors, the processors achieve following operation steps:
[0055] Obtaining time series data and spatial series data associated with user behavior;
[0056] According to a plurality of actual indicators values before pre-set time point in the time series data, predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point;
[0057] Comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior;

Date recue / Date received 202 1-1 1-29
[0058] According to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining second detection result for the user behavior;
[0059] According to the first detection result and the second detection result, performing anomaly identification on the user behavior.
[0060] The fourth aspect provides a computer readable storage medium stored with a computer program configured to achieve following operation steps when the processor executes the computer program:
[0061] Obtaining time series data and spatial series data associated with user behavior;
[0062] According to a plurality of actual indicators values before pre-set time point in the time series data, predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point;
[0063] Comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior;
[0064] According to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining second detection result for the user behavior;
[0065] According to the first detection result and the second detection result, performing anomaly identification on the user behavior.
[0066] Comparing to prior art, the technical effects achieved by the technical solutions of the present invention are:
[0067] 1. Using the SOM neural network clustering algorithm which has non-linearity, robustness and strong adaptive learning ability, also has the outstanding ability to manage uncertainty or fuzzy information, and overcome the limitation of K-Means algorithm which is influenced by the pre-determined K value and noise data, then improve the accuracy and reliability of the identification of user anomaly behavior;

Date recue / Date received 202 1-1 1-29
[0068] 2. Using the combination of ARIMA model and SOM neural network model to dig user anomaly points in both time and space aspects, comparing with single traditional method, it can improve the ability and accuracy of identifying anomaly points.
Drawing Description
[0069] In order to describe the technical solutions clearer in the implementations of the present application or the prior art, the following are drawings that need to be used are briefly introduced. Obviously, the drawings in the following description are only some implementations of the application, for those of ordinary skill in the art, without creative work, other drawings can be obtained based on these drawings.
[0070] Figure 1 is a process diagram of a user anomaly behavior identification method in the implementation of the present invention;
[0071] Figure 2 is a structural diagram of a user anomaly behavior identification apparatus in the implementation of the present invention;
[0072] Figure 3 is an internal structure of a computer device in the implementation of the present invention.
Specific implementation methods
[0073] In order to make clearer purpose, technical solutions and benefits of the present invention, the following will clearly and completely describe the technical solutions of the implementations in the present application with accompanying drawings, obviously the described implementations are only a part of the implementations in the present application. Based on the implementations in the present application, all other implementations obtained by those of ordinary skilled in the art will fall in the protection scope of the present application.
[0074] It should be noted that, unless the context clearly requires, otherwise, the similar words of "comprising", "contains" in the entire specification and claims should be interpreted as inclusive rather than Date recue / Date received 202 1-1 1-29 exclusive or exhaustive meaning; in other words, it means including but not limited to.
[0075] In addition, in the description of the present invention, the terms "first", "second", etc. are only used for descriptive purpose, they can not be understood as indicating or implying relative importance. In addition, in the description of the present invention, unless indicated, 'plurality' means two or more than two.
[0076] As described in the background, the current identification of user anomaly behavior usually adopts unsupervised K-Means clustering march learning algorithm, but K-Means algorithm needs to determine the number of classes (k) in advance, and after finding one most similar class for each input data, only updating the parameters of this class, therefore, the result of each time is unstable due to the influence of the initial value and the noise data, as a result, risky users cannot be accurately and reliably identified. For this, the implementations of the present invention provide a user anomaly behavior identification method, using the combination of ARIMA model and SOM neural network model to dig user anomaly points in both time and space aspects, comparing with single traditional method, it can improve the ability and accuracy of identifying anomaly points, meanwhile, using the SOM neural network clustering algorithm which has non-linearity, robustness and strong adaptive learning ability, also has the outstanding ability to manage uncertainty or fuzzy information, and overcome the limitation of K-Means algorithm which is influenced by the pre-determined K
value and noise data.
[0077] Implementation one
[0078] The implementation of the present invention provides a user anomaly behavior identification method, the method is applied to the user anomaly behavior identification apparatus, the apparatus can be configured in any computer device, wherein, the computer device can be server, the server can be independent server or server cluster consists of a plurality of servers.
[0079] As shown in Figure 1, the method for user anomaly behavior identification provided by the implementation of the present invention can comprise following steps:
[0080] 101, obtaining time series data and spatial series data associated with user behavior;

Date recue / Date received 202 1-1 1-29
[0081] Specifically, user data within a pre-set time period can be obtained, then pre-processing the user data to extract time series data and spatial series data associated with user behavior.
[0082] Among them, the user data comprises user attribute data and user behavior data, the user attribute data can comprise: name, age, mailing address, etc.; the user behavior data can comprise IP address of account registration, IP address of each login, time information of each login, page click information, user device information, online duration and other related information, the user device information can comprise device MAC address, device gyroscope data, device acceleration data, CPU, memory, disk I/O and other information.
[0083] Wherein, the time series data is an indicator values series obtained by sorting the actual indicator values of users in a pre-set time period in chronological order. Among them, the indicator value refers to the value of the parameter indicator obtained by statistics of user numerical data related to user behavior in a pre-set time period. Among them, the parameter indicator can be one kind of the values in online duration, device moving distance and change value of screen temperature, in addition, it can be the other indicators.
[0084] Among them, the spatial series data refers to user behavior trajectory data in a spatial order during the application, there is a connection of sequence, flow, and direction between each space, for example, the behavior trajectory data involved in the user logging in to the application to perform the transfer operation forms the user spatial series data.
[0085] 102, according to a plurality of actual indicators values before pre-set time point in the time series data, predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point.
[0086] Wherein, the pre-set time point can be the time point corresponding to the Nth data in the M data included in the time series data, N is greater than 1, N is less tan M.
[0087] Specifically, the actual values of a plurality of indicators before a pre-set time point in the time series data are substituted into the ARIMA model for predicting, obtaining the predicted values of indicators at the pre-set time point and the confidence interval of the predicted values of indicators when the confidence level is a.
Date recue / Date received 202 1-1 1-29
[0088] Among them, ARIMA (Autoregressive Integrated Moving Average Model) is Auto-Regressive Moving Average model, predicting future with past and present values, it regards the time series as a random series and finds optimal function to fit.
[0089] Wherein, ARIMA(p, q, d) model is defined as following:
Yt = (PlYt-1 (13231t-2. = = +(PpYt_p + et ¨ Otet_1¨ 92et_2... ¨Oget_q;
[0090] Wherein, p refers autoregressive order, d refers series differential order, q refers moving average order, yt is time series observation value at t time moment, et is white noise series, (pi, Oi are coefficients of yt_, and et_i respectively.
[0091] Furthermore, the ARIMA model can be constructed by following steps a to b:
[0092] a. Obtaining time series sample data associated with user behavior.
[0093] Specifically, the implementation process of this step can refer to the time series data obtaining process in step 101, here will not repeat.
[0094] b. Performing stationarity test on the time series sample data, for failing the test's time series sample data, differential processing the data to obtain stationary time series sample data.
[0095] Specifically, adopting unit root detection method to test stationarity of time series sample data to determine whether the data is stationary, if the data is non-stationary, the data needs to be stationary processed, which means that the series continue to be differentiated until the series meets the stationary test conditions, obtaining the stationary time series sample data to eliminate the data trend, the differential order d of ARIMA
model is the times of differentiating made when the time series becomes a stationary time series.
[0096] c. For the stationary time series sample data, establishing an initial ARIMA model, determining the initial ARIMA model's autoregressive order and range of moving average order according to autocorrelation coefficient and partial autocorrelation coefficient of the stationary time series sample data.

Date recue / Date received 202 1-1 1-29
[0097] d. Using AIC information criterion, determining the combination of the initial ARIMA model's optimal autoregressive order and range of moving average order, constructing the ARIMA model.
[0098] Specifically, determining the differential order d of the model, based on the AIC infoimation criterion, the ranges of both autoregressive order p and moving average order q are defined, traversing the combination of (p, q), identifying the combination of (p, q) with minimum AIC
value. In the end, the optimal p, d and q are determined to apply in ARIMA model for predicting.
[0099] 103, comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior.
[0100] Specifically, determining whether the actual value of the indicator at the pre-set time point is within the confidence interval of the predicted indicator, obtaining the determining result, generating a first detection result for user behavior according to the determining result, wherein when the actual value of indicator falls outside the confidence interval , the first detection result used for indicating the actual value of indicator at the pre-set time point is anomaly value, when the actual value of indicator falls within the confidence interval, the first detection result used for indicating the actual value of indicator at the pre-set time point is normal value.
[0101] 104, according to the spatial series data, performing anomaly detection through pre-trained SOM
neural network model, obtaining second detection result for the user behavior.
[0102] Among them, SOM (Self Organizing Maps, self-organizing map neural network) is an unsupervised manually neural network. The network structure of SOM has two layers: input layer and output layer (also called competition layer). Usually, neural network is based on the reverse transfer of loss function to train, while the SOM uses a competitive learning strategy, relying on the competition between each neurons to gradually optimize the network, the neurons are a matrix of equidistant nodes arranged in a two-dimensional form on the neural network, to constitute the output layer; each node has correspondingly weight vector with the same dimension as the dimension length of the input data and uses the nearest neighbor relationship function to maintain the topology of input space.

Date recue / Date received 202 1-1 1-29
[0103] Among them, the SOM neural network model can be obtained by training in following methods, comprising steps Si to S6:
[0104] Si, initializing weight of each neuron in the pre-set SOM neural network;
[0105] Specifically, initializing the pre-set SOM neural network, the weight of each neuron of the SOM
neural network can be initialized to a very small random number, the random number is greater than 0 and less than 1. In addition, the number of model iterations, learning rate, and neighborhood radius also need to be initialized, for example, the iteration number can be set i = 1000, the initial learning rate rate max = 0.2, rate min = 0.05. the initial neighborhood radius zone max = 1.5, zone min =
0.8, each model parameter can make corresponding adjustments according to different data or requirements, a learning rate that is too small will reduce the speed of network optimization and increase training time, while a learning rate that is too large can cause the network parameter to swing back and forth on both sides of the final optimal value, causing the network to fail to converge. In specific implementation, at the beginning of the training of SOM neural network, selecting the value of learning rate as 0.2, and then decreasing at a faster rate, this is helpful to quickly capture the general structure of the input vector, when the learning rate is reduced to a small value, the weight of the neuron can be adjusted to conform to the sample's distribution structure of the input space. In addition, in the training process of the SOM neural network, setting a neighborhood radius R with the winning neuron as the center, the neighborhood radius R is initialized as the initial neighborhood radius, the fixed radius is called winning neighborhood. The range of the winning neighborhood shrinks as the number of training increases and finally shrinks to a fixed value of the neighborhood radius.
[0106] S2, obtaining spatial series sample data associated with behavior of sample user, normalization processing each spatial series sample data, obtaining training sample set.
[0107] Wherein, the obtaining process of spatial series sample data can refer to the obtaining of time series data in step 101, here will not repeat.
[0108] S3, randomly selecting training samples from the training sample set to be input into the SOM neural network input layer, obtaining input vector.

Date recue / Date received 202 1-1 1-29
[0109] S4, according to Euclidean distance between the input vector and each neuron in competition layer of the SOM neural network, searching for winning neuron corresponding to the input vector.
[0110] Specifically, calculating the Euclidean distance between the input vector X and each neuron, the neuron with the smallest European distance to the input vector X is the winning neuron. All neurons in the output layer of the SOM neural network compete with each other, only one wining neuron can be activated each time.
[0111] S5, using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron.
[0112] Specifically, a neighborhood radius is set with the winning neuron as the center, and the area within the radius is called winning neighborhood, according to the coordinates of winning neuron and the radius of neighborhood, determining all neurons in the winning neighborhood, and using the gradient descent method to update the weight of each neuron in the winning neighborhood.
[0113] S6, iteratively execute step S3 to S5, the training ends until reaching the pre-set end condition, obtaining the SOM neural network, and obtaining a plurality of clusters output by the SOM neural network model.
[0114] Specifically, a new input sample is read from the training sample set, and the process from step S3 to step S5 is executed iteratively, until completing the training of all training samples, after updating the weight values of all wining neurons, updating the learning rate and neighborhood function. When the number of training times of the SOM neural network reaches the pre-set maximum number of times, the training and learning process is exited, obtaining the trained SOM neural network model, and obtaining a plurality of clusters output by the SOM neural network model, wherein each cluster corresponds to a neighborhood scope (i.e., the winning neighborhood), the neighborhood contains at least one neuron.
[0115] In the present implementation, by using the SOM neural network to unearth the correlation between the various influencing factors in the spatial series data, which is more useful to the classification and research Date recue / Date received 202 1-1 1-29 of anomaly user behavior and has a high generalization ability.
[0116] Wherein, the implementation process of the above step 104 can comprise:
[0117] 1041, normalization processing the spatial series data, regarding the normalization processed spatial series data as input parameters, the parameters are input to the SOM neural network model.
[0118] 1042, according to the Euclidean distance from the input parameters to each neuron, determining the winning neurons corresponding to the input parameters and cluster of the winning neurons.
[0119] Specifically, calculating the Euclidean distance between the input vector X and each neuron, the neuron with the smallest European distance to the input vector X is the winning neuron, and determining the neighborhood to where this winning neuron belongs.
[0120] 1043, calculating cluster area of the winning neurons, comparing the cluster area with an area threshold, wherein, only when the cluster area is less than the area threshold, the cluster is an anomaly cluster.
[0121] Among them, the area threshold can be set according to the actual needs, in general, when the cluster area is small, which means that an isolated cluster with a very small cluster size is set as an anomaly cluster.
[0122] Specifically, determining the neighborhood radius of the winning neighborhood where the winning neuron is located, calculating the area of circle with the radius of the neighborhood as the radius and using it as the cluster area of the cluster to which the winning neuron belongs, comparing the cluster area with the area threshold.
[0123] 1044, according to the comparing result, generating a second detection result for user behavior.
[0124] Wherein, when the cluster area of the cluster to which the winning neuron belongs is less than the area threshold, the second detection result is used to indicate that the user spatial series data is anomaly data, when the cluster area of the cluster to which the winning neuron belongs is not less than the area threshold, the second detection result is used to indicate that the user spatial series data is normal data.
Date recue / Date received 202 1-1 1-29
[0125] What should be noted is that the implementation of the present invention does not specifically limit the order in which step 102 and step 104 are performed, the concurrent execution is the preferred solution.
[0126] 105, according to the first detection result and the second detection result, performing anomaly identification on the user behavior.
[0127] Specifically, using the following methods to identify anomaly user behavior:
[0128] If the first detection result and the second detection result are both normal, determining the user behavior as normal; if the first detection result and the second detection result are both anomaly, determining the user behavior as anomaly; if only one of the first detection result and the second detection result is normal, determining the user behavior as a suspicious anomaly behavior, the suspicious anomaly behavior can be manually identified.
[0129] Furthermore, after step 105, the method also comprises:
[0130] If the identification result of user behavior is the user anomaly behavior, then performing identification authentication on the user, or restricting the user's operations and behavior.
[0131] Wherein, the restriction operation comprises disabling the key function on the key page of the application, the key function includes but is not limited to viewing, inputting, submitting, etc.
[0132] In the present implementation, when determining that the user is a risk user, by executing the authentication of user identification or restricting accordingly operations of user, which can effectively control and prevent network security risks.
[0133] The identification of user anomaly behavior method provided by the implementation of the present invention, using the SOM neural network clustering algorithm which has non-linearity, robustness and strong adaptive learning ability, also has the outstanding ability to manage uncertainty or fuzzy information, and overcome the limitation of K-Means algorithm which is influenced by the pre-determined K value and noise Date recue / Date received 202 1-1 1-29 data, then improve the accuracy and reliability of the identification of user anomaly behavior; in addition, using the combination of ARIMA model and SOM neural network model to dig user anomaly points in both time and space aspects, comparing with single traditional method, it can improve the ability and accuracy of identifying anomaly points.
[0134] Implementation two
[0135] The identification of user anomaly behavior apparatus provided by the implementation of the present invention, as shown in Figure 2, the apparatus comprises:
[0136] A data obtaining module 202 configured to obtain time series data and spatial series data associated with user behavior;
[0137] A first detection module 204 configured to predict confidence interval of the indicator through ARIMA model when user is at the pre-set time point according to a plurality of actual indicators values before pre-set time point in the time series data; compare actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtain first detection result for the user behavior;
[0138] A second detection module 206 configured to perform anomaly detection through pre-trained SOM
neural network model according to the spatial series data and obtain second detection result for the user behavior;
[0139] An anomaly identification module 208 configured to perform anomaly identification on the user behavior according to the first detection result and the second detection result.
[0140] Furthermore, the apparatus also comprises construction module, wherein the construction module is specifically for:
[0141] Obtaining time series sample data associated with sample user behavior;
[0142] Performing stationarity test on the time series sample data, for failing the test's time series sample Date recue / Date received 202 1-1 1-29 data, differential processing the data to obtain stationary time series sample data;
[0143] For the stationary time series sample data, establishing an initial ARIMA model, determining the initial ARIMA model's autoregressive order and range of moving average order according to autocorrelation coefficient and partial autocorrelation coefficient of the stationary time series sample data;
[0144] Using AIC information criterion, determining the combination of the initial ARIMA model's optimal autoregressive order and range of moving average order, constructing the ARIMA
model.
[0145] Furthermore, the apparatus also comprises training module, wherein the training module comprises:
[0146] An initializing submodule configured to initialize weight of each neuron in the pre-set SOM neural network;
[0147] A pre-processing submodule configured to obtain spatial series sample data associated with behavior of sample user, normalization processing each spatial series sample data, obtaining training sample set;
[0148] A training submodule configured to randomly select training samples from the training sample set to be input into the SOM neural network input layer and obtain input vector, according to Euclidean distance between the input vector and each neuron in competition layer of the SOM
neural network, searching for winning neuron corresponding to the input vector, using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron;
[0149] An iteration submodule configured to iteratively execute steps in training submodule, the training ends until reaching the pre-set end condition, obtaining the SOM neural network, and obtaining a plurality of clusters output by the SOM neural network model.
[0150] Furthermore, the second detection module 206 is specifically for:
[0151] Normalization processing the spatial series data, regarding the normalization processed spatial series data as input parameters, the parameters are input to the SOM neural network model, according to the Date recue / Date received 202 1-1 1-29 Euclidean distance from the input parameters to each neuron, determining the winning neurons corresponding to the input parameters and cluster of the winning neurons;
[0152] Calculating cluster area of the winning neurons, comparing the cluster area with an area threshold, wherein, only when the cluster area is less than the area threshold, the cluster is an anomaly cluster;
[0153] According to the comparing result, generating a second detection result for user behavior.
[0154] Furthermore, the apparatus also comprises anomaly processing module, the anomaly processing module is specifically for:
[0155] If the identification result of user behavior is the user anomaly behavior, then performing identification authentication on the user, or restricting the user's operations and behavior.
[0156] The anomaly behavior identification apparatus provided by the implementation of the present invention is the same invention concept as the anomaly behavior identification method provided by the implementation of the present invention, the method for identifying anomaly user behavior provided by the implementation of the present invention can be executed which has functional modules and beneficial effects corresponding to the method for identification of anomaly user behavior. For technical details that are not described in this implementation, please refer to the method for identification of anomaly user behavior provided in the implementation of the present invention, which will not be repeated here.
[0157] Figure 3 is the internal structure diagram of the computer device provided by the implementation of the present invention. The computer device includes a processor, a memory, a network interface, and a database connected through a system bus. The processor of the computer device is configured to provide calculation and control capabilities. The memory of computer device includes non-volatile storage medium and internal memory. The memory of non-volatile storage medium has operation system, computer programs and database. The internal memory provides an environment for the operation system and computer program running in a non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer program is executed by the processor to implement a user anomaly behavior identification method.

Date recue / Date received 202 1-1 1-29
[0158] The skilled in the art can understand that the structure shown in Figure 3 is only partial structural diagram related this application solution and not constitute limitation to the computer device applied on the current application solution, the specific computer device can include more or less components than what is shown in the figure, or combinations of some components or different components to what is shown in the figure.
[0159] In an implementation, a computer device is provided which includes a memory, a processor, and a computer program stored on the memory and running on the processor. The processor achieves the following steps when executing the computer program:
[0160] Obtaining time series data and spatial series data associated with user behavior.
[0161] According to a plurality of actual indicators values before pre-set time point in the time series data, predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point.
[0162] Comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior.
[0163] According to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining second detection result for the user behavior.
[0164] According to the first detection result and the second detection result, performing anomaly identification on the user behavior.
[0165] In an implementation, a computer readable storage medium is provided which stores with computer program, the processor achieves the following steps when executing the computer program:
[0166] Obtaining time series data and spatial series data associated with user behavior.
[0167] According to a plurality of actual indicators values before pre-set time point in the time series data, Date recue / Date received 202 1-1 1-29 predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point.
[0168] Comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior.
[0169] According to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining second detection result for the user behavior.
[0170] According to the first detection result and the second detection result, performing anomaly identification on the user behavior.
[0171] The skilled in the art can understand that all or partial of procedures from the above-mentioned methods can be performed by computer program instructions through related hardware, the mentioned computer program can be stored in a non-volatile material computer readable storage medium, this computer can include various implementation procedures from the abovementioned methods when execution. Any reference to the memory, the storage, the database, or the other media used in each implementation provided in current application can include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), programable ROM (PROM), electrically programmable ROM
(EPRPMD), electrically erasable programmable ROM (EEPROM) or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. As an instruction but not limited to, RAM is available in many forms such as static RAM (SRAM), dynamic RAM (DRAMD), synchronous DRAM
(SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SRAM (ESDRAM), synchronal link (Synchlink) DRAM
(SLDRAM), memory bus (Rambus), direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
[0172] The above-mentioned implementations are only several implementations in this disclosure and the description is more specific and detailed but cannot be understood as the limitation of the scope of the invention patent. What should be noted is that those ordinary skilled in the art can make various modifications and variations to the disclosure without departing from the spirit and scope of the disclosure. Therefore, the scope protection of the present invention patent shall be subject to the appended claims.

Date recue / Date received 202 1-1 1-29

Claims (112)

Claims:
1. An device for identifying user anomaly behavior, the device comprising:
a data obtaining module configured to obtain time series data and spatial series data associated with user behavior;
a first detection module configured to:
predict confidence interval of the indicator through ARIMA model when a user is at the pre-set time point according to a plurality of actual indicators values before pre-set time point in the time series data;
compare actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator;
obtain first detection result for the user behavior;
a second detection module configured to:
perform anomaly detection through pre-trained SOM neural network model according to the spatial series data;
obtain second detection result for the user behavior; and an anomaly identification module configured to perfomi anomaly identification on the user behavior according to the first detection result and the second detection result.
2. The device of claim 1 further comprises of a construction module for:
obtaining time series sample data associated with sample user behavior;
performing stationarity test on the time series sample data, for failing the test's time series sample data;
differential processing the data to obtain stationary time series sample data;

Date Recue/Date Received 2022-02-07 establishing an initial ARIMA model for the stationary time series sample data;
determining the initial ARIMA model's autoregressive order and range of moving average order according to autocorrelation coefficient and partial autocorrelation coefficient of the stationary time series sample data;
determining the combination of the initial ARIMA model's optimal autoregressive order and range of moving average order using AIC information criterion; and constructing the ARIMA model.
3. The device to any one of claims 1 to 2 further comprises a training module comprising:
an initializing submodule configured to initialize weight of each neuron in the pre-set SOM neural network;
a pre-processing submodule configured to:
obtain spatial series sample data associated with behavior of sample user, normalization process each spatial series sample data; and obtain training sample set;
a training submodule configured to:
randomly select training samples from the training sample set to be input into the SOM neural network input layer;
obtain input vector, according to Euclidean distance between the input vector and each neuron in competition layer of the SOM neural network;
searching for winning neuron corresponding to the input vector, using a gradient descent method; and Date Recue/Date Received 2022-02-07 performing weight update on the winning neuron and each neuron of neurons set around the winning neuron;
and an iteration submodule configured to:
iteratively execute steps in training submodule, wherein the training ends until reaching the pre-set end condition;
obtaining the SOM neural network; and obtaining a plurality of clusters output by the SOM neural network model.
4. The device of claim 3, wherein the second detection module further configured for:
normalization processing the spatial series data, wherein the normalization processed spatial series data as input parameters, wherein the parameters are input to the SOM
neural network model, according to the Euclidean distance from the input parameters to each neuron:
determining the winning neurons corresponding to the input parameters and cluster of the winning neurons;
calculating cluster area of the winning neurons;
comparing the cluster area with an area threshold, wherein when the cluster area is less than the area threshold, the cluster is an anomaly cluster; and generating a second detection result for the user behavior according to the comparing result.
5. The device of any one of claims 1 to 4, further comprises an anomaly processing module for performing identification authentication on the user, or restricting the user's operations and behavior, if the identification result of the user behavior is the user the anomaly behavior.

Date Recue/Date Received 2022-02-07
6. The device of any one of claims 1 to 5, wherein the user data within a pre-set time period can be obtained, pre-processing the user data to extract time series data and the spatial series data associated with the user behavior.
7. The device of any one of claims 1 to 6, wherein the user data includes user attribute data and user behavior data, wherein the user attribute data includes any one of more of name, age, mailing address, wherein the user behavior data includes any one of more of IP
address of account registration, IP address of each login, time information of each login, page click information, user device information, online duration and other related information, wherein the user device information includes any one of more of device MAC address, device gyroscope data, device acceleration data, CPU, memory, disk I/0 and other information.
8. The device of any one of claims 1 to 7, wherein the time series data is an indicator values series obtained by sorting the actual indicator values of users in a pre-set time period in chronological order, wherein the indicator value refers to the value of parameter indicator obtained by statistics of user numerical data related to the user behavior in a pre-set time period, wherein the parameter indicator can be any one or more of values in online duration, device moving distance and change value of screen temperature.
9. The device of any one of claims 1 to 8, wherein the spatial series data refers to user behavior trajectory data in a spatial order during the application, wherein there is a connection of sequence, flow, and direction between each space, wherein the user behavior trajectory data involved in the user logging in to the application to perform transfer operation forms the user spatial series data.
10. The device of any one of claims 1 to 9, wherein the pre-set time point can be a time point corresponding to a Nth data in a M data included in the time series data, N is greater than 1, N is less than M.
11. The device of any one of claims 1 to 10, wherein the actual values of a plurality of indicators before the pre-set time point in the time series data are substituted into the ARIMA model for predicting, wherein obtaining the predicted values of indicators at the pre-set time point and the confidence interval of the predicted values of indicators when confidence level is a.
Date Recue/Date Received 2022-02-07
12. The device of any one of claims 1 to 11, wherein the ARIIVIA
(Autoregressive Integrated Moving Average Model) is an Auto-Regressive Moving Average model, predicting future with past and present values, wherein it regards the time series data as a random series and finds optimal function to fit.
13. The device of any one of claims 1 to 12, wherein, the ARIIVIA(p, q, d) model is defined as the following:
Yt = (PlYt-1 (P2Yt-2= = = +(PpYt¨p 4- et Otet-1¨ Ozet-2. = = ¨Oget-q=
14. The device claim 13, wherein p refers autoregressive order, d refers series differential order, q refers moving average order, yt is time series observation value at t time moment, et is white noise series, cpi, Oi are coefficients of yt_i and et_i respectively.
15. The device of any one of claims lto 14, wherein adopting unit root detection method to test stationarity of the time series sample data to determine whether the data is stationary, wherein the data is non-stationary, the data needs to be stationary processed, wherein the series continue to be differentiated until the series meets the stationary test conditions, wherein obtaining the stationary time series sample data to eliminate data trend, wherein the differential order d of the ARIIVIA model is the times of differentiating made when the time series becomes a stationary time series.
16. The device of any one of claims 1 to 15, wherein determining the differential order d of the model, based on the AIC information criterion, the ranges of both autoregressive order p and moving average order q are defined, traversing the combination of (p, q), identifying the combination of (p, q) with minimum AIC value. In the end, the optimal p, d and q are determined to apply in the ARIIVIA model for predicting.
17. The device of any one of claims 1 to 16, wherein determining whether the actual value of the indicator at the pre-set time point is within the confidence interval of the predicted indicator, obtaining the determining result;

Date Recue/Date Received 2022-02-07 generating the first detection result for user behavior according to the determining result, wherein when an actual value of indicator falls outside the confidence interval , wherein the first detection result used for indicating the actual value of indicator at the pre-set time point is an anomaly value, wherein when the actual value of indicator falls within the confidence interval, the first detection result used for indicating the actual value of indicator at the pre-set time point is normal value.
18. The device of any one of claims 1 to 17, wherein the SOM (Self Organizing Maps, self-organizing map neural network) is an unsupervised manually neural network, wherein network structure of SOM has two layers including input layer and an output layer (also called competition layer), wherein neural network is a reverse transfer of loss function to train, wherein the SOM uses a competitive learning strategy, relying on the competition between each neurons to gradually optimize the network, wherein the neurons are a matrix of equidistant nodes arranged in a two-dimensional form on the neural network, to constitute the output layer, wherein each node has correspondingly weight vector with the same dimension as the dimension length of the input data and uses the nearest neighbor relationship function to maintain the topology of input space.
19. The device of any one of claims 1 to 18, wherein initializing the pre-set SOM neural network, wherein the weight of each neuron of the SOM neural network can be initialized to a very small random number, the random number is greater than 0 and less than 1, wherein number of model iterations, learning rate, and neighborhood radius also need to be initialized.
20. The device of any one of claims lto 20, wherein the training process of the SOM neural network, setting a neighborhood radius R with the winning neuron as the center, the neighborhood radius R is initialized as a initial neighborhood radius, a fixed radius is called winning neighborhood, wherein the range of the winning neighborhood shrinks as the number of training increases and finally shrinks to a fixed value of the neighborhood radius.

Date Recue/Date Received 2022-02-07
21. The device of any one of claims 1 to 20, wherein calculating the Euclidean distance between the input vector X and each neuron, the neuron with the smallest European distance to the input vector X is the winning neuron, wherein all neurons in the output layer of the SOM
neural network compete with each other, only one wining neuron can be activated each time.
22. The device of any one of claims 1 to 21, wherein the neighborhood radius is set with the winning neuron as the center, and the area within the radius is called winning neighborhood, according to the coordinates of winning neuron and the radius of neighborhood, determining all neurons in the winning neighborhood, and using the gradient descent method to update the weight of each neuron in the winning neighborhood.
23. The device of any one of claims 1 to 22, wherein a new input sample is read from the training sample set, and the process is executed iteratively, until completing the training of all training samples, after updating the weight values of all wining neurons, updating the learning rate and neighborhood function, wherein the number of training times of the SOM
neural network reaches a pre-set maximum number of times, the training and learning process is exited, obtaining the trained SOM neural network model, and obtaining a plurality of clusters output by the SOM neural network model, wherein each cluster corresponds to a neighborhood scope (i.e., the winning neighborhood), the neighborhood contains at least one neuron.
24. The device of any one of claims lto 23, wherein the area threshold can be set according to the actual needs, wherein the cluster area is small, which means that an isolated cluster with a very small cluster size is set as the anomaly cluster.
25. The device of any one of claims lto 24, wherein detennining the neighborhood radius of the winning neighborhood where the winning neuron is located, calculating the area of circle with the radius of the neighborhood as the radius and using it as the cluster area of the cluster to which the winning neuron belongs, comparing the cluster area with the area threshold.

Date Recue/Date Received 2022-02-07
26. The device of any one of claims 1 to 25, wherein when the cluster area of the cluster to which the winning neuron belongs is less than the area threshold, the second detection result is used to indicate that the user spatial series data is anomaly data, when the cluster area of the cluster to which the winning neuron belongs is not less than the area threshold, the second detection result is used to indicate that the user spatial series data is normal data.
27. The device of any one of claims 1 to 26, wherein the order in which predicting confidence interval of the indicator through the ARIMA model and performing anomaly detection through pre-trained SOM neural network model are executed concurrently.
28. The device of any one of claims 1 to 27, wherein the first detection result and the second detection result are both normal, determining the user behavior as normal, wherein the first detection result and the second detection result are both anomaly, determining the user behavior as anomaly, wherein only one of the first detection result and the second detection result is normal, determining the user behavior as a suspicious anomaly behavior, the suspicious anomaly behavior can be manually identified.
29. A computer device comprising:
one or a plurality of processors;
a storage apparatus configured to store one or a plurality of programs;
a network interface;
a database connected through a system bus;
wherein one or a plurality of programs are executed by one or a plurality of processors, one or a plurality of processors configured to:
obtain time series data and spatial series data associated with user behavior;

Date Recue/Date Received 2022-02-07 predict confidence interval of the indicator through ARIMA model when user is at the pre-set time point, according to a plurality of actual indicators values before pre-set time point in the time series data;
eompare actual indicator value when the user is at the pre-set time point with correspondingly confidence interval of indicator;
obtain first detection result for the user behavior;
perform anomaly detection through pre-trained SOM neural network model according to the spatial series data, obtain second detection result for the user behavior; and perform anomaly identification on the user behavior according to the first detection result and the second detection result.
wherein the memory of the computer device includes non-volatile storage medium and internal memory, wherein the memory of non-volatile storage medium has operation system, computer programs and database;
wherein the internal memory provides an environment for the operation system and computer program running in a non-volatile storage medium; and wherein the network interface of the computer device is used to communicate with an external tenninal through a network connection.
30. The device of claim 29, wherein the ARIIVIA model is configured by:
obtaining time series sample data associated with sample user behavior;
performing stationarity test on the time series sample data, for failing the test's time series sample data;
differential processing the data to obtain stationary time series sample data;
Date Recue/Date Received 2022-02-07 establishing an initial ARIMA model for the stationary time series sample data;
determining the initial ARIMA model's autoregressive order and range of moving average order according to autocorrelation coefficient and partial autocorrelation coefficient of the stationary time series sample data;
determining the combination of the initial ARIMA model's optimal autoregressive order and range of moving average order using AIC information criterion; and constructing the ARIMA model.
31. The device of claim 29, wherein the SOM neural network model is trained by:
initializing weight of each neuron in the pre-set SOM neural network;
obtaining spatial series sample data associated with behavior of sample user;
normalization processing each spatial series sample data;
obtaining training sample set;
randomly selecting training samples from the training sample set to be input into the SOM neural network input layer;
obtaining input vector;
searching for winning neuron corresponding to the input vector according to Euclidean distance between the input vector and each neuron in competition layer of the SOM
neural network;
using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron; and iteratively execute:

Date Recue/Date Received 2022-02-07 randomly selecting training samples from the training sample set to be input into the SOM neural network input layer;
obtaining input vector;
searching for winning neuron corresponding to the input vector according to Euclidean distance between the input vector and each neuron in competition layer of the SOM neural network;
using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron; and wherein the training ends until reaching the pre-set end condition, obtaining the SOM
neural network, and obtaining a plurality of clusters output by the SOM neural network model.
32. The device of claim 29, wherein according to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining a second detection result for the user behavior, comprising:
normalization processing the spatial series data, wherein the normalization processed spatial series data as input parameters, wherein the parameters are input to the SOM
neural network model;
determining the winning neurons corresponding to the input parameters and cluster of the winning neurons according to the Euclidean distance from the input parameters to each neuron;
calculating cluster area of the winning neurons;
comparing the cluster area with an area threshold, wherein the cluster area is less than the area threshold, the cluster is an anomaly cluster; and generating a second detection result for user behavior according to the comparing result.

Date Recue/Date Received 2022-02-07
33. The device of claim 29, wherein the first detection result and the second detection result is anomaly identification of user behavior, perform identification authentication on the user, or restricting the user's operations and behavior, wherein the restriction operation comprises disabling the key function on the key page of the application, wherein the key function includes viewing, inputting, submitting.
34. The device of any one of claims 29 to 33, wherein the user data within the pre-set time period can be obtained, pre-processing the user data to extract time series data and the spatial series data associated with the user behavior.
35. The device of any one of claims 29 to 34, wherein the user data includes user attribute data and user behavior data, wherein the user attribute data includes any one of more of name, age, mailing address, wherein the user behavior data includes any one of more of IP address of account registration, IP address of each login, time information of each login, page click information, user device information, online duration and other related information, wherein the user device information includes any one of more of device MAC address, device gyroscope data, device acceleration data, CPU, memory, disk I/0 and other information.
36. The device of any one of claims 29 to 35, wherein the time series data is an indicator values series obtained by sorting the actual indicator values of users in a pre-set time period in chronological order, wherein the indicator value refers to the value of parameter indicator obtained by statistics of user numerical data related to the user behavior in a pre-set time period, wherein the parameter indicator can be any one or more of values in online duration, device moving distance and change value of screen temperature.
37. The device of any one of claims 29 to 36, wherein the spatial series data refers to user behavior trajectory data in a spatial order during the application, wherein there is a connection of sequence, flow, and direction between each space, wherein the user behavior trajectory data involved in the user logging in to the application to perform transfer operation forms the user spatial series data.

Date Recue/Date Received 2022-02-07
38. The device of any one of claims 29 to 37, wherein the pre-set time point can be a time point corresponding to a Nth data in a M data included in the time series data, N is greater than 1, N is less than M.
39. The device of any one of claims 29 to 38, wherein the actual values of a plurality of indicators before the pre-set time point in the time series data are substituted into the ARIIVIA
model for predicting, wherein obtaining the predicted values of indicators at the pre-set time point and the confidence interval of the predicted values of indicators when confidence level is a.
40. The device of any one of claims 29 to 39, wherein the ARIIVIA
(Autoregressive Integrated Moving Average Model) is an Auto-Regressive Moving Average model, predicting future with past and present values, wherein it regards the time series data as a random series and finds optimal function to fit.
41. The device of any one of claims 29 to 40, wherein, the ARIIVIA(p, q, d) model is defined as the following:
Yt = (PlYt-1 (P2Yt-2== = -F(PpYt_p + et ¨ Otet_1 ¨ 02et_2...¨Oget_q.
42. The device claim 41, wherein p refers autoregressive order, d refers series differential order, q refers moving average order, yt is time series observation value at t time moment, et is white noise series, cpi, Oi are coefficients of yt_i and et_i respectively.
43. The device of any one of claims 29 to 42, wherein adopting unit root detection method to test stationarity of the time series sample data to determine whether the data is stationary, wherein the data is non-stationary, the data needs to be stationary processed, wherein the series continue to be differentiated until the series meets the stationary test conditions, wherein obtaining the stationary time series sample data to eliminate data trend, wherein the differential order d of the ARIIVIA model is the times of differentiating made when the time series becomes a stationary time series.

Date Recue/Date Received 2022-02-07
44. The device of any one of claims 29 to 43, wherein determining the differential order d of the model, based on the AIC information criterion, the ranges of both autoregressive order p and moving average order q are defined, traversing the combination of (p, q), identifying the combination of (p, q) with minimum AIC value. In the end, the optimal p, d and q are determined to apply in the ARIIVIA model for predicting.
45. The device of any one of claims 29 to 44, wherein determining whether the actual value of the indicator at the pre-set time point is within the confidence interval of the predicted indicator, obtaining the determining result;
generating the first detection result for user behavior according to the determining result, wherein when an actual value of indicator falls outside the confidence interval , wherein the first detection result used for indicating the actual value of indicator at the pre-set time point is an anomaly value, wherein when the actual value of indicator falls within the confidence interval, the first detection result used for indicating the actual value of indicator at the pre-set time point is normal value.
46. The device of any one of claims 29 to 45, wherein the SOM (Self Organizing Maps, self-organizing map neural network) is an unsupervised manually neural network, wherein network structure of SOM has two layers including input layer and output layer (also called competition layer), wherein neural network is a reverse transfer of loss function to train, wherein the SOM uses a competitive learning strategy, relying on the competition between each neurons to gradually optimize the network, wherein the neurons are a matrix of equidistant nodes arranged in a two-dimensional form on the neural network, to constitute the output layer, wherein each node has correspondingly weight vector with the same dimension as the dimension length of the input data and uses the nearest neighbor relationship function to maintain the topology of input space.
Date Recue/Date Received 2022-02-07
47. The device of any one of claims 29 to 46, wherein initializing the pre-set SOM neural network, wherein the weight of each neuron of the SOM neural network can be initialized to a very small random number, the random number is greater than 0 and less than 1, wherein number of model iterations, learning rate, and neighborhood radius also need to be initialized.
48. The device of any one of claims 29 to 47, wherein the training process of the SOM neural network, setting a neighborhood radius R with the winning neuron as the center, the neighborhood radius R is initialized as an initial neighborhood radius, a fixed radius is called winning neighborhood, wherein the range of the winning neighborhood shrinks as the number of training increases and finally shrinks to a fixed value of the neighborhood radius.
49. The device of any one of claims 29 to 48, wherein calculating the Euclidean distance between the input vector X and each neuron, the neuron with the smallest European distance to the input vector X is the winning neuron, wherein all neurons in the output layer of the SOM neural network compete with each other, only one wining neuron can be activated each time.
50. The device of any one of claims 29 to 49, wherein the neighborhood radius is set with the winning neuron as the center, and the area within the radius is called winning neighborhood, according to the coordinates of winning neuron and the radius of neighborhood, determining all neurons in the winning neighborhood, and using the gradient descent method to update the weight of each neuron in the winning neighborhood.

Date Recue/Date Received 2022-02-07
51. The device of any one of claims 29 to 50, wherein a new input sample is read from the training sample set, and the process is executed iteratively, until completing the training of all training samples, after updating the weight values of all wining neurons, updating the learning rate and neighborhood function, wherein the number of training times of the SOM
neural network reaches a pre-set maximum number of times, the training and learning process is exited, obtaining the trained SOM neural network model, and obtaining a plurality of clusters output by the SOM neural network model, wherein each cluster corresponds to a neighborhood scope (i.e., the winning neighborhood), the neighborhood contains at least one neuron.
52. The device of any one of claims 29 to 51, wherein the area threshold can be set according to the actual needs, wherein the cluster area is small, which means that an isolated cluster with a very small cluster size is set as the anomaly cluster.
53. The device of any one of claims 29 to 52, wherein determining the neighborhood radius of the winning neighborhood where the winning neuron is located, calculating the area of circle with the radius of the neighborhood as the radius and using it as the cluster area of the cluster to which the winning neuron belongs, comparing the cluster area with the area threshold.
54. The device of any one of claims 29 to 53, wherein when the cluster area of the cluster to which the winning neuron belongs is less than the area threshold, the second detection result is used to indicate that the user spatial series data is anomaly data, when the cluster area of the cluster to which the winning neuron belongs is not less than the area threshold, the second detection result is used to indicate that the user spatial series data is normal data.
55. The device of any one of claims 29 to 54, wherein the order in which predicting confidence interval of the indicator through the ARIMA model and perfonning anomaly detection through pre-trained SOM neural network model are executed concurrently.

Date Recue/Date Received 2022-02-07
56. The device of any one of claims 29 to 55, wherein the first detection result and the second detection result are both normal, determining the user behavior as normal, wherein the first detection result and the second detection result are both anomaly, determining the user behavior as anomaly, wherein only one of the first detection result and the second detection result is normal, determining the user behavior as a suspicious anomaly behavior, the suspicious anomaly behavior can be manually identified.
57. A computer readable physical memory having stored thereon a computer program executed by a computer configured to:
obtain time series data and spatial series data associated with user behavior;
predict confidence interval of the indicator through ARIIVIA model when user is at the pre-set time point, according to a plurality of actual indicators values before pre-set time point in the time series data;
compare actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator, obtaining first detection result for the user behavior;
obtain second detection result for the user behavior according to the spatial series data, performing anomaly detection through pre-trained SOM neural network model; and perform anomaly identification on the user behavior, according to the first detection result and the second detection result.
58. The memory of claim 57, wherein the ARIIVIA model is configured by:
obtaining time series sample data associated with sample user behavior;
performing stationarity test on the time series sample data, for failing the test's time series sample data;

Date Recue/Date Received 2022-02-07 differential processing the data to obtain stationary time series sample data;

establishing an initial ARIMA model for the stationary time series sample data;
determining the initial ARIMA model's autoregressive order and range of moving average order according to autocorrelation coefficient and partial autocorrelation coefficient of the stationary time series sample data;
determining the combination of the initial ARIMA model's optimal autoregressive order and range of moving average order using AIC information criterion; and constructing the ARIMA model.
59. The memory of claim 57, wherein the SOM neural network model is trained by:
initializing weight of each neuron in the pre-set SOM neural network;
obtaining spatial series sample data associated with behavior of sample user;
normalization processing each spatial series sample data;
obtaining training sample set;
randomly selecting training samples from the training sample set to be input into the SOM neural network input layer;
obtaining input vector;
searching for winning neuron corresponding to the input vector according to Euclidean distance between the input vector and each neuron in competition layer of the SOM
neural network;
using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron; and iteratively execute:

Date Recue/Date Received 2022-02-07 randomly selecting training samples from the training sample set to be input into the SOM neural network input layer;
obtaining input vector;
searching for winning neuron corresponding to the input vector according to Euclidean distance between the input vector and each neuron in competition layer of the SOM neural network;
using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron; and wherein the training ends until reaching the pre-set end condition, obtaining the SOM
neural network, and obtaining a plurality of clusters output by the SOM neural network model.
60. The memory of claim 57, wherein according to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining a second detection result for the user behavior, comprising:
normalization processing the spatial series data, wherein the normalization processed spatial series data as input parameters, wherein the parameters are input to the SOM
neural network model;
determining the winning neurons corresponding to the input parameters and cluster of the winning neurons according to the Euclidean distance from the input parameters to each neuron;
calculating cluster area of the winning neurons;
comparing the cluster area with an area threshold, wherein the cluster area is less than the area threshold, the cluster is an anomaly cluster; and generating a second detection result for user behavior according to the comparing result.
Date Recue/Date Received 2022-02-07
61. The memory of claim 57, wherein the first detection result and the second detection result is anomaly identification of user behavior, perform identification authentication on the user, or restricting the user's operations and behavior, wherein the restriction operation comprises disabling the key function on the key page of the application, wherein the key function includes viewing, inputting, submitting.
62. The memory of any one of claims 57 to 61, wherein the user data within a pre-set time period can be obtained, pre-processing the user data to extract time series data and spatial series data associated with the user behavior.
63. The memory of any one of claims 57 to 62, wherein the user data includes user attribute data and user behavior data, wherein the user attribute data includes any one of more of name, age, mailing address, wherein the user behavior data includes any one of more of IP address of account registration, IP address of each login, time information of each login, page click information, user device information, online duration and other related information, wherein the user device information includes any one of more of device MAC address, device gyroscope data, device acceleration data, CPU, memory, disk I/0 and other information.
64. The memory of any one of claims 57 to 63, wherein the time series data is an indicator values series obtained by sorting the actual indicator values of users in a pre-set time period in chronological order, wherein the indicator value refers to the value of parameter indicator obtained by statistics of user numerical data related to the user behavior in a pre-set time period, wherein the parameter indicator can be any one or more of values in online duration, device moving distance and change value of screen temperature.
65. The memory of any one of claims 57 to 64, wherein the spatial series data refers to user behavior trajectory data in a spatial order during the application, wherein there is a connection of sequence, flow, and direction between each space, wherein the user behavior trajectory data involved in the user logging in to the application to perform transfer operation forms the user spatial series data.

Date Recue/Date Received 2022-02-07
66. The memory of any one of claims 57 to 65, wherein the pre-set time point can be a time point corresponding to a Nth data in a M data included in the time series data, N is greater than 1, N is less tan M.
67. The memory of any one of claims 57 to 66, wherein the actual values of a plurality of indicators before the pre-set time point in the time series data are substituted into the ARIIVIA
model for predicting, wherein obtaining the predicted values of indicators at the pre-set time point and the confidence interval of the predicted values of indicators when confidence level is a.
68. The memory of any one of claims 57 to 67, wherein the ARIMA
(Autoregressive Integrated Moving Average Model) is an Auto-Regressive Moving Average model, predicting future with past and present values, wherein it regards the time series data as a random series and finds optimal function to fit.
69. The memory of any one of claims 57 to 68, wherein, the ARIMA(p, q, d) model is defined as the following:
Yt = (PlYt-1 (P2Yt-2== = -F(PpYt_p + et ¨ Otet_1¨ 02et_2...¨Oget_q.
70. The memory claim 69, wherein p refers autoregressive order, d refers series differential order, q refers moving average order, yt is time series observation value at t time moment, et is white noise series, cpi, Oi are coefficients of yt_i and et_i respectively.
71. The memory of any one of claims 57 to 70, wherein adopting unit root detection method to test stationarity of the time series sample data to determine whether the data is stationary, wherein the data is non-stationary, the data needs to be stationary processed, wherein the series continue to be differentiated until the series meets the stationary test conditions, wherein obtaining the stationary time series sample data to eliminate data trend, wherein the differential order d of the ARIIVIA model is the times of differentiating made when the time series becomes a stationary time series.

Date Recue/Date Received 2022-02-07
72. The memory of any one of claims 57 to 71, wherein determining the differential order d of the model, based on the AIC information criterion, the ranges of both autoregressive order p and moving average order q are defined, traversing the combination of (p, q), identifying the combination of (p, q) with minimum AIC value. In the end, the optimal p, d and q are determined to apply in the ARIIVIA model for predicting.
73. The memory of any one of claims 57 to 72, wherein determining whether the actual value of the indicator at the pre-set time point is within the confidence interval of the predicted indicator, obtaining the determining result;
generating the first detection result for user behavior according to the determining result, wherein when an actual value of indicator falls outside the confidence interval , wherein the first detection result used for indicating the actual value of indicator at the pre-set time point is an anomaly value, wherein when the actual value of indicator falls within the confidence interval, the first detection result used for indicating the actual value of indicator at the pre-set time point is normal value.
74. The memory of any one of claims 57 to 73, wherein the SOM (Self Organizing Maps, self-organizing map neural network) is an unsupervised manually neural network, wherein network structure of SOM has two layers including input layer and output layer (also called competition layer), wherein neural network is a reverse transfer of loss function to train, wherein the SOM uses a competitive learning strategy, relying on the competition between each neurons to gradually optimize the network, wherein the neurons are a matrix of equidistant nodes arranged in a two-dimensional form on the neural network, to constitute the output layer, wherein each node has correspondingly weight vector with the same dimension as the dimension length of the input data and uses the nearest neighbor relationship function to maintain the topology of input space.

Date Recue/Date Received 2022-02-07
75. The memory of any one of claims 57 to 74, wherein initializing the pre-set SOM neural network, wherein the weight of each neuron of the SOM neural network can be initialized to a very small random number, the random number is greater than 0 and less than 1, wherein number of model iterations, learning rate, and neighborhood radius also need to be initialized.
76. The memory of any one of claims 57 to 75, wherein the training process of the SOM neural network, setting a neighborhood radius R with the winning neuron as the center, the neighborhood radius R is initialized as a initial neighborhood radius, a fixed radius is called winning neighborhood, wherein the range of the winning neighborhood shrinks as the number of training increases and finally shrinks to a fixed value of the neighborhood radius.
77. The memory of any one of claims 57 to 76, wherein calculating the Euclidean distance between the input vector X and each neuron, the neuron with the smallest European distance to the input vector X is the winning neuron, wherein all neurons in the output layer of the SOM neural network compete with each other, only one wining neuron can be activated each time.
78. The memory of any one of claims 57 to 77, wherein the neighborhood radius is set with the winning neuron as the center, and the area within the radius is called winning neighborhood, according to the coordinates of winning neuron and the radius of neighborhood, determining all neurons in the winning neighborhood, and using the gradient descent method to update the weight of each neuron in the winning neighborhood.

Date Recue/Date Received 2022-02-07
79. The memory of any one of claims 57 to 78, wherein a new input sample is read from the training sample set, and the process is executed iteratively, until completing the training of all training samples, after updating the weight values of all wining neurons, updating the learning rate and neighborhood function, wherein the number of training times of the SOM
neural network reaches a pre-set maximum number of times, the training and learning process is exited, obtaining the trained SOM neural network model, and obtaining a plurality of clusters output by the SOM neural network model, wherein each cluster corresponds to a neighborhood scope (i.e., the winning neighborhood), the neighborhood contains at least one neuron.
80. The memory of any one of claims 57 to 79, wherein the area threshold can be set according to the actual needs, wherein the cluster area is small, which means that an isolated cluster with a very small cluster size is set as the anomaly cluster.
81. The memory of any one of claims 57 to 80, wherein determining the neighborhood radius of the winning neighborhood where the winning neuron is located, calculating the area of circle with the radius of the neighborhood as the radius and using it as the cluster area of the cluster to which the winning neuron belongs, comparing the cluster area with the area threshold.
82. The memory of any one of claims 57 to 81, wherein when the cluster area of the cluster to which the winning neuron belongs is less than the area threshold, the second detection result is used to indicate that the user spatial series data is anomaly data, when the cluster area of the cluster to which the winning neuron belongs is not less than the area threshold, the second detection result is used to indicate that the user spatial series data is normal data.
83. The memory of any one of claims 57 to 82, wherein the order in which predicting confidence interval of the indicator through the ARIMA model and perfonning anomaly detection through pre-trained SOM neural network model are executed concurrently.
Date Recue/Date Received 2022-02-07
84. The memory of any one of claims 57 to 83, wherein the first detection result and the second detection result are both normal, determining the user behavior as normal, wherein the first detection result and the second detection result are both anomaly, determining the user behavior as anomaly, wherein only one of the first detection result and the second detection result is normal, determining the user behavior as a suspicious anomaly behavior, the suspicious anomaly behavior can be manually identified.
85. An identification method for user anomaly behavior, the method comprises:
obtaining time series data and spatial series data associated with user behavior;
predicting confidence interval of the indicator through ARIMA model when user is at the pre-set time point according to a plurality of actual indicators values before pre-set time point in the time series data;
comparing actual indicator value when user is at the pre-set time point with correspondingly confidence interval of indicator;
obtaining first detection result for the user behavior;
performing anomaly detection through pre-trained SOM neural network model according to the spatial series data;
obtaining second detection result for the user behavior; and performing anomaly identification on the user behavior according to the first detection result and the second detection result.
86. The method of claim 85, wherein the ARIMA model is configured by:
obtaining time series sample data associated with sample user behavior;

Date Recue/Date Received 2022-02-07 performing stationarity test on the time series sample data, for failing the test's time series sample data;
differential processing the data to obtain stationary time series sample data;

establishing an initial ARIMA model for the stationary time series sample data;
determining the initial ARIMA model's autoregressive order and range of moving average order according to autocorrelation coefficient and partial autocorrelation coefficient of the stationary time series sample data;
determining the combination of the initial ARIMA model's optimal autoregressive order and range of moving average order using AIC information criterion; and constructing the ARIMA model.
87. The method of claim 85, wherein the SOM neural network model is trained by:
initializing weight of each neuron in the pre-set SOM neural network;
obtaining spatial series sample data associated with behavior of sample user;
normalization processing each spatial series sample data;
obtaining training sample set;
randomly selecting training samples from the training sample set to be input into the SOM neural network input layer;
obtaining input vector;
searching for winning neuron corresponding to the input vector according to Euclidean distance between the input vector and each neuron in competition layer of the SOM
neural network;

Date Recue/Date Received 2022-02-07 using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron; and iteratively execute:
randomly selecting training samples from the training sample set to be input into the SOM neural network input layer;
obtaining input vector;
searching for winning neuron corresponding to the input vector according to Euclidean distance between the input vector and each neuron in competition layer of the SOM neural network;
using gradient descent method, performing weight update on the winning neuron and each neuron of neurons set around the winning neuron; and wherein the training ends until reaching the pre-set end condition, obtaining the SOM
neural network, and obtaining a plurality of clusters output by the SOM neural network model.
88. The method of claim 87, wherein according to the spatial series data, performing anomaly detection through pre-trained SOM neural network model, obtaining a second detection result for the user behavior, comprising:
normalization processing the spatial series data, wherein the normalization processed spatial series data as input parameters, wherein the parameters are input to the SOM
neural network model;
determining the winning neurons corresponding to the input parameters and cluster of the winning neurons according to the Euclidean distance from the input parameters to each neuron;
calculating cluster area of the winning neurons;

Date Recue/Date Received 2022-02-07 comparing the cluster area with an area threshold, wherein the cluster area is less than the area threshold, the cluster is an anomaly cluster; and generating a second detection result for user behavior according to the comparing result.
89. The method of claim 85, wherein the first detection result and the second detection result is anomaly identification of user behavior, perform identification authentication on the user, or restricting the user's operations and behavior, wherein the restriction operation comprises disabling the key function on the key page of the application, wherein the key function includes viewing, inputting, submitting.
90. The method of any one of claims 85 to 89, wherein the user data within a pre-set time period can be obtained, pre-processing the user data to extract time series data and spatial series data associated with the user behavior.
91. The method of any one of claims 85 to 90, wherein the user data includes user attribute data and user behavior data, wherein the user attribute data includes any one of more of name, age, mailing address, wherein the user behavior data includes any one of more of IP address of account registration, IP address of each login, time information of each login, page click information, user device information, online duration and other related information, wherein the user device information includes any one of more of device MAC address, device gyroscope data, device acceleration data, CPU, memory, disk I/0 and other information.
92. The method of any one of claims 85 to 91, wherein the time series data is an indicator values series obtained by sorting the actual indicator values of users in a pre-set time period in chronological order, wherein the indicator value refers to the value of parameter indicator obtained by statistics of user numerical data related to the user behavior in a pre-set time period, wherein the parameter indicator can be any one or more of values in online duration, device moving distance and change value of screen temperature.

Date Recue/Date Received 2022-02-07
93. The method of any one of claims 85 to 92, wherein the spatial series data refers to user behavior trajectory data in a spatial order during the application, wherein there is a connection of sequence, flow, and direction between each space, wherein the user behavior trajectory data involved in the user logging in to the application to perform transfer operation forms the user spatial series data.
94. The method of any one of claims 85 to 93, wherein the pre-set time point can be a time point corresponding to a Nth data in a M data included in the time series data, N is greater than 1, N is less than M.
95. The method of any one of claims 85 to 94, wherein the actual values of a plurality of indicators before the pre-set time point in the time series data are substituted into the ARIIVIA
model for predicting, wherein obtaining the predicted values of indicators at the pre-set time point and the confidence interval of the predicted values of indicators when confidence level is a.
96. The method of any one of claims 85 to 95, wherein the ARIIVIA
(Autoregressive Integrated Moving Average Model) is an Auto-Regressive Moving Average model, predicting future with past and present values, wherein it regards the time series data as a random series and finds optimal function to fit.
97. The method of any one of claims 85 to 96, wherein, the ARIIVIA(p, q, d) model is defined as the following:
Yt = (PlYt-1 (P2Yt-2== = -F(PpYt_p + et ¨ Otet_1¨ 02et_2...¨Oget_q.
98. The method claim 97, wherein p refers autoregressive order, d refers series differential order, q refers moving average order, yt is time series observation value at t time moment, et is white noise series, cpi, Oi are coefficients of yt_i and et_i respectively.
Date Recue/Date Received 2022-02-07
99. The method of any one of claims 85 to 98, wherein adopting unit root detection method to test stationarity of the time series sample data to determine whether the data is stationary, wherein the data is non-stationary, the data needs to be stationary processed, wherein the series continue to be differentiated until the series meets the stationary test conditions, wherein obtaining the stationary time series sample data to eliminate data trend, wherein the differential order d of the ARIIVIA model is the times of differentiating made when the time series becomes a stationary time series.
100.The method of any one of claims 85 to 99, wherein determining the differential order d of the model, based on the AIC information criterion, the ranges of both autoregressive order p and moving average order q are defined, traversing the combination of (p, q), identifying the combination of (p, q) with minimum AIC value. In the end, the optimal p, d and q are determined to apply in the ARIIVIA model for predicting.
101.The method of any one of claims 85 to 100, wherein determining whether the actual value of the indicator at the pre-set time point is within the confidence interval of the predicted indicator, obtaining the determining result;
generating the first detection result for user behavior according to the determining result, wherein when an actual value of indicator falls outside the confidence interval , wherein the first detection result used for indicating the actual value of indicator at the pre-set time point is an anomaly value, wherein when the actual value of indicator falls within the confidence interval, the first detection result used for indicating the actual value of indicator at the pre-set time point is normal value.

Date Recue/Date Received 2022-02-07
102.The method of any one of claims 85 to 101, wherein the SOM (Self Organizing Maps, self-organizing map neural network) is an unsupervised manually neural network, wherein network structure of SOM has two layers including input layer and output layer (also called competition layer), wherein neural network is a reverse transfer of loss function to train, wherein the SOM uses a competitive learning strategy, relying on the competition between each neurons to gradually optimize the network, wherein the neurons are a matrix of equidistant nodes arranged in a two-dimensional form on the neural network, to constitute the output layer, wherein each node has correspondingly weight vector with the same dimension as the dimension length of the input data and uses the nearest neighbor relationship function to maintain the topology of input space.
103.The method of any one of claims 85 to 102, wherein initializing the pre-set SOM neural network, wherein the weight of each neuron of the SOM neural network can be initialized to a very small random number, the random number is greater than 0 and less than 1, wherein number of model iterations, learning rate, and neighborhood radius also need to be initialized.
104.The method of any one of claims 85 to 103, wherein the training process of the SOM neural network, setting a neighborhood radius R with the winning neuron as the center, the neighborhood radius R is initialized as a initial neighborhood radius, a fixed radius is called winning neighborhood, wherein the range of the winning neighborhood shrinks as the number of training increases and finally shrinks to a fixed value of the neighborhood radius.
105.The method of any one of claims 85 to 104, wherein calculating the Euclidean distance between the input vector X and each neuron, the neuron with the smallest European distance to the input vector X is the winning neuron, wherein all neurons in the output layer of the SOM neural network compete with each other, only one wining neuron can be activated each time.

Date Recue/Date Received 2022-02-07
106.The method of any one of claims 85 to 105, wherein the neighborhood radius is set with the winning neuron as the center, and the area within the radius is called winning neighborhood, according to the coordinates of winning neuron and the radius of neighborhood, determining all neurons in the winning neighborhood, and using the gradient descent method to update the weight of each neuron in the winning neighborhood.
107.The method of any one of claims 85 to 106, wherein a new input sample is read from the training sample set, and the process is executed iteratively, until completing the training of all training samples, after updating the weight values of all wining neurons, updating the learning rate and neighborhood function, wherein the number of training times of the SOM
neural network reaches a pre-set maximum number of times, the training and learning process is exited, obtaining the trained SOM neural network model, and obtaining a plurality of clusters output by the SOM neural network model, wherein each cluster corresponds to a neighborhood scope (i.e., the winning neighborhood), the neighborhood contains at least one neuron.
108.The method of any one of claims 85 to 107, wherein the area threshold can be set according to the actual needs, wherein the cluster area is small, which means that an isolated cluster with a very small cluster size is set as the anomaly cluster.
109.The method of any one of claims 85 to 108, wherein determining the neighborhood radius of the winning neighborhood where the winning neuron is located, calculating the area of circle with the radius of the neighborhood as the radius and using it as the cluster area of the cluster to which the winning neuron belongs, comparing the cluster area with the area threshold.
110.The method of any one of claims 85 to 109, wherein when the cluster area of the cluster to which the winning neuron belongs is less than the area threshold, the second detection result is used to indicate that the user spatial series data is anomaly data, when the cluster area of the cluster to which the winning neuron belongs is not less than the area threshold, the second detection result is used to indicate that the user spatial series data is normal data.

Date Recue/Date Received 2022-02-07
111.The method of any one of claims 85 to 110, wherein the order in which predicting confidence interval of the indicator through the ARIMA model and performing anomaly detection through pre-trained SOM neural network model are executed concurrently.
112.The method of any one of claims 85 to 111, wherein the first detection result and the second detection result are both normal, determining the user behavior as normal, wherein the first detection result and the second detection result are both anomaly, determining the user behavior as anomaly, wherein only one of the first detection result and the second detection result is normal, determining the user behavior as a suspicious anomaly behavior, the suspicious anomaly behavior can be manually identified.

Date Recue/Date Received 2022-02-07
CA3132346A 2020-09-29 2021-09-29 User abnormal behavior recognition method and device and computer readable storage medium Active CA3132346C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011047099.2 2020-09-29
CN202011047099.2A CN111898758B (en) 2020-09-29 2020-09-29 User abnormal behavior identification method and device and computer readable storage medium

Publications (2)

Publication Number Publication Date
CA3132346A1 true CA3132346A1 (en) 2022-03-29
CA3132346C CA3132346C (en) 2024-03-19

Family

ID=73224018

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3132346A Active CA3132346C (en) 2020-09-29 2021-09-29 User abnormal behavior recognition method and device and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN111898758B (en)
CA (1) CA3132346C (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114742102A (en) * 2022-03-30 2022-07-12 中国人民解放军战略支援部队航天工程大学 NLOS signal identification method and system
CN115018053A (en) * 2022-06-16 2022-09-06 河南工业大学 Air quality monitoring data calibration method and device for self-organizing robust width network
CN115565623A (en) * 2022-10-19 2023-01-03 中国矿业大学(北京) Method and system for analyzing coal geological components, electronic equipment and storage medium
CN116204805A (en) * 2023-04-24 2023-06-02 青岛鑫屋精密机械有限公司 Micro-pressure oxygen cabin and data management system
CN117034179A (en) * 2023-10-10 2023-11-10 国网山东省电力公司营销服务中心(计量中心) Abnormal electric quantity identification and tracing method and system based on graph neural network
CN117130016A (en) * 2023-10-26 2023-11-28 深圳市麦微智能电子有限公司 Personal safety monitoring system, method, device and medium based on Beidou satellite
CN117455555A (en) * 2023-12-25 2024-01-26 厦门理工学院 Big data-based electric business portrait analysis method and system
CN117828688A (en) * 2024-01-29 2024-04-05 北京亚鸿世纪科技发展有限公司 Data security processing method and system
CN117906726A (en) * 2024-03-19 2024-04-19 西安艺琳农业发展有限公司 Abnormal detection system for weight data of live cattle body ruler

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112288571B (en) * 2020-11-24 2022-06-10 重庆邮电大学 Personal credit risk assessment method based on rapid construction of neighborhood coverage
CN112907622A (en) * 2021-01-20 2021-06-04 厦门市七星通联科技有限公司 Method, device, equipment and storage medium for identifying track of target object in video
CN113052314B (en) * 2021-05-27 2021-09-14 华中科技大学 Authentication radius guide attack method, optimization training method and system
CN113569910B (en) * 2021-06-25 2024-06-21 石化盈科信息技术有限责任公司 Account type identification method, account type identification device, computer equipment and storage medium
CN113971119B (en) * 2021-10-21 2023-02-07 云纷(上海)信息科技有限公司 Unsupervised model-based user behavior anomaly analysis and evaluation method and system
CN114419528B (en) * 2022-04-01 2022-07-08 浙江口碑网络技术有限公司 Anomaly identification method and device, computer equipment and computer readable storage medium
CN115618247A (en) * 2022-09-26 2023-01-17 中电金信软件(上海)有限公司 Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium
CN117033052B (en) * 2023-08-14 2024-05-24 企口袋(重庆)数字科技有限公司 Object abnormality diagnosis method and system based on model identification

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106789149B (en) * 2016-11-18 2020-08-14 北京工业大学 Intrusion detection method adopting improved self-organizing characteristic neural network clustering algorithm
CN109587713B (en) * 2018-12-05 2022-01-11 广州数锐智能科技有限公司 Network index prediction method and device based on ARIMA model and storage medium
CN111178523B (en) * 2019-08-02 2023-06-06 腾讯科技(深圳)有限公司 Behavior detection method and device, electronic equipment and storage medium

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114742102A (en) * 2022-03-30 2022-07-12 中国人民解放军战略支援部队航天工程大学 NLOS signal identification method and system
CN115018053A (en) * 2022-06-16 2022-09-06 河南工业大学 Air quality monitoring data calibration method and device for self-organizing robust width network
CN115565623A (en) * 2022-10-19 2023-01-03 中国矿业大学(北京) Method and system for analyzing coal geological components, electronic equipment and storage medium
CN115565623B (en) * 2022-10-19 2023-06-09 中国矿业大学(北京) Analysis method, system, electronic equipment and storage medium for coal geological composition
CN116204805A (en) * 2023-04-24 2023-06-02 青岛鑫屋精密机械有限公司 Micro-pressure oxygen cabin and data management system
CN117034179B (en) * 2023-10-10 2024-02-02 国网山东省电力公司营销服务中心(计量中心) Abnormal electric quantity identification and tracing method and system based on graph neural network
CN117034179A (en) * 2023-10-10 2023-11-10 国网山东省电力公司营销服务中心(计量中心) Abnormal electric quantity identification and tracing method and system based on graph neural network
CN117130016A (en) * 2023-10-26 2023-11-28 深圳市麦微智能电子有限公司 Personal safety monitoring system, method, device and medium based on Beidou satellite
CN117130016B (en) * 2023-10-26 2024-02-06 深圳市麦微智能电子有限公司 Personal safety monitoring system, method, device and medium based on Beidou satellite
CN117455555A (en) * 2023-12-25 2024-01-26 厦门理工学院 Big data-based electric business portrait analysis method and system
CN117455555B (en) * 2023-12-25 2024-03-08 厦门理工学院 Big data-based electric business portrait analysis method and system
CN117828688A (en) * 2024-01-29 2024-04-05 北京亚鸿世纪科技发展有限公司 Data security processing method and system
CN117906726A (en) * 2024-03-19 2024-04-19 西安艺琳农业发展有限公司 Abnormal detection system for weight data of live cattle body ruler
CN117906726B (en) * 2024-03-19 2024-06-04 西安艺琳农业发展有限公司 Abnormal detection system for weight data of live cattle body ruler

Also Published As

Publication number Publication date
CA3132346C (en) 2024-03-19
CN111898758A (en) 2020-11-06
CN111898758B (en) 2021-03-02

Similar Documents

Publication Publication Date Title
CA3132346C (en) User abnormal behavior recognition method and device and computer readable storage medium
US11113394B2 (en) Data type recognition, model training and risk recognition methods, apparatuses and devices
TWI673625B (en) Uniform resource locator (URL) attack detection method, device and electronic device
EP3651043B1 (en) Url attack detection method and apparatus, and electronic device
US20190130101A1 (en) Methods and apparatus for detecting a side channel attack using hardware performance counters
JP6876801B2 (en) Methods, devices, and electronics to identify risks associated with the transaction being processed
CN109522716A (en) A kind of network inbreak detection method and device based on timing neural network
JP2022141931A (en) Method and device for training living body detection model, method and apparatus for living body detection, electronic apparatus, storage medium, and computer program
US20230086187A1 (en) Detection of anomalies associated with fraudulent access to a service platform
CN111353082B (en) Method, apparatus and computer readable storage medium for yield analysis
CN113179263A (en) Network intrusion detection method, device and equipment
WO2021168617A1 (en) Processing method and apparatus for service risk management, electronic device, and storage medium
CN109981583A (en) A kind of industry control network method for situation assessment
CN112016097A (en) Method for predicting time of network security vulnerability being utilized
CN112632535A (en) Attack detection method and device, electronic equipment and storage medium
Xiao et al. Self-checking deep neural networks for anomalies and adversaries in deployment
Blanco et al. Applying cost-sensitive classifiers with reinforcement learning to ids
CN117454187A (en) Integrated model training method based on frequency domain limiting target attack
Lim et al. More powerful selective kernel tests for feature selection
CN116305103A (en) Neural network model backdoor detection method based on confidence coefficient difference
CN115438747A (en) Abnormal account recognition model training method, device, equipment and medium
Osamor et al. Deep learning-based hybrid model for efficient anomaly detection
CN108629181A (en) The Cache attack detection methods of Behavior-based control
Parihar et al. IDS with deep learning techniques
Nehemya et al. Taking Over the Stock Market: Adversarial Perturbations Against Algorithmic Traders