CN114978602A

CN114978602A - Cloud security account management method and security platform based on big data

Info

Publication number: CN114978602A
Application number: CN202210442926.0A
Authority: CN
Inventors: 沈莲; 刘艺鼎; 马亚超
Original assignee: Individual
Current assignee: Individual
Priority date: 2022-04-25
Filing date: 2022-04-25
Publication date: 2022-08-30

Abstract

The invention discloses a cloud security account management method and a security platform based on big data, relating to the technical field of machine learning and comprising a cloud platform client, a data collection module, a personal data training module, a user data training module, an account abnormity monitoring module and a cloud platform; the data collection module comprises a personal data collection unit and a user data collection unit; training a machine learning judgment model suitable for an individual by collecting operation information of each user login; training out another type of machine learning judgment model suitable for all users by collecting login habit information of all users; by combining the two machine learning models, more accurate recognition rate of login abnormity is obtained.

Description

Cloud security account management method and security platform based on big data

Technical Field

The invention belongs to the field of cloud account security, relates to a machine learning technology, and particularly relates to a cloud security account management method and a security platform based on big data.

Background

At present, internet technology is changing day by day, various internet APPs are in endless, and netizens also need to register accounts on the APPs; account security becomes a major issue to be solved urgently; one major issue in account security is how to prevent account theft; the prior art judges whether the login is abnormal or not mainly through historical login and operation records of a user; however, the prior art has the following problems:

1. only independent login data of each user is relied on, and some similarities between the users cannot be considered, so that whether the logged-in account is abnormal or not can be judged more accurately through the similarities;

2. the abnormal user login account is judged only through historical data, and the fact that each new login information of the user can be used as new input data to participate in training of a machine learning model is not considered;

therefore, a cloud security account management method and a security platform based on big data are provided.

Disclosure of Invention

The present invention is directed to solving at least one of the problems of the prior art. The cloud security account management method and the security platform based on the big data train a machine learning judgment model suitable for individuals by collecting operation information of each user logging in each time; training out another type of machine learning judgment model suitable for all users by collecting login habit information of all users; by combining the two machine learning models, more accurate recognition rate of login abnormity is obtained.

In order to achieve the above object, embodiments according to a first aspect of the present invention provide a cloud security account management method and a security platform based on big data, including a cloud platform client, a data collection module, a personal data training module, a user data training module, an account anomaly monitoring module, and a cloud platform; the data collection module comprises a personal data collection unit and a user data collection unit; wherein, the modules are connected with each other by a wireless network and/or electricity;

the cloud platform client is mainly used for registering and logging in a cloud platform account by a user and communicating with a cloud platform; wherein, the user can selectively fill in the real gender and age when registering;

the data collection module is mainly used for collecting operation information of a user at a cloud platform client;

the personal information collection unit is mainly used for collecting operation information of each user logging in each time; the operation information includes, but is not limited to, login time, login IP address, login duration, attention to data types of access and/or operation;

the personal information collection unit sends the collected operation information of the personal user to the personal data training module;

the user information collecting unit is mainly used for collecting login habit information of each user; the user login habit information comprises but is not limited to user gender, age, first three concerned data types, the number of commonly used login IP addresses and data operation speed;

the user information collecting unit sends the login habit information of each user to a user data training module;

the personal data training module is mainly used for training a machine learning judgment model belonging to each user; specifically, the training of the machine learning judgment model of each user includes the following steps:

step S1: converting the operation information into an operation vector; combining login time, login IP address, login duration and attention to data types of access and/or operation into an operation vector;

step S2: before the login times of the user reach a login time threshold t, marking the login states of the user as normal; it is understood that 0 is used as a flag for login abnormality, and 1 is used as a flag for login normality; the login time threshold t is set according to actual experience;

step S3: training by taking the operation vector and login state mark of the user's previous t logins as a training set of a machine learning model; the machine learning model may be a deep neural network, a linear regression, and a support vector machine;

saving the trained machine learning model belonging to each user, and marking the machine learning model as Mi, wherein i represents the user;

the user data training module is mainly used for training login habit models of all users by using a machine learning model; the training of the login habit models of all users comprises the following steps:

step P1: converting the login habit information into a login habit vector;

combining the sex, age, the first three types of concerned data, the number of commonly used login IP addresses and the data operation speed of the user into a login habit vector;

step P2: collecting login habit vectors of all users, and marking each vector as 1; further, a cloud platform creates a batch of virtual accounts, and arranges a plurality of persons to log in and operate according to own habits; marking the login habit vector generated by the batch of virtual accounts as 0;

step P3: training by taking the login habit vector of the user and the login habit vector generated by the virtual account as a training set of a machine learning model, wherein the machine learning model can be a deep neural network, a linear regression and a support vector machine;

storing the trained machine learning model, and marking the machine learning model as N;

the account abnormity monitoring module is mainly used for judging whether a user account is abnormal or not;

specifically, the step of judging whether the user account is abnormal includes the following steps:

step Q1: recording operation information and login habit information generated by each login of a user;

step Q2: converting the login habit information into a login habit vector, wherein if the login IP address is a new IP address, the number of the common login IP addresses is the number of the common login IP addresses of the corresponding user plus one;

step Q3: taking the login habit vector as the input of a machine learning model N to obtain the probability p1 of abnormal login;

step Q4: converting the operation information into an operation vector;

step Q5: taking the operation vector as the input of the machine learning model Mi to obtain the probability p2 of abnormal login;

step Q6: calculating a combined probability P ═ α × P1+ β × P2 of the login anomaly; judging whether P & gtgamma is established, if so, sending login abnormal information to the user; wherein, the sending of the login abnormal information can be sending a short message to a mobile phone number reserved by the user, and switching to Q7; wherein α and β are preset fixed proportionality coefficients, and α + β is 1; gamma is an anomaly probability threshold value set according to experience;

step Q7: updating a user login habit vector and an operation vector according to the feedback of the user; and retraining the machine learning model Mi and the machine learning model N.

Compared with the prior art, the invention has the beneficial effects that:

1. the method comprehensively considers the operation habits of each user and the characteristics of similarity existing between the operations of the users, trains two groups of machine learning models, and improves the accuracy rate of judging abnormal account numbers;

2. according to the method, the historical login operation data of the user are utilized, each login of the user is further added into a machine learning model training set as new data, and the accuracy of judging the abnormal account is further guaranteed.

Drawings

FIG. 1 is a schematic diagram of the present invention;

FIG. 2 is a flow chart of the present invention.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 1, a cloud security account management security platform based on big data includes a cloud platform client, a data collection module, a personal data training module, a user data training module, an account anomaly monitoring module, and a cloud platform; the data collection module comprises a personal data collection unit and a user data collection unit; wherein, the modules are connected with each other by a wireless network and/or electricity;

the cloud platform client is mainly used for registering and logging in a cloud platform account by a user and communicating with a cloud platform;

it is understood that the user registration and login may be a way of filling out an account password and/or face and fingerprint identification; further, in order to better protect the security of the cloud account, the user can selectively fill in the real gender and age during registration; in addition, the user can communicate with the cloud platform through the cloud platform client and complete the operation on the cloud platform data through the cloud platform client;

the data type is determined by services provided by a cloud platform, for example, for the cloud platform providing movie and television services, the data type can be swordsmen, science fiction, cartoon, suspense and the like; for a cloud platform providing shopping services, the data types can be books, foods, home decoration, living goods and the like;

the attention degree to the data type is the number of times of access and/or operation of the user to each type of data, and it can be understood that the more the number of times of access and/or operation to a certain data type is, the more attention the user pays to the data type;

the user information collecting unit is mainly used for collecting login habit information of each user; the user login habit information comprises but is not limited to user gender, age, first three concerned data types, the number of commonly used login IP addresses and data operation speed; the data operation speed is the average speed of using a mouse or sliding a screen when a user accesses data;

step S1: converting the operation information into an operation vector;

the login time can be divided into different time periods according to time, for example, the login time is divided into 24 sections according to hours, and the login time is marked as i in the ith section;

the login IP address can be marked according to the new IP, if the IP appearing for the first time is marked as 0, the new IP appearing in the following is marked as 1, 2 … in sequence

The login duration is the duration from login to offline of the user;

wherein the degree of interest in the data type of the access and/or operation may be marked as the number of data type times accessed; if a certain data type is accessed and/or operated 0 times, the data type is marked as 0;

combining login time, login IP address, login duration and attention to data types of access and/or operation into an operation vector;

step P1: converting the login habit information into a login habit vector;

wherein, the sex can use 0 to represent male, 1 to represent female;

age may be expressed using real age;

the first three data types of interest can be labeled with 1; as an example, if there are five data types, 5 data types are sorted, then a vector of length 5 may be used to express, for example, using [1,1,1,0,0] to indicate that the first 3 data types are the data types with the highest attention;

combining the sex, age, top three of concerned data types, the number of commonly used login IP addresses and the data operation speed of a user into a login habit vector;

step Q4: converting the operation information into an operation vector;

step Q6: calculating the comprehensive probability P ═ α × P1+ β × P2 of the login abnormality; judging whether P & gtgamma is established, if so, sending login abnormal information to the user; wherein, the sending of the login abnormal information can be sending a short message to a mobile phone number reserved by the user, and switching to Q7; wherein α and β are preset fixed proportionality coefficients, and α + β is 1; gamma is an anomaly probability threshold value set according to experience;

As shown in fig. 2, a cloud security account management method based on big data specifically includes the following steps:

the method comprises the following steps: a user registers and logs in a cloud platform through a client;

step two: the data collection module collects operation information of each user login and user login habit information;

step three: the personal data training module is used for training a machine learning model Mi for judging personal operation according to the operation information of each user;

step four: the user data training module is used for training a machine learning model N for judging the login habits of the users according to all user login habit information and login habit information simulated by the simulation account;

step five: the account abnormity monitoring module collects the operation information and login habit information of the login each time when the user logs in, inputs the operation information and login habit information into the machine learning model Mi and the machine learning model N, and calculates the probability of the login abnormity.

The above formulas are all calculated by removing dimensions and taking numerical values thereof, the formula is a formula which is obtained by acquiring a large amount of data and performing software simulation to obtain the closest real situation, and the preset parameters and the preset threshold value in the formula are set by the technical personnel in the field according to the actual situation or obtained by simulating a large amount of data.

Although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the present invention.

Claims

1. A cloud security account management security platform based on big data is characterized by comprising: the system comprises a cloud platform client, a data collection module, a personal data training module, a user data training module, an account anomaly monitoring module and a cloud platform; the data collection module comprises a personal data collection unit and a user data collection unit; wherein, the modules are connected with each other by a wireless network and/or electricity;

the cloud platform client is used for registering and logging in a cloud platform account by a user and communicating with a cloud platform; wherein, the user registration has the option of filling in the real sex and age;

the personal information collection unit is used for collecting operation information of each user login; the operation information comprises login time, login IP address, login duration and attention to the data type of access and/or operation; the personal information collection unit sends the collected operation information of the personal user to the personal data training module;

the user information collecting unit is used for collecting login habit information of each user; the user login habit information comprises the gender, age, the first three data types concerned, the number of commonly used login IP addresses and the data operation speed of the user; the user information collecting unit sends the login habit information of each user to a user data training module;

the personal data training module is used for training a machine learning judgment model belonging to each user;

the user data training module is used for training login habit models of all users by using a machine learning model;

the account abnormity monitoring module is used for judging whether the user account is abnormal or not.

2. The big data-based cloud security account management security platform according to claim 1, wherein the personal data training module trains a machine learning judgment model of each user, comprising the steps of:

step S2: before the login times of the user reach a login time threshold t, marking the login states of the user as normal; using 0 as a sign of abnormal login and 1 as a sign of normal login; the login time threshold t is set according to actual experience;

step S3: training by taking the operation vector and login state mark of the user's previous t logins as a training set of a machine learning model;

the trained machine learning model belonging to each user is saved and labeled as Mi, where i represents the user.

3. The big-data-based cloud security account management security platform according to claim 2, wherein the converting the operation information into an operation vector comprises:

dividing the login time into different time periods, and marking the login time as the number of the time period in which the login time is positioned;

the login IP address is marked according to the newly added IP;

the login duration is the duration from login to offline of the user;

the focus on the data type of access and/or operation is marked as the number of data types accessed.

4. The big data based cloud security account management security platform according to claim 1, wherein the training of login habit models of all users comprises the following steps:

step P1: converting the login habit information into a login habit vector; combining the sex, age, top three data types of interest, the number of commonly used login IP addresses and the data operation speed of a user into a login habit vector;

step P3: training by taking the login habit vector of the user and the login habit vector generated by the virtual account as a training set of a machine learning model;

and saving the trained machine learning model, and marking the machine learning model as N.

5. The big-data-based cloud security account management security platform according to claim 4, wherein the converting the login habit information into a login habit vector comprises:

sex 0 for male, 1 for female;

age is expressed using true age;

the first three data types of interest are labeled with 1.

6. The cloud security account management method based on big data is characterized in that the step of judging whether a user account is abnormal or not comprises the following steps:

step Q2: converting the login habit information into a login habit vector;

step Q4: converting the operation information into an operation vector;

step Q6: calculating the comprehensive probability P ═ α × P1+ β × P2 of the login abnormality; judging whether P is greater than gamma, if so, sending login abnormal information to the user, and turning to Q7; wherein α and β are preset fixed proportionality coefficients, and α + β is 1; gamma is an anomaly probability threshold value set according to experience;