CN115296855A

CN115296855A - User behavior baseline generation method and related device

Info

Publication number: CN115296855A
Application number: CN202210811244.2A
Authority: CN
Inventors: 何立维; 蔡达龙; 刘国平; 吕文俊; 但柯锐
Original assignee: Nsfocus Technologies Inc; Nsfocus Technologies Group Co Ltd
Current assignee: Nsfocus Technologies Inc; Nsfocus Technologies Group Co Ltd
Priority date: 2022-07-11
Filing date: 2022-07-11
Publication date: 2022-11-04
Anticipated expiration: 2042-07-11
Also published as: CN115296855B

Abstract

The application relates to a user behavior baseline generation method and a related device, which are used for improving the accuracy of a user behavior baseline and reducing the calculation complexity, and the method comprises the following steps: after current attribute information is added into a historical access record aiming at target attributes, target weights corresponding to the current attribute information and the historical attribute information are determined based on the adding sequence of the historical attribute information and the current attribute information contained in the historical access record and the adding time interval between the historical attribute information and the current attribute information respectively, then the target weights corresponding to the attribute information with the same content in the current attribute information and the historical attribute information are accumulated, and a user behavior baseline is generated based on the weight accumulated value.

Description

User behavior baseline generation method and related device

Technical Field

The present application relates to the field of network security technologies, and in particular, to a method for generating a user behavior baseline and a related device.

Background

With the rapid development of computer networks, how to identify abnormal behaviors from a large number of behaviors to be identified based on a user behavior baseline is particularly important for ensuring the security of the network, wherein the user behavior baseline refers to a common attribute information set corresponding to various attributes, and the attributes may be Internet Protocol (IP) addresses, physical addresses, device identifiers and the like.

In the related art, the user behavior baseline is generally generated in the following two ways:

the first method is a generation method of a user behavior baseline based on machine learning.

In the first approach, the machine learning algorithm may employ isolated forests, K-means clustering, timing analysis, and the like. However, the machine learning algorithm is complex to implement, a large amount of sample data is needed, and it is difficult to ensure the accuracy of identifying abnormal behaviors in the case of insufficient behavior data samples. In addition, in practical application, in order to meet the real-time performance, the user behavior needs to be recalculated every time the user behavior changes, and the recognition performance is affected.

Second, statistical-based user behavior baseline generation method

In the second method, for a certain attribute, attribute information corresponding to the registration time closest to the current time is used as common attribute information, or attribute information having the largest number of registrations is used as common attribute information. However, the behavior baseline with a single dimension (login time or login times) cannot truly reflect abnormal behaviors, so that the calculated common attribute information is not consistent with the actual situation, and the accuracy is difficult to identify and guarantee.

Disclosure of Invention

The application provides a user behavior baseline generation method and a related device, which are used for improving the accuracy of a user behavior baseline and reducing the calculation complexity.

In a first aspect, an embodiment of the present application provides a user behavior baseline generation method, including:

responding to a login success operation triggered by a target object aiming at a target service, and acquiring current attribute information corresponding to a target attribute in the login success operation;

adding the current attribute information into a historical access record, wherein the historical access record comprises: the target object aims at the historical attribute information corresponding to the target attribute triggered by the target service;

determining target weights corresponding to the current attribute information and the historical attribute information respectively based on the adding sequence of the current attribute information and the historical attribute information and the adding time interval between the historical attribute information and the current attribute information respectively;

accumulating target weights corresponding to attribute information with the same content in the current attribute information and the historical attribute information, and determining at least one piece of common attribute information from the current attribute information and the historical attribute information on the basis of a weight accumulated value;

generating a user behavior baseline based on the target attribute and the at least one common attribute information.

In a second aspect, an embodiment of the present application provides a user behavior baseline generation apparatus, including:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for responding to login success operation triggered by a target object aiming at a target service and acquiring current attribute information corresponding to a target attribute in the login success operation;

a recording unit, configured to add the current attribute information to a history access record, where the history access record includes: the target object aims at the historical attribute information corresponding to the target attribute triggered by the target service;

a determining unit, configured to determine target weights corresponding to the current attribute information and the historical attribute information, based on the adding order of the current attribute information and the historical attribute information, and based on adding time intervals between the historical attribute information and the current attribute information, respectively;

the accumulation unit is used for accumulating the target weights corresponding to the attribute information with the same content in the current attribute information and the historical attribute information, and determining at least one piece of common attribute information from the current attribute information and the historical attribute information based on a weight accumulation value;

and the generating unit is used for generating a user behavior baseline based on the target attribute and the at least one piece of common attribute information.

In a third aspect, an embodiment of the present application provides an electronic device, which includes at least a processor and a memory, where the processor is configured to implement the steps of the user behavior baseline generation method as described above when executing a computer program stored in the memory.

In a fourth aspect, the present application provides a computer-readable storage medium, which stores a computer program, and the computer program, when executed by a processor, implements the steps of the user behavior baseline generation method as described above.

In a fifth aspect, an embodiment of the present application provides a computer program product, where the computer program product includes: computer program code which, when run on a computer, causes the computer to perform the steps of the user behavior baseline generation method described above.

In the embodiment of the application, after current attribute information is added into a historical access record aiming at target attributes, target weights corresponding to the current attribute information and the historical attribute information are determined based on the adding sequence of the historical attribute information and the current attribute information contained in the historical access record and the adding time interval between the historical attribute information and the current attribute information respectively, then the target weights corresponding to the attribute information with the same content in the current attribute information and the historical attribute information are accumulated, and a user behavior baseline is generated based on the weight accumulated value.

Therefore, through the adding sequence and the corresponding time interval of each historical attribute information and the current attribute information, the behavior baseline on the time dimension can be obtained, the behavior baseline on the use frequency dimension can be obtained by accumulating the target weights corresponding to the attribute information with the same content, and the generated user behavior baseline has sensitivity to the time and the use frequency at the same time, so that the accuracy of the user behavior baseline is improved, the abnormal behavior identification accuracy is improved, meanwhile, the calculation complexity is reduced, the generation efficiency of the user behavior baseline is improved, and the abnormal behavior identification performance is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flowchart of a user behavior baseline generation method provided in an embodiment of the present application;

FIG. 2 is a diagram of a queue provided in an embodiment of the present application;

fig. 3 is a schematic flowchart of a method for determining target weights provided in an embodiment of the present application;

fig. 4 is a schematic flowchart of a method for determining weight correction values according to an embodiment of the present application;

FIG. 5 is a mapping relationship between target weights and index numbers provided in an embodiment of the present application;

FIG. 6 is a logic diagram for determining target weights provided in an embodiment of the present application;

fig. 7 is a logic diagram for determining the maximum number of common attribute information provided in the embodiment of the present application;

FIG. 8 is a logical diagram of a null attribute mechanism provided in an embodiment of the present application;

fig. 9 is a schematic structural diagram of a user behavior baseline generation apparatus provided in an embodiment of the present application;

fig. 10 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments, but not all embodiments, of the technical solutions of the present application. All other embodiments obtained by a person skilled in the art without any inventive step based on the embodiments described in the present application are within the scope of the protection of the present application.

For the sake of understanding, the technical terms referred to in this application will be described first:

software Defined boundary (SDP): a safe and credible virtual boundary is established for enterprises or organizations on the Internet in a software-defined mode based on a zero-trust new-generation network security model.

Zero trust: the uncertainty in the accuracy of its decision is reduced when each access request in the information system and service is executed, assuming the network environment has been compromised.

User behavior baseline: the attribute is a set of various common attributes counted according to various behaviors of a user in the processes of login, access and the like, for example, a common IP set, a common login address set, a common device set and the like.

In the SDP framework, each terminal typically must be authenticated before connecting to the server to ensure that each terminal connected to the server is allowed access. The core network assets and facilities are hidden by the SDP architecture from being directly exposed under the Internet, so that the network assets and facilities are protected from external security threats. With the SDP framework, distributed denial of service attacks (DDoS), man-in-the-middle attacks, vulnerability scanning, and advanced persistent threats (ATP) are all prevented.

However, for abnormal authentication and authorization behaviors of a user after establishing a Transmission Control Protocol (TCP) connection (i.e., connecting with a server), such as abnormal network login, abnormal time slot access, and the like, it is difficult for the SDP framework based on zero trust to identify and intercept in real time, and how to improve the identification capability of the abnormal behaviors becomes a problem that needs to be solved at present.

a first method for generating a user behavior baseline based on machine learning.

In the first approach, the machine learning algorithm may employ isolated forests, K-means clustering, timing analysis, and the like. However, the machine learning algorithm is complex to implement, a large amount of sample data is required, and it is difficult to ensure the abnormal behavior identification accuracy rate under the condition that the behavior data samples are insufficient. In addition, in practical application, in order to meet real-time performance, recalculation is needed every time a user behavior changes, and recognition performance is affected.

Second, statistical-based user behavior baseline generation method

In the second method, for a certain attribute, attribute information corresponding to a registration time closest to the current time is used as common attribute information, or attribute information having the largest number of registrations is used as common attribute information. However, the behavior baseline with a single dimension (login time or login times) often cannot reflect abnormal behaviors truly, so that the calculated common attribute information is not consistent with the actual situation, and the accuracy is difficult to identify and guarantee.

In view of this, in the embodiment of the present application, a scheme for generating a user behavior baseline is provided, where the scheme includes: according to the target attribute, after the current attribute information is added into the historical access record, the target weights corresponding to the current attribute information and the historical attribute information are determined based on the adding sequence of the historical attribute information and the current attribute information contained in the historical access record and the adding time interval between the historical attribute information and the current attribute information, the target weights corresponding to the attribute information with the same content in the current attribute information and the historical attribute information are accumulated, and a user behavior base line is generated based on the weight accumulated value.

Referring to fig. 1, a schematic flow chart of a user behavior baseline generation method provided in an embodiment of the present application is shown, where the method flow may be executed by a hardware facility with an arithmetic function, such as a computer, a chip, a processor, or a server, and the method flow specifically includes:

s101, responding to a login success operation triggered by a target object aiming at a target service, and acquiring current attribute information corresponding to a target attribute in the login success operation.

In the embodiment of the present application, the target object may refer to one user or may refer to a user group, where the user group includes one or more users. And if the target object is a user, the terminal responds to the login success operation triggered by the user aiming at the target service, and acquires the current attribute information corresponding to the target attribute. And if the target object is a group, the terminal responds to the login success operation triggered by any user in the group aiming at the target service, and acquires the current attribute information corresponding to the target attribute.

The target service includes, but is not limited to, services such as authentication and authorization performed after the terminal is connected to the server. And aiming at the target service, when the user successfully logs in, the terminal determines that the target object triggers successful login operation aiming at the target service. The successful login may be, but is not limited to, a password input by the target object or verification of information such as the authority of the target object.

The target attribute may be, but is not limited to, IP, terminal identification, or MAC address. The attribute information may also be referred to as an attribute value. In this context, the target attribute is merely described as an IP.

For example, assume that the target object is user a, who uses IP:1.1.1.1, when the login is successful, the user A responds to the login success operation triggered by the target service to obtain the current attribute information 1.1.1.1 corresponding to the target attribute IP.

For another example, assume that the target object is a group x, user B is a user in the group x, and user B uses IP:2.2.2.2, when the login is successful, responding to the successful login operation triggered by the user B in the group x aiming at the target service, and acquiring the current attribute information 2.2.2.2 corresponding to the target attribute IP.

S102, adding the current attribute information into a historical access record, wherein the historical access record comprises: and the target object aims at the historical attribute information corresponding to the target attribute triggered by the target service.

As a possible implementation manner, the target object is a user, and the history access record includes each history attribute information corresponding to the target attribute triggered by the user for the target service. Still taking the user a as an example, the history access record includes history IPs triggered by the user a for the target service, and each history IP includes: 1.1.2.1, 1.1.1.1, 1.1.6.1, etc.

As another possible implementation manner, the target object is a group, and the history access record includes information of each history attribute corresponding to a target attribute triggered by all users in the group for the target service. Still taking the group x as an example, it is assumed that the group x includes the user B and the user C, and the history access record includes each history IP triggered by the user a for the target service and each history IP triggered by the user B for the target service.

In this embodiment of the present application, the history access record may be implemented by a First In First Out (FIFO) queue, and may also be implemented in a table format, but is not limited thereto. In this context, the queue is only used as an example for illustration.

Referring to fig. 2, it is a schematic diagram of a queue provided in this embodiment, where the queue includes 20 IP attribute information such as IP0, IP1, and IP2, index numbers of the 20 IP attribute information such as IP0, IP1, and IP2 are sequentially 0 to 19, and only taking IP0, IP1, and IP2 as examples, IP0, IP1, and IP2 are 1.1.2.1, 1.1.1.1, and 1.1.2.1, respectively. When adding the current attribute information into the queue, if the queue is full, the IP attribute information with the longest time to enter the queue will be removed from the queue.

It should be noted that, in the embodiment of the present application, the index numbers of the queues are sorted from small to large, and in the actual application process, the index numbers of the queues may also be sorted from large to small, which is not limited.

S103, determining target weights corresponding to the current attribute information and the historical attribute information respectively based on the adding sequence of the current attribute information and the historical attribute information and the adding time interval between the historical attribute information and the current attribute information respectively.

And S104, accumulating the target weights corresponding to the attribute information with the same content in the current attribute information and the historical attribute information, and determining at least one piece of common attribute information from the current attribute information and the historical attribute information based on the weight accumulated value.

And S105, generating a user behavior baseline based on the target attribute and the at least one piece of common attribute information.

Specifically, when S105 is executed, the following manners may be adopted, but not limited to:

regarding the attribute information with the same content in the current attribute information and each historical attribute information, when the weight accumulated value is not less than a preset weight threshold value, taking the attribute information with the same content as the common attribute information;

and regarding other attribute information except the attribute information with the same content in the current attribute information and the historical attribute information, when the target weight of the other attribute information is not less than a preset weight threshold value, taking the other attribute information as the common attribute information.

For example, the weight accumulation value may be calculated by the following formula (1):

sum = Σ weight formula (1)

Sum represents the accumulated value of the target weights corresponding to the attribute information with the same content, and weight represents the target weight of the current attribute information.

For example, assume that in the queue shown in fig. 2, IP0 is current attribute information, and IP1 to IP19 are historical attribute information, and if IP0, IP2, and IP3 are all 1.1.2.1, for an IP address 1.1.2.1, the weight accumulated value is a weight accumulated value of target weights corresponding to IP0, IP2, and IP3, and then the weight accumulated value is compared with a preset weight threshold, and if only IP1 in the queue is 1.1.1.1, then the target weight of IP1 is directly compared with the preset weight threshold.

The preset weight threshold may be set as a queue length, so that an attribute information may become a common attribute information after entering a queue, and as a new attribute information enters the queue, a target weight of the attribute information may be reduced, and thus the attribute information may become an unusual attribute information.

In the embodiment of the application, after current attribute information is added into a historical access record aiming at target attributes, target weights corresponding to the current attribute information and the historical attribute information are determined based on the adding sequence of the historical attribute information and the current attribute information contained in the historical access record and the adding time interval between the historical attribute information and the current attribute information, then the target weights corresponding to the attribute information with the same content in the current attribute information and the historical attribute information are accumulated, and a user behavior baseline is generated based on the weight accumulated value. Therefore, through the adding sequence and the corresponding time interval of each historical attribute information and the current attribute information, the behavior baseline on the time dimension can be obtained, the behavior baseline on the use frequency dimension can be obtained by accumulating the target weights corresponding to the attribute information with the same content, and the generated user behavior baseline has sensitivity to the time and the use frequency at the same time, so that the accuracy of the user behavior baseline is improved, the abnormal behavior identification accuracy is improved, meanwhile, the calculation complexity is reduced, the generation efficiency of the user behavior baseline is improved, and the abnormal behavior identification performance is improved.

In some embodiments, considering that the longer the attribute information is not used, the faster the weight of the attribute information should be decreased, a weight correction value may be introduced, and the influence of time on the weight may be dynamically adjusted according to the weight correction value, specifically, when S103 is executed, referring to fig. 3, the following steps are respectively performed for the current attribute information and each historical attribute information:

and S1031, taking the current attribute information and one of the historical attribute information as target attribute information, and determining an initial weight corresponding to the target attribute information based on the adding sequence.

Because the attribute information in the queue is sorted according to the enqueue time, the adding sequence of the current attribute information and the historical attribute information in the queue is the sequence of the current attribute information and the historical attribute information in the queue. For example, the initial weight corresponding to the target attribute information may be a difference between the queue length and the index number.

For example, in the queue in fig. 2, IP0 is current attribute information, IP1 to IP19 are historical attribute information, the initial weight corresponding to IP0 is 20-0=20, the initial weight corresponding to IP1 is 20-1=19, the initial weight corresponding to IP2 is 20-2=18, and similarly, the initial weights corresponding to IP3 to IP19 are 17 to 1, respectively.

And S1032, determining the weight correction value corresponding to the target attribute information based on the adding time interval between the current attribute information and the target attribute information.

In some embodiments, considering the influence of the adding time interval between the current attribute information and each historical attribute information on the target weight, referring to fig. 4, the following steps may be adopted when executing S302:

s10321, determining a reference time interval corresponding to the target attribute information based on a join time interval between the target attribute information and adjacent attribute information, where the adjacent attribute information is attribute information adjacent to the target attribute information according to a join order.

Specifically, when S10321 is executed, there are two cases:

the first condition is as follows: the target attribute information is current attribute information or historical attribute information added into the historical access record at the earliest time, namely the target attribute information is attribute information positioned at the head or the tail of a queue, at the moment, the target attribute information has adjacent attribute information, and the adding time interval between the target attribute information and the corresponding adjacent attribute information is used as a reference time interval corresponding to the target attribute information.

For example, in the queue of fig. 2, the historical attribute information added to the historical access record at the earliest time is IP19, the current attribute information is IP0, the target attribute information is IP0, the adjacent attribute information of IP0 is IP1, and the adding time interval between IP0 and IP1 is taken as the reference time interval corresponding to IP 0.

For another example, assume that the target attribute information is IP19, the adjacent attribute information of IP19 is IP18, and the joining time interval between IP19 and IP18 is set as the reference time interval corresponding to IP 19.

Case two: the target attribute information is not the current attribute information and the historical attribute information added to the historical access record at the earliest time, that is, the target attribute information is the attribute information except the head and the tail of the queue, at this time, the target attribute information has two adjacent attribute information, and the average value of the adding time interval between the two corresponding adjacent attribute information is used as the reference time interval corresponding to the target attribute information.

For example, the reference time interval may be determined using the following equation (2):

wherein, T (index) represents the reference time interval corresponding to the target attribute information, index is the index number of the target attribute information in the queue, T _index-1 And t _index+1 Respectively representing the joining time of two adjacent attribute information. If the length of the queue is L, the value range of index is [0,L-1 ]]。

E.g. in the queue of fig. 2If the target attribute information is IP1 and the adjacent attribute information of IP1 is IP0 and IP2, the average value of the adding time interval between IP0 and IP2 is used as the reference time interval corresponding to IP1, and it is assumed that the adding time of IP0 is t ₀ The addition time of IP2 is t ₂ Reference time interval corresponding to IP1

S10322, determining a weight correction value corresponding to the target attribute information based on the joining time interval between the current attribute information and the target attribute information and based on the reference time interval.

Specifically, when S10322 is executed, there are two cases:

case a: and if the adding time interval between the current attribute information and the target attribute information is less than the reference adding time length, determining the weight correction value corresponding to the target attribute information as a first set numerical value. For example, the first setting value may be 0.

Case B: and if the adding time interval between the current attribute information and the target attribute information is not less than the reference adding time length, taking the minimum value of the second set value and the difference value as the weight correction value corresponding to the target attribute information, wherein the difference value refers to the ratio of the difference value between the corresponding adding time interval and the reference adding time length to the reference adding time length.

The reference adding duration is determined according to the reference time interval and the position of the target attribute information in the adding sequence, namely the reference adding duration is determined according to the reference time interval and the index number of the target attribute information.

For example, the weight correction value corresponding to the target attribute information may be determined by using the following formula (3):

where, c (index, t) _index ) Indicating the weight correction value corresponding to the target attribute information, index being the target attribute informationIndex number of information in queue, t _index Indicating the joining time of the target attribute information, T indicating the reference time interval corresponding to the target attribute information, T ₀ Index number representing the current attribute information, min represents the minimum function.

For example, in the queue of fig. 2, it is assumed that the target attribute information is IP0, and the weight correction value corresponding to IP0 is 0.

For another example, in the queue of fig. 2, it is assumed that the target attribute information is IP3, and it is assumed that | t is satisfied ₀ -t ₃ The weight correction value corresponding to the value of | -3 xT is more than or equal to 2T, and the value of IP3 is 2.

For another example, in the queue of fig. 2, it is assumed that the target attribute information is IP4, and it is assumed that 0 is satisfied<|t ₀ -t ₄ And the I-4T is less than or equal to T, and the weight correction value corresponding to the IP4 is 1.

And S1033, determining the target weight corresponding to the target attribute information based on the initial weight and the weight correction value.

As a possible implementation manner, the target weight corresponding to the target attribute information may be determined directly based on the initial weight and the weight correction value, and for example, the target weight corresponding to the target attribute information may be calculated by using the following formula (4):

weight＝f(index,t _index )＝length-index-c(index,t _index ) Formula (4)

Where weight denotes a target weight corresponding to the target attribute information, c (index, t) _index ) And indicating the weight correction value corresponding to the target attribute information, length indicating the queue length, index indicating the index number of the target attribute information, and the difference value between the length and the index indicating the initial weight corresponding to the target attribute information.

For example, in the queue in fig. 2, if the target attribute information is IP0, the weight correction value corresponding to IP0 is 0, the initial weight corresponding to ip0 is 20, and the target weight corresponding to ip0 is 20-0=20.

For another example, in the queue in fig. 2, the target attribute information is IP3, the initial weight corresponding to IP3 is 17, and it is assumed that the weight correction value corresponding to IP3 is 2, and the target weight corresponding to IP3 is 20-3-2=15.

For another example, in the queue in fig. 2, the target attribute information is IP4, the initial weight corresponding to IP4 is 16, and it is assumed that the weight correction value corresponding to IP4 is 1, and the target weight corresponding to IP4 is 20-4-1=15.

Obviously, the length-index reflects the influence of the number of newly entered IPs on the weight of the existing IPs in the queue, c (index, t) _index ) And the influence of the time intervals among all enqueue IPs on the weight is reflected by correcting the weight change.

It is to be understood that the current attribute information in the present application may also be referred to as new IP attribute information, and the information content of the new IP attribute information may be the same as one or more pieces of historical attribute information in each piece of historical attribute information, or may be different from each piece of historical attribute information.

Referring to fig. 5, in the case of not considering the C function, the index number is used as a parameter in the calculation of the target weight, and the target weight drop is linear, which is suitable for the case that the time interval Δ t of each entry of the IP attribute information is equal, that is, entering a new IP attribute information + entering the new IP attribute information at equal time intervals = the IP weight in the queue minus 1, that is, in the queue, the new IP attribute information can only enter the queue one by one, so the number of new IP attribute information entries is linearly increased, and the influence of the new IP attribute information on the target weight can be considered as a constant value.

However, the reduction of the target weight is simultaneously affected by the number of entries of new IP attribute information and the variance σ of the time interval Δ t of entries of all IP attribute information ² Of a ² The closer to 0, the closer to equal time interval entry is considered. Wherein the content of the first and second substances,

x is a variable value, namely delta t, M is a variable number, mu is an average value of time intervals delta t, and the time interval delta t of one IP attribute message can be the time interval between the queue adding time of the IP attribute message and the queue adding time of the previous attribute message.

When the difference between the actual time and the estimated time of an IP attribute message entering the queue is large, the sigma is considered to be ² Very large, if σ ² The target weight is greatly influenced by the time interval, so that the new IP attribute information + the variance sigma of the time interval is entered in a short time ² Much = the target weight in the queue for the corresponding IP attribute information is unchanged. Therefore, when the time interval Δ t of incoming new IP attribute information is short and the variance of the time interval of incoming historical IP attribute information is large, the weight of the historical IP attribute information in the queue is not changed even if the new IP attribute information is entered.

Referring to fig. 6, assume that, for IP:10.10.0.1, IP:10.10.0.1 has a current index value of 3, i.e., IP3 is 10.10.0.1, satisfying | t ₀ -t ₃ The target weight corresponding to the value of | -3 XT is more than or equal to 2T, and IP3 is as follows: l-3-2, after entering a new IP attribute information, IP:10.10.0.1 has an index value of 4, i.e., IP4 is 10.10.0.1, and satisfies 0<|t ₀ -t ₄ And | 4T is less than or equal to T, and the target weight corresponding to IP4 is as follows: l-4-1, after entering a new IP attribute information again, IP:10.10.0.1 has an index value of 5, i.e., IP5 is 10.10.0.1, satisfying | t ₀ -t ₅ The target weight corresponding to the value of less than or equal to 5T and IP5 is as follows: l-5-0. It can be seen that although 2 new IP attribute information is continuously entered, IP:10.10.0.1 has a target weight of L-5 at all times, and does not change.

This situation does not continue all the time, σ is after entering new IP attribute information with a small time interval Δ t ² Gradually decreases toward zero, approaching the situation of entering new IP attribute information at equal time intervals, and therefore, the target weight of the IP attribute information in the queue starts to decrease linearly after a constant period of time.

In some embodiments, in order to solve the problem of difficulty expectation adjustment to become common IP in multiple practical scenarios, for example, for a user with a low authorization level, the number of the users is large, and the security risk is low, and by reducing the difficulty of becoming common IP, the user experience and the operation and maintenance efficiency are improved, specifically, when performing S2033, the method includes:

acquiring preset easiness for target attribute information;

and determining the target weight corresponding to the target attribute information based on the initial weight and the weight correction value and based on the easiness.

For example, the target weight corresponding to the target attribute information may be calculated by using the following formula (5):

weight＝f(index,t _index ) + N formula (5)

Where weight represents the target weight, f (index, t) _index ) The formula (4) is adopted for calculation, and N is an integer not less than 0. If N is a non-negative integer, N represents the difficulty of one attribute information becoming common attribute information, and the larger the value of N is, the more easily one attribute information becomes common attribute information. When the value of N is larger, the value of the target weight corresponding to each IP attribute information in the queue is larger, and the threshold value is easier to reach, so that the IP attribute information in the queue is easier to become a common IP, and the maximum common IP quantity is larger.

Referring to fig. 7, when the preset weight threshold is equal to the length L of the queue, it is assumed that, in the queue with the length L, regardless of the C function, there are N +1 attribute information corresponding to L + N to L with the target weight being equal to or greater than L, and in the remaining L- (N + 1) attribute information, (1+N) + (L-1) = L + N + attribute information due to the initial weights of the attribute information at the head and the tail being added>L, so in the rest L- (N + 1) attribute information, the head and the tail are matched in a one-to-one correspondence mode, and the maximum matching is carried out

And (4) carrying out pairing. Therefore, the maximum number of common attribute information can be calculated by the following formula (6):

here, max represents the maximum number of pieces of common attribute information, and in the case where the target attribute is IP, max may also be referred to as the maximum number of common IPs, and obviously, the number of common IPs is a dynamically changing interval, and the interval is [0,max ].

In some embodiments, since the number of commonly used IPs is dynamically changed, in order to solve the problem that the meaning of the commonly used IPs is lost due to an excessive number of commonly used IPs, a null attribute mechanism is proposed, and specifically, when S202 is executed, the following manner may be adopted:

determining the number of the current-time common attribute information based on the number of the previous-time common attribute information, and determining a common information number threshold based on the number of the current-time common attribute information, wherein the current time refers to the time of acquiring the current attribute information, and the previous time refers to the latest entry time of each historical attribute information;

if the number of the common attribute information at the current moment is larger than the threshold value of the number of the common attribute information, adding the current attribute information into the historical access record after adding the specified attribute information into the historical access record.

For example, the specified attribute information may be null attribute information.

In the implementation of the application, a common data number threshold H is set, when the count is larger than or equal to H, if new IP attribute information is about to enter a queue, the IP attribute information with a value as a set numerical value is entered first, and then the new IP attribute information is entered, so that the calculation of the target weights of all the IP attribute information in the queue is influenced through the specified attribute information, the attenuation of the target weights is faster, the quantity of the common IP attribute information is controlled, and the judgment on the common IP is more accurate.

Specifically, the common data number threshold is calculated by using the following formula (7) and formula (8):

wherein H represents a common information number threshold,

number average, count, representing common attribute information at the current time _max Representing the maximum historical number of the common attribute information, wherein alpha and beta are preset parametersThe number of the first and second groups is counted,

the number average value of the common attribute information at the previous moment is represented, and the count represents the number of the common attribute information at the current moment. Exemplarily, α =0.125 and β =5.

As can be seen from the above equation (6),

is an average value determined by the number of the historical common IP attribute information,

the more affected by the count recorded at the last time, it is assumed that,

the value of (a) is 0,

is the average value of the number of the common attribute information at the first moment,

is the average value of the number of the common attribute information at the second moment,

the number of common attribute information for the first time is derived as follows:

due to alpha<1,(1-α)<1, so (1-alpha) ² α<(1-α)α<α, obviously, the data count counted in the most recent time ₃ To pair

The weight of the impact is greatest.

It should be noted that, in the embodiment of the present application, one weighting function may be set to reduce the target weights of all the IP attribute information, but the null attribute mechanism has the following advantages over the case where one weighting function is set to reduce the target weights of all the IP attribute information:

first, the impact of the null attribute mechanism on the target weight is limited time, and when the number of the commonly used IP attribute information reaches a threshold (i.e., H) related to time, a null IP is entered in the queue, so that the target weight of all the IP attribute information is reduced by 1. And as new IP attribute information enters, the empty IP moves to the tail of the queue, the influence on the whole queue is reduced as the index number is increased, and then the empty IP is moved out of the queue to automatically clear the influence on the queue. And directly reducing the target weights of all the IP attribute information, it is necessary to set the time for the change to take effect, which increases the design complexity of the scheme.

Second, the null attribute mechanism has dynamic adjustability, which varies with the amount of commonly used IP attribute information. When the number of the common IP attribute information reaches the threshold value, a null IP is entered, the target weight of all the IP attribute information is reduced by 1, however, if the number of the common IP attribute information is not reduced at this time, the null IP is entered, and the target weight is influenced again.

Finally, referring to fig. 8, the impact of the null attribute mechanism on the target weights of all IP attribute information is not uniform. Since each empty IP will only affect the IP attribute information that entered the queue before it, if there are multiple empty IPs in this queue, their effect on the target weight of the IP attribute information in the queue is staged, with the degree of overlap increasing with increasing index number. If the segmentation is directly performed according to the index number, a segmentation function which affects the target weight of the IP attribute information is set, and the complexity of scheme design is also increased.

In the embodiment of the present application, the time complexity of the scheme is as follows: o (n), spatial complexity is: o (n). And a simple queue model is utilized, meanwhile, a weight calculation algorithm is not complicated, and the technical scheme is simpler. And the counted number of the common attribute information is dynamically changed, so that the personalized requirements of the user can be met.

Furthermore, due to the limitation of the size of the queue for storing data, according to the previous analysis, the number range of the common attribute information is [1,L/2], and meanwhile, due to the introduction of an empty attribute mechanism, after the number of the common attribute information reaches a corresponding threshold value, the increase or even the decrease of the number of the common attribute information can be slowed down, so that the number of the common attribute information is controlled in a reasonable range, and the accuracy of the determined common attribute information is improved.

Furthermore, other attributes such as login equipment and login addresses can also be used for screening common attribute information through the user behavior baseline generation method, so that high reusability is realized, and the design of a scheme for redundancy aiming at different attributes is avoided.

Based on the same inventive concept, referring to fig. 9, an embodiment of the present application provides a user behavior baseline generation apparatus, including:

an obtaining unit 901, configured to obtain, in response to a login success operation triggered by a target object for a target service, current attribute information corresponding to a target attribute in the login success operation;

a recording unit 902, configured to add the current attribute information to a history access record, where the history access record includes: the target object aims at the historical attribute information corresponding to the target attribute triggered by the target service;

a determining unit 903, configured to determine, based on the adding order of the current attribute information and each of the historical attribute information, and based on an adding time interval between each of the historical attribute information and the current attribute information, a target weight corresponding to each of the current attribute information and each of the historical attribute information;

an accumulation unit 904, configured to accumulate target weights corresponding to attribute information with the same content in the current attribute information and the historical attribute information, and determine, based on a weight accumulation value, at least one piece of frequently-used attribute information from the current attribute information and the historical attribute information;

a generating unit 905, configured to generate a user behavior baseline based on the target attribute and the at least one piece of common attribute information.

As a possible implementation manner, when determining the target weights corresponding to the current attribute information and the historical attribute information respectively based on the adding order of the current attribute information and the historical attribute information and based on the adding time interval between each piece of historical attribute information and the current attribute information, the determining unit 903 is specifically configured to:

for the current attribute information and the historical attribute information, respectively executing the following operations:

taking the current attribute information and one attribute information in the historical attribute information as target attribute information, and determining an initial weight corresponding to the target attribute information based on the adding sequence;

determining a weight correction value corresponding to the target attribute information based on the adding time interval between the current attribute information and the target attribute information;

and determining the target weight corresponding to the target attribute information based on the initial weight and the weight correction value.

As a possible implementation manner, when determining the weight correction value corresponding to the target attribute information based on the joining time interval between the current attribute information and the target attribute information, the determining unit 903 is specifically configured to:

determining a reference time interval corresponding to the target attribute information based on the joining time interval between the target attribute information and the adjacent attribute information; the adjacent attribute information refers to attribute information adjacent to the target attribute information according to the adding sequence;

and determining a weight correction value corresponding to the target attribute information based on the joining time interval between the current attribute information and the target attribute information and based on the reference time interval.

As a possible implementation manner, when determining the reference time interval corresponding to the target attribute information based on the joining time interval between the target attribute information and the adjacent attribute information, the determining unit 903 is specifically configured to:

if the target attribute information is the current attribute information or the historical attribute information added into the historical access record at the earliest time, taking the adding time interval between the target attribute information and one corresponding adjacent attribute information as a reference time interval corresponding to the target attribute information;

and if the target attribute information is not the current attribute information or the historical attribute information added into the historical access record at the earliest time, taking the average value of the enqueue time intervals between the target attribute information and two corresponding adjacent attribute information as the reference time interval corresponding to the target attribute information.

As a possible implementation manner, when determining the weight correction value corresponding to the target attribute information based on the joining time interval between the current attribute information and the target attribute information and based on the reference time interval, the determining unit 903 is specifically configured to:

if the adding time interval between the current attribute information and the target attribute information is less than or equal to the reference adding time length, determining the weight correction value corresponding to the target attribute information as a first set numerical value;

if the adding time interval between the current attribute information and the target attribute information is not less than the reference adding time length, taking the minimum value of a second set value and a difference value as a weight correction value corresponding to the target attribute information, wherein the difference value is the ratio of the difference value between the corresponding adding time interval and the reference adding time length to the reference adding time length;

wherein the reference join duration is determined according to the reference time interval and a position of the target attribute information in the join order.

As a possible implementation manner, when determining the target weight corresponding to the target attribute information based on the initial weight and the weight correction value, the determining unit 903 is specifically configured to:

acquiring the preset easiness degree aiming at the target attribute information;

and determining the target weight corresponding to the target attribute information based on the initial weight, the weight correction value and the easiness.

As a possible implementation manner, when the current attribute information is added to the historical access record, the recording unit 902 is specifically configured to:

determining the number of the current-time common attribute information based on the number of the previous-time common attribute information, and determining a common data information number threshold based on the number of the current-time common attribute information, wherein the current time refers to the time of acquiring the current attribute information, and the previous time refers to the latest entry time of each historical attribute information;

if the number of the common attribute information at the current moment is larger than the threshold value of the number of the common data information, adding the current attribute information into the historical access record after adding the specified attribute information into the historical access record.

As a possible implementation manner, the common data number threshold is calculated by using the following formula:

where H represents the common data number threshold,

number average, count, representing common attribute information at the current time _max Represents the maximum history number of the common attribute information, alpha and beta are preset parameters,

the number average of the common attribute information at the previous moment is represented, and the count represents the number of the common attribute information at the current moment.

As a possible implementation manner, when determining at least one piece of common attribute information from the current attribute information and the historical attribute information based on the weight accumulated value, the accumulating unit 904 is specifically configured to:

regarding the attribute information with the same content in the current attribute information and the historical attribute information, when the weight accumulated value is not less than a preset weight threshold value larger than a preset threshold value, taking the attribute information with the same content as common attribute information;

and regarding other attribute information except the attribute information with the same content in the current attribute information and the historical attribute information, when the target weight of the other attribute information is not less than a preset weight threshold value which is greater than a preset threshold value, taking the other attribute information as the common attribute information.

Based on the same inventive concept, referring to fig. 10, a schematic structural diagram of an electronic device provided in an embodiment of the present application includes: the system comprises a processor 101, a communication interface 102, a memory 103 and a communication bus 104, wherein the processor 101, the communication interface 102 and the memory 103 are communicated with each other through the communication bus 104;

the memory 103 has stored therein a computer program which, when executed by the processor 101, causes the processor 41 to perform a user behavior baseline generation method as in fig. 1, 3 or 4.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface 102 is used for communication between the above-described electronic device and other devices.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the aforementioned processor.

The Processor may be a general-purpose Processor, including a central processing unit, a Network Processor (NP), and the like; but may also be a Digital instruction processor (DSP), an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.

Based on the same inventive concept, embodiments of the present application provide a computer-readable storage medium, in which a computer program executable by a processor is stored, and when the program runs on the processor, the processor is enabled to execute the user behavior baseline generation method as described in fig. 1, fig. 3, or fig. 4.

Based on the same inventive concept, the embodiment of the present application provides a computer program product, which includes: computer program code which, when run on a computer, causes the computer to perform the user behaviour baseline generation method described above in relation to fig. 1, 3 or 4.

For the system/apparatus embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for relevant points.

It is to be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or operation from another entity or operation without necessarily requiring or implying any actual such relationship or order between such entities or operations.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A method for generating a user behavior baseline, comprising:

2. The method of claim 1, wherein the determining target weights corresponding to the current attribute information and the historical attribute information based on the adding order of the current attribute information and the historical attribute information and based on adding time intervals between the historical attribute information and the current attribute information respectively comprises:

determining a weight modification value corresponding to the target attribute information based on the adding time interval between the current attribute information and the target attribute information;

3. The method of claim 2, wherein determining the weight correction value corresponding to the target attribute information based on the joining time interval between the current attribute information and the target attribute information comprises:

4. The method of claim 3, wherein the determining a reference time interval corresponding to the target attribute information based on the joining time interval between the target attribute information and the neighboring attribute information comprises:

and if the target attribute information is not the current attribute information and the historical attribute information added into the historical access record at the earliest time, taking the average value of the queuing time intervals between two corresponding adjacent attribute information as the reference time interval corresponding to the target attribute information.

5. The method of claim 3, wherein determining the weight correction value corresponding to the target attribute information based on the joining time interval between the current attribute information and the target attribute information and based on the reference time interval comprises:

if the adding time interval between the current attribute information and the target attribute information is smaller than the reference adding duration, determining the weight correction value corresponding to the target attribute information as a first set numerical value;

6. The method according to any one of claims 2 to 5, wherein the determining the target weight corresponding to the target attribute information based on the initial weight and the weight correction value includes:

acquiring the preset easiness for the target attribute information;

7. The method of any of claims 1-5, wherein the adding the current attribute information to a historical access record comprises:

8. The method of claim 7, wherein the common data number threshold is calculated using the following formula:

wherein H represents a common data number threshold,

the number average value of the common attribute information at the previous moment is represented, and the count represents the number of the common attribute information at the current moment.

9. The method of any of claims 1-5, wherein determining at least one common attribute information from the current attribute information and the historical attribute information based on a weight accumulation value comprises:

regarding the attribute information with the same content in the current attribute information and the historical attribute information, when the weight accumulated value is not less than a preset weight threshold value, taking the attribute information with the same content as common attribute information;

10. An apparatus for generating a user behavior baseline, the apparatus comprising:

the accumulation unit is used for accumulating target weights corresponding to the attribute information with the same content in the current attribute information and the historical attribute information, and determining at least one piece of common attribute information from the current attribute information and the historical attribute information based on a weight accumulation value;

11. An electronic device, characterized in that the electronic device comprises at least a processor and a memory, the processor being adapted to implement the steps of the user behavior baseline generation method according to any of claims 1-9 when executing a computer program stored in the memory.

12. A computer-readable storage medium, characterized in that it stores a computer program which, when being executed by a processor, carries out the steps of the user behavior baseline generation method according to any one of claims 1 to 9.

13. A computer program product, the computer program product comprising: computer program code which, when run on a computer, causes the computer to perform the steps of the user behavior baseline generation method as claimed in any of claims 1 to 9.