CN113810335A

CN113810335A - Method and system for identifying target IP, storage medium and equipment

Info

Publication number: CN113810335A
Application number: CN202010533071.3A
Authority: CN
Inventors: 王璐
Original assignee: Wuhan Douyu Network Technology Co Ltd
Current assignee: Wuhan Douyu Network Technology Co Ltd
Priority date: 2020-06-12
Filing date: 2020-06-12
Publication date: 2021-12-17
Anticipated expiration: 2040-06-12
Also published as: CN113810335B

Abstract

The invention discloses a method for identifying a target IP (Internet protocol). As a current time window is adjacent to a previous time window, and the time windows are shorter by 0.5-1 h, the characteristics of login IPs of the two adjacent time windows have high similarity on the whole, therefore, a first target parameter value which is obtained by the current IP of the current time window and represents the deviation degree of the characteristics of the current IP and the characteristics in a characteristic matrix is used for accurately reflecting the deviation degree of the current IP by utilizing the characteristic value and the characteristic vector of the characteristic matrix of the previous time window, and the IP of which the threshold value of the first target parameter is greater than the threshold value is identified as the target IP by comparing with the threshold value. The method and the device realize timely and accurate identification of the login IP, avoid mistaken identification of the normal IP, and can timely intercept and limit the target IP and timely release occupied live broadcast network resources due to the timeliness of identification.

Description

Method and system for identifying target IP, storage medium and equipment

Technical Field

The invention relates to the technical field of network live broadcast, in particular to a method, a system, a storage medium and equipment for identifying a target IP.

Background

On a live network platform, malicious network attacks of some target IP often occur, such as obtaining free virtual props of the platform in batches, brushing advertisement bullets in batches and the like, and live network resources are occupied. In the prior art, suspicious login is intercepted through an IP frequency rule, if the login times or the number of accounts under the same IP are too many, the login under the IP is considered to be abnormal, and the identification method can cause the false identification of the IP such as a base station or a public internet bar. Therefore, the existing method for identifying the target IP has low accuracy and can cause error limitation on the normal IP.

Disclosure of Invention

In view of the above, the present invention has been made to provide a method and system, storage medium, and device for identifying a target IP that overcome or at least partially solve the above problems.

On one hand, the present application provides the following technical solutions through an embodiment of the present application:

a method for identifying a target IP for a live webcast platform, the method comprising:

obtaining m IPs logged in a previous time window and a feature matrix formed by n feature values of each IP based on a log of log events of the network live broadcast platform; wherein m and n are positive integers, and the previous time window is 0.5-1 h;

obtaining a matrix eigenvalue and an eigenvector based on the characteristic matrix;

acquiring n characteristic values of a current IP logged in a current time window, wherein the current time window is adjacent to the previous time window and is 0.5-1 h;

obtaining a first target parameter value representing the deviation degree of the features of the current IP and the features in the feature matrix based on the n feature values of the current IP, the matrix feature value and the feature vector;

judging whether the first target parameter value is larger than a first target parameter threshold value;

and if the first target parameter value is larger than the first target parameter threshold value, identifying the current IP as a target IP.

Optionally, after identifying the current IP as the target IP, the method further includes:

obtaining login information of each login event of a target IP, wherein the login information comprises a login timestamp T, a login nickname N and whether login is successful or not S;

obtaining a characteristic weight beta of a login timestamp based on a historical login event_TCharacteristic weight of login nicknameβ_NAnd a characteristic weight beta of whether the login was successful or not_S；

Based on the characteristic weight beta_TCharacteristic weight beta_NCharacteristic weight beta_SThe login information of each login event is used for acquiring a second target parameter value representing the similarity between the two login events;

and obtaining a target login event based on the second target parameter value and a second target parameter threshold value.

Optionally, the weight β is used as the basis_TWeight beta_NWeight beta_SAnd the login information of each login event, and obtaining a second target parameter value representing the similarity between the two login events, wherein the method specifically comprises the following steps:

obtaining the second target parameter value using the following equation:

sim(E_i,E_j)＝1-dist(E_i,E_j)；

wherein:

sim(E_i,E_j) Is a second target parameter value for login events i and j; dist (E)_i,E_j) Is the distance between login events i and j; t is_iAnd T_jIs the log-in timestamp of log-in events i and j; n is a radical of_iAnd N_jA character string which is a login nickname of login events i and j; i (S)_i＝S_j) And (3) a value representing whether the login of the login events i and j is successful or not, wherein the value is 1 if the login events i and j are consistent, and the value is 0 if the login events i and j are inconsistent.

Optionally, the weight β of the characteristic of the login timestamp is obtained based on the historical login event_TWeight beta of login nickname feature_NAnd a weight beta characterizing whether the login was successful_SThe method specifically comprises the following steps:

acquiring a plurality of first login event pairs from a plurality of target IPs, wherein two login events in the second login event pair belong to the same target IP;

randomly extracting a plurality of second login event pairs from login events which do not belong to a target IP, wherein two login events in the second login event pairs do not belong to the same IP;

respectively obtaining a first average distance of each login information of the plurality of first login event pairs and a second average distance of each login information of the plurality of second login event pairs;

obtaining the weight beta of the characteristic of the login timestamp based on the first average distance and the second average distance_TWeight beta of login nickname feature_NAnd a weight beta characterizing whether the login was successful_S。

Optionally, the obtaining a matrix eigenvalue and an eigenvector based on the feature matrix specifically includes:

carrying out zero equalization on each column of the feature matrix to obtain an equalized feature matrix;

based on the mean characteristic matrix, obtaining a covariance matrix according to the following formula:

c is a covariance matrix, X is a mean feature matrix, X^TIs the transpose of the equalized feature matrix;

obtaining the matrix eigenvalue lambda based on the covariance matrix₁,λ₂,...,λ_kAnd a feature vector e₁,e₂,...,e_kWhere k represents the number of eigenvalues contributing the most.

Optionally, the obtaining, based on the n eigenvalues of the current IP, the matrix eigenvalue, and the eigenvector, a first target parameter value representing a degree of deviation between the characteristic of the current IP and the characteristic in the eigenvector matrix specifically includes:

obtaining the first target parameter value using the following equation:

wherein: score (x) is a first target parameter value with a current IP characterized by x; e.g. of the type_yIs the y-th eigenvector, λ_yIs the corresponding y-th characteristic value, y being 1, 2.

Optionally, after obtaining the target login event, the method further includes:

and limiting the functions of the target IP and/or the account related to the target login event.

On the other hand, the present application provides a system for identifying a target IP through another embodiment of the present application, where the system is used for a live webcast platform, and the system includes:

the first obtaining module is used for obtaining m IPs logged in a previous time window and a feature matrix formed by n feature values of each IP based on a log of logging events of the live webcast platform; wherein m and n are positive integers, and the previous time window is 0.5-1 h;

a second obtaining module, configured to obtain a matrix eigenvalue and an eigenvector based on the feature matrix;

the first acquisition module is used for acquiring n characteristic values of a current IP logged in a current time window, wherein the current time window is adjacent to the previous time window, and the current time window is 0.5-1 h;

a third obtaining module, configured to obtain, based on the n eigenvalues of the current IP, the matrix eigenvalue, and the eigenvector, a first target parameter value representing a degree of deviation between the feature of the current IP and the feature in the feature matrix;

the judging module is used for judging whether the first target parameter value is larger than a first target parameter threshold value or not;

and the identification module is used for identifying the current IP as the target IP if the first target parameter value is greater than the first target parameter threshold value.

The invention discloses a readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.

The invention discloses an apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor performing the steps of the method.

One or more technical solutions provided in the embodiments of the present application have at least the following technical effects or advantages:

obtaining m IPs logged in a previous time window and a feature matrix formed by n feature values of each IP based on a log of log events of the network live broadcast platform; obtaining a matrix eigenvalue and an eigenvector based on the characteristic matrix; acquiring n characteristic values of a current IP logged in a current time window, wherein the current time window is adjacent to the previous time window and is 0.5-1 h; obtaining a first target parameter value representing the deviation degree of the features of the current IP and the features in the feature matrix based on the n feature values of the current IP, the matrix feature value and the feature vector; judging whether the first target parameter value is larger than a first target parameter threshold value; and if the first target parameter value is larger than the first target parameter threshold value, identifying the current IP as a target IP. The method comprises the steps that a current time window is adjacent to a previous time window, and the time window is shorter for 0.5-1 h, so that the characteristics of login IPs of the two adjacent time windows are highly similar on the whole, and therefore, the first target parameter value which is obtained by the current IP of the current time window and represents the deviation degree of the characteristics of the current IP and the characteristics in the characteristic matrix can accurately reflect the deviation degree of the current IP by utilizing the characteristic value and the characteristic vector of the characteristic matrix of the previous time window, and the IP of which the threshold value of the first target parameter is larger than the threshold value is identified as the target IP by comparing with the threshold value. After a target IP is identified, obtaining login information of each login event of the target IP, wherein the login information comprises a login timestamp T, a login nickname N and whether login is successful or not S; obtaining a characteristic weight beta of a login timestamp based on a historical login event_TCharacteristic weight beta of login nickname_NAnd a characteristic weight beta of whether the login was successful or not_S(ii) a Based on the characteristic weight beta_TCharacteristic weight beta_NCharacteristic weight beta_SAnd each isObtaining a second target parameter value representing the similarity between the two login events according to the login information of each login event; and obtaining a target login event based on the second target parameter value and a second target parameter threshold value. For the identified target login event, the related functions of the account related to the target login abnormal event can be limited, so that the error limitation caused by the unified limitation of the target IP is avoided, the occupied network resources are released, and the flow of live broadcast is increased.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

FIG. 1 is a flow diagram of a method of identifying a target IP in one embodiment of the invention;

fig. 2 is a system architecture diagram for identifying a target IP in one embodiment of the invention.

Detailed Description

The embodiment of the application provides a method, a system, a storage medium and equipment for identifying a target IP, and solves the technical problem that the method for identifying the target IP is low in accuracy rate and causes error limitation on a normal IP.

In order to solve the technical problems, the general idea of the embodiment of the application is as follows:

a method for identifying a target IP (Internet protocol) is characterized in that a characteristic matrix consisting of m IPs logged in a previous time window and n characteristic values of each IP is obtained based on a log of logging events of a live network platform; obtaining a matrix eigenvalue and an eigenvector based on the characteristic matrix; acquiring n characteristic values of a current IP logged in a current time window, wherein the current time window is adjacent to the previous time window and is 0.5-1 h; obtaining the feature characterizing the current IP and the feature vector based on the n eigenvalues of the current IP, the matrix eigenvalue and the feature vectorA first target parameter value of a degree of deviation of a feature in the matrix; judging whether the first target parameter value is larger than a first target parameter threshold value; and if the first target parameter value is larger than the first target parameter threshold value, identifying the current IP as a target IP. The method comprises the steps that a current time window is adjacent to a previous time window, the time windows are shorter by 0.5-1 h, and therefore the characteristics of login IPs of the two adjacent time windows are highly similar on the whole, the characteristic value and the characteristic vector of a characteristic matrix of the previous time window are utilized, a first target parameter value which is obtained by the current IP of the current time window and represents the deviation degree of the characteristics of the current IP and the characteristics in the characteristic matrix can accurately reflect the deviation degree of the current IP, and the IP of which the threshold value of the first target parameter is larger than the threshold value is identified as the target IP through comparison with the threshold value. After a target IP is identified, obtaining login information of each login event of the target IP, wherein the login information comprises a login timestamp T, a login nickname N and whether login is successful or not S; obtaining a characteristic weight beta of a login timestamp based on a historical login event_TCharacteristic weight beta of login nickname_NAnd a characteristic weight beta of whether the login was successful or not_S(ii) a Based on the characteristic weight beta_TCharacteristic weight beta_NCharacteristic weight beta_SThe login information of each login event is used for acquiring a second target parameter value representing the similarity between the two login events; and obtaining a target login event based on the second target parameter value and a second target parameter threshold value. For the identified target login event, the related functions of the account related to the target login abnormal event can be limited, so that the error limitation caused by the unified limitation of the target IP is avoided, the occupied network resources are released, and the flow of live broadcast is increased.

In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.

First, it is stated that the term "and/or" appearing herein is merely one type of associative relationship that describes an associated object, meaning that three types of relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

Example one

The present embodiment provides a method for identifying a target IP, which is used for a live webcast platform, and referring to fig. 1, the method of the present embodiment includes the following steps:

s100, obtaining m IPs logged in a previous time window and a feature matrix formed by n feature values of each IP based on a log of a log-in event of the network live broadcast platform; wherein m and n are positive integers, and the previous time window is 0.5-1 h;

s200, obtaining a matrix eigenvalue and an eigenvector based on the characteristic matrix;

s300, acquiring n characteristic values of a current IP logged in a current time window, wherein the current time window is adjacent to the previous time window and is 0.5-1 h;

s400, obtaining a first target parameter value representing the deviation degree of the features of the current IP and the features in the feature matrix based on the n feature values of the current IP, the matrix feature values and the feature vectors;

s500, judging whether the first target parameter value is larger than a first target parameter threshold value;

s600, if the first target parameter value is larger than the first target parameter threshold value, identifying the current IP as a target IP.

It should be noted that the IP in this embodiment refers to a network IP address used by a user to log in a live webcast platform, and the user may be a person participating in live webcast or an electronic device participating in live webcast interaction, such as an intelligent robot.

The method for identifying the target IP provided by the embodiment can be applied to scenes for identifying the target IP which participates in live broadcasting room activities in an illegal malicious network attack mode, such as obtaining free virtual props of a platform in batches, brushing advertisement barrages in batches and the like. The method may be performed by a target IP device, which may be implemented in software and/or hardware, typically integrated in a terminal, such as a server corresponding to a live platform.

Referring to fig. 1, the method of the present embodiment is performed as follows:

firstly, executing S100, and obtaining m IPs logged in a previous time window and a feature matrix formed by n feature values of each IP based on a log of log events of the network live broadcast platform; wherein m and n are positive integers, and the previous time window is 0.5-1 h.

It can be understood that, in order to subsequently obtain the matrix eigenvalue and eigenvector, the feature matrix needs to be obtained first for identifying the current IP. Therefore, the log of the behavior of the user login platform can be recorded in the log of the login event, and the log can contain information such as login account id, login account nickname, login timestamp, whether login is successful and the like.

According to the information of the behavior log, the following 3 characteristics can be counted for each login event under the IP login:

the method is characterized in that: nickname length standard deviation (nickname length is the number of characters of the nickname text);

and (2) feature: the nickname mode with the highest frequency of appearance (the nickname mode is to convert the nickname text characters into L in lowercase English, U in uppercase English, D in number, C in Chinese character and O in other words);

and (3) feature: number of accounts logged in (deduplication, i.e. the same account id is computed only once).

It should be noted that the login event herein refers to a login behavior occurring under each login IP, that is, an event of logging in a platform by using the IP. Each login event can comprise information such as login account id, login account nickname, login timestamp, whether login is successful and the like.

Because the data in the log of the login event refers to objective data generated by the login event recorded on the live webcast platform, in order to identify the target IP subsequently, the information such as the login account id, the nickname of the login account, the timestamp of the login, whether the login is successful and the like in the log of the login event is selected, and the 3 characteristics are obtained.

The reason for selecting the above-mentioned 3 features is: the account number of the abnormal user is usually registered by a registration machine, and the nickname mode of the account number is very similar; the nickname length of the account under the normal IP is relatively random, the nickname standard difference is larger, and the nickname standard difference of the account under the abnormal IP is smaller, so that the nickname mode ratio with the highest occurrence frequency and the nickname length standard difference can represent the abnormal degree of the IP; in addition, due to resource limitation, the number of login accounts of an abnormal user under the same login IP is usually more than that of a normal IP and the number of login accounts of the abnormal user is not the same as that of the normal IP, so that the abnormal degree of the IP can be objectively represented by the number of login accounts.

Based on this, it is obvious to those skilled in the art that, in step S100 of the present embodiment, selected: the three feature data, namely the nickname length standard deviation, the proportion of the nickname mode with the highest occurrence frequency and the number of logged accounts, are all necessary parameters for further improving the identification accuracy, are traces left after the use of a user, are objectively existed, are not selected by artificial subjective factors, but are objectively obtained through log data (namely, are selected according with a natural rule) for solving the technical problem, and a data basis is provided for the following S200-S600.

For convenience of subsequently extracting the matrix eigenvalue and the eigenvector, the m IPs logged in the previous time window and the n eigenvalues of each IP can be sorted into the eigenvector matrix. In the present embodiment, the n feature values are feature values corresponding to the 3 kinds of features of the IP.

Next, S200 is executed, and based on the feature matrix, a matrix feature value and a feature vector are obtained.

In this embodiment, the matrix eigenvalues and eigenvectors may be obtained according to the following steps:

zero averaging is carried out on each row of the feature matrix, namely, the average value of each row is subtracted from each numerical value on the row to obtain an averaged feature matrix;

And then executing S300, and acquiring n characteristic values of the current IP logged in a current time window, wherein the current time window is adjacent to the previous time window, and the current time window is 0.5-1 h.

It can be understood that after obtaining the eigenvalue and the eigenvector of the feature matrix of the previous time window, in order to identify the current IP in time subsequently, n eigenvalues of the current IP logged in the current time window need to be obtained first. In addition, in order to ensure the accuracy of identification while identifying in time, the current time window is adjacent to the previous time window in the embodiment, and the current time window and the previous time window are 0.5-1 h, and because the current time window is adjacent to the previous time window and the time window is 0.5-1 h shorter, the characteristics of the login IPs of the two adjacent time windows have high similarity on the whole. Providing theoretical basis for the subsequent S400.

Next, S400 is executed, and based on the n eigenvalues of the current IP, the matrix eigenvalue, and the eigenvector, a first target parameter value is obtained, which characterizes a degree of deviation of the features of the current IP from the features in the eigenvector matrix.

In a specific implementation process, in order to identify whether the current IP is a target IP with abnormal features, a first target parameter value which characterizes the deviation degree of the features of the current IP from the features in the feature matrix needs to be obtained firstly.

Illustratively, the first target parameter value is obtained using the following formula:

wherein: score (x) is a first target parameter value with a current IP characterized by x; e.g. of the type_yIs the y-th eigenvector, λ_yIs the corresponding y-th characteristic value, y being 1, 2. Here, the feature x is 3 feature values corresponding to the 3 feature values.

It should be noted that the principle of the above method and calculation formula is:

the characteristic vector represents different directions of variance change degrees of the characteristic data, the matrix characteristic value is the variance size of the characteristic data in the corresponding direction, and variance change in different directions reflects the internal characteristics of the data, so that if the characteristic of a single data sample is different from the characteristic shown by the whole data sample, and the characteristic deviates greatly from other data samples in certain directions, the data sample is an abnormal point.

In the formula, the deviation degree of the characteristic x in the direction y is shown in the normalization operation, so that the deviation degrees in different directions can be compared. After the deviation degrees of the data samples in all directions are calculated, the deviation degrees are summed to finally obtain a first target parameter value representing the deviation degrees of the features of the current IP and the features in the feature matrix.

After the first target parameter value is obtained, next, executing S500 and S600, and judging whether the first target parameter value is larger than a first target parameter threshold value; and if the first target parameter value is larger than the first target parameter threshold value, identifying the current IP as a target IP.

In the present embodiment, the IP having abnormal characteristics, such as the proxy IP, is identified in the login IP, and the first target parameter values of these IPs having abnormal characteristics are calculated and sorted from the size. After sorting, according to the requirement on the identification accuracy, the higher the quantile is, the fewer the number of target IPs obtained by identification is, and the higher the accuracy of the target IPs already identified is, but some identification may be missed, so in order to obtain an accurate identification result and make identification as missed as possible, the 95% quantile is taken as the first target parameter threshold in this embodiment.

The above has clearly described a complete process of identifying a current IP, and it can be understood that as long as the identification of the login IP in any time period can be realized according to the above steps, so as to obtain a target IP set in a certain time period.

After the target IP is identified and obtained, subsequent behavior limitation can be performed on the target IP, but normal user login events can also exist under the target IP, and if the behavior limitation is uniformly performed, false limitation can also be caused. For this reason, it is necessary to further identify a target login event under the target IP.

As an optional embodiment, after the identifying the current IP as the target IP, the method further comprises:

the method comprises the steps of firstly, obtaining login information of each login event of a target IP, wherein the login information comprises a login timestamp T, a login nickname N and whether login is successful or not S;

secondly, obtaining the characteristic weight beta of the login timestamp based on the historical login event_TCharacteristic weight beta of login nickname_NAnd a characteristic weight beta of whether the login was successful or not_S；

Third, based on the feature weight beta_TCharacteristic weight beta_NCharacteristic weight beta_SThe login information of each login event is used for acquiring a second target parameter value representing the similarity between the two login events;

and fourthly, acquiring a target login event based on the second target parameter value and a second target parameter threshold value.

It should be noted that, in order to identify a target login event, login information of each login event of a target IP needs to be first acquired, and in this embodiment, the login information includes a login timestamp T, a login nickname N, and whether login is successful S. The login information of the embodiment is also obtained from the login event log, and objectivity is achieved. And because the subsequent process is only to obtain the similarity of the two login events, the typical login information such as the login timestamp T, the login nickname N, whether the login is successful S and the like in the login events can be selected as basic data according to needs.

Next, based on the weight β_TWeight beta_NWeight beta_SAnd the login information of each login event is used for obtaining a second target parameter value representing the similarity between the two login events.

For example, the second target parameter value may be obtained by using the following formula:

sim(E_i,E_j)＝1-dist(E_i,E_j)；

wherein:

The principle of the above formula is: the timestamp of the occurrence of the logging event is a numerical variable, and the calculated distance, which represents the difference of the numerical variable, i.e., the manhattan distance, is processed by a function in order to normalize it to 0-1, which is 1 if 0 and tends to 0 if large. The nickname of the login account can be regarded as a set of characters, so that the set Jacard distance is used for calculation, namely, the character string similarity of the nickname of the account can be expressed. Whether the login is successful or not is a discrete variable, and only two states are yes or no, so that the distance is different from 0 when the states are the same, and the distance is 1.

However, since each login information contributes to the similarity of login events differently, in order to improve the identification accuracy, a weight needs to be added to the feature of each login information. It is now common to determine the weights manually. In this embodiment, a method for determining a feature weight of each login information is provided:

obtaining a characteristic weight beta of the login timestamp based on the first average distance and the second average distance_TCharacteristic weight beta of login nickname_NAnd a characteristic weight beta of whether the login was successful or not_S。

It should be noted that, in the method for respectively obtaining the first average distance of each login information of the plurality of first login event pairs and the second average distance of each login information of the plurality of second login event pairs, referring to the calculation principle of the distance between login events i and j in this embodiment, the distance of each login event pair is calculated, and then the average distance of the plurality of login event pairs is calculated.

Next, the feature weight of each login information can be obtained by using the following formula:

wherein, w_aIs the characteristic weight of the login information a, in this embodiment, is the weight β of the login timestamp characteristic_TWeight beta of login nickname feature_NAnd a weight beta characterizing whether the login was successful_S；S_aIs the first mean distance, S, of the landing information a_aIs the second average distance of the login information a; s_bIs the first mean distance of the landing information b, D_bIs the second average distance of the login information b.

The principle of the above formula is: because a plurality of first login event pairs are selected under the unified target IP, and the login events are under the same abnormal IP, the login events can be considered to be similar; while several second login event pairs do not belong under the same IP and can therefore be considered dissimilar. If a feature is important, then its similarity is generally low in a number of login events that are not similar, and the similarity will be higher in a number of login events that are similar. Therefore, the first average distance and the second average distance are taken as the weights. In order to add up the weight to 1, the ratio of the first average distance to the second average distance is normalized by dividing the ratio of the first average distance to the second average distance of the login information a by the sum of the ratios of the first average distance to the second average distance of all login information.

After a second target parameter value accurately representing the similarity between the two login events is obtained, a target login event is obtained based on the second target parameter value and a second target parameter threshold value.

Specifically, the login event pair with the second target parameter value higher than the second target parameter threshold is identified as a target login event pair, wherein the login event is a target login event.

The setting method of the second target parameter threshold value comprises the following steps: and (4) counting the average second target parameter values of the login event pairs under each target IP, sequencing from small to large, and taking a 95% quantile as a threshold value. If more exceptions need to be found, the threshold can be adjusted down, otherwise the threshold is increased.

For the identified target login event, the related functions of the account related to the target login abnormal event can be limited, so that the error limitation caused by the unified limitation of the target IP is avoided, the occupied network resources are released, and the flow of live broadcast is increased.

The following describes the implementation process of the method of this embodiment by using a practical example:

assume that there are 5 log entries:

(1) the IP is 12.30.34.124, the nickname is sadt001, the login time is 12:34, and the login fails;

(2) the IP is 12.30.34.124, the nickname is sadt002, the login time is 12:36, and the login fails;

(3) the IP is 12.30.34.124, the nickname is support, the login time is 12:39, and the login is successful;

(4) the IP is 35.68.90.11, the nickname is rajan, the login time is 12:41, and the login is successful;

(5) IP is 35.68.90.11, nickname is jordan, login time is 13:41, and login is successful.

First, the elements of the feature matrix include:

for IP 12.30.34.124:

nickname length standard deviation is 0

The nickname pattern with the highest frequency of occurrence has a ratio of 0.67

The number of logged-in accounts is 3

For IP 35.68.90.11:

nickname length standard deviation is 0.5

The nickname pattern with the highest frequency of occurrence has a ratio of 0.5

The number of the logged-in accounts is 2.

The two IPs then correspond to the features: (0,0.67, 3; 0.5,0.5,2)

Normalizing the features can yield: (-0.25,0.085, 0.5; 0.25, -0.085, -0.5).

Then, a first target parameter value is calculated:

assuming that the calculation yields λ₁＝0.6,λ₂＝0.3，e₁＝(-0.7,0.2,0.9),e₂＝(0.1,-0.5,0.3)

The first target parameter values for IP12.30.34.124 are then:

the first target parameter values for IP 35.68.90.11 are:

assume a first target parameter threshold of 1, and therefore 12.30.34.124 is the target IP.

And aiming at the target IP, identifying a target login event:

a second target parameter value between the three login events in IP12.30.34.124 is calculated.

Assuming that the feature weights of the 3 login messages are 0.4,0.4 and 0.2, respectively, then:

the above-mentioned second target parameter values are then 0.73,0.32.0.35, respectively. Setting the second target parameter threshold value of 0.7, the event combination E can be found₁,E₂Is then E₁,E₂Is a target login event.

The technical scheme in the embodiment of the application at least has the following technical effects or advantages:

in the method of this embodiment, based on the log of the log event of the live webcast platform, m IPs logged in a previous time window and a feature matrix formed by n feature values of each IP are obtained; obtaining a matrix eigenvalue and an eigenvector based on the characteristic matrix; acquiring n characteristic values of a current IP logged in a current time window, wherein the current time window is adjacent to the previous time window and is 0.5-1 h; obtaining a first target parameter value representing the deviation degree of the features of the current IP and the features in the feature matrix based on the n feature values of the current IP, the matrix feature value and the feature vector; judging whether the first target parameter value is larger than a first target parameter threshold value; and if the first target parameter value is larger than the first target parameter threshold value, identifying the current IP as a target IP. The method comprises the steps that a current time window is adjacent to a previous time window, the time windows are shorter by 0.5-1 h, and therefore the characteristics of login IPs of the two adjacent time windows are highly similar on the whole, the characteristic value and the characteristic vector of a characteristic matrix of the previous time window are utilized, a first target parameter value which is obtained by the current IP of the current time window and represents the deviation degree of the characteristics of the current IP and the characteristics in the characteristic matrix can accurately reflect the deviation degree of the current IP, and the IP of which the threshold value of the first target parameter is larger than the threshold value is identified as the target IP through comparison with the threshold value. The method and the device realize timely and accurate identification of the login IP, avoid mistaken identification of the normal IP, and can timely intercept and limit the target IP and timely release occupied live broadcast network resources due to the timeliness of identification.

Example two

Based on the same inventive concept as the embodiment, the embodiment provides a system for identifying a target IP, which is used for a live webcast platform, and referring to fig. 2, the system includes:

and the identification module is used for identifying the current IP as the target IP if the first target parameter value is greater than the first target parameter threshold value. .

Since the system for identifying a target IP described in this embodiment is a system adopted to implement the method for identifying a target IP described in the first embodiment of this application, a person skilled in the art can understand the specific implementation manner of the system described in this embodiment and various variations thereof based on the method for identifying a target IP described in the first embodiment of this application, and therefore, how to implement the method in the first embodiment using the system described in this embodiment is not described in detail here. The system adopted by a person skilled in the art for implementing the method for identifying the target IP in the embodiment of the present application is within the protection scope of the present application.

Based on the same inventive concept as in the previous embodiments, embodiments of the present invention further provide a readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of any of the methods described above.

Based on the same inventive concept as in the previous embodiments, an embodiment of the present invention further provides an apparatus, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the steps of any of the methods described above when executing the program.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for identifying a target IP (Internet protocol) is used for a network live platform, and is characterized by comprising the following steps:

if the first target parameter value is greater than the first target parameter threshold value, identifying the current IP as a target IP;

obtaining login information of each login event of the target IP, wherein the login information comprises a login timestamp T, a login nickname N and whether login is successful or not S;

obtaining a characteristic weight beta of a login timestamp based on a historical login event_TCharacteristic weight beta of login nickname_NAnd a characteristic weight beta of whether the login was successful or not_S；

2. The method of claim 1, wherein the weight-based β is based on_TWeight beta_NWeight beta_SAnd the login information of each login event, and obtaining a second target parameter value representing the similarity between the two login events, wherein the method specifically comprises the following steps:

obtaining the second target parameter value using the following equation:

sim(E_i,E_j)＝1-dist(E_i,E_j)；

wherein:

3. The method of claim 2, wherein the obtaining the weight β of the login timestamp feature is based on historical login events_TWeight beta of login nickname feature_NAnd a weight beta characterizing whether the login was successful_SThe method specifically comprises the following steps:

4. The method according to claim 1, wherein the obtaining matrix eigenvalues and eigenvectors based on the eigen matrix specifically comprises:

5. The method as claimed in claim 4, wherein said obtaining a first target parameter value characterizing a degree of deviation of the feature of the current IP from the features in the feature matrix based on the n feature values of the current IP, the matrix feature values and the feature vectors specifically comprises:

obtaining the first target parameter value using the following equation:

6. The method of claim 1, wherein after the obtaining the target login event, the method further comprises:

7. A system for identifying a target IP for a live webcast platform, the system comprising:

8. A readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.

9. An apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method of any of claims 1-7 are implemented when the program is executed by the processor.