CN109492394B - Abnormal service request identification method and terminal equipment - Google Patents

Abnormal service request identification method and terminal equipment Download PDF

Info

Publication number
CN109492394B
CN109492394B CN201811249394.9A CN201811249394A CN109492394B CN 109492394 B CN109492394 B CN 109492394B CN 201811249394 A CN201811249394 A CN 201811249394A CN 109492394 B CN109492394 B CN 109492394B
Authority
CN
China
Prior art keywords
matrix
service
preset
service request
time period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811249394.9A
Other languages
Chinese (zh)
Other versions
CN109492394A (en
Inventor
唐振华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201811249394.9A priority Critical patent/CN109492394B/en
Priority to PCT/CN2018/124341 priority patent/WO2020082588A1/en
Publication of CN109492394A publication Critical patent/CN109492394A/en
Application granted granted Critical
Publication of CN109492394B publication Critical patent/CN109492394B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Debugging And Monitoring (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention is applicable to the technical field of data processing, and provides an abnormal service request identification method and terminal equipment, wherein the method and terminal equipment are used for updating a preset database in real time by storing the service request received in a preset time period into the preset database and deleting the service request received before the preset time period; if the receiving time of each service request in the preset database accords with the preset time distribution standard, converting the corresponding relation between the data type and the data value contained in the service request into a service matrix corresponding to each service request; if the similarity between one clustering center matrix and all preset reference matrices is smaller than a similarity threshold, judging that the service request in a preset time period is abnormal, so that a user can master the abnormal condition of the service request in real time by shortening the preset time period, and accordingly corresponding measures are taken timely, and normal operation of a server is ensured.

Description

Abnormal service request identification method and terminal equipment
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to an abnormal service request identification method and terminal equipment.
Background
Currently, a server of a large-scale organization needs to process a large number of service requests in a short time, but a large number of abnormal service requests may occur in a short time due to malicious attacks or abnormal systems of service requesters, and the abnormal service requests may cause damage to normal operation of the server, and waste processing resources of the server.
The existing security defense system is difficult to judge whether more abnormal service requests exist in a certain short time period in real time, and engineering personnel are often required to recognize the abnormal service requests through post analysis after a longer time interval, and the time period when a large number of abnormal service requests exist is locked. Obviously, the current abnormal service request has poor real-time performance in the identification process, and the normal operation of the server can be threatened.
Disclosure of Invention
In view of the above, the embodiment of the invention provides a method for identifying an abnormal service request and a terminal device, so as to solve the problem that in the prior art, the real-time performance is poor when the abnormal service request is identified.
A first aspect of an embodiment of the present invention provides a method for identifying an abnormal service request, including:
Storing the service request received in a preset time period into a preset database, deleting the service request stored in the preset database and received before the preset time period, so as to update the preset database, wherein the service request comprises a plurality of corresponding relations between data types and data values, and the data types comprise the receiving time of the service request; judging whether the receiving time of each service request in the preset database accords with a preset time distribution standard; if the receiving time of each service request in the preset database accords with a preset time distribution standard, converting the corresponding relation between the data type and the data value contained in each service request in the preset database into a service matrix corresponding to each service request; and calculating the preset number of cluster center matrixes of all the service matrixes, and if the similarity between one cluster center matrix and all the preset reference matrixes is smaller than a similarity threshold value, judging that the service request in the preset time period is abnormal.
A second aspect of an embodiment of the present invention provides a terminal device, including a memory and a processor, where the memory stores a computer program executable on the processor, and when the processor executes the computer program, the processor implements the following steps:
Storing the service request received in a preset time period into a preset database, deleting the service request stored in the preset database and received before the preset time period, so as to update the preset database, wherein the service request comprises a plurality of corresponding relations between data types and data values, and the data types comprise the receiving time of the service request; judging whether the receiving time of each service request in the preset database accords with a preset time distribution standard; if the receiving time of each service request in the preset database accords with a preset time distribution standard, converting the corresponding relation between the data type and the data value contained in each service request in the preset database into a service matrix corresponding to each service request; and calculating the preset number of cluster center matrixes of all the service matrixes, and if the similarity between one cluster center matrix and all the preset reference matrixes is smaller than a similarity threshold value, judging that the service request in the preset time period is abnormal.
A third aspect of an embodiment of the present invention provides an apparatus for identifying an abnormal service request, including:
The updating module is used for storing the service requests received in the preset time period into a preset database, deleting the service requests stored in the preset database and received before the preset time period so as to update the preset database, wherein the service requests comprise the corresponding relation between a plurality of data types and data values, and the data types comprise the receiving time of the service requests; the judging module is used for judging whether the receiving time of each service request in the preset database accords with a preset time distribution standard; the conversion module is used for converting the corresponding relation between the data types and the data values contained in each service request in the preset database into a service matrix corresponding to each service request if the receiving time of each service request in the preset database accords with a preset time distribution standard; the calculation module is used for calculating the preset number of the clustering center matrixes of all the service matrixes, and if the similarity between one clustering center matrix and all the preset reference matrixes is smaller than a similarity threshold value, judging that the service request in the preset time period is abnormal.
A fourth aspect of the embodiments of the present invention provides a computer readable storage medium storing a computer program which when executed by a processor performs the steps of:
Storing the service request received in a preset time period into a preset database, deleting the service request stored in the preset database and received before the preset time period, so as to update the preset database, wherein the service request comprises a plurality of corresponding relations between data types and data values, and the data types comprise the receiving time of the service request; judging whether the receiving time of each service request in the preset database accords with a preset time distribution standard; if the receiving time of each service request in the preset database accords with a preset time distribution standard, converting the corresponding relation between the data type and the data value contained in each service request in the preset database into a service matrix corresponding to each service request; and calculating the preset number of cluster center matrixes of all the service matrixes, and if the similarity between one cluster center matrix and all the preset reference matrixes is smaller than a similarity threshold value, judging that the service request in the preset time period is abnormal.
In the embodiment of the invention, the service request received in the preset time period is stored in the preset database, and the service request received before the preset time period is deleted, so that the preset database is updated in real time; if the receiving time of each service request in the preset database accords with the preset time distribution standard, converting the corresponding relation between the data type and the data value contained in the service request into a service matrix corresponding to each service request; if the similarity between one clustering center matrix and all preset reference matrices is smaller than a similarity threshold, judging that the service request in a preset time period is abnormal, so that a user can master the abnormal condition of the service request in real time by shortening the preset time period, and accordingly corresponding measures are taken timely, and normal operation of a server is ensured.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of an implementation of a method for identifying an abnormal service request according to an embodiment of the present invention;
fig. 2 is a flowchart of a specific implementation of a method S102 for identifying an abnormal service request according to an embodiment of the present invention;
Fig. 3 is a flowchart of a specific implementation of an abnormal service request identifying method S105 provided in an embodiment of the present invention;
FIG. 4 is a block diagram of an apparatus for identifying an abnormal service request according to an embodiment of the present invention;
Fig. 5 is a schematic diagram of a server according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
In order to illustrate the technical scheme of the invention, the following description is made by specific examples.
Fig. 1 shows an implementation flow of a method for identifying an abnormal service request according to an embodiment of the present invention, where the method flow includes steps S101 to S107. The specific implementation principle of each step is as follows.
S101, storing the service request received in the preset time period into a preset database, and deleting the service request stored in the preset database and received before the preset time period so as to update the preset database.
In the embodiment of the present invention, the service request includes a correspondence between a plurality of data types and data values, where the data types include a receiving time of the service request.
In the embodiment of the invention, the service requests stored in the preset database are mainly analyzed, and in order to ensure the real-time performance of the analysis, the service requests in the preset database need to be updated every a period of time. It can be appreciated that by adjusting the duration of the preset time period, the real-time degree of identifying the abnormal service request can be changed, wherein the smaller the duration of the preset time period is, the better the real-time of identifying the abnormal service request is, because the time period of a large number of abnormal service requests can be shortened to a smaller range, and the user can conveniently and timely make countermeasures.
As can be appreciated, the method for updating the preset database is as follows: storing the service request received in the preset time period into a preset database, and deleting the service request stored in the preset database and received before the preset time period. For example, the preset time period may be a time period between 10 seconds before the current time and the current time.
In the embodiment of the present invention, the service request includes a plurality of data values of data types, and optionally, the data types include: the receiving time of the service request, the request object of the service request, the request type of the service request, the priority of the service request, and the like.
It will be appreciated that the situation of the service request can be understood from multiple dimensions by means of the data contained in the service request and further analyzed in a subsequent process.
S102, judging whether the receiving time of each service request in the preset database meets a preset time distribution standard.
In the embodiment of the invention, the data values of all the data types contained in the service request are not required to be comprehensively analyzed, but only one data type, namely the receiving time of the service request, is analyzed. The reason for this is that in general, if a person attacks the service system maliciously, a large number of service requests are sent in a short time, so that on the premise that the receiving time of the service requests is judged to be in accordance with the preset time distribution standard, the subsequent comprehensive analysis of all data types is performed, which is favorable for saving computing resources and finding abnormal requests as soon as possible under the condition of malicious attack.
Optionally, determining whether the receiving time of each service request in the preset database meets a preset time distribution standard may be performed by: dividing a preset time period into a plurality of unit time periods, calculating the number of service requests corresponding to each unit time period according to the receiving time of each service request, judging that the receiving time of each service request in a preset database does not accord with a preset time distribution standard if the number of the service requests corresponding to one unit time period is larger than a preset conventional number threshold, and judging that the receiving time of each service request in the preset database accords with the preset time distribution standard if the number of the service requests corresponding to one unit time period is not larger than the preset conventional number threshold.
Optionally, in some systems, the number of service requests may increase or decrease along a linear change track according to a change of time under normal conditions, so in this case, a normal linear change track may be fitted first, and it is calculated whether there is a deviation of the number of real service requests corresponding to a unit time period from the linear change track in a preset time period, if there is a deviation of the number of service requests corresponding to a unit time period from the linear change track by a preset deviation factor, it is determined that the receiving time of each service request in the preset database does not conform to a preset time distribution standard, and if there is no deviation of the number of service requests corresponding to a unit time period from the linear change track by a preset deviation factor, it is determined that the receiving time of each service request in the preset database conforms to a preset time distribution standard. The specific calculation process will be described in detail in the following examples.
And S103, if the receiving time of each service request in the preset database does not accord with the preset time distribution standard, judging that the service request in the preset time period is abnormal.
S104, if the receiving time of each service request in the preset database accords with a preset time distribution standard, converting the corresponding relation between the data type and the data value contained in each service request in the preset database into a service matrix corresponding to each service request.
In the embodiment of the present invention, as described above, after verifying one of the data types in the service request, that is, the receiving time, there is no error, a comprehensive verification is performed on all the data types of each service request.
In the embodiment of the invention, the corresponding relation between the data types and the data values contained in each service request is required to be converted into a service matrix. Specifically, each data type has an interval in the matrix corresponding to the data type, and after the data value corresponding to each data type is converted into binary, the binary data is stored in the interval in the matrix corresponding to each data type to generate the service matrix.
S105, calculating the clustering center matrixes of the preset number of all the service matrixes.
In the embodiment of the invention, one clustering center can be determined in a plurality of service matrixes with close similarity, and the preset number of clustering centers are determined according to the preset of engineering personnel for all the service matrixes.
And S106, if the similarity between one clustering center matrix and all the preset reference matrices is smaller than a similarity threshold, judging that the service request in the preset time period is abnormal.
In the embodiment of the present invention, the preset reference matrix is a cluster center matrix calculated according to the service request in the preset database before updating, that is, the essence of the judgment in this step is: whether the cluster center matrix generated according to the service matrix corresponding to each service request in the current preset database is at least similar to the cluster center matrix generated by the service matrix corresponding to each service request in the preset database before updating is large enough.
It can be understood that, because the service requests in the preset database in the embodiment of the present invention are all changed in real time, the service requests in the preset database before updating will be different from the service requests in the preset database at present by more than one, so that the calculated preset number of cluster center matrices is just the cluster center matrix in a certain preset period, when the preset database is updated, the cluster center will move with the addition of the new service matrix and the removal of the old service matrix, but according to our data statistics, the distance of this movement should be within a reasonable range under normal conditions. Therefore, if after the preset database is updated, the similarity between one cluster center matrix and all the preset reference matrices is smaller than the similarity threshold, that is, the moving distance of at least one cluster center matrix is too large, in the embodiment of the invention, the service request in the preset time period is judged to be abnormal.
Notably, after the determining that the service request within the preset period of time has an abnormality, the method further includes: and setting the clustering center matrix as the updated reference matrix.
It will be appreciated that by updating the reference matrix, preparation is made for the next comparison of whether an anomaly exists in the cluster center matrix.
And S107, if the similarity between one clustering center matrix and all the preset reference matrices is smaller than a similarity threshold value, judging that the service request in the preset time period is not abnormal.
Notably, after the determining that there is no abnormality in the service request within the preset period of time, the method further includes: and setting the clustering center matrix as the updated reference matrix.
It will be appreciated that by updating the reference matrix, preparation is made for the next comparison of whether an anomaly exists in the cluster center matrix.
It can be understood that, in the embodiment of the present invention, the preset database is updated in real time by storing the service request received in the preset time period into the preset database and deleting the service request received before the preset time period; if the receiving time of each service request in the preset database accords with the preset time distribution standard, converting the corresponding relation between the data type and the data value contained in the service request into a service matrix corresponding to each service request; if the similarity between one clustering center matrix and all preset reference matrices is smaller than a similarity threshold, judging that the service request in a preset time period is abnormal, so that a user can master the abnormal condition of the service request in real time by shortening the preset time period, and accordingly corresponding measures are taken timely, and normal operation of a server is ensured.
As an embodiment of the present invention, as shown in fig. 2, the step S102 includes:
S1021, according to the receiving time of each service request, calculating the number of the service requests corresponding to each of the plurality of unit time periods, and generating the corresponding relation between the unit time periods and the number of the service requests.
For example, assuming that in the embodiment of the present invention, the duration of the preset time period is 1 second in the range from 10:00:01 to 10:00:10, the preset time period is equally divided into 10 unit time periods, and how many service requests exist in each unit time period is counted, that is, the correspondence between the unit time period and the number of service requests is generated.
S1022, fitting a corresponding relation between the unit time period and the number of service requests in a preset time period through a linear regression model, and calculating a linear regression coefficient of the linear regression equation according to a least square method to generate the linear regression equation.
The linear regression model is: y (n) =ax (n) +b, where Y (n) is the number of service requests corresponding to the nth unit time period in the preset time period, X (n) is the nth unit time period in the preset time period, a is the linear regression coefficient, and b is an error coefficient.
For example, assuming that the preset time period is 10 seconds, the respective unit time periods are sequentially ordered in time order, X (n) of the first unit time period is set to 1, X (n) of the second unit time period is set to 2, X (n) of the third unit time period is set to 3, and so on, and X (n) represents the nth unit time period within 10 seconds as an argument.
After the corresponding relation between the unit time period and the number of the call requests in the preset time period is counted, the linear regression coefficient and the error coefficient of the linear regression equation can be calculated through a least square method.
S1023, calculating the theoretical quantity of the service requests corresponding to each unit time period through the linear regression equation, and calculating the difference value between the quantity of the service requests corresponding to each unit time period and the theoretical quantity of the service requests.
It can be appreciated that, after calculating the linear regression equation, the theoretical number of service requests corresponding to each unit time period can be calculated by substituting the value of X (n). The difference between the number of the service requests corresponding to each unit time period and the theoretical number of the service requests is the difference between the true value and the theoretical value.
And S1024, if the difference value corresponding to all the unit time periods is smaller than a preset difference value threshold value, judging that the receiving time of each service request in the preset database meets a preset time distribution standard.
It will be appreciated that in this case it may be proved that there is no deviation of the number of service requests corresponding to a unit time period from the linear variation trajectory by more than a preset deviation factor, and that the reception time of each of the service requests in the preset database is determined to meet a preset time distribution criterion.
As an embodiment of the present invention, as shown in fig. 3, the step S105 includes:
s1051, selecting the preset number of service matrixes from all the service matrixes as an initial clustering center matrix.
In the embodiment of the invention, n service matrixes are arbitrarily selected from all service matrixes as a clustering center matrix, and it can be understood that n is an integer greater than 1 and less than the total number of the service matrixes.
S1052, calculating Euclidean distance from each service matrix to each clustering center matrix, and classifying each service matrix into a matrix set corresponding to the clustering center matrix with the minimum Euclidean distance.
Illustratively, assume that the entire traffic matrix includes: k1, K2, K3, K4, K5, K6, K7, K8, K9 and K10, assuming that the current clustering center matrix is n1, n2 and n3, and obtaining that the distance between each service matrix and the clustering center matrix is smaller than the distance between the service matrix K1, K2 and K9 and n1 and the distance between the service matrix K1, K2 and K9 and n2 or n3 by calculating the Euclidean distance between each service matrix and the clustering center matrix, and classifying the service matrices K1, K2 and K9 into matrix sets corresponding to n 1.
S1053, calculating the average value of the elements at the same positions in all the service matrixes in the matrix set corresponding to each cluster center matrix to generate an average matrix corresponding to each cluster center matrix, and taking the average matrix corresponding to the cluster center matrix as the updated cluster center matrix.
Illustratively, as described in the above examples, assuming that the matrix set corresponding to the cluster center matrix n1 includes K1, K2, and K9, the average values of the corresponding position elements of the traffic matrix K1, the traffic matrix K2, and the traffic matrix K9 are calculated to generate an average matrix n4 of three traffic matrices, and n4 replaces n1 as a new cluster center.
S1054, judging whether the updated cluster center matrix meets the termination condition.
Optionally, the determining whether the updated cluster center matrix meets the termination condition includes:
Calculating the sum of the average value of Euclidean distances of each cluster center matrix and each business matrix in a matrix set corresponding to each cluster center matrix as a cluster error;
if the clustering error is larger than a preset error threshold, the updated clustering center matrix does not meet the termination condition;
If the clustering error is smaller than or equal to a preset error threshold, the updated clustering center matrix meets the termination condition.
S1055, if the cluster center matrix does not meet the termination condition, returning to execute the operation of calculating the Euclidean distance from each service matrix to each cluster center matrix and classifying each service matrix into a matrix set corresponding to the cluster center matrix with the minimum Euclidean distance.
S1056, if the cluster center matrix meets the termination condition, outputting all cluster center matrices.
In the embodiment of the invention, the clustering center matrixes with preset quantity can be calculated through repeated cyclic calculation.
Corresponding to the method for identifying an abnormal service request described in the foregoing embodiments, fig. 4 is a block diagram illustrating a structure of an apparatus for identifying an abnormal service request according to an embodiment of the present invention, and for convenience of explanation, only a portion related to the embodiment of the present invention is shown.
Referring to fig. 4, the apparatus includes:
An updating module 401, configured to store a service request received in a preset time period into a preset database, and delete a service request stored in the preset database and received before the preset time period, so as to update the preset database, where the service request includes a plurality of corresponding relations between data types and data values, and the data types include a time of receiving the service request;
a judging module 402, configured to judge whether a receiving time of each service request in the preset database meets a preset time distribution standard;
A conversion module 403, configured to convert, if the receiving time of each service request in the preset database meets a preset time distribution criterion, a corresponding relationship between a data type and a data value included in each service request in the preset database into a service matrix corresponding to each service request;
the calculating module 404 is configured to calculate a preset number of cluster center matrices of all service matrices, and if there is a cluster center matrix with similarity to all preset reference matrices being smaller than a similarity threshold, determine that the service request in the preset time period is abnormal.
Optionally, the determining whether the receiving time of each service request in the preset database meets a preset time distribution standard includes:
Calculating the number of the service requests corresponding to each of a plurality of unit time periods according to the receiving time of each service request, and generating a corresponding relation between the unit time periods and the number of the service requests; by a linear regression model: y (n) =ax (n) +b fits the corresponding relation between the unit time period and the number of service requests in the preset time period, and calculates the linear regression coefficient of the linear regression equation according to a least square method to generate the linear regression equation; the Y (n) is the number of service requests corresponding to the nth unit time period in the preset time period, the X (n) is the nth unit time period in the preset time period, the a is the linear regression coefficient, and the b is the error coefficient; calculating the theoretical number of service requests corresponding to each unit time period through the linear regression equation, and calculating the difference value between the number of service requests corresponding to each unit time period and the theoretical number of service requests; and if the difference value corresponding to all the unit time periods is smaller than a preset difference value threshold value, judging that the receiving time of each service request in the preset database accords with a preset time distribution standard.
Optionally, the calculating a preset number of cluster center matrices of the total service matrix includes:
The preset number of service matrixes are selected randomly from all the service matrixes to serve as initial clustering center matrixes; calculating Euclidean distance from each service matrix to each clustering center matrix, and classifying each service matrix into a matrix set corresponding to the clustering center matrix with the minimum Euclidean distance; calculating the average value of elements at the same positions in all service matrixes in a matrix set corresponding to each cluster center matrix to generate an average matrix corresponding to each cluster center matrix, and taking the average matrix corresponding to the cluster center matrix as an updated cluster center; judging whether the updated cluster center matrix meets the termination condition; if the clustering center matrix does not meet the termination condition, returning to execute the operation of calculating the Euclidean distance from each service matrix to each clustering center matrix and classifying each service matrix into a matrix set corresponding to the clustering center matrix with the smallest Euclidean distance; and if the clustering center matrix meets the termination condition, outputting all the clustering center matrices.
Optionally, the determining whether the updated cluster center matrix meets the termination condition includes:
Calculating the sum of the average value of Euclidean distances of each cluster center matrix and each business matrix in a matrix set corresponding to each cluster center matrix as a cluster error; if the clustering error is larger than a preset error threshold, the updated clustering center matrix does not meet the termination condition; if the clustering error is smaller than or equal to a preset error threshold, the updated clustering center matrix meets the termination condition.
Optionally, after the determining that the service request within the preset time period is abnormal, the method further includes: and setting the clustering center matrix as the updated reference matrix.
In the embodiment of the invention, the service request received in the preset time period is stored in the preset database, and the service request received before the preset time period is deleted, so that the preset database is updated in real time; if the receiving time of each service request in the preset database accords with the preset time distribution standard, converting the corresponding relation between the data type and the data value contained in the service request into a service matrix corresponding to each service request; if the similarity between one clustering center matrix and all preset reference matrices is smaller than a similarity threshold, judging that the service request in a preset time period is abnormal, so that a user can master the abnormal condition of the service request in real time by shortening the preset time period, and accordingly corresponding measures are taken timely, and normal operation of a server is ensured.
Fig. 5 is a schematic diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 5, the terminal device 5 of this embodiment includes: a processor 50, a memory 51 and a computer program 52 stored in said memory 51 and executable on said processor 50, such as an identification program of an abnormal service request. The processor 50, when executing the computer program 52, implements the steps of the above-described embodiments of the method for identifying abnormal service requests, such as steps 101 to 107 shown in fig. 1. Or the processor 50, when executing the computer program 52, performs the functions of the modules/units of the apparatus embodiments described above, e.g. the functions of the units 401 to 404 shown in fig. 4.
By way of example, the computer program 52 may be partitioned into one or more modules/units that are stored in the memory 51 and executed by the processor 50 to complete the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program 52 in the terminal device 5.
The terminal device 5 may be a computing device such as a desktop computer, a notebook computer, a palm computer, a cloud terminal device, etc. The terminal device may include, but is not limited to, a processor 50, a memory 51. It will be appreciated by those skilled in the art that fig. 5 is merely an example of the terminal device 5 and does not constitute a limitation of the terminal device 5, and may include more or less components than illustrated, or may combine certain components, or different components, e.g., the terminal device may further include an input-output device, a network access device, a bus, etc.
The Processor 50 may be a central processing unit (Central Processing Unit, CPU), other general purpose Processor, digital signal Processor (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), off-the-shelf Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 51 may be an internal storage unit of the terminal device 5, such as a hard disk or a memory of the terminal device 5. The memory 51 may also be an external storage device of the terminal device 5, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the terminal device 5. Further, the memory 51 may also include both an internal storage unit and an external storage device of the terminal device 5. The memory 51 is used for storing the computer program as well as other programs and data required by the terminal device. The memory 51 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (8)

1. A method for identifying an abnormal service request, comprising:
Storing the service request received in a preset time period into a preset database, deleting the service request stored in the preset database and received before the preset time period, so as to update the preset database, wherein the service request comprises a plurality of corresponding relations between data types and data values, and the data types comprise the receiving time of the service request;
Judging whether the receiving time of each service request in the preset database accords with a preset time distribution standard;
If the receiving time of each service request in the preset database accords with a preset time distribution standard, converting the corresponding relation between the data type and the data value contained in each service request in the preset database into a service matrix corresponding to each service request;
Calculating a preset number of cluster center matrixes of all service matrixes, and if the similarity between one cluster center matrix and all preset reference matrixes is smaller than a similarity threshold value, judging that the service request in the preset time period is abnormal;
the calculating the preset number of clustering center matrixes of all the service matrixes comprises the following steps:
The preset number of service matrixes are selected randomly from all the service matrixes to serve as initial clustering center matrixes;
calculating Euclidean distance from each service matrix to each clustering center matrix, and classifying each service matrix into a matrix set corresponding to the clustering center matrix with the minimum Euclidean distance;
Calculating the average value of elements at the same positions in all service matrixes in a matrix set corresponding to each cluster center matrix to generate an average matrix corresponding to each cluster center matrix, and taking the average matrix corresponding to the cluster center matrix as an updated cluster center matrix;
Judging whether the updated cluster center matrix meets the termination condition;
If the clustering center matrix does not meet the termination condition, returning to execute the operation of calculating the Euclidean distance from each service matrix to each clustering center matrix and classifying each service matrix into a matrix set corresponding to the clustering center matrix with the smallest Euclidean distance;
and if the clustering center matrix meets the termination condition, outputting all the clustering center matrices.
2. The method for identifying abnormal service requests according to claim 1, wherein said determining whether the reception time of each of the service requests in the preset database meets a preset time distribution criterion comprises:
Calculating the number of the service requests corresponding to each of a plurality of unit time periods according to the receiving time of each service request, and generating a corresponding relation between the unit time periods and the number of the service requests;
By a linear regression model: y (n) =ax (n) +b fits the corresponding relation between the unit time period and the number of service requests in the preset time period, and calculates the linear regression coefficient of the linear regression equation according to a least square method to generate the linear regression equation; the Y (n) is the number of service requests corresponding to the nth unit time period in the preset time period, the X (n) is the nth unit time period in the preset time period, the a is the linear regression coefficient, and the b is the error coefficient;
Calculating the theoretical number of service requests corresponding to each unit time period through the linear regression equation, and calculating the difference value between the number of service requests corresponding to each unit time period and the theoretical number of service requests;
And if the difference value corresponding to all the unit time periods is smaller than a preset difference value threshold value, judging that the receiving time of each service request in the preset database accords with a preset time distribution standard.
3. The method for identifying an abnormal service request according to claim 1, wherein the determining whether the updated cluster center matrix satisfies the termination condition comprises:
Calculating the sum of the average value of Euclidean distances of each cluster center matrix and each business matrix in a matrix set corresponding to each cluster center matrix as a cluster error;
if the clustering error is larger than a preset error threshold, the updated clustering center matrix does not meet the termination condition;
If the clustering error is smaller than or equal to a preset error threshold, the updated clustering center matrix meets the termination condition.
4. The method for identifying an abnormal service request according to claim 1, further comprising, after said determining that the service request within the preset time period is abnormal:
and setting the clustering center matrix as the updated reference matrix.
5. A terminal device comprising a memory and a processor, said memory storing a computer program executable on said processor, characterized in that said processor, when executing said computer program, performs the steps of:
Storing the service request received in a preset time period into a preset database, deleting the service request stored in the preset database and received before the preset time period, so as to update the preset database, wherein the service request comprises a plurality of corresponding relations between data types and data values, and the data types comprise the receiving time of the service request;
Judging whether the receiving time of each service request in the preset database accords with a preset time distribution standard;
If the receiving time of each service request in the preset database accords with a preset time distribution standard, converting the corresponding relation between the data type and the data value contained in each service request in the preset database into a service matrix corresponding to each service request;
Calculating a preset number of cluster center matrixes of all service matrixes, and if the similarity between one cluster center matrix and all preset reference matrixes is smaller than a similarity threshold value, judging that the service request in the preset time period is abnormal;
the calculating the preset number of clustering center matrixes of all the service matrixes comprises the following steps:
The preset number of service matrixes are selected randomly from all the service matrixes to serve as initial clustering center matrixes;
calculating Euclidean distance from each service matrix to each clustering center matrix, and classifying each service matrix into a matrix set corresponding to the clustering center matrix with the minimum Euclidean distance;
Calculating the average value of elements at the same positions in all service matrixes in a matrix set corresponding to each cluster center matrix to generate an average matrix corresponding to each cluster center matrix, and taking the average matrix corresponding to the cluster center matrix as an updated cluster center matrix;
Judging whether the updated cluster center matrix meets the termination condition;
If the clustering center matrix does not meet the termination condition, returning to execute the operation of calculating the Euclidean distance from each service matrix to each clustering center matrix and classifying each service matrix into a matrix set corresponding to the clustering center matrix with the smallest Euclidean distance;
and if the clustering center matrix meets the termination condition, outputting all the clustering center matrices.
6. The terminal device according to claim 5, wherein the determining whether the reception time of each service request in the preset database meets a preset time distribution criterion comprises:
Calculating the number of the service requests corresponding to each of a plurality of unit time periods according to the receiving time of each service request, and generating a corresponding relation between the unit time periods and the number of the service requests;
By a linear regression model: y (n) =ax (n) +b fits the corresponding relation between the unit time period and the number of service requests in the preset time period, and calculates the linear regression coefficient of the linear regression equation according to a least square method to generate the linear regression equation; the Y (n) is the number of service requests corresponding to the nth unit time period in the preset time period, the X (n) is the nth unit time period in the preset time period, the a is the linear regression coefficient, and the b is the error coefficient;
Calculating the theoretical number of service requests corresponding to each unit time period through the linear regression equation, and calculating the difference value between the number of service requests corresponding to each unit time period and the theoretical number of service requests;
And if the difference value corresponding to all the unit time periods is smaller than a preset difference value threshold value, judging that the receiving time of each service request in the preset database accords with a preset time distribution standard.
7. An apparatus for identifying an abnormal service request, the apparatus comprising:
The updating module is used for storing the service requests received in the preset time period into a preset database, deleting the service requests stored in the preset database and received before the preset time period so as to update the preset database, wherein the service requests comprise the corresponding relation between a plurality of data types and data values, and the data types comprise the receiving time of the service requests;
the judging module is used for judging whether the receiving time of each service request in the preset database accords with a preset time distribution standard;
The conversion module is used for converting the corresponding relation between the data types and the data values contained in each service request in the preset database into a service matrix corresponding to each service request if the receiving time of each service request in the preset database accords with a preset time distribution standard;
The calculation module is used for calculating the preset number of the clustering center matrixes of all the service matrixes, and judging that the service request in the preset time period is abnormal if the similarity between one clustering center matrix and all the preset reference matrixes is smaller than a similarity threshold value;
the calculating the preset number of clustering center matrixes of all the service matrixes comprises the following steps:
The preset number of service matrixes are selected randomly from all the service matrixes to serve as initial clustering center matrixes;
calculating Euclidean distance from each service matrix to each clustering center matrix, and classifying each service matrix into a matrix set corresponding to the clustering center matrix with the minimum Euclidean distance;
Calculating the average value of elements at the same positions in all service matrixes in a matrix set corresponding to each cluster center matrix to generate an average matrix corresponding to each cluster center matrix, and taking the average matrix corresponding to the cluster center matrix as an updated cluster center matrix;
Judging whether the updated cluster center matrix meets the termination condition;
If the clustering center matrix does not meet the termination condition, returning to execute the operation of calculating the Euclidean distance from each service matrix to each clustering center matrix and classifying each service matrix into a matrix set corresponding to the clustering center matrix with the smallest Euclidean distance;
and if the clustering center matrix meets the termination condition, outputting all the clustering center matrices.
8. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 4.
CN201811249394.9A 2018-10-25 2018-10-25 Abnormal service request identification method and terminal equipment Active CN109492394B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811249394.9A CN109492394B (en) 2018-10-25 2018-10-25 Abnormal service request identification method and terminal equipment
PCT/CN2018/124341 WO2020082588A1 (en) 2018-10-25 2018-12-27 Method and apparatus for identifying abnormal service request, electronic device, and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811249394.9A CN109492394B (en) 2018-10-25 2018-10-25 Abnormal service request identification method and terminal equipment

Publications (2)

Publication Number Publication Date
CN109492394A CN109492394A (en) 2019-03-19
CN109492394B true CN109492394B (en) 2024-05-03

Family

ID=65691882

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811249394.9A Active CN109492394B (en) 2018-10-25 2018-10-25 Abnormal service request identification method and terminal equipment

Country Status (2)

Country Link
CN (1) CN109492394B (en)
WO (1) WO2020082588A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059013B (en) * 2019-04-24 2023-06-23 北京百度网讯科技有限公司 Method and device for determining normal operation after software upgrading
CN110222504B (en) * 2019-05-21 2024-02-13 平安银行股份有限公司 User operation monitoring method, device, terminal equipment and medium
CN110619019B (en) * 2019-08-07 2024-03-15 平安科技(深圳)有限公司 Distributed storage method and system for data
CN111079653B (en) * 2019-12-18 2024-03-22 中国工商银行股份有限公司 Automatic database separation method and device
CN111506829B (en) * 2020-03-20 2023-08-25 微梦创科网络科技(中国)有限公司 Abnormal attention behavior batch real-time identification method and device
CN112232771B (en) * 2020-10-17 2021-06-01 力合科创集团有限公司 Big data analysis method and big data cloud platform applied to smart government-enterprise cloud service
CN112560085B (en) * 2020-12-10 2023-09-19 支付宝(杭州)信息技术有限公司 Privacy protection method and device for business prediction model
CN113630425B (en) * 2021-10-08 2022-01-07 国网浙江省电力有限公司金华供电公司 Financial data safe transmission method for multiple power bodies
CN116484230B (en) * 2023-06-20 2023-09-01 世优(北京)科技有限公司 Method for identifying abnormal business data and training method of AI digital person

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106469276A (en) * 2015-08-19 2017-03-01 阿里巴巴集团控股有限公司 The kind identification method of data sample and device
CN107256257A (en) * 2017-06-12 2017-10-17 上海携程商务有限公司 Abnormal user generation content identification method and system based on business datum
CN107302547A (en) * 2017-08-21 2017-10-27 深信服科技股份有限公司 A kind of web service exceptions detection method and device
CN108491301A (en) * 2018-02-01 2018-09-04 平安科技(深圳)有限公司 Electronic device, the abnormity early warning method based on redis and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104065526B (en) * 2013-03-22 2019-04-23 腾讯科技(深圳)有限公司 A kind of method and apparatus of server failure alarm
WO2014188500A1 (en) * 2013-05-20 2014-11-27 富士通株式会社 Data stream processing parallelization program, and data stream processing parallelization system
CN104917643B (en) * 2014-03-11 2019-02-01 腾讯科技(深圳)有限公司 Abnormal account detection method and device
CN108289077B (en) * 2017-01-09 2021-09-21 中兴通讯股份有限公司 Method and device for carrying out fuzzy detection analysis on WEB server security
CN108595300A (en) * 2018-03-21 2018-09-28 北京奇艺世纪科技有限公司 A kind of method and device of configurable monitoring and alarm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106469276A (en) * 2015-08-19 2017-03-01 阿里巴巴集团控股有限公司 The kind identification method of data sample and device
CN107256257A (en) * 2017-06-12 2017-10-17 上海携程商务有限公司 Abnormal user generation content identification method and system based on business datum
CN107302547A (en) * 2017-08-21 2017-10-27 深信服科技股份有限公司 A kind of web service exceptions detection method and device
CN108491301A (en) * 2018-02-01 2018-09-04 平安科技(深圳)有限公司 Electronic device, the abnormity early warning method based on redis and storage medium

Also Published As

Publication number Publication date
CN109492394A (en) 2019-03-19
WO2020082588A1 (en) 2020-04-30

Similar Documents

Publication Publication Date Title
CN109492394B (en) Abnormal service request identification method and terminal equipment
CN110471916B (en) Database query method, device, server and medium
CN110309125B (en) Data verification method, electronic device and storage medium
US9792169B2 (en) Managing alert profiles
US20210182318A1 (en) Data Retrieval Method and Apparatus
CN108491255B (en) Self-service MapReduce data optimal distribution method and system
CN115412371B (en) Big data security protection method and system based on Internet of things and cloud platform
CN112069242B (en) Data processing method based on big data and cloud computing and big data service platform
CN110944016B (en) DDoS attack detection method, device, network equipment and storage medium
WO2021114025A1 (en) Incremental data determination method, incremental data determination apparatus, server and terminal device
CN111813845A (en) ETL task-based incremental data extraction method, device, equipment and medium
CN116418603B (en) Safety comprehensive management method and system for industrial Internet
CN110807050B (en) Performance analysis method, device, computer equipment and storage medium
CN112804333A (en) Exception handling method, device and equipment for out-of-block node and storage medium
CN108920601B (en) Data matching method and device
CN111159009B (en) Pressure testing method and device for log service system
CN117294497A (en) Network traffic abnormality detection method and device, electronic equipment and storage medium
CN112866300A (en) Block chain big data safety protection method and system based on artificial intelligence
TWI723517B (en) Method for preventing distributed denial of service attack and related equipment
CN108255710B (en) Script abnormity detection method and terminal thereof
WO2020000724A1 (en) Method, electronic device and medium for processing communication load between hosts of cloud platform
US20220253457A1 (en) Block Generation Control Method Applied to Blockchain and Related Apparatus
CN112560085B (en) Privacy protection method and device for business prediction model
CN109522915B (en) Virus file clustering method and device and readable medium
CN112468546A (en) Account position determining method, account position determining device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant