WO2023109046A1 - Anomaly detection method and apparatus, electronic device, and storage medium - Google Patents

Anomaly detection method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2023109046A1
WO2023109046A1 PCT/CN2022/098734 CN2022098734W WO2023109046A1 WO 2023109046 A1 WO2023109046 A1 WO 2023109046A1 CN 2022098734 W CN2022098734 W CN 2022098734W WO 2023109046 A1 WO2023109046 A1 WO 2023109046A1
Authority
WO
WIPO (PCT)
Prior art keywords
access
feature value
dimension
access address
set dimension
Prior art date
Application number
PCT/CN2022/098734
Other languages
French (fr)
Chinese (zh)
Inventor
杨春保
卢道和
谢波
朱敏毅
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2023109046A1 publication Critical patent/WO2023109046A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/552Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries

Definitions

  • the present application relates to the field of computer technology, and in particular to an anomaly detection method, device, electronic equipment and storage medium.
  • the log software development kit (SDK, Software Development Kit) reports user behavior logs to the log server; the log server analyzes the user behavior logs reported by each business system, determines the abnormal behavior corresponding to each business system, and determines the The abnormal behavior is sent to the corresponding system so that the corresponding business system can handle the abnormal behavior.
  • SDK Software Development Kit
  • embodiments of the present application provide an anomaly detection method, device, electronic equipment, and storage medium.
  • the embodiment of the present application provides an anomaly detection method, including:
  • An access address represents an access address in any access information in the first access log, and is used to access any one of at least two set business systems; the first access log is used for real-time recording The access information of the access request via the built-in application programming interface (Application Programming Interface, API) gateway within the unit time;
  • API Application Programming Interface
  • the first feature value corresponding to each unit time length in the set statistical period is determined based on the history log corresponding to the corresponding unit time length.
  • the access information at least includes user identification, access time and access address; determining the characteristic value corresponding to the access address includes:
  • the second access log includes the first access log or the historical access corresponding to each unit duration within the set statistical period log; the first sequence includes all access addresses corresponding to the first user in the second access log;
  • the second sequence includes at least one access address
  • the hash value is calculated to obtain the feature value corresponding to the access address at the set position in the second sequence corresponding to the first user.
  • the calculation of the hash value includes:
  • the third access address in the second sequence is replaced with the set character string; wherein, the second access address represents the first The access address at the set position in the second sequence, the third access address represents any access address adjacent to the second access address in the second sequence;
  • a hash value is calculated based on the updated second sequence.
  • the second sequence includes three access addresses, and the set position represents an intermediate position.
  • the access information also includes the department to which the user belongs, the position of the user, and the identification of the terminal device sending the access address;
  • the at least one set dimension includes at least one of the following:
  • the first dimension represents the number of occurrences of statistical feature values by user
  • the second dimension represents the number of occurrences of statistical feature values by department
  • the third dimension represents the number of occurrences of statistical feature values by position
  • the fourth dimension represents the statistics of the number of occurrences of the same feature value based on the terminal equipment used by the user.
  • the determining whether the first access address is abnormal includes:
  • a plurality of second scores corresponding to the first access address are determined based on the first score corresponding to the first feature value of each unit duration in each set dimension; wherein each second score is based on each The first eigenvalues within the unit duration are determined by the first scores corresponding to all the set dimensions;
  • the determination of the first score corresponding to the first feature value in each set dimension within each unit duration includes:
  • the second occurrence of the first feature value corresponding to the first set dimension within the first unit time length The number of times; the first set dimension represents any set dimension in the at least one set dimension; the first unit duration represents any unit duration in the set statistical cycle;
  • the maximum historical number of occurrences of a feature value in the set statistical period and the second number of occurrences of the first feature value in the first set dimension corresponding to the first unit time length are determined to determine The first score corresponding to the first feature value in the first set dimension.
  • the determination of the second occurrence times corresponding to the first feature value in the first set dimension within the first unit duration includes one of the following:
  • the first characteristic value of the first unit duration is determined based on the ratio of the historical occurrence times of the first characteristic value corresponding to the first set dimension in the first unit duration to the total number of the unit duration within the set statistical period The second number of occurrences corresponding to the first set dimension;
  • the weight corresponding to the first unit duration is greater than the weight corresponding to the second unit duration.
  • the determination of the first score corresponding to the first feature value in the first set dimension within the first unit duration includes one of the following:
  • a first score corresponding to the first feature value in the first set dimension within the first unit time length is determined based on the set score
  • the first difference is not equal to zero, based on the ratio of the second difference to the first difference, determine the first score corresponding to the first feature value in the first set dimension within the first unit time length;
  • the first difference represents the maximum number of historical occurrences corresponding to the first feature value in the set statistical period and the first feature value corresponding to the first set dimension within the first unit duration the difference between the second occurrences;
  • the second difference represents the first number of appearances corresponding to the first feature value in the first set dimension and the second number of appearances corresponding to the first feature value in the first set dimension within the first unit duration Difference.
  • the determination of the first score corresponding to the first feature value in the first set dimension within the first unit time length based on the set score includes one of the following:
  • the first set score corresponding to the first set dimension is determined as the first feature value within the first unit time length within the first Set the first score corresponding to the dimension;
  • the second set score is determined as the first feature value within the first unit duration within the first Set the first score corresponding to the dimension.
  • the determination of the first score corresponding to the first feature value in the first set dimension within the first unit duration based on the ratio of the second difference to the first difference includes one of the following:
  • the ratio of the second difference to the first difference is less than or equal to zero, it is determined that the first fraction corresponding to the first feature value in the first set dimension within the first unit time length is zero;
  • the quotient of the second difference to the first difference is determined as the first characteristic value within the first unit time length in the first setting The first fraction corresponding to the dimension.
  • the multiple second scores corresponding to the first access address are determined based on the first scores corresponding to the first feature value in each set dimension within each unit duration, including:
  • the first score corresponding to the first feature value in the first set dimension within the first unit duration is determined as the first access address in the first unit duration the corresponding second fraction;
  • the method also includes:
  • the access operation corresponding to the first access address is blocked, and/or, access restriction is performed on the terminal device sending the first access address.
  • the embodiment of the present application also provides an abnormality detection device, including:
  • the first determination unit is configured to determine the first feature value corresponding to the first access address based on the first access log, and determine the first feature value corresponding to each set dimension in at least one set dimension. A number of occurrences; the first access address represents the access address in any access information in the first access log, and is used to access any one of the at least two set service systems; the first An access log is used to record the access information of the access request via the built-in application program interface API gateway in real time within the unit duration;
  • the second determining unit is configured to be based on the first occurrence times corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit time length in the set statistical period in each set Determine whether the first access address is abnormal by determining the number of historical occurrences corresponding to the dimension; wherein, the first characteristic value corresponding to each unit time length in the set statistical period is determined based on the historical log corresponding to the corresponding unit time length.
  • An embodiment of the present application also provides an electronic device, including: a processor and a memory configured to store a computer program that can run on the processor,
  • the processor is configured to execute the steps of the above abnormality detection method when running the computer program.
  • the embodiment of the present application also provides a storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the above abnormality detection method are implemented.
  • the first feature value corresponding to the first access address is determined, and the first feature value corresponding to each set dimension in at least one set dimension is determined.
  • the number of occurrences; based on the first occurrences corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit duration in the set statistical period in each set dimension The number of historical occurrences determines whether the first access address is abnormal. In this way, the access request to the business system via the built-in API gateway can be directly detected, and the first access log is generated based on the access information of the detected access request.
  • the timeliness of collecting the access log is improved Therefore, the timeliness of anomaly detection is improved; since the number of occurrences corresponding to the first eigenvalue can reflect the user's behavior habits, this solution can improve the accuracy of anomaly detection results.
  • FIG. 1 is a schematic diagram of an implementation process of an anomaly detection method in the related art
  • FIG. 2 is a schematic diagram of the implementation flow of the abnormality detection method provided by the embodiment of the present application.
  • FIG. 3 is a schematic diagram of an implementation flow for determining a characteristic value corresponding to an access address provided in an embodiment of the present application
  • FIG. 4 is a schematic diagram of an implementation flow for determining whether the first access address is abnormal according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of an implementation process for determining the first score provided by the embodiment of the present application.
  • FIG. 6 is a schematic diagram of the implementation flow of the abnormality detection method provided by the application embodiment of the present application.
  • FIG. 7 is a schematic diagram of an abnormality detection system provided by an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an abnormality detection device provided in an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a hardware composition structure of an electronic device provided by an embodiment of the present application.
  • each business system needs to integrate an SDK for collecting user behavior logs, which makes the cost of collecting user behavior logs relatively high; each business system needs to report user behavior to the log server through its own integrated SDK. Behavior logs, which makes the timeliness of collecting user behavior logs poor.
  • the log server analyzes the user behavior logs reported by various business systems based on rules set according to human experience, so as to determine abnormal behaviors, which leads to low accuracy in determining the analyzed abnormal behaviors.
  • the present application provides an anomaly detection method. Based on the first access log, the first feature value corresponding to the first access address is determined, and each of the first feature values in at least one set dimension is determined. Set the first number of occurrences corresponding to the dimension; based on the first number of occurrences corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit duration in the set statistical period in The historical occurrence times corresponding to each set dimension determine whether the first access address is abnormal. In this way, the access request to the business system via the built-in API gateway can be directly detected, and the first access log is generated based on the access information of the detected access request.
  • the timeliness of collecting the access log is improved Therefore, the timeliness of anomaly detection is improved; since the number of occurrences corresponding to the first eigenvalue can reflect the user's behavior habits, this solution can improve the accuracy of anomaly detection results.
  • FIG. 2 is a schematic diagram of an implementation flow of an anomaly detection method provided by an embodiment of the present application, wherein the execution subject of the flow is electronic equipment such as a terminal device and a server.
  • the anomaly detection method includes:
  • Step 201 Based on the first access log, determine the first feature value corresponding to the first access address, and determine the first occurrence times of the first feature value corresponding to each set dimension in at least one set dimension;
  • the first access address is an access address in any access information in the first access log, and is used to access any one of the at least two set business systems;
  • the first access The log is used to record the access information of the access request via the built-in application program interface API gateway in real time within a unit time.
  • the electronic device is provided with an API gateway, and the user triggers a login request or an access request related to the setting service system through the terminal device, and reaches the server managing the corresponding setting service system through the API gateway.
  • the access request may be a cross-system access request, or a non-cross-system access request. That is to say, the access request triggered on the interactive interface of any set business system can be used to request access to related functions in the set business system, or to request access to other devices other than the set business system. Determine the relevant functions of the business system.
  • the API gateway can be APISIX.
  • the user may click on a relevant function or button on the interactive page of the setting business system to trigger an access request.
  • the electronic device detects an access request via the API gateway, it writes the access information of the access request into a first access log, and the first access log is used to record in real time the access information carried in the access request via the API gateway within a unit duration, The access information in the first access log is updated in real time.
  • the access information includes at least user identification, access time and access address.
  • the unit duration is one day. That is to say, the electronic device creates an access log every day, which is used to record the access information carried in all the access requests detected on that day.
  • the implementation of determining the first feature value corresponding to the first access address includes:
  • the electronic device determines a first sequence corresponding to each user identifier based on the access information in the first access log, and the first sequence corresponding to each user identifier includes all access addresses corresponding to the user identifier;
  • the first sequence of the first sequence determine the feature value corresponding to each access address in the first sequence corresponding to each user identifier.
  • the feature value corresponding to each access address corresponding to each user identifier may be determined based on the corresponding access address, or may be determined based on the corresponding access address, and based on at least one access address adjacent to the corresponding access address in the first sequence out.
  • each of the three adjacent access addresses in the first sequence based on each of the three adjacent access addresses in the first sequence, it is determined that each of the three adjacent access addresses is in the middle
  • the characteristic value corresponding to the access address of the location When calculating the eigenvalue corresponding to the first access address in the first sequence, use the set empty string to replace the missing access address, or determine the first access address based on the first access address and the access addresses adjacent to the first access address.
  • the characteristic value corresponding to each access address It should be noted that the characteristic value represents the characteristic of the access address.
  • the feature value corresponding to the access address includes a hash value.
  • the electronic device determines the first feature value corresponding to the first access address from the determined feature values corresponding to the access address; based on the first feature value corresponding to each user, determines that the first feature value is in at least one set dimension The number of first occurrences corresponding to each dimension in .
  • the first data table can be used to record the access information corresponding to each user, the feature value corresponding to the access address in the access information, and the occurrence times of each feature value within a unit time length.
  • the first data table is as follows:
  • the access information also includes the department to which the user belongs, the position of the user, and the identification of the terminal device that sends the access address;
  • the at least one set dimension includes at least one of the following:
  • the first dimension represents the number of occurrences of feature values by user statistics; that is, based on all feature values corresponding to each user, the number of occurrences of each feature value corresponding to each user is counted;
  • the second dimension represents the number of occurrences of feature values by department; that is, counts the number of occurrences of each feature value based on all feature values corresponding to each department;
  • the third dimension represents the number of occurrences of feature values by post; that is, counts the number of occurrences of each feature value based on all feature values corresponding to each post;
  • the fourth dimension represents the number of occurrences of feature values based on the terminal devices used by users; that is, based on all feature values corresponding to each terminal device, the number of occurrences of each feature value corresponding to each terminal device is counted.
  • Embodiment 2 Considering that in the application scenario of tracking the user's cross-system user behavior, the access address sequence corresponding to any function in any business system when the user accesses it is fixed, in order to more accurately determine the abnormality Request, as shown in Figure 3, in some embodiments, the following steps 301 to 303 are used to determine the characteristic value corresponding to each access address:
  • Step 301 Based on the access information in the second access log, determine the first sequence corresponding to the first user; wherein, the second access log includes the first access log or each unit duration corresponding to the set statistical period historical access logs; the first sequence includes all access addresses corresponding to the first user in the second access log.
  • the electronic device determines the first sequence corresponding to the first user in the second access log based on the user identifier and access time included in each access information in the second access log.
  • the first user refers to any user in the second access log.
  • Step 302 From the first sequence corresponding to the first user, determine the second sequence corresponding to the first user; the second sequence includes at least one access address.
  • the number of access addresses included in the first sequence corresponding to the first user is greater than or equal to the number of access addresses included in the corresponding second sequence.
  • Step 303 Calculate the hash value based on the second sequence corresponding to the first user, and obtain the characteristic value corresponding to the access address at the set position in the second sequence corresponding to the first user.
  • the electronic device can calculate a hash value based on each access address in the second sequence corresponding to the first user through a setting algorithm, and obtain the corresponding access address at the set position in the second sequence corresponding to the first user. eigenvalues of .
  • Set the algorithm as the algorithm used to calculate the hash value including information digest algorithm or hash algorithm.
  • the access address includes a domain name and a service system identifier, and may also include a function identifier.
  • the access address may be http://xxxx/1/2. Among them, xxxx represents the domain name, 1 represents the service system identification, and 2 represents the function identification.
  • the electronic device determines the first characteristic value corresponding to the first access address by performing steps 301 to 303 .
  • the first access address is the access address at the set position in the second sequence.
  • the electronic device determines the corresponding access address corresponding to each unit duration within the set statistical period by performing steps 301 to 303 Eigenvalues.
  • said calculating the hash value includes:
  • the third access address in the second sequence is replaced with the set character string; wherein, the second access address represents the second An access address at a set position in the sequence, the third access address represents an access address adjacent to the second access address in the second sequence;
  • a hash value is calculated based on each access address in the updated second sequence.
  • the electronic device determines the time interval between the third access address and the second access address based on the access time corresponding to the third access address and the access time corresponding to the second access address; when the calculated time interval is greater than or equal to the set
  • the third access address in the second sequence is replaced with a set string; and a hash value is calculated based on each updated access address and set string in the second sequence.
  • the set string is an empty string set according to the format of the access address.
  • the set duration is 5 minutes, of course, the set duration can also be set according to the actual situation in actual application.
  • the second sequence includes three access addresses, and the set position represents an intermediate position. In this way, both the calculation amount for calculating the feature value and the accuracy of the abnormality detection result can be taken into account.
  • the second sequence is V 1 V 2 V 3 ;
  • MD5 (V 1 +V 2 +V 3 ) represents the characteristic value corresponding to the access address V 2 calculated through the message digest algorithm 5 (MD5, Message-Digest Algorithm 5).
  • API gateway can also be used to call related plug-ins or related services in the electronic device to verify the login request or access request of the set business system.
  • the access information carried in the access request is written into the access log.
  • the user enters information such as user name and password on the login interface of the first business system to trigger a login request; when the electronic device detects a login request via the API gateway, , call the first plug-in or the first service in the electronic device through the API gateway, so as to authenticate the login request through the first plug-in or the first service; when the identity verification is passed, the user is allowed to log in to the first business system, and the login The request is sent to the server that manages the first business system; when the identity verification fails, the user is not allowed to log in to the first business system, and a prompt message representing the failure of the identity verification is returned to the terminal device that sent the login request.
  • the first service system generally refers to any set service system in at least one set service system; the first plug-in represents a plug-in used for identity verification, and the first service represents a service used for identity verification.
  • the user can click a relevant function or button on the interactive page of the first business system to trigger an access request.
  • the electronic device detects an access request via the API gateway, it invokes a second plug-in or a second service in the electronic device through the API gateway, and performs permission verification on the detected access request through the second plug-in or the second service;
  • the authority verification passes, the user is allowed to access the corresponding function of the first business system, and the access request is sent to the server that manages the first business system;
  • the authority verification fails, the user is not allowed to access the corresponding function of the first business system, and the The terminal device that sends the access request returns prompt information indicating that there is no access right.
  • the second plug-in represents a plug-in for authorization verification
  • the second service represents a service for authorization verification.
  • the electronic device stores the user's set access path corresponding to each set business system
  • the set access path refers to the access path with access authority
  • the set access path can be dynamically updated.
  • the process of verifying the authority of the detected access request can be as follows:
  • Step 202 Based on the first number of occurrences corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit duration in the set statistical period in each set dimension The number of historical occurrences determines whether the first access address is abnormal; wherein, the first characteristic value corresponding to each unit time length in the set statistical period is determined based on the history log corresponding to the corresponding unit time length.
  • the electronic device determines the first sequence corresponding to each user identifier in each unit duration based on the access information in the historical access log corresponding to each unit duration in the set statistical period;
  • the first sequence corresponding to the duration determines the characteristic value corresponding to each access address of each user ID in the first sequence corresponding to each unit duration; based on each user identifier in the first sequence corresponding to each unit duration, each feature value corresponding to each access address, and determine the historical occurrence times of each feature value corresponding to each set dimension in each unit time length; each feature value in each unit time length corresponds to each set dimension Among the historical occurrence times of the first feature value in each unit time period, the historical occurrence times corresponding to each set dimension are determined.
  • the electronic device determines whether the first access address is abnormal.
  • the electronic device may also calculate the first visit based on the first number of appearances corresponding to the first feature value in each set dimension, and based on the historical number of occurrences of the first feature value in each set dimension within each unit of time.
  • the score corresponding to the address, and based on the score corresponding to the first access address determine whether the first access address is an abnormal address.
  • the setting threshold is set based on the abnormal access address.
  • the method for determining the characteristic value corresponding to the access address based on the historical access log is the same as the method for determining the characteristic value corresponding to the access address based on the first access log, and details are not described here.
  • the determining whether the first access address is abnormal includes the following steps 401 to 403:
  • Step 401 Based on the first occurrence times corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit duration in the set statistical period in each set dimension The number of historical occurrences determines the first score corresponding to the first feature value in each set dimension within each unit duration.
  • the electronic device determines that the first feature value in the set statistical period corresponds to The maximum number of historical occurrences; based on the historical occurrences of the first eigenvalue corresponding to each unit time in each set dimension in the set statistical period, determine the first eigenvalue in each unit time in each set The second number of occurrences corresponding to the given dimension; based on the first number of occurrences corresponding to the first feature value in each set dimension, the maximum historical number of occurrences of the first feature value corresponding to the first set dimension within the set statistical period, and The second occurrence times corresponding to the first feature value of each unit time length in each set dimension determine the first score corresponding to the first feature value in each set dimension within each unit time length.
  • the first set dimension is any set dimension in at least one set dimension.
  • the second number of occurrences is related to the total number of unit time lengths included in the set statistical period, and/or, the weight corresponding to each unit time length of the first feature value. The later the time corresponding to the unit duration in the set statistical period, the greater the weight corresponding to the unit duration.
  • the determined first feature value within each unit time length is The first score corresponding to a set dimension includes the following steps 501 to 502:
  • Step 501 Based on the historical occurrence times of the first feature value corresponding to the first unit time length in the first set dimension, determine the number of the first feature value corresponding to the first set dimension within the first unit time length The second number of occurrences; the first set dimension characterizes any set dimension in the at least one set dimension; the first unit duration represents any unit time length in the set statistical cycle.
  • the electronic device determines the first feature value within the first unit time length based on the historical occurrence times corresponding to the first feature value corresponding to the first unit time length in the first set dimension and the total number of unit time lengths included in the set statistical period. The second occurrence count corresponding to the feature value in the first set dimension.
  • the determination of the second number of occurrences corresponding to the first feature value in the first set dimension within the first unit duration includes one of the following :
  • the weight corresponding to the first unit duration is greater than the weight corresponding to the second unit duration.
  • the electronic device determines the ratio of the historical occurrence times corresponding to the first feature value in the first set dimension within the first unit time length to the total number of unit time lengths included in the set statistical period, and based on the determined ratio, determines The second number of occurrences corresponding to the first feature value in the first set dimension within the first unit duration is calculated.
  • C represents the second occurrence number corresponding to the first feature value in the first set dimension within the first unit duration
  • Cn represents the historical occurrence number corresponding to the first feature value of the first unit duration in the first set dimension
  • m represents the total number of unit duration included in the set statistical period.
  • the calculated C may also be adjusted by adjusting parameters, so as to obtain the adjusted C.
  • the electronic device may also determine the second number of occurrences corresponding to the first feature value in the first set dimension within the first unit duration according to the following method:
  • the electronic device determines that the first characteristic value corresponding to the first unit duration corresponds to The historical occurrence times of ; based on the historical occurrence times of the first feature value corresponding to the first unit time length in the first set dimension, and the weight corresponding to the first feature value in the first unit time length, determine the The third occurrence times corresponding to the first characteristic value in the first setting dimension; according to this method, the third occurrence times corresponding to the first characteristic value in each unit time length in the first setting dimension are determined.
  • the electronic device determines that the first characteristic value within the first unit duration is in the first Sets the second occurrence count corresponding to the dimension. in,
  • the electronic device may determine the quotient of the third number of appearances corresponding to the first set dimension of the first feature value within the first unit time length and the calculated third sum as the first feature value within the first unit time length.
  • the second number of occurrences corresponding to the set dimension the electronic device can also use the set adjustment parameters to adjust the determined quotient, and obtain the first characteristic value within the first unit time length corresponding to the second set dimension. The number of occurrences.
  • C represents the second occurrence number corresponding to the first feature value in the first set dimension within the first unit time length
  • C n represents the history corresponding to the first feature value in the first set dimension within the nth unit time length The number of occurrences
  • g represents the weight corresponding to the first feature value at the nth unit time length
  • m represents the total number of unit time lengths included in the set statistical period.
  • Step 502 Based on the first number of occurrences corresponding to the first feature value in the first set dimension and the second number of occurrences of the first feature value corresponding to the first unit duration in the first set dimension, and based on the The maximum historical number of occurrences of the first feature value in the set statistical period and the second number of occurrences of the first feature value in the first set dimension corresponding to the first unit duration, determine the first unit The first score corresponding to the first characteristic value within the duration in the first set dimension.
  • the electronic device determines the maximum number of historical occurrences corresponding to the first feature value in the set statistical period based on the historical occurrence times corresponding to the first characteristic value corresponding to each unit duration in the first set dimension within the set statistical period .
  • the electronic device determines the first difference based on the maximum number of historical occurrences corresponding to the first characteristic value in the set statistical period and the second occurrence number corresponding to the first characteristic value in the first set dimension within the first unit duration ;Based on the first number of occurrences corresponding to the first feature value in the first set dimension, and the second number of occurrences corresponding to the second number of occurrences corresponding to the first feature value in the first set dimension within the first unit duration, determine calculating the second difference; based on the determined first difference and the determined second difference, determining the first score corresponding to the first feature value in the first set dimension within the first unit time length.
  • the first difference represents the difference between the maximum historical number of occurrences corresponding to the first feature value in the set statistical period and the second number of occurrences corresponding to the first feature value in the first set dimension within the first unit duration.
  • the second difference represents the difference between the first number of appearances corresponding to the first feature value in the first set dimension and the second number of appearances corresponding to the first feature value in the first set dimension within the first unit time length.
  • the determined first feature value within the first unit time corresponds to the first set dimension
  • the first fraction of consisting of one of the following:
  • a first score corresponding to the first feature value in the first set dimension within the first unit time length is determined based on the set score
  • the first difference is not equal to zero, based on the ratio of the second difference to the first difference, determine the first score corresponding to the first feature value in the first set dimension within the first unit time length; in,
  • the first difference represents the maximum number of historical occurrences corresponding to the first feature value in the set statistical period and the second value corresponding to the first feature value in the first set dimension within the first unit time length. difference in number of occurrences;
  • the second difference represents the first number of appearances corresponding to the first feature value in the first set dimension and the second number of appearances corresponding to the first feature value in the first set dimension within the first unit duration Difference.
  • the electronic device when the electronic device calculates the first difference, it judges whether the first difference is equal to zero, and if the first difference is equal to zero, based on the maximum historical occurrence times corresponding to the first characteristic value in the set statistical period , determine the corresponding set score, and determine the determined set score as the first score corresponding to the first feature value in the first set dimension.
  • the setting scores corresponding to different maximum historical occurrence times may be different.
  • the electronic device calculates the second difference, it judges whether the second difference is equal to zero, and if the second difference is not equal to zero, determines the ratio of the second difference to the first difference, and based on the determined The ratio of is determined to determine the first score corresponding to the first feature value in the first set dimension within the first unit time length. Wherein, the first scores corresponding to different ratios may be different.
  • the determination of the first unit time period based on the set score including one of the following:
  • the first set score corresponding to the first set dimension is determined as the first feature value within the first unit time length - setting the first score corresponding to the dimension;
  • the second set score is determined as the first feature value within the first unit duration within the first Set the first score corresponding to the dimension.
  • the electronic device determines the first set score corresponding to the first set dimension as the first score corresponding to the first feature value within the first unit time duration in the first set dimension.
  • the first set score is a default score corresponding to the first set dimension. In practice, this default score is zero.
  • the first difference When the first difference is equal to zero and the maximum historical occurrence times corresponding to the first feature value is equal to zero, it indicates that there is no first access address in the historical access log. At this time, it is judged whether there is The first access address; in the case where there is a first access address in any access address corresponding to the service system, it is used to characterize that the first access address is used to access the new function in the set service system, and the second set score is determined as The first score corresponding to the first feature value in the first set dimension within the first unit duration. The second set score represents the initial score corresponding to the new access address.
  • Determining the first score corresponding to the first feature value in the first set dimension within the first unit duration includes one of the following:
  • the ratio of the second difference to the first difference is less than or equal to zero, it is determined that the first fraction corresponding to the first feature value in the first set dimension within the first unit time length is zero;
  • the quotient of the second difference to the first difference is determined as the first characteristic value within the first unit time length in the first setting The first fraction corresponding to the dimension.
  • the second difference it is judged whether the second difference is equal to zero. In the case where the second difference is equal to zero, it is determined that the first fraction corresponding to the first feature value in the first set dimension within the first unit time length is zero.
  • the second difference In the case that the second difference is not equal to zero, it is judged whether the ratio of the second difference to the first difference is greater than zero; in the case that the ratio of the second difference to the first difference is less than zero, the first characteristic value is determined
  • the first score corresponding to the first set dimension is zero; when the ratio of the second difference to the first difference is greater than zero, the quotient of the second difference to the first difference (that is, the second difference value/first difference), which is determined as the first fraction corresponding to the first feature value in the first set dimension within the first unit time length.
  • S j represents the first score corresponding to the j-th set dimension within the first unit duration;
  • C ij represents the first number corresponding to the first feature value in the j-th set dimension;
  • C uj represents the first feature The value is the second number of occurrences corresponding to the j-th setting dimension;
  • C jmax represents the maximum historical number of occurrences of the first feature value corresponding to the j-th setting dimension.
  • Step 402 Determine a plurality of second scores corresponding to the first access address based on the first score corresponding to the first feature value of each unit duration in each set dimension; wherein, each second score It is determined based on the first scores corresponding to the first feature value in each unit time period in all set dimensions.
  • the electronic device determines the second score corresponding to the first access address in the first unit duration based on the first scores corresponding to the first characteristic value of the first unit duration in all set dimensions.
  • the second score corresponding to each unit duration of the first access address within the set statistical period may be calculated, so as to obtain a plurality of second scores corresponding to the first access address.
  • the first unit duration is any unit duration within the set statistical period.
  • the quantity of the first score is the same as the quantity of the unit time in the set statistical period.
  • the number of dimensions can be set to 1 or greater than 1, in order to improve the accuracy of the determined second score, thereby improving the accuracy of abnormality detection results, in some embodiments, the A plurality of second scores corresponding to the first access address are determined based on the first scores corresponding to the first feature value in each set dimension within each unit duration, including:
  • the first score corresponding to the first feature value in the first set dimension within the first unit duration is determined as the first access address in the first unit duration the corresponding second fraction;
  • the first fraction corresponding to the first characteristic value within the first unit time length in the first set dimension is determined as the first access address corresponding to the first unit time length of the second score.
  • weighted summation is performed based on the first score corresponding to each set dimension and the set weight corresponding to each set dimension based on the first feature value within the first unit time length, A second score corresponding to the first access address in the first unit duration is obtained.
  • the electronic device can determine the second fraction corresponding to the first access address within each unit duration within the set statistical period.
  • the electronic device may use the following formula to calculate the second score corresponding to the first access address in the first unit duration:
  • S represents the second score corresponding to the first access address in the first unit duration
  • S 1 represents the first score corresponding to the first dimension
  • W 1 represents the setting weight corresponding to the first dimension
  • S 3 represents the first score corresponding to the third dimension
  • W 3 represents the set weight corresponding to the third dimension
  • S 4 represents the set weight corresponding to the fourth dimension
  • a score W 4 represents the setting weight corresponding to the fourth dimension.
  • Step 403 Based on the plurality of second scores corresponding to the first access address, determine whether the first access address is abnormal.
  • the electronic device can compare each second score with a set threshold to obtain a comparison result; if any comparison result indicates that the second score is greater than or equal to the set threshold, it is determined that the first access address is abnormal; The results all indicate that when the second score is less than the set threshold, it is determined that the first access address is not an abnormal address.
  • the set threshold may be 400.
  • the method further includes:
  • the access operation corresponding to the first access address is blocked, and/or, access restriction is performed on the terminal device sending the first access address.
  • the electronic device determines that the first access address is abnormal, it blocks the access operation corresponding to the first access address, thereby preventing the terminal device sending the first access address from continuing to access the corresponding setting service system.
  • Restricting access to the terminal sending the first access address refers to restricting the access operation of the terminal device sending the first access address, thereby prohibiting the corresponding terminal device from accessing the corresponding service system or the corresponding function of the corresponding service system. Because, when the electronic device determines that the first access address is abnormal, it can directly perform exception processing on the first access address, and does not need to notify the corresponding business system to perform exception processing on the first access address, which can reduce the number of exceptions to the business system. The time consumed by early warning improves the timeliness of exception handling.
  • the first feature value corresponding to the first access address is determined, and the first feature value corresponding to each set dimension in at least one set dimension is determined.
  • the number of occurrences; based on the first occurrences corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit duration in the set statistical period in each set dimension The number of historical occurrences determines whether the first access address is abnormal. In this way, the access request to the business system via the built-in API gateway can be directly detected, and the first access log is generated based on the access information of the detected access request.
  • the timeliness of collecting the access log is improved Therefore, the timeliness of anomaly detection is improved; since the number of occurrences corresponding to the first eigenvalue can reflect the user's behavior habits, this solution can improve the accuracy of anomaly detection results.
  • FIG. 6 is a schematic diagram of an implementation flow of an anomaly detection method provided by an application embodiment of the present application. As shown in Figure 6, the anomaly detection method includes:
  • Step 601 Determine the characteristic value corresponding to each historical access address based on the historical access log corresponding to each unit duration in the set statistical period, and determine the characteristic value corresponding to each unit duration in the set statistical period in each Set the historical occurrences corresponding to the dimension.
  • API gateway As shown in Figure 7, in actual application, there are API gateway, login plug-in, permission plug-in and log plug-in in the electronic device, and the electronic device supports rights management service, log storage service, rights management service, user behavior analysis service and user history behavior analysis services, etc. in,
  • the API gateway can call login plug-ins, permission plug-ins and log plug-ins, etc.
  • the login plug-in is used for the API gateway to authenticate the login request;
  • the permission plug-in is used for the API gateway to verify the permission of the access request, and
  • the log plug-in is used for the API gateway to write the access information in the access request into the access log.
  • the authority management service is used to update and store the user's corresponding setting access address in each business system, as well as the access address of synchronization exception.
  • the user behavior analysis service is used to detect abnormal access addresses based on the first access log, for example, to implement steps 101 to 102; the user historical behavior analysis service is used to determine the characteristic value corresponding to each access address in each device based on the historical access log.
  • Setting dimensions include: user dimension, department dimension, post dimension and setting dimension, corresponding to the first dimension, second dimension, third dimension and fourth dimension above respectively.
  • step 601 for the implementation process of step 601, please refer to the relevant description above, and details will not be repeated here.
  • Step 602 Based on the first access log, determine the first feature value corresponding to the first access address, and determine the first occurrence times of the first feature value corresponding to each set dimension in at least one set dimension;
  • the first access address represents an access address in any access information in the first access log, and is used to access any one of the at least two set service systems; the first access log uses The access information of the access request through the built-in API gateway is recorded in real time within the unit duration.
  • step 602 is the same as step 101, please refer to the relevant description of step 101 for the implementation process, and details are not repeated here.
  • Step 603 Based on the first number of occurrences corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit duration in the set statistical period in each set dimension The number of historical occurrences determines whether the first access address is abnormal.
  • step 603 is the same as step 102, please refer to the relevant description of step 102 for the implementation process, and details are not repeated here.
  • Step 604 When the first access address is abnormal, block the access operation corresponding to the first access address, and/or restrict access to the terminal device sending the first access address.
  • the embodiment of the present application also provides an abnormality detection device, as shown in Figure 8, the abnormality detection device includes:
  • the first determining unit 81 is configured to determine the first feature value corresponding to the first access address based on the first access log, and determine the corresponding value of the first feature value in each set dimension in at least one set dimension The first number of occurrences; the first access address represents the access address in any access information in the first access log, and is used to access any one of the at least two set service systems; the The first access log is used to record the access information of the access request via the built-in application program interface API gateway in real time within the unit duration;
  • the second determination unit 82 is configured to be based on the first occurrence times corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit time length in the set statistical period in each Set the number of historical occurrences corresponding to the dimension to determine whether the first access address is abnormal; wherein, the first feature value corresponding to each unit duration in the set statistical period is determined based on the historical log corresponding to the corresponding unit duration .
  • the access information includes at least user identification, access time and access address
  • the first determining unit 81 is further configured to:
  • the second access log includes the first access log or the historical access corresponding to each unit duration within the set statistical period log; the first sequence includes all access addresses corresponding to the first user in the second access log;
  • the second sequence includes at least one access address
  • the hash value is calculated to obtain the feature value corresponding to the access address at the set position in the second sequence corresponding to the first user.
  • the first determining unit 81 is specifically configured as:
  • the third access address in the second sequence is replaced with the set character string; wherein, the second access address represents the first The access address at the set position in the second sequence, the third access address represents the access address adjacent to the second access address in the second sequence;
  • a hash value is calculated based on the updated second sequence.
  • the second sequence includes three access addresses, the set position representing an intermediate position.
  • the access information also includes the department to which the user belongs, the position of the user, and the identification of the terminal device sending the access address;
  • the at least one set dimension includes at least one of the following:
  • the first dimension represents the number of occurrences of statistical feature values by user
  • the second dimension represents the number of occurrences of statistical feature values by department
  • the third dimension represents the number of occurrences of statistical feature values by position
  • the fourth dimension represents the statistics of the number of occurrences of the same feature value based on the terminal equipment used by the user.
  • the second determination unit 82 is specifically configured as:
  • a plurality of second scores corresponding to the first access address are determined based on the first score corresponding to the first feature value of each unit duration in each set dimension; wherein each second score is based on each The first eigenvalues within the unit duration are determined by the first scores corresponding to all the set dimensions;
  • the second determination unit 82 is specifically configured as:
  • the second occurrence of the first feature value corresponding to the first set dimension within the first unit time length The number of times; the first set dimension represents any set dimension in the at least one set dimension; the first unit duration represents any unit duration in the set statistical cycle;
  • the maximum historical number of occurrences of a feature value in the set statistical period and the second number of occurrences of the first feature value in the first set dimension corresponding to the first unit time length are determined to determine The first score corresponding to the first feature value in the first set dimension.
  • the second determination unit 82 is specifically configured as:
  • the first characteristic value of the first unit duration is determined based on the ratio of the historical occurrence times of the first characteristic value corresponding to the first set dimension in the first unit duration to the total number of the unit duration within the set statistical period The second number of occurrences corresponding to the first set dimension;
  • the weight corresponding to the first unit duration is greater than the weight corresponding to the second unit duration.
  • the second determination unit 82 is specifically configured as:
  • a first score corresponding to the first feature value in the first set dimension within the first unit time length is determined based on the set score
  • the first difference is not equal to zero, based on the ratio of the second difference to the first difference, determine the first score corresponding to the first feature value in the first set dimension within the first unit time length; in,
  • the first difference represents the maximum number of historical occurrences corresponding to the first feature value in the set statistical period and the second value corresponding to the first feature value in the first set dimension within the first unit time length. difference in number of occurrences;
  • the second difference represents the first number of appearances corresponding to the first feature value in the first set dimension and the second number of appearances corresponding to the first feature value in the first set dimension within the first unit duration Difference.
  • the second determination unit 82 is specifically configured as:
  • the first set score corresponding to the first set dimension is determined as the first feature value within the first unit time length within the first Set the first score corresponding to the dimension;
  • the second set score is determined as the first feature value within the first unit duration within the first Set the first score corresponding to the dimension.
  • the second determination unit 82 is specifically configured as:
  • the ratio of the second difference to the first difference is less than or equal to zero, it is determined that the first fraction corresponding to the first feature value in the first set dimension within the first unit time length is zero;
  • the quotient of the second difference to the first difference is determined as the first characteristic value within the first unit time length in the first setting The first fraction corresponding to the dimension.
  • the second determination unit 82 is specifically configured as:
  • the first score corresponding to the first feature value in the first set dimension within the first unit duration is determined as the first access address in the first unit duration the corresponding second fraction;
  • the anomaly detection device also includes:
  • the exception processing unit is configured to block the access operation corresponding to the first access address when the first access address is abnormal, and/or restrict access to the terminal device sending the first access address.
  • the first determining unit 81, the second determining unit 82 and the abnormality processing unit can pass through a processor in the abnormality detection device, such as a central processing unit (CPU, Central Processing Unit), a digital signal processor (DSP, Digital Signal Processor), Microcontroller Unit (MCU, Microcontroller Unit) or Programmable Gate Array (FPGA, Field-Programmable Gate Array) and other implementations.
  • a processor in the abnormality detection device such as a central processing unit (CPU, Central Processing Unit), a digital signal processor (DSP, Digital Signal Processor), Microcontroller Unit (MCU, Microcontroller Unit) or Programmable Gate Array (FPGA, Field-Programmable Gate Array) and other implementations.
  • CPU Central Processing Unit
  • DSP Digital Signal Processor
  • MCU Microcontroller Unit
  • FPGA Field-Programmable Gate Array
  • the abnormality detection device when the abnormality detection device provided by the above-mentioned embodiments performs abnormality detection, it only uses the division of the above-mentioned program modules as an example. In practical applications, the above-mentioned processing can be allocated by different program modules according to needs. That is, the internal structure of the device is divided into different program modules to complete all or part of the processing described above.
  • the anomaly detection device and the anomaly detection method embodiment provided by the above embodiment belong to the same idea, and the specific implementation process thereof is detailed in the method embodiment, and will not be repeated here.
  • FIG. 9 is a schematic diagram of the hardware composition structure of the electronic device provided by the embodiment of the present application. As shown in FIG. 9, the electronic device 9 includes:
  • Communication interface 91 capable of information exchange with other devices such as network devices;
  • the processor 92 is connected to the communication interface 91 to implement information interaction with other devices, and is configured to execute the methods provided by one or more of the above technical solutions when running a computer program. Instead, the computer program is stored on the memory 93 .
  • bus system 94 is configured to enable connection communication between these components.
  • bus system 94 also includes a power bus, a control bus and a status signal bus.
  • the various buses are labeled as bus system 94 in FIG. 9 for clarity of illustration.
  • the memory 93 in the embodiment of the present application is configured to store various types of data to support the operation of the electronic device 9 .
  • Examples of such data include: any computer program configured to operate on electronic device 9 .
  • the memory 93 may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memories.
  • the non-volatile memory can be read-only memory (ROM, Read Only Memory), programmable read-only memory (PROM, Programmable Read-Only Memory), erasable programmable read-only memory (EPROM, Erasable Programmable Read-Only Memory) Only Memory), Electrically Erasable Programmable Read-Only Memory (EEPROM, Electrically Erasable Programmable Read-Only Memory), Magnetic Random Access Memory (FRAM, ferromagnetic random access memory), Flash Memory (Flash Memory), Magnetic Surface Memory , CD, or CD-ROM (Compact Disc Read-Only Memory); magnetic surface storage can be disk storage or tape storage.
  • the volatile memory may be random access memory (RAM, Random Access Memory), which is used as an external cache.
  • RAM random access memory
  • RAM Random Access Memory
  • many forms of RAM are available, such as Static Random Access Memory (SRAM, Static Random Access Memory), Synchronous Static Random Access Memory (SSRAM, Synchronous Static Random Access Memory), Dynamic Random Access Memory Memory (DRAM, Dynamic Random Access Memory), synchronous dynamic random access memory (SDRAM, Synchronous Dynamic Random Access Memory), double data rate synchronous dynamic random access memory (DDRSDRAM, Double Data Rate Synchronous Dynamic Random Access Memory), enhanced Synchronous Dynamic Random Access Memory (ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), Synchronous Link Dynamic Random Access Memory (SLDRAM, SyncLink Dynamic Random Access Memory), Direct Memory Bus Random Access Memory (DRRAM, Direct Rambus Random Access Memory ).
  • the memory 93 described in the embodiments of the present application is intended to include but not limited to these and any other suitable types of memory.
  • the methods disclosed in the foregoing embodiments of the present application may be applied to the processor 92 or implemented by the processor 92 .
  • the processor 92 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by an integrated logic circuit of hardware in the processor 92 or instructions in the form of software.
  • the aforementioned processor 92 may be a general-purpose processor, DSP, or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like.
  • the processor 92 may implement or execute various methods, steps, and logic block diagrams disclosed in the embodiments of the present application.
  • a general purpose processor may be a microprocessor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a storage medium, the storage medium is located in the memory 93, and the processor 92 reads the program in the memory 93, and completes the steps of the foregoing method in combination with its hardware.
  • the processor 92 executes the program, it implements a corresponding process implemented by the terminal in each method of the embodiment of the present application. For the sake of brevity, details are not repeated here.
  • the embodiment of the present application also provides a storage medium, that is, a computer storage medium, specifically a computer-readable storage medium, for example, including a first memory 93 storing a computer program, and the above-mentioned computer program can be processed by the terminal 92 to complete the steps described in the foregoing method.
  • the computer-readable storage medium can be memories such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface memory, optical disk, or CD-ROM.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical or other forms of.
  • the units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place or distributed to multiple network units; Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing module, or each unit can be used as a single unit, or two or more units can be integrated into one unit; the above-mentioned integration
  • the unit can be realized in the form of hardware or in the form of hardware plus software functional unit.
  • the term "and/or" in the embodiments of the present application is only an association relationship describing associated objects, which means that there may be three kinds of relationships, for example, A and/or B, which may mean that A exists alone , both A and B exist, and B exists alone.
  • the term "at least one" herein means any combination of any one or more of at least two of a plurality, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Security & Cryptography (AREA)
  • General Physics & Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Computer Hardware Design (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present application discloses an anomaly detection method and apparatus, an electronic device, and a storage medium. The method comprises: on the basis of a first access log, determining a first feature value corresponding to a first access address, and determining a first number of occurrences corresponding to the first feature value in each set dimension among at least one set dimension, the first access address representing an access address in any access information in the first access log and being used for accessing any set service system among at least two set service systems, and the first access log being used for recording, in real time, access information of an access request via a built-in application program interface (API) gateway within a unit duration; and on the basis of the first number of occurrences corresponding to the first feature value in each set dimension, and on the basis of a historical number of occurrences corresponding to the first feature value in each set dimension corresponding to each unit duration within a set statistical period, determining whether the first access address is abnormal.

Description

异常检测方法、装置、电子设备及存储介质Abnormality detection method, device, electronic device and storage medium
相关申请的交叉引用Cross References to Related Applications
本申请基于申请号为202111526821.5,申请日为2021年12月14日的中国专利申请提出,并要求上述中国专利申请的优先权,上述中国专利申请的全部内容在此引入本申请作为参考。This application is based on the Chinese patent application with application number 202111526821.5 and the filing date is December 14, 2021, and claims the priority of the above-mentioned Chinese patent application. The entire content of the above-mentioned Chinese patent application is hereby incorporated by reference into this application.
技术领域technical field
本申请涉及计算机技术领域,尤其涉及一种异常检测方法、装置、电子设备及存储介质。The present application relates to the field of computer technology, and in particular to an anomaly detection method, device, electronic equipment and storage medium.
背景技术Background technique
随着计算机技术的发展,越来越多的技术,例如,区块链(Blockchain)、大数据、分布式等技术被应用在金融领域,传统金融业正在逐步向金融科技转变,然而,由于金融行业的安全性、实时性要求,金融科技也对技术提出了更高的要求。在金融科技领域下,在追踪跨系统的用户行为的场景下,各业务系统在检测到用户访问对应的系统页面时,对用户的操作进行记录,并通过集成于业务系统中用于收集用户行为日志的软件开发工具包(SDK,Software Development Kit)向日志服务器上报用户行为日志;日志服务器对各业务系统上报的用户行为日志进行分析,确定出每个业务系统对应的异常行为,并将确定出的异常行为下发至对应系统,以便对应的业务系统对异常行为进行处理。但相关技术中,确定出的异常行为不准确,且时效性较差。With the development of computer technology, more and more technologies, such as blockchain (Blockchain), big data, distributed and other technologies are applied in the financial field, and the traditional financial industry is gradually transforming into financial technology. However, due to financial The security and real-time requirements of the industry, and financial technology also put forward higher requirements for technology. In the field of financial technology, in the scenario of tracking cross-system user behavior, when each business system detects that the user visits the corresponding system page, it records the user's operation and integrates it into the business system to collect user behavior. The log software development kit (SDK, Software Development Kit) reports user behavior logs to the log server; the log server analyzes the user behavior logs reported by each business system, determines the abnormal behavior corresponding to each business system, and determines the The abnormal behavior is sent to the corresponding system so that the corresponding business system can handle the abnormal behavior. However, in related technologies, the determined abnormal behavior is inaccurate and has poor timeliness.
发明内容Contents of the invention
为解决相关技术问题,本申请实施例提供一种异常检测方法、装置、电子设备及存储介质。To solve related technical problems, embodiments of the present application provide an anomaly detection method, device, electronic equipment, and storage medium.
本申请实施例提供了一种异常检测方法,包括:The embodiment of the present application provides an anomaly detection method, including:
基于第一访问日志,确定出第一访问地址对应的第一特征值,以及确定出所述第一特征值在至少一个设定维度中每个设定维度对应的第一出现次数;所述第一访问地址表征所述第一访问日志中的任一访问信息中的访问地址,用于访问至少两个设定业务系统中的任一设定业务系统;所述第一访问日志用于实时记录单位时长内经由内置的应用程序接口(Application Programming Interface,API)网关的访问请求的访问信息;Based on the first access log, determine the first feature value corresponding to the first access address, and determine the first number of occurrences of the first feature value corresponding to each set dimension in at least one set dimension; An access address represents an access address in any access information in the first access log, and is used to access any one of at least two set business systems; the first access log is used for real-time recording The access information of the access request via the built-in application programming interface (Application Programming Interface, API) gateway within the unit time;
基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出所述第一访问地址是否异常;其中,设定统计周期内每个单位时长对应的所述第一特征值基于对应的单位时长对应的历史日志确定出。Based on the first number of occurrences corresponding to the first feature value in each set dimension, and based on the historical number of occurrences of the first feature value corresponding to each unit duration in the set statistical period in each set dimension , to determine whether the first access address is abnormal; wherein, the first feature value corresponding to each unit time length in the set statistical period is determined based on the history log corresponding to the corresponding unit time length.
上述方案中,所述访问信息至少包括用户标识、访问时间和访问地址;确定访问地址对应的特征值,包括:In the above solution, the access information at least includes user identification, access time and access address; determining the characteristic value corresponding to the access address includes:
基于第二访问日志中的访问信息,确定出第一用户对应的第一序列;其中,所述第二访问日志包括所述第一访问日志或设定统计周期内每个单位时长对应的历史访问日志;第一序列包括第二访问日志中第一用户对应的所有访问地址;Based on the access information in the second access log, determine the first sequence corresponding to the first user; wherein, the second access log includes the first access log or the historical access corresponding to each unit duration within the set statistical period log; the first sequence includes all access addresses corresponding to the first user in the second access log;
在第一用户对应的第一序列中,确定出第一用户对应的第二序列;第二序列包括至少一个访问地址;In the first sequence corresponding to the first user, determine the second sequence corresponding to the first user; the second sequence includes at least one access address;
基于第一用户对应的第二序列,计算出哈希值,得到第一用户对应的第二序列中位于设定位置的访问地址对应的特征值。Based on the second sequence corresponding to the first user, the hash value is calculated to obtain the feature value corresponding to the access address at the set position in the second sequence corresponding to the first user.
上述方案中,所述计算出哈希值,包括:In the above scheme, the calculation of the hash value includes:
在第三访问地址与第二访问地址之间的时间间隔大于或等于设定时长的情况下,将第二序列中的第三访问地址替换为设定字符串;其中,第二访问地址表征第二序列中位于设定位置的访问地址,第三访问地址表征第二序列中与第二访问地址相邻的任一访问地址;In the case that the time interval between the third access address and the second access address is greater than or equal to the set duration, the third access address in the second sequence is replaced with the set character string; wherein, the second access address represents the first The access address at the set position in the second sequence, the third access address represents any access address adjacent to the second access address in the second sequence;
基于更新后的第二序列计算出哈希值。A hash value is calculated based on the updated second sequence.
上述方案中,所述第二序列包括三个访问地址,设定位置表征中间位置。In the above solution, the second sequence includes three access addresses, and the set position represents an intermediate position.
上述方案中,所述访问信息还包括用户所属的部门、用户岗位以及发送访问地址的终端设备的标识;In the above solution, the access information also includes the department to which the user belongs, the position of the user, and the identification of the terminal device sending the access address;
所述至少一个设定维度包括以下至少之一:The at least one set dimension includes at least one of the following:
第一维度,表征按用户统计特征值的出现次数;The first dimension represents the number of occurrences of statistical feature values by user;
第二维度,表征按部门统计特征值的出现次数;The second dimension represents the number of occurrences of statistical feature values by department;
第三维度,表征按岗位统计特征值的出现次数;The third dimension represents the number of occurrences of statistical feature values by position;
第四维度,表征按用户使用的终端设备统计相同的特征值的出现次数。The fourth dimension represents the statistics of the number of occurrences of the same feature value based on the terminal equipment used by the user.
上述方案中,所述确定出所述第一访问地址是否异常,包括:In the above solution, the determining whether the first access address is abnormal includes:
基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出每个单位时长内的所述第一特征值在每个设定维度对应的第一分数;Based on the first number of occurrences corresponding to the first feature value in each set dimension, and based on the historical number of occurrences of the first feature value corresponding to each unit duration in the set statistical period in each set dimension , determining the first score corresponding to the first feature value in each set dimension within each unit duration;
基于每个单位时长的所述第一特征值在每个设定维度对应的第一分数,确定出所述第一访问地址对应的多个第二分数;其中,每个第二分数基于每个单位时长内的所述第一特征值在所有设定维度对应的第一分数确定出;A plurality of second scores corresponding to the first access address are determined based on the first score corresponding to the first feature value of each unit duration in each set dimension; wherein each second score is based on each The first eigenvalues within the unit duration are determined by the first scores corresponding to all the set dimensions;
基于所述第一访问地址对应的多个第二分数,确定出所述第一访问地址是否异常。Based on the plurality of second scores corresponding to the first access address, it is determined whether the first access address is abnormal.
上述方案中,所述确定出每个单位时长内的所述第一特征值在每个设定维度对应的第一分数,包括:In the above solution, the determination of the first score corresponding to the first feature value in each set dimension within each unit duration includes:
基于第一单位时长对应的所述第一特征值在第一设定维度对应的历史出现次数,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数;第一设定维度表征所述至少一个设定维度中的任一设定维度;第一单位时长表征所述设定统计周期中的任一单位时长;Based on the historical occurrence times of the first feature value corresponding to the first unit time length in the first set dimension, determine the second occurrence of the first feature value corresponding to the first set dimension within the first unit time length The number of times; the first set dimension represents any set dimension in the at least one set dimension; the first unit duration represents any unit duration in the set statistical cycle;
基于所述第一特征值在第一设定维度对应的第一出现次数和第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数,以及基于所述第一特征值在所述设定统计周期中对应的最大历史出现次数以及第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数。Based on the first number of occurrences of the first feature value corresponding to the first set dimension and the second number of occurrences of the first feature value corresponding to the first set dimension within the first unit duration, and based on the first set dimension The maximum historical number of occurrences of a feature value in the set statistical period and the second number of occurrences of the first feature value in the first set dimension corresponding to the first unit time length are determined to determine The first score corresponding to the first feature value in the first set dimension.
上述方案中,所述确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数,包括以下之一:In the above solution, the determination of the second occurrence times corresponding to the first feature value in the first set dimension within the first unit duration includes one of the following:
基于第一单位时长内的所述第一特征值在第一设定维度对应的历史出现次数与设定统计周期内单位时长的总数的比值,确定出第一单位时长的所述第一特征值在第一设 定维度对应的第二出现次数;The first characteristic value of the first unit duration is determined based on the ratio of the historical occurrence times of the first characteristic value corresponding to the first set dimension in the first unit duration to the total number of the unit duration within the set statistical period The second number of occurrences corresponding to the first set dimension;
基于每个单位时长内的所述第一特征值在第一设定维度对应的历史出现次数,以及所述第一特征值在每个单位时长对应的权重,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数;其中,Based on the historical occurrence times corresponding to the first feature value in the first set dimension in each unit time length, and the weight corresponding to the first feature value in each unit time length, determine all the first feature values in the first unit time length The second number of occurrences corresponding to the first feature value in the first set dimension; wherein,
在第一单位时长对应的时间晚于第二单位时长对应的时间的情况下,第一单位时长对应的权重大于第二单位时长对应的权重。In a case where the time corresponding to the first unit duration is later than the time corresponding to the second unit duration, the weight corresponding to the first unit duration is greater than the weight corresponding to the second unit duration.
上述方案中,所述确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,包括以下之一:In the above solution, the determination of the first score corresponding to the first feature value in the first set dimension within the first unit duration includes one of the following:
在第一差值等于零的情况下,基于设定分数确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数;In the case where the first difference is equal to zero, a first score corresponding to the first feature value in the first set dimension within the first unit time length is determined based on the set score;
在第一差值不等于零的情况下,基于第二差值与第一差值的比值,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数;In the case where the first difference is not equal to zero, based on the ratio of the second difference to the first difference, determine the first score corresponding to the first feature value in the first set dimension within the first unit time length;
其中,所述第一差值表征所述第一特征值在所述设定统计周期中对应的最大历史出现次数与第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数之差;Wherein, the first difference represents the maximum number of historical occurrences corresponding to the first feature value in the set statistical period and the first feature value corresponding to the first set dimension within the first unit duration the difference between the second occurrences;
所述第二差值表征所述第一特征值在第一设定维度对应的第一出现次数与第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数之差。The second difference represents the first number of appearances corresponding to the first feature value in the first set dimension and the second number of appearances corresponding to the first feature value in the first set dimension within the first unit duration Difference.
上述方案中,所述基于设定分数确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,包括以下之一:In the above solution, the determination of the first score corresponding to the first feature value in the first set dimension within the first unit time length based on the set score includes one of the following:
在所述第一特征值对应的最大历史出现次数大于零的情况下,将第一设定维度对应的第一设定分数,确定为第一单位时长内的所述第一特征值在第一设定维度对应的第一分数;In the case where the maximum number of historical occurrences corresponding to the first feature value is greater than zero, the first set score corresponding to the first set dimension is determined as the first feature value within the first unit time length within the first Set the first score corresponding to the dimension;
在所述第一特征值对应的最大历史出现次数等于零,且存在所述第一访问地址的情况下,将第二设定分数确定为第一单位时长内的所述第一特征值在第一设定维度对应的第一分数。In the case that the maximum number of historical occurrences corresponding to the first feature value is equal to zero and the first access address exists, the second set score is determined as the first feature value within the first unit duration within the first Set the first score corresponding to the dimension.
上述方案中,所述基于第二差值与第一差值的比值,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,包括以下之一:In the above solution, the determination of the first score corresponding to the first feature value in the first set dimension within the first unit duration based on the ratio of the second difference to the first difference includes one of the following:
在第二差值与第一差值的比值小于或等于零的情况下,确定第一单位时长内的所述第一特征值在第一设定维度对应的第一分数为零;In the case where the ratio of the second difference to the first difference is less than or equal to zero, it is determined that the first fraction corresponding to the first feature value in the first set dimension within the first unit time length is zero;
在第二差值与第一差值的比值大于零的情况下,将第二差值与第一差值之商,确定为第一单位时长内的所述第一特征值在第一设定维度对应的第一分数。In the case where the ratio of the second difference to the first difference is greater than zero, the quotient of the second difference to the first difference is determined as the first characteristic value within the first unit time length in the first setting The first fraction corresponding to the dimension.
上述方案中,所述基于每个单位时长内的所述第一特征值在每个设定维度对应的第一分数,确定出所述第一访问地址对应的多个第二分数,包括:In the above scheme, the multiple second scores corresponding to the first access address are determined based on the first scores corresponding to the first feature value in each set dimension within each unit duration, including:
在设定维度的数量为1的情况下,将第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,确定为所述第一访问地址在第一单位时长对应的第二分数;In the case where the number of dimensions is set to 1, the first score corresponding to the first feature value in the first set dimension within the first unit duration is determined as the first access address in the first unit duration the corresponding second fraction;
在设定维度的数量大于1的情况下,基于第一单位时长内的所述第一特征值在每个设定维度对应的第一分数和每个设定维度的设定权重,确定出所述第一访问地址在第一单位时长对应的第二分数。When the number of set dimensions is greater than 1, based on the first score corresponding to the first feature value in each set dimension within the first unit time length and the set weight of each set dimension, determine the The second fraction corresponding to the first access address in the first unit duration.
上述方案中,所述方法还包括:In the above scheme, the method also includes:
在所述第一访问地址异常的情况下,阻断所述第一访问地址对应的访问操作,和/或,对发送所述第一访问地址的终端设备进行访问限制。In the case that the first access address is abnormal, the access operation corresponding to the first access address is blocked, and/or, access restriction is performed on the terminal device sending the first access address.
本申请实施例还提供了一种异常检测装置,包括:The embodiment of the present application also provides an abnormality detection device, including:
第一确定单元,配置为基于第一访问日志,确定出第一访问地址对应的第一特征值,以及确定出所述第一特征值在至少一个设定维度中每个设定维度对应的第一出现次数;所述第一访问地址表征所述第一访问日志中的任一访问信息中的访问地址,用于访问至 少两个设定业务系统中的任一设定业务系统;所述第一访问日志用于实时记录单位时长内经由内置的应用程序接口API网关的访问请求的访问信息;The first determination unit is configured to determine the first feature value corresponding to the first access address based on the first access log, and determine the first feature value corresponding to each set dimension in at least one set dimension. A number of occurrences; the first access address represents the access address in any access information in the first access log, and is used to access any one of the at least two set service systems; the first An access log is used to record the access information of the access request via the built-in application program interface API gateway in real time within the unit duration;
第二确定单元,配置为基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出所述第一访问地址是否异常;其中,设定统计周期内每个单位时长对应的所述第一特征值基于对应的单位时长对应的历史日志确定出。The second determining unit is configured to be based on the first occurrence times corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit time length in the set statistical period in each set Determine whether the first access address is abnormal by determining the number of historical occurrences corresponding to the dimension; wherein, the first characteristic value corresponding to each unit time length in the set statistical period is determined based on the historical log corresponding to the corresponding unit time length.
本申请实施例还提供了一种电子设备,包括:处理器和配置为存储能够在处理器上运行的计算机程序的存储器,An embodiment of the present application also provides an electronic device, including: a processor and a memory configured to store a computer program that can run on the processor,
其中,所述处理器配置为运行所述计算机程序时,执行上述异常检测方法的步骤。Wherein, the processor is configured to execute the steps of the above abnormality detection method when running the computer program.
本申请实施例还提供了一种存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述异常检测方法的步骤。The embodiment of the present application also provides a storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the above abnormality detection method are implemented.
本申请实施例中,基于第一访问日志,确定出第一访问地址对应的第一特征值,以及确定出所述第一特征值在至少一个设定维度中每个设定维度对应的第一出现次数;基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出所述第一访问地址是否异常。这样,可以直接检测到经由内置的API网关到业务系统的访问请求,基于检测到的访问请求的访问信息生成第一访问日志,因不需要各业务系统上报访问日志,提高了收集访问日志的时效性,从而提高了异常检测的时效性;由于第一特征值对应的出现次数,可以反映出用户的行为习惯,因此,本方案可以提高异常检测结果的准确度。In this embodiment of the present application, based on the first access log, the first feature value corresponding to the first access address is determined, and the first feature value corresponding to each set dimension in at least one set dimension is determined. The number of occurrences; based on the first occurrences corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit duration in the set statistical period in each set dimension The number of historical occurrences determines whether the first access address is abnormal. In this way, the access request to the business system via the built-in API gateway can be directly detected, and the first access log is generated based on the access information of the detected access request. Since there is no need for each business system to report the access log, the timeliness of collecting the access log is improved Therefore, the timeliness of anomaly detection is improved; since the number of occurrences corresponding to the first eigenvalue can reflect the user's behavior habits, this solution can improve the accuracy of anomaly detection results.
附图说明Description of drawings
图1为相关技术中的异常检测方法的实现流程示意图;FIG. 1 is a schematic diagram of an implementation process of an anomaly detection method in the related art;
图2为本申请实施例提供的异常检测方法的实现流程示意图;FIG. 2 is a schematic diagram of the implementation flow of the abnormality detection method provided by the embodiment of the present application;
图3为本申请实施例提供的确定访问地址对应的特征值的实现流程示意图;FIG. 3 is a schematic diagram of an implementation flow for determining a characteristic value corresponding to an access address provided in an embodiment of the present application;
图4为本申请实施例提供确定第一访问地址是否异常的实现流程示意图;FIG. 4 is a schematic diagram of an implementation flow for determining whether the first access address is abnormal according to an embodiment of the present application;
图5为本申请实施例提供的确定第一分数的实现流程示意图;FIG. 5 is a schematic diagram of an implementation process for determining the first score provided by the embodiment of the present application;
图6为本申请应用实施例提供的异常检测方法的实现流程示意图;FIG. 6 is a schematic diagram of the implementation flow of the abnormality detection method provided by the application embodiment of the present application;
图7为本申请实施例提供的异常检测系统的示意图;FIG. 7 is a schematic diagram of an abnormality detection system provided by an embodiment of the present application;
图8为本申请实施例提供的异常检测装置的结构示意图;FIG. 8 is a schematic structural diagram of an abnormality detection device provided in an embodiment of the present application;
图9为本申请实施例提供的电子设备的硬件组成结构示意图。FIG. 9 is a schematic diagram of a hardware composition structure of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
如图1所示,相关技术中,需要各业务系统需要集成用于收集用户行为日志的SDK,这使得收集用户行为日志的成本较高;各业务系统需要通过各自集成的SDK向日志服务器上报用户行为日志,这使得收集用户行为日志的时效性较差。另外,日志服务器基于按人工经验设置的规则,对各业务系统上报的用户行为日志进行分析,从而确定出异常行为,这导致确定出分析出的异常行为的准确度较低。As shown in Figure 1, in related technologies, each business system needs to integrate an SDK for collecting user behavior logs, which makes the cost of collecting user behavior logs relatively high; each business system needs to report user behavior to the log server through its own integrated SDK. Behavior logs, which makes the timeliness of collecting user behavior logs poor. In addition, the log server analyzes the user behavior logs reported by various business systems based on rules set according to human experience, so as to determine abnormal behaviors, which leads to low accuracy in determining the analyzed abnormal behaviors.
基于此,本申请提供了一种异常检测方法,基于第一访问日志,确定出第一访问地址对应的第一特征值,以及确定出所述第一特征值在至少一个设定维度中每个设定维度对应的第一出现次数;基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史 出现次数,确定出所述第一访问地址是否异常。这样,可以直接检测到经由内置的API网关到业务系统的访问请求,基于检测到的访问请求的访问信息生成第一访问日志,因不需要各业务系统上报访问日志,提高了收集访问日志的时效性,从而提高了异常检测的时效性;由于第一特征值对应的出现次数,可以反映出用户的行为习惯,因此,本方案可以提高异常检测结果的准确度。Based on this, the present application provides an anomaly detection method. Based on the first access log, the first feature value corresponding to the first access address is determined, and each of the first feature values in at least one set dimension is determined. Set the first number of occurrences corresponding to the dimension; based on the first number of occurrences corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit duration in the set statistical period in The historical occurrence times corresponding to each set dimension determine whether the first access address is abnormal. In this way, the access request to the business system via the built-in API gateway can be directly detected, and the first access log is generated based on the access information of the detected access request. Since there is no need for each business system to report the access log, the timeliness of collecting the access log is improved Therefore, the timeliness of anomaly detection is improved; since the number of occurrences corresponding to the first eigenvalue can reflect the user's behavior habits, this solution can improve the accuracy of anomaly detection results.
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.
图2为本申请实施例提供的异常检测方法的实现流程示意图,其中,流程的执行主体为终端设备、服务器等电子设备。如图2示出的,异常检测方法包括:FIG. 2 is a schematic diagram of an implementation flow of an anomaly detection method provided by an embodiment of the present application, wherein the execution subject of the flow is electronic equipment such as a terminal device and a server. As shown in Figure 2, the anomaly detection method includes:
步骤201:基于第一访问日志,确定出第一访问地址对应的第一特征值,以及确定出所述第一特征值在至少一个设定维度中每个设定维度对应的第一出现次数;其中,所述第一访问地址为所述第一访问日志中的任一访问信息中的访问地址,用于访问至少两个设定业务系统中的任一设定业务系统;所述第一访问日志用于实时记录单位时长内经由内置的应用程序接口API网关的访问请求的访问信息。Step 201: Based on the first access log, determine the first feature value corresponding to the first access address, and determine the first occurrence times of the first feature value corresponding to each set dimension in at least one set dimension; Wherein, the first access address is an access address in any access information in the first access log, and is used to access any one of the at least two set business systems; the first access The log is used to record the access information of the access request via the built-in application program interface API gateway in real time within a unit time.
这里,电子设备中设置有API网关,用户通过终端设备触发的设定业务系统相关的登陆请求或访问请求,经由API网关到达管理对应的设定业务系统的服务器。该访问请求可以是跨系统的访问请求,也可以是非跨系统的访问请求。也就是说,在任一设定业务系统的交互界面触发的访问请求,可以用于请求访问该设定业务系统中的相关功能,也可以用于请求访问除该设定业务系统之外的其他设定业务系统的相关功能。实际应用时,API网关可以为APISIX。Here, the electronic device is provided with an API gateway, and the user triggers a login request or an access request related to the setting service system through the terminal device, and reaches the server managing the corresponding setting service system through the API gateway. The access request may be a cross-system access request, or a non-cross-system access request. That is to say, the access request triggered on the interactive interface of any set business system can be used to request access to related functions in the set business system, or to request access to other devices other than the set business system. Determine the relevant functions of the business system. In actual application, the API gateway can be APISIX.
用户在登陆设定业务系统之后,可以以点击该设定业务系统的交互页面中的相关功能或按钮,触发访问请求。电子设备在检测到经由API网关的访问请求的情况下,将访问请求的访问信息写入第一访问日志,第一访问日志用于实时记录单位时长内经由API网关的访问请求携带的访问信息,第一访问日志中的访问信息是实时更新的。其中,访问信息至少包括用户标识、访问时间和访问地址。After logging in to the setting business system, the user may click on a relevant function or button on the interactive page of the setting business system to trigger an access request. When the electronic device detects an access request via the API gateway, it writes the access information of the access request into a first access log, and the first access log is used to record in real time the access information carried in the access request via the API gateway within a unit duration, The access information in the first access log is updated in real time. Wherein, the access information includes at least user identification, access time and access address.
在实际应用时,单位时长为一天。也就是说,电子设备每天创建一个访问日志,用于记录当天检测到的所有访问请求携带的访问信息。In actual application, the unit duration is one day. That is to say, the electronic device creates an access log every day, which is used to record the access information carried in all the access requests detected on that day.
在本申请实施例中,基于第一访问日志,确定出第一访问地址对应的第一特征值的实施方式包括:In this embodiment of the application, based on the first access log, the implementation of determining the first feature value corresponding to the first access address includes:
1)实施方式一:若跟踪的是当前业务系统的用户行为的场景下:1) Implementation mode 1: If the user behavior of the current business system is tracked:
电子设备基于第一访问日志中的访问信息,确定出每个用户标识对应的第一序列,每个用户标识对应的第一序列中包括该用户标识对应的所有访问地址;基于每个用户标识对应的第一序列,确定出每个用户标识对应的第一序列中每个访问地址对应的特征值。每个用户标识对应的每个访问地址对应的特征值可以基于对应的访问地址确定出,也可以基于对应的访问地址,以及基于第一序列中与对应的访问地址相邻的至少一个访问地址确定出。实际应用时,除第一序列中的首个访问地址之外,基于第一序列中每3个相邻的访问地址中的每个访问地址,确定出每3个相邻的访问地址中处于中间位置的访问地址对应的特征值。在计算第一序列中首个访问地址对应的特征值时,采用设定的空字符串替代缺失的访问地址,或者基于首个访问地址和与首个访问地址相邻的访问地址,确定出首个访问地址对应的特征值。需要说明的是,特征值表征访问地址的特征。示例性地,访问地址对应的特征值包括哈希值。The electronic device determines a first sequence corresponding to each user identifier based on the access information in the first access log, and the first sequence corresponding to each user identifier includes all access addresses corresponding to the user identifier; The first sequence of the first sequence, determine the feature value corresponding to each access address in the first sequence corresponding to each user identifier. The feature value corresponding to each access address corresponding to each user identifier may be determined based on the corresponding access address, or may be determined based on the corresponding access address, and based on at least one access address adjacent to the corresponding access address in the first sequence out. In practical applications, except for the first access address in the first sequence, based on each of the three adjacent access addresses in the first sequence, it is determined that each of the three adjacent access addresses is in the middle The characteristic value corresponding to the access address of the location. When calculating the eigenvalue corresponding to the first access address in the first sequence, use the set empty string to replace the missing access address, or determine the first access address based on the first access address and the access addresses adjacent to the first access address. The characteristic value corresponding to each access address. It should be noted that the characteristic value represents the characteristic of the access address. Exemplarily, the feature value corresponding to the access address includes a hash value.
电子设备从确定出的访问地址对应的特征值中,确定出第一访问地址对应的第一特征值;基于每个用户对应的第一特征值,确定出第一特征值在至少一个设定维度中每个 维度对应的第一出现次数。The electronic device determines the first feature value corresponding to the first access address from the determined feature values corresponding to the access address; based on the first feature value corresponding to each user, determines that the first feature value is in at least one set dimension The number of first occurrences corresponding to each dimension in .
实际应用时,可以通过第一数据表记录每个用户对应的访问信息、访问信息中的访问地址对应的特征值,以及单位时长内每个特征值的出现次数。示例性地,第一数据表如下:In practical applications, the first data table can be used to record the access information corresponding to each user, the feature value corresponding to the access address in the access information, and the occurrence times of each feature value within a unit time length. Exemplarily, the first data table is as follows:
Figure PCTCN2022098734-appb-000001
Figure PCTCN2022098734-appb-000001
为了提高统计出的特征值对应的出现次数,从而提高异常检测结果的准确度,在一些实施例中,所述访问信息还包括用户所属的部门、用户岗位以及发送访问地址的终端设备的标识;In order to increase the number of occurrences corresponding to the counted feature values, thereby improving the accuracy of abnormal detection results, in some embodiments, the access information also includes the department to which the user belongs, the position of the user, and the identification of the terminal device that sends the access address;
所述至少一个设定维度包括以下至少之一:The at least one set dimension includes at least one of the following:
第一维度,表征按用户统计特征值的出现次数;即,基于每个用户对应的所有特征值,统计出每个用户对应的每个特征值的出现次数;The first dimension represents the number of occurrences of feature values by user statistics; that is, based on all feature values corresponding to each user, the number of occurrences of each feature value corresponding to each user is counted;
第二维度,表征按部门统计特征值的出现次数;即,基于每个部门对应的所有特征值,统计每个特征值的出现次数;The second dimension represents the number of occurrences of feature values by department; that is, counts the number of occurrences of each feature value based on all feature values corresponding to each department;
第三维度,表征按岗位统计特征值的出现次数;即,基于每个岗位对应的所有特征值,统计每个特征值的出现次数;The third dimension represents the number of occurrences of feature values by post; that is, counts the number of occurrences of each feature value based on all feature values corresponding to each post;
第四维度,表征按用户使用的终端设备统计特征值的出现次数;即,基于每台终端设备对应的所有特征值,统计每台终端设备对应的每个特征值的出现次数。The fourth dimension represents the number of occurrences of feature values based on the terminal devices used by users; that is, based on all feature values corresponding to each terminal device, the number of occurrences of each feature value corresponding to each terminal device is counted.
当然,在实际应用时,还可以根据实际需要设置其他的维度。Of course, in actual application, other dimensions can also be set according to actual needs.
2)实施方式二:考虑到在跟踪用户的跨系统的用户行为的应用场景下,用户在访问任一业务系统中的任一功能对应的访问地址序列是固定的,为了更准确地确定出异常请求,如图3所示,在一些实施例中,采用以下步骤301至步骤303,确定出每个访问地址对应的特征值:2) Embodiment 2: Considering that in the application scenario of tracking the user's cross-system user behavior, the access address sequence corresponding to any function in any business system when the user accesses it is fixed, in order to more accurately determine the abnormality Request, as shown in Figure 3, in some embodiments, the following steps 301 to 303 are used to determine the characteristic value corresponding to each access address:
步骤301:基于第二访问日志中的访问信息,确定出第一用户对应的第一序列;其中,所述第二访问日志包括所述第一访问日志或设定统计周期内每个单位时长对应的历史访问日志;第一序列包括第二访问日志中第一用户对应的所有访问地址。Step 301: Based on the access information in the second access log, determine the first sequence corresponding to the first user; wherein, the second access log includes the first access log or each unit duration corresponding to the set statistical period historical access logs; the first sequence includes all access addresses corresponding to the first user in the second access log.
这里,电子设备基于第二访问日志中每个访问信息包括的用户标识和访问时间,在第二访问日志中确定出第一用户对应的第一序列。第一用户是指第二访问日志中的任一用户。Here, the electronic device determines the first sequence corresponding to the first user in the second access log based on the user identifier and access time included in each access information in the second access log. The first user refers to any user in the second access log.
步骤302:在第一用户对应的第一序列中,确定出第一用户对应的第二序列;第二序列包括至少一个访问地址。Step 302: From the first sequence corresponding to the first user, determine the second sequence corresponding to the first user; the second sequence includes at least one access address.
这里,第一用户对应的第一序列中包括的访问地址的数量大于或等于对应的第二序列中包括的访问地址的数量。Here, the number of access addresses included in the first sequence corresponding to the first user is greater than or equal to the number of access addresses included in the corresponding second sequence.
步骤303:基于第一用户对应的第二序列,计算出哈希值,得到第一用户对应的第二序列中位于设定位置的访问地址对应的特征值。Step 303: Calculate the hash value based on the second sequence corresponding to the first user, and obtain the characteristic value corresponding to the access address at the set position in the second sequence corresponding to the first user.
这里,电子设备可以通过设定算法,基于第一用户对应的第二序列中的每个访问地 址,计算出哈希值,得到第一用户对应的第二序列中位于设定位置的访问地址对应的特征值。设定算法为用于计算哈希值的算法,包括信息摘要算法或散列算法等。实际应用时,访问地址包括域名和业务系统标识,还可以包括功能标识。示例性地,访问地址可以为http://xxxx/1/2。其中,xxxx表征域名,1表征业务系统标识,2表征功能标识。Here, the electronic device can calculate a hash value based on each access address in the second sequence corresponding to the first user through a setting algorithm, and obtain the corresponding access address at the set position in the second sequence corresponding to the first user. eigenvalues of . Set the algorithm as the algorithm used to calculate the hash value, including information digest algorithm or hash algorithm. In actual application, the access address includes a domain name and a service system identifier, and may also include a function identifier. Exemplarily, the access address may be http://xxxx/1/2. Among them, xxxx represents the domain name, 1 represents the service system identification, and 2 represents the function identification.
需要说明的是,当第二访问日志为第一访问日志时,电子设备通过执行步骤301至步骤303,确定出第一访问地址对应的第一特征值。其中,第一访问地址是第二序列中位于设定位置的访问地址。It should be noted that, when the second access log is the first access log, the electronic device determines the first characteristic value corresponding to the first access address by performing steps 301 to 303 . Wherein, the first access address is the access address at the set position in the second sequence.
当第二访问日志为设定统计周期内每个单位时长对应的历史访问日志时,电子设备通过执行步骤301至步骤303,确定出设定统计周期内的每个单位时长对应的访问地址对应的特征值。When the second access log is the historical access log corresponding to each unit duration within the set statistical period, the electronic device determines the corresponding access address corresponding to each unit duration within the set statistical period by performing steps 301 to 303 Eigenvalues.
考虑到在实际应用时,用户连续两次触发的访问请求之间的时间间隔较长的话,可能连续两次触发的访问请求中的访问地址可能用户访问不同的功能,因此,为了提高异常检测结果的准确度,在一些实施例中,所述计算出哈希值,包括:Considering that in practical applications, if the time interval between two consecutive access requests triggered by the user is long, the access addresses in the two consecutive access requests may be accessed by the user to different functions. Therefore, in order to improve the abnormality detection results The accuracy, in some embodiments, said calculating the hash value includes:
在第三访问地址与第二访问地址之间的时间间隔大于或等于设定时长的情况下,将第二序列中第三访问地址替换为设定字符串;其中,第二访问地址表征第二序列中位于设定位置的访问地址,第三访问地址表征第二序列中与第二访问地址相邻的访问地址;In the case that the time interval between the third access address and the second access address is greater than or equal to the set duration, the third access address in the second sequence is replaced with the set character string; wherein, the second access address represents the second An access address at a set position in the sequence, the third access address represents an access address adjacent to the second access address in the second sequence;
基于更新后的第二序列中的每个访问地址,计算出哈希值。A hash value is calculated based on each access address in the updated second sequence.
这里,电子设备基于第三访问地址对应的访问时间和第二访问地址对应的访问时间,确定出第三访问地址与第二访问地址之间的时间间隔;在计算出的时间间隔大于或等于设定时长的情况下,将第二序列中的第三访问地址替换为设定字符串;并基于更新后的第二序列中的每个访问地址和设定字符串,计算出哈希值。实际应用时,设定字符串为按照访问地址的格式设置的空字符串。示例性地,设定时长为5分钟,当然,实际应用时也可以根据实际情况设置设定时长。Here, the electronic device determines the time interval between the third access address and the second access address based on the access time corresponding to the third access address and the access time corresponding to the second access address; when the calculated time interval is greater than or equal to the set In the case of a fixed length, the third access address in the second sequence is replaced with a set string; and a hash value is calculated based on each updated access address and set string in the second sequence. In actual application, the set string is an empty string set according to the format of the access address. Exemplarily, the set duration is 5 minutes, of course, the set duration can also be set according to the actual situation in actual application.
实际应用时,第二序列包括三个访问地址,设定位置表征中间位置。由此,可以兼顾计算特征值的计算量和异常检测结果的准确度。In actual application, the second sequence includes three access addresses, and the set position represents an intermediate position. In this way, both the calculation amount for calculating the feature value and the accuracy of the abnormality detection result can be taken into account.
示例性地,电子设备可以采用公式Vs=MD5(V 1+V 2+V 3),来计算第二序列中位于中间位置的访问地址V 2对应的特征值。其中,第二序列为V 1V 2V 3;MD5(V 1+V 2+V 3)表征通过信息摘要算法5(MD5,Message-DigestAlgorithm 5)计算访问地址V 2对应的特征值。 Exemplarily, the electronic device may use the formula Vs=MD5(V 1 +V 2 +V 3 ) to calculate the feature value corresponding to the access address V 2 at the middle position in the second sequence. Wherein, the second sequence is V 1 V 2 V 3 ; MD5 (V 1 +V 2 +V 3 ) represents the characteristic value corresponding to the access address V 2 calculated through the message digest algorithm 5 (MD5, Message-Digest Algorithm 5).
需要说明的是,API网关还可以用于调用电子设备中的相关插件或相关服务对设定业务系统的登陆请求或访问请求进行验证。在对访问请求进行验证通过的情况下,将该访问请求携带的访问信息写入访问日志。It should be noted that the API gateway can also be used to call related plug-ins or related services in the electronic device to verify the login request or access request of the set business system. When the access request is verified and passed, the access information carried in the access request is written into the access log.
例如,在用户使用终端设备登陆第一业务系统的场景下,用户在第一业务系统的登陆界面输入用户名和密码等信息,触发登陆请求;电子设备在检测到经由API网关的登陆请求的情况下,通过API网关调用电子设备中的第一插件或第一服务,以通过第一插件或第一服务对登陆请求进行身份验证;在身份验证通过时,允许该用户登陆第一业务系统,将登陆请求发送至管理第一业务系统的服务器;在身份验证失败时,不允许该用户登陆第一业务系统,并向发送登陆请求的终端设备返回表征身份验证失败的提示信息。其中,第一业务系统泛指至少一个设定业务系统中的任一设定业务系统;第一插件表征用于身份验证的插件,第一服务表征用于身份验证的服务。For example, in the scenario where a user logs in to the first business system using a terminal device, the user enters information such as user name and password on the login interface of the first business system to trigger a login request; when the electronic device detects a login request via the API gateway, , call the first plug-in or the first service in the electronic device through the API gateway, so as to authenticate the login request through the first plug-in or the first service; when the identity verification is passed, the user is allowed to log in to the first business system, and the login The request is sent to the server that manages the first business system; when the identity verification fails, the user is not allowed to log in to the first business system, and a prompt message representing the failure of the identity verification is returned to the terminal device that sent the login request. Wherein, the first service system generally refers to any set service system in at least one set service system; the first plug-in represents a plug-in used for identity verification, and the first service represents a service used for identity verification.
用户登陆第一业务系统之后,用户可以点击第一业务系统的交互页面中的相关功能或按钮,触发访问请求。电子设备在检测到经由API网关的访问请求的情况下,通过API网关调用电子设备中的第二插件或第二服务,通过第二插件或第二服务对检测到的访问请求进行权限验证;在权限验证通过时,允许该用户访问第一业务系统的对应功能, 将访问请求发送至管理第一业务系统的服务器;在权限验证失败时,不允许该用户访问第一业务系统的对应功能,向发送访问请求的终端设备返回表征没有访问权限的提示信息。第二插件表征用于权限验证的插件;第二服务表征用于权限验证的服务。After the user logs in to the first business system, the user can click a relevant function or button on the interactive page of the first business system to trigger an access request. When the electronic device detects an access request via the API gateway, it invokes a second plug-in or a second service in the electronic device through the API gateway, and performs permission verification on the detected access request through the second plug-in or the second service; When the authority verification passes, the user is allowed to access the corresponding function of the first business system, and the access request is sent to the server that manages the first business system; when the authority verification fails, the user is not allowed to access the corresponding function of the first business system, and the The terminal device that sends the access request returns prompt information indicating that there is no access right. The second plug-in represents a plug-in for authorization verification; the second service represents a service for authorization verification.
其中,电子设备中存储有用户在每个设定的业务系统中对应的设定访问路径,设定访问路径是指具有访问权限的访问路径,且设定访问路径是可以动态更新的。对检测到的访问请求进行权限验证的过程可以为:Wherein, the electronic device stores the user's set access path corresponding to each set business system, the set access path refers to the access path with access authority, and the set access path can be dynamically updated. The process of verifying the authority of the detected access request can be as follows:
判断检测到的访问请求中是否包括访问地址,得到第一判断结果;在第一判断结果表征访问请求中包括访问地址的情况下,查找访问请求包括的用户标识对应的设定访问路径,并在查找到的设定访问路径中,查找与访问请求包括的访问地址匹配的设定访问路径;在查找到匹配的设定访问路径的情况下,确定权限验证通过。在第一判断结果表征访问请求中不包括访问地址,或者未查找到匹配的设定访问路径的情况下,确定权限验证失败。在访问请求中不包括访问地址的情况下,表征业务系统受到黑客攻击。Judging whether the detected access request includes the access address, and obtaining a first judgment result; when the first judgment result indicates that the access request includes the access address, search for the set access path corresponding to the user identification included in the access request, and Searching for a set access path that matches the access address included in the access request among the found set access paths; if a matched set access path is found, it is determined that the authority verification is passed. If the first judgment result indicates that the access address is not included in the access request, or no matching set access path is found, it is determined that the authorization verification fails. If the access address is not included in the access request, it indicates that the business system has been attacked by hackers.
步骤202:基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出所述第一访问地址是否异常;其中,设定统计周期内每个单位时长对应的所述第一特征值基于对应的单位时长对应的历史日志确定出。Step 202: Based on the first number of occurrences corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit duration in the set statistical period in each set dimension The number of historical occurrences determines whether the first access address is abnormal; wherein, the first characteristic value corresponding to each unit time length in the set statistical period is determined based on the history log corresponding to the corresponding unit time length.
这里,电子设备基于设定统计周期内每个单位时长对应的历史访问日志中的访问信息,确定出每个用户标识在每个单位时长对应的第一序列;基于每个用户标识在每个单位时长对应的第一序列,确定出每个用户标识在每个单位时长对应的第一序列中每个访问地址对应的特征值;基于每个用户标识在每个单位时长对应的第一序列中每个访问地址对应的特征值,确定出每个单位时长内的每个特征值在每个设定维度对应的历史出现次数;在每个单位时长内的每个特征值在每个设定维度对应的历史出现次数中,确定出每个单位时长内第一特征值在每个设定维度对应的历史出现次数。Here, the electronic device determines the first sequence corresponding to each user identifier in each unit duration based on the access information in the historical access log corresponding to each unit duration in the set statistical period; The first sequence corresponding to the duration determines the characteristic value corresponding to each access address of each user ID in the first sequence corresponding to each unit duration; based on each user identifier in the first sequence corresponding to each unit duration, each feature value corresponding to each access address, and determine the historical occurrence times of each feature value corresponding to each set dimension in each unit time length; each feature value in each unit time length corresponds to each set dimension Among the historical occurrence times of the first feature value in each unit time period, the historical occurrence times corresponding to each set dimension are determined.
电子设备基于第一特征值在每个设定维度对应的第一出现次数,以及基于每个单位时长内第一特征值在每个设定维度对应的历史出现次数,确定出第一访问地址是否异常。The electronic device determines whether the first access address is abnormal.
其中,在第一特征值在任一设定维度对应的历史出现次数为零的情况下,确定第一访问地址异常。或者,第一特征值在第一设定维度对应的第一出现次数与对应的历史出现次数相差较大时,确定第一访问地址异常。电子设备还可以基于第一特征值在每个设定维度对应的第一出现次数,以及基于每个单位时长内第一特征值在每个设定维度对应的历史出现次数,计算出第一访问地址对应的分数,并基于第一访问地址对应的分数,确定出第一访问地址是否为异常地址。例如,在第一访问地址对应的分数大于或等于设定阈值的情况下,确定第一访问地址为异常地址。在第一访问地址对应的分数小于设定阈值的情况下,确定第一访问地址不是异常地址。设定阈值基于异常的访问地址设定的。Wherein, when the history occurrence times corresponding to the first characteristic value in any set dimension is zero, it is determined that the first access address is abnormal. Alternatively, when the first feature value corresponding to the first set dimension has a large difference between the first number of occurrences and the corresponding historical number of occurrences, it is determined that the first access address is abnormal. The electronic device may also calculate the first visit based on the first number of appearances corresponding to the first feature value in each set dimension, and based on the historical number of occurrences of the first feature value in each set dimension within each unit of time. The score corresponding to the address, and based on the score corresponding to the first access address, determine whether the first access address is an abnormal address. For example, if the score corresponding to the first access address is greater than or equal to the set threshold, it is determined that the first access address is an abnormal address. If the score corresponding to the first access address is less than the set threshold, it is determined that the first access address is not an abnormal address. The setting threshold is set based on the abnormal access address.
需要说明的是,基于历史访问日志确定访问地址对应的特征值的方法,与基于第一访问日志确定访问地址对应的特征值的方法相同,此处不赘述。It should be noted that the method for determining the characteristic value corresponding to the access address based on the historical access log is the same as the method for determining the characteristic value corresponding to the access address based on the first access log, and details are not described here.
为了提高确定异常检测结果的准确度,如图4所示,在一些实施例中,所述确定出所述第一访问地址是否异常,包括以下步骤401至步骤403:In order to improve the accuracy of determining the abnormality detection result, as shown in FIG. 4, in some embodiments, the determining whether the first access address is abnormal includes the following steps 401 to 403:
步骤401:基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出每个单位时长内的所述第一特征值在每个设定维度对应的第一分数。Step 401: Based on the first occurrence times corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit duration in the set statistical period in each set dimension The number of historical occurrences determines the first score corresponding to the first feature value in each set dimension within each unit duration.
这里,电子设备基于设定统计周期内每个单位时长对应的第一特征值在每个设定维度对应的历史出现次数,确定出设定统计周期内第一特征值在第一设定维度对应的最大历史出现次数;基于设定统计周期内每个单位时长对应的第一特征值在每个设定维度对 应的历史出现次数,确定出每个单位时长内的第一特征值在每个设定维度对应的第二出现次数;基于第一特征值在每个设定维度对应的第一出现次数、设定统计周期内第一特征值在第一设定维度对应的最大历史出现次数,以及每个单位时长的第一特征值在每个设定维度对应的第二出现次数,确定出每个单位时长内的第一特征值在每个设定维度对应的第一分数。Here, the electronic device determines that the first feature value in the set statistical period corresponds to The maximum number of historical occurrences; based on the historical occurrences of the first eigenvalue corresponding to each unit time in each set dimension in the set statistical period, determine the first eigenvalue in each unit time in each set The second number of occurrences corresponding to the given dimension; based on the first number of occurrences corresponding to the first feature value in each set dimension, the maximum historical number of occurrences of the first feature value corresponding to the first set dimension within the set statistical period, and The second occurrence times corresponding to the first feature value of each unit time length in each set dimension determine the first score corresponding to the first feature value in each set dimension within each unit time length.
其中,第一设定维度为至少一个设定维度中的任一设定维度。Wherein, the first set dimension is any set dimension in at least one set dimension.
第二出现次数与设定统计周期内包含的单位时长的总数,和/或,第一特征值在每个单位时长对应的权重有关。在设定统计周期内单位时长对应的时间越晚,该单位时长对应权重越大。The second number of occurrences is related to the total number of unit time lengths included in the set statistical period, and/or, the weight corresponding to each unit time length of the first feature value. The later the time corresponding to the unit duration in the set statistical period, the greater the weight corresponding to the unit duration.
为了更准确地确定出的第一分数,从而提高异常检测结果的准确度,如图5所示,在一些实施例中,所述确定出每个单位时长内的所述第一特征值在每个设定维度对应的第一分数,包括以下步骤501至步骤502:In order to determine the first score more accurately, thereby improving the accuracy of the abnormality detection result, as shown in FIG. 5 , in some embodiments, the determined first feature value within each unit time length is The first score corresponding to a set dimension includes the following steps 501 to 502:
步骤501:基于第一单位时长对应的所述第一特征值在第一设定维度对应的历史出现次数,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数;第一设定维度表征所述至少一个设定维度中的任一设定维度;第一单位时长表征所述设定统计周期中的任一单位时长。Step 501: Based on the historical occurrence times of the first feature value corresponding to the first unit time length in the first set dimension, determine the number of the first feature value corresponding to the first set dimension within the first unit time length The second number of occurrences; the first set dimension characterizes any set dimension in the at least one set dimension; the first unit duration represents any unit time length in the set statistical cycle.
这里,电子设备基于第一单位时长对应的第一特征值在第一设定维度对应的历史出现次数,以及设定统计周期内包含的单位时长的总数,确定出第一单位时长内的第一特征值在第一设定维度对应的第二出现次数。Here, the electronic device determines the first feature value within the first unit time length based on the historical occurrence times corresponding to the first feature value corresponding to the first unit time length in the first set dimension and the total number of unit time lengths included in the set statistical period. The second occurrence count corresponding to the feature value in the first set dimension.
为了更准确地确定出第二出现次数,在一些实施例中,所述确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数,包括以下之一:In order to more accurately determine the second number of occurrences, in some embodiments, the determination of the second number of occurrences corresponding to the first feature value in the first set dimension within the first unit duration includes one of the following :
基于第一单位时长内的所述第一特征值在第一设定维度对应的历史出现次数与设定统计周期内单位时长的总数的比值,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数;Based on the ratio of the historical occurrence times of the first feature value corresponding to the first set dimension in the first unit time length to the total number of the unit time length in the set statistical period, determine the first feature value in the first unit time length The second number of occurrences corresponding to the value in the first set dimension;
基于每个单位时长内的所述第一特征值在第一设定维度对应的历史出现次数,以及所述第一特征值在每个单位时长对应的权重,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数;其中,Based on the historical occurrence times corresponding to the first feature value in the first set dimension in each unit time length, and the weight corresponding to the first feature value in each unit time length, determine all the first feature values in the first unit time length The second number of occurrences corresponding to the first feature value in the first set dimension; wherein,
在第一单位时长对应的时间晚于第二单位时长对应的时间的情况下,第一单位时长对应的权重大于第二单位时长对应的权重。In a case where the time corresponding to the first unit duration is later than the time corresponding to the second unit duration, the weight corresponding to the first unit duration is greater than the weight corresponding to the second unit duration.
这里,电子设备确定出第一单位时长内的第一特征值在第一设定维度对应的历史出现次数,与设定统计周期内包含的单位时长的总数的比值,基于确定出的比值,确定出第一单位时长内的第一特征值在第一设定维度对应的第二出现次数。Here, the electronic device determines the ratio of the historical occurrence times corresponding to the first feature value in the first set dimension within the first unit time length to the total number of unit time lengths included in the set statistical period, and based on the determined ratio, determines The second number of occurrences corresponding to the first feature value in the first set dimension within the first unit duration is calculated.
实际应用时,可以基于公式
Figure PCTCN2022098734-appb-000002
计算出第一单位时长内的第一特征值在第一设定维度对应的第二出现次数。
In practical application, it can be based on the formula
Figure PCTCN2022098734-appb-000002
Calculate the second number of occurrences corresponding to the first feature value in the first set dimension within the first unit time length.
其中,C表征第一单位时长内的第一特征值在第一设定维度对应的第二出现次数;C n表征第一单位时长的第一特征值在第一设定维度对应的历史出现次数;m表征设定统计周期内包含的单位时长的总数。当然,在一些实施例中,也可以通过调整参数对计算出的C进行调整,从而得到调整后的C。 Among them, C represents the second occurrence number corresponding to the first feature value in the first set dimension within the first unit duration; Cn represents the historical occurrence number corresponding to the first feature value of the first unit duration in the first set dimension ;m represents the total number of unit duration included in the set statistical period. Of course, in some embodiments, the calculated C may also be adjusted by adjusting parameters, so as to obtain the adjusted C.
电子设备还可以按照以下方法确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数:The electronic device may also determine the second number of occurrences corresponding to the first feature value in the first set dimension within the first unit duration according to the following method:
电子设备在设定统计周期内每个单位时长对应的第一特征值在第一设定维度对应的历史出现次数中,确定出第一单位时长对应的第一特征值在第一设定维度对应的历史出现次数;基于第一单位时长对应的第一特征值在第一设定维度对应的历史出现次数, 以及第一特征值在第一单位时长对应的权重,确定出第一单位时长内的第一特征值在第一设定维度对应的第三出现次数;按照此方法,确定出每个单位时长内的第一特征值在第一设定维度对应的第三出现次数。The electronic device determines that the first characteristic value corresponding to the first unit duration corresponds to The historical occurrence times of ; based on the historical occurrence times of the first feature value corresponding to the first unit time length in the first set dimension, and the weight corresponding to the first feature value in the first unit time length, determine the The third occurrence times corresponding to the first characteristic value in the first setting dimension; according to this method, the third occurrence times corresponding to the first characteristic value in each unit time length in the first setting dimension are determined.
在确定出设定统计周期内每个单位时长对应的第一特征值在第一设定维度对应的第三出现次数的情况下,计算出第一特征值对应的所有第三出现次数之和,得到第一总和。In the case of determining the third occurrence times corresponding to the first feature value corresponding to each unit duration in the set statistical period in the first set dimension, calculate the sum of all the third occurrence times corresponding to the first feature value, Get the first sum.
电子设备基于第一单位时长内的第一特征值在第一设定维度对应的第三出现次数与计算出的第三总和的比值,确定出第一单位时长内的第一特征值在第一设定维度对应的第二出现次数。其中,The electronic device determines that the first characteristic value within the first unit duration is in the first Sets the second occurrence count corresponding to the dimension. in,
电子设备可以将第一单位时长内的第一特征值在第一设定维度对应的第三出现次数与计算出的第三总和之商,确定为第一单位时长内的第一特征值在第一设定维度对应的第二出现次数;电子设备还可以采用设定的调整参数对确定出的商进行调整,得到第一单位时长内的第一特征值在第一设定维度对应的第二出现次数。The electronic device may determine the quotient of the third number of appearances corresponding to the first set dimension of the first feature value within the first unit time length and the calculated third sum as the first feature value within the first unit time length. The second number of occurrences corresponding to the set dimension; the electronic device can also use the set adjustment parameters to adjust the determined quotient, and obtain the first characteristic value within the first unit time length corresponding to the second set dimension. The number of occurrences.
示例性地,可以采用以下公式确定出第一单位时长的第一特征值在第一设定维度对应的第二出现次数:Exemplarily, the following formula can be used to determine the second number of occurrences corresponding to the first characteristic value of the first unit duration in the first set dimension:
Figure PCTCN2022098734-appb-000003
Figure PCTCN2022098734-appb-000003
其中,C表征第一单位时长内的第一特征值在第一设定维度对应的第二出现次数;C n表征第n个单位时长内的第一特征值在第一设定维度对应的历史出现次数;g表征第一特征值在第n个单位时长对应的权重;m表征设定统计周期内包含的单位时长的总数。需要说明的是,在实际应用时,可以采用设定系数或常数对上述公式进行调整。 Among them, C represents the second occurrence number corresponding to the first feature value in the first set dimension within the first unit time length; C n represents the history corresponding to the first feature value in the first set dimension within the nth unit time length The number of occurrences; g represents the weight corresponding to the first feature value at the nth unit time length; m represents the total number of unit time lengths included in the set statistical period. It should be noted that, in actual application, the above formula can be adjusted by setting coefficients or constants.
步骤502:基于所述第一特征值在第一设定维度对应的第一出现次数和第一单位时长的所述第一特征值在第一设定维度对应的第二出现次数,以及基于所述第一特征值在所述设定统计周期中对应的最大历史出现次数以及第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数。Step 502: Based on the first number of occurrences corresponding to the first feature value in the first set dimension and the second number of occurrences of the first feature value corresponding to the first unit duration in the first set dimension, and based on the The maximum historical number of occurrences of the first feature value in the set statistical period and the second number of occurrences of the first feature value in the first set dimension corresponding to the first unit duration, determine the first unit The first score corresponding to the first characteristic value within the duration in the first set dimension.
这里,电子设备基于设定统计周期内每个单位时长对应的第一特征值在第一设定维度对应的历史出现次数,确定出第一特征值在设定统计周期中对应的最大历史出现次数。Here, the electronic device determines the maximum number of historical occurrences corresponding to the first feature value in the set statistical period based on the historical occurrence times corresponding to the first characteristic value corresponding to each unit duration in the first set dimension within the set statistical period .
电子设备基于第一特征值在设定统计周期中对应的最大历史出现次数,以及第一单位时长内的第一特征值在第一设定维度对应的第二出现次数,确定出第一差值;基于第一特征值在第一设定维度对应的第一出现次数,以及第一单位时长内的第一特征值在第一设定维度对应的第二出现次数对应的第二出现次数,确定出第二差值;基于确定出的第一差值和确定出的第二差值,确定出第一单位时长内的第一特征值在第一设定维度对应的第一分数。The electronic device determines the first difference based on the maximum number of historical occurrences corresponding to the first characteristic value in the set statistical period and the second occurrence number corresponding to the first characteristic value in the first set dimension within the first unit duration ;Based on the first number of occurrences corresponding to the first feature value in the first set dimension, and the second number of occurrences corresponding to the second number of occurrences corresponding to the first feature value in the first set dimension within the first unit duration, determine calculating the second difference; based on the determined first difference and the determined second difference, determining the first score corresponding to the first feature value in the first set dimension within the first unit time length.
其中,第一差值表征第一特征值在设定统计周期中对应的最大历史出现次数与第一单位时长内的第一特征值在第一设定维度对应的第二出现次数之差。Wherein, the first difference represents the difference between the maximum historical number of occurrences corresponding to the first feature value in the set statistical period and the second number of occurrences corresponding to the first feature value in the first set dimension within the first unit duration.
第二差值表征第一特征值在第一设定维度对应的第一出现次数与第一单位时长内的第一特征值在第一设定维度对应的第二出现次数之差。The second difference represents the difference between the first number of appearances corresponding to the first feature value in the first set dimension and the second number of appearances corresponding to the first feature value in the first set dimension within the first unit time length.
为了提高确定出的第一分数的准确度,进而提高异常检测结果的准确度,在一些实施例中,所述确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,包括以下之一:In order to improve the accuracy of the determined first score, and further improve the accuracy of abnormality detection results, in some embodiments, the determined first feature value within the first unit time corresponds to the first set dimension The first fraction of , consisting of one of the following:
在第一差值等于零的情况下,基于设定分数确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数;In the case where the first difference is equal to zero, a first score corresponding to the first feature value in the first set dimension within the first unit time length is determined based on the set score;
在第一差值不等于零的情况下,基于第二差值与第一差值的比值,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数;其中,In the case where the first difference is not equal to zero, based on the ratio of the second difference to the first difference, determine the first score corresponding to the first feature value in the first set dimension within the first unit time length; in,
所述第一差值表征所述第一特征值在所述设定统计周期中对应的最大历史出现次数与第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数之差;The first difference represents the maximum number of historical occurrences corresponding to the first feature value in the set statistical period and the second value corresponding to the first feature value in the first set dimension within the first unit time length. difference in number of occurrences;
所述第二差值表征所述第一特征值在第一设定维度对应的第一出现次数与第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数之差。The second difference represents the first number of appearances corresponding to the first feature value in the first set dimension and the second number of appearances corresponding to the first feature value in the first set dimension within the first unit duration Difference.
这里,电子设备在计算出第一差值的情况下,判断第一差值是否等于零,在第一差值等于零的情况下,基于第一特征值在设定统计周期中对应的最大历史出现次数,确定出对应的设定分数,将确定出的设定分数确定为第一特征值在第一设定维度对应的第一分数。其中,不同的最大历史出现次数对应的设定分数可以不同。Here, when the electronic device calculates the first difference, it judges whether the first difference is equal to zero, and if the first difference is equal to zero, based on the maximum historical occurrence times corresponding to the first characteristic value in the set statistical period , determine the corresponding set score, and determine the determined set score as the first score corresponding to the first feature value in the first set dimension. Wherein, the setting scores corresponding to different maximum historical occurrence times may be different.
电子设备在计算出第二差值的情况下,判断第二差值是否等于零,在第二差值不等于零的情况下,确定出第二差值与第一差值的比值,并基于确定出的比值,确定出第一单位时长内的第一特征值在第一设定维度对应的第一分数。其中,不同的比值对应的第一分数可以不同。When the electronic device calculates the second difference, it judges whether the second difference is equal to zero, and if the second difference is not equal to zero, determines the ratio of the second difference to the first difference, and based on the determined The ratio of is determined to determine the first score corresponding to the first feature value in the first set dimension within the first unit time length. Wherein, the first scores corresponding to different ratios may be different.
为了准确地确定出第一分数,从而提高异常检测结果的准确度,在一些实施例中,在第一差值等于零的情况下,所述基于设定分数确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,包括以下之一:In order to accurately determine the first score, thereby improving the accuracy of abnormality detection results, in some embodiments, when the first difference is equal to zero, the determination of the first unit time period based on the set score The first score corresponding to the first feature value in the first set dimension, including one of the following:
在在所述第一特征值对应的最大历史出现次数大于零的情况下,将第一设定维度对应的第一设定分数,确定为第一单位时长内的所述第一特征值在第一设定维度对应的第一分数;In the case that the maximum number of historical occurrences corresponding to the first feature value is greater than zero, the first set score corresponding to the first set dimension is determined as the first feature value within the first unit time length - setting the first score corresponding to the dimension;
在所述第一特征值对应的最大历史出现次数等于零,且存在所述第一访问地址的情况下,将第二设定分数确定为第一单位时长内的所述第一特征值在第一设定维度对应的第一分数。In the case that the maximum number of historical occurrences corresponding to the first feature value is equal to zero and the first access address exists, the second set score is determined as the first feature value within the first unit duration within the first Set the first score corresponding to the dimension.
这里,在第一差值等于零,且第一特征值对应的最大历史出现次数大于零的情况下,表征历史访问日志中存在第一访问地址,第一特征值对应的最大历史出现次数等于第一特征值在第一设定维度对应的第二出现次数。此时,电子设备将第一设定维度对应的第一设定分数,确定为第一单位时长内的第一特征值在第一设定维度对应的第一分数。第一设定分数为第一设定维度对应的默认分数。实际应用时,该默认分数为零。Here, when the first difference is equal to zero and the maximum historical occurrence times corresponding to the first characteristic value is greater than zero, it indicates that there is a first access address in the historical access log, and the maximum historical occurrence times corresponding to the first characteristic value is equal to the first The second occurrence count corresponding to the feature value in the first set dimension. At this time, the electronic device determines the first set score corresponding to the first set dimension as the first score corresponding to the first feature value within the first unit time duration in the first set dimension. The first set score is a default score corresponding to the first set dimension. In practice, this default score is zero.
在第一差值等于零,且第一特征值对应的最大历史出现次数等于零的情况下,表征历史访问日志中不存在第一访问地址,此时,判断设定业务系统对应的访问地址中是否存在第一访问地址;在任意设定业务系统对应的访问地址中存在第一访问地址的情况下,表征第一访问地址用于访问设定业务系统中的新功能,将第二设定分数确定为第一单位时长内的第一特征值在第一设定维度对应的第一分数。第二设定分数表征新的访问地址对应的初始分数。When the first difference is equal to zero and the maximum historical occurrence times corresponding to the first feature value is equal to zero, it indicates that there is no first access address in the historical access log. At this time, it is judged whether there is The first access address; in the case where there is a first access address in any access address corresponding to the service system, it is used to characterize that the first access address is used to access the new function in the set service system, and the second set score is determined as The first score corresponding to the first feature value in the first set dimension within the first unit duration. The second set score represents the initial score corresponding to the new access address.
为了准确地确定出第一分数,从而提高异常检测结果的准确度,在一些实施例中,在第一差值不等于零的情况下,所述基于第二差值与第一差值的比值,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,包括以下之一:In order to accurately determine the first score, thereby improving the accuracy of the abnormality detection result, in some embodiments, in the case where the first difference is not equal to zero, based on the ratio of the second difference to the first difference, Determining the first score corresponding to the first feature value in the first set dimension within the first unit duration includes one of the following:
在第二差值与第一差值的比值小于或等于零的情况下,确定第一单位时长内的所述第一特征值在第一设定维度对应的第一分数为零;In the case where the ratio of the second difference to the first difference is less than or equal to zero, it is determined that the first fraction corresponding to the first feature value in the first set dimension within the first unit time length is zero;
在第二差值与第一差值的比值大于零的情况下,将第二差值与第一差值之商,确定为第一单位时长内的所述第一特征值在第一设定维度对应的第一分数。In the case where the ratio of the second difference to the first difference is greater than zero, the quotient of the second difference to the first difference is determined as the first characteristic value within the first unit time length in the first setting The first fraction corresponding to the dimension.
这里,在第一差值不等于零的情况下,判断第二差值是否等于零。在第二差值等于 零的情况下,确定第一单位时长内的第一特征值在第一设定维度对应的第一分数为零。在第二差值不等于零的情况下,判断第二差值与第一差值的比值是否大于零;在第二差值与第一差值的比值小于零的情况下,确定第一特征值在第一设定维度对应的第一分数为零;在第二差值与第一差值的比值大于零的情况下,将第二差值与第一差值之商(即,第二差值/第一差值),确定为第一单位时长内的第一特征值在第一设定维度对应的第一分数。实际应用时,采用以下公式计算出第一单位时长内的第一特征值在每个设定维度对应的第一分数:Here, in the case that the first difference is not equal to zero, it is judged whether the second difference is equal to zero. In the case where the second difference is equal to zero, it is determined that the first fraction corresponding to the first feature value in the first set dimension within the first unit time length is zero. In the case that the second difference is not equal to zero, it is judged whether the ratio of the second difference to the first difference is greater than zero; in the case that the ratio of the second difference to the first difference is less than zero, the first characteristic value is determined The first score corresponding to the first set dimension is zero; when the ratio of the second difference to the first difference is greater than zero, the quotient of the second difference to the first difference (that is, the second difference value/first difference), which is determined as the first fraction corresponding to the first feature value in the first set dimension within the first unit time length. In practical application, use the following formula to calculate the first score corresponding to the first feature value in each set dimension within the first unit duration:
Figure PCTCN2022098734-appb-000004
Figure PCTCN2022098734-appb-000004
其中,S j表征第一单位时长内的第j个设定维度对应的第一分数;C ij表征第一特征值在第j个设定维度对应的第一次数;C uj表征第一特征值在第j个设定维度对应的第二出现次数;C jmax表征第一特征值在第j个设定维度对应的最大历史出现次数。 Among them, S j represents the first score corresponding to the j-th set dimension within the first unit duration; C ij represents the first number corresponding to the first feature value in the j-th set dimension; C uj represents the first feature The value is the second number of occurrences corresponding to the j-th setting dimension; C jmax represents the maximum historical number of occurrences of the first feature value corresponding to the j-th setting dimension.
步骤402:基于每个单位时长的所述第一特征值在每个设定维度对应的第一分数,确定出所述第一访问地址对应的多个第二分数;其中,每个第二分数基于每个单位时长内的所述第一特征值在所有设定维度对应的第一分数确定出。Step 402: Determine a plurality of second scores corresponding to the first access address based on the first score corresponding to the first feature value of each unit duration in each set dimension; wherein, each second score It is determined based on the first scores corresponding to the first feature value in each unit time period in all set dimensions.
这里,电子设备基于第一单位时长的第一特征值在所有设定维度对应的第一分数,确定出第一访问地址在第一单位时长对应的第二分数。按照同样的方法,可以计算出第一访问地址在设定统计周期内每个单位时长对应的第二分数,从而得到第一访问地址对应的多个第二分数。Here, the electronic device determines the second score corresponding to the first access address in the first unit duration based on the first scores corresponding to the first characteristic value of the first unit duration in all set dimensions. In the same way, the second score corresponding to each unit duration of the first access address within the set statistical period may be calculated, so as to obtain a plurality of second scores corresponding to the first access address.
其中,第一单位时长为设定统计周期内的任一单位时长。第一分数的数量与设定统计周期内单位时长的数量相同。Wherein, the first unit duration is any unit duration within the set statistical period. The quantity of the first score is the same as the quantity of the unit time in the set statistical period.
考虑到在实际应用中,设定维度的数量可以为1,也可以大于1,为了提高确定出的第二分数的准确度,从而提高异常检测结果的准确度,在一些实施例中,所述基于每个单位时长内的所述第一特征值在每个设定维度对应的第一分数,确定出所述第一访问地址对应的多个第二分数,包括:Considering that in practical applications, the number of dimensions can be set to 1 or greater than 1, in order to improve the accuracy of the determined second score, thereby improving the accuracy of abnormality detection results, in some embodiments, the A plurality of second scores corresponding to the first access address are determined based on the first scores corresponding to the first feature value in each set dimension within each unit duration, including:
在设定维度的数量为1的情况下,将第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,确定为所述第一访问地址在第一单位时长对应的第二分数;In the case where the number of dimensions is set to 1, the first score corresponding to the first feature value in the first set dimension within the first unit duration is determined as the first access address in the first unit duration the corresponding second fraction;
在设定维度的数量大于1的情况下,基于第一单位时长内的所述第一特征值在每个设定维度对应的第一分数和每个设定维度的设定权重,确定出所述第一访问地址在第一单位时长对应的第二分数。When the number of set dimensions is greater than 1, based on the first score corresponding to the first feature value in each set dimension within the first unit time length and the set weight of each set dimension, determine the The second fraction corresponding to the first access address in the first unit duration.
这里,在设定维度的数量为1的情况下,将第一单位时长内的第一特征值在第一设定维度对应的第一分数,确定为第一访问地址在第一单位时长内对应的第二分数。Here, when the number of dimensions is set to 1, the first fraction corresponding to the first characteristic value within the first unit time length in the first set dimension is determined as the first access address corresponding to the first unit time length of the second score.
在设定维度的数量大于1的情况下,基于第一单位时长内的第一特征值在每个设定维度对应的第一分数和每个设定维度对应的设定权重进行加权求和,得到第一访问地址在第一单位时长对应的第二分数。When the number of set dimensions is greater than 1, weighted summation is performed based on the first score corresponding to each set dimension and the set weight corresponding to each set dimension based on the first feature value within the first unit time length, A second score corresponding to the first access address in the first unit duration is obtained.
按照上述方法,电子设备可以确定出第一访问地址在设定统计周期内每个单位时长内对应的第二分数。According to the above method, the electronic device can determine the second fraction corresponding to the first access address within each unit duration within the set statistical period.
示例性地,在设定维度的数量为4的情况下,电子设备可以采用以下公式计算出第一访问地址在第一单位时长对应的第二分数:Exemplarily, when the number of dimensions is set to be 4, the electronic device may use the following formula to calculate the second score corresponding to the first access address in the first unit duration:
S=S 1W 1+S 2W 2+S 3W 3+S 4W 4 S=S 1 W 1 +S 2 W 2 +S 3 W 3 +S 4 W 4
其中,S表征第一访问地址在第一单位时长对应的第二分数;S 1表征第一维度对应的第一分数,W 1表征第一维度对应的设定权重;S 2表征第二维度对应的第一分数,W 2表征第二维度对应的设定权重;S 3表征第三维度对应的第一分数,W 3表征第三维度对应 的设定权重;S 4表征第四维度对应的第一分数,W 4表征第四维度对应的设定权重。 Among them, S represents the second score corresponding to the first access address in the first unit duration; S 1 represents the first score corresponding to the first dimension, W 1 represents the setting weight corresponding to the first dimension; S 2 represents the corresponding W 2 represents the set weight corresponding to the second dimension; S 3 represents the first score corresponding to the third dimension, W 3 represents the set weight corresponding to the third dimension; S 4 represents the set weight corresponding to the fourth dimension A score, W 4 represents the setting weight corresponding to the fourth dimension.
步骤403:基于所述第一访问地址对应的多个第二分数,确定出所述第一访问地址是否异常。Step 403: Based on the plurality of second scores corresponding to the first access address, determine whether the first access address is abnormal.
这里,电子设备可以将每个第二分数与设定阈值进行比较,得到比较结果;在任一比较结果表征第二分数大于或等于设定阈值的情况下,确定第一访问地址异常;在所有比较结果均表征第二分数小于设定阈值的情况下,确定第一访问地址不是异常地址。Here, the electronic device can compare each second score with a set threshold to obtain a comparison result; if any comparison result indicates that the second score is greater than or equal to the set threshold, it is determined that the first access address is abnormal; The results all indicate that when the second score is less than the set threshold, it is determined that the first access address is not an abnormal address.
实际应用时,设定阈值可以为400。In actual application, the set threshold may be 400.
为了提高异常处理的时效性,在一些实施例中,在确定出第一访问地址异常之后,所述方法还包括:In order to improve the timeliness of exception handling, in some embodiments, after determining that the first access address is abnormal, the method further includes:
在所述第一访问地址异常的情况下,阻断所述第一访问地址对应的访问操作,和/或,对发送所述第一访问地址的终端设备进行访问限制。In the case that the first access address is abnormal, the access operation corresponding to the first access address is blocked, and/or, access restriction is performed on the terminal device sending the first access address.
这里,电子设备在确定出第一访问地址异常的情况下,阻断第一访问地址对应的访问操作,从而阻止发送第一访问地址的终端设备继续访问对应的设定业务系统。对发送第一访问地址的终端进行访问限制是指,限制发送第一访问地址的终端设备的访问操作,从而禁止对应的终端设备访问对应的业务系统或对应的业务系统的对应功能。由于,电子设备在确定出第一访问地址异常的情况下,可以直接对第一访问地址进行异常处理,不需要通知对应的业务系统对第一访问地址进行异常处理,可以减少向业务系统进行异常预警所消耗的时间,提高异常处理的时效性。Here, when the electronic device determines that the first access address is abnormal, it blocks the access operation corresponding to the first access address, thereby preventing the terminal device sending the first access address from continuing to access the corresponding setting service system. Restricting access to the terminal sending the first access address refers to restricting the access operation of the terminal device sending the first access address, thereby prohibiting the corresponding terminal device from accessing the corresponding service system or the corresponding function of the corresponding service system. Because, when the electronic device determines that the first access address is abnormal, it can directly perform exception processing on the first access address, and does not need to notify the corresponding business system to perform exception processing on the first access address, which can reduce the number of exceptions to the business system. The time consumed by early warning improves the timeliness of exception handling.
本申请实施例中,基于第一访问日志,确定出第一访问地址对应的第一特征值,以及确定出所述第一特征值在至少一个设定维度中每个设定维度对应的第一出现次数;基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出所述第一访问地址是否异常。这样,可以直接检测到经由内置的API网关到业务系统的访问请求,基于检测到的访问请求的访问信息生成第一访问日志,因不需要各业务系统上报访问日志,提高了收集访问日志的时效性,从而提高了异常检测的时效性;由于第一特征值对应的出现次数,可以反映出用户的行为习惯,因此,本方案可以提高异常检测结果的准确度。In this embodiment of the present application, based on the first access log, the first feature value corresponding to the first access address is determined, and the first feature value corresponding to each set dimension in at least one set dimension is determined. The number of occurrences; based on the first occurrences corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit duration in the set statistical period in each set dimension The number of historical occurrences determines whether the first access address is abnormal. In this way, the access request to the business system via the built-in API gateway can be directly detected, and the first access log is generated based on the access information of the detected access request. Since there is no need for each business system to report the access log, the timeliness of collecting the access log is improved Therefore, the timeliness of anomaly detection is improved; since the number of occurrences corresponding to the first eigenvalue can reflect the user's behavior habits, this solution can improve the accuracy of anomaly detection results.
图6为本申请应用实施例提供的异常检测方法的实现流程示意图。如图6示出的,异常检测方法包括:FIG. 6 is a schematic diagram of an implementation flow of an anomaly detection method provided by an application embodiment of the present application. As shown in Figure 6, the anomaly detection method includes:
步骤601:基于设定统计周期内每个单位时长对应的历史访问日志,确定出每个历史访问地址对应的特征值,以及确定出设定统计周期内每个单位时长对应的特征值在每个设定维度对应的历史出现次数。Step 601: Determine the characteristic value corresponding to each historical access address based on the historical access log corresponding to each unit duration in the set statistical period, and determine the characteristic value corresponding to each unit duration in the set statistical period in each Set the historical occurrences corresponding to the dimension.
如图7所示,实际应用时,电子设备中设有API网关、登陆插件、权限插件和日志插件,电子设备支持权限管理服务、日志存储服务、权限管理服务、用户行为分析服务以及用户历史行为分析服务等。其中,As shown in Figure 7, in actual application, there are API gateway, login plug-in, permission plug-in and log plug-in in the electronic device, and the electronic device supports rights management service, log storage service, rights management service, user behavior analysis service and user history behavior analysis services, etc. in,
API网关可以调用登陆插件、权限插件和日志插件等。登陆插件用于供API网关对登陆请求进行身份验证;权限插件用于供API网关对访问请求进行权限验证,日志插件用于供API网关将访问请求中的访问信息写入访问日志。The API gateway can call login plug-ins, permission plug-ins and log plug-ins, etc. The login plug-in is used for the API gateway to authenticate the login request; the permission plug-in is used for the API gateway to verify the permission of the access request, and the log plug-in is used for the API gateway to write the access information in the access request into the access log.
权限管理服务用于更新和存储用户在各个业务系统中对应的设定访问地址,以及同步异常的访问地址。用户行为分析服务用于基于第一访问日志检测异常的访问地址,例如实现步骤101至步骤102;用户历史行为分析服务用于基于历史访问日志,确定每个访问地址对应的特征值在每个设定维度对应的历史出现次数。设定维度包括:用户维度、部门维度、岗位维度和设定维度,分别对应上文中的第一维度、第二维度、第三维度和第四维度。The authority management service is used to update and store the user's corresponding setting access address in each business system, as well as the access address of synchronization exception. The user behavior analysis service is used to detect abnormal access addresses based on the first access log, for example, to implement steps 101 to 102; the user historical behavior analysis service is used to determine the characteristic value corresponding to each access address in each device based on the historical access log. The number of historical occurrences corresponding to a given dimension. Setting dimensions include: user dimension, department dimension, post dimension and setting dimension, corresponding to the first dimension, second dimension, third dimension and fourth dimension above respectively.
其中,步骤601的实现过程,请参照上文中的相关描述,此处不赘述。Wherein, for the implementation process of step 601, please refer to the relevant description above, and details will not be repeated here.
步骤602:基于第一访问日志,确定出第一访问地址对应的第一特征值,以及确定出所述第一特征值在至少一个设定维度中每个设定维度对应的第一出现次数;所述第一访问地址表征所述第一访问日志中的任一访问信息中的访问地址,用于访问至少两个设定业务系统中的任一设定业务系统;所述第一访问日志用于实时记录单位时长内经由内置的API网关的访问请求的访问信息。Step 602: Based on the first access log, determine the first feature value corresponding to the first access address, and determine the first occurrence times of the first feature value corresponding to each set dimension in at least one set dimension; The first access address represents an access address in any access information in the first access log, and is used to access any one of the at least two set service systems; the first access log uses The access information of the access request through the built-in API gateway is recorded in real time within the unit duration.
其中,步骤602与步骤101相同,实现过程请参照步骤101的相关描述,此处不赘述。Wherein, step 602 is the same as step 101, please refer to the relevant description of step 101 for the implementation process, and details are not repeated here.
步骤603:基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出所述第一访问地址是否异常。Step 603: Based on the first number of occurrences corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit duration in the set statistical period in each set dimension The number of historical occurrences determines whether the first access address is abnormal.
其中,步骤603与步骤102相同,实现过程请参照步骤102的相关描述,此处不赘述。Wherein, step 603 is the same as step 102, please refer to the relevant description of step 102 for the implementation process, and details are not repeated here.
步骤604:在所述第一访问地址异常的情况下,阻断所述第一访问地址对应的访问操作,和/或,对发送所述第一访问地址的终端设备进行访问限制。Step 604: When the first access address is abnormal, block the access operation corresponding to the first access address, and/or restrict access to the terminal device sending the first access address.
为实现本申请实施例的方法,本申请实施例还提供了一种异常检测装置,如图8所示,该异常检测装置包括:In order to implement the method of the embodiment of the present application, the embodiment of the present application also provides an abnormality detection device, as shown in Figure 8, the abnormality detection device includes:
第一确定单元81,配置为基于第一访问日志,确定出第一访问地址对应的第一特征值,以及确定出所述第一特征值在至少一个设定维度中每个设定维度对应的第一出现次数;所述第一访问地址表征所述第一访问日志中的任一访问信息中的访问地址,用于访问至少两个设定业务系统中的任一设定业务系统;所述第一访问日志用于实时记录单位时长内经由内置的应用程序接口API网关的访问请求的访问信息;The first determining unit 81 is configured to determine the first feature value corresponding to the first access address based on the first access log, and determine the corresponding value of the first feature value in each set dimension in at least one set dimension The first number of occurrences; the first access address represents the access address in any access information in the first access log, and is used to access any one of the at least two set service systems; the The first access log is used to record the access information of the access request via the built-in application program interface API gateway in real time within the unit duration;
第二确定单元82,配置为基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出所述第一访问地址是否异常;其中,设定统计周期内每个单位时长对应的所述第一特征值基于对应的单位时长对应的历史日志确定出。The second determination unit 82 is configured to be based on the first occurrence times corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit time length in the set statistical period in each Set the number of historical occurrences corresponding to the dimension to determine whether the first access address is abnormal; wherein, the first feature value corresponding to each unit duration in the set statistical period is determined based on the historical log corresponding to the corresponding unit duration .
在一些实施例中,所述访问信息至少包括用户标识、访问时间和访问地址,第一确定单元81还配置为:In some embodiments, the access information includes at least user identification, access time and access address, and the first determining unit 81 is further configured to:
基于第二访问日志中的访问信息,确定出第一用户对应的第一序列;其中,所述第二访问日志包括所述第一访问日志或设定统计周期内每个单位时长对应的历史访问日志;第一序列包括第二访问日志中第一用户对应的所有访问地址;Based on the access information in the second access log, determine the first sequence corresponding to the first user; wherein, the second access log includes the first access log or the historical access corresponding to each unit duration within the set statistical period log; the first sequence includes all access addresses corresponding to the first user in the second access log;
在第一用户对应的第一序列中,确定出第一用户对应的第二序列;第二序列包括至少一个访问地址;In the first sequence corresponding to the first user, determine the second sequence corresponding to the first user; the second sequence includes at least one access address;
基于第一用户对应的第二序列,计算出哈希值,得到第一用户对应的第二序列中位于设定位置的访问地址对应的特征值。Based on the second sequence corresponding to the first user, the hash value is calculated to obtain the feature value corresponding to the access address at the set position in the second sequence corresponding to the first user.
在一些实施例中,第一确定单元81具体配置为:In some embodiments, the first determining unit 81 is specifically configured as:
在第三访问地址与第二访问地址之间的时间间隔大于或等于设定时长的情况下,将第二序列中的第三访问地址替换为设定字符串;其中,第二访问地址表征第二序列中位于设定位置的访问地址,第三访问地址表征第二序列中与第二访问地址相邻的访问地址;In the case that the time interval between the third access address and the second access address is greater than or equal to the set duration, the third access address in the second sequence is replaced with the set character string; wherein, the second access address represents the first The access address at the set position in the second sequence, the third access address represents the access address adjacent to the second access address in the second sequence;
基于更新后的第二序列计算出哈希值。A hash value is calculated based on the updated second sequence.
在一些实施例中,所述第二序列包括三个访问地址,设定位置表征中间位置。In some embodiments, the second sequence includes three access addresses, the set position representing an intermediate position.
在一些实施例中,所述访问信息还包括用户所属的部门、用户岗位以及发送访问地址的终端设备的标识;In some embodiments, the access information also includes the department to which the user belongs, the position of the user, and the identification of the terminal device sending the access address;
所述至少一个设定维度包括以下至少之一:The at least one set dimension includes at least one of the following:
第一维度,表征按用户统计特征值的出现次数;The first dimension represents the number of occurrences of statistical feature values by user;
第二维度,表征按部门统计特征值的出现次数;The second dimension represents the number of occurrences of statistical feature values by department;
第三维度,表征按岗位统计特征值的出现次数;The third dimension represents the number of occurrences of statistical feature values by position;
第四维度,表征按用户使用的终端设备统计相同的特征值的出现次数。The fourth dimension represents the statistics of the number of occurrences of the same feature value based on the terminal equipment used by the user.
在一些实施例中,第二确定单元82具体配置为:In some embodiments, the second determination unit 82 is specifically configured as:
基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出每个单位时长内的所述第一特征值在每个设定维度对应的第一分数;Based on the first number of occurrences corresponding to the first feature value in each set dimension, and based on the historical number of occurrences of the first feature value corresponding to each unit duration in the set statistical period in each set dimension , determining the first score corresponding to the first feature value in each set dimension within each unit duration;
基于每个单位时长的所述第一特征值在每个设定维度对应的第一分数,确定出所述第一访问地址对应的多个第二分数;其中,每个第二分数基于每个单位时长内的所述第一特征值在所有设定维度对应的第一分数确定出;A plurality of second scores corresponding to the first access address are determined based on the first score corresponding to the first feature value of each unit duration in each set dimension; wherein each second score is based on each The first eigenvalues within the unit duration are determined by the first scores corresponding to all the set dimensions;
基于所述第一访问地址对应的多个第二分数,确定出所述第一访问地址是否异常。Based on the plurality of second scores corresponding to the first access address, it is determined whether the first access address is abnormal.
在一些实施例中,第二确定单元82具体配置为:In some embodiments, the second determination unit 82 is specifically configured as:
基于第一单位时长对应的所述第一特征值在第一设定维度对应的历史出现次数,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数;第一设定维度表征所述至少一个设定维度中的任一设定维度;第一单位时长表征所述设定统计周期中的任一单位时长;Based on the historical occurrence times of the first feature value corresponding to the first unit time length in the first set dimension, determine the second occurrence of the first feature value corresponding to the first set dimension within the first unit time length The number of times; the first set dimension represents any set dimension in the at least one set dimension; the first unit duration represents any unit duration in the set statistical cycle;
基于所述第一特征值在第一设定维度对应的第一出现次数和第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数,以及基于所述第一特征值在所述设定统计周期中对应的最大历史出现次数以及第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数。Based on the first number of occurrences of the first feature value corresponding to the first set dimension and the second number of occurrences of the first feature value corresponding to the first set dimension within the first unit duration, and based on the first set dimension The maximum historical number of occurrences of a feature value in the set statistical period and the second number of occurrences of the first feature value in the first set dimension corresponding to the first unit time length are determined to determine The first score corresponding to the first feature value in the first set dimension.
在一些实施例中,第二确定单元82具体配置为:In some embodiments, the second determination unit 82 is specifically configured as:
基于第一单位时长内的所述第一特征值在第一设定维度对应的历史出现次数与设定统计周期内单位时长的总数的比值,确定出第一单位时长的所述第一特征值在第一设定维度对应的第二出现次数;The first characteristic value of the first unit duration is determined based on the ratio of the historical occurrence times of the first characteristic value corresponding to the first set dimension in the first unit duration to the total number of the unit duration within the set statistical period The second number of occurrences corresponding to the first set dimension;
基于每个单位时长内的所述第一特征值在第一设定维度对应的历史出现次数,以及所述第一特征值在每个单位时长对应的权重,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数;其中,Based on the historical occurrence times corresponding to the first feature value in the first set dimension in each unit time length, and the weight corresponding to the first feature value in each unit time length, determine all the first feature values in the first unit time length The second number of occurrences corresponding to the first feature value in the first set dimension; wherein,
在第一单位时长对应的时间晚于第二单位时长对应的时间的情况下,第一单位时长对应的权重大于第二单位时长对应的权重。In a case where the time corresponding to the first unit duration is later than the time corresponding to the second unit duration, the weight corresponding to the first unit duration is greater than the weight corresponding to the second unit duration.
在一些实施例中,第二确定单元82具体配置为:In some embodiments, the second determination unit 82 is specifically configured as:
在第一差值等于零的情况下,基于设定分数确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数;In the case where the first difference is equal to zero, a first score corresponding to the first feature value in the first set dimension within the first unit time length is determined based on the set score;
在第一差值不等于零的情况下,基于第二差值与第一差值的比值,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数;其中,In the case where the first difference is not equal to zero, based on the ratio of the second difference to the first difference, determine the first score corresponding to the first feature value in the first set dimension within the first unit time length; in,
所述第一差值表征所述第一特征值在所述设定统计周期中对应的最大历史出现次数与第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数之差;The first difference represents the maximum number of historical occurrences corresponding to the first feature value in the set statistical period and the second value corresponding to the first feature value in the first set dimension within the first unit time length. difference in number of occurrences;
所述第二差值表征所述第一特征值在第一设定维度对应的第一出现次数与第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数之差。The second difference represents the first number of appearances corresponding to the first feature value in the first set dimension and the second number of appearances corresponding to the first feature value in the first set dimension within the first unit duration Difference.
在一些实施例中,第二确定单元82具体配置为:In some embodiments, the second determination unit 82 is specifically configured as:
在所述第一特征值对应的最大历史出现次数大于零的情况下,将第一设定维度对应的第一设定分数,确定为第一单位时长内的所述第一特征值在第一设定维度对应的第一 分数;In the case where the maximum number of historical occurrences corresponding to the first feature value is greater than zero, the first set score corresponding to the first set dimension is determined as the first feature value within the first unit time length within the first Set the first score corresponding to the dimension;
在所述第一特征值对应的最大历史出现次数等于零,且存在所述第一访问地址的情况下,将第二设定分数确定为第一单位时长内的所述第一特征值在第一设定维度对应的第一分数。In the case that the maximum number of historical occurrences corresponding to the first feature value is equal to zero and the first access address exists, the second set score is determined as the first feature value within the first unit duration within the first Set the first score corresponding to the dimension.
在一些实施例中,第二确定单元82具体配置为:In some embodiments, the second determination unit 82 is specifically configured as:
在第二差值与第一差值的比值小于或等于零的情况下,确定第一单位时长内的所述第一特征值在第一设定维度对应的第一分数为零;In the case where the ratio of the second difference to the first difference is less than or equal to zero, it is determined that the first fraction corresponding to the first feature value in the first set dimension within the first unit time length is zero;
在第二差值与第一差值的比值大于零的情况下,将第二差值与第一差值之商,确定为第一单位时长内的所述第一特征值在第一设定维度对应的第一分数。In the case where the ratio of the second difference to the first difference is greater than zero, the quotient of the second difference to the first difference is determined as the first characteristic value within the first unit time length in the first setting The first fraction corresponding to the dimension.
在一些实施例中,第二确定单元82具体配置为:In some embodiments, the second determination unit 82 is specifically configured as:
在设定维度的数量为1的情况下,将第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,确定为所述第一访问地址在第一单位时长对应的第二分数;In the case where the number of dimensions is set to 1, the first score corresponding to the first feature value in the first set dimension within the first unit duration is determined as the first access address in the first unit duration the corresponding second fraction;
在设定维度的数量大于1的情况下,基于第一单位时长内的所述第一特征值在每个设定维度对应的第一分数和每个设定维度的设定权重,确定出所述第一访问地址在第一单位时长对应的第二分数。When the number of set dimensions is greater than 1, based on the first score corresponding to the first feature value in each set dimension within the first unit time length and the set weight of each set dimension, determine the The second fraction corresponding to the first access address in the first unit duration.
在一些实施例中,该异常检测装置还包括:In some embodiments, the anomaly detection device also includes:
异常处理单元,配置为在所述第一访问地址异常的情况下,阻断所述第一访问地址对应的访问操作,和/或,对发送所述第一访问地址的终端设备进行访问限制。The exception processing unit is configured to block the access operation corresponding to the first access address when the first access address is abnormal, and/or restrict access to the terminal device sending the first access address.
实际应用时,第一确定单元81、第二确定单元82和异常处理单元可通过异常检测装置中的处理器,比如中央处理器(CPU,Central Processing Unit)、数字信号处理器(DSP,Digital Signal Processor)、微控制单元(MCU,Microcontroller Unit)或可编程门阵列(FPGA,Field-Programmable Gate Array)等实现。In actual application, the first determining unit 81, the second determining unit 82 and the abnormality processing unit can pass through a processor in the abnormality detection device, such as a central processing unit (CPU, Central Processing Unit), a digital signal processor (DSP, Digital Signal Processor), Microcontroller Unit (MCU, Microcontroller Unit) or Programmable Gate Array (FPGA, Field-Programmable Gate Array) and other implementations.
需要说明的是:上述实施例提供的异常检测装置在进行异常检测时,仅以上述各程序模块的划分进行举例说明,实际应用中,可以根据需要而将上述处理分配由不同的程序模块完成,即将装置的内部结构划分成不同的程序模块,以完成以上描述的全部或者部分处理。另外,上述实施例提供的异常检测装置与异常检测方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。It should be noted that: when the abnormality detection device provided by the above-mentioned embodiments performs abnormality detection, it only uses the division of the above-mentioned program modules as an example. In practical applications, the above-mentioned processing can be allocated by different program modules according to needs. That is, the internal structure of the device is divided into different program modules to complete all or part of the processing described above. In addition, the anomaly detection device and the anomaly detection method embodiment provided by the above embodiment belong to the same idea, and the specific implementation process thereof is detailed in the method embodiment, and will not be repeated here.
基于上述程序模块的硬件实现,且为了实现本申请实施例的方法,本申请实施例还提供了一种电子设备。图9为本申请实施例提供的电子设备的硬件组成结构示意图,如图9所示,电子设备9包括:Based on the hardware implementation of the above program modules, and in order to implement the method of the embodiment of the present application, the embodiment of the present application further provides an electronic device. FIG. 9 is a schematic diagram of the hardware composition structure of the electronic device provided by the embodiment of the present application. As shown in FIG. 9, the electronic device 9 includes:
通信接口91,能够与其它设备比如网络设备等进行信息交互;Communication interface 91, capable of information exchange with other devices such as network devices;
处理器92,与所述通信接口91连接,以实现与其它设备进行信息交互,配置为运行计算机程序时,执行上述一个或多个技术方案提供的方法。而所述计算机程序存储在存储器93上。The processor 92 is connected to the communication interface 91 to implement information interaction with other devices, and is configured to execute the methods provided by one or more of the above technical solutions when running a computer program. Instead, the computer program is stored on the memory 93 .
当然,实际应用时,电子设备9中的各个组件通过总线系统94耦合在一起。可理解,总线系统94配置为实现这些组件之间的连接通信。总线系统94除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图9中将各种总线都标为总线系统94。Of course, in actual application, various components in the electronic device 9 are coupled together through the bus system 94 . It will be appreciated that the bus system 94 is configured to enable connection communication between these components. In addition to the data bus, the bus system 94 also includes a power bus, a control bus and a status signal bus. However, the various buses are labeled as bus system 94 in FIG. 9 for clarity of illustration.
本申请实施例中的存储器93配置为存储各种类型的数据以支持电子设备9的操作。这些数据的示例包括:配置为在电子设备9上操作的任何计算机程序。The memory 93 in the embodiment of the present application is configured to store various types of data to support the operation of the electronic device 9 . Examples of such data include: any computer program configured to operate on electronic device 9 .
可以理解,存储器93可以是易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(ROM,Read Only Memory)、可编程只读存储器(PROM,Programmable Read-Only Memory)、可擦除可编程只读存储器(EPROM,Erasable Programmable Read-Only Memory)、电可擦除可编 程只读存储器(EEPROM,Electrically Erasable Programmable Read-Only Memory)、磁性随机存取存储器(FRAM,ferromagnetic random access memory)、快闪存储器(Flash Memory)、磁表面存储器、光盘、或只读光盘(CD-ROM,Compact Disc Read-Only Memory);磁表面存储器可以是磁盘存储器或磁带存储器。易失性存储器可以是随机存取存储器(RAM,Random Access Memory),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(SRAM,Static Random Access Memory)、同步静态随机存取存储器(SSRAM,Synchronous Static Random Access Memory)、动态随机存取存储器(DRAM,Dynamic Random Access Memory)、同步动态随机存取存储器(SDRAM,Synchronous Dynamic Random Access Memory)、双倍数据速率同步动态随机存取存储器(DDRSDRAM,Double Data Rate Synchronous Dynamic Random Access Memory)、增强型同步动态随机存取存储器(ESDRAM,Enhanced Synchronous Dynamic Random Access Memory)、同步连接动态随机存取存储器(SLDRAM,SyncLink Dynamic Random Access Memory)、直接内存总线随机存取存储器(DRRAM,Direct Rambus Random Access Memory)。本申请实施例描述的存储器93旨在包括但不限于这些和任意其它适合类型的存储器。It can be understood that the memory 93 may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memories. Among them, the non-volatile memory can be read-only memory (ROM, Read Only Memory), programmable read-only memory (PROM, Programmable Read-Only Memory), erasable programmable read-only memory (EPROM, Erasable Programmable Read-Only Memory) Only Memory), Electrically Erasable Programmable Read-Only Memory (EEPROM, Electrically Erasable Programmable Read-Only Memory), Magnetic Random Access Memory (FRAM, ferromagnetic random access memory), Flash Memory (Flash Memory), Magnetic Surface Memory , CD, or CD-ROM (Compact Disc Read-Only Memory); magnetic surface storage can be disk storage or tape storage. The volatile memory may be random access memory (RAM, Random Access Memory), which is used as an external cache. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM, Static Random Access Memory), Synchronous Static Random Access Memory (SSRAM, Synchronous Static Random Access Memory), Dynamic Random Access Memory Memory (DRAM, Dynamic Random Access Memory), synchronous dynamic random access memory (SDRAM, Synchronous Dynamic Random Access Memory), double data rate synchronous dynamic random access memory (DDRSDRAM, Double Data Rate Synchronous Dynamic Random Access Memory), enhanced Synchronous Dynamic Random Access Memory (ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), Synchronous Link Dynamic Random Access Memory (SLDRAM, SyncLink Dynamic Random Access Memory), Direct Memory Bus Random Access Memory (DRRAM, Direct Rambus Random Access Memory ). The memory 93 described in the embodiments of the present application is intended to include but not limited to these and any other suitable types of memory.
上述本申请实施例揭示的方法可以应用于处理器92中,或者由处理器92实现。处理器92可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器92中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器92可以是通用处理器、DSP,或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。处理器92可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本申请实施例所公开的方法的步骤,可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于存储介质中,该存储介质位于存储器93,处理器92读取存储器93中的程序,结合其硬件完成前述方法的步骤。The methods disclosed in the foregoing embodiments of the present application may be applied to the processor 92 or implemented by the processor 92 . The processor 92 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by an integrated logic circuit of hardware in the processor 92 or instructions in the form of software. The aforementioned processor 92 may be a general-purpose processor, DSP, or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. The processor 92 may implement or execute various methods, steps, and logic block diagrams disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in a storage medium, the storage medium is located in the memory 93, and the processor 92 reads the program in the memory 93, and completes the steps of the foregoing method in combination with its hardware.
可选地,所述处理器92执行所述程序时实现本申请实施例的各个方法中由终端实现的相应流程,为了简洁,在此不再赘述。Optionally, when the processor 92 executes the program, it implements a corresponding process implemented by the terminal in each method of the embodiment of the present application. For the sake of brevity, details are not repeated here.
在示例性实施例中,本申请实施例还提供了一种存储介质,即计算机存储介质,具体为计算机可读存储介质,例如包括存储计算机程序的第一存储器93,上述计算机程序可由终端的处理器92执行,以完成前述方法所述步骤。计算机可读存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、Flash Memory、磁表面存储器、光盘、或CD-ROM等存储器。In an exemplary embodiment, the embodiment of the present application also provides a storage medium, that is, a computer storage medium, specifically a computer-readable storage medium, for example, including a first memory 93 storing a computer program, and the above-mentioned computer program can be processed by the terminal 92 to complete the steps described in the foregoing method. The computer-readable storage medium can be memories such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface memory, optical disk, or CD-ROM.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。In the several embodiments provided in this application, it should be understood that the disclosed devices and methods may be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods, such as: multiple units or components can be combined, or May be integrated into another system, or some features may be ignored, or not implemented. In addition, the coupling, or direct coupling, or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical or other forms of.
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place or distributed to multiple network units; Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各实施例中的各功能单元可以全部集成在一个处理模块中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application can be integrated into one processing module, or each unit can be used as a single unit, or two or more units can be integrated into one unit; the above-mentioned integration The unit can be realized in the form of hardware or in the form of hardware plus software functional unit.
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程 序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps to realize the above method embodiments can be completed by hardware related to program instructions, and the aforementioned program can be stored in a computer-readable storage medium. When the program is executed, the Including the steps of the foregoing method embodiments; and the foregoing storage medium includes: a removable storage device, a read-only memory (ROM, Read-Only Memory), a random access memory (RAM, Random Access Memory), a magnetic disk or an optical disk, etc. A medium on which program code can be stored.
需要说明的是,“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。本申请实施例所记载的技术方案之间,在不冲突的情况下,可以任意组合。It should be noted that "first", "second", etc. are used to distinguish similar objects, and not necessarily used to describe a specific sequence or sequence. The technical solutions described in the embodiments of the present application may be combined arbitrarily if there is no conflict.
需要说明的是,本申请实施例中的术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多个中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。It should be noted that the term "and/or" in the embodiments of the present application is only an association relationship describing associated objects, which means that there may be three kinds of relationships, for example, A and/or B, which may mean that A exists alone , both A and B exist, and B exists alone. In addition, the term "at least one" herein means any combination of any one or more of at least two of a plurality, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above is only a specific implementation of the application, but the scope of protection of the application is not limited thereto. Anyone familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the application. Should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be determined by the protection scope of the claims.

Claims (16)

  1. 一种异常检测方法,包括:An anomaly detection method comprising:
    基于第一访问日志,确定出第一访问地址对应的第一特征值,以及确定出所述第一特征值在至少一个设定维度中每个设定维度对应的第一出现次数;所述第一访问地址表征所述第一访问日志中的任一访问信息中的访问地址,用于访问至少两个设定业务系统中的任一设定业务系统;所述第一访问日志用于实时记录单位时长内经由内置的应用程序接口API网关的访问请求的访问信息;Based on the first access log, determine the first feature value corresponding to the first access address, and determine the first number of occurrences of the first feature value corresponding to each set dimension in at least one set dimension; An access address represents an access address in any access information in the first access log, and is used to access any one of at least two set business systems; the first access log is used for real-time recording Access information of access requests via the built-in application programming interface API gateway within the unit duration;
    基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出所述第一访问地址是否异常;其中,设定统计周期内每个单位时长对应的所述第一特征值基于对应的单位时长对应的历史日志确定出。Based on the first number of occurrences corresponding to the first feature value in each set dimension, and based on the historical number of occurrences of the first feature value corresponding to each unit duration in the set statistical period in each set dimension , to determine whether the first access address is abnormal; wherein, the first feature value corresponding to each unit time length in the set statistical period is determined based on the history log corresponding to the corresponding unit time length.
  2. 根据权利要求1所述的方法,其中,所述访问信息至少包括用户标识、访问时间和访问地址;确定访问地址对应的特征值,包括:The method according to claim 1, wherein the access information includes at least user identification, access time and access address; determining the characteristic value corresponding to the access address includes:
    基于第二访问日志中的访问信息,确定出第一用户对应的第一序列;其中,所述第二访问日志包括所述第一访问日志或设定统计周期内每个单位时长对应的历史访问日志;第一序列包括第二访问日志中第一用户对应的所有访问地址;Based on the access information in the second access log, determine the first sequence corresponding to the first user; wherein, the second access log includes the first access log or the historical access corresponding to each unit duration within the set statistical period log; the first sequence includes all access addresses corresponding to the first user in the second access log;
    在第一用户对应的第一序列中,确定出第一用户对应的第二序列;第二序列包括至少一个访问地址;In the first sequence corresponding to the first user, determine the second sequence corresponding to the first user; the second sequence includes at least one access address;
    基于第一用户对应的第二序列,计算出哈希值,得到第一用户对应的第二序列中位于设定位置的访问地址对应的特征值。Based on the second sequence corresponding to the first user, the hash value is calculated to obtain the feature value corresponding to the access address at the set position in the second sequence corresponding to the first user.
  3. 根据权利要求2所述的方法,其中,所述计算出哈希值,包括:The method according to claim 2, wherein said calculating the hash value comprises:
    在第三访问地址与第二访问地址之间的时间间隔大于或等于设定时长的情况下,将第二序列中的第三访问地址替换为设定字符串;其中,第二访问地址表征第二序列中位于设定位置的访问地址,第三访问地址表征第二序列中与第二访问地址相邻的访问地址;In the case that the time interval between the third access address and the second access address is greater than or equal to the set duration, the third access address in the second sequence is replaced with the set character string; wherein, the second access address represents the first The access address at the set position in the second sequence, the third access address represents the access address adjacent to the second access address in the second sequence;
    基于更新后的第二序列计算出哈希值。A hash value is calculated based on the updated second sequence.
  4. 根据权利要求2所述的方法,其中,所述第二序列包括三个访问地址,设定位置表征中间位置。The method of claim 2, wherein the second sequence includes three access addresses, the set position representing an intermediate position.
  5. 根据权利要求2所述的方法,其中,所述访问信息还包括用户所属的部门、用户岗位以及发送访问地址的终端设备的标识;The method according to claim 2, wherein the access information further includes the department to which the user belongs, the position of the user, and the identification of the terminal device sending the access address;
    所述至少一个设定维度包括以下至少之一:The at least one set dimension includes at least one of the following:
    第一维度,表征按用户统计特征值的出现次数;The first dimension represents the number of occurrences of statistical feature values by user;
    第二维度,表征按部门统计特征值的出现次数;The second dimension represents the number of occurrences of statistical feature values by department;
    第三维度,表征按岗位统计特征值的出现次数;The third dimension represents the number of occurrences of statistical feature values by position;
    第四维度,表征按用户使用的终端设备统计相同的特征值的出现次数。The fourth dimension represents the statistics of the number of occurrences of the same feature value based on the terminal equipment used by the user.
  6. 根据权利要求1所述的方法,其中,所述确定出所述第一访问地址是否异常,包括:The method according to claim 1, wherein the determining whether the first access address is abnormal comprises:
    基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出每个单位时长内的所述第一特征值在每个设定维度对应的第一分数;Based on the first number of occurrences corresponding to the first feature value in each set dimension, and based on the historical number of occurrences of the first feature value corresponding to each unit duration in the set statistical period in each set dimension , determining the first score corresponding to the first feature value in each set dimension within each unit duration;
    基于每个单位时长的所述第一特征值在每个设定维度对应的第一分数,确定出所述第一访问地址对应的多个第二分数;其中,每个第二分数基于每个单位时长内 的所述第一特征值在所有设定维度对应的第一分数确定出;A plurality of second scores corresponding to the first access address are determined based on the first score corresponding to the first characteristic value of each unit duration in each set dimension; wherein each second score is based on each The first eigenvalues within the unit duration are determined by the first scores corresponding to all the set dimensions;
    基于所述第一访问地址对应的多个第二分数,确定出所述第一访问地址是否异常。Based on the plurality of second scores corresponding to the first access address, it is determined whether the first access address is abnormal.
  7. 根据权利要求6所述的方法,其中,所述确定出每个单位时长内的所述第一特征值在每个设定维度对应的第一分数,包括:The method according to claim 6, wherein said determining the first score corresponding to each set dimension of the first feature value within each unit duration includes:
    基于第一单位时长对应的所述第一特征值在第一设定维度对应的历史出现次数,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数;第一设定维度表征所述至少一个设定维度中的任一设定维度;第一单位时长表征所述设定统计周期中的任一单位时长;Based on the historical occurrence times of the first feature value corresponding to the first unit time length in the first set dimension, determine the second occurrence of the first feature value corresponding to the first set dimension within the first unit time length The number of times; the first set dimension represents any set dimension in the at least one set dimension; the first unit duration represents any unit duration in the set statistical cycle;
    基于所述第一特征值在第一设定维度对应的第一出现次数和第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数,以及基于所述第一特征值在所述设定统计周期中对应的最大历史出现次数以及第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数。Based on the first number of occurrences of the first feature value corresponding to the first set dimension and the second number of occurrences of the first feature value corresponding to the first set dimension within the first unit duration, and based on the first set dimension The maximum historical number of occurrences of a feature value in the set statistical period and the second number of occurrences of the first feature value in the first set dimension corresponding to the first unit time length are determined to determine The first score corresponding to the first feature value in the first set dimension.
  8. 根据权利要求7所述的方法,其中,所述确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数,包括以下之一:The method according to claim 7, wherein said determining the second number of occurrences corresponding to the first feature value in the first set dimension within the first unit duration includes one of the following:
    基于第一单位时长内的所述第一特征值在第一设定维度对应的历史出现次数与设定统计周期内单位时长的总数的比值,确定出第一单位时长的所述第一特征值在第一设定维度对应的第二出现次数;The first characteristic value of the first unit duration is determined based on the ratio of the historical occurrence times of the first characteristic value corresponding to the first set dimension in the first unit duration to the total number of the unit duration within the set statistical period The second number of occurrences corresponding to the first set dimension;
    基于每个单位时长内的所述第一特征值在第一设定维度对应的历史出现次数,以及所述第一特征值在每个单位时长对应的权重,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数;其中,Based on the historical occurrence times corresponding to the first feature value in the first set dimension in each unit time length, and the weight corresponding to the first feature value in each unit time length, determine all the first feature values in the first unit time length The second number of occurrences corresponding to the first feature value in the first set dimension; wherein,
    在第一单位时长对应的时间晚于第二单位时长对应的时间的情况下,第一单位时长对应的权重大于第二单位时长对应的权重。In a case where the time corresponding to the first unit duration is later than the time corresponding to the second unit duration, the weight corresponding to the first unit duration is greater than the weight corresponding to the second unit duration.
  9. 根据权利要求7所述的方法,其中,所述确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,包括以下之一:The method according to claim 7, wherein said determining the first score corresponding to the first feature value in the first set dimension within the first unit duration includes one of the following:
    在第一差值等于零的情况下,基于设定分数确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数;In the case where the first difference is equal to zero, a first score corresponding to the first feature value in the first set dimension within the first unit time length is determined based on the set score;
    在第一差值不等于零的情况下,基于第二差值与第一差值的比值,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数;In the case where the first difference is not equal to zero, based on the ratio of the second difference to the first difference, determine the first score corresponding to the first feature value in the first set dimension within the first unit time length;
    其中,所述第一差值表征所述第一特征值在所述设定统计周期中对应的最大历史出现次数与第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数之差;Wherein, the first difference represents the maximum number of historical occurrences corresponding to the first feature value in the set statistical period and the first feature value corresponding to the first set dimension within the first unit duration the difference between the second occurrences;
    所述第二差值表征所述第一特征值在第一设定维度对应的第一出现次数与第一单位时长内的所述第一特征值在第一设定维度对应的第二出现次数之差。The second difference represents the first number of appearances corresponding to the first feature value in the first set dimension and the second number of appearances corresponding to the first feature value in the first set dimension within the first unit duration Difference.
  10. 根据权利要求9所述的方法,其中,所述基于设定分数确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,包括以下之一:The method according to claim 9, wherein the determining the first score corresponding to the first feature value in the first set dimension within the first unit time length based on the set score includes one of the following:
    在所述第一特征值对应的最大历史出现次数大于零的情况下,将第一设定维度对应的第一设定分数,确定为第一单位时长内的所述第一特征值在第一设定维度对应的第一分数;In the case where the maximum number of historical occurrences corresponding to the first feature value is greater than zero, the first set score corresponding to the first set dimension is determined as the first feature value within the first unit time length within the first Set the first score corresponding to the dimension;
    在所述第一特征值对应的最大历史出现次数等于零,且存在所述第一访问地址的情况下,将第二设定分数确定为第一单位时长内的所述第一特征值在第一设定维度对应的第一分数。In the case that the maximum number of historical occurrences corresponding to the first feature value is equal to zero and the first access address exists, the second set score is determined as the first feature value within the first unit duration within the first Set the first score corresponding to the dimension.
  11. 根据权利要求9所述的方法,其中,所述基于第二差值与第一差值的比 值,确定出第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,包括以下之一:The method according to claim 9, wherein, based on the ratio of the second difference to the first difference, the first characteristic value corresponding to the first feature value in the first set dimension within the first unit time length is determined. Score, including one of the following:
    在第二差值与第一差值的比值小于或等于零的情况下,确定第一单位时长内的所述第一特征值在第一设定维度对应的第一分数为零;In the case where the ratio of the second difference to the first difference is less than or equal to zero, it is determined that the first fraction corresponding to the first feature value in the first set dimension within the first unit time length is zero;
    在第二差值与第一差值的比值大于零的情况下,将第二差值与第一差值之商,确定为第一单位时长内的所述第一特征值在第一设定维度对应的第一分数。In the case where the ratio of the second difference to the first difference is greater than zero, the quotient of the second difference to the first difference is determined as the first characteristic value within the first unit time length in the first setting The first fraction corresponding to the dimension.
  12. 根据权利要求6所述的方法,其中,所述基于每个单位时长内的所述第一特征值在每个设定维度对应的第一分数,确定出所述第一访问地址对应的多个第二分数,包括:The method according to claim 6, wherein, based on the first score corresponding to each set dimension of the first characteristic value within each unit duration, a plurality of corresponding to the first access address is determined. Second score, including:
    在设定维度的数量为1的情况下,将第一单位时长内的所述第一特征值在第一设定维度对应的第一分数,确定为所述第一访问地址在第一单位时长对应的第二分数;In the case where the number of dimensions is set to 1, the first score corresponding to the first feature value in the first set dimension within the first unit duration is determined as the first access address in the first unit duration the corresponding second fraction;
    在设定维度的数量大于1的情况下,基于第一单位时长内的所述第一特征值在每个设定维度对应的第一分数和每个设定维度的设定权重,确定出所述第一访问地址在第一单位时长对应的第二分数。When the number of set dimensions is greater than 1, based on the first score corresponding to the first feature value in each set dimension within the first unit time length and the set weight of each set dimension, determine the The second fraction corresponding to the first access address in the first unit duration.
  13. 根据权利要求1至12任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1 to 12, wherein the method further comprises:
    在所述第一访问地址异常的情况下,阻断所述第一访问地址对应的访问操作,和/或,对发送所述第一访问地址的终端设备进行访问限制。In the case that the first access address is abnormal, the access operation corresponding to the first access address is blocked, and/or, access restriction is performed on the terminal device sending the first access address.
  14. 一种异常检测装置,包括:An anomaly detection device, comprising:
    第一确定单元,配置为基于第一访问日志,确定出第一访问地址对应的第一特征值,以及确定出所述第一特征值在至少一个设定维度中每个设定维度对应的第一出现次数;所述第一访问地址表征所述第一访问日志中的任一访问信息中的访问地址,用于访问至少两个设定业务系统中的任一设定业务系统;所述第一访问日志用于实时记录单位时长内经由内置的应用程序接口API网关的访问请求的访问信息;The first determination unit is configured to determine the first feature value corresponding to the first access address based on the first access log, and determine the first feature value corresponding to each set dimension in at least one set dimension. A number of occurrences; the first access address represents the access address in any access information in the first access log, and is used to access any one of the at least two set service systems; the first An access log is used to record the access information of the access request via the built-in application program interface API gateway in real time within the unit duration;
    第二确定单元,配置为基于所述第一特征值在每个设定维度对应的第一出现次数,以及基于设定统计周期内每个单位时长对应的所述第一特征值在每个设定维度对应的历史出现次数,确定出所述第一访问地址是否异常;其中,设定统计周期内每个单位时长对应的所述第一特征值基于对应的单位时长对应的历史日志确定出。The second determining unit is configured to be based on the first occurrence times corresponding to the first feature value in each set dimension, and based on the first feature value corresponding to each unit time length in the set statistical period in each set Determine whether the first access address is abnormal by determining the number of historical occurrences corresponding to the dimension; wherein, the first characteristic value corresponding to each unit time length in the set statistical period is determined based on the historical log corresponding to the corresponding unit time length.
  15. 一种电子设备,包括:处理器和配置为存储能够在处理器上运行的计算机程序的存储器,An electronic device comprising: a processor and a memory configured to store a computer program capable of running on the processor,
    其中,所述处理器配置为运行所述计算机程序时,执行权利要求1至13任一项所述的方法的步骤。Wherein, the processor is configured to execute the steps of the method according to any one of claims 1 to 13 when running the computer program.
  16. 一种存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至13任一项所述的方法的步骤。A storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 13 are implemented.
PCT/CN2022/098734 2021-12-14 2022-06-14 Anomaly detection method and apparatus, electronic device, and storage medium WO2023109046A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111526821.5 2021-12-14
CN202111526821.5A CN114386025A (en) 2021-12-14 2021-12-14 Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium

Publications (1)

Publication Number Publication Date
WO2023109046A1 true WO2023109046A1 (en) 2023-06-22

Family

ID=81196608

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/098734 WO2023109046A1 (en) 2021-12-14 2022-06-14 Anomaly detection method and apparatus, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN114386025A (en)
WO (1) WO2023109046A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114386025A (en) * 2021-12-14 2022-04-22 深圳前海微众银行股份有限公司 Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140244572A1 (en) * 2006-11-27 2014-08-28 Alex T. Hill Qualification of website data and analysis using anomalies relative to historic patterns
WO2017071551A1 (en) * 2015-10-30 2017-05-04 北京奇虎科技有限公司 Method and device for preventing malicious access to login/registration interface
CN109246116A (en) * 2018-09-26 2019-01-18 北京云端智度科技有限公司 A kind of Network anomaly detection system based on DNS log analysis
CN110071941A (en) * 2019-05-08 2019-07-30 北京奇艺世纪科技有限公司 A kind of network attack detecting method, equipment, storage medium and computer equipment
CN111756679A (en) * 2019-03-29 2020-10-09 北京数安鑫云信息技术有限公司 Log analysis method and device, storage medium and computer equipment
CN114386025A (en) * 2021-12-14 2022-04-22 深圳前海微众银行股份有限公司 Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140244572A1 (en) * 2006-11-27 2014-08-28 Alex T. Hill Qualification of website data and analysis using anomalies relative to historic patterns
WO2017071551A1 (en) * 2015-10-30 2017-05-04 北京奇虎科技有限公司 Method and device for preventing malicious access to login/registration interface
CN109246116A (en) * 2018-09-26 2019-01-18 北京云端智度科技有限公司 A kind of Network anomaly detection system based on DNS log analysis
CN111756679A (en) * 2019-03-29 2020-10-09 北京数安鑫云信息技术有限公司 Log analysis method and device, storage medium and computer equipment
CN110071941A (en) * 2019-05-08 2019-07-30 北京奇艺世纪科技有限公司 A kind of network attack detecting method, equipment, storage medium and computer equipment
CN114386025A (en) * 2021-12-14 2022-04-22 深圳前海微众银行股份有限公司 Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium

Also Published As

Publication number Publication date
CN114386025A (en) 2022-04-22

Similar Documents

Publication Publication Date Title
JP2021518705A (en) Runtime self-modification for blockchain ledger
US20180069893A1 (en) Identifying Changes in Use of User Credentials
CN106778260B (en) Attack detection method and device
WO2020000763A1 (en) Network risk monitoring method and apparatus, computer device and storage medium
WO2021068488A1 (en) Blockchain-based log processing method and apparatus, computer device, and storage medium
US20240054128A1 (en) Automatic database query load assessment and adaptive handling
WO2021012509A1 (en) Method, device, and computer storage medium for detecting abnormal account
US11681719B2 (en) Efficient access of chainable records
CN113711559B (en) System and method for detecting anomalies
CN113132311B (en) Abnormal access detection method, device and equipment
US20200374308A1 (en) Method, product, and system for maintaining an ensemble of hierarchical machine learning models for detection of security risks and breaches in a network
WO2021004123A1 (en) Blockchain-based information processing apparatus and method, and storage medium
US11073987B2 (en) System and method for identifying SSDS with lowest tail latencies
WO2023109046A1 (en) Anomaly detection method and apparatus, electronic device, and storage medium
CN112600797A (en) Method and device for detecting abnormal access behavior, electronic equipment and storage medium
US10637878B2 (en) Multi-dimensional data samples representing anomalous entities
US11429697B2 (en) Eventually consistent entity resolution
US11263104B2 (en) Mapping between raw anomaly scores and transformed anomaly scores
CN113342594A (en) Industrial control host and dynamic health degree evaluation method thereof
US8397295B1 (en) Method and apparatus for detecting a rootkit
CN112764974B (en) Information asset online management method and system
US11836265B2 (en) Type-dependent event deduplication
TWI838461B (en) Methods and systems for accessing chainable records
KR101137150B1 (en) A method for determining validity of command and a system thereof
CN116830095A (en) Evaluating access requests using assigned co-actor identifiers

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22905812

Country of ref document: EP

Kind code of ref document: A1