WO2022037299A1 - 异常行为检测方法、装置、电子设备及计算机可读存储介质 - Google Patents

异常行为检测方法、装置、电子设备及计算机可读存储介质 Download PDF

Info

Publication number
WO2022037299A1
WO2022037299A1 PCT/CN2021/104999 CN2021104999W WO2022037299A1 WO 2022037299 A1 WO2022037299 A1 WO 2022037299A1 CN 2021104999 W CN2021104999 W CN 2021104999W WO 2022037299 A1 WO2022037299 A1 WO 2022037299A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
model
sub
data volume
detected
Prior art date
Application number
PCT/CN2021/104999
Other languages
English (en)
French (fr)
Inventor
程哲豪
董井然
陈守志
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP21857394.7A priority Critical patent/EP4120167A4/en
Priority to JP2022554811A priority patent/JP7430816B2/ja
Publication of WO2022037299A1 publication Critical patent/WO2022037299A1/zh
Priority to US17/898,324 priority patent/US20230004979A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/382Payment protocols; Details thereof insuring higher security of transaction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user

Definitions

  • the present application relates to information processing technology in the field of computer applications, and relates to an abnormal behavior detection method, apparatus, electronic device, and computer-readable storage medium.
  • abnormal behavior detection is usually carried out in an unsupervised manner; for example, historical behavior information is clustered to obtain multiple clusters.
  • the relationship between multiple clusters is used to determine the abnormality of the behavior information to be detected.
  • the feature dimension of the behavior information to be detected is low, when the detection result is determined by clustering according to the low-dimensional feature, the possibility of error in the detection result is high, which leads to the abnormal behavior detection. less accurate.
  • Embodiments of the present application provide an abnormal behavior detection method, apparatus, electronic device, and computer-readable storage medium, which can improve the accuracy of abnormal behavior detection.
  • An embodiment of the present application provides a method for detecting abnormal behavior, and the method is executed by an electronic device, including:
  • the behavior information to be detected includes a first target object, a second target object, and a target data volume;
  • an abnormal data volume is determined from the first target sub-model, and a first detection result corresponding to the to-be-detected behavior information is determined based on a comparison result between the target data volume and the abnormal data volume ;
  • the target detection result of the behavior information to be detected is determined by combining the first detection result and the second detection result.
  • the embodiment of the present application provides an abnormal behavior detection device, including:
  • an information acquisition module configured to acquire behavior information to be detected, where the behavior information to be detected includes a first target object, a second target object, and a target data volume;
  • a first detection module configured to obtain a first target sub-model corresponding to the first target object from a first preset object model
  • the first detection model is further configured to determine an abnormal data volume from the first target sub-model based on preset model parameters, and based on the comparison result between the target data volume and the abnormal data volume, determine the difference between the target data volume and the abnormal data volume. the first detection result corresponding to the behavior information to be detected;
  • a second detection module configured to obtain a second target sub-model corresponding to the second target object and having the highest similarity with the first target sub-model from a second preset object model
  • the second detection model is also used to obtain the target maximum data volume corresponding to the second target sub-model, and based on the comparison result between the target data volume and the target maximum data volume, determine the behavior information to be detected. the corresponding second detection result;
  • the result determination module is configured to combine the first detection result and the second detection result to determine the target detection result of the behavior information to be detected.
  • the embodiment of the present application provides an electronic device for abnormal behavior detection, including:
  • the processor is configured to implement the abnormal behavior detection method provided by the embodiment of the present application when executing the executable instructions stored in the memory.
  • Embodiments of the present application provide a computer-readable storage medium storing executable instructions for causing a processor to execute the abnormal behavior detection method provided by the embodiments of the present application.
  • the embodiments of the present application have at least the following beneficial effects: when the behavior information to be detected includes three dimensional features of the first target object, the second target object and the target data volume, by combining the target data volume with the abnormal data volume and the target maximum data volume, respectively
  • the abnormal data volume is an abnormal judgment condition for the first target object determined based on the first preset object model
  • the maximum target data volume is The abnormal judgment condition for the second target object determined based on the second preset object model; therefore, it is possible to judge whether the target data volume is within the preset value from two dimensions of the first target object and the second target object under the low-latitude feature
  • the interval is used to accurately obtain the target detection result of whether the behavior information to be detected is abnormal; thus, the accuracy of abnormal behavior detection can be improved.
  • FIG. 1 is an optional schematic diagram of the architecture of an abnormal behavior detection system provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of the composition and structure of the server in FIG. 1 provided by an embodiment of the present application;
  • FIG. 3 is an optional schematic flowchart of the abnormal behavior detection method provided by the embodiment of the present application.
  • FIG. 4 is a schematic diagram of an exemplary determination of an abnormal data amount provided by an embodiment of the present application.
  • 6a is an exemplary schematic diagram of the amount of data to be converted provided by an embodiment of the present application.
  • 6b is a schematic diagram of an exemplary data volume conversion provided by an embodiment of the present application.
  • FIG. 7 is an exemplary schematic diagram of obtaining similarity provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of an exemplary acquisition of a merged sub-model provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of an exemplary abnormal behavior detection process provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of an exemplary acquisition model provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of an exemplary model provided by an embodiment of the present application.
  • first ⁇ second involved is only to distinguish similar objects, and does not represent a specific ordering of objects. It is understood that “first ⁇ second” can be mutually The specific order or sequence may be changed to enable the embodiments of the application described herein to be implemented in sequences other than those illustrated or described herein.
  • Abnormal behavior detection refers to the detection of whether the data corresponding to the user's operation behavior conforms to the preset operation process or the actual process, for example, the detection of payment through account hacking, and the detection of swiping through cheating.
  • Offline environment refers to a platform that processes a large amount of data (for example, hundreds of millions of data) based on data mining tools (for example, "hadoop", “spark”); there is usually a large delay (for example, One-day delay), the real-time performance is low.
  • Real-time/online environment a platform for real-time and efficient storage and operation of data to be processed, usually with a delay of milliseconds, low complexity and high real-time performance.
  • an unsupervised method when used for abnormal behavior detection, it refers to applying an unsupervised algorithm to abnormal behavior detection; for example, if the behavior information in the application scenario obeys a Gaussian mixture distribution (Mixture Gaussian Distribution), then abnormal behavior detection is performed.
  • the behavior information to be detected when the behavior information to be detected is abnormal, it can be determined whether the behavior information to be detected is abnormal by judging whether the behavior information to be detected obeys the Gaussian distribution; for another example, the historical behavior information is clustered to obtain multiple clusters. The relationship between the behavior information to be detected and multiple clusters is used to determine whether the behavior information to be detected is abnormal.
  • the behavior information is determined to be abnormal behavior information.
  • the behavior information to be detected includes the first target object, the second target object and the three-dimensional features of the target data volume (for example, users, merchants, and amounts; users, commodities, and amount; users, articles and reading volume), the dimension of the feature is low, when the low-dimensional feature is used for unsupervised detection to determine the detection result, the possibility of error in the detection result is high, so the detection of abnormal behavior is accurate.
  • the detection time will be longer due to the high dimension/complexity of the features, so the real-time detection is low.
  • a supervised method for abnormal behavior detection refers to annotating samples, using the characteristics of the samples and the labeled information to train a network model, and then using the network model to detect whether the behavior information to be detected is abnormal.
  • the samples need to be labeled; and when the amount of data of the samples is large, such as reaching the 100 million level, the labeling is less executable; for example, when detecting payment Whether it is abnormal or not, because the number of payments that occur every day has reached 100 million, the practicability of manual labeling is low; and the longer labeling time will lead to a longer training time for the network model.
  • the trained network model may no longer be suitable for the current application scenario; therefore, the executable is low and cannot be applied to the time-sensitive application. in high-performance application scenarios.
  • the embodiments of the present application provide an abnormal behavior method, apparatus, electronic device, and computer-readable storage medium, which can quickly and accurately detect abnormal behavior, and can be applied to application scenarios with high timeliness.
  • the device for abnormal behavior may be implemented as a notebook computer, a tablet computer, a desktop computer
  • the terminals such as computers, set-top boxes, smart TVs, smart speakers, mobile devices (eg, mobile phones, portable music players, personal digital assistants, dedicated messaging devices, portable game devices, in-vehicle devices, smart phones, smart watches) , which can also be implemented as a server.
  • mobile devices eg, mobile phones, portable music players, personal digital assistants, dedicated messaging devices, portable game devices, in-vehicle devices, smart phones, smart watches
  • FIG. 1 is an optional schematic structural diagram of an abnormal behavior detection system provided by an embodiment of the present application; as shown in FIG. 1, in order to support an abnormal behavior detection application, in the abnormal behavior detection system 100, the terminal 200 ( The terminal 200-1 and the terminal 200-2 are exemplarily shown connected to the server 400 (abnormal behavior detection device) through the network 300, and the network 300 may be a wide area network or a local area network, or a combination of the two.
  • the abnormal behavior detection system 100 further includes a database 500 .
  • the database 500 is used to store the first preset object model and the second preset object model, and provide the first preset object model and the second preset object model to the server 400 to realize abnormal behavior detection.
  • the terminal 200-1 is configured to receive the user's payment operation through the control 200-111 on the graphical interface 200-11 (exemplarily showing a payment button), and in response to the payment operation, send the information including the merchant (No. A target object), user (second target object) and amount of money (target data amount) to be detected behavior information. It is also used for receiving the target detection result sent by the server 400 through the network 300, and displaying the target detection result on the graphical interface 200-12.
  • the terminal 200-2 is configured to receive the user's reading operation through the controls 200-211 on the graphical interface 200-21 (a reading button is exemplarily shown), and in response to the reading operation, send the server 400 through the network 300 including the article (No. A target object), user (second target object) and reading amount (target data amount) behavior information to be detected. It is also used for receiving the target detection result sent by the server 400 through the network 300, and displaying the target detection result on the graphical interface 200-22.
  • the server 400 is configured to obtain the behavior information to be detected from the terminal 200 through the network 300, and the behavior information to be detected includes the first target object, the second target object and the target data amount; from the first preset object model provided by the database 500, obtain The first target sub-model corresponding to the first target object; based on the preset model parameters, determine the abnormal data volume from the first target sub-model, compare the target data volume and the abnormal data volume, and determine the first target data volume corresponding to the behavior information to be detected.
  • the target detection result from the second preset object model provided by the database 500, obtain the second target sub-model corresponding to the second target object and with the highest similarity with the first target sub-model; obtain the corresponding second target sub-model
  • the target maximum data volume is compared with the target maximum data volume to determine the second detection result corresponding to the behavior information to be detected; the target detection result of the behavior information to be detected is determined by combining the first detection result and the second detection result. It is also used to send the target detection result to the terminal 200 through the network 300 .
  • the server 400 may be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud service, cloud database, cloud computing, cloud function, cloud storage, Cloud servers for basic cloud computing services such as network services, cloud communications, middleware services, domain name services, security services, Content Delivery Network (CDN, Content Delivery Network), and big data and artificial intelligence platforms.
  • the terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in this embodiment of the present invention.
  • FIG. 2 is a schematic diagram of the composition and structure of a server in FIG. 1 provided by an embodiment of the present application.
  • the server 400 shown in FIG. 2 includes: at least one processor 410, a memory 450, at least one network interface 420, and a user interface 430.
  • the various components in server 400 are coupled together by bus system 440 .
  • bus system 440 is used to implement the connection communication between these components.
  • the bus system 440 also includes a power bus, a control bus, and a status signal bus.
  • the various buses are labeled as bus system 440 in FIG. 2 .
  • the processor 410 may be an integrated circuit chip with signal processing capabilities, such as a general-purpose processor, a digital signal processor (DSP, Digital Signal Processor), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc., where a general-purpose processor may be a microprocessor or any conventional processor or the like.
  • DSP Digital Signal Processor
  • User interface 430 includes one or more output devices 431 that enable presentation of media content, including one or more speakers and/or one or more visual display screens.
  • User interface 430 also includes one or more input devices 432, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, and other input buttons and controls.
  • Memory 450 may be removable, non-removable, or a combination thereof.
  • Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like.
  • Memory 450 optionally includes one or more storage devices that are physically remote from processor 410 .
  • Memory 450 includes volatile memory or non-volatile memory, and may also include both volatile and non-volatile memory.
  • the non-volatile memory may be a read-only memory (ROM, Read Only Memory), and the volatile memory may be a random access memory (RAM, Random Access Memory).
  • ROM read-only memory
  • RAM random access memory
  • the memory 450 described in the embodiments of the present application includes any suitable type of memory.
  • memory 450 is capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.
  • the operating system 451 includes system programs for processing various basic system services and performing hardware-related tasks, such as framework layer, core library layer, driver layer, etc., for implementing various basic services and processing hardware-based tasks;
  • a presentation module 453 for enabling presentation of information (eg, a user interface for operating peripherals and displaying content and information) via one or more output devices 431 (eg, a display screen, speakers, etc.) associated with the user interface 430 );
  • An input processing module 454 for detecting one or more user inputs or interactions from one of the one or more input devices 432 and translating the detected inputs or interactions.
  • the abnormal behavior detection apparatus provided by the embodiments of the present application may be implemented in software.
  • FIG. 2 shows the abnormal behavior detection apparatus 455 stored in the memory 450, which may be software in the form of programs and plug-ins. Including the following software modules: information acquisition module 4551, first detection module 4552, second detection module 4553, result determination module 4554 and model acquisition module 4555, these modules are logical, so any combination can be carried out according to the realized functions or further split. The function of each module will be explained below.
  • the abnormal behavior detection apparatus provided by the embodiments of the present application may be implemented in hardware.
  • the abnormal behavior detection apparatus provided by the embodiments of the present application may be a processor in the form of a hardware decoding processor. be programmed to execute the abnormal behavior detection method provided by the embodiments of the present application, for example, a processor in the form of a hardware decoding processor may adopt one or more application specific integrated circuits (ASIC, Application Specific Integrated Circuit), DSP, programmable logic Device (PLD, Programmable Logic Device), Complex Programmable Logic Device (CPLD, Complex Programmable Logic Device), Field Programmable Gate Array (FPGA, Field-Programmable Gate Array) or other electronic components.
  • ASIC Application Specific Integrated Circuit
  • DSP digital signal processor
  • PLD programmable logic Device
  • CPLD Complex Programmable Logic Device
  • FPGA Field-Programmable Gate Array
  • the abnormal behavior detection method provided by the embodiment of the present application will be described with reference to an exemplary application when the abnormal behavior detection device is implemented as a server provided by the embodiment of the present application.
  • FIG. 3 is an optional schematic flowchart of the abnormal behavior detection method provided by the embodiment of the present application, which will be described with reference to the steps shown in FIG. 3 .
  • the terminal when a user operates a function application on the terminal, such as making a payment, reading an article, or clicking an advertisement, the terminal generates operation data in response to the operation performed by the user, and sends the operation data to the server. , the server receives the behavior information to be detected, and the acquisition of the behavior information to be detected is completed.
  • the behavior information to be detected is the detection object, including the first target object, the second target object and the target data volume; wherein, the first target object is the operated object, such as an advertisement, a merchant, a commodity or an article;
  • the second target object is an operation object, such as a user, etc.;
  • the target data volume is the data volume generated by the operation of the first target object by the second target object, such as an amount, a click or a reading volume.
  • the server pre-stores the first preset object model, or the server can obtain the first preset object model in advance; the first preset object model is a model corresponding to each first object, and the model is the data volume distribution information; since the first target object is a first object, the server can obtain the model corresponding to the first target object from the first preset object model; at this time, the first target object is also obtained. target submodel.
  • the first preset object model is the corresponding relationship between the first object and the data volume distribution information; since the first target object is a first object, the corresponding relationship between the server first object and the data volume distribution information , match each first object in the correspondence between the first target object and the first object and the data volume distribution information, match the first object from each first object, and match the first object corresponding to the matched first object
  • the data volume distribution information is determined as the first target sub-model.
  • the first target sub-model is the data volume distribution information corresponding to the first target object, for example, the histogram of merchant A about the amount, or the distribution of article B about the reading volume.
  • the server when it does not find a model corresponding to the first target object from the first preset object model, it builds a first target sub-model for the first target object, and converts the first target object The corresponding first target sub-model is added to the first preset object model.
  • S303 Determine the abnormal data volume from the first target sub-model based on the preset model parameters, and determine the first detection result corresponding to the behavior information to be detected based on the comparison result between the target data volume and the abnormal data volume.
  • the server stores the preset model parameters, or the server can obtain the preset model parameters in advance;
  • the preset model parameters are preset quantile points in the data volume distribution information, which are used to determine the first The preset interval of the data amount corresponding to the target object, and the determined preset interval of the data amount corresponding to the first target object is the abnormal judgment condition corresponding to the first target object; thus, the server can determine from the first target sub-model The target position corresponding to the preset model parameters, the amount of data corresponding to the target position is the amount of abnormal data; next, the server compares the amount of target data with the amount of abnormal data, according to the amount of target data and the amount of abnormal data in the comparison result It can be determined whether the target data volume is in the preset range of the data volume corresponding to the first target object (the range from zero to the abnormal data volume); at this time, the first target data volume corresponding to the behavior information to be detected is obtained. A test result.
  • the first detection result is the result of whether the behavior information to be detected determined for the first target object is abnormal; when the target data volume is greater than the abnormal data volume, it indicates that the target data volume is within the data volume corresponding to the first target object. Therefore, the server determines the first detection result about the abnormality of the first target object including the behavior information to be detected; when the target data volume is less than or equal to the abnormal data volume, it indicates that the target data volume is in the data corresponding to the first target object Therefore, the server determines that the first detection result including the behavior information to be detected is normal with respect to the first target object.
  • FIG. 4 is an exemplary schematic diagram of determining the amount of abnormal data provided by an embodiment of the present application; as shown in FIG. 4 , in the histogram 4-1 (the first target sub-model), The horizontal axis is the amount value, and the vertical axis is the probability value.
  • the preset model parameter is the 99th percentile
  • the probability of each amount segment in the histogram 4-1 is superimposed in the order of the amount value from small to large.
  • the amount value corresponding to the position 14-2 target position, the probability that an amount is smaller than the amount value corresponding to the position 14-2 is greater than 99%
  • the abnormal data volume is the abnormal data volume.
  • a second preset object model is pre-stored in the server, or the server can obtain the second preset object model in advance;
  • the second preset object model is a A model set formed by each data volume distribution information corresponding to an object; since the second target object is a second object, the server can obtain the model group corresponding to the second target object from the second preset object model, and A model with the highest similarity to the first target sub-model is matched from the acquired model group corresponding to the second target object, and the second target sub-model is obtained.
  • the server matches the second target object with each second object in the second preset object model, and obtains the data volume distribution information corresponding to each first object corresponding to the matched second object ( model group), and then obtain the most similar data volume distribution information with the obtained first target sub-model from each data volume distribution information, and the most similar data volume distribution information with the first target sub-model is the second target sub-model ; wherein, the data volume distribution information corresponding to each first object in the first preset object model is different from the data volume distribution information corresponding to each first object in the second preset object model, and the second preset object model
  • the data volume distribution information corresponding to each first object is obtained by processing the data volume distribution information corresponding to each first object in the first preset object model.
  • the data volume distribution information of the object is obtained by processing the data volume distribution information corresponding to each first object in the first preset object model.
  • the second target sub-model is the data volume distribution information about the first target object corresponding to the second target object, for example, user C's amount histogram about merchant A, and user D's reading volume distribution about article B.
  • S305 Acquire the target maximum data volume corresponding to the second target sub-model, and determine the second detection result corresponding to the behavior information to be detected based on the comparison result between the target data volume and the target maximum data volume.
  • the server determines the target maximum data volume based on the maximum data volume corresponding to the second target object in the second preset object model; that is, the target maximum data volume may be the maximum data volume corresponding to the second target object, or may It is determined based on the maximum amount of data corresponding to the second target object in the second preset object model, and so on, which is not specifically limited in this embodiment of the present application.
  • the server obtains the target maximum data amount, and also determines the preset interval of the data amount corresponding to the second target object, and the determined preset interval of the data amount corresponding to the second target object is the data amount corresponding to the second target object.
  • the corresponding abnormal judgment condition thus, the server can determine whether the target data volume is in the second target object by comparing the target data volume with the target maximum data volume and according to the size relationship between the target data volume and the target maximum data volume in the comparison result.
  • the preset interval of the corresponding data amount (the interval range from zero to the target maximum data amount); at this time, the second detection result corresponding to the behavior information to be detected is obtained.
  • the second detection result is the result of whether the behavior information to be detected is abnormal for the second target object; when the target data volume is greater than the target maximum data volume, it indicates that the target data volume is in the data corresponding to the second target object. Therefore, the server determines the second detection result including the behavior information to be detected about the abnormality of the second target object; when the target data amount is less than or equal to the target maximum data amount, it indicates that the target data amount is within the second target object In the preset interval of the corresponding data amount, the server thus determines that the second detection result including the behavior information to be detected is normal with respect to the second target object.
  • the server determines the target detection result of the behavior information to be detected by combining the first detection result and the second detection result.
  • the target detection result refers to whether the behavior information to be detected is abnormal.
  • the server determines the target detection result of the behavior information to be detected in combination with the first detection result and the second detection result, it includes: when the first detection result is that the behavior information to be detected is abnormal about the first target object, and the second detection result is abnormal.
  • the server determines that the target detection result including the behavior information to be detected is abnormal; when the first detection result is that the behavior information to be detected is normal with respect to the first target object, and the second detection result
  • the server determines the target detection result including the abnormal behavior information to be detected, the server may also determine the target detection result including the normal behavior information to be detected, and the server can also determine the behavior information to be detected.
  • further detection is performed by artificial means and intelligent detection processing; when the first detection result is that the behavior information to be detected is normal with respect to the first target object, and the second detection result is that the behavior information to be detected is related to the second target.
  • the server determines that the target detection result including the behavior information to be detected is normal; when the first detection result is that the behavior information to be detected is abnormal about the first target object, and the second detection result is that the behavior information to be detected is normal about the second target object.
  • the server determines the target detection result including the normal behavior information to be detected, the server can also determine the target detection result including the abnormal behavior information to be detected, and the server can also determine the behavior information to be detected as the behavior information to be reviewed.
  • the intelligent detection process performs further detection.
  • the server may also determine target processing information according to the target detection result; the target processing information is the processing method of the behavior information to be detected, for example, when the behavior information to be detected is a payment operation , if the target detection result is that the behavior information to be detected is abnormal, the target processing information is the processing of intercepting the payment operation.
  • the behavior information to be detected is an advertisement click
  • the target processing information is the processing of preventing the advertisement click.
  • the behavior information to be detected includes three dimensional characteristics of the first target object, the second target object and the target data volume, it is determined by combining the comparison results of the target data volume with the abnormal data volume and the target maximum data volume respectively.
  • the maximum target data volume is based on the second preset object model.
  • the abnormal judgment condition for the second target object determined by the object model therefore, it can be accurately obtained by judging whether the target data volume is within the preset range from the two dimensions of the first target object and the second target object under the low-latitude feature
  • the target detection result of whether the behavior information to be detected is abnormal; thus, the accuracy of abnormal behavior detection is high.
  • FIG. 5 is another optional schematic flowchart of the abnormal behavior detection method provided by the embodiment of the present application; as shown in FIG. 5, in the embodiment of the present application, S307-S311 are further included before S302; that is, , before the server obtains the first target sub-model corresponding to the first target object from the first preset object model, the abnormal behavior detection method further includes S307-S311, and each step is described below.
  • the server obtains behavior information in the current cycle, and also obtains behavior information samples; for example, payment orders in the past week, and reading records in the past month.
  • the current cycle refers to the most recent preset cycle.
  • the behavior information sample is a set formed by the first object, the second object and the behavior information corresponding to the data amount in the current cycle, so each piece of behavior information in the behavior information sample includes the first object, the second object and the data volume. Two objects and the amount of data.
  • the server aggregates the behavior information samples according to different preset types to obtain the first preset object model and the second preset object model.
  • the different preset types include a first preset object type (eg, a merchant type or a commodity type) and a second preset object type (eg, a user), and the first preset object type is an object type to which the first object belongs , and the second preset object type is the object type to which the second object belongs. Therefore, when the server aggregates the behavior information samples according to the first preset object type and obtains each data amount corresponding to each first object, it also obtains a data amount set corresponding to each first object;
  • the object type corresponding to each first object is the first preset object type
  • the data volume set is a set formed by each first object with respect to the data volume of the second object.
  • the server determines the data volume distribution information corresponding to each first object according to the data volume set, and thus completes the construction of the first sub-model corresponding to each first object ;
  • each first sub-model corresponding to each first object is also obtained, and each first sub-model corresponding to each first object is the first sub-model corresponding to each first object.
  • Default object model where, each first object refers to any one of the first objects, and the first preset object model refers to a set of first sub-models of each first object; the first target sub-model is a first target sub-model submodel.
  • the server when the server aggregates the behavior information samples according to the second preset object type, and obtains each first object corresponding to each second object, it also obtains the corresponding first object for each second object.
  • the first object set here, when the server aggregates the behavior information samples according to the second preset object type, it also obtains the respective data amounts corresponding to each second object, from the obtained data corresponding to each second object.
  • the largest data amount is selected among the various data amounts, and the largest data amount corresponding to each second object is obtained.
  • the server after the server obtains the first object set, it traverses the first objects in the first object set, and for the traversed first object, performs a traversal with the first object in the first preset object model. Matching, the model corresponding to the matched first object is the first sub-model corresponding to the traversed first object; here, the server uses the first sub-model corresponding to the traversed first object to construct each second object. The corresponding at least one second object submodel.
  • the at least one second object sub-model is a set composed of the data volume distribution of the first object associated with each second object, for example, the amount histogram of user C about merchant A, the amount histogram of merchant E, and A histogram of the amount of merchant F.
  • the traversed first object is any first object in the first object set.
  • the server after obtaining at least one second object sub-model corresponding to each second object, the server combines the at least one second object sub-model with the maximum amount of data, and the obtained combination result is that each second object corresponds to When the second sub-model corresponding to each second object is obtained, the second preset object model of each second sub-model corresponding to each second object is also obtained.
  • each second object refers to any one of the respective second objects
  • the second preset object model refers to a set composed of second sub-models of each second object.
  • the server constructs a first sub-model corresponding to each first object according to the data volume set, including S3081-S3085, and each step is described below.
  • the server extracts the minimum data volume and the maximum data volume from each data volume in the data volume set, and the range corresponding to the minimum data volume and the maximum data volume is the data volume range.
  • the server segments the quantity range according to the preset segment size or the preset number of segments, thereby obtaining multiple target segments.
  • the server determines the target segment to which each data volume in the data volume set belongs, and then counts the data volumes belonging to each target segment in the multiple target segments. The amount of data; and the amount of data of each target segment is counted, that is, the number of targets corresponding to each target segment.
  • the server counts the number of data volumes in the data volume set, and the counted number of data volumes in the data volume set is the number of set elements corresponding to the data volume set; at this time, the target data volume is taken as the numerator, Taking the number of set elements as the denominator, the ratio is calculated, and the obtained ratio result is the probability value corresponding to each target segment; when the acquisition of the probability value corresponding to each target segment is completed, the corresponding ratios for multiple target segments are also obtained.
  • the multiple probability values of , and the multiple target segments correspond to the multiple probability values one-to-one.
  • the first sub-model is a plurality of probability values corresponding to a plurality of target segments associated with the first object.
  • S3081 can be implemented through S30811 and S30812; that is, the server obtains the data volume range corresponding to each data volume in the data volume set, including S30811 and S30812, and each step is described below.
  • the server converts each data amount in the data amount set to eliminate the smooth part , so that the distribution corresponding to each transformed data volume in the transformed data volume set obeys the standard normal distribution.
  • the transformation may be logarithmic transformation, or simultaneous reduction of preset multiples, or corresponding multiplication of each data amount by using different weights, etc., which are not specifically limited in this embodiment of the present application. It should also be noted that, when the data volume is transformed during the model acquisition process, the data volume referred to in the model acquisition process and the model application process is the transformed data volume.
  • the corresponding amount value when the amount of data is an amount value, the corresponding amount value is sometimes small in small-amount payments, for example, a few yuan; in large-amount payments, the corresponding amount value is sometimes large, for example, hundreds of thousands; thus the data
  • the horizontal axis is the amount value, and the vertical axis is the probability value.
  • the natural logarithm (ln) is used to transform the amount of data.
  • the server extracts the smallest transformed data volume and the largest transformed data volume from each transformed data volume in the transformed data volume set, and the minimum The range corresponding to the converted data volume and the largest converted data volume is the data volume range.
  • the server counts the target quantity of the data quantity belonging to each target segment in the plurality of target segments from the data volume set, including: the server counts the data volume belonging to the multiple target segments from the transformed data volume set.
  • the target number of converted data volumes for each target segment That is to say, the server determines the target segment to which each converted data volume in the converted data volume set belongs, and then counts the number of converted data volumes belonging to each target segment in the multiple target segments; and The number of converted data volumes of each target segment is counted, that is, the number of targets corresponding to each target segment.
  • other data volumes such as the target data volume are the corresponding data volumes after conversion.
  • S310 can be implemented through S3101-S3105; that is, the server traverses the first object set, and builds at least one second object sub-model based on the first sub-model corresponding to the traversed first object, including Steps S3101-S3105 are described below.
  • the server uses the first sub-model corresponding to the first traversed first object as a current sub-model as an element in the current sub-model collection to construct the first current sub-model collection.
  • the server will traverse the ith first object corresponding to the first sub-model Compare with each current sub-model in the i-1 th current sub-model set respectively, and select the first sub-model corresponding to the i-th first object from the i-1 th current sub-model set according to the comparison result The most similar one of the current sub-models also gets the similar sub-model.
  • the server compares the similarity between the similar sub-model and the first sub-model corresponding to the i-th first object with the first preset similarity; when the similar sub-model When the similarity between the first sub-models corresponding to the ith first object is greater than the first preset similarity, it indicates that the first object corresponding to the first sub-model corresponding to the ith first object (such as , the convenience store A) is similar to the first object (for example, convenience store B) corresponding to the similar sub-model in terms of data volume; at this time, the server merges the first sub-model and the similar sub-model corresponding to the i-th first object, The merged result is the merged sub-model; then, the server replaces the similar sub-model in the i-1 th current sub-model set with the merged sub-model, and completes the replaced i-1 th current sub-model set That is, the ith current submodel set.
  • the similarity between the similar sub-model and the first sub-model corresponding to the ith first object is less than or equal to the first preset similarity, it indicates that the first object corresponding to the ith first object has a degree of similarity.
  • the first object corresponding to the sub-model (for example, convenience store A) is not similar to the first object corresponding to the similar sub-model (for example, convenience store B) in terms of data volume;
  • the corresponding first sub-model is inserted into the i-1 th current sub-model set, and the i-1 th current sub-model set after the insertion is completed, that is, the i-th current sub-model set.
  • the process of S3102 to S3104 is used to update the current sub-model set corresponding to the current first object;
  • the obtained current sub-model set after traversing and updating is at least one second object sub-model constructed.
  • S3102 further includes S31021-S31024; that is, the server converts the sub-model to be updated corresponding to the current first object in the first preset object model to the current sub-model corresponding to each second object respectively
  • Each current sub-model in the model set is compared to obtain similar sub-models, including S31021-S31024, and the distribution of each step will be described below.
  • the first sub-model corresponding to each first object is a plurality of probability values corresponding to multiple target segments; thus, the first sub-model corresponding to the i-th first object traversed also includes multiple probability values.
  • the multiple probability values corresponding to the target segment are referred to here as multiple first target probability values corresponding to the multiple target segments; that is, the multiple first target probability values corresponding to the multiple target segments are the multiple values corresponding to the sub-models to be updated.
  • S31022 Acquire multiple second target probability values corresponding to multiple target segments from each current sub-model in the i-1 th current sub-model set.
  • each current sub-model in the i-1 th current sub-model set also includes multiple probability values corresponding to multiple target segments, which are referred to here as multiple second targets corresponding to multiple target segments.
  • probability value that is, multiple second target probability values corresponding to multiple target segments, which are the relationships between multiple target segments corresponding to each current sub-model and multiple probability values; wherein multiple target segments and multiple first The two target probability values are in one-to-one correspondence.
  • the server compares the plurality of first target probability values of the first sub-model corresponding to the i-th first object.
  • the target probability value and the multiple second target probability values of each current sub-model determine the similarity between the first sub-model corresponding to the i-th first object traversed and each current sub-model; when the traversed traversed first object is obtained After the similarity between the first sub-model corresponding to the i-th first object and each current sub-model, at least one similarity corresponding to at least one current sub-model in the i-1-th current sub-model set is obtained. .
  • multiple minimum probability values correspond to multiple first target probability values one-to-one
  • multiple minimum probability values correspond to multiple second target probability values one-to-one
  • at least one current The sub-models correspond one-to-one with multiple similarities.
  • FIG. 7 is an exemplary schematic diagram of obtaining similarity provided by an embodiment of the present application; as shown in FIG. 7, the abscissa is the logarithmic amount value, and the ordinate is the coordinate system of the probability value , the similarity between the first sub-model 7-1 corresponding to the i-th first object and each current sub-model 7-2 is the area corresponding to the region 7-3.
  • n is the number of multiple target segments.
  • S31024 Select the highest similarity from the determined at least one similarity corresponding to the i-1 th current sub-model set, and select the current sub-model corresponding to the highest similarity in the i-1 th current sub-model set , identified as similar submodels.
  • the server obtains the highest similarity from multiple similarities, and uses the current sub-model corresponding to the i-1 th current sub-model set with the highest similarity as the similarity sub-model.
  • the server merges the first sub-model corresponding to the i-th first object and the similar sub-model to obtain a merged sub-model, including S31031-S31034.
  • the steps are described below.
  • each current sub-model also includes multiple probability values corresponding to multiple target segments, which are referred to here as multiple second target probability values corresponding to multiple target segments;
  • the second target probability value is the relationship between the multiple target segments corresponding to each current sub-model and the multiple probability values, wherein the multiple target segments and the multiple probability values to be merged are in one-to-one correspondence.
  • multiple maximum probability values correspond to multiple first target probability values
  • multiple maximum probability values correspond to multiple pending probability values.
  • the combined probability values correspond one-to-one.
  • FIG. 8 is a schematic diagram of an exemplary acquisition and merged sub-model provided by an embodiment of the present application; as shown in FIG. 8, the abscissa is the logarithmic amount value, and the ordinate is the probability value In the coordinate system of , the first sub-model 8-1 corresponding to the i-th first object and the merged sub-model 8-3 corresponding to the similar sub-model 8-2 are shown.
  • C is used to represent the merged submodel 8-3.
  • S304 may be implemented through S3041-S3044; that is, the server obtains, from the second preset object model, the first target object corresponding to the second target object and having the highest similarity with the first target sub-model.
  • the two-target sub-model includes S3041-S3044, and each step is described below.
  • the server combines the second target object with each of the second preset object models.
  • a second object is matched, and at least one second object sub-model corresponding to the matched second object is at least one target second object sub-model.
  • S3042 Acquire at least one target similarity corresponding to the first target sub-model and at least one target second target sub-model respectively.
  • the server compares the first target sub-model with each target second object sub-model, and obtains that the first target sub-model and each target second object sub-model have similar targets degree, that is, at least one target similarity corresponding to at least one target second object sub-model is obtained; wherein, at least one target second object sub-model corresponds to at least one target similarity one-to-one.
  • the highest target similarity is the highest target similarity among the at least one target similarity and the target similarity corresponding to the at least one target second object sub-model.
  • At least one target second object sub-model determine the target second object sub-model corresponding to the highest target similarity as the second target sub-model.
  • the server selects a corresponding target second object sub-model from at least one target second object sub-model according to the highest target similarity, and selects the selected target second object sub-model corresponding to the highest target similarity.
  • the model is determined as the second target sub-model; wherein, the second target sub-model has the highest similarity with the first target sub-model.
  • the server obtains the target maximum data volume corresponding to the second target sub-model, including: when the highest target similarity is greater than the second preset similarity, the server compares the second preset object model with the first target The maximum data volume corresponding to the two target objects is determined as the target maximum data volume corresponding to the second target sub-model; when the highest target similarity is less than or equal to the second preset similarity, the server determines the preset data volume as the second target sub-model The maximum number of targets corresponding to the model.
  • the preset data amount is, for example, 0, or any other value; in addition, the first preset similarity and the second preset similarity may be the same or different, which is not specifically limited in this embodiment of the present application.
  • FIG. 9 is a schematic diagram of an exemplary abnormal behavior detection process provided by the embodiment of the present application; as shown in FIG. 9, in the payment scenario, after the user submits the payment order 9-1 (behavior information to be detected), First, before completing the payment according to the payment order, the server pulls the payment order, and obtains the order triple 9-2 from the payment order: user 9-21 (first target object), merchant 9-22 (first target object) and Amount value 9-23 (target data volume).
  • the server obtains the merchant histogram 9-311 (the first target sub-model) corresponding to the merchant 9-22 from the merchant model 9-31 (the first preset object model), and according to the merchant histogram 9-311
  • the 99th percentile (preset model parameters) determines the normal transaction threshold t u (abnormal data volume) of merchants 9-22; the process of determining the normal transaction threshold t u is shown in Figure 4; If the amount value is greater than the normal transaction threshold t u , that is, the payment order corresponding to the amount value exceeds 99% of the normal situation, it is determined that the payment order corresponding to the amount value is an abnormal transaction on the merchant side.
  • the amount value 9-23 is greater than the normal transaction threshold t u , it indicates that the payment order 9-1 exceeds 99% of the normal condition, and thus, it is determined that the payment order 9-1 is the result of an abnormal transaction on the merchant side 9-41 (No. 9-41). a test result).
  • the server obtains the merchant histogram group 9-321 (at least one target second object sub-model) corresponding to the user 9-21 from the user model 9-32 (the second preset object model), and obtains the merchant histogram group 9-321 (at least one target second object sub-model) from the merchant histogram Search for the merchant histogram 9-322 (second target sub-model) most similar to the merchant histogram 9-311 in the group 9-321.
  • the normal transaction threshold t v target maximum data volume
  • the corresponding payment order is a normal transaction on the user side.
  • the amount value 9-23 is greater than the normal transaction threshold t v , it indicates that the amount value 9-23 has not appeared in the historical payment order, thus, it is determined that the payment order 9-1 is the result of an abnormal transaction on the user side 9-42 ( second test result).
  • the payment order is determined to be abnormal payment; when the merchant side indicates that the payment order is abnormal payment. , and the user side indicates that the payment order is a normal payment, the payment order is determined to be a normal payment; when the merchant side indicates that the payment order is a normal payment, and the user side indicates that the payment order is an abnormal payment, the payment order is determined to be a suspected abnormal payment.
  • the payment order determines the abnormality of the payment order according to the actual situation, that is, according to the actual situation, the payment order can be considered normal, or the payment order can be considered abnormal; when the merchant side indicates that the payment order is normal payment, and the user side indicates that the payment order is also normal payment, Make sure that the payment order is a normal payment.
  • the payment order is determined to be abnormal payment; when the merchant side indicates that the payment order is abnormal payment. , and when the user side indicates that the payment order is a normal payment, it is determined that the payment order is a suspected abnormal payment.
  • the abnormality of the payment order is determined according to the actual situation, that is, the payment order can be considered normal according to the actual situation, or the payment order can be considered abnormal;
  • the payment order is determined to be suspected abnormal payment.
  • the abnormality of the payment order is determined according to the actual situation, that is, the payment order can be considered according to the actual situation. If it is normal, it can also be considered that the payment order is abnormal; when the merchant side indicates that the payment order is normal payment, and the user side indicates that the payment order is also normal payment, it is determined that the payment order is normal payment.
  • the server determines the target detection result according to the first detection result and the second detection result, it may also be other decision matrices, which are not listed one by one in the embodiments of the present application.
  • the above-mentioned processing process for determining the abnormality of a payment order is real-time and implemented in a real-time/online environment; the above-mentioned processing process for determining the abnormality of a payment order can also be applied to scenarios such as credit card anti-fraud and counterfeiting middle.
  • FIG. 10 is a schematic diagram of an exemplary acquisition model provided by an embodiment of the present application; as shown in FIG. 10, the server acquires a recent (current preset period) historical payment order 10-1 (behavior information sample) , wherein each payment order in the historical payment order 10-1 includes a user (a second object), a merchant (a first object) and an amount (data amount).
  • a recent (current preset period) historical payment order 10-1 (behavior information sample)
  • each payment order in the historical payment order 10-1 includes a user (a second object), a merchant (a first object) and an amount (data amount).
  • the historical payment orders 10-1 are aggregated according to the merchants (the first preset object type), and the transaction amount 10-2 (data volume set) of all recent users of each merchant is obtained; After the transaction amount 10-2 of all users is converted, the amount distribution histogram 10-31 (the first sub-model) is obtained by segment statistics; the histogram 11-1 shown in Figure 11 is the amount distribution histogram in Figure 10 10-31, the abscissa in the coordinate system is the logarithm of the amount value, and the ordinate is the probability value. Combining the histograms 10-31 of the amount distribution corresponding to each merchant, the merchant model 10-3 is obtained.
  • the historical payment orders 10-1 are aggregated according to the user (the second preset object type), and each merchant 10-41 (the first object set) paid by each user recently is obtained, and each user has recently paid in each merchant 10-41.
  • the maximum payment amount 10-421 (maximum data amount) is selected from the recent payment amount 10-42 of each user at each merchant as the historical maximum transaction amount.
  • each merchant 10-41 traverses each merchant 10-41 recently paid by each user, and for each merchant 10-411 (the i-th first object), obtain the corresponding merchant amount distribution histogram 10-32 from the merchant model 10-3 .
  • each merchant 10-411 is the first merchant among the traversed merchants 10-41, insert the merchant amount distribution histogram 10-32 directly into the histogram group 10-51 (current sub-model set);
  • the merchants 10-411 are not the first merchants in the traversed merchants 10-41, compare the merchant amount distribution histogram 10-32 with each histogram in the histogram group 10-51 to obtain a The histograms 10-511 (similar sub-models) with the highest similarity are compared in the histogram group 10-51; wherein, the comparison process is shown in FIG.
  • the similarity between the histogram 10-511 and the histogram 10-32 of merchant amount distribution is greater than 0.8 (the first preset similarity)
  • the histogram 10-32 of merchant amount distribution is merged into histogram 10-511 , the process of merging is shown in Figure 8; and the similarity between the histogram 10-511 and the merchant amount distribution histogram 10-32 is less than or equal to 0.8, insert the merchant amount distribution histogram 10-32 into the histogram group 10- 51; in this way, when the traversal of each merchant 10-41 is completed (the histogram group 10-51 is at least one second object sub-model), the histogram group 10-51 corresponding to each user and the maximum payment amount 10 -421 combination, the second sub-model is obtained, and thus the user model 10-5 (the second preset object model) is obtained.
  • the above process of acquiring the first preset object model and the second preset object model may be performed in an offline environment.
  • the abnormal behavior detection method provided by the embodiment of the present application adopts an unsupervised method, and does not need to mark the data, which improves the operability of detection in the payment scene of the instant messaging client; moreover, the merchant model and the user model Only the merchant, user and amount value are required in the acquisition process of the IM client, and the abnormal behavior detection in the payment scene of the instant messaging client can be accurately realized in the case of three-dimensional features; in addition, since the acquisition process of the merchant model and the user model does not require data annotation , which improves the acquisition efficiency, so that the acquired merchant model and user model are time-sensitive and can be applied to the real-time payment environment of the instant messaging client.
  • the software modules stored in the abnormal behavior detection apparatus 455 of the memory 450 can include:
  • the information acquisition module 4551 is configured to acquire behavior information to be detected, and the behavior information to be detected includes the first target object, the second target object and the target data volume;
  • the first detection module 4552 is configured to obtain the first target sub-model corresponding to the first target object from the first preset object model;
  • the first detection model 4552 is further configured to determine the amount of abnormal data from the first target sub-model based on preset model parameters, and determine the amount of abnormal data based on the comparison result between the target data amount and the amount of abnormal data. the first detection result corresponding to the behavior information to be detected;
  • the second detection module 4553 is configured to obtain the second target sub-model corresponding to the second target object and having the highest similarity with the first target sub-model from the second preset object model;
  • the second detection model 4553 is further configured to obtain the target maximum data volume corresponding to the second target sub-model, and based on the comparison result between the target data volume and the target maximum data volume, determine the behavior to be detected. the second detection result corresponding to the information;
  • the result determination module 4554 is configured to combine the first detection result and the second detection result to determine the target detection result of the behavior information to be detected.
  • the abnormal behavior detection device 455 further includes a model acquisition module 4555, which is configured to acquire behavior information samples; aggregate the behavior information samples according to the first preset object type, and obtain a correlation with each first object type.
  • the data volume set corresponding to the object, according to the data volume set, construct a first sub-model corresponding to each of the first objects, and determine each of the first sub-models corresponding to the constructed first objects is the first preset object model;
  • the behavior information samples are aggregated according to the second preset object type to obtain the first object set and the maximum data volume corresponding to each second object; traverse the first object Set, build at least one second object sub-model based on the first sub-model corresponding to the traversed first object; combine the at least one second object sub-model and the maximum amount of data into a
  • Each of the second sub-models corresponding to the second objects is determined as the second preset object model.
  • the model obtaining module 4555 is further configured to obtain a data volume range corresponding to each data volume in the data volume set; segment the data volume range to obtain multiple target segments; From the data volume set, count the target number corresponding to the data volume belonging to each of the multiple target segments; compare the target number with the number of set elements corresponding to the data volume set The ratio is determined as the probability value corresponding to each of the target segments; the determined multiple probability values corresponding to the target segments are determined as the first probability values corresponding to each of the first objects submodel.
  • the model obtaining module 4555 is further configured to transform each of the data volumes in the data volume set to obtain a transformed data volume set; The range corresponding to each converted data volume in is determined as the data volume range.
  • the model obtaining module 4555 is further configured to count the transformed data volume belonging to each of the multiple target segments from the transformed data volume set the corresponding target quantity.
  • the model obtaining module 4555 is further configured to traverse the first object set, and determine the first sub-model corresponding to the first traversed first object as the current sub-model , and construct the first current sub-model set including the current sub-model; perform the following processing by iterating i: the first sub-model corresponding to the traversed i-th first object, respectively and the i-th Comparing each of the current sub-models in the set of -1 current sub-models to obtain similar sub-models, where 2 ⁇ i ⁇ I, and i is a positive integer variable with an increasing value, and I is the first object The number of first objects in the set; when the similarity between the first sub-model corresponding to the i-th first object and the similar sub-model is greater than the first preset similarity, merge the i-th The first sub-model and the similar sub-model corresponding to the first object, obtain the merged sub-model, and use the merged sub-model to replace all the i-1 th current
  • the similar sub-model is obtained, and the i-th current sub-model set is obtained; when the similarity between the first sub-model corresponding to the i-th first object and the similar sub-model is less than or equal to the first preset In the case of similarity, insert the first sub-model corresponding to the i-th first object into the i-1-th current sub-model set to obtain the i-th current sub-model set; One current sub-model set is determined as at least one of the second object sub-models.
  • the model obtaining module 4555 is further configured to obtain multiple first sub-models corresponding to multiple target segments from the first sub-model corresponding to the i-th first object traversed target probability value; from each of the current sub-models in the i-1 th current sub-model set, obtain a plurality of second target probability values corresponding to the target segments; One target probability value and a plurality of the second target probability values are compared one-to-one to obtain a plurality of minimum probability values, and the cumulative sum of the plurality of the minimum probability values is determined as the i-th corresponding to the first object the similarity between the first sub-model and each of the current sub-models; from the determined at least one similarity corresponding to the i-1 th current sub-model set, select the highest similarity, and The current sub-model corresponding to the highest similarity in the i-1 th current sub-model set is determined as the similar sub-model.
  • the model obtaining module 4555 is further configured to obtain multiple first target probability values corresponding to multiple target segments from the first sub-model corresponding to the i-th first object ; From the similar sub-models, obtain a plurality of probability values to be combined corresponding to a plurality of the target segments; compare a plurality of the first target probability values and a plurality of the probability values to be combined in one-to-one correspondence to obtain a plurality of maximum probability values; combining a plurality of the target segments and a plurality of the maximum probability values to obtain the combined sub-model.
  • the second detection module 4553 is further configured to obtain at least one target second object sub-model corresponding to the second target object from the second preset object model; at least one target similarity corresponding to the first target sub-model and at least one of the target second object sub-models respectively; obtain the highest target similarity from at least one of the target similarities; at least one of the target second objects In the sub-model, the target second object sub-model corresponding to the highest target similarity is determined as the second target sub-model.
  • the second detection module 4553 is further configured to, when the highest target similarity is greater than the second preset similarity, compare the second target in the second preset object model with the second target
  • the maximum data volume corresponding to the object is determined as the target maximum data volume corresponding to the second target sub-model; when the highest target similarity is less than or equal to the second preset similarity, the preset data volume is determined is the maximum number of targets corresponding to the second target sub-model.
  • the first detection module 4552 is further configured to, when the target data volume is greater than the abnormal data volume, determine that the behavior information to be detected is abnormal about the first target object, wherein, The first detection result is that the behavior information to be detected is abnormal about the first target object; when the target data amount is less than or equal to the abnormal data amount, it is determined that the behavior information to be detected is related to the first target The object is normal, wherein the first detection result is that the behavior information to be detected is normal with respect to the first target object.
  • the second detection module 4553 is further configured to, when the target data volume is greater than the target maximum data volume, determine that the behavior information to be detected is abnormal about the second target object, wherein , the second detection result is that the behavior information to be detected is abnormal with respect to the second target object; when the target data volume is less than or equal to the target maximum data volume, it is determined that the behavior information to be detected is related to the first target object.
  • the two target objects are normal, wherein the second detection result is that the behavior information to be detected is normal with respect to the second target object.
  • the result determination module 4554 is further configured to, when the first detection result is that the behavior information to be detected is abnormal about the first target object, and the second detection result is the When the behavior information to be detected is abnormal about the second target object, it is determined that the target detection result including the behavior information to be detected is abnormal; when the first detection result is that the behavior information to be detected is related to the first target When the object is normal, and the second detection result is that the behavior information to be detected is abnormal about the second target object, it is determined that the target detection result including the behavior information to be detected is abnormal; when the first detection result is abnormal When the behavior information to be detected is normal with respect to the first target object, and the second detection result is that the behavior information to be detected is normal with respect to the second target object, it is determined that the behavior information to be detected is normal.
  • the target detection result when the first detection result is that the behavior information to be detected is abnormal about the first target object, and the second detection result is that the behavior information to be detected is related to the second target object When it is normal, it is determined that the target detection result including the behavior information to be detected is normal.
  • Embodiments of the present application provide a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the abnormal behavior detection device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the abnormal behavior detection method described above in the embodiments of the present application.
  • the embodiments of the present application provide a computer-readable storage medium storing executable instructions, wherein the executable instructions are stored, and when the executable instructions are executed by the processor, the processor will cause the processor to perform the abnormal behavior detection provided by the embodiments of the present application.
  • method for example, the abnormal behavior detection method shown in FIG. 3 .
  • the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; it may also include one or any combination of the foregoing memories Various equipment.
  • executable instructions may take the form of programs, software, software modules, scripts, or code, written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and which Deployment may be in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
  • executable instructions may, but do not necessarily correspond to files in a file system, may be stored as part of a file that holds other programs or data, for example, a Hyper Text Markup Language (HTML, Hyper Text Markup Language) document
  • HTML Hyper Text Markup Language
  • One or more scripts in stored in a single file dedicated to the program in question, or in multiple cooperating files (eg, files that store one or more modules, subroutines, or code sections).
  • executable instructions may be deployed to be executed on one computing device, or on multiple computing devices located at one site, or alternatively, distributed across multiple sites and interconnected by a communication network execute on.
  • the target data volume is combined with the abnormal data volume and the target maximum data volume, respectively.
  • the abnormal data volume is an abnormal judgment condition for the first target object determined based on the first preset object model
  • the target maximum data volume It is an abnormal judgment condition for the second target object determined based on the second preset object model; therefore, under the low-latitude feature, it can be judged from the two dimensions of the first target object and the second target object whether the target data volume is in the predicted value.
  • the interval is set to accurately obtain the target detection result of whether the behavior information to be detected is abnormal; thus, the accuracy of abnormal behavior detection is high.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Algebra (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Debugging And Monitoring (AREA)
  • Alarm Systems (AREA)
  • Image Analysis (AREA)

Abstract

本申请提供了一种异常行为检测方法、装置、电子设备及计算机可读存储介质;方法包括:从第一预设对象模型中,获取与第一目标对象对应的第一目标子模型;基于预设模型参数,从第一目标子模型中确定异常数据量,基于目标数据量和异常数据量的对比结果,确定与待检测行为信息对应的第一检测结果;从第二预设对象模型中,获取与第二目标对象对应的且与第一目标子模型的相似度最高的第二目标子模型;获取第二目标子模型对应的目标最大数据量,基于目标数据量和目标最大数据量的对比结果,确定与待检测行为信息对应的第二检测结果;结合第一检测结果和第二检测结果,确定待检测行为信息的目标检测结果。

Description

异常行为检测方法、装置、电子设备及计算机可读存储介质
相关申请的交叉引用
本申请基于申请号为202010840924.8、申请日为2020年08月20日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及计算机应用领域中的信息处理技术,涉及一种异常行为检测方法、装置、电子设备及计算机可读存储介质。
背景技术
随着计算机应用技术的快速发展,各种网络功能的应用也变得越来越广泛;然而,在网络功能的应用过程中,常常存在通过异常方式进行虚假刷量或盗号支付等恶意处理的情况;因此,为了提升网络安全性,异常行为检测越来越重要。
一般来说,在进行异常行为检测时,通常采用无监督方式进行;比如,对历史行为信息进行聚类,得到多个类簇,当获得了待检测行为信息之后,通过判断待检测行为信息与多个类簇之间的所属关系来确定待检测行为信息的异常性。然而,上述异常行为检测的过程中,由于待检测行为信息的特征维度低,当依据低维度的特征进行聚类确定检测结果时,检测结果中存在误差的可能性较高,导致异常行为检测的准确度较低。
发明内容
本申请实施例提供一种异常行为检测方法、装置、电子设备及计算机可读存储介质,能够提升异常行为检测的准确度。
本申请实施例的技术方案是这样实现的:
本申请实施例提供一种异常行为检测方法,所述方法由电子设备执行,包括:
获取待检测行为信息,所述待检测行为信息包括第一目标对象、第二目标对象和目标数据量;
从第一预设对象模型中,获取与所述第一目标对象对应的第一目标子模型;
基于预设模型参数,从所述第一目标子模型中确定异常数据量,基于 所述目标数据量和所述异常数据量的对比结果,确定与所述待检测行为信息对应的第一检测结果;
从第二预设对象模型中,获取与所述第二目标对象对应的且与所述第一目标子模型的相似度最高的第二目标子模型;
获取所述第二目标子模型对应的目标最大数据量,基于所述目标数据量和所述目标最大数据量的对比结果,确定与所述待检测行为信息对应的第二检测结果;
结合所述第一检测结果和所述第二检测结果,确定所述待检测行为信息的目标检测结果。
本申请实施例提供一种异常行为检测装置,包括:
信息获取模块,配置为获取待检测行为信息,所述待检测行为信息包括第一目标对象、第二目标对象和目标数据量;
第一检测模块,配置为从第一预设对象模型中,获取与所述第一目标对象对应的第一目标子模型;
所述第一检测模型,还配置为基于预设模型参数,从所述第一目标子模型中确定异常数据量,基于所述目标数据量和所述异常数据量的对比结果,确定与所述待检测行为信息对应的第一检测结果;
第二检测模块,配置为从第二预设对象模型中,获取与所述第二目标对象对应的且与所述第一目标子模型的相似度最高的第二目标子模型;
所述第二检测模型,还用于获取所述第二目标子模型对应的目标最大数据量,基于所述目标数据量和所述目标最大数据量的对比结果,确定与所述待检测行为信息对应的第二检测结果;
结果确定模块,配置为结合所述第一检测结果和所述第二检测结果,确定所述待检测行为信息的目标检测结果。
本申请实施例提供一种用于异常行为检测的电子设备,包括:
存储器,用于存储可执行指令;
处理器,用于执行所述存储器中存储的可执行指令时,实现本申请实施例提供的异常行为检测方法。
本申请实施例提供一种计算机可读存储介质,存储有可执行指令,用于引起处理器执行时,实现本申请实施例提供的异常行为检测方法。
本申请实施例至少具有以下有益效果:在待检测行为信息包括第一目标对象、第二目标对象和目标数据量三种维度特征时,通过结合目标数据量分别与异常数据量和目标最大数据量的比较结果,确定待检测行为信息是否具有异常性的目标检测结果的过程中,由于异常数据量为基于第一预设对象模型确定的针对第一目标对象的异常判断条件,目标最大数据量为基于第二预设对象模型确定的针对第二目标对象的异常判断条件;因此,能够在低纬度特征下,通过从第一目标对象和第二目标对象两种维度判断目标数据量是否在预设区间,来准确得到待检测行为信息是否异常的目标 检测结果;从而,能够提升异常行为检测的准确度。
附图说明
图1是本申请实施例提供的异常行为检测系统的一个可选的架构示意图;
图2是本申请实施例提供的一种图1中的服务器的组成结构示意图;
图3是本申请实施例提供的异常行为检测方法的一个可选的流程示意图;
图4是本申请实施例提供的一种示例性的确定异常数据量的示意图;
图5是本申请实施例提供的异常行为检测方法的另一个可选的流程示意图;
图6a是本申请实施例提供的一种示例性的待转化的数据量示意图;
图6b是本申请实施例提供的一种示例性的数据量转化示意图;
图7是本申请实施例提供的一种示例性的获取相似度的示意图;
图8是本申请实施例提供的一种示例性的获取合并后的子模型的示意图;
图9为本申请实施例提供的一种示例性的异常行为检测流程示意图;
图10是本申请实施例提供的一种示例性的获取模型的示意图;
图11是本申请实施例提供的一种示例性的模型示意图。
具体实施方式
为了使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请作进一步地详细描述,所描述的实施例不应视为对本申请的限制,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。
在以下的描述中,所涉及的术语“第一\第二”仅仅是区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。
除非另有定义,本申请实施例所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本申请实施例中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
对本申请实施例进行进一步详细说明之前,对本申请实施例中涉及的名词和术语进行说明,本申请实施例中涉及的名词和术语适用于如下的解 释。
1)异常行为检测:是指检测用户进行操作的行为所对应的数据是否符合预设操作流程或实际的过程,比如,通过盗号进行支付的检测,通过作弊进行刷量的检测。
2)离线环境:是指基于数据挖掘工具(比如,“hadoop”,“spark”)对大量数据(比如,数以亿计的数据)进行处理的平台;通常存在较大的时延(比如,一天的时延),实时性较低。
3)实时/在线环境:用于实时高效的存储和运算待处理数据的平台,通常时延为毫秒级,复杂度低,实时性较高。
一般来说,为了进行异常行为检测,通常采用无监督方式和有监督方式实现。其中,当采用无监督方式进行异常行为检测时,是指将无监督算法应用于异常行为检测;比如,如果应用场景中的行为信息服从高斯混合分布(Mixture Gaussian Distribution),则在进行异常行为检测时,可以通过判断待检测行为信息是否服从高斯分布来确定待检测行为信息是否异常;再比如,对历史行为信息进行聚类,得到多个类簇,当获得了待检测行为信息之后,通过判断待检测行为信息与多个类簇之间的所属关系来确定待检测行为信息是否异常;又比如,利用异常点检测算法(比如,孤立森林(Isolation Forest)),将空间中孤立的点对应的行为信息确定为异常的行为信息。然而,上述采用无监督方式进行异常行为检测的过程中,当待检测行为信息包括第一目标对象、第二目标对象和目标数据量三维特征时(比如,用户、商户和金额;用户、商品和金额;用户、文章和阅读量),特征的维度较低,当依据低维度的特征进行无监督方式检测确定检测结果时,检测结果中存在误差的可能性较高,从而,异常行为检测的准确度较低;另外,当获得了三种以上维度的特征进行无监督方式检测时,又会由于特征的维度/复杂度较高导致检测的时间较长,从而,检测的实时性较低。
当采用有监督方式进行异常行为检测时,是指通过对样本进行标注,利用样本的特征和标注的信息训练网络模型,再利用网络模型检测待检测行为信息是否异常。然而,上述采用有监督方式进行异常行为检测的过程中,需要对样本进行标注;而当样本的数据量较大,比如达到了亿级时,标注的可执行性较低;比如,当检测支付是否异常时,由于每天发生的支付的笔数达到了亿级,手工进行标注的可执行性较低;并且标注时长较长会导致网络模型训练的时间较长,当完成网络模型的训练之后,如果应用场景中行为信息的变化较快,应用场景的异常行为检测具备时效性时,可能训练好的网络模型已不再适用于当前的应用场景;从而,可执行性较低,不能应用于时效性较高的应用场景中。
基于此,本申请实施例提供一种异常行为方法、装置、电子设备和计算机可读存储介质,能够快速准确地进行异常行为的检测,且能够应用于时效性较高的应用场景中。
下面说明本申请实施例提供的用于异常行为检测的电子设备(以下简称为异常行为简称设备)的示例性应用,本申请实施例提供的异常行为检测设备可以实施为笔记本电脑,平板电脑,台式计算机,机顶盒,智能电视,智能音箱,移动设备(例如,移动电话,便携式音乐播放器,个人数字助理,专用消息设备,便携式游戏设备,车载设备,智能手机,智能手表)等各种类型的终端,也可以实施为服务器。下面,将说明设备实施为服务器时的示例性应用。
参见图1,图1是本申请实施例提供的异常行为检测系统的一个可选的架构示意图;如图1所示,为支撑一个异常行为检测应用,在异常行为检测系统100中,终端200(示例性地示出了终端200-1和终端200-2)通过网络300连接服务器400(异常行为检测设备),网络300可以是广域网或者局域网,又或者是二者的组合。另外,异常行为检测系统100中还包括数据库500。
数据库500,用于存储第一预设对象模型和第二预设对象模型,并向服务器400提供第一预设对象模型和第二预设对象模型,以实现异常行为检测。
终端200-1,用于通过图形界面200-11上的控件200-111(示例性地示出了支付按钮)接收用户的支付操作,响应于支付操作通过网络300向服务器400发送包括商户(第一目标对象)、用户(第二目标对象)和金额(目标数据量)的待检测行为信息。还用于通过网络300接收服务器400发送的目标检测结果,并在图形界面200-12上显示该目标检测结果。
终端200-2,用于通过图形界面200-21上的控件200-211(示例性地示出了阅读按钮)接收用户的阅读操作,响应于阅读操作通过网络300向服务器400发送包括文章(第一目标对象)、用户(第二目标对象)和阅读量(目标数据量)的待检测行为信息。还用于通过网络300接收服务器400发送的目标检测结果,并在图形界面200-22上显示该目标检测结果。
服务器400,用于通过网络300从终端200获取待检测行为信息,待检测行为信息包括第一目标对象、第二目标对象和目标数据量;从数据库500提供的第一预设对象模型中,获取与第一目标对象对应的第一目标子模型;基于预设模型参数,从第一目标子模型中确定异常数据量,对比目标数据量和异常数据量,确定与待检测行为信息对应的第一检测结果;从数据库500提供的第二预设对象模型中,获取与第二目标对象对应的且与第一目标子模型的相似度最高的第二目标子模型;获取第二目标子模型对应的目标最大数据量,对比目标数据量和目标最大数据量,确定与待检测行为信息对应的第二检测结果;结合第一检测结果和第二检测结果,确定待检测行为信息的目标检测结果。还用于通过网络300向终端200发送目标检测结果。
在一些实施例中,服务器400可以是独立的物理服务器,也可以是多 个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(CDN,Content Delivery Network)、以及大数据和人工智能平台等基础云计算服务的云服务器。终端以及服务器可以通过有线或无线通信方式进行直接或间接地连接,本发明实施例中不做限制。
参见图2,图2是本申请实施例提供的一种图1中的服务器的组成结构示意图,图2所示的服务器400包括:至少一个处理器410、存储器450、至少一个网络接口420和用户接口430。服务器400中的各个组件通过总线系统440耦合在一起。可理解,总线系统440用于实现这些组件之间的连接通信。总线系统440除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图2中将各种总线都标为总线系统440。
处理器410可以是一种集成电路芯片,具有信号的处理能力,例如通用处理器、数字信号处理器(DSP,Digital Signal Processor),或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等,其中,通用处理器可以是微处理器或者任何常规的处理器等。
用户接口430包括使得能够呈现媒体内容的一个或多个输出装置431,包括一个或多个扬声器和/或一个或多个视觉显示屏。用户接口430还包括一个或多个输入装置432,包括有助于用户输入的用户接口部件,比如键盘、鼠标、麦克风、触屏显示屏、摄像头、其他输入按钮和控件。
存储器450可以是可移除的,不可移除的或其组合。示例性的硬件设备包括固态存储器,硬盘驱动器,光盘驱动器等。存储器450可选地包括在物理位置上远离处理器410的一个或多个存储设备。
存储器450包括易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。非易失性存储器可以是只读存储器(ROM,Read Only Memory),易失性存储器可以是随机存取存储器(RAM,Random Access Memory)。本申请实施例描述的存储器450包括任意适合类型的存储器。
在一些实施例中,存储器450能够存储数据以支持各种操作,这些数据的示例包括程序、模块和数据结构或者其子集或超集,下面示例性说明。
操作系统451,包括用于处理各种基本系统服务和执行硬件相关任务的系统程序,例如框架层、核心库层、驱动层等,用于实现各种基础业务以及处理基于硬件的任务;
网络通信模块452,用于经由一个或多个(有线或无线)网络接口420到达其他计算设备,示例性的网络接口420包括:蓝牙、无线相容性认证(Wi-Fi)、和通用串行总线(USB,Universal Serial Bus)等;
呈现模块453,用于经由一个或多个与用户接口430相关联的输出装置431(例如,显示屏、扬声器等)使得能够呈现信息(例如,用于操作外围 设备和显示内容和信息的用户接口);
输入处理模块454,用于对一个或多个来自一个或多个输入装置432之一的一个或多个用户输入或互动进行检测以及翻译所检测的输入或互动。
在一些实施例中,本申请实施例提供的异常行为检测装置可以采用软件方式实现,图2示出了存储在存储器450中的异常行为检测装置455,其可以是程序和插件等形式的软件,包括以下软件模块:信息获取模块4551、第一检测模块4552、第二检测模块4553、结果确定模块4554和模型获取模块4555,这些模块是逻辑上的,因此根据所实现的功能可以进行任意的组合或进一步拆分。将在下文中说明各个模块的功能。
在另一些实施例中,本申请实施例提供的异常行为检测装置可以采用硬件方式实现,作为示例,本申请实施例提供的异常行为检测装置可以是采用硬件译码处理器形式的处理器,其被编程以执行本申请实施例提供的异常行为检测方法,例如,硬件译码处理器形式的处理器可以采用一个或多个应用专用集成电路(ASIC,Application Specific Integrated Circuit)、DSP、可编程逻辑器件(PLD,Programmable Logic Device)、复杂可编程逻辑器件(CPLD,Complex Programmable Logic Device)、现场可编程门阵列(FPGA,Field-Programmable Gate Array)或其他电子元件。
下面,将结合本申请实施例提供异常行为检测设备实施为服务器时的示例性应用,说明本申请实施例提供的异常行为检测方法。
参见图3,图3是本申请实施例提供的异常行为检测方法的一个可选的流程示意图,将结合图3示出的步骤进行说明。
S301、获取待检测行为信息,待检测行为信息包括第一目标对象、第二目标对象和目标数据量。
在本申请实施例中,当用户在终端针对功能应用进行操作,比如,进行支付、阅读文章或点击广告时,终端响应于用户所进行的操作,生成操作数据,并将操作数据向服务器发送时,服务器接收该待检测行为信息,也就完成了待检测行为信息的获取。
需要说明的是,待检测行为信息为检测对象,包括第一目标对象、第二目标对象和目标数据量;其中,第一目标对象为被操作对象,比如,广告、商户、商品或文章;第二目标对象为操作对象,比如,用户等;目标数据量为第二目标对象对第一目标对象进行操作所产生的数据量,比如,金额、点击量或阅读量。
S302、从第一预设对象模型中,获取与第一目标对象对应的第一目标子模型。
在本申请实施例中,服务器中预先存储着第一预设对象模型,或服务器能够预先获取到第一预设对象模型;该第一预设对象模型为各个第一对象对应的模型,该模型为数据量分布信息;由于第一目标对象为一个第一对象,从而,服务器能够从第一预设对象模型中能够获取到与第一目标对 象对应的模型;此时,也就得到了第一目标子模型。
也就是说,第一预设对象模型为第一对象与数据量分布信息之间的对应关系;由于第一目标对象为一个第一对象,服务器第一对象与数据量分布信息之间的对应关系,将第一目标对象与第一对象与数据量分布信息之间的对应关系中的各个第一对象匹配,从各个第一对象中匹配出第一对象,并将匹配出的第一对象对应的数据量分布信息确定为第一目标子模型。
需要说明的是,第一目标子模型为第一目标对象对应的数据量分布信息,比如,商户A关于金额的直方图,或者文章B关于阅读量的分布。
另外,在本申请实施例中,当服务器从第一预设对象模型中未找到与第一目标对象对应的模型,则构建针对第一目标对象的第一目标子模型,并将第一目标对象对应的第一目标子模型增加至第一预设对象模型中。
S303、基于预设模型参数,从第一目标子模型中确定异常数据量,基于目标数据量和异常数据量的对比结果,确定与待检测行为信息对应的第一检测结果。
在本申请实施例中,服务器中存储着预设模型参数,或者服务器能够预先获取到预设模型参数;该预设模型参数为数据量分布信息中的预设分位点,用于确定第一目标对象对应的数据量的预设区间,所确定的第一目标对象对应的数据量的预设区间是与第一目标对象对应的异常判断条件;从而,服务器能够从第一目标子模型中确定与预设模型参数对应的目标位置,该目标位置对应的数据量即为异常数据量;接下来,服务器通过将目标数据量和异常数据量进行对比,根据对比结果中目标数据量与异常数据量的大小关系,就能够确定目标数据量是否处于第一目标对象对应的数据量的预设区间(零到异常数据量的区间范围);此时,也就得到了与待检测行为信息对应的第一检测结果。
需要说明的是,第一检测结果为针对第一目标对象确定出的待检测行为信息是否异常的结果;当目标数据量大于异常数据量时,表明目标数据量处于第一目标对象对应的数据量的预设区间外,从而,服务器确定包括待检测行为信息关于第一目标对象异常的第一检测结果;当目标数据量小于等于异常数据量时,表明目标数据量处于第一目标对象对应的数据量的预设区间中,从而,服务器确定包括待检测行为信息关于第一目标对象正常的第一检测结果。
示例性地,参见图4,图4是本申请实施例提供的一种示例性的确定异常数据量的示意图;如图4所示,在直方图4-1(第一目标子模型)中,横轴为金额值,纵轴为概率值,当预设模型参数为99分位点时,按照金额值从小到大的顺序对直方图4-1中各金额段的概率进行叠加,当叠加结果大于99%时所对应的位置14-2(目标位置,一个金额小于位置14-2对应的金额值的概率大于99%)对应的金额值即异常数据量。
S304、从第二预设对象模型中,获取与第二目标对象对应的且与第一 目标子模型的相似度最高的第二目标子模型。
在本申请实施例中,服务器中预先存储着第二预设对象模型,或服务器能够预先获取到第二预设对象模型;该第二预设对象模型为每个第二对象对应的关于各个第一对象对应的各个数据量分布信息构成的模型集合;由于第二目标对象为一个第二对象,从而,服务器能够从第二预设对象模型中获取到与第二目标对象对应的模型组,并从获取到的与第二目标对象对应的模型组中匹配出与第一目标子模型具有最高相似度的模型,也就得到了第二目标子模型。
也就是说,服务器在第二预设对象模型中,将第二目标对象与每个第二对象进行匹配,并获取匹配出的第二对象对应的各个第一对象对应的各个数据量分布信息(模型组),进而从各个数据量分布信息中获取与已获得的第一目标子模型最相似的数据量分布信息,该与第一目标子模型最相似的数据量分布信息即第二目标子模型;其中,第一预设对象模型中的每个第一对象对应的数据量分布信息与第二预设对象模型中的每个第一对象对应的数据量分布信息不同,第二预设对象模型中的每个第一对象对应的数据量分布信息,是通过对第一预设对象模型中的每个第一对象对应的数据量分布信息进行处理获得的,是第二对象对应的关于第一对象的数据量分布信息。
需要说明的是,第二目标子模型为第二目标对象对应的关于第一目标对象的数据量分布信息,比如,用户C关于商户A的金额直方图,用户D关于文章B的阅读量分布。
S305、获取第二目标子模型对应的目标最大数据量,基于目标数据量和目标最大数据量的对比结果,确定与待检测行为信息对应的第二检测结果。
在本申请实施例中,服务器获得了第二目标子模型之后,由于第二预设对象模型中不仅包括每个第二对象对应的模型组,还包括每个第二对象对应的最大数据量;此时,服务器基于第二预设对象模型中第二目标对象对应的最大数据量,确定目标最大数据量;也就是说,目标最大数据量可能为第二目标对象对应的最大数据量,也可能为基于第二预设对象模型中第二目标对象对应的最大数据量确定出的,等等,本申请实施例对此不作具体限定。
这里,服务器获得了目标最大数据量,也就确定了与第二目标对象对应的数据量的预设区间,所确定的与第二目标对象对应的数据量的预设区间是与第二目标对象对应的异常判断条件;从而,服务器通过将目标数据量和目标最大数据量进行对比,根据对比结果中目标数据量与目标最大数据量的大小关系,就能够确定目标数据量是否处于第二目标对象对应的数据量的预设区间(零到目标最大数据量的区间范围);此时,也就得到了与待检测行为信息对应的第二检测结果。
需要说明的是,第二检测结果为待检测行为信息针对第二目标对象确定出的是否异常的结果;当目标数据量大于目标最大数据量时,表明目标数据量处于第二目标对象对应的数据量的预设区间之外,从而,服务器确定包括待检测行为信息关于第二目标对象异常的第二检测结果;当目标数据量小于等于目标最大数据量时,表明目标数据量处于第二目标对象对应的数据量的预设区间中,从而,服务器确定包括待检测行为信息关于第二目标对象正常的第二检测结果。
S306、结合第一检测结果和第二检测结果,确定待检测行为信息的目标检测结果。
在本申请实施例中,服务器获得了第一检测结果和第二检测结果之后,结合第一检测结果和第二检测结果,确定待检测行为信息的目标检测结果。这里,目标检测结果是指待检测行为信息是否异常。
需要说明的是,服务器结合第一检测结果和第二检测结果,确定待检测行为信息的目标检测结果时,包括:当第一检测结果为待检测行为信息关于第一目标对象异常,且第二检测结果为待检测行为信息关于第二目标对象异常时,服务器确定包括待检测行为信息异常的目标检测结果;当第一检测结果为待检测行为信息关于第一目标对象正常,且第二检测结果为待检测行为信息关于第二目标对象异常时,服务器确定包括待检测行为信息异常的目标检测结果,服务器还可以确定包括待检测行为信息正常的目标检测结果,服务器还可以将待检测行为信息确定为待审核行为信息,以通过人工方式、智能检测处理执行进一步地检测;当第一检测结果为待检测行为信息关于第一目标对象正常,且第二检测结果为待检测行为信息关于第二目标对象正常时,服务器确定包括待检测行为信息正常的目标检测结果;当第一检测结果为待检测行为信息关于第一目标对象异常,且第二检测结果为待检测行为信息关于第二目标对象正常时,服务器确定包括待检测行为信息正常的目标检测结果,服务器还可以确定包括待检测行为信息异常的目标检测结果,服务器还可以将待检测行为信息确定为待审核行为信息,以通过人工方式、智能检测处理执行进一步地检测。
在本申请实施例中,服务器获得了目标检测结果之后,还可以根据目标检测结果确定目标处理信息;该目标处理信息为对待检测行为信息的处理方式,比如,当待检测行为信息为支付操作时,如果目标检测结果为待检测行为信息异常,则目标处理信息为拦截支付操作的处理。再比如,当待检测行为信息为广告点击时,如果目标检测结果为待检测行为信息异常,则目标处理信息为阻止广告点击的处理。
可以理解的是,在待检测行为信息包括第一目标对象、第二目标对象和目标数据量三种维度特征时,通过结合目标数据量分别与异常数据量和目标最大数据量的比较结果,确定待检测行为信息是否具有异常性的目标检测结果的过程中,由于异常数据量为基于第一预设对象模型确定的针对 第一目标对象的异常判断条件,目标最大数据量为基于第二预设对象模型确定的针对第二目标对象的异常判断条件;因此,能够在低纬度特征下,通过从第一目标对象和第二目标对象两种维度判断目标数据量是否在预设区间,来准确得到待检测行为信息是否异常的目标检测结果;从而,异常行为检测的准确度较高。
参见图5,图5是本申请实施例提供的异常行为检测方法的另一个可选的流程示意图;如图5所示,在本申请实施例中,S302之前还包括S307-S311;也就是说,服务器从第一预设对象模型中,获取与第一目标对象对应的第一目标子模型之前,该异常行为检测方法还包括S307-S311,下面对各步骤分别进行说明。
S307、获取行为信息样本。
在本申请实施例中,服务器获取当前周期中的行为信息,也就得到了行为信息样本;比如,近一周的支付订单,近一月的阅读记录。这里,当前周期是指最近的预设周期。
需要说明的是,行为信息样本为当前周期内的第一对象、第二对象和数据量对应的行为信息所构成的集合,从而,行为信息样本中的每条行为信息均包括第一对象、第二对象和数据量。
S308、依据第一预设对象类型对行为信息样本进行聚合,得到与每个第一对象对应的数据量集合,依据数据量集合,构建与每个第一对象对应的第一子模型,将构建出的各个第一对象对应的各个第一子模型确定为第一预设对象模型。
需要说明的是,服务器获得了行为信息样本之后,对行为信息样本按照不同的预设类型进行聚合,来获得第一预设对象模型和第二预设对象模型。其中,不同的预设类型包括第一预设对象类型(比如,商户类型或商品类型)和第二预设对象类型(比如,用户),第一预设对象类型为第一对象所属的对象类型,第二预设对象类型为第二对象所属的对象类型。从而,当服务器依据第一预设对象类型对行为信息样本进行聚合,获取与每个第一对象对应的各个数据量时,也就得到了与每个第一对象对应的数据量集合;
这里,每个第一对象所对应的对象类型为第一预设对象类型;数据量集合为每个第一对象关于第二对象的数据量构成的集合。
在本申请实施例中,服务器获得了数据量集合之后,依据该数据量集合确定每个第一对象对应的数据量分布信息,也就完成了与每个第一对象对应的第一子模型构建;当完成了每个第一对象对应的第一子模型的构建时,也就得到了各个第一对象对应的各个第一子模型,该各个第一对象对应的各个第一子模型即第一预设对象模型。这里,每个第一对象是指各个第一对象中的任一对象,第一预设对象模型是指每个第一对象的第一子模型构成的集合;第一目标子模型为一个第一子模型。
S309、依据第二预设对象类型对行为信息样本进行聚合,得到与每个第二对象对应的第一对象集合和最大数据量。
在本申请实施例中,当服务器依据第二预设对象类型对行为信息样本进行聚合,获取与每个第二对象对应的各个第一对象时,也就得到了与每个第二对象对应的第一对象集合;这里,当服务器依据第二预设对象类型对行为信息样本进行聚合时,还获取与每个第二对象对应的各个数据量,从获取到的与每个第二对象对应的各个数据量中选择最大的数据量,也就得到了与每个第二对象对应的最大数据量。
S310、遍历第一对象集合,基于遍历到的第一对象对应的第一子模型,构建至少一个第二对象子模型。
在本申请实施例中,服务器获得了第一对象集合之后,对第一对象集合中的第一对象进行遍历,针对遍历到的第一对象,与第一预设对象模型中的第一对象进行匹配,匹配出的第一对象对应的模型即遍历到的第一对象对应的第一子模型;这里,服务器利用遍历到的第一对象对应的第一子模型,构建出与每个第二对象对应的至少一个第二对象子模型。
需要说明的是,至少一个第二对象子模型为每个第二对象关联的第一对象的数据量分布构成的集合,比如,用户C关于商户A的金额直方图、商户E的金额直方图和商户F的金额直方图。遍历到的第一对象为第一对象集合中的任一第一对象。
S311、将至少一个第二对象子模型与最大数据量,组合为与每个第二对象对应的第二子模型,将组合出的各个第二对象对应的各个第二子模型确定为第二预设对象模型。
需要说明的是,服务器获得了每个第二对象对应的至少一个第二对象子模型之后,将至少一个第二对象子模型与最大数据量组合,所获得的组合结果即每个第二对象对应的第二子模型;当获得了每个第二对象对应的第二子模型时,也就得到了各个第二对象对应的各个第二子模型第二预设对象模型。这里,每个第二对象是指各个第二对象中的任一对象,第二预设对象模型是指每个第二对象的第二子模型构成的集合。
在申请实施例中,S308中服务器依据数据量集合,构建与每个第一对象对应的第一子模型,包括S3081-S3085,下面对各步骤分别进行说明。
S3081、获取数据量集合中的各个数据量对应的数据量范围。
在本申请实施例中,服务器从数据量集合中的各个数据量中提取最小的数据量和最大的数据量,最小的数据量和最大的数据量对应的范围即数据量范围。
S3082、对数据量范围进行分段,得到多个目标段。
在本申请实施例中,服务器依据预设段大小或者预设段数量对数量范围进行分段,也就得到了多个目标段。
S3083、从数据量集合中,统计属于多个目标段中的每个目标段的数据 量的目标数量。
在本申请实施例中,服务器获得了多个目标段和数据量集合之后,确定数据量集合中的每个数据量所属的目标段,进而统计出属于多个目标段中的每个目标段的数据量的数量;而统计出的每个目标段的数据量的数量,即与每个目标段对应的目标数量。
S3084、将目标数量与数据量集合对应的集合元素数量的比值,确定为与每个目标段对应的概率值。
在本申请实施例中,服务器统计数据量集合中数据量的数量,所统计出的数据量集合中数据量的数量即数据量集合对应的集合元素数量;此时,将目标数据量作为分子,将集合元素数量作为分母,计算比值,所获得的比值结果即每个目标段对应的概率值;当完成了每个目标段对应的概率值的获取时,也就得到了与多个目标段对应的多个概率值,多个目标段与多个概率值一一对应。
S3085、将确定出的多个目标段对应的多个概率值,确定为每个第一对象对应的第一子模型。
需要说明的是,第一子模型即与第一对象关联的多个目标段对应的多个概率值。
在本申请实施例中,S3081可通过S30811和S30812实现;也就是说,服务器获取数据量集合中的各个数据量对应的数据量范围,包括S30811和S30812,下面对各步骤分别进行说明。
S30811、对数据量集合中的各个数据量进行转化,得到转化后的数据量集合。
需要说明的是,由于数据量集合中的各个数据量对应的分布通常是对数正态分布,而对数正态分布中存在平滑部分(长尾部分),该平滑部分对应的概率接近于0,对较大的数据量的检测结果不准确,无法为后续异常行为检测提供数据支持;因此,为了提升异常行为检测的准确度,服务器对数据量集合中的各个数据量进行转化,消除平滑部分,以使得转化后的数据量集合中的各个转化后的数据量对应的分布服从标准正态分布。
这里,转化可以为对数转化,还可以是同时进行预设倍数的缩小,又可以是利用不同的权重对各个数据量对应相乘,等等,本申请实施例对此不作具体限定。还需要说明的是,当模型获取过程中对数据量进行了转化,则模型获取过程和模型应用过程中所指的数据量均是转化后的数据量。
参见图6a,当数据量为金额值时,由于小额支付时对应的金额值有时较小,比如,几元;大额支付时对应的金额值有时较大,比如,几十万;从而数据量对应的分布曲线6-1中会出现平滑部分6-11,该平滑部分6-11对应的概率分布中的概率几乎为0,无法为后续异常支付检测提供数据支持;其中,图6a中,横轴为金额值,纵轴为概率值。此时,利用自然对数(ln)对数据量进行转化,此时,图6a中的分布曲线6-1就转化成了图6b中的分 布曲线6-2;在图6b中,横轴为转化后的金额值,纵轴为概率值。易知,通过对金额值进行取对数计算,使得小额支付中,能够通过模型编码出几元的波动;大额支付中,能够通过模型编码出千元甚至万元的波动;这种小额敏感大额不敏感的趋势与对数函数的变化曲线是一致,比如,
Figure PCTCN2021104999-appb-000001
S30812、将转化后的数据量集合中的各个转化后的数据量对应的范围,确定为数据量范围。
需要说明的是,服务器获得了转化后的数据量集合之后,从转化后的数据量集合中的各个转化后的数据量中提取最小的转化后的数据量和最大的转化后的数据量,最小的转化后的数据量和最大的转化后的数据量对应的范围即数据量范围。
相应地,S3083中服务器从数据量集合中,统计属于多个目标段中的每个目标段的数据量的目标数量,包括:服务器从转化后的数据量集合中,统计属于多个目标段中的每个目标段的转化后的数据量的目标数量。也就是说,服务器确定转化后的数据量集合中的每个转化后的数据量所属的目标段,进而统计出属于多个目标段中的每个目标段的转化后的数据量的数量;而统计出的每个目标段的转化后的数据量的数量,即与每个目标段对应的目标数量。此外,目标数据量等其他的数据量均是转化后对应的数据量。
在本申请实施例中,S310可通过S3101-S3105实现;也就是说,服务器遍历第一对象集合,基于遍历到的第一对象对应的第一子模型,构建至少一个第二对象子模型,包括S3101-S3105,下面对各步骤分别进行说明。
S3101、遍历第一对象集合,将遍历到的第1个第一对象对应的第一子模型确定为当前子模型,并构建包括当前子模型的第1个当前子模型集合。
需要说明的是,在遍历第一对象集合时,当遍历到的第一对象为第1个遍历到的第一对象时,关联的第二对象对应的当前子模型集合是空集;此时,服务器将第1个遍历到的第一对象对应的第一子模型作为一个当前子模型作为当前子模型集合中的一个元素来构建第1个当前子模型集合。
S3102、通过迭代i执行以下处理:将遍历到的第i个第一对象对应的第一子模型,分别与第i-1个当前子模型集合中的每个当前子模型进行对比,得到相似子模型,其中,2<i≤I,且i为取值递增的正整数变量,I为第一对象集合中第一对象的数量。
需要说明的是的,当遍历到的第一对象为第1个遍历到的第一对象之后又遍历到的第一对象时,服务器将遍历到的第i个第一对象对应的第一子模型分别与第i-1个当前子模型集合中的每个当前子模型进行对比,根据对比结果,从第i-1个当前子模型集合中选择与第i个第一对象对应的第一子模型最相似的一个当前子模型,也就得到了相似子模型。
S3103、当第i个第一对象对应的第一子模型与相似子模型对应的相似 度大于第一预设相似度时,合并第i个第一对象对应的第一子模型和相似子模型,得到合并后的子模型,并利用合并后的子模型替换第i-1个当前子模型集合中的相似子模型,得到第i个当前子模型集合。
需要说明的是,服务器获得了相似子模型之后,将相似子模型与第i个第一对象对应的第一子模型之间的相似度,与第一预设相似度进行对比;当相似子模型与第i个第一对象对应的第一子模型之间的相似度,大于第一预设相似度时,表明,第i个第一对象对应的第一子模型所对应的第一对象(比如,便利店A)与相似子模型所对应的第一对象(比如,便利店B)在数据量方面类似;此时,服务器合并第i个第一对象对应的第一子模型和相似子模型,合并的结果即合并后的子模型;接着,服务器利用合并后的子模型替换掉第i-1个当前子模型集合中的相似子模型,完成了替换后的第i-1个当前子模型集合即第i个当前子模型集合。
S3104、当第i个第一对象对应的第一子模型与相似子模型对应的相似度小于等于第一预设相似度时,将待更新子模型插入当前子模型集合中,完成对当前子模型集合的更新。
需要说明的是,当相似子模型与第i个第一对象对应的第一子模型之间的相似度,小于等于第一预设相似度时,表明,第i个第一对象对应的第一子模型所对应的第一对象(比如,便利店A)与相似子模型所对应的第一对象(比如,便利店B)在数据量方面不类似;此时,服务器将第i个第一对象对应的第一子模型插入第i-1个当前子模型集合中,完成了插入后的第i-1个当前子模型集合即第i个当前子模型集合。
S3105、将迭代i获得的第I个当前子模型集合确定为至少一个第二对象子模型。
在本申请实施例中,服务器在遍历第一对象集合时,针对第一对象集合中的任一第一对象,均采用S3102至S3104的过程更新当前第一对象对应的当前子模型集合;当完成对第一对象集合的遍历时,所获得的遍历更新后的当前子模型集合即构建出的至少一个第二对象子模型。
在本申请实施例中,S3102还包括S31021-S31024;也就是说,服务器将第一预设对象模型中与当前第一对象对应的待更新子模型,分别与每个第二对象对应的当前子模型集合中的各个当前子模型进行对比,得到相似子模型,包括S31021-S31024,下面对各步骤分布进行说明。
S31021、从遍历到的第i个第一对象对应的第一子模型中,获取多个目标段对应的多个第一目标概率值。
需要说明的是,由于每个第一对象对应的第一子模型为多个目标段对应的多个概率值;从而,遍历到的第i个第一对象对应的第一子模型也包括多个目标段对应的多个概率值,这里,称为多个目标段对应的多个第一目标概率值;即多个目标段对应的多个第一目标概率值,为待更新子模型对应的多个目标段与多个概率值之间的关系;其中,多个目标段与多个第一 目标概率值一一对应。
S31022、从第i-1个当前子模型集合中的每个当前子模型中,获取多个目标段对应的多个第二目标概率值。
需要说明的是,第i-1个当前子模型集合中的每个当前子模型中也包括多个目标段对应的多个概率值,这里,称为多个目标段对应的多个第二目标概率值;即多个目标段对应的多个第二目标概率值,为每个当前子模型对应的多个目标段与多个概率值之间的关系;其中,多个目标段和多个第二目标概率值一一对应。
S31023、将多个第一目标概率值和多个第二目标概率值一一对应对比,得到多个最小概率值,将多个最小概率值的累加和,确定为第i个第一对象对应的第一子模型和每个当前子模型的相似度。
需要说明的是,由于多个第二目标概率值与多个第一目标概率值一一对应,因此,服务器通过对比遍历到的第i个第一对象对应的第一子模型的多个第一目标概率值和每个当前子模型的多个第二目标概率值,确定遍历到的第i个第一对象对应的第一子模型和每个当前子模型的相似度;当获取了遍历到的第i个第一对象对应的第一子模型和每个当前子模型的相似度之后,也就得到了与第i-1个当前子模型集合中的至少一个当前子模型对应的至少一个相似度。
这里,多个最小概率值与多个第一目标概率值一一对应,且多个最小概率值和多个第二目标概率值一一对应,第i-1个当前子模型集合中至少一个当前子模型与多个相似度一一对应。
参见图7,图7是本申请实施例提供的一种示例性的获取相似度的示意图;如图7所示,在横坐标为取对数后的金额值,纵坐标为概率值的坐标系中,第i个第一对象对应的第一子模型7-1和每个当前子模型7-2的相似度为区域7-3对应的面积。这里,当第i个第一对象对应的第一子模型7-1对应的多个第一目标概率值为a j,当前子模型7-2对应的多个第二目标概率值为b i,j为大于等于2的整数时,第i个第一对象对应的第一子模型7-1和当前子模型7-2的相似度S如式(1)所示:
Figure PCTCN2021104999-appb-000002
其中,n为多个目标段的数量。
S31024、从确定出的与第i-1个当前子模型集合对应的至少一个相似度中,选择最高相似度,并将第i-1个当前子模型集合中与最高相似度对应的当前子模型,确定为相似子模型。
在本申请实施例中,服务器从多个相似度中获取最高相似度,并将最高相似度在第i-1个当前子模型集合中对应的当前子模型,作为相似子模型。
在本申请实施例中,S3103中服务器合并第i个第一对象对应的第一子模型和相似子模型,得到合并后的子模型,包括S31031-S31034,下面对各步骤分别进行说明。
S31031、从第i个第一对象对应的第一子模型中,获取多个目标段对应的多个第一目标概率值。
需要说明的是,S31031的实现过程与S31021描述的实现过程一致。
S31032、从相似子模型中,获取多个目标段对应的多个待合并概率值。
需要说明的是,每个当前子模型中也包括多个目标段对应的多个概率值,这里,称为多个目标段对应的多个第二目标概率值;即多个目标段对应的多个第二目标概率值,为每个当前子模型对应的多个目标段与多个概率值之间的关系,其中,多个目标段和多个待合并概率值一一对应。
S31033、将多个第一目标概率值和多个待合并概率值一一对应对比,得到多个最大概率值。
需要说明的是,多个第一目标概率值和多个待合并概率值一一对应,多个最大概率值与多个第一目标概率值一一对应,且多个最大概率值和多个待合并概率值一一对应。
S31034、将多个目标段和多个最大概率值组合,得到合并后的子模型。
参见图8,图8是本申请实施例提供的一种示例性的获取合并后的子模型的示意图;如图8所示,在横坐标为取对数后的金额值,纵坐标为概率值的坐标系中,示出了第i个第一对象对应的第一子模型8-1和相似子模型8-2对应的合并后的子模型8-3。这里,当第i个第一对象对应的第一子模型8-1对应的多个第一目标概率值为a j,相似子模型8-2对应的多个待合并概率值为c j,合并第i个第一对象对应的第一子模型8-1和相似子模型8-2,得到合并后的子模型8-3的过程可通过式(2)实现,式(2)为:
C={…,max(a j,c j),…},i∈n        (2)
其中,C用于表示合并后的子模型8-3。
在本申请实施例中,S304可通过S3041-S3044实现;也就是说,服务器从第二预设对象模型中,获取与第二目标对象对应的且与第一目标子模型的相似度最高的第二目标子模型,包括S3041-S3044,下面对各步骤分别进行说明。
S3041、从第二预设对象模型中,获取与第二目标对象对应的至少一个目标第二对象子模型。
需要说明的是,由于第二预设对象模型为每个第二对象对应的至少一个第二对象子模型进行组合得到的;从而,服务器将第二目标对象与第二预设对象模型中的每个第二对象进行匹配,所匹配出的第二对象对应的至少一个第二对象子模型即至少一个目标第二对象子模型。
S3042、获取第一目标子模型分别与至少一个目标第二对象子模型对应的至少一个目标相似度。
在本申请实施例中,服务器将第一目标子模型分别于与每个目标第二对象子模型分别进行对比,也就得到了第一目标子模型与每个目标第二对象子模型的目标相似度,也就得到了至少一个目标第二对象子模型对应的 至少一个目标相似度;其中,至少一个目标第二对象子模型与至少一个目标相似度一一对应。
需要说明的是,服务器获取多个目标相似度的过程,与S31011-S31013描述的获取多个相似度的过程类似,本申请实施例在此不再赘述。
S3043、从个目标相似度中获取最高目标相似度。
这里,最高目标相似度为至少一个目标第二对象子模型对应的至少一个目标相似度个目标相似度中最高的目标相似度。
S3044、在至少一个目标第二对象子模型中,将与最高目标相似度对应的目标第二对象子模型确定为第二目标子模型。
需要说明的是,服务器依据最高目标相似度,从至少一个目标第二对象子模型中选择对应的目标第二对象子模型,并将选择出的将与最高目标相似度对应的目标第二对象子模型确定为第二目标子模型;其中,第二目标子模型与第一目标子模型的相似度最高。
在本申请实施例中,S305中服务器获取第二目标子模型对应的目标最大数据量,包括:当最高目标相似度大于第二预设相似度时,服务器将第二预设对象模型中与第二目标对象对应的最大数据量,确定为第二目标子模型对应的目标最大数据量;当最高目标相似度小于等于第二预设相似度时,服务器将预设数据量确定为第二目标子模型对应的目标最大数量。这里,预设数据量比如为0,或其他任意数值;另外,第一预设相似度和第二预设相似度可以相同,也可以不同,本申请实施例对此不作具体限定。
下面,以即时通信客户端的支付场景中,对支付订单进行异常检测的过程为例,来说明本申请实施例在一个实际的应用场景中的示例性应用。
参见图9,图9为本申请实施例提供的一种示例性的异常行为检测流程示意图;如图9所示,在支付场景中,用户提交支付订单9-1(待检测行为信息)之后,首先,服务器根据支付订单完成支付之前,拉取支付订单,从支付订单中获取订单三元组9-2:用户9-21(第一目标对象)、商户9-22(第一目标对象)和金额值9-23(目标数据量)。
其次,服务器从商户模型9-31(第一预设对象模型)中,获取与商户9-22对应的商户直方图9-311(第一目标子模型),并根据商户直方图9-311的99分位点(预设模型参数)确定商户9-22的正常交易阈值t u(异常数据量);确定正常交易阈值t u的过程参见图4;易知,正常交易阈值t u表示,如果金额值大于正常交易阈值t u,即该金额值对应的支付订单超过了99%的正常情况,确定该金额值对应的支付订单在商户侧为异常交易。这里,由于金额值9-23大于正常交易阈值t u,表明支付订单9-1超过了99%的正常情况,从而,确定支付订单9-1在商户侧为异常交易的结果9-41(第一检测结果)。
然后,服务器从用户模型9-32(第二预设对象模型)中,获取与用户9-21对应的商户直方图组9-321(至少一个目标第二对象子模型),并从商 户直方图组9-321中搜索与商户直方图9-311最相似的商户直方图9-322(第二目标子模型)。这里,如果商户直方图9-322与商户直方图组9-321的相似度大于0.8(第二预设相似度),则获取用户模型9-32中与用户9-21对应的历史最大交易金额(第二目标对象对应的最大数据量)作为正常交易阈值t v(目标最大数据量);否则,表明用户9-21在商户9-22处未消费过,从而,将正常交易阈值t v设置为0(预设数据量);易知,正常交易阈值t v表示,如果金额值小于等于正常交易阈值t v,表明该金额值在历史支付订单中出现了相同的金额值,确定该金额值对应的支付订单在用户侧为正常交易。这里,由于金额值9-23大于正常交易阈值t v,表明金额值9-23未在历史支付订单中出现过,从而,确定支付订单9-1在用户侧为异常交易的结果9-42(第二检测结果)。
最后,依据表1所示的决策矩阵,结合结果9-41和结果9-42,确定支付订单9-1异常(目标检测结果),可能为盗用账号的交易;从而,对支付订单9-1进行拦截处理9-5,以提升网络安全性。其中,
表1
Figure PCTCN2021104999-appb-000003
由表1示出的决策矩阵,易知,当商户侧表明支付订单为异常支付,且用户侧表明支付订单也为异常支付时,确定支付订单为异常支付;当商户侧表明支付订单为异常支付,且用户侧表明支付订单为正常支付时,确定支付订单为正常支付;当商户侧表明支付订单为正常支付,且用户侧表明支付订单为异常支付时,确定支付订单为疑似异常支付,此时,根据实际情况确定支付订单的异常性,即根据实际情况可以认为支付订单正常,也可以认为支付订单异常;当商户侧表明支付订单为正常支付,且用户侧表明支付订单也为正常支付时,确定支付订单为正常支付。
另外,本申请实施例还提供了另一种决策矩阵,如表2所示:
表2
Figure PCTCN2021104999-appb-000004
由表2示出的决策矩阵,易知,当商户侧表明支付订单为异常支付,且用户侧表明支付订单也为异常支付时,确定支付订单为异常支付;当商户侧表明支付订单为异常支付,且用户侧表明支付订单为正常支付时,确定支付订单为疑似异常支付,此时,根据实际情况确定支付订单的异常性,即根据实际情况可以认为支付订单正常,也可以认为支付订单异常;当商 户侧表明支付订单为正常支付,且用户侧表明支付订单为异常支付时,确定支付订单为疑似异常支付,此时,根据实际情况确定支付订单的异常性,即根据实际情况可以认为支付订单正常,也可以认为支付订单异常;当商户侧表明支付订单为正常支付,且用户侧表明支付订单也为正常支付时,确定支付订单为正常支付。
需要说明的是,服务器依据第一检测结果和第二检测结果确定目标检测结果时,还可以是其他的决策矩阵,本申请实施例在此不一一列出。
另外,上述确定支付订单的异常性的处理过程是实时的,是在实时/在线环境上实现的;上述确定支付订单的异常性的处理过程,还可以应用在信用卡反欺诈、黑产对抗等场景中。
可以理解的是,在即时通信客户端的支付场景中,通过对支付订单进行异常检测,在基于检测结果确定支付异常时进行支付拦截、身份验证等处理,能够保证即时通信客户端中支付的安全性。
下面,继续说明支付场景中的异常行为检测的应用。
参见图10,图10是本申请实施例提供的一种示例性的获取模型的示意图;如图10所示,服务器获取近期(当前预设周期)的历史支付订单10-1(行为信息样本),其中,历史支付订单10-1中的每条支付订单均包括用户(第二对象)、商户(第一对象)和金额(数据量)。
首先,对历史支付订单10-1按照商户(第一预设对象类型)进行聚合,得到每个商户近期所有用户的交易金额10-2(数据量集合);利用自然对数对每个商户近期所有用户的交易金额10-2进行转化后,分段统计得到金额分布直方图10-31(第一子模型);如图11所示的直方图11-1即图10中的金额分布直方图10-31,坐标系中横坐标为取对数后的金额值,纵坐标为概率值。将每个商户对应的金额分布直方图10-31组合,也就得到了商户模型10-3。
其次,对历史支付订单10-1按照用户(第二预设对象类型)进行聚合,得到每个用户近期所支付的各个商户10-41(第一对象集合),以及每个用户近期在各个商户的支付金额10-42,从每个用户近期在各个商户的支付金额10-42中选择最大支付金额10-421(最大数据量)作为历史最大交易金额。
然后,遍历每个用户近期所支付的各个商户10-41,针对每个商户10-411(第i个第一对象),从商户模型10-3中获取对应的商户金额分布直方图10-32。当每个商户10-411为遍历到的各个商户10-41中的第一个商户时,将商户金额分布直方图10-32直接插入直方图组10-51(当前子模型集合);当每个商户10-411不为遍历到的各个商户10-41中的第一个商户时,将商户金额分布直方图10-32与直方图组10-51中的每个直方图进行对比,以从直方图组10-51中对比出相似度最高的直方图10-511(相似子模型);其中,对比的过程参见图7。这里,如果直方图10-511和商户金额分布直 方图10-32之间的相似度大于0.8(第一预设相似度),则将商户金额分布直方图10-32合并至直方图10-511中,合并的过程参见图8;而直方图10-511和商户金额分布直方图10-32之间的相似度小于等于0.8,则将商户金额分布直方图10-32插入至直方图组10-51中;如此,当完成对各个商户10-41的遍历时(直方图组10-51即至少一个第二对象子模型),将每个用户对应的直方图组10-51与最大支付金额10-421组合,也就得到了第二子模型,从而也就得到了用户模型10-5(第二预设对象模型)。
需要说明的是,上述获取第一预设对象模型和第二预设对象模型的过程可在离线环境中进行。
可以理解的是,本申请实施例提供的异常行为检测方法采用的为无监督方式,无需对数据进行标注,提升了即时通信客户端的支付场景中检测的可执行性;并且,商户模型和用户模型的获取过程中仅需要商户、用户和金额值,在三维特征的情况下也能准确实现即时通信客户端的支付场景中的异常行为检测;另外,由于商户模型和用户模型的获取过程中无需数据标注,提升了获取效率,从而也使得获得的商户模型和用户模型具有时效性,能够适用于即时通信客户端的实时支付环境。
下面继续说明本申请实施例提供的异常行为检测装置455的实施为软件模块的示例性结构,在一些实施例中,如图2所示,存储在存储器450的异常行为检测装置455中的软件模块可以包括:
信息获取模块4551,配置为获取待检测行为信息,所述待检测行为信息包括第一目标对象、第二目标对象和目标数据量;
第一检测模块4552,配置为从第一预设对象模型中,获取与所述第一目标对象对应的第一目标子模型;
所述第一检测模型4552,还配置为基于预设模型参数,从所述第一目标子模型中确定异常数据量,基于所述目标数据量和所述异常数据量的对比结果,确定与所述待检测行为信息对应的第一检测结果;
第二检测模块4553,配置为从第二预设对象模型中,获取与所述第二目标对象对应的且与所述第一目标子模型的相似度最高的第二目标子模型;
所述第二检测模型4553,还配置为获取所述第二目标子模型对应的目标最大数据量,基于所述目标数据量和所述目标最大数据量的对比结果,确定与所述待检测行为信息对应的第二检测结果;
结果确定模块4554,配置为结合所述第一检测结果和所述第二检测结果,确定所述待检测行为信息的目标检测结果。
在本申请实施例中,所述异常行为检测装置455还包括模型获取模块4555,配置为获取行为信息样本;依据第一预设对象类型对所述行为信息样本进行聚合,得到与每个第一对象对应的数据量集合,依据所述数据量集合,构建与每个所述第一对象对应的第一子模型,将构建出的各个所述第一对象对应的各个所述第一子模型确定为所述第一预设对象模型;依据 第二预设对象类型对所述行为信息样本进行聚合,得到与每个第二对象对应的第一对象集合和最大数据量;遍历所述第一对象集合,基于遍历到的所述第一对象对应的所述第一子模型,构建至少一个第二对象子模型;将至少一个所述第二对象子模型与所述最大数据量,组合为与每个所述第二对象对应的第二子模型,将组合出的各个所述第二对象对应的各个所述第二子模型确定为所述第二预设对象模型。
在本申请实施例中,所述模型获取模块4555,还配置为获取所述数据量集合中的各个数据量对应的数据量范围;对所述数据量范围进行分段,得到多个目标段;从所述数据量集合中,统计属于多个所述目标段中的每个所述目标段的数据量所对应的目标数量;将所述目标数量与所述数据量集合对应的集合元素数量的比值,确定为与每个所述目标段对应的概率值;将确定出的多个所述目标段对应的多个所述概率值,确定为每个所述第一对象对应的所述第一子模型。
在本申请实施例中,所述模型获取模块4555,还配置为对所述数据量集合中的各个所述数据量进行转化,得到转化后的数据量集合;将所述转化后的数据量集合中的各个转化后的数据量对应的范围,确定为所述数据量范围。
在本申请实施例中,所述模型获取模块4555,还配置为从所述转化后的数据量集合中,统计属于多个所述目标段中的每个所述目标段的转化后的数据量所对应的所述目标数量。
在本申请实施例中,所述模型获取模块4555,还配置为遍历所述第一对象集合,将遍历到的第1个所述第一对象对应的所述第一子模型确定为当前子模型,并构建包括所述当前子模型的第1个当前子模型集合;通过迭代i执行以下处理:将遍历到的第i个所述第一对象对应的所述第一子模型,分别与第i-1个当前子模型集合中的每个所述当前子模型进行对比,得到相似子模型,其中,2<i≤I,且i为取值递增的正整数变量,I为所述第一对象集合中第一对象的数量;当第i个所述第一对象对应的所述第一子模型与所述相似子模型之间的相似度大于第一预设相似度时,合并第i个所述第一对象对应的所述第一子模型和所述相似子模型,得到合并后的子模型,并利用所述合并后的子模型替换所述第i-1个当前子模型集合中的所述相似子模型,得到第i个当前子模型集合;当第i个所述第一对象对应的所述第一子模型与所述相似子模型之间的相似度小于等于所述第一预设相似度时,将第i个所述第一对象对应的所述第一子模型插入所述第i-1个当前子模型集合中,得到第i个当前子模型集合;将迭代i获得的第I个当前子模型集合确定为至少一个所述第二对象子模型。
在本申请实施例中,所述模型获取模块4555,还配置为从遍历到的第i个所述第一对象对应的所述第一子模型中,获取多个目标段对应的多个第一目标概率值;从所述第i-1个当前子模型集合中的每个所述当前子模型中, 获取多个所述目标段对应的多个第二目标概率值;将多个所述第一目标概率值和多个所述第二目标概率值一一对应对比,得到多个最小概率值,将多个所述最小概率值的累加和,确定为第i个所述第一对象对应的所述第一子模型和每个所述当前子模型的相似度;从确定出的与所述第i-1个当前子模型集合对应的至少一个所述相似度中,选择最高相似度,并将所述第i-1个当前子模型集合中与所述最高相似度对应的所述当前子模型,确定为所述相似子模型。
在本申请实施例中,所述模型获取模块4555,还配置为从第i个所述第一对象对应的所述第一子模型中,获取多个目标段对应的多个第一目标概率值;从所述相似子模型中,获取多个所述目标段对应的多个待合并概率值;将多个所述第一目标概率值和多个所述待合并概率值一一对应对比,得到多个最大概率值;将多个所述目标段和多个所述最大概率值组合,得到所述合并后的子模型。
在本申请实施例中,所述第二检测模块4553,还配置为从所述第二预设对象模型中,获取与所述第二目标对象对应的至少一个目标第二对象子模型;获取所述第一目标子模型分别与至少一个所述目标第二对象子模型对应的至少一个目标相似度;从至少一个所述目标相似度中获取最高目标相似度;在至少一个所述目标第二对象子模型中,将与所述最高目标相似度对应的目标第二对象子模型确定为所述第二目标子模型。
在本申请实施例中,所述第二检测模块4553,还配置为当所述最高目标相似度大于第二预设相似度时,将所述第二预设对象模型中与所述第二目标对象对应的最大数据量,确定为所述第二目标子模型对应的所述目标最大数据量;当所述最高目标相似度小于等于所述第二预设相似度时,将预设数据量确定为所述第二目标子模型对应的所述目标最大数量。
在本申请实施例中,所述第一检测模块4552,还配置为当所述目标数据量大于所述异常数据量时,确定所述待检测行为信息关于所述第一目标对象异常,其中,所述第一检测结果为所述待检测行为信息关于所述第一目标对象异常;当所述目标数据量小于等于所述异常数据量时,确定所述待检测行为信息关于所述第一目标对象正常,其中,所述第一检测结果为所述待检测行为信息关于所述第一目标对象正常。
在本申请实施例中,所述第二检测模块4553,还配置为当所述目标数据量大于所述目标最大数据量时,确定所述待检测行为信息关于所述第二目标对象异常,其中,所述第二检测结果为所述待检测行为信息关于所述第二目标对象异常;当所述目标数据量小于等于所述目标最大数据量时,确定所述待检测行为信息关于所述第二目标对象正常,其中,所述第二检测结果为所述待检测行为信息关于所述第二目标对象正常。
在本申请实施例中,所述结果确定模块4554,还配置为当所述第一检测结果为所述待检测行为信息关于所述第一目标对象异常,且所述第二检 测结果为所述待检测行为信息关于所述第二目标对象异常时,确定包括所述待检测行为信息异常的所述目标检测结果;当所述第一检测结果为所述待检测行为信息关于所述第一目标对象正常,且所述第二检测结果为所述待检测行为信息关于所述第二目标对象异常时,确定包括所述待检测行为信息异常的所述目标检测结果;当所述第一检测结果为所述待检测行为信息关于所述第一目标对象正常,且所述第二检测结果为所述待检测行为信息关于所述第二目标对象正常时,确定包括所述待检测行为信息正常的所述目标检测结果;当所述第一检测结果为所述待检测行为信息关于所述第一目标对象异常,且所述第二检测结果为所述待检测行为信息关于所述第二目标对象正常时,确定包括所述待检测行为信息正常的所述目标检测结果。
本申请实施例提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。异常行为检测设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行本申请实施例上述的异常行为检测方法。
本申请实施例提供一种存储有可执行指令的计算机可读存储介质,其中存储有可执行指令,当可执行指令被处理器执行时,将引起处理器执行本申请实施例提供的异常行为检测方法,例如,如图3示出的异常行为检测方法。
在一些实施例中,计算机可读存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、闪存、磁表面存储器、光盘、或CD-ROM等存储器;也可以是包括上述存储器之一或任意组合的各种设备。
在一些实施例中,可执行指令可以采用程序、软件、软件模块、脚本或代码的形式,按任意形式的编程语言(包括编译或解释语言,或者声明性或过程性语言)来编写,并且其可按任意形式部署,包括被部署为独立的程序或者被部署为模块、组件、子例程或者适合在计算环境中使用的其它单元。
作为示例,可执行指令可以但不一定对应于文件系统中的文件,可以可被存储在保存其它程序或数据的文件的一部分,例如,存储在超文本标记语言(HTML,Hyper Text Markup Language)文档中的一个或多个脚本中,存储在专用于所讨论的程序的单个文件中,或者,存储在多个协同文件(例如,存储一个或多个模块、子程序或代码部分的文件)中。
作为示例,可执行指令可被部署为在一个计算设备上执行,或者在位于一个地点的多个计算设备上执行,又或者,在分布在多个地点且通过通信网络互连的多个计算设备上执行。
综上所述,通过本申请实施例,在待检测行为信息包括第一目标对象、第二目标对象和目标数据量三种维度特征时,通过结合目标数据量分别与 异常数据量和目标最大数据量的比较结果,确定待检测行为信息是否具有异常性的目标检测结果的过程中,由于异常数据量为基于第一预设对象模型确定的针对第一目标对象的异常判断条件,目标最大数据量为基于第二预设对象模型确定的针对第二目标对象的异常判断条件;因此,能够在低纬度特征下,通过从第一目标对象和第二目标对象两种维度判断目标数据量是否在预设区间,来准确得到待检测行为信息是否异常的目标检测结果;从而,异常行为检测的准确度较高。
以上所述,仅为本申请的实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和范围之内所作的任何修改、等同替换和改进等,均包含在本申请的保护范围之内。

Claims (15)

  1. 一种异常行为检测方法,所述方法由电子设备执行,包括:
    获取待检测行为信息,所述待检测行为信息包括第一目标对象、第二目标对象和目标数据量;
    从第一预设对象模型中,获取与所述第一目标对象对应的第一目标子模型;
    基于预设模型参数,从所述第一目标子模型中确定异常数据量,基于所述目标数据量和所述异常数据量的对比结果,确定与所述待检测行为信息对应的第一检测结果;
    从第二预设对象模型中,获取与所述第二目标对象对应的且与所述第一目标子模型的相似度最高的第二目标子模型;
    获取所述第二目标子模型对应的目标最大数据量,基于所述目标数据量和所述目标最大数据量的对比结果,确定与所述待检测行为信息对应的第二检测结果;
    结合所述第一检测结果和所述第二检测结果,确定所述待检测行为信息的目标检测结果。
  2. 根据权利要求1所述的方法,其中,所述从第一预设对象模型中,获取与所述第一目标对象对应的第一目标子模型之前,所述方法还包括:
    获取行为信息样本;
    依据第一预设对象类型对所述行为信息样本进行聚合,得到与每个第一对象对应的数据量集合,依据所述数据量集合,构建与每个所述第一对象对应的第一子模型,将构建出的各个所述第一对象对应的各个所述第一子模型确定为所述第一预设对象模型;
    依据第二预设对象类型对所述行为信息样本进行聚合,得到与每个第二对象对应的第一对象集合和最大数据量;
    遍历所述第一对象集合,基于遍历到的所述第一对象对应的所述第一子模型,构建至少一个第二对象子模型;
    将至少一个所述第二对象子模型与所述最大数据量,组合为与每个所述第二对象对应的第二子模型,将组合出的各个所述第二对象对应的各个所述第二子模型确定为所述第二预设对象模型。
  3. 根据权利要求2所述的方法,其中,所述依据所述数据量集合,构建与每个所述第一对象对应的第一子模型,包括:
    获取所述数据量集合中的各个数据量对应的数据量范围;
    对所述数据量范围进行分段,得到多个目标段;
    从所述数据量集合中,统计属于多个所述目标段中的每个所述目标段的数据量所对应的目标数量;
    将所述目标数量与所述数据量集合对应的集合元素数量的比值,确定 为与每个所述目标段对应的概率值;
    将确定出的多个所述目标段对应的多个所述概率值,确定为每个所述第一对象对应的所述第一子模型。
  4. 根据权利要求3所述的方法,其中,所述获取所述数据量集合中的各个数据量对应的数据量范围,包括:
    对所述数据量集合中的各个所述数据量进行转化,得到转化后的数据量集合;
    将所述转化后的数据量集合中的各个转化后的数据量对应的范围,确定为所述数据量范围;
    所述从所述数据量集合中,统计属于多个所述目标段中的每个所述目标段的数据量所对应的目标数量,包括:
    从所述转化后的数据量集合中,统计属于多个所述目标段中的每个所述目标段的转化后的数据量所对应的所述目标数量。
  5. 根据权利要求2至4任一项所述的方法,其中,所述遍历所述第一对象集合,基于遍历到的所述第一对象对应的所述第一子模型,构建至少一个第二对象子模型,包括:
    遍历所述第一对象集合,将遍历到的第1个所述第一对象对应的所述第一子模型确定为当前子模型,并构建包括所述当前子模型的第1个当前子模型集合;
    通过迭代i执行以下处理:
    将遍历到的第i个所述第一对象对应的所述第一子模型,分别与第i-1个当前子模型集合中的每个所述当前子模型进行对比,得到相似子模型,其中,2<i≤I,且i为取值递增的正整数变量,I为所述第一对象集合中第一对象的数量;
    当第i个所述第一对象对应的所述第一子模型与所述相似子模型之间的相似度大于第一预设相似度时,合并第i个所述第一对象对应的所述第一子模型和所述相似子模型,得到合并后的子模型,并利用所述合并后的子模型替换所述第i-1个当前子模型集合中的所述相似子模型,得到第i个当前子模型集合;
    当第i个所述第一对象对应的所述第一子模型与所述相似子模型之间的相似度小于等于所述第一预设相似度时,将第i个所述第一对象对应的所述第一子模型插入所述第i-1个当前子模型集合中,得到第i个当前子模型集合;
    将迭代i获得的第I个当前子模型集合确定为至少一个所述第二对象子模型。
  6. 根据权利要求5所述的方法,其中,所述将遍历到的第i个所述第一对象对应的所述第一子模型,分别与第i-1个当前子模型集合中的每个所述当前子模型进行对比,得到相似子模型,包括:
    从遍历到的第i个所述第一对象对应的所述第一子模型中,获取多个目标段对应的多个第一目标概率值;
    从所述第i-1个当前子模型集合中的每个所述当前子模型中,获取多个所述目标段对应的多个第二目标概率值;
    将多个所述第一目标概率值和多个所述第二目标概率值一一对应对比,得到多个最小概率值,将多个所述最小概率值的累加和,确定为第i个所述第一对象对应的所述第一子模型和每个所述当前子模型的相似度;
    从确定出的与所述第i-1个当前子模型集合对应的至少一个所述相似度中,选择最高相似度,并将所述第i-1个当前子模型集合中与所述最高相似度对应的所述当前子模型,确定为所述相似子模型。
  7. 根据权利要求5所述的方法,其中,所述合并第i个所述第一对象对应的所述第一子模型和所述相似子模型,得到合并后的子模型,包括:
    从第i个所述第一对象对应的所述第一子模型中,获取多个目标段对应的多个第一目标概率值;
    从所述相似子模型中,获取多个所述目标段对应的多个待合并概率值;
    将多个所述第一目标概率值和多个所述待合并概率值一一对应对比,得到多个最大概率值;
    将多个所述目标段和多个所述最大概率值组合,得到所述合并后的子模型。
  8. 根据权利要求1至4任一项所述的方法,其中,所述从第二预设对象模型中,获取与所述第二目标对象对应的且与所述第一目标子模型的相似度最高的第二目标子模型,包括:
    从所述第二预设对象模型中,获取与所述第二目标对象对应的至少一个目标第二对象子模型;
    获取所述第一目标子模型分别与至少一个所述目标第二对象子模型对应的至少一个目标相似度;
    从至少一个所述目标相似度中获取最高目标相似度;
    在至少一个所述目标第二对象子模型中,将与所述最高目标相似度对应的目标第二对象子模型确定为所述第二目标子模型。
  9. 根据权利要求8所述的方法,其中,所述获取所述第二目标子模型对应的目标最大数据量,包括:
    当所述最高目标相似度大于第二预设相似度时,将所述第二预设对象模型中与所述第二目标对象对应的最大数据量,确定为所述第二目标子模型对应的所述目标最大数据量;
    当所述最高目标相似度小于等于所述第二预设相似度时,将预设数据量确定为所述第二目标子模型对应的所述目标最大数量。
  10. 根据权利要求1至4任一项所述的方法,其中,所述基于所述目标数据量和所述异常数据量的对比结果,确定与所述待检测行为信息对应 的第一检测结果,包括:
    当所述目标数据量大于所述异常数据量时,确定所述待检测行为信息关于所述第一目标对象异常,其中,所述第一检测结果为所述待检测行为信息关于所述第一目标对象异常;
    当所述目标数据量小于等于所述异常数据量时,确定所述待检测行为信息关于所述第一目标对象正常,其中,所述第一检测结果为所述待检测行为信息关于所述第一目标对象正常。
  11. 根据权利要求1至4任一项所述的方法,其中,所述基于所述目标数据量和所述目标最大数据量的对比结果,确定与所述待检测行为信息对应的第二检测结果,包括:
    当所述目标数据量大于所述目标最大数据量时,确定所述待检测行为信息关于所述第二目标对象异常,其中,所述第二检测结果为所述待检测行为信息关于所述第二目标对象异常;
    当所述目标数据量小于等于所述目标最大数据量时,确定所述待检测行为信息关于所述第二目标对象正常,其中,所述第二检测结果为所述待检测行为信息关于所述第二目标对象正常。
  12. 根据权利要求1至4任一项所述的方法,其中,所述结合所述第一检测结果和所述第二检测结果,确定所述待检测行为信息的目标检测结果,包括:
    当所述第一检测结果为所述待检测行为信息关于所述第一目标对象异常,且所述第二检测结果为所述待检测行为信息关于所述第二目标对象异常时,确定包括所述待检测行为信息异常的所述目标检测结果;
    当所述第一检测结果为所述待检测行为信息关于所述第一目标对象正常,且所述第二检测结果为所述待检测行为信息关于所述第二目标对象异常时,确定包括所述待检测行为信息异常的所述目标检测结果;
    当所述第一检测结果为所述待检测行为信息关于所述第一目标对象正常,且所述第二检测结果为所述待检测行为信息关于所述第二目标对象正常时,确定包括所述待检测行为信息正常的所述目标检测结果;
    当所述第一检测结果为所述待检测行为信息关于所述第一目标对象异常,且所述第二检测结果为所述待检测行为信息关于所述第二目标对象正常时,确定包括所述待检测行为信息正常的所述目标检测结果。
  13. 一种异常行为检测装置,包括:
    信息获取模块,配置为获取待检测行为信息,所述待检测行为信息包括第一目标对象、第二目标对象和目标数据量;
    第一检测模块,配置为从第一预设对象模型中,获取与所述第一目标对象对应的第一目标子模型;
    所述第一检测模块,还配置为基于预设模型参数,从所述第一目标子模型中确定异常数据量,基于所述目标数据量和所述异常数据量的对比结 果,确定与所述待检测行为信息对应的第一检测结果;
    第二检测模块,配置为从第二预设对象模型中,获取与所述第二目标对象对应的且与所述第一目标子模型的相似度最高的第二目标子模型;
    所述第二检测模块,还配置为获取所述第二目标子模型对应的目标最大数据量,基于所述目标数据量和所述目标最大数据量的对比结果,确定与所述待检测行为信息对应的第二检测结果;
    结果确定模块,配置为结合所述第一检测结果和所述第二检测结果,确定所述待检测行为信息的目标检测结果。
  14. 一种异常行为检测设备,包括:
    存储器,用于存储可执行指令;
    处理器,用于执行所述存储器中存储的可执行指令时,实现权利要求1至12任一项所述的异常行为检测方法。
  15. 一种计算机可读存储介质,存储有可执行指令,用于被处理器执行时,实现权利要求1至12任一项所述的异常行为检测方法。
PCT/CN2021/104999 2020-08-20 2021-07-07 异常行为检测方法、装置、电子设备及计算机可读存储介质 WO2022037299A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP21857394.7A EP4120167A4 (en) 2020-08-20 2021-07-07 METHOD AND DEVICE FOR DETECTING ABNORMAL BEHAVIOR AND ELECTRONIC DEVICE AND COMPUTER READABLE STORAGE MEDIUM
JP2022554811A JP7430816B2 (ja) 2020-08-20 2021-07-07 異常行為検出方法、装置、電子機器及びコンピュータプログラム
US17/898,324 US20230004979A1 (en) 2020-08-20 2022-08-29 Abnormal behavior detection method and apparatus, electronic device, and computer-readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010840924.8 2020-08-20
CN202010840924.8A CN114078008A (zh) 2020-08-20 2020-08-20 异常行为检测方法、装置、设备及计算机可读存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/898,324 Continuation US20230004979A1 (en) 2020-08-20 2022-08-29 Abnormal behavior detection method and apparatus, electronic device, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
WO2022037299A1 true WO2022037299A1 (zh) 2022-02-24

Family

ID=80282949

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/104999 WO2022037299A1 (zh) 2020-08-20 2021-07-07 异常行为检测方法、装置、电子设备及计算机可读存储介质

Country Status (5)

Country Link
US (1) US20230004979A1 (zh)
EP (1) EP4120167A4 (zh)
JP (1) JP7430816B2 (zh)
CN (1) CN114078008A (zh)
WO (1) WO2022037299A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116451139A (zh) * 2023-06-16 2023-07-18 杭州新航互动科技有限公司 一种基于人工智能的直播数据快速分析方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116192538B (zh) * 2023-04-28 2023-07-11 北京源堡科技有限公司 基于机器学习的网络安全评估方法、装置、设备及介质
CN117687859B (zh) * 2024-01-31 2024-04-12 苏州元脑智能科技有限公司 PCIe设备的异常检测设备、系统、服务器、方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090210373A1 (en) * 2008-02-20 2009-08-20 Matsushita Electric Industrial Co., Ltd. System architecture and process for seamless adaptation to context aware behavior models
CN103559420A (zh) * 2013-11-20 2014-02-05 苏州大学 一种异常检测训练集的构建方法及装置
CN105843947A (zh) * 2016-04-08 2016-08-10 华南师范大学 基于大数据关联规则挖掘的异常行为检测方法和系统
US20200134504A1 (en) * 2018-10-29 2020-04-30 Acer Cyber Security Incorporated System and method of training behavior labeling model

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6430539B1 (en) * 1999-05-06 2002-08-06 Hnc Software Predictive modeling of consumer financial behavior
US10896421B2 (en) * 2014-04-02 2021-01-19 Brighterion, Inc. Smart retail analytics and commercial messaging
US10896424B2 (en) * 2017-10-26 2021-01-19 Mastercard International Incorporated Systems and methods for detecting out-of-pattern transactions
JP6491297B1 (ja) 2017-10-30 2019-03-27 みずほ情報総研株式会社 不正検知システム、不正検知方法及び不正検知プログラム

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090210373A1 (en) * 2008-02-20 2009-08-20 Matsushita Electric Industrial Co., Ltd. System architecture and process for seamless adaptation to context aware behavior models
CN103559420A (zh) * 2013-11-20 2014-02-05 苏州大学 一种异常检测训练集的构建方法及装置
CN105843947A (zh) * 2016-04-08 2016-08-10 华南师范大学 基于大数据关联规则挖掘的异常行为检测方法和系统
US20200134504A1 (en) * 2018-10-29 2020-04-30 Acer Cyber Security Incorporated System and method of training behavior labeling model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4120167A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116451139A (zh) * 2023-06-16 2023-07-18 杭州新航互动科技有限公司 一种基于人工智能的直播数据快速分析方法
CN116451139B (zh) * 2023-06-16 2023-09-01 杭州新航互动科技有限公司 一种基于人工智能的直播数据快速分析方法

Also Published As

Publication number Publication date
EP4120167A1 (en) 2023-01-18
EP4120167A4 (en) 2023-10-25
JP7430816B2 (ja) 2024-02-13
CN114078008A (zh) 2022-02-22
US20230004979A1 (en) 2023-01-05
JP2023517338A (ja) 2023-04-25

Similar Documents

Publication Publication Date Title
WO2022037299A1 (zh) 异常行为检测方法、装置、电子设备及计算机可读存储介质
US11934290B2 (en) Interactive model performance monitoring
CN111598012B (zh) 一种图片聚类管理方法、系统、设备及介质
CN111783039B (zh) 风险确定方法、装置、计算机系统和存储介质
CN113093958B (zh) 数据处理方法、装置和服务器
WO2020007177A1 (zh) 计算机执行的报价方法、报价装置、电子设备及存储介质
CN111598122B (zh) 数据校验方法、装置、电子设备和存储介质
CN107766316B (zh) 评价数据的分析方法、装置及系统
CN115249043A (zh) 数据分析方法、装置、电子设备及存储介质
WO2024179519A1 (zh) 语义识别方法及其装置
CN111738290B (zh) 图像检测方法、模型构建和训练方法、装置、设备和介质
CN114697127B (zh) 一种基于云计算的业务会话风险处理方法及服务器
US12106617B2 (en) Method and system for auto generating automotive data quality marker
KR102648613B1 (ko) 입력 이미지를 기반으로 인터넷 쇼핑몰에 전시되는 상품 이미지를 생성하는 방법, 장치 및 컴퓨터-판독 가능 기록 매체
CN114882273A (zh) 应用于狭小空间的视觉识别方法、装置、设备和存储介质
CN114240663A (zh) 数据对账方法、装置、终端及存储介质
CN114565460A (zh) 一种基于延迟转化预测模型的信息推送方法及相关设备
CN110728243B (zh) 一种权级分类的业务管理方法、系统、设备和介质
CN113312554A (zh) 用于评价推荐系统的方法及装置、电子设备和介质
CN110767224B (zh) 一种基于特征权级的业务管理方法、系统、设备和介质
CN117291731A (zh) 一种基于手续费计算评估的单据生成方法及相关设备
CN114187549A (zh) 场景分类模型的训练方法、装置、电子设备及存储介质
CN118469381A (zh) 信息处理方法、装置、设备、介质和程序产品
CN115905664A (zh) 意图识别方法及装置
CN117994524A (zh) 一种语义分割模型的评估方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21857394

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022554811

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2021857394

Country of ref document: EP

Effective date: 20221013

NENP Non-entry into the national phase

Ref country code: DE