CN112003834A - Abnormal behavior detection method and device - Google Patents

Abnormal behavior detection method and device Download PDF

Info

Publication number
CN112003834A
CN112003834A CN202010752277.5A CN202010752277A CN112003834A CN 112003834 A CN112003834 A CN 112003834A CN 202010752277 A CN202010752277 A CN 202010752277A CN 112003834 A CN112003834 A CN 112003834A
Authority
CN
China
Prior art keywords
vector representation
web page
event information
training
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010752277.5A
Other languages
Chinese (zh)
Other versions
CN112003834B (en
Inventor
代维
郑霖
程文杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ruishu Information Technology Shanghai Co ltd
Original Assignee
Ruishu Information Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ruishu Information Technology Shanghai Co ltd filed Critical Ruishu Information Technology Shanghai Co ltd
Priority to CN202010752277.5A priority Critical patent/CN112003834B/en
Publication of CN112003834A publication Critical patent/CN112003834A/en
Application granted granted Critical
Publication of CN112003834B publication Critical patent/CN112003834B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Computer Hardware Design (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides an abnormal behavior detection method and device, wherein the method comprises the following steps: the threat perception platform acquires user operation event information generated on a web page collected by a front-end code; based on an anomaly detection model obtained by pre-training, carrying out anomaly detection on user operation event information occurring on each web page to determine whether an abnormal operation behavior exists on the web page; wherein the anomaly detection model is pre-trained based on a deep confrontation neural network. Through the method and the device, whether the web page has the abnormal operation behavior or not can be identified, so that the accuracy of the anti-crawler technology is improved.

Description

Abnormal behavior detection method and device
[ technical field ] A method for producing a semiconductor device
The present application relates to the field of computer security technologies, and in particular, to a method and an apparatus for detecting abnormal behavior.
[ background of the invention ]
This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.
Crawlers are a way to obtain website information in batches using any technical means. On one hand, a large number of crawlers can seriously occupy the performance and bandwidth of the server, influence normal user access and cause DDoS attack in serious cases. On the other hand, important data, information and property of the website cannot be revealed at will, and if the important data, information and property are easily stolen, serious loss is caused. A corresponding anti-crawler mechanism has emerged. However, with the evolution of attack and defense opposition of online service security, the automated crawler is gradually developed to simulate normal user operation in order to bypass the anti-crawler mechanism, and therefore, the behavior of a normal user and the abnormal behavior need to be detected, so that the abnormal behavior of the simulated user is detected.
[ summary of the invention ]
The invention provides an abnormal behavior detection method and device, which are used for identifying abnormal behaviors of a simulation user and improving the accuracy of a crawler resisting technology.
The specific technical scheme is as follows:
in a first aspect, the present application provides a method for detecting abnormal behavior, including:
the threat perception platform acquires user operation event information generated on a web page collected by a front-end code;
based on an anomaly detection model obtained by pre-training, carrying out anomaly detection on user operation event information occurring on each web page to determine whether an abnormal operation behavior exists on the web page;
wherein the anomaly detection model is pre-trained based on a deep confrontation neural network.
According to a preferred embodiment of the present application, the front-end code comprises: a script JS code embedded in a web page, a code embedded in a mobile application, or a code embedded in a desktop client;
the operation event information includes: mouse keyboard event information, touch screen event information, or motion sensor event information.
According to a preferred embodiment of the present application, the performing, based on the abnormality detection model obtained through pre-training, abnormality detection on the user operation event information occurring on each web page includes:
acquiring vector representation of user operation event information occurring on the web page;
encoding the vector representation by using an encoder in the anomaly detection model to obtain a hidden vector;
and inputting the vector representation and the hidden vector into a discrimination network in the abnormal detection model, and determining whether the vector representation is a normal operation behavior by using the discrimination network to determine whether the abnormal operation behavior exists on the web page.
According to a preferred embodiment of the present application, the anomaly detection model is obtained by pre-training in the following manner:
acquiring normal user operation event information occurring on a web page as training data;
obtaining a vector representation of the training data;
inputting the vector representation of the training data into an encoder for encoding to obtain a hidden vector;
a generating network in the deep countermeasure neural network generates a reconstructed vector representation based on the random vector;
inputting the vector representation and the hidden vector of the training data into a discrimination network in the deep antagonistic neural network, and inputting the reconstructed vector representation and the random vector into the discrimination network in the deep antagonistic neural network, so as to respectively obtain a first probability that the vector representation of the training data belongs to a normal operation behavior and a second probability that the reconstructed vector representation belongs to the normal operation behavior;
the training targets are: minimizing a distance between a vector representation of the training data and the reconstructed vector representation, and maximizing a difference of the first and second probabilities;
and after the training is finished, the encoder and the discrimination network form the abnormity detection model.
According to a preferred embodiment of the present application, in the training process, model parameters of the encoder and the generator network are optimized in dependence on a distance between a vector representation of the training data and the reconstructed vector representation;
and optimizing the model parameters of the discrimination network according to the difference value of the first probability and the second probability.
According to a preferred embodiment of the present application, the method further comprises:
counting the conditions of the web pages with abnormal operation behaviors corresponding to the same user identification, application identification or equipment identification;
and determining the abnormal user identification, application identification or equipment identification according to the counted conditions.
In a second aspect, the present application further provides an abnormal behavior detection apparatus, including:
the acquisition unit is used for acquiring user operation event information generated on a web page acquired by a front-end code;
the detection unit is used for carrying out abnormity detection on the user operation event information generated on each web page based on an abnormity detection model obtained by pre-training so as to determine whether abnormal operation behaviors exist on the web page or not; wherein the anomaly detection model is pre-trained based on a deep confrontation neural network.
According to a preferred embodiment of the present application, the front-end code comprises: a script JS code embedded in a web page, a code embedded in a mobile application, or a code embedded in a desktop client;
the operation event information includes: mouse keyboard event information, touch screen event information, or motion sensor event information.
According to a preferred embodiment of the present application, the detecting unit is specifically configured to:
acquiring vector representation of user operation event information occurring on the web page;
encoding the vector representation by using an encoder in the anomaly detection model to obtain a hidden vector;
and inputting the vector representation and the hidden vector into a discrimination network in the abnormal detection model, and determining whether the vector representation is a normal operation behavior by using the discrimination network to determine whether the abnormal operation behavior exists on the web page.
According to a preferred embodiment of the present application, the apparatus further includes a training unit, configured to pre-train and obtain the anomaly detection model in the following manner:
acquiring normal user operation event information occurring on a web page as training data;
obtaining a vector representation of the training data;
inputting the vector representation of the training data into an encoder for encoding to obtain a hidden vector;
a generating network in the deep countermeasure neural network generates a reconstructed vector representation based on the random vector;
inputting the vector representation and the hidden vector of the training data into a discrimination network in the deep antagonistic neural network, and inputting the reconstructed vector representation and the random vector into the discrimination network in the deep antagonistic neural network, so as to respectively obtain a first probability that the vector representation of the training data belongs to a normal operation behavior and a second probability that the reconstructed vector representation belongs to the normal operation behavior;
the training targets are: minimizing a distance between a vector representation of the training data and the reconstructed vector representation, and maximizing a difference of the first and second probabilities;
and after the training is finished, the encoder and the discrimination network form the abnormity detection model.
According to a preferred embodiment of the present application, the training unit optimizes model parameters of the encoder and the generator network in dependence on a distance between a vector representation of the training data and the reconstructed vector representation during the training process; and optimizing the model parameters of the encoder and the discrimination network according to the difference value of the first probability and the second probability.
According to a preferred embodiment of the present application, the apparatus further comprises:
the statistical unit is used for counting the conditions of the web pages with abnormal operation behaviors corresponding to the same user identifier, application identifier or equipment identifier; and determining the abnormal user identification, application identification or equipment identification according to the counted conditions.
In a third aspect, the present application provides an apparatus comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method as described above.
In a fourth aspect, the present application provides a storage medium containing computer-executable instructions for performing the method as described above when executed by a computer processor.
According to the technical scheme, the user operation event information generated on the web pages acquired by the front-end codes is acquired, and the abnormality detection is carried out on the user operation events generated on the web pages based on the abnormality detection model obtained through pre-training, so that whether the abnormal operation behaviors exist on the web pages can be identified, and the accuracy of the anti-crawler technology is improved.
[ description of the drawings ]
FIG. 1 illustrates an exemplary system architecture to which an abnormal behavior detection method or apparatus of an embodiment of the present invention may be applied;
FIG. 2 is a flow chart of a method provided by an embodiment of the present application;
FIG. 3 is a block diagram of a deep countermeasure neural network used in training an anomaly detection model according to an embodiment of the present application;
FIG. 4 is a block diagram of an anomaly detection model according to an embodiment of the present application;
FIG. 5 is a block diagram of an apparatus according to an embodiment of the present disclosure;
FIG. 6 illustrates a block diagram of an exemplary computer system/server suitable for use in implementing embodiments of the present invention.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in detail below with reference to the accompanying drawings and specific embodiments.
Fig. 1 shows an exemplary system architecture to which an abnormal behavior detection method or apparatus according to an embodiment of the present invention may be applied.
As shown in fig. 1, the system architecture may include a terminal device 101, a network 102, and a server 103. Network 102 is the medium used to provide communication links between terminal devices 101 and server 103. Network 102 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may use terminal device 101 to interact with server 103 through network 102. Various applications, such as a voice interaction application, a web browser application, a communication-type application, etc., may be installed on the terminal device 101.
Terminal device 101 may be any terminal device including, but not limited to, a smartphone, a smart tablet, a laptop, a PC, an intelligent wearable device, and so on. The browsing and operation of the web page may be performed by a browser, a mobile application (referring to an application installed in the mobile device), and a desktop client (referring to a client installed in a PC or a notebook computer) in the terminal device 101. In the application, a code, called a front-end code, can be embedded in a web page, a mobile application or a desktop client, and is responsible for collecting user operation event information occurring on the web page and uploading the information to a threat perception platform. The threat awareness platform may be located and run in the server 103 described above. It may be implemented as a plurality of software or software modules (for example, for providing distributed services), or as a single software or software module, which is not specifically limited herein. The server 103 may be a single server or a server group including a plurality of servers.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 2 is a flowchart of a method provided by an embodiment of the present application, and as shown in fig. 2, the method may include the following steps:
in 201, the threat awareness platform obtains information of user operation events occurring on a web page collected by the front-end code.
The method and the device aim at the web page for identifying the abnormal behavior, and based on the characteristic that the web page has extremely strong interactivity, JS codes (the JS codes run after the web page is loaded) can be embedded into the web page, codes are embedded into mobile applications or desktop clients, and the codes are used for collecting user operation event information occurring on the web page.
The collected operation events may include mouse and keyboard events occurring on a web page of a desktop client, touch screen events and motion sensor events occurring on a web page of a mobile application, and the like.
The front-end code can upload the collected user operation event information occurring on the web page to the threat awareness platform in a streaming or periodic mode so as to be stored by the threat awareness platform. The threat awareness platform can acquire user operation event information occurring on a specific web page, and acquire a user ID (identification), an application ID or a device ID from which the user operation event information originates through session information. The application ID may be an ID of the desktop client or an ID of the mobile application. If the user is in an anonymous state, a tourist state, or the like during browsing the web page, the user ID is not usually carried in the session information, but may carry an application ID or a device ID. If the user is in a login state during browsing the web page, the session information usually carries the user ID.
At 202, the threat awareness platform performs data cleansing and normalization processing on the collected user operation event information occurring on the web page.
The data cleaning of the user operation event information may include, but is not limited to, filtering out user operation event information with an incorrect data format or missing data, performing a complement process on some user operation event information, and the like.
In 203, the threat awareness platform performs anomaly detection on the user operation event information occurring on each web page based on the anomaly detection model obtained by pre-training, so as to determine whether an abnormal operation behavior exists on the web page.
The anomaly detection method is based on a deep neural network architecture instead of a traditional principal component analysis method. In particular, the anomaly detection model adopted by the application is obtained based on deep confrontation neural network pre-training. For ease of understanding, the training process of the anomaly detection model will be described in detail first.
As a preferred embodiment, the training data for training the anomaly detection model may include normal user operation event information occurring on the web page. That is, some user operation events generated on the web page by normal user behaviors may be collected in advance, and these user operation events may also include a mouse-keyboard event occurring on the web page of the desktop client, a touch screen event and a motion sensor event occurring on the web page of the mobile application, and the like. For example, normal mouse and keyboard events occurring on a web page of a desktop client may be collected in advance, thereby constituting model training data for anomaly detection. Alternatively, normal touch screen events and motion sensor events occurring on a web page of a mobile application may be collected in advance, thereby constituting model training data for anomaly detection.
Because the collected training data is usually a series of operation sequences, each operation in the sequence may include information such as an operation type, an ip address from which the operation is derived, a client identifier, a device type, a client version number, a timestamp, and the like, and these information serve as feature data to identify the operation. Each training data may be represented as a feature data comprising a series of operations. Each operation is represented as x:
Figure BDA0002610418510000071
wherein f isiAnd representing the vector representation of the ith type of feature data, wherein the feature data are mapped to the same vector space in a one-hot mode in the vector representation. M is the number of types of feature data that an operation contains. Since the amount of feature data of each type is not necessarily largeSame, so a sufficiently large dimension can be predefined and then passed through fnumAnd carrying out normalization to ensure that the feature data have the same dimensionality.
The representations of the operations in a training data are concatenated to obtain the vector representation v (x) of the training data. Since the number of operations included in each piece of training data is not necessarily the same, it is also possible to ensure that the pieces of training data have the same dimensionality through normalization processing when performing concatenation.
Fig. 3 is a block diagram of a deep countermeasure neural network used in training an anomaly detection model according to an embodiment of the present application, as shown in fig. 3, including an encoder E, a generation network G, and a discrimination network D.
Wherein, the vector representation v (x) of the training data is input into an encoder E for encoding, and the v (x) is mapped to obtain a hidden vector
Figure BDA0002610418510000081
The generation network G in the deep countermeasure neural network is responsible for generating a vector representation with a random vector z
Figure BDA0002610418510000082
The generation of the network G is a process of learning the distribution of the operational characteristics of normal users, i.e. as close as possible to v (x)
Figure BDA0002610418510000083
Is the reconstructed vector of v (x).
Will be provided with
Figure BDA0002610418510000084
And
Figure BDA0002610418510000085
inputting the decision network, respectively obtaining the probability that v (x) belongs to normal operation behavior from the decision networkAnd
Figure BDA0002610418510000087
probability of belonging to normal operation behavior
Figure BDA0002610418510000088
The training targets of the deep countermeasure network are as follows: minimize v (x) and
Figure BDA00026104185100000814
and the distance between, and maximize
Figure BDA0002610418510000089
And
Figure BDA00026104185100000810
the difference between them.
Specifically, in the training process, the generation network G and the discrimination network D may be alternately trained. The respective training times of G and D in each round of alternation can be flexibly set. For example, G may be trained once, D may be trained once; it is also possible to train several times G, then train again D, and so on.
Training to generate network G according to v (x) and
Figure BDA00026104185100000811
the distance between the encoder E and the generating network G. In training to determine network D, according to
Figure BDA00026104185100000812
And
Figure BDA00026104185100000813
and optimizing and judging the model parameters of the network D according to the difference value between the two. Specifically, a loss function can be constructed, and training optimization is performed in a gradient descent mode.
And when the training end condition is met, for example, the value of the loss function meets the preset requirement, the training iteration number meets the preset requirement, and the like, forming an anomaly detection model by the encoder E and the discrimination network D obtained by training. That is, in the present application, the generated network G only functions during the training process, and after the training is completed, the anomaly detection model does not include the generated network G.
In step 203, when the anomaly detection model is used to perform anomaly detection, as shown in fig. 4, the user operation event information x occurring on the web page is represented as a vector v (x), and then is input to the encoder E to be encoded, so as to obtain a hidden vector
Figure BDA0002610418510000094
The process of obtaining v (x) is consistent with the mode in the process of describing model training, and is not described herein again. Then will be
Figure BDA0002610418510000091
Input to the discrimination network D, obtained from the discrimination network D
Figure BDA0002610418510000092
The probability that the user operation event information belongs to normal operation is reflected, and whether abnormal operation behaviors exist on the web page or not can be determined according to the probability. For example, if
Figure BDA0002610418510000093
And if the user operation event information is lower than a certain threshold, the user operation event information is not considered to belong to the normal operation behavior, and the abnormal operation behavior exists on the web page.
In 204, the condition of the web page with abnormal operation behavior corresponding to the same user ID, application ID or device ID is counted, and the user identifier, application identifier or device identifier with abnormal behavior is determined according to the counted condition.
In the application, the web pages with abnormal operation behaviors corresponding to the same user ID, application ID or equipment ID can be counted. For example, if the number of web pages with abnormal operation behavior is greater than or equal to a preset number threshold, the user, application or device is considered abnormal. For another example, if the web page ratio with abnormal operation behavior is greater than or equal to the preset ratio threshold, the user, application or device is considered abnormal.
The determined user identification, application identification or equipment identification with the abnormality can be displayed to the manager through an interface provided by the threat perception platform, and can also be provided to the manager through system messages, short messages, mails and other modes. Besides, the web page information which is displayed to the manager simultaneously and has abnormal operation behavior and corresponds to the abnormal user identifier, the application identifier or the equipment identifier, the corresponding abnormal operation behavior information and the like can also be included. Thus, the manager can analyze the information and further close the corresponding account or add specific users, applications and devices to a blacklist, etc.
The above is a detailed description of the method provided in the present application, and the following is a detailed description of the apparatus provided in the present application with reference to the embodiments.
Fig. 5 is a structural diagram of an apparatus provided in an embodiment of the present application, where the apparatus is disposed at a server side to implement the functions of the threat awareness platform. As shown in fig. 5, the apparatus may include: the acquisition unit 01 and the detection unit 02 may further include a training unit 03 and a statistic unit 04. Wherein the main functions of each unit include:
the obtaining unit 01 is responsible for obtaining user operation event information occurring on a web page collected by a front-end code.
Among other things, front-end code may include but is not limited to: a script JS code embedded in a web page, a code embedded in a mobile application, or a code embedded in a desktop client.
Operational event information may include, but is not limited to: mouse keyboard event information, touch screen event information, or motion sensor event information.
Furthermore, the obtaining unit 01 may also perform data cleansing on the user operation event information, which may include, but is not limited to, filtering out user operation event information with incorrect data format or missing data, performing complementary processing on some user operation event information, and the like.
The detection unit 02 is responsible for performing anomaly detection on user operation event information occurring on each web page based on an anomaly detection model obtained through pre-training so as to determine whether an abnormal operation behavior exists on the web page; wherein the anomaly detection model is obtained based on the pre-training of the deep confrontation neural network.
Specifically, the detection unit 02 may acquire a vector representation of user operation event information occurring on a web page; encoding the vector representation by using an encoder in the anomaly detection model to obtain a hidden vector; and inputting the vector representation and the hidden vector into a discrimination network in the anomaly detection model, and determining whether the vector representation is a normal operation behavior by using the discrimination network to determine whether the web page has the abnormal operation behavior.
The training unit 03 is responsible for pre-training to obtain an anomaly detection model in the following manner: acquiring normal user operation event information occurring on a web page as training data; obtaining a vector representation of training data; inputting the vector representation of the training data into an encoder for encoding to obtain a hidden vector; a generating network in the deep countermeasure neural network generates a reconstructed vector representation based on the random vector; inputting the vector representation and the hidden vector of the training data into a discrimination network in the deep antagonistic neural network, and inputting the reconstructed vector representation and the random vector into the discrimination network in the deep antagonistic neural network, and respectively obtaining a first probability that the vector representation of the training data belongs to a normal operation behavior and a second probability that the reconstructed vector representation belongs to the normal operation behavior.
Wherein, the training target is: minimizing a distance between the vector representation of the training data and the reconstructed vector representation, and maximizing a difference between the first probability and the second probability.
After training is finished, an encoder and a discrimination network form an anomaly detection model.
Specifically, in the training process, the generation network and the discrimination network may be alternately trained. The training times of alternately generating the network and judging the network in each round can be flexibly set. For example, the generated network may be trained once, and the network may be discriminated once; or training the network for several times, then performing the training of judging the network again, and the like.
In the training process, the training unit 03 optimizes the model parameters of the encoder and the generation network according to the distance between the vector representation of the training data and the reconstructed vector representation; and optimizing the model parameters of the encoder and the discrimination network according to the difference value of the first probability and the second probability.
The counting unit 04 is responsible for counting the conditions of the web pages with abnormal operation behaviors corresponding to the same user identifier, application identifier or equipment identifier; and determining the abnormal user identification, application identification or equipment identification according to the counted conditions.
FIG. 6 illustrates a block diagram of an exemplary computer system/server suitable for use in implementing embodiments of the present invention. The computer system/server 012 shown in fig. 6 is only an example, and should not bring any limitation to the function and the scope of use of the embodiment of the present invention.
As shown in fig. 6, the computer system/server 012 is embodied as a general purpose computing device. The components of computer system/server 012 may include, but are not limited to: one or more processors or processing units 016, a system memory 028, and a bus 018 that couples various system components including the system memory 028 and the processing unit 016.
Bus 018 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer system/server 012 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer system/server 012 and includes both volatile and nonvolatile media, removable and non-removable media.
System memory 028 can include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)030 and/or cache memory 032. The computer system/server 012 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 034 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 6, commonly referred to as a "hard drive"). Although not shown in FIG. 6, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be connected to bus 018 via one or more data media interfaces. Memory 028 can include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of embodiments of the present invention.
Program/utility 040 having a set (at least one) of program modules 042 can be stored, for example, in memory 028, such program modules 042 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof might include an implementation of a network environment. Program modules 042 generally perform the functions and/or methodologies of embodiments of the present invention as described herein.
The computer system/server 012 may also communicate with one or more external devices 014 (e.g., keyboard, pointing device, display 024, etc.), hi the present invention, the computer system/server 012 communicates with an external radar device, and may also communicate with one or more devices that enable a user to interact with the computer system/server 012, and/or with any device (e.g., network card, modem, etc.) that enables the computer system/server 012 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 022. Also, the computer system/server 012 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the internet) via the network adapter 020. As shown, the network adapter 020 communicates with the other modules of the computer system/server 012 via bus 018. It should be appreciated that although not shown in fig. 6, other hardware and/or software modules may be used in conjunction with the computer system/server 012, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 016 executes programs stored in the system memory 028, thereby executing various functional applications and data processing, such as implementing the method flow provided by the embodiment of the present invention.
The computer program described above may be provided in a computer storage medium encoded with a computer program that, when executed by one or more computers, causes the one or more computers to perform the method flows and/or apparatus operations shown in the above-described embodiments of the invention. For example, the method flows provided by the embodiments of the invention are executed by one or more processors described above.
With the development of time and technology, the meaning of media is more and more extensive, and the propagation path of computer programs is not limited to tangible media any more, and can also be downloaded from a network directly and the like. Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

Claims (14)

1. A method for detecting abnormal behavior, the method comprising:
the threat perception platform acquires user operation event information generated on a web page collected by a front-end code;
based on an anomaly detection model obtained by pre-training, carrying out anomaly detection on user operation event information occurring on each web page to determine whether an abnormal operation behavior exists on the web page;
wherein the anomaly detection model is pre-trained based on a deep confrontation neural network.
2. The method of claim 1, wherein the front-end code comprises: a script JS code embedded in a web page, a code embedded in a mobile application, or a code embedded in a desktop client;
the operation event information includes: mouse keyboard event information, touch screen event information, or motion sensor event information.
3. The method according to claim 1, wherein the performing anomaly detection on the user operation event information occurring on each web page based on the anomaly detection model obtained through pre-training comprises:
acquiring vector representation of user operation event information occurring on the web page;
encoding the vector representation by using an encoder in the anomaly detection model to obtain a hidden vector;
and inputting the vector representation and the hidden vector into a discrimination network in the abnormal detection model, and determining whether the vector representation is a normal operation behavior by using the discrimination network to determine whether the abnormal operation behavior exists on the web page.
4. The method of claim 1, wherein the anomaly detection model is pre-trained by:
acquiring normal user operation event information occurring on a web page as training data;
obtaining a vector representation of the training data;
inputting the vector representation of the training data into an encoder for encoding to obtain a hidden vector;
a generating network in the deep countermeasure neural network generates a reconstructed vector representation based on the random vector;
inputting the vector representation and the hidden vector of the training data into a discrimination network in the deep antagonistic neural network, and inputting the reconstructed vector representation and the random vector into the discrimination network in the deep antagonistic neural network, so as to respectively obtain a first probability that the vector representation of the training data belongs to a normal operation behavior and a second probability that the reconstructed vector representation belongs to the normal operation behavior;
the training targets are: minimizing a distance between a vector representation of the training data and the reconstructed vector representation, and maximizing a difference of the first and second probabilities;
and after the training is finished, the encoder and the discrimination network form the abnormity detection model.
5. The method according to claim 4, characterized in that in the training process, model parameters of the encoder and the generating network are optimized in dependence on a distance between a vector representation of the training data and the reconstructed vector representation;
and optimizing the model parameters of the discrimination network according to the difference value of the first probability and the second probability.
6. The method according to any one of claims 1 to 5, characterized in that the method further comprises:
counting the conditions of the web pages with abnormal operation behaviors corresponding to the same user identification, application identification or equipment identification;
and determining the abnormal user identification, application identification or equipment identification according to the counted conditions.
7. An abnormal behavior detection apparatus, characterized in that the apparatus comprises:
the acquisition unit is used for acquiring user operation event information generated on a web page acquired by a front-end code;
the detection unit is used for carrying out abnormity detection on the user operation event information generated on each web page based on an abnormity detection model obtained by pre-training so as to determine whether abnormal operation behaviors exist on the web page or not; wherein the anomaly detection model is pre-trained based on a deep confrontation neural network.
8. The apparatus of claim 7, wherein the front-end code comprises: a script JS code embedded in a web page, a code embedded in a mobile application, or a code embedded in a desktop client;
the operation event information includes: mouse keyboard event information, touch screen event information, or motion sensor event information.
9. The apparatus according to claim 7, wherein the detection unit is specifically configured to:
acquiring vector representation of user operation event information occurring on the web page;
encoding the vector representation by using an encoder in the anomaly detection model to obtain a hidden vector;
and inputting the vector representation and the hidden vector into a discrimination network in the abnormal detection model, and determining whether the vector representation is a normal operation behavior by using the discrimination network to determine whether the abnormal operation behavior exists on the web page.
10. The apparatus of claim 7, further comprising a training unit for pre-training the anomaly detection model by:
acquiring normal user operation event information occurring on a web page as training data;
obtaining a vector representation of the training data;
inputting the vector representation of the training data into an encoder for encoding to obtain a hidden vector;
a generating network in the deep countermeasure neural network generates a reconstructed vector representation based on the random vector;
inputting the vector representation and the hidden vector of the training data into a discrimination network in the deep antagonistic neural network, and inputting the reconstructed vector representation and the random vector into the discrimination network in the deep antagonistic neural network, so as to respectively obtain a first probability that the vector representation of the training data belongs to a normal operation behavior and a second probability that the reconstructed vector representation belongs to the normal operation behavior;
the training targets are: minimizing a distance between a vector representation of the training data and the reconstructed vector representation, and maximizing a difference of the first and second probabilities;
and after the training is finished, the encoder and the discrimination network form the abnormity detection model.
11. The apparatus according to claim 10, wherein the training unit optimizes model parameters of the encoder and the generator network in dependence on a distance between the vector representation of the training data and the reconstructed vector representation during the training process; and optimizing the model parameters of the encoder and the discrimination network according to the difference value of the first probability and the second probability.
12. The apparatus of any of claims 7 to 11, further comprising:
the statistical unit is used for counting the conditions of the web pages with abnormal operation behaviors corresponding to the same user identifier, application identifier or equipment identifier; and determining the abnormal user identification, application identification or equipment identification according to the counted conditions.
13. An apparatus, characterized in that the apparatus comprises:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-6.
14. A storage medium containing computer-executable instructions for performing the method of any one of claims 1-6 when executed by a computer processor.
CN202010752277.5A 2020-07-30 2020-07-30 Abnormal behavior detection method and device Active CN112003834B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010752277.5A CN112003834B (en) 2020-07-30 2020-07-30 Abnormal behavior detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010752277.5A CN112003834B (en) 2020-07-30 2020-07-30 Abnormal behavior detection method and device

Publications (2)

Publication Number Publication Date
CN112003834A true CN112003834A (en) 2020-11-27
CN112003834B CN112003834B (en) 2022-09-23

Family

ID=73462491

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010752277.5A Active CN112003834B (en) 2020-07-30 2020-07-30 Abnormal behavior detection method and device

Country Status (1)

Country Link
CN (1) CN112003834B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112565271A (en) * 2020-12-07 2021-03-26 瑞数信息技术(上海)有限公司 Web attack detection method and device
CN112989348A (en) * 2021-04-15 2021-06-18 中国电子信息产业集团有限公司第六研究所 Attack detection method, model training method, device, server and storage medium
CN116756453A (en) * 2023-08-16 2023-09-15 浙江飞猪网络技术有限公司 Method, equipment and medium for user anomaly analysis and model training based on page

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635993A (en) * 2018-10-23 2019-04-16 平安科技(深圳)有限公司 Operation behavior monitoring method and device based on prediction model
CN110086776A (en) * 2019-03-22 2019-08-02 国网河南省电力公司经济技术研究院 Intelligent substation Network Intrusion Detection System and detection method based on deep learning
US20190251401A1 (en) * 2018-02-15 2019-08-15 Adobe Inc. Image composites using a generative adversarial neural network
US20190339685A1 (en) * 2016-05-09 2019-11-07 Strong Force Iot Portfolio 2016, Llc Methods and systems for data collection, learning, and streaming of machine signals for analytics and maintenance using the industrial internet of things
CN110958207A (en) * 2018-09-26 2020-04-03 瑞数信息技术(上海)有限公司 Attack detection method, device, equipment and computer storage medium
CN111064745A (en) * 2019-12-30 2020-04-24 厦门市美亚柏科信息股份有限公司 Self-adaptive back-climbing method and system based on abnormal behavior detection
CN111104616A (en) * 2018-10-26 2020-05-05 阿里巴巴集团控股有限公司 Webpage processing method and device
CN111223040A (en) * 2020-01-09 2020-06-02 北京市商汤科技开发有限公司 Network training method and device and image generation method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190339685A1 (en) * 2016-05-09 2019-11-07 Strong Force Iot Portfolio 2016, Llc Methods and systems for data collection, learning, and streaming of machine signals for analytics and maintenance using the industrial internet of things
US20190251401A1 (en) * 2018-02-15 2019-08-15 Adobe Inc. Image composites using a generative adversarial neural network
CN110958207A (en) * 2018-09-26 2020-04-03 瑞数信息技术(上海)有限公司 Attack detection method, device, equipment and computer storage medium
CN109635993A (en) * 2018-10-23 2019-04-16 平安科技(深圳)有限公司 Operation behavior monitoring method and device based on prediction model
CN111104616A (en) * 2018-10-26 2020-05-05 阿里巴巴集团控股有限公司 Webpage processing method and device
CN110086776A (en) * 2019-03-22 2019-08-02 国网河南省电力公司经济技术研究院 Intelligent substation Network Intrusion Detection System and detection method based on deep learning
CN111064745A (en) * 2019-12-30 2020-04-24 厦门市美亚柏科信息股份有限公司 Self-adaptive back-climbing method and system based on abnormal behavior detection
CN111223040A (en) * 2020-01-09 2020-06-02 北京市商汤科技开发有限公司 Network training method and device and image generation method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GABRIEL S.SIMOES、JÔNATAS WEHRMANN、RODRIGO C.BARROS: ""Attention-based Adversarial Training for Seamless Nudity Censorship"", 《2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)》 *
葛凯强、陈铁明: ""人机交互安全攻防综述"", 《电信科学》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112565271A (en) * 2020-12-07 2021-03-26 瑞数信息技术(上海)有限公司 Web attack detection method and device
CN112989348A (en) * 2021-04-15 2021-06-18 中国电子信息产业集团有限公司第六研究所 Attack detection method, model training method, device, server and storage medium
CN116756453A (en) * 2023-08-16 2023-09-15 浙江飞猪网络技术有限公司 Method, equipment and medium for user anomaly analysis and model training based on page

Also Published As

Publication number Publication date
CN112003834B (en) 2022-09-23

Similar Documents

Publication Publication Date Title
CN112003834B (en) Abnormal behavior detection method and device
CN110442712B (en) Risk determination method, risk determination device, server and text examination system
CN106022349B (en) Method and system for device type determination
CN110222513B (en) Abnormality monitoring method and device for online activities and storage medium
US10489715B2 (en) Fingerprinting and matching log streams
CN111915086A (en) Abnormal user prediction method and equipment
CN115357470A (en) Information generation method and device, electronic equipment and computer readable medium
CN116015842A (en) Network attack detection method based on user access behaviors
CN117349102B (en) Digital twin operation and maintenance data quality inspection method, system and medium
US20220179764A1 (en) Multi-source data correlation extraction for anomaly detection
CN113971284B (en) JavaScript-based malicious webpage detection method, equipment and computer readable storage medium
CN113132393A (en) Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium
CN114500075B (en) User abnormal behavior detection method and device, electronic equipment and storage medium
CN115964701A (en) Application security detection method and device, storage medium and electronic equipment
CN115589339A (en) Network attack type identification method, device, equipment and storage medium
CN112003833A (en) Abnormal behavior detection method and device
CN115659351A (en) Information security analysis method, system and equipment based on big data office
Ahmed Khan et al. Generating realistic IoT‐based IDS dataset centred on fuzzy qualitative modelling for cyber‐physical systems
KR20230059607A (en) Method for Automating failure prediction of virtual machines and servers through log message analysis, apparatus and system thereof
CN110674839B (en) Abnormal user identification method and device, storage medium and electronic equipment
CN114338195A (en) Web traffic anomaly detection method and device based on improved isolated forest algorithm
CN115145623A (en) White box monitoring method, device, equipment and storage medium of software business system
CN112000559A (en) Abnormal equipment detection method and device
KR100961992B1 (en) Method and Apparatus of cyber criminal activity analysis using markov chain and Recording medium using it
CN112565271B (en) Web attack detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant