CN107992741A - A kind of model training method, the method and device for detecting URL - Google Patents

A kind of model training method, the method and device for detecting URL Download PDF

Info

Publication number
CN107992741A
CN107992741A CN201710998117.7A CN201710998117A CN107992741A CN 107992741 A CN107992741 A CN 107992741A CN 201710998117 A CN201710998117 A CN 201710998117A CN 107992741 A CN107992741 A CN 107992741A
Authority
CN
China
Prior art keywords
parameter
url
feature vector
corresponding feature
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710998117.7A
Other languages
Chinese (zh)
Other versions
CN107992741B (en
Inventor
张雅淋
李龙飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201710998117.7A priority Critical patent/CN107992741B/en
Priority to CN202011120753.8A priority patent/CN112182578A/en
Publication of CN107992741A publication Critical patent/CN107992741A/en
Priority to TW107129588A priority patent/TWI696090B/en
Priority to PCT/CN2018/105176 priority patent/WO2019080660A1/en
Application granted granted Critical
Publication of CN107992741B publication Critical patent/CN107992741B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/51Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems at application loading time, e.g. accepting, rejecting, starting or inhibiting executable software based on integrity or source reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2119Authenticating web pages, e.g. with suspicious links

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This specification embodiment discloses a kind of model training method, detects the method and device of URL.In this specification embodiment, some URL are obtained, determine the parameter in each URL, and obtain the corresponding feature vector of each parameter, then according to the corresponding feature vector of each parameter, structure isolation forest model.

Description

A kind of model training method, the method and device for detecting URL
Technical field
This specification is related to information technology field, more particularly to a kind of model training method, the method and dress that detect URL Put.
Background technology
In Internet era, network security is even more important.Hacker usually utilizes cyberspace vulnerability, is determined by unified resource Position symbol (Uniform Resoure Locator, URL) invades server, carries out such as structured query language The illegal operations such as (Structured Query Language, SQL) injection attacks, cross-site scripting attack.Attacked with SQL injection Exemplified by, hacker can add illegal field in the parameter of URL so that server, will be non-when being parsed to receiving URL Method field is mistakenly considered executable code and performs, and threatens the data safety on server.
In practical applications, the personnel for being responsible for network security are typically based on business experience, set some safety regulation (examples Such as, the URL comprising XX fields cannot pass through detection) so that whether the URL that server detection receives meets safety regulation, and And only the URL for meeting safety regulation is parsed, so as to avoid being attacked.
Based on the prior art, it is necessary to a kind of method of safer reliable detection URL.
The content of the invention
This specification embodiment provides a kind of model training method, detects the method and device of URL, to solve existing inspection The problem of method security of survey URL is not high.
In order to solve the above technical problems, what this specification embodiment was realized in:
A kind of model training method that this specification embodiment provides, including:
Obtain some uniform resource position mark URLs;
For each URL, the parameter in the URL is extracted;
For each parameter of extraction, the corresponding feature vector of the parameter is determined;
According to the corresponding feature vector of each parameter, structure isolation forest Isolation Forest models, it is described every Whether exhausted forest model is abnormal for detecting URL.
A kind of method for detection URL that this specification embodiment provides, including:
Obtain URL;
Extract the parameter in the URL;
For each parameter of extraction, the corresponding feature vector of the parameter is determined;
The corresponding feature vector of each parameter is input to the isolation forest model built in advance, with to the URL into Row abnormality detection;The isolation forest model is built according to above-mentioned model training method.
A kind of model training apparatus that this specification embodiment provides, including:
Acquisition module, obtains some uniform resource position mark URLs;
Extraction module, for each URL, extracts the parameter in the URL;
Determining module, for each parameter of extraction, determines the corresponding feature vector of the parameter;
Processing module, according to the corresponding feature vector of each parameter, structure isolation forest model, the isolation forest mould Whether type is abnormal for detecting URL.
A kind of device for detection URL that this specification embodiment provides, including:
Acquisition module, obtains URL;
Extraction module, extracts the parameter in the URL;
Determining module, for each parameter of extraction, determines the corresponding feature vector of the parameter;
The corresponding feature vector of each parameter, is input to the isolation forest built in advance by abnormality detection module Isolation Forest models, to carry out abnormality detection to the URL;The isolation forest model is instructed according to above-mentioned model Practice method structure.
A kind of model training equipment that this specification embodiment provides, including one or more processors and memory, institute Stating memory storage has program, and is configured to perform following steps by one or more of processors:
Obtain some uniform resource position mark URLs;
For each URL, the parameter in the URL is extracted;
For each parameter of extraction, the corresponding feature vector of the parameter is determined;
According to the corresponding feature vector of each parameter, structure isolation forest model, the isolation forest model is used to examine Whether abnormal survey URL.
A kind of equipment for detection URL that this specification embodiment provides, including one or more processors and memory, institute Stating memory storage has program, and is configured to perform following steps by one or more of processors:
Obtain URL;
Extract the parameter in the URL;
For each parameter of extraction, the corresponding feature vector of the parameter is determined;
The corresponding feature vector of each parameter is input to the isolation forest model built in advance, with to the URL into Row abnormality detection;The isolation forest model is built according to above-mentioned model training method.
The technical solution provided by above this specification embodiment obtains some as it can be seen that in this specification embodiment URL, determines the parameter in each URL, and obtains the corresponding feature vector of each parameter, then corresponding according to each parameter Feature vector, structure isolation forest model.Whether isolation forest model can be used for detection URL abnormal.In general, abnormal URL is exactly often the URL that hacker sends, and server can refuse the abnormal URL of parsing, from avoiding by hacker attack.
Brief description of the drawings
In order to illustrate more clearly of this specification embodiment or technical solution of the prior art, below will to embodiment or Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, drawings in the following description are only Some embodiments described in this specification, for those of ordinary skill in the art, in not making the creative labor property Under the premise of, other attached drawings can also be obtained according to these attached drawings.
Fig. 1 is a kind of model training method flow chart that this specification embodiment provides;
Fig. 2 a~c are the normal point and abnormal point distribution schematic diagram that this specification embodiment provides;
Fig. 3 is a kind of method flow diagram for detection URL that this specification embodiment provides;
Fig. 4 is a kind of model training apparatus schematic diagram that this specification embodiment provides;
Fig. 5 is a kind of schematic device for detection URL that this specification embodiment provides;
Fig. 6 is a kind of model training equipment schematic diagram that this specification embodiment provides;
Fig. 7 is a kind of equipment schematic diagram for detection URL that this specification embodiment provides.
Embodiment
The method of existing detection URL is that URL is detected according to the safety regulation manually formulated by server.But It is, on the one hand, the means that hacker carries out network attack using URL are ever-changing, and the safety regulation manually formulated is difficult to cover each Attack means;On the other hand, the safety regulation manually formulated usually lags behind emerging attack means.
For this reason, in one or more embodiments of this specification, some URL are obtained, extract the parameter in each URL, and The corresponding feature vector of each parameter is determined, according to the corresponding feature vector of each parameter, structure isolation forest Isolation Forest models.It is well known to those skilled in the art, isolation forest model is a kind of abnormality detection model, uses isolation Forest model can detect whether some URL is abnormal, and abnormal URL is exactly often the URL sent by hacker, and server can With the URL that refusal parsing is abnormal, so as to avoid by hacker attack.
It should be noted that why can be according to the corresponding feature vector structure isolation forest of parameter in some URL Model, is because in practice, hacker is exactly to add in the parameter of URL using the URL main means attacked server Add illegal field.That is, the feature vector of parameter and the feature vector of parameter in abnormal URL exist in normal URL Significant difference.The feature of parameter is often rare in abnormal URL, hence it is evident that is different from the feature of parameter in normal URL.
Based on this, the core concept of the technical solution described in this specification is, by the spy of parameter in known some URL Sign vector is used as data sample, structure isolation forest model.Completely cutting off forest model can be according to the ginseng in some URL to be detected Several feature vectors judges whether the URL is abnormal.
In order to make those skilled in the art more fully understand the technical solution in this specification, below in conjunction with this explanation Attached drawing in book one or more embodiment, is clearly and completely described the technical solution in this specification embodiment, shows So, described embodiment is only this specification part of the embodiment, instead of all the embodiments.It is real by this specification Apply example, those of ordinary skill in the art's all other embodiments obtained without creative efforts, all should When the scope for belonging to this specification protection.
Below in conjunction with attached drawing, the technical solution that each embodiment of this specification provides is described in detail.
Fig. 1 is the model training method flow chart that this specification embodiment provides, and is comprised the following steps:
S100:Obtain some URL.
In this specification embodiment, executive agent can be server or other equipment with data-handling capacity, Hereafter will be by taking executive agent be server as an example.
It is well known that for a URL, the parameter in the URL can include user's (being probably hacker) input Some information.
For example, " http://server/path/documentName1=value1&name2=value2 " is The typical structure of one URL, "" after data be parameter.More than one parameter, different ginsengs can be included in one URL Usually separated between number with " & ", each parameter has parameter name and parameter value.Parameter value is typically by input by user.At this In example, which includes two parameters, and " name1=value1 " represents that the parameter value of the parameter of the entitled name1 of parameter is value1;" name2=value2 " represents that the parameter value of the parameter of the entitled name2 of parameter is value2.
Hacker can sometimes add abnormal illegal field in the parameter of URL, to attack server.Citing comes Say, if the normal URL sent during bone fide subscriber login service device is as follows:
“http://server/path/documentName1=user1&name2=password1 ", wherein, first The parameter value of parameter is user name " user1 ", and the parameter value of second parameter is password " password1 ", and server parsing should URL, for verification username and password by rear, user signs in server.
And hacker can use the means of SQL injection attack when wanting to pretend to be user " user1 " login service device, to service Device sends following abnormal URL:
“http://server/path/documentName1=user1&name2=" ' or 1=1 ", wherein, first The parameter value of parameter is user name " user1 ", and the parameter value of second parameter does not correspond to the password of user name but, and right and wrong Method field " " ' or 1=1 ", due to the intrinsic characteristic of SQL syntax, when server can not be to the close of user according to the illegal field When code is verified, which can resolve to executable code by server and be performed by server, cause hacker need not Password can also sign in the account of user " user1 ", and the data of user are operated.
In this step S200, the normal URL in part is generally comprised in some URL that server obtains and part is abnormal URL.And since abnormal URL is more rare, its shared ratio in some URL is relatively low.
S102:For each URL, the parameter in the URL is extracted.
In this specification embodiment, the parameter in server extraction URL can be extract in URL the parameter name that includes and Parameter value or the parameter value for only extracting the parameter in URL.
In addition, server is directed to each URL, whole parameters in the URL can be extracted, can also be extracted in the URL Partial parameters.
Since in practical applications, the probability of occurrence of some parameter names is relatively low, illegal field is also seldom added to by hacker In the corresponding parameter value of the relatively low parameter name of these probabilities of occurrence, therefore, server can not extract the relatively low ginseng of probability of occurrence Several corresponding parameter values.
Specifically, server can be directed to each URL, in the parameter included in the URL, determine that parameter name meets to refer to The parameter of fixed condition;For definite each parameter, the parameter value of the parameter is extracted.Wherein, the specified requirements can be ginseng Several probabilities of occurrence, which is more than, specifies probable value.Filtered out consequently, it is possible to which the relatively low parameter of probability can will appear from, alleviate clothes Business device handles the burden of data in subsequent step.
S104:For each parameter of extraction, the corresponding feature vector of the parameter is determined.
In this specification embodiment, each parameter of extraction can be directed to, according to the parameter value of the parameter, determines the ginseng The corresponding N-dimensional feature vector of number;N is the natural number more than 0.
Wherein, it is total can to include character sum, letter that the parameter value of parameter includes for the dimension of parameter character pair vector It is number, numerical sum, special symbol sum, the quantity of kinds of characters, the quantity of different letter, the quantity of different digital, different It is at least one in the quantity of special symbol.
With URL " http://server/path/documentExemplified by name1=user1&name2=password1 ", The parameter value of parameter name1 in the URL is user1, the character sum 5 which includes, letter sum 4, numerical sum 1, special symbol sum 0, the quantity 5 of kinds of characters, the quantity 4 of different letters, the quantity 1 of different digital, different specific symbols Number quantity 0.So, the corresponding feature vectors of parameter name1 can be (5,4,1,0,5,4,1,0).
Value it is possible to further each dimension to feature vector is normalized.Herein or edge is used Example explanation, can be according to formula8 feature vector values corresponding to parameter name1 are normalized.Wherein, x tables Show feature vector value, z represents the character sum that parameter name1 is included, and y represents the numerical value built after x is normalized. So, the feature vector vector that parameter name1 is included is (5/5,4/5,1/5,0/5,5/5,4/5,1/5,0/5), i.e., (1,0.8, 0.2,0,1,0.8,0.8,0).
S106:According to the corresponding feature vector of each parameter, structure isolation forest model.
In this specification embodiment, using isolation forest algorithm, built according to the corresponding feature vector of each parameter Completely cut off forest model, whether the isolation forest model is abnormal for detecting URL.Wherein, without to the corresponding feature of each parameter Vector carries out normal or abnormal mark.
The thought for completely cutting off forest algorithm is briefly introduced herein.Referring to Fig. 2 a, in this 10 points shown in Fig. 2 a Including hollow dots and solid dot, the quantity of hollow dots is more (8), and distribution is more concentrated, and the negligible amounts (2 of solid dot It is a), and be distributed more scattered.Hollow dots can be considered as to normal point, solid dot is considered as abnormal point.That is, abnormal point is just It is point that is a small number of and peeling off.Then proceed as follows:
1st division:Occur a line at random, these points in Fig. 2 a are divided into part A and part B, obtain Fig. 2 b.
2nd division:For part A, continue occur a line at random, the point in part A is divided into C portion and D portions Point;Equally, for part B, also occur a line at random, the point in part B is divided into E parts and F parts, such as Fig. 2 c.
Occur a line at random continuing with the part of each new division, continue to divide, until the plane shown in by Fig. 2 a 10 parts are divided into, each part only includes 1 point, i.e., each point is divided into an exclusive part (if some part In only include a point, then this part be exactly this point exclusive part) in.Obviously, be easier also faster can be by for solid dot Exclusive part is included in, as shown in figure 2b, the solid dot in the upper right corner is included in exclusive part (F parts).That is, Some point is easier to be included in exclusive part, this puts more abnormal.
Above thought is based on, in forest algorithm is completely cut off, there are S classification tree (can be specifically binary tree), pin For each binary tree, these points shown in Fig. 2 a are put into root node, since root node, the condition of bifurcated is each time Random (being divided each time with a line occurred at random to point), it is more early to fall into leaf node in the binary tree Point its abnormal possibility it is higher.
By taking above-mentioned isolation forest algorithm as an example, to, according to the corresponding feature vector of each parameter, being built in step S106 Isolation forest model is sketched.
Isolation forest includes S binary tree (iTree), and for each iTree, the process of the training iTree can describe It is as follows:
The first step, in each feature vector, randomly chooses M feature vector, is put into the root node of the iTree;
Second step, in N number of dimension of feature vector, is randomly assigned a dimension (specified dimension), and specify dimension at this In the value of degree, a value is randomly assigned, as cut value;The specified dimension of the cut value between M feature vector Value in maximum and minimum value between;
3rd step, according to the cut value, is divided into two parts, the value of specified dimension is not less than by M feature vector The feature vector of the cut value is a part, and what the value of specified dimension was less than the cut value is another part;
4th step, recurrence perform second step and the 3rd step, until the iTree reaches specified altitude assignment or the leaf of the iTree A feature vector has all only been placed on node.Wherein, specified altitude assignment can be arranged as required to, generally log2M.
Four steps more than, it is possible to train an iTree.
It should be noted that as the next iTree of training, in the first step, can in whole feature vectors with Machine selects M feature vector, and M feature vector can also be randomly choosed in the feature vector of not selected mistake.
Above-mentioned four step is repeated, S trained iTree, composition isolation forest model can be obtained.
A kind of method flow diagram for detection URL that Fig. 3 this specification embodiment provides, comprises the following steps:
S300:Obtain URL.
S302:Extract the parameter in the URL.
S304:For each parameter of extraction, the corresponding feature vector of the parameter is determined.
S306:The corresponding feature vector of each parameter is input to the isolation forest model built in advance, with to described URL carries out abnormality detection.
The URL in Fig. 3 is URL to be detected.The explanation of step S300~S304 may refer to step S100~ S104, repeats no more.
In step S306, the corresponding feature vector of each parameter can be input to isolation forest model, obtained each The output of parameter corresponding model as a result, according to the corresponding model output of each parameter as a result, judge in each parameter whether In the presence of abnormal parameter.
It is possible to further which for each parameter, the corresponding feature vector of the parameter is input to isolation forest model, with Classified by each classification tree in the isolation forest model to the corresponding feature vector of the parameter, determine that the parameter corresponds to The average height of leaf node that is fallen into each classification tree of feature vector, as the corresponding model output knot of the parameter Fruit;Then, for each parameter, if the corresponding model output result of the parameter is less than specified threshold, it is determined that the abnormal parameters, If the corresponding model output result of the parameter is not less than specified threshold, it is determined that the parameter is normal;When definite any abnormal parameters When, it is determined that there is abnormal parameter in each parameter;When determining that each parameter is all normal, it is determined that there is no abnormal in each parameter Parameter.
By the method shown in Fig. 1 and Fig. 3, the feature vector of the parameter in URL, structure isolation forest model, makes Obtaining server can be detected by completely cutting off the received URL of forest model docking, if it is determined that the URL received is abnormal, then It can refuse to parse the URL, so as to avoid, by hacker attack, improving internet security.
In addition, by this specification embodiment, it has also been found that potential network attack means.Specifically, by every Exhausted forest model can determine whether some URL is abnormal, if the URL is abnormal, then meaning that the parameter value of the parameter is Abnormal, abnormal parameter value can prompt staff to analyze the attack means of hacker's use, facilitate staff perfect Safety regulation.
Based on the model training method shown in Fig. 1, this specification embodiment also correspondence provides a kind of model training apparatus, As shown in figure 4, including:
Acquisition module 401, obtains some uniform resource position mark URLs;
Extraction module 402, for each URL, extracts the parameter in the URL;
Determining module 403, for each parameter of extraction, determines the corresponding feature vector of the parameter;
Processing module 404, it is gloomy according to the corresponding feature vector of each parameter, structure isolation forest model, the isolation Whether woods model is abnormal for detecting URL.
The extraction module, for each URL, in the parameter that the URL is included, determines that parameter name meets specified requirements Parameter;For definite each parameter, the parameter value of the parameter is extracted.
The determining module, for each parameter of extraction, according to the parameter value of the parameter, determines the corresponding N of the parameter Dimensional feature vector;N is the natural number more than 0.
The dimension of N-dimensional feature vector, specifically includes:The character that the parameter value of parameter includes is total, alphabetical total, digital total In the quantity of the alphabetical quantity of number, total number of symbols, the quantity of kinds of characters, difference, the quantity of different digital and distinct symbols It is at least one.
Based on the method for the detection URL shown in Fig. 3, this specification embodiment also correspondence provides a kind of dress of detection URL Put, as shown in figure 5, including:
Acquisition module 501, obtains URL;
Extraction module 502, extracts the parameter in the URL;
Determining module 503, for each parameter of extraction, determines the corresponding feature vector of the parameter;
The corresponding feature vector of each parameter, is input to the isolation forest built in advance by abnormality detection module Isolation Forest models, to carry out abnormality detection to the URL;The isolation forest model is instructed according to above-mentioned model Practice method structure.
The corresponding feature vector of each parameter, is input to the isolation forest built in advance by the abnormality detection module Isolation Forest models, build the corresponding model output result of each parameter;According to the corresponding mould of each parameter Type is exported as a result, judging in each parameter with the presence or absence of abnormal parameter;If, it is determined that the URL is abnormal;Otherwise, it determines institute It is normal to state URL.
The abnormality detection module, for each parameter, is input to what is built in advance by the corresponding feature vector of the parameter Completely cut off forest model, to divide by each classification tree in the isolation forest model the corresponding feature vector of the parameter Class, determines the average height for the leaf node that the corresponding feature vector of the parameter is fallen into each classification tree, as the parameter Corresponding model exports result;For each parameter, if the corresponding model output result of the parameter is less than specified threshold, it is determined that The abnormal parameters, if the corresponding model output result of the parameter is not less than specified threshold, it is determined that the parameter is normal.
Based on the model training method shown in Fig. 2, this specification embodiment also correspondence provides a kind of model training equipment, As shown in fig. 6, including one or more processors and memory, the memory storage has program, and is configured to by institute State one or more processors and perform following steps:
Obtain some uniform resource position mark URLs;
For each URL, the parameter in the URL is extracted;
For each parameter of extraction, the corresponding feature vector of the parameter is determined;
According to the corresponding feature vector of each parameter, structure isolation forest Isolation Forest models, it is described every Whether exhausted forest model is abnormal for detecting URL.
Based on the method for the detection URL shown in Fig. 3, this specification embodiment also correspondence provides a kind of setting for detection URL Standby, as shown in fig. 7, comprises one or more processors and memory, the memory storage has program, and be configured to by One or more of processors perform following steps:
Obtain URL;
Extract the parameter in the URL;
For each parameter of extraction, the corresponding feature vector of the parameter is determined;
The corresponding feature vector of each parameter is input to the isolation forest Isolation Forest moulds built in advance Type, to carry out abnormality detection to the URL;The isolation forest model is built according to above-mentioned model training method.
Each embodiment in this specification is described by the way of progressive, identical similar portion between each embodiment Divide mutually referring to what each embodiment stressed is the difference with other embodiment.Especially for Fig. 6 and For equipment shown in Fig. 7, since it is substantially similar to embodiment of the method, so description is fairly simple, related part referring to The part explanation of embodiment of the method.
In the 1990s, the improvement for a technology can clearly distinguish be on hardware improvement (for example, Improvement to circuit structures such as diode, transistor, switches) or software on improvement (improvement for method flow).So And as the development of technology, the improvement of current many method flows can be considered as directly improving for hardware circuit. Designer nearly all obtains corresponding hardware circuit by the way that improved method flow is programmed into hardware circuit.Cause This, it cannot be said that the improvement of a method flow cannot be realized with hardware entities module.For example, programmable logic device (Programmable Logic Device, PLD) (such as field programmable gate array (Field Programmable Gate Array, FPGA)) it is exactly such a integrated circuit, its logic function determines device programming by user.By designer Voluntarily programming comes a number character repertoire " integrated " on a piece of PLD, without asking chip maker to design and make Make dedicated IC chip.Moreover, nowadays, substitution manually makes IC chip, and this programming is also used instead mostly " logic compiler (logic compiler) " software realizes that software compiler used is similar when it writes with program development Seemingly, and the source code before compiling also handy specific programming language is write, this is referred to as hardware description language (Hardware Description Language, HDL), and HDL is also not only a kind of, but have many kinds, such as ABEL (Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL (Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language) etc., VHDL (Very-High-Speed are most generally used at present Integrated Circuit Hardware Description Language) and Verilog.Those skilled in the art also should This understands, it is only necessary to method flow slightly programming in logic and is programmed into integrated circuit with above-mentioned several hardware description languages, The hardware circuit for realizing the logical method flow can be readily available.
Controller can be implemented in any suitable manner, for example, controller can take such as microprocessor or processing The computer for the computer readable program code (such as software or firmware) that device and storage can be performed by (micro-) processor can Read medium, logic gate, switch, application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), the form of programmable logic controller (PLC) and embedded microcontroller, the example of controller include but not limited to following microcontroller Device:ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320, are deposited Memory controller is also implemented as a part for the control logic of memory.It is also known in the art that except with Pure computer readable program code mode is realized beyond controller, can be made completely by the way that method and step is carried out programming in logic Controller is obtained in the form of logic gate, switch, application-specific integrated circuit, programmable logic controller (PLC) and embedded microcontroller etc. to come in fact Existing identical function.Therefore this controller is considered a kind of hardware component, and various to being used for realization for including in it The device of function can also be considered as the structure in hardware component.Or even, the device for being used for realization various functions can be regarded For either the software module of implementation method can be the structure in hardware component again.
System, device, module or the unit that above-described embodiment illustrates, can specifically be realized by computer chip or entity, Or realized by having the function of certain product.One kind typically realizes that equipment is computer.Specifically, computer for example may be used Think that personal computer, laptop computer, cell phone, camera phone, smart phone, individual digital symbol assistant, media are broadcast Put appointing in device, navigation equipment, electronic mail equipment, game console, tablet PC, wearable device or these equipment The combination of what equipment.
For convenience of description, it is divided into various units during description apparatus above with function to describe respectively.Certainly, this is being implemented The function of each unit can be realized in same or multiple softwares and/or hardware during specification.
It should be understood by those skilled in the art that, the embodiment of the present invention can be provided as method, system or computer program Product.Therefore, the present invention can use the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardware Apply the form of example.Moreover, the present invention can use the computer for wherein including computer usable program code in one or more The computer program production that usable storage medium is implemented on (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) The form of product.
The present invention be with reference to according to the method for the embodiment of the present invention, the flow of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that it can be realized by computer program instructions every first-class in flowchart and/or the block diagram The combination of flow and/or square frame in journey and/or square frame and flowchart and/or the block diagram.These computer programs can be provided The processors of all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce A raw machine so that the instruction performed by computer or the processor of other programmable data processing devices, which produces, to be used in fact The device for the function of being specified in present one flow of flow chart or one square frame of multiple flows and/or block diagram or multiple square frames.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which produces, to be included referring to Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or The function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that counted Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented processing, thus in computer or The instruction performed on other programmable devices is provided and is used for realization in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in a square frame or multiple square frames.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read-only storage (ROM) or flash memory (flash RAM).Memory is computer-readable medium Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only storage (ROM), electric erasable Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only storage (CD-ROM), Digit multifunctional optical disk (DVD) or other optical storages, magnetic cassette tape, the storage of tape magnetic rigid disk or other magnetic storages are set Standby or any other non-transmission medium, the information that can be accessed by a computing device available for storage.Define, count according to herein Calculation machine computer-readable recording medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It should also be noted that, term " comprising ", "comprising" or its any other variant are intended to nonexcludability Comprising so that process, method, commodity or equipment including a series of elements not only include those key elements, but also wrapping Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment it is intrinsic will Element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that wanted including described Also there are other identical element in the process of element, method, commodity or equipment.
It will be understood by those skilled in the art that the embodiment of this specification can be provided as method, system or computer program production Product.Therefore, this specification can use the implementation in terms of complete hardware embodiment, complete software embodiment or combination software and hardware The form of example.Moreover, this specification can use the computer for wherein including computer usable program code in one or more The computer program production that usable storage medium is implemented on (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) The form of product.
This specification can be described in the general context of computer executable instructions, such as journey Sequence module.Usually, program module include performing particular task or realize the routine of particular abstract data type, program, object, Component, data structure etc..This specification can also be put into practice in a distributed computing environment, in these distributed computing environment In, by performing task by communication network and connected remote processing devices.In a distributed computing environment, program module It can be located in the local and remote computer-readable storage medium including storage device.
The foregoing is merely the embodiment of this specification, this specification is not limited to.For art technology For personnel, this specification can have various modifications and variations.It is all this specification spirit and principle within made it is any Modification, equivalent substitution, improvement etc., should be included within the right of this specification.

Claims (16)

1. a kind of model training method, including:
Obtain some uniform resource position mark URLs;
For each URL, the parameter in the URL is extracted;
For each parameter of extraction, the corresponding feature vector of the parameter is determined;
It is gloomy according to the corresponding feature vector of each parameter, structure isolation forest Isolation Forest models, the isolation Whether woods model is abnormal for detecting URL.
2. according to the method described in claim 1, being directed to each URL, the parameter in the URL is extracted, is specifically included:
For each URL, in the parameter that the URL is included, determine that parameter name meets the parameter of specified requirements;
For definite each parameter, the parameter value of the parameter is extracted.
3. according to the method described in claim 2, for each parameter of extraction, the corresponding feature vector of the parameter is determined, have Body includes:
For each parameter of extraction, according to the parameter value of the parameter, the corresponding N-dimensional feature vector of the parameter is determined;N be more than 0 natural number.
4. according to the method described in claim 3, the dimension of N-dimensional feature vector, specifically includes:
Character that the parameter value of parameter includes sum, alphabetical sum, numerical sum, total number of symbols, the quantity of kinds of characters, difference It is at least one in the quantity of letter, the quantity of the quantity of different digital and distinct symbols.
5. a kind of method of detection URL, including:
Obtain URL;
Extract the parameter in the URL;
For each parameter of extraction, the corresponding feature vector of the parameter is determined;
The corresponding feature vector of each parameter is input to the isolation forest model built in advance, it is different to be carried out to the URL Often detection;The isolation forest model is built according to Claims 1 to 4 any one of them method.
6. according to the method described in claim 5, the corresponding feature vector of each parameter is input to the isolation built in advance Forest model, to carry out abnormality detection to the URL, specifically includes:
The corresponding feature vector of each parameter is input to the isolation forest model built in advance, each parameter is obtained and corresponds to respectively Model output result;
According to the corresponding model output of each parameter as a result, judging in each parameter with the presence or absence of abnormal parameter;
If, it is determined that the URL is abnormal;
Otherwise, it determines the URL is normal.
7. according to the method described in claim 6, the corresponding feature vector of each parameter is input to the isolation built in advance Forest model, obtains the corresponding model output of each parameter as a result, specifically including:
For each parameter, the corresponding feature vector of the parameter is input to the isolation forest model built in advance, to pass through Each classification tree stated in isolation forest model classifies the corresponding feature vector of the parameter, determines the corresponding feature of the parameter The average height for the leaf node that vector is fallen into each classification tree, as the corresponding model output result of the parameter;
According to the corresponding model output of each parameter as a result, judging to specifically include with the presence or absence of abnormal parameter in each parameter:
For each parameter, if the corresponding model output result of the parameter is less than specified threshold, it is determined that the abnormal parameters, if should The corresponding model output result of parameter is not less than specified threshold, it is determined that the parameter is normal.
8. a kind of model training apparatus, including:
Acquisition module, obtains some uniform resource position mark URLs;
Extraction module, for each URL, extracts the parameter in the URL;
Determining module, for each parameter of extraction, determines the corresponding feature vector of the parameter;
Processing module, according to the corresponding feature vector of each parameter, structure isolation forest model, the isolation forest model is used It is whether abnormal in detection URL.
9. device according to claim 8, the extraction module, for each URL, in the parameter that the URL is included, really Determine the parameter that parameter name meets specified requirements;For definite each parameter, the parameter value of the parameter is extracted.
10. device according to claim 9, the determining module, for each parameter of extraction, according to the ginseng of the parameter Numerical value, determines the corresponding N-dimensional feature vector of the parameter;N is the natural number more than 0.
11. device according to claim 10, the dimension of N-dimensional feature vector, specifically include:
Character that the parameter value of parameter includes sum, alphabetical sum, numerical sum, total number of symbols, the quantity of kinds of characters, difference It is at least one in the quantity of letter, the quantity of the quantity of different digital and distinct symbols.
12. a kind of device of detection URL, including:
Acquisition module, obtains URL;
Extraction module, extracts the parameter in the URL;
Determining module, for each parameter of extraction, determines the corresponding feature vector of the parameter;
The corresponding feature vector of each parameter, is input to the isolation forest model built in advance, with right by abnormality detection module The URL carries out abnormality detection;The isolation forest model is built according to Claims 1 to 4 any one of them method.
13. device according to claim 12, the abnormality detection module is defeated by the corresponding feature vector of each parameter Enter to the isolation forest model built in advance, obtain the corresponding model output result of each parameter;It is right respectively according to each parameter The model output answered is as a result, judge in each parameter with the presence or absence of abnormal parameter;If, it is determined that the URL is abnormal;Otherwise, Determine that the URL is normal.
14. device according to claim 13, the abnormality detection module is corresponding by the parameter for each parameter Feature vector is input to the isolation forest model built in advance, with by it is described isolation forest model in each classification tree to the ginseng The corresponding feature vector of number is classified, and determines the leaf node that the corresponding feature vector of the parameter is fallen into each classification tree Average height, as the parameter corresponding model output result;For each parameter, if the corresponding model output knot of the parameter Fruit is less than specified threshold, it is determined that the abnormal parameters, if the corresponding model output result of the parameter is not less than specified threshold, really The fixed parameter is normal.
15. a kind of model training equipment, including one or more processors and memory, the memory storage have program, and And it is configured to perform following steps by one or more of processors:
Obtain some uniform resource position mark URLs;
For each URL, the parameter in the URL is extracted;
For each parameter of extraction, the corresponding feature vector of the parameter is determined;
According to the corresponding feature vector of each parameter, structure isolation forest model, the isolation forest model is used to detect URL It is whether abnormal.
16. a kind of equipment of detection URL, including one or more processors and memory, the memory storage have program, and And it is configured to perform following steps by one or more of processors:
Obtain URL;
Extract the parameter in the URL;
For each parameter of extraction, the corresponding feature vector of the parameter is determined;
The corresponding feature vector of each parameter is input to the isolation forest model built in advance, it is different to be carried out to the URL Often detection;The isolation forest model is built according to Claims 1 to 4 any one of them method.
CN201710998117.7A 2017-10-24 2017-10-24 Model training method, URL detection method and device Active CN107992741B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201710998117.7A CN107992741B (en) 2017-10-24 2017-10-24 Model training method, URL detection method and device
CN202011120753.8A CN112182578A (en) 2017-10-24 2017-10-24 Model training method, URL detection method and device
TW107129588A TWI696090B (en) 2017-10-24 2018-08-24 Model training method, method and device for detecting URL
PCT/CN2018/105176 WO2019080660A1 (en) 2017-10-24 2018-09-12 Model training method, method and device for testing url

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710998117.7A CN107992741B (en) 2017-10-24 2017-10-24 Model training method, URL detection method and device

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202011120753.8A Division CN112182578A (en) 2017-10-24 2017-10-24 Model training method, URL detection method and device

Publications (2)

Publication Number Publication Date
CN107992741A true CN107992741A (en) 2018-05-04
CN107992741B CN107992741B (en) 2020-08-28

Family

ID=62030610

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201710998117.7A Active CN107992741B (en) 2017-10-24 2017-10-24 Model training method, URL detection method and device
CN202011120753.8A Pending CN112182578A (en) 2017-10-24 2017-10-24 Model training method, URL detection method and device

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202011120753.8A Pending CN112182578A (en) 2017-10-24 2017-10-24 Model training method, URL detection method and device

Country Status (3)

Country Link
CN (2) CN107992741B (en)
TW (1) TWI696090B (en)
WO (1) WO2019080660A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108777873A (en) * 2018-06-04 2018-11-09 江南大学 The wireless sensor network abnormal deviation data examination method of forest is isolated based on weighted blend
CN108984376A (en) * 2018-05-31 2018-12-11 阿里巴巴集团控股有限公司 A kind of system anomaly detection method, device and equipment
WO2019080660A1 (en) * 2017-10-24 2019-05-02 阿里巴巴集团控股有限公司 Model training method, method and device for testing url
CN109815566A (en) * 2019-01-09 2019-05-28 同济大学 A kind of method for detecting abnormality of the go AI chess manual file of SGF format
WO2019128529A1 (en) * 2017-12-28 2019-07-04 阿里巴巴集团控股有限公司 Url attack detection method and apparatus, and electronic device
CN110032881A (en) * 2018-12-28 2019-07-19 阿里巴巴集团控股有限公司 A kind of data processing method, device, equipment and medium
CN110086749A (en) * 2018-01-25 2019-08-02 阿里巴巴集团控股有限公司 Data processing method and device
WO2019169982A1 (en) * 2018-03-06 2019-09-12 阿里巴巴集团控股有限公司 Url abnormality positioning method and device, and server and storage medium
CN110958222A (en) * 2019-10-31 2020-04-03 苏州浪潮智能科技有限公司 Server log anomaly detection method and system based on isolated forest algorithm

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110399268B (en) * 2019-07-26 2023-09-26 创新先进技术有限公司 Abnormal data detection method, device and equipment
CN113065610B (en) * 2019-12-12 2022-05-17 支付宝(杭州)信息技术有限公司 Isolated forest model construction and prediction method and device based on federal learning
CN114095391B (en) * 2021-11-12 2024-01-12 上海斗象信息科技有限公司 Data detection method, baseline model construction method and electronic equipment
CN116776135B (en) * 2023-08-24 2023-12-19 之江实验室 Physical field data prediction method and device based on neural network model

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544210A (en) * 2013-09-02 2014-01-29 烟台中科网络技术研究所 System and method for identifying webpage types
CN104077396A (en) * 2014-07-01 2014-10-01 清华大学深圳研究生院 Method and device for detecting phishing website
CN105956472A (en) * 2016-05-12 2016-09-21 宝利九章(北京)数据技术有限公司 Method and system for identifying whether webpage includes malicious content or not
CN106131071A (en) * 2016-08-26 2016-11-16 北京奇虎科技有限公司 A kind of Web method for detecting abnormality and device
CN106846806A (en) * 2017-03-07 2017-06-13 北京工业大学 Urban highway traffic method for detecting abnormality based on Isolation Forest
CN106960040A (en) * 2017-03-27 2017-07-18 北京神州绿盟信息安全科技股份有限公司 A kind of URL classification determines method and device
US20170270299A1 (en) * 2016-03-17 2017-09-21 Electronics And Telecommunications Research Institute Apparatus and method for detecting malware code by generating and analyzing behavior pattern

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7082426B2 (en) * 1993-06-18 2006-07-25 Cnet Networks, Inc. Content aggregation method and apparatus for an on-line product catalog
WO2010011411A1 (en) * 2008-05-27 2010-01-28 The Trustees Of Columbia University In The City Of New York Systems, methods, and media for detecting network anomalies
US8521667B2 (en) * 2010-12-15 2013-08-27 Microsoft Corporation Detection and categorization of malicious URLs
US9178901B2 (en) * 2013-03-26 2015-11-03 Microsoft Technology Licensing, Llc Malicious uniform resource locator detection
US9106536B2 (en) * 2013-04-15 2015-08-11 International Business Machines Corporation Identification and classification of web traffic inside encrypted network tunnels
US9412024B2 (en) * 2013-09-13 2016-08-09 Interra Systems, Inc. Visual descriptors based video quality assessment using outlier model
JP6276417B2 (en) * 2013-11-04 2018-02-07 イルミオ, インコーポレイテッドIllumio,Inc. Automatic generation of label-based access control rules
CN105205394B (en) * 2014-06-12 2019-01-08 腾讯科技(深圳)有限公司 Data detection method and device for intrusion detection
CN104899508B (en) * 2015-06-17 2018-12-07 中国互联网络信息中心 A kind of multistage detection method for phishing site and system
US11200291B2 (en) * 2015-11-02 2021-12-14 International Business Machines Corporation Automated generation of web API descriptions from usage data
CN105554007B (en) * 2015-12-25 2019-01-04 北京奇虎科技有限公司 A kind of web method for detecting abnormality and device
CN107196953B (en) * 2017-06-14 2020-05-08 上海境领信息科技有限公司 Abnormal behavior detection method based on user behavior analysis
CN107992741B (en) * 2017-10-24 2020-08-28 阿里巴巴集团控股有限公司 Model training method, URL detection method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544210A (en) * 2013-09-02 2014-01-29 烟台中科网络技术研究所 System and method for identifying webpage types
CN104077396A (en) * 2014-07-01 2014-10-01 清华大学深圳研究生院 Method and device for detecting phishing website
US20170270299A1 (en) * 2016-03-17 2017-09-21 Electronics And Telecommunications Research Institute Apparatus and method for detecting malware code by generating and analyzing behavior pattern
CN105956472A (en) * 2016-05-12 2016-09-21 宝利九章(北京)数据技术有限公司 Method and system for identifying whether webpage includes malicious content or not
CN106131071A (en) * 2016-08-26 2016-11-16 北京奇虎科技有限公司 A kind of Web method for detecting abnormality and device
CN106846806A (en) * 2017-03-07 2017-06-13 北京工业大学 Urban highway traffic method for detecting abnormality based on Isolation Forest
CN106960040A (en) * 2017-03-27 2017-07-18 北京神州绿盟信息安全科技股份有限公司 A kind of URL classification determines method and device

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019080660A1 (en) * 2017-10-24 2019-05-02 阿里巴巴集团控股有限公司 Model training method, method and device for testing url
WO2019128529A1 (en) * 2017-12-28 2019-07-04 阿里巴巴集团控股有限公司 Url attack detection method and apparatus, and electronic device
US10785241B2 (en) 2017-12-28 2020-09-22 Alibaba Group Holding Limited URL attack detection method and apparatus, and electronic device
CN110086749A (en) * 2018-01-25 2019-08-02 阿里巴巴集团控股有限公司 Data processing method and device
WO2019169982A1 (en) * 2018-03-06 2019-09-12 阿里巴巴集团控股有限公司 Url abnormality positioning method and device, and server and storage medium
US10819745B2 (en) 2018-03-06 2020-10-27 Advanced New Technologies Co., Ltd. URL abnormality positioning method and device, and server and storage medium
CN108984376A (en) * 2018-05-31 2018-12-11 阿里巴巴集团控股有限公司 A kind of system anomaly detection method, device and equipment
CN108777873A (en) * 2018-06-04 2018-11-09 江南大学 The wireless sensor network abnormal deviation data examination method of forest is isolated based on weighted blend
CN108777873B (en) * 2018-06-04 2021-03-02 江南大学 Wireless sensor network abnormal data detection method based on weighted mixed isolated forest
CN110032881A (en) * 2018-12-28 2019-07-19 阿里巴巴集团控股有限公司 A kind of data processing method, device, equipment and medium
CN110032881B (en) * 2018-12-28 2023-09-22 创新先进技术有限公司 Data processing method, device, equipment and medium
CN109815566A (en) * 2019-01-09 2019-05-28 同济大学 A kind of method for detecting abnormality of the go AI chess manual file of SGF format
CN110958222A (en) * 2019-10-31 2020-04-03 苏州浪潮智能科技有限公司 Server log anomaly detection method and system based on isolated forest algorithm

Also Published As

Publication number Publication date
CN112182578A (en) 2021-01-05
TWI696090B (en) 2020-06-11
TW201917618A (en) 2019-05-01
WO2019080660A1 (en) 2019-05-02
CN107992741B (en) 2020-08-28

Similar Documents

Publication Publication Date Title
CN107992741A (en) A kind of model training method, the method and device for detecting URL
CN108108373A (en) A kind of name-matches method and device
CN108229156A (en) URL attack detection methods, device and electronic equipment
CN109582833B (en) Abnormal text detection method and device
US10528766B2 (en) Techniques for masking electronic data
KR101874373B1 (en) A method and apparatus for detecting malicious scripts of obfuscated scripts
CN109344615A (en) A kind of method and device detecting malicious commands
CN111159697B (en) Key detection method and device and electronic equipment
CN109582954A (en) Method and apparatus for output information
CN107341143A (en) A kind of sentence continuity determination methods and device and electronic equipment
CN111859093A (en) Sensitive word processing method and device and readable storage medium
CN107402945A (en) Word stock generating method and device, short text detection method and device
CN109714356A (en) A kind of recognition methods of abnormal domain name, device and electronic equipment
CN115438650B (en) Contract text error correction method, system, equipment and medium fusing multi-source characteristics
CN115314236A (en) System and method for detecting phishing domains in a Domain Name System (DNS) record set
CN111159354A (en) Sensitive information detection method, device, equipment and system
CN111324892B (en) Method, device and medium for generating software genes and script detection of script file
CN111221690A (en) Model determination method and device for integrated circuit design and terminal
CN113688240B (en) Threat element extraction method, threat element extraction device, threat element extraction equipment and storage medium
CN116089985A (en) Encryption storage method, device, equipment and medium for distributed log
CN110502902A (en) A kind of vulnerability classification method, device and equipment
CN110705258A (en) Text entity identification method and device
CN115455416A (en) Malicious code detection method and device, electronic equipment and storage medium
CN111291561A (en) Text recognition method, device and system
CN108334775A (en) One kind is escaped from prison plug-in detecting method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1253998

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20201019

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20201019

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Patentee before: Alibaba Group Holding Ltd.

TR01 Transfer of patent right