CN110086749A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN110086749A
CN110086749A CN201810075020.3A CN201810075020A CN110086749A CN 110086749 A CN110086749 A CN 110086749A CN 201810075020 A CN201810075020 A CN 201810075020A CN 110086749 A CN110086749 A CN 110086749A
Authority
CN
China
Prior art keywords
network
data area
data
access
acquisition system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810075020.3A
Other languages
Chinese (zh)
Inventor
刘添龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201810075020.3A priority Critical patent/CN110086749A/en
Publication of CN110086749A publication Critical patent/CN110086749A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • H04L63/0236Filtering by address, protocol, port number or service, e.g. IP-address or URL
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Computer And Data Communications (AREA)

Abstract

The embodiment of the present application provides a kind of data processing method and device.The described method includes: obtaining the data acquisition system of characterization network access, the data area for meeting preset condition in the data acquisition system is identified, according to the data area, it is determined whether prevent the corresponding network access in the data area.The application can carry out preliminary screening to the data acquisition system of characterization network access by way of breaking the whole up into parts, the data area with network attack risk obtained further according to screening carries out further precisely identification to network access, compared with being identified by carrying out precisely matching with the access address in white list or blacklist solely by access address to network access, it avoids and is difficult to access the problem of identifying to network without precisely matching, it does not need manually to safeguard white list or blacklist simultaneously yet, improve the accuracy and reliability detected to network attack.

Description

Data processing method and device
Technical field
This application involves Internet technical fields, more particularly to a kind of data processing method and device.
Background technique
The development of network technology brings great convenience to the production and life of user.User can pass through computer or hand The terminals such as machine, carry out network access to long-range server, and respective services are obtained from network to obtain.But in practical application In, network access may also can carry out network attack to the network equipment in network by malicious exploitation, to progress network access User brings loss or the network equipment is caused to damage, and therefore, needs a kind of data processing method to detect to network access.
In the prior art, it can collect in advance and store the access address of network access to white list or blacklist, wherein Include the access address of safety in white list, include in blacklist there may be the access address of the network of network attack access, And the access address stored in the access address and blacklist or white list that are used for network access can be compared, thus really Whether fixed network access includes network attack.But since the access address for including in white and black list is limited, and need The access address for carrying out network access is precisely matched and can be accessed network with the access address in white list or blacklist It is detected, therefore the reliability and accuracy that identify are lower, it is likely that the problem of will cause wrong report and failing to report.
Summary of the invention
In view of the above problems, it proposes on the application overcomes the above problem or at least be partially solved in order to provide one kind State the data processing method and device of problem.
This application provides a kind of data processing methods, comprising:
Obtain the data acquisition system of characterization network access;
Identify the data area for meeting preset condition in the data acquisition system;
According to the data area, it is determined whether prevent the corresponding network access in the data area.
Optionally, the data acquisition system for obtaining characterization network access includes:
Extract the access address of network access;
Determine the access address mapped data acquisition system.
Optionally, the determination access address mapped data acquisition system includes:
Determine the corresponding character vector of each character in the access address;
The corresponding character vector group of multiple characters is combined into the corresponding data matrix of access address as the data acquisition system.
Optionally, described according to the data area, it is determined whether to prevent the corresponding network access packet in the data area It includes:
Semantics recognition is carried out to the data area using nervus opticus network model, it is corresponding to obtain the data area Target attack risk classifications;
Grammer is carried out to the data area using the corresponding third nerve network model of the target attack risk classifications Identification, and determined whether that the corresponding network in the data area is prevented to access according to grammer recognition result.
Optionally, in the determination access address before the corresponding character vector of each character, the determining institute State access address mapped data acquisition system further include:
Remove the meaningless character in the access address.
Optionally, the data acquisition system for obtaining characterization network access includes:
Obtain the data acquisition system for characterizing multiple network access;
The data area for meeting preset condition in the identification data acquisition system includes:
Identify the data area of the multiple network access of the correspondence for meeting preset condition in the data acquisition system.
Optionally, the data area that preset condition is met in the identification data acquisition system includes:
By data, present position is divided into multiple data areas in the data acquisition system;
Identification meets the data area of the preset condition from the multiple data area.
Optionally, the data area of preset condition is met in the identification data acquisition system further include:
According to present position information, the data area for meeting the preset condition is merged.
Optionally, described according to the data area, it is determined whether to prevent the corresponding network access packet in the data area It includes:
Semantics recognition is carried out to the data area, obtains the corresponding target attack risk classifications in the data area;
Grammer identification is carried out to the data area, and is determined whether to prevent the data area according to grammer recognition result Corresponding network access.
Optionally, described according to the data area, it is determined whether prevent the corresponding network access in the data area Before, the method also includes:
Determine that the data area accesses the address fragment mapped in corresponding access address in network, as with network The address fragment of risk of attacks;
It is described to include: to data area progress semantics recognition
Semantics recognition is carried out to the address fragment with network attack risk;
It is described to include: to data area progress grammer identification
Grammer identification is carried out to the address fragment with network attack risk, is determined described with network attack risk Address fragment whether meet the syntax rule that the target attack risk classifications have.
Optionally, described according to the data area, it is determined whether to prevent the corresponding network access packet in the data area It includes:
If it is determined that the network access includes network attack, then the network is blocked to access.
Optionally, described according to the data area, it is determined whether to prevent the corresponding network access packet in the data area It includes:
If it is determined that the network access does not include network attack, then the network access of letting pass.
Present invention also provides a kind of data processing equipments, comprising:
Data acquisition system obtains module, for obtaining the data acquisition system of characterization network access;
Data area identification module meets the data area of preset condition for identification in the data acquisition system;
Network accesses identification module, for according to the data area, it is determined whether prevent the data area corresponding Network access.
Present invention also provides a kind of computer equipment, including memory, processor and storage are on a memory and can be The computer program run on processor, the processor are realized when executing the computer program such as aforementioned one or more Method.
Present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, the computer The method such as aforementioned one or more is realized when program is executed by processor.
In the embodiment of the present application, the data acquisition system of characterization network access can be acquired, which can say The bright relevant information with network access identifies in the data acquisition system whether include the data area for meeting preset condition It identifies the data area in the data acquisition system with network attack risk, and then is determined whether to prevent the number according to the data area According to the corresponding network access in region, that is to say, it can be by way of breaking the whole up into parts to the data acquisition system for illustrating the relevant information Preliminary screening is carried out, the data area with network attack risk obtained further according to screening carries out further network access Precisely identification, with by solely by access address in white list or blacklist access address carry out precisely match come to net Network access carries out identification and compares, and avoids and is difficult to access the problem of identifying to network without precisely matching, while being also not required to It wants artificial dialogue list or blacklist to be safeguarded, improves the accuracy and reliability detected to network attack.
Above description is only the general introduction of technical scheme, in order to better understand the technological means of the application, And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects, features and advantages of the application can It is clearer and more comprehensible, below the special specific embodiment for lifting the application.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefit are common for this field Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the application Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows a kind of data processing method flow chart according to the application one embodiment one;
Fig. 2 shows a kind of data processing method flow charts according to the application one embodiment two;
Fig. 3 shows a kind of data processing method flow chart according to the application one embodiment;
Fig. 4 shows a kind of structural block diagram of data processing equipment according to the application one embodiment three;
Fig. 5 shows a kind of structural block diagram of exemplary system according to the application one embodiment.
Specific embodiment
The application exemplary embodiment is more fully described below with reference to accompanying drawings.Although showing that the application shows in attached drawing Example property embodiment, it being understood, however, that may be realized in various forms the application without that should be limited by embodiments set forth here System.It is to be able to thoroughly understand the application on the contrary, providing these embodiments, and can be complete by scope of the present application Be communicated to those skilled in the art.
The embodiment of the present application is deeply understood for the ease of those skilled in the art, will introduce the embodiment of the present application first below Involved in technical term definition.
Network access, also known as internet access refer to that two or more the network equipments (such as computer) pass through internet Link or transfer data to the process of other side.
The data acquisition system of network access can be presented as character string or matrix, can be used in the accessed network of characterization and set Standby domain name, IP (Internet Protocol, network protocol) address, MAC (Media Access Control media interviews Control) address, URI (Uniform Resource Identifier, uniform resource identifier), URL (Uniform Resource Locator, uniform resource locator) or the letter such as URN (Uniform Resource Name, uniform resource name) At least one of breath.Wherein, URI is for uniquely identifying a resource, and the subset that URL and URN are URI, URL is for used The form of to acquisite approachs identifies a resource, and URN is for indicating resource by name, and the URN of the resource will not be because of depositing It stores up the position change of the resource and changes.Certainly, in practical applications, can be also used for characterizing other has with network access The information of pass.
Data acquisition system may include the information such as above-mentioned domain name, IP address, MAC Address, URI, URL or URN, or including by Above- mentioned information carry out obtained information after the conversion of the forms such as reduction or mapping, so that the network for characterizing one or more is visited It asks.For example, the data acquisition system can correspond to multiple URL, to characterize multiple network access.Data area is in the middle part of data acquisition system Region where divided data, in the embodiment of the present application, if the data area meets preset condition, which may include tool There are the data of network attack risk, correspondingly, the network access that the data acquisition system is characterized may include network attack.
Wherein, it since data acquisition system can characterize one or more network access, that is to say included in the data acquisition system Data can correspond to one or more network access, correspondingly, data area includes the partial data in the data acquisition system, Therefore, data area can also correspond to multiple network access instructions.It that is to say, it in the embodiment of the present application, can be by one A data acquisition system detects to access one or more networks.
Reduction is a kind of mode solved the problems, such as, and a complexity or unknown problem can be converted to one or more letters Single or known problem, for example a longer or more complicated character string is converted into another shorter or relatively simple character Sequence.Mapping is for being converted to another form of data for a form of data.It certainly, in practical applications, can be with Form conversion is carried out to above- mentioned information otherwise.
Network attack can be used for stealing data or damage to the network equipment, and common network attack may include CSS (Cross Site Scripting, cross site scripting) attack, CSRF (Cross-Site Request Forgeries, across station Point request is forged) attack, SQL (Structured Query Language, Structured Query Language) injection attacks, DoS (Denial of Service, refusal service) attack or redirection attack etc..Therefore the identification of network attack has important meaning Whether justice is network attack by identification network access, can prevent or reduce the possibility of suffered network attack, improve network The safety of equipment and network, it is ensured that the data safety of user.
The network equipment may include mobile phone, smartwatch, VR (Virtual Reality, virtual reality) equipment, plate electricity Brain, E-book reader, MP3 (MovingPicture Experts Group Audio Layer III, dynamic image expert Compression standard audio level 3) player, MP4 (Moving Picture Experts Group Audio Layer IV, dynamic Image expert's compression standard audio level 4) player, pocket computer on knee, vehicle-mounted computer, desktop computer, set-top box, Intelligent TV set, wearable device etc..The network equipment may include any device of lower Fig. 3 or 4, implement any of 1-3 Mode, to be detected to network access.
Client may include at least one application program.The client can operate in positioning device, to realize Data processing method provided by the embodiments of the present application.
Plug-in unit may include in the application program for running on positioning device, to realize number provided by the embodiments of the present application According to processing method.
The embodiment of the present application can be applied to access in the scene detected network.Due to single in the prior art Ground relies on white list or the detection mode of blacklist easily causes wrong report or fails to report, and the accuracy and reliability of detection is lower, Therefore in order to solve this problem, the embodiment of the present application provides a kind of data processing method, can obtain characterization network access Data acquisition system, including the domain name in aforementioned, IP address, MAC Address, at least one of URI, URL and URN, to the data Set is identified, to obtain meeting the data area of preset condition in the data acquisition system, for meeting the number of the preset condition According to region, due to may include the data with network attack risk, and the net that the data acquisition system is characterized in the data area Network access may include network attack, therefore, can further determine whether to prevent the data area pair according to the data area The network access answered.Can by way of breaking the whole up into parts, according to preset condition to characterization network access data acquisition system into Row preliminary screening, so that screening obtains may include the data area of the data with network attack risk, further according to screening To data area determine whether to prevent the corresponding network access in the data area, and by solely by access address and white name Access address in list or blacklist carries out precisely matching and compares to carry out identification to network access, avoids without precisely matching It is difficult to access the problem of identifying to network, while also not needing manually to safeguard white list or blacklist, improve The accuracy and reliability that network attack is detected.
Certainly, in practical applications, data processing method provided by the embodiment of the present application can also be not limited solely to Identify whether network access includes network attack, and then determine whether to prevent network access, but can identify for appointing The network access of what purpose, alternatively, data processing method provided by the embodiment of the present application can also be used to it is other to characterizing The data acquisition system of information is handled, for example, identification one article in whether include certain specific meanings text segment etc..
The embodiment of the present application can be implemented as client or plug-in unit, and the network equipment can be obtained and be installed from remote server The client or plug-in unit, to implement data processing method provided by the embodiment of the present application by the client or plug-in unit. Certainly, the embodiment of the present application can also be disposed on the remote server in the form of data processing software, and positioning device can lead to It crosses and accesses the remote server to obtain data processing service.
Embodiment one
Referring to Fig.1, a kind of data processing method flow chart according to the application one embodiment, specific steps packet are shown It includes:
Step 101, the data acquisition system of characterization network access is obtained.
Network access may include network attack, bring loss to the network equipment, network and user, since network accesses Data acquisition system can illustrate the network access relevant information, therefore, for the ease of it is subsequent according to characterization network access number According to set, network access is identified, to reduce the possibility by network attack, improves the safety of the network equipment and network Property, the data acquisition system of available characterization network access.
The network equipment can obtain the data acquisition system of characterization network access from network request.The network equipment can be to sending Network request and/or the network request that receives monitored, to acquire in network request and the network request Data acquisition system of the entrained data as characterization network access.
Wherein, for network equipments such as PCs, it can be used as terminal and initiate network access, therefore, the network equipment Only the network request of sending can be monitored;And for network equipments such as servers, it can be as accessed object, i.e., The network access from multiple terminals may be received, therefore the network equipment can only carry out the network request received It monitors.
Network request is based on the network transmission protocol, may include based on HTTP (Hyper Text Transfer Protocol, hypertext transfer protocol), FTP (File Transfer Protocol, File Transfer Protocol) or ICMP The request of (Internet Control Message Protocol, Internet Control Message agreement), certainly, in practical application In, which can also include the request based on other network transmission protocols.
Step 102, the data area for meeting preset condition in the data acquisition system is identified.
From the foregoing it will be appreciated that network access can be characterized by data acquisition system, then by analyzing whether data acquisition system has The corresponding feature of network attack risk can recognize whether network access therefore can be in advance according to data including network attack Feature possessed by data in set with network attack risk, and/or the data without network attack risk are had Feature, preset condition is set, then the data acquisition system is identified according to preset condition, to obtain meeting the default item The data area of part includes the data that may have network attack risk in the data area, in order to subsequent further root It whether include network attack according to the data identification network access in the region, it is determined whether prevent network from accessing, realize and pass through The mode to break the whole up into parts carries out preliminary screening to data acquisition system, convenient for true by the subsequent data area directly obtained according to screening Determine whether to access prevention network, it is smart solely by the access address progress in access address and white list or blacklist with passing through Quasi- matching is compared to identify to network access, avoids and is difficult to access the problem of identifying to network without precisely matching, i.e., Network can be accessed and carry out more accurate identification, while also not need manually to safeguard white list or blacklist, because This improves the accuracy and reliability detected to network attack.
Preset condition be include feature possessed by the data area of the data with network attack risk, for example, this is pre- If condition can be the character string for including the particular order that specific character is constituted, alternatively, specific character is according to specific position structure At arrangement mode.If data area meets the preset condition, which includes the number with network attack risk According to otherwise, which does not include the data with network attack risk.
The network equipment can directly identify the data in data acquisition system, to identify in the data acquisition system full The data area of sufficient preset condition;Alternatively, multiple data areas first can be divided into data acquisition system, each data area is carried out Identification, so that it is determined that whether the data area meets the preset condition.
The network equipment can be according to order of the character in the character string that data acquisition system includes in the character string, will Character is divided into multiple data areas in the character string, then identifies to data area, is with the determining data area It is no to meet preset condition.
Certain number of character adjacent in character string can be divided into a data area by the network equipment, and continuous At least one data area in may include be less than the certain number of overlapping character.
Wherein, given number can be determined by network device in advance, for example receive the numerical value determination of submission.
Certainly, in practical applications, the character sequence that the network equipment can include to data acquisition system according to multiple given numbers Column are divided, so that multiple data areas different including number of characters are obtained, to more smart convenient for carrying out to data acquisition system Thin identification, improving in identification data acquisition system includes the accuracy with the data area of network attack risk.
For example, data acquisition system includes b.php? name=&id=1 ' and 1=1, the data area divided can wrap " id=1 ' and 1=1 " is included, " php? name=&id ", " ame=&id=1 ' and ".
For example, the character string for including in data acquisition system is asdfghjkl, given number includes 3 and 4, and therefore, network is set It is standby that character string is divided according to given number 3, obtain data area include asd, sdf, dfg, fgh, ghj, hjk and Jkl divides the character string according to given number 4, obtain data area include asdf, sdfg, dfgh, fghj, Ghjk and hjkl.
The network equipment, which can identify to whether there is in data acquisition system by models such as neural network models, meets preset condition Data area.The data area conduct that the neural network model can receive data acquisition system or be divided by the data acquisition system Input, and export meet the data area of the preset condition, carry the data area marked with the matching degree of the preset condition, Or output meets the preset condition and the data area for being unsatisfactory for the preset condition.
Matching degree is for illustrating that data area meets the program of preset condition, to illustrate that the data area may include The probability of data with network attack risk, when the matching degree is higher, then the data area includes having network attack wind The probability of the data of danger is also higher.
It can be in advance using data acquisition system or the data area divided by data acquisition system as sample, wherein having network The data area of risk of attacks (meeting preset condition) or the data acquisition system including having the data area of network attack risk Labeled as positive sample, data area without network attack risk or include the data area without network attack risk Data acquisition system is labeled as negative sample, identifies in data acquisition system whether have to neural network model by the positive sample and negative sample The data area for meeting the preset condition is trained.Certainly, in practical applications, may not be all samples simply Labeled as positive sample and negative sample, but each sample is marked according to the matching degree with preset condition, and pass through label Sample afterwards is trained the neural network model.
Wherein, neural network model is widely interconnected by a large amount of, simple processing unit (referred to as neuron) And the complex networks system formed, it reflects many essential characteristics of human brain function, is a highly complex Nonlinear Dynamic Mechanics learning system.Neural network has large-scale parallel, distributed storage and processing, self-organizing, adaptive, self-study and extensive energy Power is particularly suitable for processing and needs to consider simultaneously many factors and condition, inaccurate and fuzzy information-processing problem, and with divide Class device is compared, be capable of providing it is more abundant output as a result, and the preset condition can by training obtain, user can not feel Know the preset condition, also there is no need to add rule and it is artificial participate in, therefore, the advantages of being based on above-mentioned neural network model, Identification process can be made to participate in or add rule without artificial, improve the efficiency and accuracy of identification.
Certainly, in practical applications, it can also identify whether to have in data acquisition system otherwise and meet default item The data area of part, for example pass through YOLO (You Only Look Once) or SSD (Single Shot multiBox Detector) data area is identified, alternatively, being identified by classifier to data area, alternatively, by each data field Domain is compared with white list and/or blacklist, so that it is determined that whether the data area meets preset condition.
Wherein, YOLO and SSD is respectively a kind of title of Model of Target Recognition.
Step 103, according to the data area, it is determined whether prevent the corresponding network access in the data area.
Due to including that there may be the data areas for meeting preset condition in the data acquisition system of characterization network access, then the net Network access is likely to be also that therefore, can access corresponding network according to the data area including network attack and carry out Identification, it is determined whether the network is prevented to access.
Wherein, based on advantage possessed by aforementioned middle neural network model, in order to improve the accuracy and efficiency of identification, net Network equipment can access network according to the data area with network attack risk by neural network model and identify, with Determine whether to prevent the corresponding network access in the data area.The neural network model can receive data area and be used as input, And the recognition result that access about network is exported, including prevent or the access of clearance network, certainly, in practical applications, identification is tied Whether fruit may include prevention or the access of clearance network, includes network attack, the probability with network attack, network attack risk At least one of specific a certain kind network attack etc. in type, the network attack risk classifications.
It can be using data area in the data acquisition system for characterizing network access as sample, wherein from including network attack Data area in data acquisition system is labeled as positive sample, does not include that data area label in the data acquisition system of network attack is negative Sample identifies that network access is according to the data area with network attack risk to neural network by positive sample and negative sample It is no to be trained including network attack.Certainly, in practical applications, may not be simply is positive and negative by all sample labelings Sample, but by whether include network attack, the probability with network attack, network attack risk classifications, the network attack Sample is marked at least one of specific a certain kind network attack etc. in risk classifications, and passes through the sample after label This is trained neural network model.
From the foregoing it will be appreciated that can identify that there is network attack risk in data acquisition system by models such as neural network models Therefore data area in the embodiment of the present application, can detect network attack by two models, one of them There is the data area for meeting preset condition for identification, another is used for the data area obtained according to identification in data acquisition system Determine whether to prevent network corresponding to the data area to access, certainly, in practical applications, can also by a model come There is the data area for meeting preset condition in identification data acquisition system, and determine whether to prevent according to the data area that identification obtains The access of network corresponding to the data area.Wherein, if being identified by multiple models to network access, convenient for passing through each mould Type is precisely controlled each step of identification, including being calibrated according to the recognition result in each step to corresponding model Deng the accuracy detected to network attack can be further increased;If being identified by a model to network access, The quantity of the model of training needed for capable of reducing and the sample required for the collection training of each model, can further increase knowledge Other efficiency reduces artificial participate in.
Certainly, in practical applications, the data in the data area can also be identified otherwise, into And identify whether the corresponding network access in the data area includes network attack, for example pass through classifier, blacklist and/or white name It is singly at least one of equal to be identified.Wherein, if being identified by the classifier based on machine learning, when recognition result is Determine that the data will cause attack when 1, network access includes network attack, determines the data when recognition result is 0 It not will cause attack, network access does not include network attack.
In addition, however, it is determined that include the data area for meeting preset condition in the data acquisition system of characterization network access, network is set It is standby can also by with it is aforementioned it is middle according to data area determine whether that network is prevented to access in the way of, be according to data acquisition system determination The no network access for preventing the data acquisition system from being characterized, i.e., know all data in the data acquisition system of characterization network access Not, to be identified according to all data in the data acquisition system to network access comprehensively, it can be improved and network is detected Reliability.
In the embodiment of the present application, the data acquisition system of characterization network access can be acquired, which can say The bright relevant information with network access identifies in the data acquisition system whether include the data area for meeting preset condition It identifies the data area in the data acquisition system with network attack risk, and then is determined whether to prevent the number according to the data area According to the corresponding network access in region, that is to say, it can be by way of breaking the whole up into parts to the data acquisition system for illustrating the relevant information Preliminary screening is carried out, the data area with network attack risk obtained further according to screening carries out further network access Precisely identification, with by solely by access address in white list or blacklist access address carry out precisely match come to net Network access carries out identification and compares, and avoids and is difficult to access the problem of identifying to network without precisely matching, while being also not required to It wants artificial dialogue list or blacklist to be safeguarded, improves the accuracy and reliability detected to network attack.
Embodiment two
Referring to Fig. 2, a kind of data processing method flow chart according to the application one embodiment, specific steps packet are shown It includes:
Step 201, the access address for extracting network access, determines the access address mapped data acquisition system.
Access address includes the character string of the various characters such as letter, number and punctuation mark, can extract network access Access address, form conversion is carried out to the access address that extracts by mapping, to obtain data acquisition system.
Network access address may include domain name, IP address, MAC Address or URL etc. in aforementioned, can be by listening to Network request in extract for network access access address.
The multiple characters for including in access address can be mapped, by pre-determined mapping relations to arrive Data acquisition system.
Wherein, mapping relations are for converting the character for including in access address, including by one in access address A character is mapped as a character or multiple character combinations in same type or other types.The mapping relations can in advance really It is fixed, for example, receiving the rule etc. submitted.The mapping relations can be presented as formula, mathematical model or list.
In the embodiment of the present application, optionally, data matrix can be known by machine vision for the ease of subsequent Not, and then achieve the purpose that access network and identify, can determine the corresponding character of each character in the access address The corresponding character vector group of multiple characters is combined into the corresponding data matrix of access address as the data acquisition system by vector.
Vectorization processing is carried out to access address, that is to say and character string is subjected to mathematicization.Character string can be split It is divided into multiple characters, character vector is exactly the character in character string to be carried out with vector a kind of common type of mathematicization, word According with vector is exactly a character representation into a vector.
Specifically, VSM (Vector Space Model, vector space model) or term vector calculating instrument can be passed through Word2vec, to include in access address character carry out vector quantities operation, thus by the access address be converted to from character to Measure the data matrix constituted.Meaning represented by access address can be carried out by reaching to the identification of data matrix later The purpose of identification.
Wherein it is possible to initialize the corresponding character vector of each character according to the dimension of the character vector of setting, character to The dimension of amount indicates the number for dividing vector, and the dimension of each character vector be it is identical, specifically can be at random as each character Generate the numerical value in each dimension of character vector.
For example, the dimension of character vector is 20, and include: " h " is expressed as character vector (d1 for an access address =0.001, d2=-0.077, d3=-0.907, d4=0.189, d5=-0.456 ..., d20=0.354), " a " be expressed as word Symbol vector (d1=0.335, d2=-0.125, d3=-0.110, d4=0.136, d5=-0.590 ..., d20=0.248), " l " is expressed as character vector (d1=0.223, d2=-0.345, d3=-0.456, d4=0.567, d5=-0.567 ..., d20 =-0.423).
For example, including letter e in access address, vectorization processing is carried out to access address by VSM model, obtains letter The corresponding vector of e can be for [0.1,0.21,0.13,0.3].
Certainly, in practical applications, word corresponding to the character in access address can also be determined by another way Vector is accorded with, acquired character vector is arranged according to sequence of the corresponding character in access address, to obtain and the visit Ask address corresponding data matrix.
In the embodiment of the present application, optionally, due to that may include a variety of multiple characters, some of them in access address Character may only have the function of it is formal, without actual semanteme, i.e., meaningless character, for example wrapped in access address " HTTP: // " that includes, "=" etc., therefore in order to reduce the data volume of processing, the efficiency detected to network attack is improved, together When also reduce these meaningless characters and detection process may be interfered, in order to improve the standard detected to network attack True property can remove the access address in the determination access address before the corresponding character vector of each character In meaningless character.
Meaningless character refer in access address only have the function of it is formal, without the character of practical semanteme.
The character of submission can be received in advance as the sample of meaningless character, so as to according to the meaningless character Sample is filtered access address, by character deletion meaningless in the access address.
In the embodiment of the present application, optionally, for the access of batch detection network, raising detects network access Efficiency, the data acquisition system of the available multiple network access of characterization.
Wherein it is possible to obtain multiple data acquisition systems for characterizing the access of a network respectively in the manner previously described, then will obtain The data acquisition system got merges, to obtain characterizing the data acquisition system of multiple network access.
Step 202, the data area for meeting preset condition in the data acquisition system is identified.
Wherein, it identifies the mode for meeting the data area of preset condition in data acquisition system, may refer to the correlation in aforementioned Description, no longer repeats one by one herein.
In the embodiment of the present application, optionally, due to may include character string in data acquisition system, in the character string Character, which is arranged successively, may have certain specific semantemes, for example constitute domain name title or constitute a URN etc., therefore, For the ease of the semanteme for keeping each character in the data acquisition system represented in the data acquisition system, and in affiliated data area Represented semanteme is identical, while realizing and carrying out finer identification to data acquisition system, reduces the subsequent required data identified Amount improves the accuracy and efficiency for dividing data area from data acquisition system, identifies to improve to data area Accuracy and efficiency, can by data, present position is divided into multiple data areas in the data acquisition system, from described more Identification meets the data area of the preset condition in a data area.
If data acquisition system includes character string, data present position in data acquisition system can be presented as character in character sequence Order in column may refer to the associated description in aforementioned correspondingly, data acquisition system to be divided into the mode of data area, this Place no longer repeats one by one.
If data acquisition system includes the data matrix in aforementioned, data present position embodiment in data acquisition system can be character Position coordinates in a matrix, correspondingly, at least one continuous row and at least one company can will be in the data matrix Region in continuous column is as a data area.Certainly, in practical applications, in order to being identified to data acquisition system The data matrix can also be divided into multiple data areas otherwise by accuracy, for example, can be by the data square The submatrix of battle array is as data area.
Wherein, from the foregoing it will be appreciated that data acquisition system can characterize multiple network access, therefore, data acquisition system is divided Obtained data area can also correspond to the network access of one or more.
In the embodiment of the present application, optionally, due to that may will indicate one after dividing to data matrix Complete semantic multiple characters are divided to different data areas, for example, constituting several letters of a word, or constitute one Letter etc. in several words of item instruction, therefore, if multiple data areas all have network attack risk, multiple data Character in region may indicate the semanteme of a complete user network attack;Or, it is also possible in two data areas Therefore in order to improve the accuracy and reliability for detecting network attack, it can be believed according to present position including overlapping character Breath, merges the data area for meeting the preset condition.
Present position information may include position coordinates of the character in data matrix in data area.
It can be when determination obtains multiple data areas with network attack risk, according to the character in each data area Position coordinates in data matrix judge the character for whether having lap position in any two data area, if then according to The lap position merges two data areas.
In the embodiment of the present application, optionally, from the foregoing it will be appreciated that neural network model has large-scale parallel, distribution The advantages that storage and processing, self-organizing, adaptive and self-learning ability, is capable of providing output more abundant result it is not necessary to add Add rule and artificial participation, therefore, in order to improve the accuracy and efficiency of identification, (can be denoted as by neural network model First nerves network model), to identify the data area for meeting preset condition in data acquisition system.
Wherein, first nerves network model can receive the data matrix as data acquisition system of submission, output obtain with The highest data area of preset condition matching degree or matching degree are higher than the data area of first threshold.If it is determined that output Data area is more than one, can also judge whether multiple data areas to be output can merge, and to can merge Multiple data areas merge, the data area after then output merges.Certainly, in practical applications, can also receive After the data matrix as data acquisition system of submission, which is divided into multiple data areas, thus to each data Region is identified, determines the matching degree of the data area and preset condition.
The data matrixes including multiple data acquisition systems as characterization network access can be obtained in advance, including having The data matrix of the data area of network attack risk does not include the data area with network attack wind direction as positive sample Data matrix is trained first nerves network model identification data matrix by the positive negative sample as negative sample.
First threshold can be determined in advance, for example receive the numerical value determination of submission.For example, first threshold can be 50%.
In the embodiment of the present application, optionally, for the ease of related technical personnel according to recognition result to first nerves net Network model is corrected, and then improves the accuracy and reliability for the data area for having network attack risk to identification, can be with The determining data area for meeting preset condition is highlighted, to intuitively show recognition result.
Certainly, in practical applications, the data area for meeting preset condition can also be protruded otherwise It has been shown that, for example the data addition underscore of the data area or overstriking are shown.
In addition, first from data acquisition system extract have network attack risk data area, by following manner according to The data area identifies network access, can be realized the position accurately where discovery attack in data acquisition system, and right The data of the position are identified, are avoided the detection process to normal data in data acquisition system, are improved to network attack Detection efficiency and accuracy.
In the embodiment of the present application, optionally, from the foregoing it will be appreciated that data acquisition system can characterize multiple network access, therefore, It ensures that and the access of multiple networks is detected, improve the reliability detected to network attack, can identify institute State the data area for meeting the multiple network access of correspondence of preset condition in data acquisition system.
Wherein, since data acquisition system can characterize multiple network access, then the data acquisition system is carried out in the manner previously described Identify that obtained data area can correspond to multiple network access.
Step 203, identify whether corresponding network access includes network attack according to the data area, if so then execute Step 204, no to then follow the steps 205.
Wherein, according to network attack risk data area identification network access whether the mode of network attack, can No longer to repeat one by one referring to the associated description in aforementioned herein.
In the embodiment of the present application, optionally, since the access address of network access includes character string, the character string It may illustrate the object for accessing or obtaining, the purpose accessed, that is, there is different semantemes, when network access includes network When attack, the access address of network access may illustrate the object of attack or attack purpose of network attack, therefore, in order to It is enough accurately to detect a plurality of types of network attacks by way of breaking the whole up into parts, improve detection network attack accuracy and Reliability takes corresponding protection method for different network attacks convenient for subsequent, for the determining preset condition that meets Data area can carry out semantics recognition to the data area, obtain the corresponding target attack risk class in the data area Type.
In the embodiment of the present application, optionally, from the foregoing it will be appreciated that neural network model has large-scale parallel, distribution The advantages that storage and processing, self-organizing, adaptive and self-learning ability, is capable of providing output more abundant result it is not necessary to add Add rule and artificial participation, it therefore, can be using neural network model (as in order to improve the accuracy and efficiency of identification Nervus opticus network model) semantics recognition is carried out to the data area, obtain the corresponding target attack wind in the data area Dangerous type.
Wherein, semantics recognition is carried out to data area by nervus opticus network model, can include determining that data area For the probability of multiple risk of attacks types, the risk of attacks type of maximum probability is determined as the corresponding target in the data area Risk of attacks type, alternatively, the risk of attacks type that probability is greater than second threshold is determined that the corresponding target in the data area is attacked Hit risk classifications.
Multiple data areas corresponding to different risk of attacks types can be acquired in advance as sample, to second Neural network model identification data area is trained.
Second threshold can be by being determined in advance, for example receives the numerical value that user submits and determine.
Certainly, in practical applications, semantics recognition can also be carried out to data area by classifier, so that it is determined that corresponding Target attack risk classifications.
In the embodiment of the present application, optionally, it can determine that the data area accesses corresponding access address in network The address fragment of middle mapping, as the address fragment with network attack risk;Correspondingly, can have network attack to described The address fragment of risk carries out semantics recognition.
Address fragment includes the continuous character in part in access address.From the foregoing it will be appreciated that can be according to pre-determined Mapping relations map the multiple characters for including in access address, the data acquisition system after being converted, therefore, can be by According to the mapping relations, the character for including in data area is restored, obtains the corresponding address fragment in access address.
It, can be by by the address fragment and pre-determined blacklist and/or white for the obtained address fragment of mapping List is compared, so that it is determined that whether the address fragment includes network attack, if then determining that corresponding network access includes Network attack;Alternatively, can also be by whether being attacked including network based on classifier or neural network model to the address fragment It hits and is judged.
Wherein, semantics recognition is carried out to address fragment by nervus opticus network model, can include determining that address fragment For the probability of multiple risk of attacks types, the risk of attacks type of maximum probability is determined as the corresponding target of the address fragment Risk of attacks type, alternatively, the risk of attacks type that probability is greater than second threshold is determined that the corresponding target of the address fragment is attacked Hit risk classifications.
Multiple address fragments corresponding to different risk of attacks models can be acquired in advance as sample, to second Neural network model identification address fragment is trained.
In the embodiment of the present application, optionally, since network access address includes character string, which may also The mode that can illustrate the object for accessing or obtaining, that is, have different grammers, when network access includes network attack, the network The access address of access may illustrate the attack pattern of network attack, therefore, in order to by way of breaking the whole up into parts, essence A plurality of types of network attacks are detected quasi-ly, improve the accuracy and reliability of detection network attack, while being directed to convenient for subsequent Different network attacks takes corresponding protection method, can be to described for the determining data area for meeting preset condition Data area carries out grammer identification, and is determined whether that the corresponding network in the data area is prevented to visit according to grammer recognition result It asks.
In the embodiment of the present application, optionally, from the foregoing it will be appreciated that neural network model has large-scale parallel, distribution The advantages that storage and processing, self-organizing, adaptive and self-learning ability, is capable of providing output more abundant result it is not necessary to add Add rule and artificial participation, therefore, in order to improve the accuracy and efficiency of identification, the target attack risk class can be used The corresponding neural network model of type (as nervus opticus network model) carries out grammer identification to the data area, and according to language Method recognition result determines whether to prevent the corresponding network access in the data area.
Wherein, grammer identification is carried out to data area by third nerve network model, determining if operation result is 1 should Whether data area includes the network attack of target attack risk classifications, and then determines whether network access attacks including network It hits;If it is determined that operation result is the network attack that 0 determining data area does not include target attack risk classifications.Certainly, exist In practical application, the general of target attack risk classifications can also be may include by third nerve network model operational data region Rate shows that the probability determines whether the data area includes target attack risk classifications, or basis otherwise by user Whether the determine the probability data area includes target attack risk classifications.
Multiple data areas can be obtained in advance, using including the data area of target attack risk classifications as positive sample This, do not include target attack risk classifications data area as negative sample, by positive negative sample, to third nerve network model Whether identification data area includes that the network attacks of target attack risk classifications is trained.
Certainly, in practical applications, grammer identification can also be carried out to data area by classifier, so that it is determined that the number Whether meet syntax rule possessed by the target attack risk classifications according to region.
In the embodiment of the present application, optionally, it can determine that the data area accesses corresponding access address in network The address fragment of middle mapping, as the address fragment with network attack risk;Correspondingly, there is network attack risk to described Address fragment carry out grammer identification, determine whether the address fragment with network attack risk meets the target attack The syntax rule that risk classifications have.
Wherein, grammer identification is carried out to address fragment by third nerve network model, determining if operation result is 1 should Whether address fragment includes the network attack of target attack risk classifications, and then determines whether network access attacks including network It hits;If it is determined that operation result is the network attack that 0 determining address fragment does not include target attack risk classifications.Certainly, exist In practical application, the general of target attack risk classifications can also be may include by third nerve network model arithmetic address segment Rate shows that the probability determines whether the address fragment includes target attack risk classifications, or basis otherwise by user Whether the determine the probability address fragment includes target attack risk classifications.
Multiple address fragments can be obtained in advance, using including the address fragment of target attack risk classifications as positive sample This, do not include target attack risk classifications address fragment as negative sample, by positive negative sample, to third nerve network model Whether identification address fragment includes that the network attacks of target attack risk classifications is trained.
In addition, from the foregoing it will be appreciated that the embodiment of the present application can know network access by three neural network models Not, to judge whether network access includes network attack, wherein first nerves network model is used in characterization network access Data acquisition system in identify and meet preset condition, i.e., with the data area of network attack risk, nervus opticus network model The corresponding target attack risk classifications of data for identification in the data area, the third nerve network model number for identification Whether meet the syntax rule that target attack risk classifications have according to the data in region.Certainly, in practical applications, in order to mention The accuracy or efficiency of height identification can also access the network by more or fewer neural network models and identify, For example, can be identified by fourth nerve network model to carry out semantic and grammer to the data in the data area, with simultaneously Realize the function of nervus opticus network model and third nerve network model.
Step 204, the network is blocked to access.
Since network access includes network attack, then network access may be to damaging or steal number of users According to therefore, in order to ensure, the safety of network and user's property, the network being blocked to access.
The network can be blocked to access by disconnecting network connection, intercepting the modes such as network access by firewall Continue.
Step 205, the network of letting pass accesses.
Due to network access do not include network attack, then the network access be it is safe, therefore, in order to ensure in network Every business can normally handle, improve and network reliability, can let pass the network access.
In the embodiment of the present application, firstly, the data acquisition system of characterization network access, the data acquisition system energy can be acquired The relevant information of enough explanations and network access identifies in the data acquisition system whether include the data area for meeting preset condition, It can recognize the data area in the data acquisition system with network attack risk, and then determine whether to prevent according to the data area The corresponding network access in the data area, that is to say, can be by way of breaking the whole up into parts to the data for illustrating the relevant information Set carries out preliminary screening, and the data area with network attack risk obtained further according to screening carries out into one network access The accurate identification of step, with by solely by access address and the access address in white list or blacklist carry out it is accurate match come Identification is carried out to network access to compare, and is avoided and is difficult to access the problem of identifying to network without precisely matching, while It does not need manually to safeguard white list or blacklist, improves the accuracy and reliability detected to network attack.
Secondly, the access address of network access can be extracted, and determine access address mapped data acquisition system, therefore energy It is enough that many and diverse or difficult to deal with character string for including in access address is converted into single or easy-to-handle character sequence Column or matrix convenient for identifying to data acquisition system improve the accuracy and reliability detected to network attack.
In addition, vectorization processing can be carried out the character string that access address includes, so that character string is converted to Data matrix is using as the data acquisition system, thus data matrix can be identified by machine vision in order to subsequent, into And achievees the purpose that access network and identify.
In addition, can according to data, present position is divided into multiple data areas in data acquisition system, thus to each data Region is identified, it is ensured that the character in each data area can express with semanteme identical in data acquisition system, while It realizes and data acquisition system is finely divided, reduce the subsequent required data volume identified, that is, improve divided data area Accuracy and efficiency, and then also improve and identify whether to include the data area with network attack risk from data acquisition system Accuracy and efficiency.
In addition, can be identified to obtain the data with network attack risk to data acquisition system by neural network model Region and according to network access identify, it is ensured that be capable of providing recognition result more abundant, and identification process is not It needs manually to participate in or addition is regular, improve the accuracy and efficiency detected to network attack.
It should be understood that the method and step in above-described embodiment is not each essential, Under specific situation, it is convenient to omit one or more of steps, as long as can be realized the skill detected to network attack Art purpose.The quantity and its sequence of step in the embodiment that the present invention does not limit, protection scope of the present invention is worked as to be wanted with right It asks subject to the restriction of book.
The application is more fully understood for the ease of those skilled in the art, and the application is implemented below by way of specific example A kind of data processing method of example is illustrated, and is specifically comprised the following steps:
Example one: step S1 pre-processes HTTP raw requests, including rejects idle character therein, and to original Does beginning character string carry out reduction and obtains new character string seq1, such as http://www.a.com/b.php? name=Zhang San & Id=123 ' and 1=1 wherein can not construct attack content at domain name (www.a.com), and Chinese cannot constitute attack, shaping Does is sequence (123) specification 1, and obtained new sequence seq1 is b.php? name=&id=1 ' and 1=1;Step S2, to character Sequence seq1 carries out vectorization, generates data matrix mat1, due to only having English character (A-Z, a-z), number (0-9) and some Special symbol (as+,-,=,, ,/,;, ',?,!, #, %, ^, *, (), etc.) can construct attack, therefore can be with Construction one only includes the dictionary D of these characters, and the number of dictionary content is the number M of these characters, it is assumed that each character reflects As soon as penetrating as the vector of N-dimensional, the two-dimensional matrix that dictionary D is a M*N, as shown in table 1 below, therefore, for a reduction Afterwards request (such as it is aforementioned in b.php? name=&id=1 ' and 1=1), each of which character can with it is right in dictionary It answers character vector to be mapped, obtains the corresponding bivector matrix mat1 of the request;
Table 1
Step S3 identifies that it is suspicious that extraction obtains at least one by region recognition model to data matrix mat1 Region is attacked, corresponding original sequence-stretches point are than being r1, r2 ..., rN, wherein region recognition model can be for based on mind Model through network, the region recognition model can be used for extracting candidate region in data matrix, to candidate region into Do the filtering of row overlapping region and confidence level screening, obtain suspicious attack region, such as to b.php? name=&id=1 ' and 1 For=1, the N number of segment (1≤N≤5) extracted may include " id=1 ' and 1=1 ", " php? name=&id ", " ame =&id=1 ' and ";Step S4 can attack region for what is extracted, be predicted by deep neural network, judge the region Whether corresponding original sequence-stretches belong to attack.
Example two: a kind of data processing method flow chart is referring to Fig. 3.Firstly, HTTP request is obtained, including access Does is location http:/www.a.com/b.php? id=1&name=tom ' and 1=1;Vector is carried out to the access address got It is converted to vector matrix;Region is carried out to vector region to perceive to obtain the suspicious region of at least one;By classifier to doubtful Classify like region;Access address is blocked or passed through according to classification results.
Embodiment three
Referring to Fig. 4, a kind of structural block diagram of data processing equipment according to the application one embodiment, the device are shown Include:
Data acquisition system obtains module 401, for obtaining the data acquisition system of characterization network access;
Data area identification module 402 meets the data area of preset condition for identification in the data acquisition system;
Network accesses identification module 403, for according to the data area, it is determined whether prevents the data area corresponding Network access.
Optionally, the data acquisition system acquisition module includes:
Access address extracting sub-module, for extracting the access address of network access;
Data acquisition system determines submodule, for determining the access address mapped data acquisition system.
Optionally, the data acquisition system determines that submodule is also used to:
Determine the corresponding character vector of each character in the access address;
The corresponding character vector group of multiple characters is combined into the corresponding data matrix of access address as the data acquisition system.
Optionally, the network access identification module includes:
First semantics recognition submodule, for carrying out semantic knowledge to the data area using nervus opticus network model Not, the corresponding target attack risk classifications in the data area are obtained;
First grammer identifies submodule, for using the corresponding third nerve network model of the target attack risk classifications Grammer identification is carried out to the data area, and is determined whether to prevent the corresponding net in the data area according to grammer recognition result Network access.
Optionally, the data acquisition system determines that submodule is also used to:
Remove the meaningless character in the access address.
Optionally, the data acquisition system acquisition module includes:
Data acquisition system acquisition submodule, for obtaining the data acquisition system for characterizing multiple network access;
The data area identification module includes:
First data area identifies submodule, meets the multiple nets of correspondence of preset condition in the data acquisition system for identification The data area of network access.
Optionally, the data area identification module includes:
Data area divides submodule, for by data, present position to be divided into multiple data fields in the data acquisition system Domain;
Second data area identifies submodule, meets the preset condition for identifying from the multiple data area Data area.
Optionally, the data area identification module further include:
Data area merging module, for according to present position information, to meet the data area of the preset condition into Row merges.
Optionally, the network access identification module includes:
Second semantics recognition submodule obtains the data area pair for carrying out semantics recognition to the data area The target attack risk classifications answered;
Second grammer identifies submodule, for carrying out grammer identification to the data area, and according to grammer recognition result Determine whether to prevent the corresponding network access in the data area.
Optionally, described device further include:
Address fragment determining module maps in corresponding access address for determining that the data area is accessed in network Address fragment, as the address fragment with network attack risk;
The second semantics recognition submodule is also used to:
Semantics recognition is carried out to the address fragment with network attack risk;
The second grammer identification submodule is also used to:
Grammer identification is carried out to the address fragment with network attack risk, is determined described with network attack risk Address fragment whether meet the syntax rule that the target attack risk classifications have.
Optionally, described device further include:
Network access blocks module, is used to then block the network to visit if it is determined that network access includes network attack It asks.
Optionally, described device further include:
Network access clearance module, for if it is determined that the network access do not include network attack, then the network of letting pass Access.
For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simple Place illustrates referring to the part of embodiment of the method.
In the embodiment of the present application, the data acquisition system of characterization network access can be acquired, which can say The bright relevant information with network access identifies in the data acquisition system whether include the data area for meeting preset condition It identifies the data area in the data acquisition system with network attack risk, and then is determined whether to prevent the number according to the data area According to the corresponding network access in region, that is to say, it can be by way of breaking the whole up into parts to the data acquisition system for illustrating the relevant information Preliminary screening is carried out, the data area with network attack risk obtained further according to screening carries out further network access Precisely identification, with by solely by access address in white list or blacklist access address carry out precisely match come to net Network access carries out identification and compares, and avoids and is difficult to access the problem of identifying to network without precisely matching, while being also not required to It wants artificial dialogue list or blacklist to be safeguarded, improves the accuracy and reliability detected to network attack.
The embodiment of the present application can be implemented as using any suitable hardware, firmware, software, or and any combination thereof progress The system of desired configuration.Fig. 5 schematically shows the example that can be used for realizing each embodiment described herein Property system (or device) 500.
For one embodiment, Fig. 5 shows exemplary system 500, the system have one or more processors 502, It is coupled to the system control module (chipset) 504 of at least one of (one or more) processor 502, is coupled to and be The system storage 506 for control module 504 of uniting is coupled to the nonvolatile memory (NVM) of system control module 504/deposit Storage equipment 508 is coupled to one or more input-output apparatus 510 of system control module 504, and is coupled to and is The network interface 512 for control module 506 of uniting.
Processor 502 may include one or more single or multiple core processors, processor 502 may include general processor or Any combination of application specific processor (such as graphics processor, application processor, Baseband processor etc.).In some embodiments, System 500 can be as the network equipment described in the embodiment of the present application.
In some embodiments, system 500 may include with instruction one or more computer-readable mediums (for example, System storage 506 or NVM/ store equipment 508) and mutually merge with the one or more computer-readable medium and be configured as Execute instruction the one or more processors 502 to realize module thereby executing movement described herein.
For one embodiment, system control module 504 may include any suitable interface controller, with to (one or It is multiple) at least one of processor 502 and/or any suitable equipment or component that communicate with system control module 504 mentions For any suitable interface.
System control module 504 may include Memory Controller module, to provide interface to system storage 506.Storage Device controller module can be hardware module, software module and/or firmware module.
System storage 506 can be used for for example, load of system 500 and storing data and/or instruction.For a reality Example is applied, system storage 506 may include any suitable volatile memory, for example, DRAM appropriate.In some embodiments In, system storage 506 may include four Synchronous Dynamic Random Access Memory of Double Data Rate type (DDR4SDRAM).
For one embodiment, system control module 504 may include one or more i/o controllers, with to NVM/ stores equipment 508 and (one or more) input-output apparatus 510 provides interface.
For example, NVM/ storage equipment 508 can be used for storing data and/or instruction.NVM/ storage equipment 508 may include appointing It anticipates nonvolatile memory appropriate (for example, flash memory) and/or to may include that any suitable (one or more) is non-volatile deposit Equipment is stored up (for example, one or more hard disk drives (HDD), one or more CD (CD) drivers and/or one or more Digital versatile disc (DVD) driver).
NVM/ storage equipment 508 may include a part for the equipment being physically mounted on as system 500 Storage resource or its can by the equipment access without a part as the equipment.For example, NVM/ storage equipment 508 can It is accessed by network via (one or more) input-output apparatus 510.
(one or more) input-output apparatus 510 can be provided for system 500 interface with other any equipment appropriate Communication, input-output apparatus 510 may include communication component, audio component, sensor module etc..Network interface 512 can be System 500 provides interfaces with by one or more network communications, system 500 can according to one or more wireless network standards and/ Or arbitrary standards in agreement and/or agreement are carried out wireless communication with the one or more components of wireless network, such as are accessed Wireless network based on communication standard, such as WiFi, 2G or 3G or their combination carry out wireless communication.
For one embodiment, at least one of (one or more) processor 502 can be with system control module 504 The logic of one or more controllers (for example, Memory Controller module) is packaged together.For one embodiment, (one Or multiple) at least one of processor 502 can be encapsulated in the logic of one or more controllers of system control module 504 Together to form system in package (SiP).For one embodiment, at least one of (one or more) processor 502 can It is integrated on same mold with the logic of one or more controllers of system control module 504.For one embodiment, (one It is a or multiple) at least one of processor 502 can be integrated with the logic of one or more controllers of system control module 504 To form system on chip (SoC) on same mold.
In various embodiments, system 500 can be, but not limited to be: work station, desk-top calculating equipment or mobile computing are set Standby (for example, lap-top computing devices, handheld computing device, tablet computer, net book etc.).In various embodiments, system 500 Can have more or fewer components and/or different frameworks.For example, in some embodiments, system 500 includes one or more A video camera, keyboard, liquid crystal display (LCD) screen (including touch screen displays), nonvolatile memory port, Duo Getian Line, graphic chips, specific integrated circuit (ASIC) and loudspeaker.
Wherein, if display includes touch panel, display screen may be implemented as touch screen displays, be used by oneself with receiving The input signal at family.Touch panel includes one or more touch sensors to sense the hand on touch, slide, and touch panel Gesture.The touch sensor can not only sense the boundary of a touch or slide action, but also detect and the touch or sliding Operate relevant duration and pressure.
The embodiment of the present application also provides a kind of non-volatile readable storage medium, be stored in the storage medium one or Multiple modules (programs) when the one or more module is used in terminal device, can make the terminal device execute The instruction (instructions) of various method steps in the embodiment of the present application.
A kind of device is provided in one example, comprising: one or more processors;With what is stored thereon has instruction One or more machine readable medias, when by one or more of processors execute when so that described device execute as this Apply for the method that the network equipment executes in embodiment.
Additionally provide one or more machine readable medias in one example, be stored thereon with instruction, when by one or When multiple processors execute, so that device executes the method such as network equipment execution in the embodiment of the present application.
The embodiment of the present application discloses a kind of data processing method and device.
Example 1, a kind of data processing method, comprising:
Obtain the data acquisition system of characterization network access;
Identify the data area for meeting preset condition in the data acquisition system;
According to the data area, it is determined whether prevent the corresponding network access in the data area.
Example 2 may include method described in example 1, and the data acquisition system for obtaining characterization network access includes:
Extract the access address of network access;
Determine the access address mapped data acquisition system.
Example 3 may include method described in example 2, and the determination access address mapped data acquisition system includes:
Determine the corresponding character vector of each character in the access address;
The corresponding character vector group of multiple characters is combined into the corresponding data matrix of access address as the data acquisition system.
Example 4 may include method described in example 3, described according to the data area, it is determined whether to prevent the data The corresponding network in region, which accesses, includes:
Semantics recognition is carried out to the data area using nervus opticus network model, it is corresponding to obtain the data area Target attack risk classifications;
Grammer is carried out to the data area using the corresponding third nerve network model of the target attack risk classifications Identification, and determined whether that the corresponding network in the data area is prevented to access according to grammer recognition result.
Example 5 may include method described in example 3, the corresponding character of each character in the determination access address Before vector, the determination access address mapped data acquisition system further include:
Remove the meaningless character in the access address.
Example 6 may include method described in example 1, and the data acquisition system for obtaining characterization network access includes:
Obtain the data acquisition system for characterizing multiple network access;
The data area for meeting preset condition in the identification data acquisition system includes:
Identify the data area of the multiple network access of the correspondence for meeting preset condition in the data acquisition system.
Example 7 may include method described in example 1, and the data field of preset condition is met in the identification data acquisition system Domain includes:
By data, present position is divided into multiple data areas in the data acquisition system;
Identification meets the data area of the preset condition from the multiple data area.
Example 8 may include method described in example 7, and the data field of preset condition is met in the identification data acquisition system Domain further include:
According to present position information, the data area for meeting the preset condition is merged.
Example 9 may include method described in example 1, described according to the data area, it is determined whether to prevent the data The corresponding network in region, which accesses, includes:
Semantics recognition is carried out to the data area, obtains the corresponding target attack risk classifications in the data area;
Grammer identification is carried out to the data area, and is determined whether to prevent the data area according to grammer recognition result Corresponding network access.
Example 10 may include method described in example 9, described according to the data area, it is determined whether prevent the number Before the corresponding network access in region, the method also includes:
Determine that the data area accesses the address fragment mapped in corresponding access address in network, as with network The address fragment of risk of attacks;
It is described to include: to data area progress semantics recognition
Semantics recognition is carried out to the address fragment with network attack risk;
It is described to include: to data area progress grammer identification
Grammer identification is carried out to the address fragment with network attack risk, is determined described with network attack risk Address fragment whether meet the syntax rule that the target attack risk classifications have.
Example 11 may include method described in example 1, described according to the data area, it is determined whether to prevent the data The corresponding network in region, which accesses, includes:
If it is determined that the network access includes network attack, then the network is blocked to access.
Example 12 may include method described in example 1, described according to the data area, it is determined whether to prevent the data The corresponding network in region, which accesses, includes:
If it is determined that the network access does not include network attack, then the network access of letting pass.
Example 13, a kind of data processing equipment, comprising:
Data acquisition system obtains module, for obtaining the data acquisition system of characterization network access;
Data area identification module meets the data area of preset condition for identification in the data acquisition system;
Network accesses identification module, for according to the data area, it is determined whether prevent the data area corresponding Network access.
Example 14, a kind of device, comprising: one or more processors;What is stored thereon has the one or more of instruction Machine readable media, when being executed by one or more of processors, so that described device executes such as example 1- example 12 1 A or multiple method.
Example 15, one or more machine readable media, are stored thereon with instruction, when being performed by one or more processors When, so that device executes as one or more methods such as example 1- example 12.
Although some embodiments are various substitutions, and/or equivalent implementation for the purpose of illustrating and describing Scheme calculates to reach same purpose and implement the realization for exemplifying and describing, and does not depart from the practical range of the application.This Shen It please be intended to cover any modification or variation of the embodiment being discussed herein.It is, therefore, apparent that embodiment described herein only by right It is required that being limited with their equivalent.

Claims (15)

1. a kind of data processing method characterized by comprising
Obtain the data acquisition system of characterization network access;
Identify the data area for meeting preset condition in the data acquisition system;
According to the data area, it is determined whether prevent the corresponding network access in the data area.
2. the method according to claim 1, wherein the data acquisition system for obtaining characterization network access includes:
Extract the access address of network access;
Determine the access address mapped data acquisition system.
3. according to the method described in claim 2, it is characterized in that, the determination access address mapped data acquisition system Include:
Determine the corresponding character vector of each character in the access address;
The corresponding character vector group of multiple characters is combined into the corresponding data matrix of access address as the data acquisition system.
4. according to the method described in claim 3, it is characterized in that, described according to the data area, it is determined whether prevent institute Stating the corresponding network access in data area includes:
Semantics recognition is carried out to the data area using nervus opticus network model, obtains the corresponding target in the data area Risk of attacks type;
Grammer identification is carried out to the data area using the corresponding third nerve network model of the target attack risk classifications, And determined whether that the corresponding network in the data area is prevented to access according to grammer recognition result.
5. according to the method described in claim 3, it is characterized in that, each character is corresponding in the determination access address Character vector before, the determination access address mapped data acquisition system further include:
Remove the meaningless character in the access address.
6. the method according to claim 1, wherein the data acquisition system for obtaining characterization network access includes:
Obtain the data acquisition system for characterizing multiple network access;
The data area for meeting preset condition in the identification data acquisition system includes:
Identify the data area of the multiple network access of the correspondence for meeting preset condition in the data acquisition system.
7. the method according to claim 1, wherein meeting preset condition in the identification data acquisition system Data area includes:
By data, present position is divided into multiple data areas in the data acquisition system;
Identification meets the data area of the preset condition from the multiple data area.
8. the method according to the description of claim 7 is characterized in that meeting preset condition in the identification data acquisition system Data area further include:
According to present position information, the data area for meeting the preset condition is merged.
9. the method according to claim 1, wherein described according to the data area, it is determined whether prevent institute Stating the corresponding network access in data area includes:
Semantics recognition is carried out to the data area, obtains the corresponding target attack risk classifications in the data area;
Grammer identification is carried out to the data area, and is determined whether to prevent the data area corresponding according to grammer recognition result Network access.
10. according to the method described in claim 9, it is characterized in that, described according to the data area, it is determined whether prevent Before the corresponding network access in the data area, the method also includes:
Determine that the data area accesses the address fragment mapped in corresponding access address in network, as with network attack The address fragment of risk;
It is described to include: to data area progress semantics recognition
Semantics recognition is carried out to the address fragment with network attack risk;
It is described to include: to data area progress grammer identification
Grammer identification is carried out to the address fragment with network attack risk, determines the ground with network attack risk Whether location segment meets the syntax rule that the target attack risk classifications have.
11. the method according to claim 1, wherein described according to the data area, it is determined whether prevent institute Stating the corresponding network access in data area includes:
If it is determined that the network access includes network attack, then the network is blocked to access.
12. the method according to claim 1, wherein described according to the data area, it is determined whether prevent institute Stating the corresponding network access in data area includes:
If it is determined that the network access does not include network attack, then the network access of letting pass.
13. a kind of data processing equipment characterized by comprising
Data acquisition system obtains module, for obtaining the data acquisition system of characterization network access;
Data area identification module meets the data area of preset condition for identification in the data acquisition system;
Network accesses identification module, for according to the data area, it is determined whether prevent the corresponding network in the data area Access.
14. a kind of computer equipment including memory, processor and stores the meter that can be run on a memory and on a processor Calculation machine program, which is characterized in that the processor realizes such as claim 1-12 mono- or more when executing the computer program A method.
15. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The method such as claim 1-12 one or more is realized when being executed by processor.
CN201810075020.3A 2018-01-25 2018-01-25 Data processing method and device Pending CN110086749A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810075020.3A CN110086749A (en) 2018-01-25 2018-01-25 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810075020.3A CN110086749A (en) 2018-01-25 2018-01-25 Data processing method and device

Publications (1)

Publication Number Publication Date
CN110086749A true CN110086749A (en) 2019-08-02

Family

ID=67412081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810075020.3A Pending CN110086749A (en) 2018-01-25 2018-01-25 Data processing method and device

Country Status (1)

Country Link
CN (1) CN110086749A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110768969A (en) * 2019-10-14 2020-02-07 深圳Tcl数字技术有限公司 Test method and device based on network data monitoring and readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699687A (en) * 2014-01-03 2014-04-02 复旦大学 Network entity crawling method based on enumeration
US20150106123A1 (en) * 2013-10-15 2015-04-16 Parkland Center For Clinical Innovation Intelligent continuity of care information system and method
CN104994091A (en) * 2015-06-30 2015-10-21 东软集团股份有限公司 Method and device for detecting abnormal flow, and method and device for defending against Web attack
US20170163663A1 (en) * 2015-12-02 2017-06-08 Salesforce.Com, Inc. False positive detection reduction system for network-based attacks
CN107294993A (en) * 2017-07-05 2017-10-24 重庆邮电大学 A kind of WEB abnormal flow monitoring methods based on integrated study
CN107483458A (en) * 2017-08-29 2017-12-15 杭州迪普科技股份有限公司 The recognition methods of network attack and device, computer-readable recording medium
CN107992741A (en) * 2017-10-24 2018-05-04 阿里巴巴集团控股有限公司 A kind of model training method, the method and device for detecting URL

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150106123A1 (en) * 2013-10-15 2015-04-16 Parkland Center For Clinical Innovation Intelligent continuity of care information system and method
CN103699687A (en) * 2014-01-03 2014-04-02 复旦大学 Network entity crawling method based on enumeration
CN104994091A (en) * 2015-06-30 2015-10-21 东软集团股份有限公司 Method and device for detecting abnormal flow, and method and device for defending against Web attack
US20170163663A1 (en) * 2015-12-02 2017-06-08 Salesforce.Com, Inc. False positive detection reduction system for network-based attacks
CN107294993A (en) * 2017-07-05 2017-10-24 重庆邮电大学 A kind of WEB abnormal flow monitoring methods based on integrated study
CN107483458A (en) * 2017-08-29 2017-12-15 杭州迪普科技股份有限公司 The recognition methods of network attack and device, computer-readable recording medium
CN107992741A (en) * 2017-10-24 2018-05-04 阿里巴巴集团控股有限公司 A kind of model training method, the method and device for detecting URL

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110768969A (en) * 2019-10-14 2020-02-07 深圳Tcl数字技术有限公司 Test method and device based on network data monitoring and readable storage medium
CN110768969B (en) * 2019-10-14 2023-10-17 深圳Tcl数字技术有限公司 Test method and device based on network data monitoring and readable storage medium

Similar Documents

Publication Publication Date Title
Naeem et al. Malware detection in industrial internet of things based on hybrid image visualization and deep learning model
US11171977B2 (en) Unsupervised spoofing detection from traffic data in mobile networks
CN106713371B (en) Fast Flux botnet detection method based on DNS abnormal mining
CN105471823B (en) A kind of sensitive information processing method, device, server and safe decision-making system
Chen et al. Detecting visually similar web pages: Application to phishing detection
Wu et al. Shoulder-surfing-proof graphical password authentication scheme
Zhao et al. A review of computer vision methods in network security
CN109862003B (en) Method, device, system and storage medium for generating local threat intelligence library
CN106209886B (en) Web interface data encryption is endorsed method, apparatus and server
CN109005145A (en) A kind of malice URL detection system and its method extracted based on automated characterization
CN112163638A (en) Defense method, device, equipment and medium for image classification model backdoor attack
CN107204956B (en) Website identification method and device
US20170063892A1 (en) Robust representation of network traffic for detecting malware variations
CN108090351A (en) For handling the method and apparatus of request message
EP2779520A1 (en) A process for obtaining candidate data from a remote storage server for comparison to a data to be identified
CN111371778A (en) Attack group identification method, device, computing equipment and medium
CN114422271B (en) Data processing method, device, equipment and readable storage medium
EP3676757A1 (en) Systems and methods for device recognition
CN114422211B (en) HTTP malicious traffic detection method and device based on graph attention network
CN110086749A (en) Data processing method and device
CN115314239A (en) Analysis method and related equipment for hidden malicious behaviors based on multi-model fusion
CN114422207A (en) Multi-mode-based C & C communication flow detection method and device
Wan et al. DevTag: A benchmark for fingerprinting IoT devices
JP7140268B2 (en) WARNING DEVICE, CONTROL METHOD AND PROGRAM
CN110417744B (en) Security determination method and device for network access

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40010988

Country of ref document: HK

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190802