CN109726347A - Network request automatic classification method and relevant device - Google Patents
Network request automatic classification method and relevant device Download PDFInfo
- Publication number
- CN109726347A CN109726347A CN201811639957.5A CN201811639957A CN109726347A CN 109726347 A CN109726347 A CN 109726347A CN 201811639957 A CN201811639957 A CN 201811639957A CN 109726347 A CN109726347 A CN 109726347A
- Authority
- CN
- China
- Prior art keywords
- network request
- classification
- request
- network
- classifier
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Transfer Between Computers (AREA)
Abstract
The embodiment of the present application provides a kind of network request automatic classification method, comprising: receives network request;The classification of the network request itself is identified according to the network request;If can not identify the classification of the network request according to the network request, the key message of the corresponding webpage of the network request is input in the first classifier, identifies the classification of the network request;If the key message in the corresponding webpage of the network request to be input to after the first classifier to the classification that can not identify the network request, the content of the corresponding webpage of the network request is input in the second classifier, identifies the classification of the network request.The application can solve the problems, such as the classification speed and classification accuracy that can not guarantee network request simultaneously in the prior art as far as possible.The embodiment of the present application also provides a kind of network request apparatus for automatically sorting, electronic equipment and computer readable storage medium.
Description
Technical field
This application involves network safety fileds, in particular to network request automatic classification method and relevant device.
Background technique
Currently, many enterprises and institutions can all make control to the network request issued in the work computer of employee: allowing
The request of which classification is either forbidden in the request of which classification.Therefore, the classification of network request how is quickly and accurately identified
With regard to becoming a crucial problem.
Summary of the invention
In the related prior art, the classification method of a kind of pair of network request is: establishing network request and its classification
Mapping table, and save the mapping table in memory.The network request to be sorted for one, the progress in the mapping table
Match, if there is occurrence then directly returns to its corresponding classification.This method is limited by factors such as memories, includes in mapping table
Network request quantity be limited, cause some network requests that can not identify its classification with mapping table.
Another kind is to train one with the method for machine learning under online and be based on network to the mode classification of network request
The classifier of corresponding web page contents is requested, then the classifier is deployed on line.Although the mode based on classification avoids
The problem of using the classification that can not identify network request when mapping table, still is difficult to do based entirely on the classifier of machine learning
To there is very high classification speed and very high classification accuracy simultaneously.
Another kind is to be combined above-mentioned two mode to the mode classification of network request.That is, to be sorted for one
Network request, first matched in the mapping table, if there is occurrence then directly returns to its corresponding classification, otherwise send net
Network request, to obtain its corresponding Webpage, will handle the net obtained to server after then pre-processing to Webpage
A classification results are obtained in page content input classifier.But this mode is still difficult to guarantee classification speed simultaneously and divide
Class accuracy rate, also, those network requests being not present in classification map are needed first to match in the mapping table, to
Matching is completed, and is confirmed to be not present in mapping table after occurrence and is initiated request to server again to obtain Webpage, this is with regard to big
The big response time for increasing system.
In view of this, the application provides network request automatic classification method and relevant device, to solve existing skill as far as possible
The problem of can not guaranteeing the classification speed and classification accuracy of network request in art simultaneously.
Specifically, the application is achieved by the following technical solution:
A kind of network request automatic classification method, comprising:
Receive network request;
The classification of the network request itself is identified according to the network request;
It is if can not identify the classification of the network request according to the network request, the network request is corresponding
The key message of webpage is input in the first classifier, identifies the classification of the network request;
If can not be identified after the key message in the corresponding webpage of the network request is input to the first classifier
The content of the corresponding webpage of the network request is input in the second classifier, described in identification by the classification of the network request
The classification of network request.
A kind of network request apparatus for automatically sorting, comprising:
Receiving unit, for receiving network request;
First judgement unit, for itself identifying the classification of the network request according to the network request;
Second judgement unit, if for that can not identify the classification of the network request according to the network request, it will
The key message of the corresponding webpage of the network request is input in the first classifier, identifies the classification of the network request;
Third judgement unit, if for the key message in the corresponding webpage of the network request to be input to first point
The content of the corresponding webpage of the network request is input to second point by the classification that the network request can not be identified after class device
In class device, the classification of the network request is identified.
A kind of electronic equipment, the electronic equipment include:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processing
Device realizes network request automatic identifying method above-mentioned.
A kind of computer readable storage medium, is stored thereon with computer program, realization when which is executed by processor
Network request automatic identifying method above-mentioned.
By the above technical solution provided by the present application as it can be seen that by the above technical solution provided by the present application as it can be seen that the application
It is middle using layering thought preferably resolve speed in network request assorting process, accuracy rate and it is comprehensive between existing lance
Shield.Network request higher for access frequency and the network request of main stream website can quickly be sentenced using first layer arbiter
Not Chu its classification, other network requests then need to be carried out with the second layer or third layer arbiter according to the corresponding webpage of network request
Differentiate.It in this way can be in one classification speed, accuracy rate and comprehensive upper acquirement good balance.
In addition, the asynchronous interrelational form used between first layer arbiter and second layer arbiter can effectively reduce and sentence
The response time of other system.Since second layer arbiter and third layer arbiter need to use the corresponding webpage of network request, institute
The webpage of request is then waited to return just to need to transmit a request to server.If first layer arbiter is waited to differentiate completion
Initiating request again afterwards will make the response time of system excessive.In the present invention, network request is first sent to remote server, so
After start to be differentiated in first layer arbiter, first layer arbiter, which also has been completed, when webpage wait request returns sentences
Not, which reduces the times of waiting, also just reduce the response time of system.
Detailed description of the invention
Fig. 1 is the network architecture schematic diagram of the processing network request shown in the application;
The frame diagram that network request shown in the application of the position Fig. 2 is classified automatically;
Fig. 3 is a kind of flow chart of network request automatic classification method shown in the application;
Fig. 4 is a kind of structural block diagram of network request apparatus for automatically sorting shown in the application;
Fig. 5 is the structural block diagram of another network request apparatus for automatically sorting shown in the application;
Fig. 6 is the structural block diagram of the first judgement unit shown in the application;
Fig. 7 is the structural block diagram of a kind of electronic equipment shown in the application;
Fig. 8 is the structural representation for realizing the computer system according to the network request automatic classification method shown in the application
Figure.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to
When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistent with the application.On the contrary, they be only with it is such as appended
The example of the consistent device and method of some aspects be described in detail in claims, the application.
It is only to be not intended to be limiting the application merely for for the purpose of describing particular embodiments in term used in this application.
It is also intended in the application and the "an" of singular used in the attached claims, " described " and "the" including majority
Form, unless the context clearly indicates other meaning.It is also understood that term "and/or" used herein refers to and wraps
It may be combined containing one or more associated any or all of project listed.
It will be appreciated that though various information, but this may be described using term first, second, third, etc. in the application
A little information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other out.For example, not departing from
In the case where the application range, the first information can also be referred to as the second information, and similarly, the second information can also be referred to as
One information.Depending on context, word as used in this " if " can be construed to " ... when " or " when ...
When " or " in response to determination ".
It referring to Figure 1, is the network architecture schematic diagram of the processing network request shown in the application.It is wrapped in the network architecture
Include subscriber's main station 10, the network equipment 20 and server 30.Wherein, it when user will browse webpage, can be sent out by subscriber's main station 10
SCN Space Cable Network request, and by the classification of the network equipment 20, the classification of the network request is determined, according to the classification identified to the net
Network request is handled, and server 30 is relayed to after processing.Such as, processing is filtered according to the classification of network request, that is, mistake
Forbidden network request is filtered, and retains the network request being allowed to and is transmitted to server 30.Method provided by the present application
It can apply in the network equipment 20, when the network equipment 20 receives the network request (arrow as shown in figure 1 of the transmission of subscriber's main station 10
Head is 1. shown) when, the classification of the network request is identified, according to the classification identified to the network request filtration treatment.Such as certain
Company does not allow Internet chat, and if the classification identified is chat class, the network equipment 20 forbids the network request (such as Fig. 1
In arrow 2. shown in), if the classification identified be not chat class, the network equipment 20 let pass the network request, will and the net
Network request is transmitted to server 30 (arrow as shown in figure 1 is 3. shown), and server 30 is responded according to network request, and will be rung
Should result return to the network equipment 20 (arrow as shown in figure 1 4. shown in), the network equipment 20 further returns to the response results
Give subscriber's main station 10 (arrow as shown in figure 1 is 5. shown).
When the network equipment 20 identifies the classification of network request, in the related prior art, classifier is either used
Mode, or by the way of combining classifier and mapping table, be all difficult to guarantee classification speed and classification accuracy simultaneously.
To solve the above-mentioned problems, the embodiment of the present application provides a kind of automatic identification scheme of network request.One complete
Whole request generally comprises two parts: initiating request, receives response.According to this feature, the application using the thought of layering come
Solve the classification problem of request, wherein frame diagram as shown in Figure 2, first layer arbiter is using network request itself (that is, reflecting
Firing table mode) come identify the classification of network request, second layer arbiter and third layer arbiter using the response of network request come
Identify the classification (that is, classifier mode) of request.Wherein, (e.g., second layer arbiter mainly utilizes the key message in response
The title or abstract usually being had in the response of HTTP request), third layer arbiter mainly (does not include mark using the content of response
Topic or abstract) it is identified.Certainly, the can be skipped if not having title or abstract in the response of a network request
Two layers of arbiter are directly identified with third layer arbiter, as shown in phantom in FIG..Since the key message in response is normal
It is the generality description of web page contents, can be good at the theme and classification that reflect the page, improve recognition accuracy, in addition,
Key message is typically more brief, can greatly reduce the calculation amount in identification process in this way, accelerates recognition speed.
It should be noted that there is " request-response " mould to any to the scheme that network request is classified in the application
The request of formula has general applicability.For ease of description, the following examples are only carried out by taking " classification of URL request " as an example
Explanation.
Here, being first illustrated one by one to symbol used in the present embodiment: assuming that a is a URL request, URL (a) table
Show the value (character string) of a, CLS (a) indicates the classification of a, and SUBDOMAIN (a) indicates that the subdomain name of a, PAGE (a) indicate that a is corresponding
Webpage, TITLE (a) indicate by pretreatment after PAGE (a) key message, such as title or abstract, CONTENT (a)
Indicate the content of the PAGE (a) after pretreatment.
Three layers of arbiter in framework shown in Fig. 2 are illustrated respectively below.
One, first layer arbiter
First layer arbiter is high speed arbiter, is contained in the arbiter there are two judgement unit, first judgement unit makes
Differentiated with URL classification library, second judgement unit is differentiated using URL feature database.The two judgement units are parallel
Differentiated, and the priority of first judgement unit is higher than the priority of second judgement unit.
URL classification library is generated after carrying out precise classification to the very big URL request of some amount of access (such as: top 10000)
A class library, in this class library each URL request all correspond to a classification.The URL to be sorted for one
(a), if URL (a) and URL (x) are equal, CLS (a)=CLS (x).
URL feature database is to be used to classify to URL request with what the self-contained some classification informations of URL request generated
Library.The website of mainstream is at present for the ease of managing and maintaining, it will usually include one in the subdomain name of URL request or path
A little classification informations, these classification informations are usually all that very accurately, can be used to classify to URL request.These classifications letter
Breath itself may be exactly a classifier, it is also possible to a synonym (including the Chinese phonetic alphabet and its abbreviation) for category word,
Such as: http://sports.qq.com/nba/, the subdomain name ' sports ' of the URL request are exactly a classifier (assuming that having
Sports this classification), all these words are known as to the derivative of a classification here.All derivatives of one classification c with
The form of set can indicate are as follows: and DRV (c) 1, DRV (c) 2 ..., DRV (c) n }.It can be with the derivation set of words of all categories
Generate a URL feature database.The URL (a) to be sorted for one is matched in feature database, such as with SUBDOMAIN (a)
Fruit has been matched to DRV (c) i, then CLS (a)=c.
Second judgement unit is the supplement to first judgement unit.Due to the limitation of the factors such as memory, URL classification library
In include URL request quantity it is opposite be it is fewer, this results in some URL requests that can not be sentenced with first judgement unit
Its other classification, and second judgement unit can preferably make up this point.
The two judgement units are inherently to carry out string matching operation, can with multimode string matching algorithm or
Person's bloom filter algorithm quickly finishes differentiation.For a URL (a), if first judgement unit sentences its classification
Not Wei c1, then CLS (a)=c1, if first judgement unit fails to differentiate its classification, and the second judgement unit sentences its classification
Not Wei c2 then CLS (a)=c2, use second layer arbiter to carry out if two judgement units all do not determine the classification of a
Differentiate.
Two, second layer arbiter
Second layer arbiter is that key message (e.g., title and/or abstract) in webpage is corresponded to according to URL request to URL
Request is classified.The classifier based on machine learning is used in this layer of arbiter.Why individually in webpage
It is to be difficult to effectively because the content of some webpages is more dispersed with the classifier based on machine learning that key message, which carries out classification,
Classify to it, and the key message (such as title) of webpage is usually the generality description of the content of the webpage, it can be fine
Reflection the page theme and classification.In addition, key message is typically more brief, it can greatly reduce differentiated in this way
Calculation amount in journey.
This layer use classifier require very low misclassification rate, can guarantee in this way with the classifier determine come
Classification have very high accuracy.This can pass through one group of discrimination threshold of setting (for example, each classification corresponds to a threshold value)
To realize: the probability that some URL request is determined as classification c is only greater than or equal to when classifier the discrimination threshold of classification c
When c be possible to be the URL request classification.Classification by having part URL request after second layer arbiter is cannot
Differentiate, for this part, URL request needs to differentiate it with third layer arbiter.
Three, third layer arbiter
Third layer arbiter is to correspond to the content of webpage according to URL request to classify to URL request.This layer of arbiter
Also use the classifier based on machine learning.Different with second layer arbiter, this layer of arbiter is according to webpage
Content differentiated that and this layer uses complete classifier, namely to arbitrarily effectively inputting the classifier
It can differentiate its classification.Although this layer of arbiter can carry out classification judgement, its speed to almost all of URL request
Comparatively be most slow, it is easy to as the performance bottleneck of system, therefore cannot function as main classifier come using.
First layer and second layer arbiter in this application is designed to can be to most of URL request often accessed
Classification make quick discrimination, only the classification of small part URL request can be differentiated with third layer arbiter, therefore third layer
Arbiter not will cause too much influence to the overall performance of judgement system.
Fig. 3 is referred to, Fig. 3 is a kind of flow chart of network request automatic classification method shown in the application.Party's rule
It such as can be applied to the network equipment 20 shown in FIG. 1, it is assumed that user is used a URL request x and initiates request, to the class of x
The process not identified is as follows:
Step 300: user initiates URL request;
Step 301A: replicating URL (x), sends it to first layer arbiter, with first layer arbiter to x's
Classification is differentiated;
Step 302: whether first layer arbiter, which can determine the classification of x, goes to step 303 if can determine for c
It is handled, otherwise, goes to step 306 and handled;
Step 303: general<x, c>it is sent to marking unit, differentiate that process terminates;
Step 301B: while sending first layer arbiter for URL (x), original request x is still sent to corresponding
Server simultaneously returns the corresponding page PAGE (x) of x, goes to step 304 and is handled;
Step 304: with the presence or absence of<x in marking unit, c>, it is handled if it does, going to step 311, otherwise, turns step
Rapid 305 processing;
Step 305: general<x, PAGE (x)>be put into queue Q1 to be discriminated;
Since first layer arbiter need to only carry out matching operation in local memory, it is hereby ensured that return in PAGE (x)
First layer arbiter has been completed that (if can differentiate the classification of x, x and its corresponding classification have also been sent out for differentiation before returning
It is sent to marking unit).
Step 306: second layer arbiter monitors queue Q1, if there are data in Q1, successively take out data to its into
Row differentiates.
Assuming that the data taken out are<x, PAGE (x)>.PAGE (x) is pre-processed: extract title, and to title according to
The input requirements of this layer of classifier are handled to obtain TITLE (x).TITLE (x) is input to classifier.
Step 307: the classification whether second layer arbiter can determine TITLE (x) is c, if can determine,
303 and 311 are gone to step respectively to be handled, and otherwise, are gone to step 308 and are handled;
Step 308: general<x, PAGE (x)>be put into queue Q2 to be discriminated, it is handled subsequently into step 309;
Step 309: third layer arbiter monitors queue Q2, if there are data in Q2, successively take out data to its into
Row differentiates.
Assuming that the data taken out are<x, PAGE (x)>.PAGE (x) is pre-processed: extracting the content of PAGE (x), and
It is handled to obtain CONTENT (x) according to the input requirements of this layer of classifier.
Step 310: the classification whether third layer arbiter can determine CONTENT (x) is c, if can differentiate,
303 and 311 are gone to step respectively to be handled.
Step 311: general<x, c, PAGE (x)>it is sent to subscriber's main station progress subsequent processing, differentiate that process terminates.
By the above technical solution provided by the present application as it can be seen that preferably resolving network using the thought of layering in the application
Speed during requests classification, accuracy rate and it is comprehensive between existing contradiction.Network request higher for access frequency and
The network request of main stream website can quickly determine its classification using first layer arbiter, and other network requests then need to use
The second layer or third layer arbiter are differentiated according to the corresponding webpage of network request.It in this way can be in classification speed, accuracy rate
With one good balance of comprehensive upper acquirement.
In addition, the asynchronous interrelational form used between first layer arbiter and second layer arbiter can effectively reduce and sentence
The response time of other system.Since second layer arbiter and third layer arbiter need to use the corresponding webpage of network request, institute
The webpage of request is then waited to return just to need to transmit a request to server.If first layer arbiter is waited to differentiate completion
Initiating request again afterwards will make the response time of system excessive.In the present invention, network request is first sent to remote server, so
After start to be differentiated in first layer arbiter, first layer arbiter, which also has been completed, when webpage wait request returns sentences
Not, which reduces the times of waiting, also just reduce the response time of system.
Referring to FIG. 4, Fig. 4 is a kind of structural block diagram of network request apparatus for automatically sorting shown in the application, it is applied to
The network equipment shown in FIG. 1, the device include: receiving unit 410, the first judgement unit 420, the second judgement unit 430 and
Three judgement units 440.
Receiving unit 410, for receiving network request;
First judgement unit 420, for itself identifying the classification of the network request according to the network request;
Second judgement unit 430, if for that can not identify the classification of the network request according to the network request,
The key message of the corresponding webpage of the network request is input in the first classifier, identifies the classification of the network request;
Third judgement unit 440, if for the key message in the corresponding webpage of the network request to be input to
The content of the corresponding webpage of the network request is input to by the classification that the network request can not be identified after one classifier
In two classifiers, the classification of the network request is identified.
In a kind of embodiment, which can also include acquiring unit, in the first judgement unit root
When identifying the classification of the network request according to network request itself, the corresponding webpage of the network request is obtained from server, it is right
The webpage is pre-processed, and the key message and content of the corresponding webpage of the network request are obtained.
Specifically, the key message of the webpage includes at least one of abstract and title of webpage.
As shown in figure 5, the device can also include assert unit 450, for described in another embodiment
The key message of the corresponding webpage of the network request is input in the first classifier by the second judgement unit, identifies the network
After the classification of request, judge whether the probability that the network request is identified as the classification is greater than or is waited by first classifier
In the discrimination threshold of the pre-set classification, the key in the corresponding webpage of the network request is believed if so, assert
Breath has identified the classification of the network request after being input to the first classifier, if not, assert the network request is corresponding
Webpage in key message be input to after the first classifier the classification that can not identify the network request.
As shown in fig. 6, in a kind of embodiment, the first judgement unit 420 includes:
First mapping subelement 4211, for searching the classification with network request mapping from class library, wherein institute
State the mapping relations being previously provided between network request and classification in class library;
Second mapping subelement 4212, if for being searched in class library less than the classification mapped with the network request,
The classification of the classification information mapping carried with the network request is searched from feature database, wherein set in advance in the feature database
It is equipped with the classification information and the mapping relations of classification time of network request carrying.
In another embodiment, which can also include: processing unit, according to the classification identified to described
Network request is handled.
By the above technical solution provided by the present application as it can be seen that by the above technical solution provided by the present application as it can be seen that the application
It is middle using layering thought preferably resolve speed in network request assorting process, accuracy rate and it is comprehensive between existing lance
Shield.Network request higher for access frequency and the network request of main stream website can quickly be sentenced using first layer arbiter
Not Chu its classification, other network requests then need to be carried out with the second layer or third layer arbiter according to the corresponding webpage of network request
Differentiate.It in this way can be in one classification speed, accuracy rate and comprehensive upper acquirement good balance.
In addition, the asynchronous interrelational form used between first layer arbiter and second layer arbiter can effectively reduce and sentence
The response time of other system.Since second layer arbiter and third layer arbiter need to use the corresponding webpage of network request, institute
The webpage of request is then waited to return just to need to transmit a request to server.If first layer arbiter is waited to differentiate completion
Initiating request again afterwards will make the response time of system excessive.In the present invention, network request is first sent to remote server, so
After start to be differentiated in first layer arbiter, first layer arbiter, which also has been completed, when webpage wait request returns sentences
Not, which reduces the times of waiting, also just reduce the response time of system.
For device embodiment, since it corresponds essentially to embodiment of the method, so related place is referring to method reality
Apply the part explanation of example.The apparatus embodiments described above are merely exemplary, wherein described be used as separation unit
The unit of explanation may or may not be physically separated, and component shown as a unit can be or can also be with
It is not physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual
The purpose for needing to select some or all of the modules therein to realize application scheme.Those of ordinary skill in the art are not paying
Out in the case where creative work, it can understand and implement.
Referring to FIG. 7, Fig. 7 is the structural block diagram of a kind of electronic equipment shown in the application, as shown in fig. 7, the electronics
Equipment 700 includes processor 701 and memory 702;Wherein,
The memory 702 is for storing one or more computer instruction, wherein one or more computer refers to
It enables and being executed by the processor 701 to realize all or part of the steps in aforementioned approaches method step.
Fig. 8 is the structural representation for realizing the computer system according to the network request automatic classification method shown in the application
Figure.
As shown in figure 8, computer system 800 includes central processing unit (CPU) 801, it can be read-only according to being stored in
Program in memory (ROM) 802 or be loaded into the program in random access storage device (RAM) 803 from storage section 808 and
Execute the various processing in embodiment shown in above-mentioned Fig. 2-3.In RAM803, be also stored with system 800 operate it is required
Various programs and data.CPU801, ROM802 and RAM803 are connected with each other by bus 804.Input/output (I/O) interface
805 are also connected to bus 804.
I/O interface 805 is connected to lower component: the importation 806 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 807 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 808 including hard disk etc.;
And the communications portion 809 of the network interface card including LAN card, modem etc..Communications portion 809 via such as because
The network of spy's net executes communication process.Driver 810 is also connected to I/O interface 805 as needed.Detachable media 811, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 810, in order to read from thereon
Computer program be mounted into storage section 808 as needed.
Particularly, according to presently filed embodiment, it may be implemented as computer above with reference to Fig. 2-3 method described
Software program.For example, presently filed embodiment includes a kind of computer program product comprising be tangibly embodied in and its can
The computer program on medium is read, the computer program includes the program generation for executing aforesaid space index establishing method
Code.In such an embodiment, which can be downloaded and installed from network by communications portion 809, and/
Or it is mounted from detachable media 811.
Flow chart and block diagram in attached drawing illustrate system, method and computer according to the various embodiments of the application
The architecture, function and operation in the cards of program product.In this regard, each box in course diagram or block diagram can be with
A part of a module, section or code is represented, a part of the module, section or code includes one or more
Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box
The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical
On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants
It is noted that the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart, Ke Yiyong
The dedicated hardware based system of defined functions or operations is executed to realize, or can be referred to specialized hardware and computer
The combination of order is realized.
Being described in unit or module involved in disclosure embodiment can be realized by way of software, can also
It is realized in a manner of through hardware.Described unit or module also can be set in the processor, these units or module
Title do not constitute the restriction to the unit or module itself under certain conditions.
As on the other hand, present invention also provides a kind of computer readable storage medium, the computer-readable storage mediums
Matter can be computer readable storage medium included in device described in above embodiment;It is also possible to individualism,
Without the computer readable storage medium in supplying equipment.Computer-readable recording medium storage has one or more than one journey
Sequence, described program is used to execute by one or more than one processor is described in the present processes.
The foregoing is merely the preferred embodiments of the application, not to limit the application, all essences in the application
Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the application protection.
Claims (14)
1. a kind of network request automatic classification method characterized by comprising
Receive network request;
The classification of the network request itself is identified according to the network request;
If can not identify the classification of the network request according to the network request, by the corresponding webpage of the network request
Key message be input in the first classifier, identify the classification of the network request;
If the key message in the corresponding webpage of the network request is input to after the first classifier described in can not identifying
The content of the corresponding webpage of the network request is input in the second classifier, identifies the network by the classification of network request
The classification of request.
2. the method according to claim 1, wherein further include:
When identifying the classification of the network request according to network request itself, it is corresponding that the network request is obtained from server
Webpage pre-processes the webpage, obtains the key message and content of the corresponding webpage of the network request.
3. method according to claim 1 or 2, which is characterized in that the key message of the webpage includes the abstract of webpage
At least one of with title.
4. method according to claim 1 or 2, which is characterized in that further include:
It is input in the first classifier by the key message of the corresponding webpage of the network request, identifies the network request
After classification, judge whether the probability that the network request is identified as the classification is greater than or equal in advance by first classifier
The discrimination threshold for the classification being arranged inputs the key message in the corresponding webpage of the network request if so, assert
The classification of the network request is had identified after to the first classifier, if not, assert the corresponding webpage of the network request
In key message be input to after the first classifier the classification that can not identify the network request.
5. the method according to claim 1, wherein described identify the network according to the network request itself
The classification of request, comprising:
The classification with network request mapping is searched from class library, wherein be previously provided with network in the class library and ask
Seek the mapping relations between classification;
If searched in class library less than the classification mapped with the network request, searched and the network request from feature database
Carrying classification information mapping classification, wherein be previously provided in the feature database network request carrying classification information with
The mapping relations of classification time.
6. the method according to claim 1, wherein further include:
The network request is handled according to the classification identified.
7. a kind of network request apparatus for automatically sorting characterized by comprising
Receiving unit, for receiving network request;
First judgement unit, for itself identifying the classification of the network request according to the network request;
Second judgement unit, if for that can not identify the classification of the network request according to the network request, it will be described
The key message of the corresponding webpage of network request is input in the first classifier, identifies the classification of the network request;
Third judgement unit, if for the key message in the corresponding webpage of the network request to be input to the first classifier
The content of the corresponding webpage of the network request is input to the second classifier by the classification that can not identify the network request afterwards
In, identify the classification of the network request.
8. device according to claim 7, which is characterized in that further include:
Acquiring unit, for when first judgement unit identifies the classification of the network request according to network request itself,
The corresponding webpage of the network request is obtained from server, the webpage is pre-processed, it is corresponding to obtain the network request
Webpage key message and content.
9. device according to claim 7 or 8, which is characterized in that the key message of the webpage includes the abstract of webpage
At least one of with title.
10. device according to claim 7 or 8, which is characterized in that further include:
Unit is assert, for the key message of the corresponding webpage of the network request to be input to the in second judgement unit
In one classifier, after the classification for identifying the network request, judge that the network request is identified as institute by first classifier
Whether the probability for stating classification is greater than or equal to the discrimination threshold of the pre-set classification, if so, assert the network
The classification of the network request is had identified after requesting the key message in corresponding webpage to be input to the first classifier, if
No, identification can not identify the net after the key message in the corresponding webpage of the network request is input to the first classifier
The classification of network request.
11. device according to claim 7, which is characterized in that first judgement unit includes:
First mapping subelement, for searching the classification with network request mapping from class library, wherein the class library
In be previously provided with mapping relations between network request and classification;
Second mapping subelement, if for searching in class library less than the classification mapped with the network request, from feature database
The classification for the classification information mapping that middle lookup is carried with the network request, wherein be previously provided with network in the feature database
Request the mapping relations of the classification information carried and classification time.
12. device according to claim 7, which is characterized in that further include:
Processing unit is handled the network request according to the classification identified.
13. a kind of electronic equipment, which is characterized in that the electronic equipment includes:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
Existing method of any of claims 1-6.
14. a kind of computer readable storage medium, which is characterized in that be stored thereon with computer program, the program is by processor
Method of any of claims 1-6 is realized when execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811639957.5A CN109726347A (en) | 2018-12-29 | 2018-12-29 | Network request automatic classification method and relevant device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811639957.5A CN109726347A (en) | 2018-12-29 | 2018-12-29 | Network request automatic classification method and relevant device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109726347A true CN109726347A (en) | 2019-05-07 |
Family
ID=66297978
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811639957.5A Pending CN109726347A (en) | 2018-12-29 | 2018-12-29 | Network request automatic classification method and relevant device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109726347A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101872347A (en) * | 2009-04-22 | 2010-10-27 | 富士通株式会社 | Method and device for judging type of webpage |
CN102624703A (en) * | 2011-12-31 | 2012-08-01 | 成都市华为赛门铁克科技有限公司 | Method and device for filtering uniform resource locators (URLs) |
CN103544210A (en) * | 2013-09-02 | 2014-01-29 | 烟台中科网络技术研究所 | System and method for identifying webpage types |
CN105512143A (en) * | 2014-09-26 | 2016-04-20 | 中兴通讯股份有限公司 | Method and device for web page classification |
CN105591997A (en) * | 2014-10-20 | 2016-05-18 | 杭州迪普科技有限公司 | URL (uniform resource locator) classification and filtering method and device |
CN108134784A (en) * | 2017-12-19 | 2018-06-08 | 东软集团股份有限公司 | web page classification method and device, storage medium and electronic equipment |
EP3355240A1 (en) * | 2017-01-31 | 2018-08-01 | Wipro Limited | A method and a system for generating a multi-level classifier for image processing |
-
2018
- 2018-12-29 CN CN201811639957.5A patent/CN109726347A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101872347A (en) * | 2009-04-22 | 2010-10-27 | 富士通株式会社 | Method and device for judging type of webpage |
CN102624703A (en) * | 2011-12-31 | 2012-08-01 | 成都市华为赛门铁克科技有限公司 | Method and device for filtering uniform resource locators (URLs) |
CN103544210A (en) * | 2013-09-02 | 2014-01-29 | 烟台中科网络技术研究所 | System and method for identifying webpage types |
CN105512143A (en) * | 2014-09-26 | 2016-04-20 | 中兴通讯股份有限公司 | Method and device for web page classification |
CN105591997A (en) * | 2014-10-20 | 2016-05-18 | 杭州迪普科技有限公司 | URL (uniform resource locator) classification and filtering method and device |
EP3355240A1 (en) * | 2017-01-31 | 2018-08-01 | Wipro Limited | A method and a system for generating a multi-level classifier for image processing |
CN108134784A (en) * | 2017-12-19 | 2018-06-08 | 东软集团股份有限公司 | web page classification method and device, storage medium and electronic equipment |
Non-Patent Citations (2)
Title |
---|
胡学钢等: "新闻网页自动识别的相关特征研究", 《广西师范大学学报:自然科学版》 * |
许世明等: "一种基于预分类的高效SVM中文网页分类器", 《计算机工程与应用》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106156365B (en) | A kind of generation method and device of knowledge mapping | |
US9323839B2 (en) | Classification rule generation device, classification rule generation method, classification rule generation program, and recording medium | |
US10565444B2 (en) | Using visual features to identify document sections | |
AU2005201758B2 (en) | Method of learning associations between documents and data sets | |
US10637826B1 (en) | Policy compliance verification using semantic distance and nearest neighbor search of labeled content | |
CN110929125B (en) | Search recall method, device, equipment and storage medium thereof | |
CN106960030A (en) | Pushed information method and device based on artificial intelligence | |
CN111753171B (en) | Malicious website identification method and device | |
CN106708940A (en) | Method and device used for processing pictures | |
CN108446295A (en) | Information retrieval method, device, computer equipment and storage medium | |
CN103678460B (en) | For identifying the method and system for the non-text elements for being suitable to be communicated in multi-language environment | |
CN110032622B (en) | Keyword determination method, keyword determination device, keyword determination equipment and computer readable storage medium | |
US10963686B2 (en) | Semantic normalization in document digitization | |
US11321531B2 (en) | Systems and methods of updating computer modeled processes based on real time external data | |
CN110321428A (en) | Structural maintenance mapper | |
CN111309288B (en) | Analysis method and device of software requirement specification file suitable for banking business | |
RU105758U1 (en) | ANALYSIS AND FILTRATION SYSTEM FOR INTERNET TRAFFIC BASED ON THE CLASSIFICATION METHODS OF MULTI-DIMENSIONAL DOCUMENTS | |
CN109726347A (en) | Network request automatic classification method and relevant device | |
EP4167122A1 (en) | Extracting key value pairs using positional coordinates | |
CN113888760B (en) | Method, device, equipment and medium for monitoring violation information based on software application | |
US11503055B2 (en) | Identifying siem event types | |
CN114492446A (en) | Legal document processing method and device, electronic equipment and storage medium | |
WO2023003488A1 (en) | Checking of a document for compliance with personal data requirements | |
US11372925B2 (en) | Information processing apparatus and non-transitory computer readable medium storing information processing program | |
CN107656909A (en) | A kind of Documents Similarity decision method and device based on document composite character |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190507 |
|
RJ01 | Rejection of invention patent application after publication |