CN107886344A - The recognition methods of fraud advertisement page and device based on convolutional neural networks - Google Patents

The recognition methods of fraud advertisement page and device based on convolutional neural networks Download PDF

Info

Publication number
CN107886344A
CN107886344A CN201610875790.7A CN201610875790A CN107886344A CN 107886344 A CN107886344 A CN 107886344A CN 201610875790 A CN201610875790 A CN 201610875790A CN 107886344 A CN107886344 A CN 107886344A
Authority
CN
China
Prior art keywords
picture
neural networks
convolutional neural
training
page
Prior art date
Application number
CN201610875790.7A
Other languages
Chinese (zh)
Inventor
黃獻德
Original Assignee
北京金山安全软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京金山安全软件有限公司 filed Critical 北京金山安全软件有限公司
Priority to CN201610875790.7A priority Critical patent/CN107886344A/en
Publication of CN107886344A publication Critical patent/CN107886344A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06K9/6256Obtaining sets of training patterns; Bootstrap methods, e.g. bagging, boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce, e.g. shopping or e-commerce
    • G06Q30/02Marketing, e.g. market research and analysis, surveying, promotions, advertising, buyer profiling, customer management or rewards; Price estimation or determination
    • G06Q30/0241Advertisement
    • G06Q30/0248Avoiding fraud

Abstract

The invention discloses a kind of recognition methods of fraud advertisement page, device and server based on convolutional neural networks, wherein method includes:Collect page pictures and make training set, all pictures in the training set are handled;Convolutional neural networks are built, use the picture training convolutional neural networks in the training set;Then the picture of the page to be detected is obtained, is input in the convolutional neural networks trained and detects after handling the picture for detecting the page.Thus, it is no longer necessary to expend resource picture and extract condition code and without extracting the image recognition of feature by manually again for network address in the past, effectively solve the problems, such as to cheat advertisement.

Description

The recognition methods of fraud advertisement page and device based on convolutional neural networks

Technical field

The present invention relates to the image identification method of calculator visual field, wherein more particularly to a kind of based on convolution class nerve The recognition methods of fraud advertisement page, device and the server of network.

Background technology

Image recognition (Pattern recognition, or pattern-recognition) is that one kind passes through Computing technical side Method reaches the technology of the automatic interpretation of image.Image storage and computing in the form of array in a computer, such as three primary colors optical mode Image, is first layered by formula (RGB) with different color, after method described above represents different colours, then with relative position distribution color Strength values.Common representative art such as character recognition (OCR;Optical Character Recognition), by text Word image is handled, and is extracted main expression characteristic and is noted down characteristic model, after comparison actually enters image, according to logical AND machine Rate is converted into character string forms, carries out subsequent treatment for system, it seems number plate recognition system to also have in addition, by the shadow of various vehicles As being inputted as system, after recognizing characters on license plate with characteristic matching, you can progress such as charging, safety control, doubtful car tracking is existing There is related application.

In the nowadays information change quick epoch, one intelligent mobile phone of human hand is very universal situation, wherein and with Android phone is more typical, when user is using surfing Internet with cell phone often be apprised of poisoning or need more new application and Be forced to download and be mounted with a daze user oneself may unwanted App at all, here it is so-called fraud advertisement (Deceptive Advertising) (as shown in Figure 1), mean tricks user cheating is used when user is just in browsing pages, So that user thinks the computer of oneself by poisoning intrusion, and it is induced to and downloads the page " installation " certain App, fraud advertisement is in Existing mode is maked rapid progress, can with country, time zone, language and have different complexions, it is hard to guard against.

Presently relevant technology is most directly to collect the target network address to be blocked and make condition code (i.e. black and white lists machine System) carry out the filtering of the advertisement webpage of traditional fishing website, but cheat advertisement with it is maximum be not all can because location, when Difference, browser family of languages etc. be different and the existing different ad content of bullet, thereby attract user click on installation reality is intended to it is expanded should With.The characteristic of the apparent short grade of the life cycle change more for not being suitable for confrontation fraud advertisement of this method;Still there is pin in addition The practices such as condition code are made to web page contents source code, this is more only applicable to application scenarios such as fishing website of minority etc., is intended to resist Quickly fraud advertisement webpage is substantially insufficient for change.

The content of the invention

The purpose of the present invention is intended at least solve one of above-mentioned technical problem to a certain extent.

Therefore, first purpose of the present invention is to propose a kind of fraud advertisement page identification based on convolutional neural networks Method, this method are mainly the convolution class nerual network technique of penetratingdepth study, and the webpage interception of advertisement is now cheated for bullet Picture forms training set, is successively filtered with the image of wherein small range, as shown in fig. 6, through substantial amounts of sample training collection To be learnt so that convolutional neural networks can automatically extract the feature of fraud advertisement page picture, and then automatic identification is not Know the fraud advertisement in the page.

Second object of the present invention is to propose a kind of fraud advertisement page identification device based on convolutional neural networks.

Third object of the present invention is to propose a kind of server.

For the above-mentioned purpose, fraud advertisement page identification side of the first aspect present invention embodiment based on convolutional neural networks Method, including:

Collect page pictures and make training set, fraud advertisement page picture and normogram are comprised at least in the training set Piece;

All pictures in the training set are zoomed in and out with the cromogram for obtaining predefined size, calculates the color of the picture Value, wherein each picture in the training set carries label information, the label information is used for the classification for marking the picture;

Convolutional neural networks are built, the convolution neural network includes an input layer, multiple convolutional layers and multiple ponds Change layer, multiple full articulamentums and an output layer, use the picture training convolutional neural networks in the training set so that The label information of each picture of the convolutional neural networks output layer characteristic value with being inputted is identical;Wherein described input layer Size is identical with the predefined size of picture in the training set;

The picture of the page to be detected is obtained, is input to after handling the picture for detecting the page described in training In convolutional neural networks, judge whether included in the page to be detected according to the output layer characteristic value of the convolutional neural networks Cheat advertisement.

In a kind of possible way of realization of first aspect, collect page pictures and make training set, including:

Fraud advertisement page is obtained, carrying out sectional drawing by the virtual exclusive main frame simulation opening advertisement page is stored in institute State in training set.

In the alternatively possible way of realization of first aspect, the recognition methods also includes:

If according to the output layer characteristic value of the convolutional neural networks, determine in the page pictures to be detected comprising fraud Advertisement, the testing result is preserved.

In the alternatively possible way of realization of first aspect, the convolutional neural networks specifically include an input Layer, four convolutional layers and four pond layers, two full articulamentums and an output layer, convolutional layer are input layer before, Mei Gejuan There is a pond layer after lamination, two full articulamentums are located between last pond layer and output layer.

In the alternatively possible way of realization of first aspect, the convolutional layer sizes of the convolutional neural networks be For 10x10 between 255x255, pond layer size is between 128x128 in 5x5.

In the alternatively possible way of realization of first aspect, four convolutional layer sizes point of the convolutional neural networks Not Wei 255x255,96x96,28x28,10x10, four pond layer size be respectively 128x128,48x48,14x14,5x5.

In the alternatively possible way of realization of first aspect, the full connection node layers of the convolutional neural networks be Between 10 to 100.

In the alternatively possible way of realization of first aspect, the output layer of the convolutional neural networks is softmax Grader, the node number of the output layer are consistent with the label classification number of picture in the training set.

The fraud advertisement page identification device based on convolutional neural networks of second aspect of the present invention embodiment, including:Instruction Practice module, convolutional neural networks model and interface module, wherein,

The training module, training set is made for collecting page pictures, fraud advertisement is comprised at least in the training set Page pictures and normal picture;Wherein, all pictures in the training set are zoomed in and out with the cromogram for obtaining predefined size, The colour of the picture is calculated, wherein each picture in the training set carries label information, the label information is used to mark Remember the classification of the picture;

The convolutional neural networks model, for building convolutional neural networks, the convolution neural network includes one Input layer, multiple convolutional layers and multiple pond layers, multiple full articulamentums and an output layer, use the picture in the training set Train the convolutional neural networks so that the label of the convolutional neural networks output layer characteristic value and each picture inputted Information is identical;The size of wherein described input layer is identical with the predefined size of picture in the training set;

Interface module, for obtaining the picture of the page to be detected, inputted after handling the picture for detecting the page Into the convolutional neural networks trained, judged according to the output layer characteristic value of the convolutional neural networks described to be detected Whether fraud advertisement is included in the page.

In a kind of possible way of realization of second aspect, the training module also includes:

Picture submodule is cheated, advertisement page is cheated for obtaining, the advertisement is opened by virtual exclusive main frame simulation The page carries out sectional drawing and is stored in the training set.

In the alternatively possible way of realization of second aspect, described device also includes:

Memory module, if for the output layer characteristic value according to the convolutional neural networks, determine the page to be detected When in picture comprising fraud advertisement, the testing result is preserved.

In the alternatively possible way of realization of second aspect, described device also includes:

Memory module, if for the output layer characteristic value according to the convolutional neural networks, determine the page to be detected When in picture comprising fraud advertisement, the testing result is preserved.

In the alternatively possible way of realization of second aspect, the convolutional neural networks specifically include an input Layer, four convolutional layers and four pond layers, two full articulamentums and an output layer, convolutional layer are input layer before, Mei Gejuan There is a pond layer after lamination, two full articulamentums are located between last pond layer and output layer.

In the alternatively possible way of realization of second aspect, the convolutional layer sizes of the convolutional neural networks be For 10x10 between 255x255, pond layer size is between 128x128 in 5x5.

In the alternatively possible way of realization of second aspect, four convolutional layer sizes point of the convolutional neural networks Not Wei 255x255,96x96,28x28,10x10, four pond layer size be respectively 128x128,48x48,14x14,5x5.

In the alternatively possible way of realization of second aspect, the full connection node layers of the convolutional neural networks be Between 10 to 100.

In the alternatively possible way of realization of second aspect, the output layer of the convolutional neural networks is softmax Grader, the node number of the output layer are consistent with the label classification number of picture in the training set.

The server of third aspect present invention embodiment, including:Including:Memory, processor and communication interface, it is described to deposit Reservoir is used to store executable program code;The processor is by reading the executable program code stored in the memory To run program corresponding with executable program code, know for performing foregoing any fraud advertisement based on convolutional neural networks Other method.

Fraud advertisement page recognition methods, device and server of the embodiment of the present invention based on convolutional neural networks, are based on The actual fraud advertisement page seen of user collects page pictures, and normal picture is fabricated to training set together, to training set In picture handled after be input in the convolutional neural networks of structure, by convolution class neural network algorithm, carry out repeatedly Model training, when convolutional neural networks can correctly recognition training concentrates all fraud advertisements when, convolutional neural networks are just Available for whether identifying in unknown page pictures comprising fraud advertisement, it can so simplify traditional characteristic code or image recognition etc. completely A large amount of manpowers of required consuming, resist because time zone and family of languages etc. set and quickly produce it is different fraud contents advertisings it is especially effective. If judge to meet for user by query interface can also be provided in the convolutional neural networks model trained deployment beyond the clouds main frame To the unknown page in whether comprising fraud advertisement.

The additional aspect of the present invention and advantage will be set forth in part in the description, and will partly become from the following description Obtain substantially, or recognized by the practice of the present invention.

Brief description of the drawings

Of the invention above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Substantially and it is readily appreciated that, wherein:

Fig. 1 is fraud advertisement page example figure of the present invention;

Fig. 2 is the flow chart of fraud advertisement page recognition methods of the one embodiment of the invention based on convolutional neural networks;

Fig. 3 is fraud advertisement page identification device structural representation of the one embodiment of the invention based on convolutional neural networks Figure;

Fig. 4 is fraud advertisement page identifying system schematic diagram of the one embodiment of the invention based on convolutional neural networks.

Fig. 5 is the structural representation according to server one embodiment of the present invention;

Fig. 6 is the convolution neural network schematic diagram that the present invention is built.

Embodiment

Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to end Same or similar label represents same or similar element or the element with same or like function.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the present invention, and is not considered as limiting the invention.

Convolutional neural networks (Convolutional Neural Network, CNN) are one kind of neural network, can be with Input layer (expression input variable) is divided into, output layer (represents the variable to be predicted), and middle convolutional layer is for increasing The complexity of neuron, to allow it to emulate more complicated function transformational structure.

Convolutional neural networks utilization space relation carries out weights and shared, and reduces the number of parameters for needing to learn, in convolution god After being handled in network picture images only by the use of sub-fraction as the bottom of hierarchical structure input, then information again according to Secondary to be transferred to different layers, every layer is gone to obtain the most significant feature of observation data by a small convolution kernel, due to parameter Greatly reducing, which reduces amount of calculation, contributes to quick obtaining result, and deep learning is exactly that substantial amounts of training sample coordinates computing capability Along with the Neural Network Structure Design of freedom and flexibility carrys out the effective image recognition feature of quick obtaining.

Below with reference to the accompanying drawings describe the embodiment of the present invention the fraud advertisement page recognition methods based on convolutional neural networks, Device and server.

Fig. 2 is the flow according to fraud advertisement page recognition methods of the one embodiment of the invention based on convolutional neural networks Figure, as shown in Fig. 2 the described method comprises the following steps:

S100, collect page pictures making training set;

Wherein, fraud advertisement page picture and normal picture are comprised at least in the training set.Each picture carries label Information, the label information are used for the classification for marking the picture, for example 0 represents the picture as fraud advertisement, described in 1 expression Picture is normal picture.

S102, all pictures in the training set are zoomed in and out with the cromogram for obtaining predefined size, calculate the instruction Practice the colour for concentrating each picture.

The picture of predefined size can serve as follow-up convolutional Neural net in the training set so obtained by above-mentioned processing The input of network, the convolutional neural networks are trained to learn to cheat the picture feature of advertisement automatically.

In an alternate embodiment of the invention, can be by collecting caused fraud advertisement page in user equipment, when user is certainly Fraud advertisement page, such as page network address are reported and uploaded when webpage is browsed in oneself terminal device.

In an alternate embodiment of the invention, the fraud advertisement page for reporting of user, can be simulated by virtual exclusive main frame The network address reduction page is opened, fraud advertisement page picture is obtained to carry out sectional drawing.When obtaining a number of fraud advertising page Training set is made after the picture of face, for convolutional neural networks, the sample size in training set is more, obtained output layer Characteristic value is just more accurate, in order to take into account in efficiency one embodiment of the invention from 10w fraud advertisement page pictures and 10W Normal picture is as training set.

When making training set, the picture in training set is zoomed in and out to obtain the cromogram of predefined size, calculates the instruction Practice the colour for concentrating each picture, the picture in training set is zoomed in and out and has no effect on the image spy that picture is included in itself Sign, therefore the predefined size that can be scaled according to output performance adjustment picture of the convolutional neural networks in training.

In an embodiment of the present invention, the predefined size of the picture scaling is between 100x100 to 300x300, preferably For 299x299.

S104, structure convolutional neural networks, the convolution neural network include an input layer, multiple convolutional layers and more Individual pond layer, multiple full articulamentums and an output layer, using the picture in the training set, trained with reference to gradient descent algorithm The convolutional neural networks so that the label information of the convolutional neural networks output layer characteristic value and each picture inputted It is identical;Wherein, the size of the input layer is identical with the predefined size of picture in the training set.

In alternative embodiment, the convolutional neural networks specifically include an input layer, four convolutional layers and four Pond layer, two full articulamentums and an output layer, convolutional layer are input layer before, have a pond layer after each convolutional layer, Two full articulamentums are located between last pond layer and output layer.

In an alternate embodiment of the invention, the convolutional layer size of the convolutional neural networks be in 10x10 between 255x255, Pond layer size is that the full node layer that connects is between 10 to 100 between 128x128 in 5x5.

Wherein, the activation primitive of above-mentioned convolutional layer is Relu, and the activation primitive of last layer of output layer is softmax.

In a preferred embodiment, the input layer size of convolutional neural networks is 299x299, and four convolutional layer sizes are divided Not Wei 255x255,96x96,28x28,10x10, four pond layer size be respectively 128x128,48x48,14x14,5x5.Its In, the activation primitive of each convolutional layer all uses Relu.The size of first full articulamentum is 100, second full articulamentum Size is 10.Last layer of output layer is softmax graders, the node number and the mark of picture in the training set of output layer Remember that classification number is consistent.

In an embodiment of the invention, fraud advertisement (label information 0), normal picture (label are included in training set Information is 1) two class, therefore output layer node number is 2, and now the output valve of softmax graders has two.In training set Picture value be input in convolutional neural networks after, by repetition training, until convolutional neural networks can be identified correctly Fraud picture and normal picture in training set so that convolutional neural networks output layer characteristic value is every with being inputted in training set The label information of individual picture is identical, when input be to cheat advertisement when obtained output valve be 0, if input is normal picture The output valve that output layer obtains is 1, and now characteristic value of the convolutional neural networks by study to fraud advertisement, can be used for identifying Whether other unknown pictures include fraud advertisement.

Optionally, the fraud picture incomplete same with the picture in above-mentioned training set and normal picture composition can be chosen Test set, it is input in above-mentioned convolutional neural networks, makes each in above-mentioned convolutional neural networks output layer characteristic value and test set The label information of picture is identical, and its parameters can also be adjusted while test convolutional neural networks.

It is pointed out that as alternative embodiment, the picture of other classifications can also be included in training set, for example may be used also So that comprising 10w porny, the label information that porny can be set is 2, now uses the training set for including three class pictures When being trained to above-mentioned convolutional neural networks, the node number of output layer is 3, the output valve of output layer softmax graders There are three, the picture for characterizing input respectively is normal picture, fraud advertisement or porny.Now, by repetition training Convolutional neural networks can picture that correctly recognition training is concentrated be normal picture, fraud advertisement or porny, convolution god It is fraud advertisement that input picture is represented when output valve through network is 0, represents that input picture is normal picture when output valve is 1, Represent that input picture is porny when output valve is 2.

It should be noted that above-mentioned being given for example only property of numerical value illustrates the technology of the present invention, it is not used to limit convolutional Neural The parameters of network.

S106, the picture for obtaining the page to be detected, it is input to and trains after handling the picture for detecting the page Convolutional neural networks in, the characteristic value obtained according to the output layer of the convolutional neural networks is judged in the page to be detected Whether fraud advertisement is included.

In an alternative embodiment, the above-mentioned convolutional neural networks trained can be deployed in cloud server, by opening Whether the method for putting query interface helps user to judge in actual pages comprising fraud advertisement.

When receiving the picture of the page to be detected of user's passback, acquisition is zoomed in and out in advance to the picture of the page to be detected Determine the cromogram of size, the picture colour of the page pictures to be detected of predefined size is input to the above-mentioned convolutional Neural trained In network, judged according to the output layer characteristic value of the convolutional neural networks whether wide comprising fraud in the page to be detected Accuse.For example, judge that picture to be detected includes fraud advertisement when output layer characteristic value is 0 according to the above.

In practice, user can be carried out by returning page website information or page pictures content to cloud server Judge, cloud server can open page network address and intercept picture to be detected, Huo Zhezhi if user has returned website information The page pictures using user's passback are connect as picture to be detected, above-mentioned train is input to after picture to be detected is handled Convolutional neural networks in, whether the characteristic value that is obtained according to output layer is judged in the page to be detected comprising fraud advertisement, And result of determination is fed back into user.

In an alternate embodiment of the invention, included if the above-mentioned convolutional neural networks trained can determine in the page to be detected Advertisement is cheated, the testing result is preserved, for example, the picture that can store the page to be detected preserves beyond the clouds The fraud advertising pictures of server are concentrated, and so contribute to form bigger training sample set, constantly training optimization convolutional Neural Network so that the judgement of convolutional neural networks is more accurate;The network address storage of the page can also be cheated advertising page beyond the clouds In the blacklist storehouse of face, the fraud advertisement page that is run into for directly detecting user according to page address.

Cheat advertisement change quickly and life cycle is short, effectively automatic identification can be cheated extensively using convolutional neural networks Accuse, it is no longer necessary to extract characteristics of image for network address extraction condition code or by artificial, a large amount of artificial required moneys expended of reduction Source.

The invention also provides a kind of fraud advertisement page identification device based on convolutional neural networks, Fig. 3 is according to this The structural representation of the fraud advertisement page identification device of invention one embodiment, as shown in figure 3, the device includes:

Training module 10, training set is made for collecting page pictures, fraud advertising page is comprised at least in the training set Face picture and normal picture;Wherein, all pictures in the training set are zoomed in and out with the cromogram for obtaining predefined size, meter The colour of the picture is calculated, wherein each picture in the training set carries label information, the label information is used to mark The classification of the picture;

In an alternate embodiment of the invention, the training module also includes:Picture submodule is cheated, advertising page is cheated for obtaining Face, the advertisement page progress sectional drawing is opened by virtual exclusive main frame simulation and is stored in the training set.

Convolutional neural networks model 20, for building convolutional neural networks, the convolution neural network is defeated including one Enter layer, multiple convolutional layers and multiple pond layers, multiple full articulamentums and an output layer, instructed using the picture in the training set Practice the convolutional neural networks so that the label of the convolutional neural networks output layer characteristic value and each picture inputted is believed Manner of breathing is same;The size of wherein described input layer is identical with the predefined size of picture in the training set;

Interface module 30, it is defeated after handling the picture for detecting the page for obtaining the picture of the page to be detected Enter into the convolutional neural networks trained, judged according to the output layer characteristic value of the convolutional neural networks described to be checked Whether survey in the page comprising fraud advertisement.

In an alternate embodiment of the invention, said apparatus also includes:Memory module, if for according to the convolutional neural networks Output layer characteristic value, when determining in the page pictures to be detected comprising fraud advertisement, the testing result is preserved.

It should be noted that foregoing saying to the fraud advertisement page recognition methods embodiment based on convolutional neural networks It is bright, it is also applied for the explanation to fraud advertisement page identification device embodiment of the present invention based on convolutional neural networks, the present invention The details not disclosed in fraud advertisement page identification device embodiment based on convolutional neural networks, will not be repeated here.

Fig. 4 provides fraud advertisement recognition system schematic diagram of the embodiment of the present invention, is produced as shown in figure 4, collecting in user equipment Raw fraud advertisement page, fraud advertisement page is reported and uploaded when user browses webpage in the terminal device of oneself, than Such as page network address.

The fraud advertisement page network address reported for user in terminal device, can by virtual exclusive main frame (VPS, Virtual Private Server) the simulation opening network address reduction page, obtain fraud advertisement page picture to carry out sectional drawing. Obtain that after a number of fraud advertisement page picture training set, for convolutional neural networks, training set can be made In sample size it is more, obtained output layer characteristic value is just more accurate.

It is right before being trained into convolutional neural networks after collecting fraud advertisement page picture making training set All pictures in the training set zoom in and out the cromogram for obtaining predefined size, calculate the colour of the picture, wherein institute State each picture in training set and carry label information, whether the label information is used to mark wide comprising fraud in the picture Accuse;

Convolutional neural networks are built, the convolution neural network includes an input layer, multiple convolutional layers and multiple ponds Layer, multiple full articulamentums and an output layer, using the picture handled well in training set, with reference to described in gradient descent algorithm training Convolutional neural networks, reach the convolutional neural networks output layer characteristic value and the label information phase of each picture inputted Together;Wherein, the size of the input layer is identical with the predefined size of picture in the training set.

After convolutional network nerve trains, convolutional neural networks model is disposed in server beyond the clouds, for examining Survey caused unknown advertisement page in user equipment.

After unknown advertisement page is found in user equipment, relevant information can be passed back to cloud server and be identified.

When receiving the page to be detected in user equipment, if can not be looked into according to fraud advertisement page address blacklist Ask result, can obtain the page to be detected picture handled after be input in the convolutional neural networks trained, Whether the characteristic value obtained according to the output layer of the convolutional neural networks is judged in the page to be detected comprising fraud advertisement.

Fraud advertisement recognition system of the invention based on convolutional neural networks, it can give and detect what user equipment end ran into time Whether comprising fraud advertisement in the page, the characteristic that confrontation fraud advertisement change is quick and life cycle is short is highly effective, saves significantly Cost of labor and time cost are saved.

Shown in Fig. 5 is server architecture schematic diagram provided in an embodiment of the present invention, using general-purpose computing system structure, The program code for performing the present invention program preserves in memory, and is performed by processor to control.Server includes:Processor 501, memory 502, communication interface 503.

Processor 501 can be a general central processor (CPU), graphics processor (GPU), and microprocessor is specific Application integrated circuit applicat1n-specific integrated circuit (ASIC), or it is one or more for controlling The integrated circuit that the present invention program program performs.

One or more memories 502 that computer system includes, can be a kind of non-volatile computer-readable Storage medium, such as read-only storage read-only memory (ROM) or the other types that static information and instruction can be stored Static storage device or magnetic disk storage.These memories can be connected by bus with processor.Memory, The program code for performing the present invention program is preserved, such as performs the program of the method for embodiment illustrated in fig. 2.Perform present invention side The program code of case preserves in memory, and is performed by processor to control.

Communication interface 503, a kind of device of any transceiver can be used, to lead to other equipment or communication network Letter, such as Ethernet, wireless access network (RAN), WLAN (WLAN) etc..

It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment it Between identical similar part mutually referring to what each embodiment stressed is the difference with other embodiment. For device embodiment, because it is substantially similar to embodiment of the method, so describe fairly simple, each module The implementation procedure of concrete function illustrates referring to the part of embodiment of the method.Device embodiment described above is only to show Meaning property, wherein as the module that separating component illustrates can be or may not be physically separate, show as module The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.Some or all of module therein can be selected to realize the mesh of this embodiment scheme according to the actual needs 's.Those of ordinary skill in the art are without creative efforts, you can to understand and implement.

In the description of the invention, " multiple " are meant that at least two, such as two, three etc., unless otherwise clear and definite It is specific to limit.In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specific The description of example " or " some examples " etc. mean to combine the specific features that the embodiment or example describe, structure, material or Feature is contained at least one embodiment or example of the present invention.In this manual, to the schematic representation of above-mentioned term Necessarily it is directed to identical embodiment or example.Moreover, specific features, structure, material or the feature of description can be Combined in an appropriate manner in any one or more embodiments or example.In addition, in the case of not conflicting, this area Technical staff can be carried out the different embodiments or example and the feature of different embodiments or example described in this specification With reference to and combination.

Although embodiments of the invention have been shown and described above, it is to be understood that above-described embodiment is example Property, it is impossible to limitation of the present invention is interpreted as, one of ordinary skill in the art within the scope of the invention can be to above-mentioned Embodiment is changed, changed, replacing and modification.

Claims (10)

1. a kind of fraud advertisement page recognition methods based on convolutional neural networks, it is characterised in that comprise the following steps:
Collect page pictures and make training set, fraud advertisement page picture and normal picture are comprised at least in the training set;
All pictures in the training set are zoomed in and out with the cromogram for obtaining predefined size, calculates the colour of the picture, Each picture in wherein described training set carries label information, and the label information is used for the classification for marking the picture;
Build convolutional neural networks, the convolution neural network include an input layer, multiple convolutional layers and multiple pond layers, Multiple full articulamentums and an output layer, use the picture training convolutional neural networks in the training set so that described The label information of each picture of the convolutional neural networks output layer characteristic value with being inputted is identical;The size of wherein described input layer It is identical with the predefined size of picture in the training set;
The picture of the page to be detected is obtained, the convolution trained is input to after handling the picture for detecting the page In neutral net, whether judged according to the output layer characteristic value of the convolutional neural networks in the page to be detected comprising fraud Advertisement.
2. the method as described in claim 1, it is characterised in that the collection page pictures make training set, including:
Fraud advertisement page is obtained, carrying out sectional drawing by the virtual exclusive main frame simulation opening advertisement page is stored in the instruction Practice and concentrate.
3. the method as described in claim 1, it is characterised in that methods described also includes:
If according to the output layer characteristic value of the convolutional neural networks, determine wide comprising fraud in the page pictures to be detected Accuse, then preserved the testing result.
4. the method as described in claim 1, it is characterised in that the convolutional neural networks specifically include input layer, four Individual convolutional layer and four pond layers, two full articulamentums and an output layer, are input layer before convolutional layer, each convolutional layer it After have a pond layer, two full articulamentums are located between last pond layer and output layer.
5. method as claimed in claim 4, it is characterised in that the convolutional layer size of the convolutional neural networks is in 10x10 To between 255x255, pond layer size is between 128x128 in 5x5.
6. method as claimed in claim 4, it is characterised in that four convolutional layer sizes of the convolutional neural networks are respectively 255x255,96x96,28x28,10x10, four pond layer size are respectively 128x128,48x48,14x14,5x5.
7. method as claimed in claim 4, it is characterised in that the full connection node layer of the convolutional neural networks is to be arrived 10 Between 100.
8. method as claimed in claim 4, it is characterised in that the output layer of the convolutional neural networks is classified for softmax Device, the node number of the output layer are consistent with the label classification number of picture in the training set.
A kind of 9. fraud advertisement page identification device based on convolutional neural networks, it is characterised in that including:
Training module, training set is made for collecting page pictures, fraud advertisement page picture is comprised at least in the training set And normal picture;Wherein, all pictures in the training set are zoomed in and out with the cromogram for obtaining predefined size, described in calculating The colour of picture, wherein each picture in the training set carries label information, the label information is used to mark the figure The classification of piece;
Convolutional neural networks model, for building convolutional neural networks, the convolution neural network includes an input layer, more Individual convolutional layer and multiple pond layers, a multiple full articulamentums and output layer, using described in the picture training in the training set Convolutional neural networks so that the convolutional neural networks output layer characteristic value and the label information phase of each picture inputted Together;The size of wherein described input layer is identical with the predefined size of picture in the training set;
Interface module, for obtaining the picture of the page to be detected, instruction is input to after handling the picture for detecting the page In the convolutional neural networks perfected, the page to be detected is judged according to the output layer characteristic value of the convolutional neural networks In whether comprising fraud advertisement.
A kind of 10. server, it is characterised in that including:Memory, processor and communication interface, the memory are used to store Executable program code;The processor is run with that can hold by reading the executable program code stored in the memory Program corresponding to line program code, for performing the fraud based on convolutional neural networks described in foregoing any claim 1-8 Advertisement recognition method.
CN201610875790.7A 2016-09-30 2016-09-30 The recognition methods of fraud advertisement page and device based on convolutional neural networks CN107886344A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610875790.7A CN107886344A (en) 2016-09-30 2016-09-30 The recognition methods of fraud advertisement page and device based on convolutional neural networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610875790.7A CN107886344A (en) 2016-09-30 2016-09-30 The recognition methods of fraud advertisement page and device based on convolutional neural networks

Publications (1)

Publication Number Publication Date
CN107886344A true CN107886344A (en) 2018-04-06

Family

ID=61769791

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610875790.7A CN107886344A (en) 2016-09-30 2016-09-30 The recognition methods of fraud advertisement page and device based on convolutional neural networks

Country Status (1)

Country Link
CN (1) CN107886344A (en)

Similar Documents

Publication Publication Date Title
Vetrivel et al. Disaster damage detection through synergistic use of deep learning and 3D point cloud features derived from very high resolution oblique aerial images, and multiple-kernel-learning
CN105184312B (en) A kind of character detecting method and device based on deep learning
CN106096605B (en) A kind of image obscuring area detection method and device based on deep learning
CN105139028B (en) SAR image sorting technique based on layering sparseness filtering convolutional neural networks
WO2017020528A1 (en) Lane line recognition modeling method, apparatus, storage medium, and device, recognition method and apparatus, storage medium, and device
CN104881865B (en) Forest pest and disease monitoring method for early warning and its system based on unmanned plane graphical analysis
Xiao et al. Sun database: Large-scale scene recognition from abbey to zoo
CN104166841B (en) The quick detection recognition methods of pedestrian or vehicle is specified in a kind of video surveillance network
Joutou et al. A food image recognition system with multiple kernel learning
Albert et al. Using convolutional networks and satellite imagery to identify patterns in urban environments at a large scale
CN106295714B (en) Multi-source remote sensing image fusion method based on deep learning
CN105354565A (en) Full convolution network based facial feature positioning and distinguishing method and system
JP2019514123A (en) Remote determination of the quantity stored in containers in geographical areas
CN105631439B (en) Face image processing process and device
CN104239858B (en) A kind of method and apparatus of face characteristic checking
Simensen et al. Methods for landscape characterisation and mapping: A systematic review
CN101980248B (en) Improved visual attention model-based method of natural scene object detection
CN107077601A (en) Low-power, which is carried out, using the vision sensor based on event connects face detection, tracking, identification and/or analysis all the time
CN105518709B (en) The method, system and computer program product of face for identification
CN103605794B (en) Website classifying method
CN106462574B (en) The method and server of machine language translation for the text from image
CN100568264C (en) Print identification control method
CN104732243A (en) SAR target identification method based on CNN
CN102799635B (en) The image collection sort method that a kind of user drives
CN110874841A (en) Object detection method and device with reference to edge image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination