CN106682677A - Advertising identification rule induction method, device and equipment - Google Patents
Advertising identification rule induction method, device and equipment Download PDFInfo
- Publication number
- CN106682677A CN106682677A CN201510768446.3A CN201510768446A CN106682677A CN 106682677 A CN106682677 A CN 106682677A CN 201510768446 A CN201510768446 A CN 201510768446A CN 106682677 A CN106682677 A CN 106682677A
- Authority
- CN
- China
- Prior art keywords
- advertisement
- elements
- recognition rule
- test set
- list
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Abstract
The invention discloses an advertising identification rule induction method, device and equipment. According to the method, a training set is generated based on a first URL's List; according to the identification result by manual work and/or by advertising recognition software, each element in the training set is labeled as an advertising element or non-advertising element; through the machine learning algorithm, an advertising recognition model is obtained based on the advertising recognition characteristics of each element in the training set and the judgment whether or not the characteristics are the labeled results of the advertising elements; a testing set is generated based on a second URL's List; based on the advertising recognition characteristics of each element in the testing set, the centralized advertising elements are identified and tested using the advertising recognition model; an advertising identification rule is obtained by carrying out induction on uniform resource locators of the advertising elements in the test set. By this time, a new advertising identification rule can be used to identify advertising elements in a page, or the new advertising identification rule and the manual labeling rule/ advertising identification rule of advertising identification software can be combined to identify the advertising elements in the page.
Description
Technical field
The present invention relates to Internet technical field, specifically, is related to a kind of advertisement recognition rule and concludes
Method, device and equipment.
Background technology
With the popularization and development of internet, increasing user has been accustomed in such as mobile phone, has put down
Webpage is browsed on the terminal device of plate computer etc, is obtained information.However, user is enjoying above-mentioned one
When series is convenient, thing followed web advertisement is also more and more, such as banner Banner advertisements, presses
Button advertisement, pop-up window advertisement, page suspension advertisement and interstitials etc..For being moved using mobile phone etc.
Dynamic terminal is browsed for the user of webpage, and in the case of display screen limited space, these webpages are wide
Announcement can not only affect the acquisition of information, but also can consumption network flow.Therefore, how effectively to filter
Advertisement in webpage is the problem that industry is being researched and solved.
Now widely used advertisement filter method is mainly filtered using advertisement filter software, such as
AdBlock, net net great master etc..Using advertisement filter software can to the banner in webpage, pop-up, regard
The advertisement of the forms such as frequency is filtered, and the filtration needs of user can be met to a certain extent.
But, the filtering rule of advertisement filter software needs Jing often to update the demand that could meet user,
Accordingly, it would be desirable to the renewal for often safeguarding software using substantial amounts of manpower Jing, could allow it exactly
Filtering advertisements, meet the demand of user.
The content of the invention
The invention solves the problems that a technical problem be to provide a kind of advertisement recognition rule inductive method, dress
Put and equipment, it being capable of automatic sorting advertisement recognition rule.
According to an aspect of the present invention, a kind of advertisement recognition rule inductive method is disclosed, including:
Training set is generated based on the first list of websites, training set is including corresponding to each network address in the first list of websites
At least part of element and its advertisement identification feature in webpage;It is soft according to manually and/or by advertisement recognizing
The result that part is identified, by each element in training set ad elements or non-advertisement unit are labeled as
Element;By machine learning algorithm, advertisement identification feature based on each element in training set and its it is whether
The annotation results of ad elements, obtain advertisement identification model;Based on the second list of websites generating test set,
Test set includes at least part of element in webpage corresponding to each network address and its advertisement in the second list of websites
Identification feature;Based on the advertisement identification feature of each element in test set, recognized using advertisement identification model
Ad elements in test set;The URL of the ad elements in test set is concluded,
Obtain advertisement recognition rule.
Thus, it is possible to first pass through the mode that artificial mark or advertisement identification software or both are combined, will instruct
It is ad elements or non-ad elements to practice each rubidium marking concentrated, then according to these ad elements
With non-ad elements and its corresponding advertisement identification feature, advertisement can be set up by machine learning model
Identification model, is then identified using the advertisement identification model for establishing to test set, identifies survey
The ad elements that examination is concentrated, the URL of the ad elements that will identify that is concluded, just
New advertisement recognition rule can be obtained.At this point it is possible to recognize page using new advertisement recognition rule
Ad elements in face, it is also possible to by the rule of new advertisement recognition rule and artificial mark/advertisement identification
The advertisement recognition rule of software be combined to recognize the page in ad elements, with realize accurately
Identify the purpose of the ad elements in webpage.
Preferably, the method can also include:The element for meeting advertisement recognition rule in test set is presented;
Advertisement recognition rule is screened according to the artificial judgment of the element to being presented.
Thus, after the ad elements in test set is identified using advertisement identification model, can also increase
One plus artificial screening step, to filter out the ad elements of marked erroneous, so that concluding what is drawn
Advertisement recognition rule can be more accurate.
Preferably, the method can also include that iteration performs following steps:According to advertisement recognition rule pair
Element in training set re-starts identification, and the element in training set is labeled as into ad elements again
Or non-ad elements;By machine learning algorithm, the advertisement identification feature based on each element in training set
And its whether be ad elements annotation results again, obtain advertisement identification model;Based on the second network address
List generating test set, test set is included in the second list of websites in webpage corresponding to each network address at least
Partial Elements and its advertisement identification feature;Based on the advertisement identification feature of each element in test set, use
Ad elements in advertisement identification model identification test set;Unified money to the ad elements in test set
Source finger URL is concluded, and obtains advertisement recognition rule;Present and meet the advertisement identification in test set
The element of rule;Advertisement recognition rule is screened according to the artificial judgment of the element to being presented.
Thus, after advertisement recognition rule is obtained, can be according to the advertisement recognition rule for obtaining to training
The element of concentration is marked again, and according to annotation results advertisement identification model is re-established, based on building again
Vertical advertisement identification model, then the element in test set is marked again, according to annotation results again again
An advertisement recognition rule is obtained, is screened by the advertisement recognition rule manually to reacquiring,
By obvious inappropriate rejecting, above-mentioned steps then can be repeated.What is obtained after successive ignition is wide
Accusing recognition rule can take union, used as a final advertisement recognition rule, the advertisement for so obtaining
Recognition rule can filter out most ad elements in the page, and less judge by accident, filter
Effect is significant.
Preferably, in the above-mentioned methods, the element in training set can be included by advertisement identification software
All ad elements for identifying in webpage corresponding to each network address from the first list of websites and at least partly
Non- ad elements;Element in test set can be included by advertisement identification software from the second list of websites
In all ad elements for identifying in webpage corresponding to each network address and at least part of non-ad elements.
Thus, the element in training set and test set includes being arranged from the first network address by advertisement identification software
The all ad elements identified in webpage corresponding to each network address in table, the second list of websites, so,
When setting up advertisement identification model by the annotation results in training set, advertisement identification model can be improved
The degree of accuracy.Also, ad elements are known in set up advertisement identification model is utilized to test set
Not, when the URL and to being identified as ad elements is concluded, due to wrapping in test set
Containing more ad elements, as such, it is possible to so that the advertisement recognition rule summarized is more comprehensive, accurate
Really.Furthermore it is possible to the mode being trained by using positive negative sample obtains advertisement identification model,
That is, can simultaneously comprising ad elements and non-ad elements in training set and test set.So,
The higher advertisement identification model of accuracy can be obtained, such that it is able to lift the practicality of advertisement identification model
Property.
Preferably, in above-mentioned advertisement recognition rule inductive method, advertisement identification feature can include source
Whether the number of times that whether occurs comprising specific character string combination, in foreign lands website in code, element are bar
Positioning properties, picture format, dynamic picture frame number in shape, CSS.
According to another aspect of the present invention, a kind of advertisement recognition rule sorting device is also disclosed, is wrapped
Include:Training set generation module, for generating training set based on the first list of websites, training set includes the
At least part of element and its advertisement identification feature in one list of websites in webpage corresponding to each network address;Unit
Plain labeling module, for according to result that is artificial and/or being identified by advertisement identification software, will instruct
Practice each element concentrated and be labeled as ad elements or non-ad elements;Advertisement identification model generates mould
Block, for by machine learning algorithm, advertisement identification feature based on each element in training set and its be
The no annotation results for ad elements, obtain advertisement identification model;Test set generation module, for base
In the second list of websites generating test set, test set includes net corresponding to each network address in the second list of websites
At least part of element and its advertisement identification feature in page;Elemental recognition module, for based on test set
The advertisement identification feature of middle each element, using advertisement identification model the ad elements in test set are recognized;
Module is concluded, for concluding to the URL of the ad elements in test set, is obtained
Advertisement recognition rule.
Preferably, the device can also include:Element present module, for test set to be presented in meet
The element of advertisement recognition rule;Advertisement recognition rule screening module, for basis to the element that presented
Artificial judgment screening advertisement recognition rule.
Preferably, in above-mentioned advertisement recognition rule sorting device, the instruction that training set generation module is generated
Practicing the element concentrated includes by advertisement identification software the webpage corresponding to each network address from the first list of websites
In all ad elements for identifying and at least part of non-ad elements;What test set generation module was generated
Element in test set includes by advertisement identification software the net corresponding to each network address from the second list of websites
The all ad elements identified in page and at least part of non-ad elements.
According to another aspect of the present invention, also disclose a kind of advertisement recognition rule and conclude equipment, bag
Input unit, mixed-media network modules mixed-media, memory, display and processor are included, wherein, input unit is received
First list of websites and the second list of websites of user input;Mixed-media network modules mixed-media is used to access the first network address row
Webpage in table and the second list of websites corresponding to each website;Processor is based on mixed-media network modules mixed-media from the first net
The web data that each network address is obtained in the list of location generates training set, and training set is stored on a memory,
Training set includes at least part of element in webpage corresponding to each network address and its advertisement in the first list of websites
Identification feature, processor will be instructed according to result that is artificial and/or being identified by advertisement identification software
Practice each element concentrated and be labeled as ad elements or non-ad elements, and by annotation results accordingly
Storage is on a memory;Processor is by machine learning algorithm, the advertisement based on each element in training set
Identification feature and its be whether ad elements annotation results, obtain advertisement identification model;Processor base
In the mixed-media network modules mixed-media web data generating test set that each network address is obtained from the second list of websites, and will survey
On a memory, test set is including in webpage corresponding to each network address in the second list of websites for the storage of examination collection
At least part of element and its advertisement identification feature;Advertisement identification of the processor based on each element in test set
Feature, using advertisement identification model the ad elements in test set are recognized;Processor is in test set
The URL of ad elements is concluded, and obtains advertisement recognition rule, and advertisement is recognized
Rule storage is on a memory.
Preferably, in above-mentioned advertisement recognition rule conclusion equipment, present over the display in test set
Meet the element of advertisement recognition rule, the judged result that processor is input into according to user by input unit
To screen advertisement recognition rule.
To sum up, advertisement recognition rule inductive method disclosed by the invention, device and equipment, can basis
Advertisement identification software and/or artificial advertisement recognition rule, by the element in training set advertisement unit is labeled as
Plain or non-ad elements, generate with regard to advertisement identification feature according to each element annotation results in training set
Advertisement identification model, reuse the advertisement identification model that obtains by each element mark in test set
For ad elements or non-ad elements, finally to the URL of the ad elements in test set
Concluded, it is possible to obtain new advertisement recognition rule, new advertisement recognition rule can be used as existing
There is the supplement of artificial filtering rule or software filtering rule, preferably to recognize the ad elements in webpage.
Thus, the advertisement recognition rule for finally giving combines advertisement identification feature and existing advertisement identification rule
Then, thus can be preferably in filtering page using new advertisement recognition rule advertisement, reduce loading
The flow consumed during the page, lifts the viewing experience of user.
Description of the drawings
Disclosure illustrative embodiments are described in more detail by combining accompanying drawing, the disclosure
Above-mentioned and other purposes, feature and advantage will be apparent from, wherein, it is exemplary in the disclosure
In embodiment, identical reference number typically represents same parts.
Fig. 1 shows the indicative flowchart of the advertisement recognition rule inductive method of the present invention.
Fig. 2 shows showing for advertisement recognition rule inductive method according to another embodiment of the invention
Meaning property flow chart.
Fig. 3 shows showing for advertisement recognition rule inductive method according to another embodiment of the invention
Meaning property flow chart.
Fig. 4 shows that the flow process of a specific embodiment of advertisement recognition rule inductive method of the present invention is shown
It is intended to.
Fig. 5 is shown according to the structure of one embodiment of advertisement recognition rule sorting device in the page of the present invention
It is intended to.
Fig. 6 shows the structure of advertisement recognition rule sorting device in accordance with another embodiment of the present invention
Schematic block diagram.
Fig. 7 shows that the advertisement recognition rule of the present invention concludes the schematic block diagram of equipment.
Specific embodiment
The preferred embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing in accompanying drawing
The preferred embodiment of the disclosure is shown, however, it is to be appreciated that may be realized in various forms the disclosure
And should not be limited by embodiments set forth herein.Conversely, thesing embodiments are provided so that
The disclosure is more thorough and complete, and the scope of the present disclosure can be conveyed to intactly this area
Technical staff.
Fig. 1 shows the indicative flowchart of the advertisement recognition rule inductive method of the present invention.
Wherein, shown in Fig. 1 execution sequence is merely to more clearly describe the present invention, it is to be understood that
For the purpose of the present invention, step S140 can be exchanged with the order of S110, S120, S130, you can with
Step S140 is first carried out, then execution step S110, S120, S130, it is also possible to while perform, its
Execution sequence has no impact on the present invention.
In step S110, training set is generated based on the first list of websites.Wherein, training set includes first
At least part of element and its advertisement identification feature in list of websites in webpage corresponding to each network address.
First list of websites can be the multiple network address, or some for randomly selecting than more typical
Network address, such as can be the network address of page browsing amount (page view, PV) multiple pages in the top.
Webpage in first list of websites corresponding to each network address includes multiple elements, for the first list of websites
In each webpage, the corresponding advertisement identification of a part of element and element in each webpage can be chosen special
Levy as training set.
Advertisement identification feature can be some features that ad elements usually have.For example, for the page
In an element for, when in the source code corresponding to it comprising some specific character strings or character
During string combination, it is believed that it is ad elements, such as when include in the source code corresponding to element " ad " or
When " ads ", it is believed that it is ad elements, now, " ad " or " ads " is exactly that an advertisement identification is special
Levy.Again for example, for an element in the page, the number of times occurred in foreign lands websites when it compared with
When many, it is believed that it is ad elements, now, element can also in the number of times that foreign lands website occurs
It is advertisement identification feature.Again for example, the advertisement in webpage is often fixed to certain position in the page,
Or move as user moves the page, therefore, positioning properties of the element in CSS
Can be as advertisement identification feature, specifically, for positioning properties are absolute/fixed positioning
Element, can be construed as ad elements.In addition, whether element is bar shaped, picture format, dynamic
Picture frame number etc. can serve as advertisement identification feature, and here is omitted.
In step S120, according to result that is artificial and/or being identified by advertisement identification software, will instruct
Practice each element concentrated and be labeled as ad elements or non-ad elements.
That is, can be using existing advertisement recognition rules pair such as artificial and/or advertisement identification softwares
Element in training set is labeled, and each element in training set is labeled as into ad elements or non-advertisement
Element.Wherein, in the case where the advertisement recognition rule degree of accuracy of advertisement identification software is higher, can be with
Do not use artificial mark to be labeled the element in training set.
In step S130, by machine learning algorithm, the advertisement identification based on each element in training set is special
The annotation results of ad elements are levied and its be whether, advertisement identification model is obtained.
The URL of several elements, corresponding advertisement identification feature, whether advertisement are included in training set
Mark, therefore, based on training set, advertisement identification model can be obtained by machine learning algorithm.
Advertisement identification model denotes the corresponding relation of advertisement identification feature and ad elements, based on advertisement
Identification model may determine that whether element is ad elements.
In step S140, based on the second list of websites generating test set, test set is arranged including the second network address
At least part of element and its advertisement identification feature in table in webpage corresponding to each network address.
Wherein, the second list of websites can also be the multiple network address, or some for randomly selecting
Such as can be in the top multiple of page browsing amount (page view, PV) than more typical network address
The network address of the page.Webpage in first list of websites corresponding to each network address includes multiple elements, for
Each webpage in first list of websites, can choose a part of element and element correspondence in each webpage
Advertisement identification feature as test set.
In step S150, based on the advertisement identification feature of each element in test set, using advertisement mould is recognized
Ad elements in type identification test set.
The advertisement identification model obtained using step S130, is labeled to each element in test set,
Label it as ad elements or non-ad elements.
In step S160, the URL (URL) of the ad elements in test set is carried out
Conclude, obtain advertisement recognition rule.
After being labeled to the element in test set using advertisement identification model, it is possible to conclude test
Concentration is noted as the URL (URL) of each element of ad elements, obtains advertisement knowledge
It is irregular.Wherein, the unified resource for being noted as each element of ad elements in test set is determined
Position symbol, can there is various conclusion modes.For example, when the URL (URL) of multiple elements
Respectively http://abc.com/ad/1.gif、http://abc.com/ad/2.gif ... when, one can be generalized into
Individual advertisement recognition rule http://abc.com/ad/*.gif.Again for example, for being marked as ad elements
Element, when the same URLs of one of them is http://example.com/ads/banner123.gif
When, it is also possible to it is generalized into advertisement recognition rule http://example.com/ads/banner*.gif.Certainly,
According to the concrete form of the URL for being marked as ad elements, there can also be other to conclude
Mode, here is omitted.
The URL (URL) that ad elements are marked as in test set is concluded
Afterwards, the advertisement recognition rule for obtaining will can be concluded with basis at the beginning manually and/or by advertisement identification
The advertisement recognition rule taken when software is identified carries out union process, that is, conclusion is obtained
Advertisement recognition rule and at the beginning according to artificial and/or when being identified by advertisement identification software
The advertisement recognition rule taken is used as new advertisement recognition rule.
To sum up, advertisement recognition rule inductive method of the invention, can according to advertisement identification software and/or
Artificial advertisement recognition rule, by the element in training set ad elements or non-ad elements are labeled as,
The advertisement identification model with regard to advertisement identification feature is generated according to each element annotation results in training set,
Reuse the advertisement identification model for obtaining and each element in test set is labeled as into ad elements or non-wide
Element is accused, finally the URL of the ad elements in test set is concluded, it is possible to
Obtain new advertisement recognition rule, new advertisement recognition rule can as existing artificial filter rule or
The supplement of software filtering rule, preferably to recognize the ad elements in webpage.
Thus, the advertisement recognition rule for finally giving combines advertisement identification feature and existing advertisement is known
Not other rule, thus advertisement that can be preferably in filtering page using new advertisement recognition rule, reduce
The flow consumed during loading page, lifts the viewing experience of user.
Fig. 2 shows showing for advertisement recognition rule inductive method according to another embodiment of the invention
Meaning property flow chart.As shown in Fig. 2 the advertisement recognition rule inductive method of the embodiment of the present invention is except bag
Include outside step shown in Fig. 1, also including step S170, S180.
In step S170, the element for meeting advertisement recognition rule in test set is presented.
In step S180, advertisement recognition rule is screened according to the artificial judgment of the element to being presented.
Thus, after the ad elements in test set is identified using advertisement identification model, can also increase
One plus artificial screening step, to filter out the ad elements of marked erroneous, so that concluding what is drawn
Advertisement recognition rule can be more accurate.Alternatively, it is also possible to enter to the advertisement recognition rule that conclusion is obtained
Row artificial screening, to exclude those irrational filtering rule is substantially concluded.
Fig. 3 shows showing for advertisement recognition rule inductive method according to another embodiment of the invention
Meaning property flow chart.
As shown in figure 3, the advertisement recognition rule inductive method of the embodiment of the present invention is comprising complete shown in Fig. 2
Portion step S110 to S180, difference is, after execution of step S110 to S180 successively,
Also include iteration execution step S190, S130 to S180.
In step S190, identification is re-started to the element in training set according to advertisement recognition rule, with
Element in training set is labeled as into ad elements or non-ad elements again.
Wherein, training set can adopt the training set that step S110 is obtained, it is also possible to reacquire training
Collection, the process for reacquiring training set can be found in Fig. 1 with regard to the narration of step S110.
Identification is re-started to the element in training set according to advertisement recognition rule, wherein, advertisement identification
Rule can be carried out the advertisement recognition rule that step S160 obtains, or execution step 160
To artificial and/or advertisement identification software the filtering rule taken of advertisement recognition rule and step S120
Superposition.
After execution of step S190, successively execution step S130 is to step S180, step S130
Detailed description to step S180 can be found in the associated description of Fig. 1, Fig. 2, and here is omitted.Its
In, it should be appreciated that step S140, then execution step S130 can be first carried out, it is also possible to while performing
Step S130 and S140.
Then iteration execution step S190, S130 to S180.Wherein, the number of times for repeating can be with
Set as the case may be.
To sum up, after advertisement recognition rule is obtained, can be according to the advertisement recognition rule for obtaining to training
The element of concentration is marked again, and according to annotation results advertisement identification model is re-established, based on building again
Vertical advertisement identification model, then the element in test set is marked again, according to annotation results again again
An advertisement recognition rule is obtained, is screened by the advertisement recognition rule manually to reacquiring,
By obvious inappropriate rejecting, above-mentioned steps then can be repeated.What is obtained after successive ignition is wide
Accusing recognition rule can take union, as a final advertisement recognition rule, as such, it is possible to so that
The advertisement recognition rule for finally giving can filter out most ad elements, filter effect in the page
Significantly.
After successive ignition, can be according to the result, Yi Jitong being identified using advertisement identification model
The result that advertisement identification software is identified is crossed, the accuracy rate and recall ratio of advertisement identification model is calculated,
Determined the need for continuing iteration according to the accuracy rate and recall ratio of calculated advertisement identification model.
In addition, preferably, the element in training set can be included by advertisement identification software from first
The all ad elements identified in webpage corresponding to each network address in list of websites and at least part of non-advertisement
Element, correspondingly, the element in test set can include being arranged from the second network address by advertisement identification software
The all ad elements identified in webpage corresponding to each network address in table and at least part of non-ad elements.
So, because the element in training set and test set is included by advertisement identification software from the first net
The all ad elements identified in webpage corresponding to each network address in location list, the second list of websites, this
Sample, when setting up advertisement identification model by the annotation results in training set, can improve advertisement identification mould
The degree of accuracy of type.Also, ad elements are entered in set up advertisement identification model is utilized to test set
Row identification, and the URL to being identified as ad elements is when concluding, due to test set
In include more ad elements, as such, it is possible to so that the advertisement recognition rule summarized more comprehensively,
Accurately.
Further, ad elements and non-ad elements can be simultaneously included in training set and test set,
Thus, it is possible to the positive negative sample in by using training set is trained, higher wide of accuracy is obtained
Identification model is accused, such that it is able to lift the practicality of advertisement identification model.
Fig. 4 shows that the flow process of a specific embodiment of advertisement recognition rule inductive method of the present invention is shown
It is intended to.The embodiment in some details and Fig. 1, Fig. 2 in the present embodiment is essentially identical, and something in common please
Referring to Fig. 1, Fig. 2 and corresponding explanatory note, no longer describe in detail herein.
As shown in figure 4, first, user can include the list of websites of multiple network address in client input,
Wherein, the multiple network address in list of websites can be the network address of the webpage that user Jing is often browsed, it is also possible to
It is the network address corresponding to page browsing amount (page view, PV) webpage in the top.
After the complete list of websites of user input, the list of websites that can crawl user input by server is (i.e.
Url list), for each webpage in list of websites, a part of element can be randomly selected together with it
Advertisement identification feature is used as sample set.Wherein, advertisement identification feature can preset, specifically, can
Manual analysis is carried out with the advertisement in being present in webpage, its feature is concluded as advertisement identification feature.For example,
Advertisement identification feature can be generalized into following form:
Feature | JavaScript | Iframe | Picture | Flash |
Container id/class includes " ad " | √ | √ | √ | |
In foreign lands, website occurrence number is more | √ | √ | ||
Bar shaped | √ | √ | ||
Absolute/fixed is positioned | √ | √ | √ | |
Picture format | √ | |||
GIF animation frame numbers | √ |
For the sample set for being formed, two parts can be divided, used as training set, a part is used as test for a part
Collection.For training set, existing advertisement identification software (such as ADBlock Plus) can be taken by training set
In each element be labeled, label it as ad elements or non-ad elements.At this point it is possible to take
The mode of artificial mark is marked again to the Partial Elements in labeled training set, so that in training set
Markup information can be more accurate.
Include the url of several elements, corresponding characteristic of advertisement value, whether in labeled training set
The mark of advertisement.Thus can be used to train advertisement identification model.
Because the scikit-learn based on python has good performance in sorting algorithm, therefore, this reality
Apply and can select in example logistic regression, the decision Tree algorithms model of scikit-learn as advertisement identification model
Basis, using training set training pattern, to obtain advertisement identification model.
Test set is labeled using advertisement identification model, each element in test set is labeled as into advertisement
Element or non-ad elements.
The url list of multiple elements that ad elements are noted as in test set is obtained, URL column is concluded
Table, obtains new advertisement recognition rule, for new advertisement recognition rule, can delete through manual verification
Go some to contain the rule of non-advertisement, then authenticated newly-increased filtering rule is added advertisement recognition rule
Collection, and training set and test set sample are marked again according to the filtering rule of advertisement recognition rule concentration, repeatedly
In generation, repeats above-mentioned flow process, to improve, accurate advertisement recognition rule.
In addition, see after newly-increased filtering rule is obtained, can with after newly-increased filtering rule is added again plus
List of websites is carried, advertisement filter validity and false determination ratio is checked by identification model, or manually spot-check website,
Check advertisement filter validity and false determination ratio.
Advertisement recognition rule inductive method in the page of the present invention is described above with reference to Fig. 1 to Fig. 4.Under
Face describes advertisement recognition rule sorting device and equipment in the invention page with reference to Fig. 5 to Fig. 7.Retouch below
The device stated and very multiunit function of equipment respectively with the phase above with reference to described by Fig. 1 to Fig. 4
Answer the function phase of step same.In order to avoid repeating, here emphasis describes what device, equipment can have
Cellular construction, and for some details are then repeated no more, may be referred to corresponding description above.
Fig. 5 is shown according to the structure of one embodiment of advertisement recognition rule sorting device in the page of the present invention
It is intended to.As shown in figure 5, device includes training set generation module 110, element labeling module 120, advertisement
Identification model generation module 130, test set generation module 140, elemental recognition module 150 and conclusion
Module 160.
Training set generation module 110 is used to generate training set based on the first list of websites, and training set includes
At least part of element and its advertisement identification feature in first list of websites in webpage corresponding to each network address.
Wherein, the acquisition of the first list of websites and the concept of advertisement identification feature may refer to be walked in Fig. 1
The associated description of rapid S110.
Element labeling module 120 is used for according to knot that is artificial and/or being identified by advertisement identification software
Really, each element in training set is labeled as into ad elements or non-ad elements.
That is, element labeling module 120 can be existing using artificial and/or advertisement identification software etc.
Advertisement recognition rule the element in training set is labeled, each element in training set is labeled as
Ad elements or non-ad elements.Wherein, advertisement identification software the degree of accuracy of advertisement recognition rule compared with
In the case of height, artificial mark can not used the element in training set is labeled.
Advertisement identification model generation module 130 is used for by machine learning algorithm, based on each in training set
The advertisement identification feature of element and its be whether ad elements annotation results, obtain advertisement identification model.
Wherein, the advertisement identification model for being generated based on advertisement identification model generation module 130 is denoted extensively
Identification feature, the corresponding relation of ad elements are accused, whether element may determine that based on advertisement identification model
For ad elements.
Test set generation module 140 is used to be based on the second list of websites generating test set, wherein, test
Collection includes at least part of element in the second list of websites in webpage corresponding to each network address and its advertisement identification
Feature.
Wherein, the acquisition of the second list of websites and the concept of advertisement identification feature may refer to be walked in Fig. 1
The associated description of rapid S140.
Elemental recognition module 150 is used for the advertisement identification feature based on each element in test set, using wide
Accuse the ad elements in identification model identification test set.
Using the advertisement identification model generated by advertisement identification model generation module 130, to test set
In each element be labeled, label it as ad elements or non-ad elements.
Concluding module 160 is used to conclude the URL of the ad elements in test set,
Obtain advertisement recognition rule.
Wherein, the correlation that the concrete conclusion mode for concluding module 160 can be found in step S160 in Fig. 1 is retouched
State.
To sum up, advertisement identifying device of the invention, can be according to advertisement identification software and/or artificial wide
Recognition rule is accused, the element in training set ad elements or non-ad elements is labeled as into, according to training
Concentrate each element annotation results to generate the advertisement identification model with regard to advertisement identification feature, reuse
To advertisement identification model each element in test set is labeled as into ad elements or non-ad elements,
Finally the URL of the ad elements in test set is concluded, it is possible to obtain new
Advertisement recognition rule, new advertisement recognition rule can be filtered as existing artificial filter rule or software
The supplement of rule, preferably to recognize the ad elements in webpage.
In addition, preferably, the element in the training set of the generation of training set generation module 110 can be wrapped
Include by owning that advertisement identification software is identified from the first list of websites in webpage corresponding to each network address
Ad elements and at least part of non-ad elements;In the test set that test set generation module 140 is generated
Element can include by advertisement identification software the webpage corresponding to each network address from second list of websites
In all ad elements for identifying and at least part of non-ad elements.
So, because the element in training set and test set is included by advertisement identification software from the first net
The all ad elements identified in webpage corresponding to each network address in location list, the second list of websites, this
Sample, when setting up advertisement identification model by the annotation results in training set, can improve advertisement identification mould
The degree of accuracy of type.Also, ad elements are entered in set up advertisement identification model is utilized to test set
Row identification, and the URL to being identified as ad elements is when concluding, due to test set
In include more ad elements, as such, it is possible to so that the advertisement recognition rule summarized more comprehensively,
Accurately.
Further, due in training set and test set non-advertisement unit can also be noted as containing part
Element element, thus can by using training set in positive negative sample be trained, obtain advertisement
Identification model.So that the accuracy of advertisement identification model that training is obtained is higher, practicality is stronger.
Fig. 6 shows the structure of advertisement recognition rule sorting device in accordance with another embodiment of the present invention
Schematic block diagram.
As shown in fig. 6, in a preferred embodiment, the device is except containing shown in Fig. 5
Outside all structures, also alternatively include that element is presented module 170 and advertisement recognition rule screening module 180,
Wherein, introduce herein in Fig. 5 without structure, can with the related introduction of Fig. 5 identical structures
The explanation with regard to Fig. 5 is seen above, here is omitted.
Element is presented module 170 is used to that the unit for meeting the advertisement recognition rule in the test set to be presented
Element.Advertisement recognition rule screening module 180 is used to be sieved according to the artificial judgment of the element to being presented
Select advertisement recognition rule.
Thus, after the ad elements in test set is identified using advertisement identification model, can also lead to
Cross advertisement recognition rule screening module 180 and filter out the ad elements of marked erroneous, so that concluding
The advertisement recognition rule for going out can be more accurate.In addition, advertisement recognition rule screening module 180 also may be used
To screen to the advertisement recognition rule that conclusion is obtained, to exclude those irrational mistake is substantially concluded
Filter rule.
In addition, screening in 180 pairs of advertisement recognition rules for obtaining of advertisement recognition rule screening module
Afterwards, element labeling module 120, advertisement identification model generation module 130, elemental recognition module 150,
The advertisement recognition rule that concluding module 160 can obtain according to screening repeats correlation step.
Specifically, carry out in 180 pairs of advertisement recognition rules for obtaining of advertisement recognition rule screening module
After screening, element labeling module 120 can be according to the advertisement recognition rule for obtaining to being given birth to based on training set
The training set generated into module 110 is marked again, and advertisement identification module 130 can be according to element
Annotation results of the labeling module 120 to training set, regenerate advertisement identification model, elemental recognition mould
Block 150 can be re-recognized according to the advertisement identification model for regenerating to the element in test set,
Mark, the markup information again for concluding element of the module 160 in test set concludes again advertisement knowledge
Rule, is not then presented module 170 and presents to the element for meeting new advertisement recognition rule by element
Advertisement recognition rule screening module 180, with the new advertisement recognition rule of artificial screening.Then can repeat
Perform said process.Wherein, the number of times for repeating can set as the case may be.
The advertisement recognition rule obtained after successive ignition can take union, know as a final advertisement
It is irregular, as such, it is possible to allow the advertisement recognition rule for finally giving to filter out big portion in the page
The ad elements divided, filter effect is notable.
Fig. 7 shows that the advertisement recognition rule of the present invention concludes the schematic block diagram of equipment.Such as Fig. 7
Shown, equipment includes input unit 3, mixed-media network modules mixed-media 4, memory 2, display 5 and processor
1。
First list of websites and the second list of websites of the receiving user's input of input unit 3, processor 1
First list of websites and the second list of websites of user input, network can be obtained by input unit 3
Module 4 is used to access the webpage in the first list of websites and the second list of websites corresponding to each website, place
Reason device 1 generates training based on the web data that each network address is obtained from the first list of websites of mixed-media network modules mixed-media 4
Collection, and training set is stored on memory 2, training set includes each network address institute in the first list of websites
At least part of element and its advertisement identification feature in correspondence webpage.
Processor 1 according to result that is artificial and/or being identified by advertisement identification software, by training set
In each element be labeled as ad elements or non-ad elements, and annotation results are accordingly stored
On memory 1, processor 1 is by machine learning algorithm, the advertisement based on each element in training set
Identification feature and its be whether ad elements annotation results, obtain advertisement identification model, the base of processor 1
In the web data generating test set that each network address is obtained from the second list of websites of mixed-media network modules mixed-media 4, and will
Test set is stored on memory 2, and test set includes webpage corresponding to each network address in the second list of websites
In at least part of element and its advertisement identification feature, processor 1 based in test set each element it is wide
Identification feature is accused, using advertisement identification model the ad elements in test set, 1 pair of survey of processor are recognized
The URL of the ad elements that examination is concentrated is concluded, and obtains advertisement recognition rule, and will
Advertisement recognition rule is stored on memory 2.Present on display 5 and meet described wide in test set
Accuse the element of recognition rule, the judged result that processor 1 is input into by input unit 3 according to user come
Screening advertisement recognition rule.
To sum up, the advertisement recognition rule based on the present invention concludes equipment, and user is input into network address in client
After list, it is possible to processed by list of websites of the server to user input, advertisement identification is obtained
Rule, the advertisement recognition rule can be used as the advertisement recognition rule of existing other advertisement filter softwares
Supplement, import advertisement filter software, it is also possible to be stored in client in the form of an executable program,
Perform the operation of the ad elements in identification webpage.In the advertisement recognition rule for obtaining being sent by server
Afterwards, user can also be screened by hand, to exclude the advertisement recognition rule of apparent error, to enter one
Step improves the accuracy of filtering rule.
Above by reference to accompanying drawing describe in detail advertisement recognition rule inductive method of the invention,
Device and equipment.
Additionally, the method according to the invention is also implemented as a kind of computer program, the computer journey
Sequence includes the computer program code of the above steps for limiting in the said method for performing the present invention
Instruction.Or, the method according to the invention is also implemented as a kind of computer program, the meter
Calculation machine program product includes computer-readable medium, is stored with the computer-readable medium for holding
The computer program of the above-mentioned functions limited in the said method of the row present invention.Those skilled in the art are also
It will be clear that, various illustrative logical blocks, module, circuit with reference to described by disclosure herein and
Algorithm steps may be implemented as the combination of electronic hardware, computer software or both.
The system and method that flow chart and block diagram in accompanying drawing shows multiple embodiments of the invention
Architectural framework in the cards, function and operation.At this point, each in flow chart or block diagram
Square frame can represent a part for module, program segment or a code, the module, program segment or generation
A part for code is used for the executable instruction of the logic function that realization specifies comprising one or more.Also should
Work as attention, at some as in the realizations replaced, the function of being marked in square frame can also be being different from
The order marked in accompanying drawing occurs.For example, two continuous square frames can essentially be substantially in parallel
Perform, they can also be performed in the opposite order sometimes, and this is depending on involved function.Also
It is noted that block diagram and/or each square frame and block diagram and/or the square frame in flow chart in flow chart
Combination, can with performing the function of regulation or the special hardware based system of operation realizing,
Or can be realized with the combination of computer instruction with specialized hardware.
It is described above various embodiments of the present invention, described above is exemplary, and exhaustive
Property, and it is also not necessarily limited to disclosed each embodiment.In the model without departing from illustrated each embodiment
Enclose and spirit in the case of, many modifications and changes for those skilled in the art
Will be apparent from.The selection of term used herein, it is intended to best explain the original of each embodiment
Reason, practical application or the improvement to the technology in market, or other the common skills for making the art
Art personnel are understood that each embodiment disclosed herein.
Claims (10)
1. a kind of advertisement recognition rule inductive method, including:
Training set is generated based on the first list of websites, the training set includes at least part of element and its advertisement identification feature in first list of websites in webpage corresponding to each network address;
According to result that is artificial and/or being identified by advertisement identification software, each element in the training set is labeled as into ad elements or non-ad elements;
By machine learning algorithm, advertisement identification feature based on each element in the training set and its be whether ad elements annotation results, obtain advertisement identification model;
Based on the second list of websites generating test set, the test set includes at least part of element and its advertisement identification feature in second list of websites in webpage corresponding to each network address;
Based on the advertisement identification feature of each element in the test set, using the advertisement identification model ad elements in the test set are recognized;
The URL of the ad elements in the test set is concluded, advertisement recognition rule is obtained.
2. advertisement recognition rule inductive method according to claim 1, also includes:
The element for meeting the advertisement recognition rule in the test set is presented;
The advertisement recognition rule is screened according to the artificial judgment of the element to being presented.
3. advertisement recognition rule inductive method according to claim 2, also performs following steps including iteration:
Identification is re-started to the element in the training set according to the advertisement recognition rule, the element in the training set is labeled as into ad elements or non-ad elements again;
By machine learning algorithm, advertisement identification feature based on each element in the training set and its be whether ad elements annotation results again, obtain advertisement identification model;
Based on the second list of websites generating test set, the test set includes at least part of element and its advertisement identification feature in second list of websites in webpage corresponding to each network address;
Based on the advertisement identification feature of each element in the test set, using the advertisement identification model ad elements in the test set are recognized;
The URL of the ad elements in the test set is concluded, advertisement recognition rule is obtained;
The element for meeting the advertisement recognition rule in the test set is presented;
The advertisement recognition rule is screened according to the artificial judgment of the element to being presented.
4. advertisement recognition rule inductive method according to claim 3, wherein,
Element in the training set includes all ad elements identified from webpage corresponding to each network address in first list of websites by advertisement identification software and at least part of non-ad elements;
Element in the test set includes all ad elements identified from webpage corresponding to each network address in second list of websites by advertisement identification software and at least part of non-ad elements.
5. the advertisement recognition rule inductive method according to any one of Claims 1-4, wherein,
The advertisement identification feature includes positioning properties, picture format, the dynamic picture frame number in whether combining comprising specific character string in source code, whether being bar shaped, CSS in the number of times, element that foreign lands website occurs.
6. a kind of advertisement recognition rule sorting device, including:
Training set generation module, for generating training set based on the first list of websites, the training set includes at least part of element and its advertisement identification feature in first list of websites in webpage corresponding to each network address;
Element labeling module, for according to result that is artificial and/or being identified by advertisement identification software, each element in the training set being labeled as into ad elements or non-ad elements;
Advertisement identification model generation module, for by machine learning algorithm, advertisement identification feature based on each element in the training set and its be whether ad elements annotation results, obtain advertisement identification model;
Test set generation module, for based on the second list of websites generating test set, the test set to include at least part of element and its advertisement identification feature in second list of websites in webpage corresponding to each network address;
Elemental recognition module, for the advertisement identification feature based on each element in the test set, using the advertisement identification model ad elements in the test set is recognized;
Module is concluded, for concluding to the URL of the ad elements in the test set, advertisement recognition rule is obtained.
7. advertisement recognition rule sorting device according to claim 6, also includes:
Element is presented module, for the test set to be presented in meet the element of the advertisement recognition rule;
Advertisement recognition rule screening module, for screening the advertisement recognition rule according to the artificial judgment of the element to being presented.
8. the advertisement recognition rule sorting device according to claim 6 or 7, wherein,
Element in the training set that the training set generation module is generated includes all ad elements identified from webpage corresponding to each network address in first list of websites by advertisement identification software and at least part of non-ad elements;
Element in the test set that the test set generation module is generated includes all ad elements identified from webpage corresponding to each network address in second list of websites by advertisement identification software and at least part of non-ad elements.
9. a kind of advertisement recognition rule concludes equipment, including input unit, mixed-media network modules mixed-media, memory, display and processor, wherein,
First list of websites and the second list of websites of the input unit receiving user's input;
The mixed-media network modules mixed-media is used to access the webpage in first list of websites and second list of websites corresponding to each website;
The processor generates training set based on the mixed-media network modules mixed-media from the web data that each network address in first list of websites is obtained, and the training set is stored on the memory, the training set includes at least part of element and its advertisement identification feature in first list of websites in webpage corresponding to each network address
Each element in the training set is labeled as ad elements or non-ad elements by the processor according to result that is artificial and/or being identified by advertisement identification software, and annotation results are accordingly stored on the memory;
The processor by machine learning algorithm, advertisement identification feature based on each element in the training set and its be whether ad elements annotation results, obtain advertisement identification model;
The web data generating test set that the processor is obtained based on the mixed-media network modules mixed-media from each network address in second list of websites, and the test set is stored on the memory, the test set includes at least part of element and its advertisement identification feature in second list of websites in webpage corresponding to each network address;
Advertisement identification feature of the processor based on each element in the test set, using the advertisement identification model ad elements in the test set are recognized;
The processor is concluded to the URL of the ad elements in the test set, obtains advertisement recognition rule, and the advertisement recognition rule is stored on the memory.
10. advertisement recognition rule according to claim 9 concludes equipment, wherein,
The element for meeting the advertisement recognition rule in the test set is presented on the display,
The judged result that the processor is input into by the input unit according to user is screening the advertisement recognition rule.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510768446.3A CN106682677A (en) | 2015-11-11 | 2015-11-11 | Advertising identification rule induction method, device and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510768446.3A CN106682677A (en) | 2015-11-11 | 2015-11-11 | Advertising identification rule induction method, device and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106682677A true CN106682677A (en) | 2017-05-17 |
Family
ID=58865347
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510768446.3A Pending CN106682677A (en) | 2015-11-11 | 2015-11-11 | Advertising identification rule induction method, device and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106682677A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108733764A (en) * | 2018-04-16 | 2018-11-02 | 优视科技有限公司 | Advertisement filter rule generating method based on machine learning and advertisement filtering system |
CN110110982A (en) * | 2019-04-26 | 2019-08-09 | 特赞(上海)信息科技有限公司 | The checking method and device of intention material |
CN110704615A (en) * | 2019-09-04 | 2020-01-17 | 北京航空航天大学 | Internet financial non-dominant advertisement identification method and device |
CN111914199A (en) * | 2019-05-10 | 2020-11-10 | 腾讯科技(深圳)有限公司 | Page element filtering method, device, equipment and storage medium |
CN112075068A (en) * | 2018-05-03 | 2020-12-11 | 三星电子株式会社 | Electronic device and operation method thereof |
CN112988811A (en) * | 2021-03-09 | 2021-06-18 | 重庆可兰达科技有限公司 | Method, system, terminal and medium for detecting APP advertisement content compliance |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101276417A (en) * | 2008-04-17 | 2008-10-01 | 上海交通大学 | Method for filtering internet cartoon medium rubbish information based on content |
CN101526946A (en) * | 2008-03-07 | 2009-09-09 | 鸿富锦精密工业(深圳)有限公司 | Search system, web page browser, web page filter system and web page filter method thereof |
CN101593200A (en) * | 2009-06-19 | 2009-12-02 | 淮海工学院 | Chinese Web page classification method based on the keyword frequency analysis |
CN104239422A (en) * | 2014-08-21 | 2014-12-24 | 小米科技有限责任公司 | Advertisement identification method, advertisement identification device and electronic equipment |
US20160306893A1 (en) * | 2013-12-02 | 2016-10-20 | Beijing Qihoo Technology Company Limited | Url purification method and url purification apparatus |
-
2015
- 2015-11-11 CN CN201510768446.3A patent/CN106682677A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101526946A (en) * | 2008-03-07 | 2009-09-09 | 鸿富锦精密工业(深圳)有限公司 | Search system, web page browser, web page filter system and web page filter method thereof |
CN101276417A (en) * | 2008-04-17 | 2008-10-01 | 上海交通大学 | Method for filtering internet cartoon medium rubbish information based on content |
CN101593200A (en) * | 2009-06-19 | 2009-12-02 | 淮海工学院 | Chinese Web page classification method based on the keyword frequency analysis |
US20160306893A1 (en) * | 2013-12-02 | 2016-10-20 | Beijing Qihoo Technology Company Limited | Url purification method and url purification apparatus |
CN104239422A (en) * | 2014-08-21 | 2014-12-24 | 小米科技有限责任公司 | Advertisement identification method, advertisement identification device and electronic equipment |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108733764A (en) * | 2018-04-16 | 2018-11-02 | 优视科技有限公司 | Advertisement filter rule generating method based on machine learning and advertisement filtering system |
CN108733764B (en) * | 2018-04-16 | 2021-09-10 | 阿里巴巴(中国)有限公司 | Advertisement filtering rule generation method based on machine learning and advertisement filtering system |
CN112075068A (en) * | 2018-05-03 | 2020-12-11 | 三星电子株式会社 | Electronic device and operation method thereof |
US11893063B2 (en) | 2018-05-03 | 2024-02-06 | Samsung Electronics Co., Ltd. | Electronic device and operation method thereof |
CN110110982A (en) * | 2019-04-26 | 2019-08-09 | 特赞(上海)信息科技有限公司 | The checking method and device of intention material |
CN111914199A (en) * | 2019-05-10 | 2020-11-10 | 腾讯科技(深圳)有限公司 | Page element filtering method, device, equipment and storage medium |
CN111914199B (en) * | 2019-05-10 | 2024-04-12 | 腾讯科技(深圳)有限公司 | Page element filtering method, device, equipment and storage medium |
CN110704615A (en) * | 2019-09-04 | 2020-01-17 | 北京航空航天大学 | Internet financial non-dominant advertisement identification method and device |
CN112988811A (en) * | 2021-03-09 | 2021-06-18 | 重庆可兰达科技有限公司 | Method, system, terminal and medium for detecting APP advertisement content compliance |
CN112988811B (en) * | 2021-03-09 | 2023-06-06 | 重庆可兰达科技有限公司 | Method, system, terminal and medium for detecting APP advertisement content compliance |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106682677A (en) | Advertising identification rule induction method, device and equipment | |
CN108733764B (en) | Advertisement filtering rule generation method based on machine learning and advertisement filtering system | |
CN106649610A (en) | Image labeling method and apparatus | |
CN105306495B (en) | user identification method and device | |
CN106503172A (en) | The method and apparatus that learning path recommended by knowledge based collection of illustrative plates | |
CN107608874A (en) | Method of testing and device | |
CN102419777B (en) | System and method for filtering internet image advertisements | |
CN107341805A (en) | Background segment and network model training, image processing method and device before image | |
CN102650999B (en) | A kind of method and system of extracting object attribute value information from webpage | |
CN109299258A (en) | A kind of public sentiment event detecting method, device and equipment | |
CN108229523A (en) | Image detection, neural network training method, device and electronic equipment | |
CN107731229A (en) | Method and apparatus for identifying voice | |
CN107908959A (en) | Site information detection method, device, electronic equipment and storage medium | |
CN103514279B (en) | A kind of Sentence-level sensibility classification method and device | |
CN104765746A (en) | Data processing method and device for mobile communication terminal browser | |
CN107643929A (en) | Information shows the methods of exhibiting and device at interface | |
CN107153716A (en) | Webpage content extracting method and device | |
CN103491116A (en) | Method and device for processing text-related structural data | |
CN108763313A (en) | On-line training method, server and the storage medium of model | |
CN107590236A (en) | A kind of big data acquisition method and system towards enterprise in charge of construction | |
CN109977762A (en) | A kind of text positioning method and device, text recognition method and device | |
CN105956002A (en) | Webpage classification method and device based on URL analysis | |
CN107291774A (en) | Error sample recognition methods and device | |
CN108197337A (en) | A kind of file classification method and device | |
CN107729931A (en) | Picture methods of marking and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170517 |