CN103150668A - Internet whole network advertising identification method based on content identification - Google Patents
Internet whole network advertising identification method based on content identification Download PDFInfo
- Publication number
- CN103150668A CN103150668A CN2013100882820A CN201310088282A CN103150668A CN 103150668 A CN103150668 A CN 103150668A CN 2013100882820 A CN2013100882820 A CN 2013100882820A CN 201310088282 A CN201310088282 A CN 201310088282A CN 103150668 A CN103150668 A CN 103150668A
- Authority
- CN
- China
- Prior art keywords
- advertisement
- page
- advertisement position
- information
- main information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention provides an internet whole network advertising identification method based on content identification. The method comprises the following steps of establishing and maintaining an advertisement space knowledge base and an advertisement main information base, wherein the advertisement space knowledge base is used for storing advertisement space information of various networks; the advertisement main information base is used for storing advertisement main information; obtaining advertising information regularly from the advertising knowledge base by an advertisement space tracking server, and analyzing the advertisement space information, judging when the advertisement advertized by an advertisement space is changed, downloading and storing a designated advertisement page designated by a designated advertisement space by the advertisement space tracking server, and transmitting the designated advertisement page to the advertisement data analysis server; analyzing the designated advertisement page data based on the advertisement main information stored in the advertisement main information base by the advertisement data analysis server, so as to recognize the designated advertisement main information attached to the designated advertisement page. The method provided by the invention can be used for dynamically and precisely tracking the change situation of the internet whole network advertisement, and providing valuable significant foundation data for analyzing and mining internet advertising business.
Description
Technical field
The invention belongs to Internet advertising and throw in technical field, be specifically related to a kind of internet the whole network advertisement putting recognition methods of content-based identification.
Background technology
High speed development along with the internet, the web advertisement becomes the main source of website income gradually, the web advertisement is known as the 5th media now, have the incomparable advantages of traditional media such as newspaper, magazine, TV, broadcasting, such as: but the spread scope of the web advertisement is extensive, the target audience is with strong points, advertising expenditure is cheap, the plurality of advantages such as statistical of commercial audience quantity, therefore, the web advertisement more and more is subject to the favor of businessman.
At present, advertisement website One's name is legion, such as: Sohu, Sina, Tengxun etc.; And, for same advertisement website, the advertising message of also having thrown in One's name is legion, therefore, in the internet, exist the huge numerous and diverse advertising message of quantity, and each advertiser also can often upgrade the advertising message of throwing in, in prior art, lack a kind of effective mode, come effectively the advertising message of internet dynamically to be followed the tracks of and to add up, thereby provide basic data for significant analysis and excavation Internet advertising business.
Summary of the invention
Defective for the prior art existence, the invention provides a kind of internet the whole network advertisement putting recognition methods of content-based identification, can dynamically follow the tracks of the situation of change of internet the whole network advertisement, thereby provide significant important foundation data for analyzing and excavate the Internet advertising business.
The technical solution used in the present invention is as follows:
The invention provides a kind of internet the whole network advertisement putting recognition methods of content-based identification, comprise the following steps:
S1 sets up and safeguards advertisement position knowledge base and advertisement main information storehouse; Wherein, described advertisement position knowledge base is used for the ad spot information of each website of storage; Described advertisement main information storehouse is used for the storage advertisement main information;
S2, the advertisement position tracking server regularly obtains each described ad spot information corresponding with advertisement position from described advertisement position knowledge base, and each described ad spot information is analyzed, judge whether the advertisement that each advertisement position is thrown in changes, if do not change, continue next ad spot information is analyzed; If change, carry out S3;
S3, described advertisement position tracking server download and preserve the given ad position given ad page pointed that changes, and the described given ad page is transferred to the ad data Analysis server;
S4, described ad data Analysis server carries out data analysis based on the described advertisement main information of described advertisement main information library storage to the described given ad page, identify given ad master's information under the described given ad page in conjunction with machine learning algorithm, and store the corresponding relation of the described given ad page and described given ad master's information.
Preferably, in S1, the code characteristic of the URL, advertisement position that described ad spot information comprises the advertisement position page at advertisement position place in the advertisement position page, advertisement position in the advertisement position page display location information and one or more in the rate card information of advertisement position.
Preferably, the URL of the advertisement position page at described advertisement position place, described advertisement position in the advertisement position page code characteristic and the acquisition methods of the display location information of described advertisement position in the advertisement position page be: URL, code characteristic and the described advertisement position display location information in the advertisement position page of described advertisement position in the advertisement position page of the advertisement position page of automatically collecting the described advertisement position place of each Website page by web crawlers; The rate card information exchange of described advertisement position is crossed under line artificial obtain manner and is obtained.
Preferably, in S2, described each described ad spot information is analyzed, is judged whether advertisement that each advertisement position is thrown in changes specifically to comprise the following steps:
S21 downloads the current advertisement position page according to the URL of the advertisement position page at described advertisement position place;
S22 according to the code characteristic of described advertisement position in the advertisement position page, extracts the current advertisement link of pointing to advertisement page from the described current advertisement position page that S21 downloads;
S23, it is identical that judgement belongs to the advertisement link whether described current advertisement link of this extraction of same advertisement position extract with the last time, if identical, draws the conclusion that advertisement that described advertisement position throws in does not change; If not identical, draw the conclusion that advertisement that described advertisement position throws in changes.
Preferably, described advertisement main information comprises brand message that advertiser's name information, advertiser have, product line information that the advertiser has and one or more in advertisement language material information.
Preferably, the acquisition methods of described advertisement main information is: initial ad master's information of automatically collecting each Website page by web crawlers; Then described initial ad master's information is carried out filtering screening, obtain the described advertisement main information of storing in described advertisement main information storehouse.
Beneficial effect of the present invention is as follows:
The invention provides a kind of internet the whole network advertisement putting recognition methods of content-based identification, can dynamically follow the tracks of the situation of change of internet the whole network advertisement, especially can trace into the ad spot information that changes, and restore the advertisement main information corresponding with the ad spot information that changes, and the ad spot information that changes of storage and the corresponding relation of advertisement main information, thereby provide significant important foundation data for analyzing and excavate the Internet advertising business.
Description of drawings
Fig. 1 is the schematic flow sheet of internet the whole network advertisement putting recognition methods of content-based identification provided by the invention.
Embodiment
The present invention is described in detail below in conjunction with accompanying drawing:
As shown in Figure 1, the invention provides a kind of internet the whole network advertisement putting recognition methods of content-based identification, comprise the following steps:
S1 sets up and safeguards advertisement position knowledge base and advertisement main information storehouse; Wherein, described advertisement position knowledge base is used for the ad spot information of each website of storage; Described advertisement main information storehouse is used for the storage advertisement main information;
Concrete, the advertisement position knowledge base need to dynamically update, the code characteristic of the URL, advertisement position that the ad spot information of its storage comprises the advertisement position page at advertisement position place in the advertisement position page, advertisement position in the advertisement position page display location information and one or more in the rate card information of advertisement position.Understand for convenience the implication of above-mentioned each ad spot information, be exemplified below: A advertisement putting website is provided with 50 advertisement positions in URL is linked as the advertisement position page of http://www.A.com, each advertisement position has all been thrown in the advertising picture, when some advertising pictures are clicked, be linked to corresponding advertisement page.Wherein, http://www.A.com is the URL of the advertisement position page at advertisement position place, the display location information of advertisement position in the advertisement position page is: the position of advertising picture in the advertisement position page that some advertisement positions are thrown in, such as: the upper left corner or the lower right corner etc.
Need to prove, for understanding the present invention, need to distinguish the implication of the advertisement position page and these two words of advertisement page, the advertisement position page refers to show the page of a plurality of advertisement positions, as above shows the advertisement position page of 50 advertisement positions in example; And after advertisement page refers to that some advertisement positions are clicked, the page of the correspondence that is linked to, throw in a picture as: advertiser who sells automobile at certain advertisement position of A advertisement putting website and be the advertising pictures of " automobile ", when this advertising pictures is clicked, be linked to the Website page that this advertiser sells relevant automobile, the Website page that this advertiser sells relevant automobile is advertisement page.
Wherein, the URL of the advertisement position page at advertisement position place, described advertisement position in the advertisement position page code characteristic and the acquisition methods of the display location information of described advertisement position in the advertisement position page be: URL, code characteristic and the described advertisement position display location information in the advertisement position page of described advertisement position in the advertisement position page of the advertisement position page of automatically collecting the described advertisement position place of each Website page by web crawlers; The rate card information exchange of described advertisement position is crossed under line artificial obtain manner and is obtained.
The brand message that the advertisement main information storehouse needs dynamic real-time update, the advertisement main information of its storage to comprise advertiser's name information, advertiser to have, the product line information that the advertiser has and one or more in advertisement language material information.For example: A company is that the sport footwear of the X brand of its production is thrown in advertisement, and A company is advertiser's name information, and the X brand is the brand message that A company has.
Wherein, the acquisition methods of advertisement main information is: initial ad master's information of automatically collecting each Website page by web crawlers; Then described initial ad master's information is carried out filtering screening, obtain the described advertisement main information of storing in described advertisement main information storehouse.
S2, the advertisement position tracking server regularly obtains each described ad spot information corresponding with advertisement position from described advertisement position knowledge base, and each described ad spot information is analyzed, judge whether the advertisement that each advertisement position is thrown in changes, if do not change, continue next ad spot information is analyzed; If change, carry out S3;
In this step, each described ad spot information is analyzed, is judged whether advertisement that each advertisement position is thrown in changes specifically to comprise the following steps:
S21 downloads the current advertisement position page according to the URL of the advertisement position page at described advertisement position place;
S22 according to the code characteristic of described advertisement position in the advertisement position page, extracts the current advertisement link of pointing to advertisement page from the described current advertisement position page that S21 downloads;
S23, it is identical that judgement belongs to the advertisement link whether described current advertisement link of this extraction of same advertisement position extract with the last time, if identical, draws the conclusion that advertisement that described advertisement position throws in does not change; If not identical, draw the conclusion that advertisement that described advertisement position throws in changes.
S3, described advertisement position tracking server download and preserve the given ad position given ad page pointed that changes, and the described given ad page is transferred to the ad data Analysis server;
S4, described ad data Analysis server carries out data analysis based on the described advertisement main information of described advertisement main information library storage to the described given ad page, identify given ad master's information under the described given ad page in conjunction with machine learning algorithm, and store the corresponding relation of the described given ad page and described given ad master's information.
In sum, internet the whole network advertisement putting recognition methods of content-based identification provided by the invention, can dynamically follow the tracks of the situation of change of internet the whole network advertisement, especially can trace into the ad spot information that changes, and restore the advertisement main information corresponding with the ad spot information that changes, and the ad spot information that changes of storage and the corresponding relation of advertisement main information, thereby provide significant important foundation data for analyzing and excavate the Internet advertising business.
The above is only the preferred embodiment of the present invention; should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the principle of the invention; can also make some improvements and modifications, these improvements and modifications also should be looked protection scope of the present invention.
Claims (6)
1. internet the whole network advertisement putting recognition methods of a content-based identification, is characterized in that, comprises the following steps:
S1 sets up and safeguards advertisement position knowledge base and advertisement main information storehouse; Wherein, described advertisement position knowledge base is used for the ad spot information of each website of storage; Described advertisement main information storehouse is used for the storage advertisement main information;
S2, the advertisement position tracking server regularly obtains each described ad spot information corresponding with advertisement position from described advertisement position knowledge base, and each described ad spot information is analyzed, judge whether the advertisement that each advertisement position is thrown in changes, if do not change, continue next ad spot information is analyzed; If change, carry out S3;
S3, described advertisement position tracking server download and preserve the given ad position given ad page pointed that changes, and the described given ad page is transferred to the ad data Analysis server;
S4, described ad data Analysis server carries out data analysis based on the described advertisement main information of described advertisement main information library storage to the described given ad page, identify given ad master's information under the described given ad page in conjunction with machine learning algorithm, and store the corresponding relation of the described given ad page and described given ad master's information.
2. internet the whole network advertisement putting recognition methods of content-based identification according to claim 1, it is characterized in that, in S1, the code characteristic of the URL, advertisement position that described ad spot information comprises the advertisement position page at advertisement position place in the advertisement position page, advertisement position in the advertisement position page display location information and one or more in the rate card information of advertisement position.
3. internet the whole network advertisement putting recognition methods of content-based identification according to claim 2, it is characterized in that, the URL of the advertisement position page at described advertisement position place, described advertisement position in the advertisement position page code characteristic and the acquisition methods of the display location information of described advertisement position in the advertisement position page be: URL, code characteristic and the described advertisement position display location information in the advertisement position page of described advertisement position in the advertisement position page of the advertisement position page of automatically collecting the described advertisement position place of each Website page by web crawlers; The rate card information exchange of described advertisement position is crossed under line artificial obtain manner and is obtained.
4. internet the whole network advertisement putting recognition methods of content-based identification according to claim 2, it is characterized in that, in S2, described each described ad spot information is analyzed, is judged whether advertisement that each advertisement position is thrown in changes specifically to comprise the following steps:
S21 downloads the current advertisement position page according to the URL of the advertisement position page at described advertisement position place;
S22 according to the code characteristic of described advertisement position in the advertisement position page, extracts the current advertisement link of pointing to advertisement page from the described current advertisement position page that S21 downloads;
S23, it is identical that judgement belongs to the advertisement link whether described current advertisement link of this extraction of same advertisement position extract with the last time, if identical, draws the conclusion that advertisement that described advertisement position throws in does not change; If not identical, draw the conclusion that advertisement that described advertisement position throws in changes.
5. internet the whole network advertisement putting recognition methods of content-based identification according to claim 1, it is characterized in that, described advertisement main information comprises brand message that advertiser's name information, advertiser have, product line information that the advertiser has and one or more in advertisement language material information.
6. internet the whole network advertisement putting recognition methods of content-based identification according to claim 5, is characterized in that, the acquisition methods of described advertisement main information is: initial ad master's information of automatically collecting each Website page by web crawlers; Then described initial ad master's information is carried out filtering screening, obtain the described advertisement main information of storing in described advertisement main information storehouse.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310088282.0A CN103150668B (en) | 2013-03-19 | 2013-03-19 | The Internet whole network advertisement putting recognition methods of content-based identification |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310088282.0A CN103150668B (en) | 2013-03-19 | 2013-03-19 | The Internet whole network advertisement putting recognition methods of content-based identification |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103150668A true CN103150668A (en) | 2013-06-12 |
CN103150668B CN103150668B (en) | 2016-02-24 |
Family
ID=48548724
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310088282.0A Expired - Fee Related CN103150668B (en) | 2013-03-19 | 2013-03-19 | The Internet whole network advertisement putting recognition methods of content-based identification |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103150668B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104717210A (en) * | 2015-02-13 | 2015-06-17 | 北京集奥聚合科技有限公司 | Method and system for connecting optional media to DSP |
CN112231618A (en) * | 2020-10-14 | 2021-01-15 | 北京思特奇信息技术股份有限公司 | Multi-contact non-invasive page information recommendation bit configuration method and related device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101158963A (en) * | 2007-10-31 | 2008-04-09 | 中兴通讯股份有限公司 | Information acquisition processing and retrieval system |
CN202058145U (en) * | 2011-01-31 | 2011-11-30 | 北京开心人信息技术有限公司 | Update system of internet advertisements |
CN102819580A (en) * | 2012-07-25 | 2012-12-12 | 广州翼锋信息科技有限公司 | Monitoring method and system of advertisements of internet third-part media website |
CN102880607A (en) * | 2011-07-15 | 2013-01-16 | 舆情(香港)有限公司 | Dynamic network content grabbing method and dynamic network content crawler system |
-
2013
- 2013-03-19 CN CN201310088282.0A patent/CN103150668B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101158963A (en) * | 2007-10-31 | 2008-04-09 | 中兴通讯股份有限公司 | Information acquisition processing and retrieval system |
CN202058145U (en) * | 2011-01-31 | 2011-11-30 | 北京开心人信息技术有限公司 | Update system of internet advertisements |
CN102880607A (en) * | 2011-07-15 | 2013-01-16 | 舆情(香港)有限公司 | Dynamic network content grabbing method and dynamic network content crawler system |
CN102819580A (en) * | 2012-07-25 | 2012-12-12 | 广州翼锋信息科技有限公司 | Monitoring method and system of advertisements of internet third-part media website |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104717210A (en) * | 2015-02-13 | 2015-06-17 | 北京集奥聚合科技有限公司 | Method and system for connecting optional media to DSP |
CN104717210B (en) * | 2015-02-13 | 2018-06-26 | 北京集奥聚合科技有限公司 | A kind of method and system by free media access DSP |
CN112231618A (en) * | 2020-10-14 | 2021-01-15 | 北京思特奇信息技术股份有限公司 | Multi-contact non-invasive page information recommendation bit configuration method and related device |
Also Published As
Publication number | Publication date |
---|---|
CN103150668B (en) | 2016-02-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102646248B (en) | A kind of advertisement delivery method and system | |
CN102184230A (en) | Method and device for displaying search results | |
CN104869173A (en) | Information push method and device for outdoor large-scale advertising board | |
CN103377287A (en) | Method and device for putting in item information | |
CN106846061A (en) | Potential user's method for digging and device | |
CN101689263A (en) | Determine advertising results | |
CN103491146A (en) | Method, device and system for releasing network information | |
EP2557707A3 (en) | Broadcast signal receiver, method for providing information related to a broadcast signal, and server for sending identification information about a broadcast programme | |
CN101520782A (en) | Method and system for directionally releasing special-subject information relevant to online images | |
CN104170392A (en) | Method, device, system and terminal of inplanting advertisements in files | |
CN103279565A (en) | Advertisement placement tracking method and system | |
CN105160545A (en) | Delivered information pattern determination method and device | |
CN106339891A (en) | Intelligent analysis method and system based on large data acquisition | |
CN104732416A (en) | Data processing method and device | |
CN104217353A (en) | Webpage advertisement directional pushing system | |
CN101183396A (en) | Advertisement display process, system and device | |
CN104239526A (en) | POI (Point of Interest) labeling method and device for electronic map | |
CN108648009A (en) | A kind of advertisement sending method and device | |
EP2577590A1 (en) | Online advertising system and a method of operating the same | |
CN101882135A (en) | Data processing method and device | |
CN105139233A (en) | Advertisement putting method, device, and system | |
CN106202371A (en) | The processing method of media file, device and advertisement analysis method | |
CN103150668B (en) | The Internet whole network advertisement putting recognition methods of content-based identification | |
Collins et al. | A co-evolving cultural cluster in the periphery: Film and TV production in Galway, Ireland | |
CN103337026A (en) | Advertising systems and methods using embedded map |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220421 Address after: 100000 room 116, building 3, Shuangqiao (Shuangqiao dairy factory), Chaoyang District, Beijing Patentee after: Beijing Xiaoxiang innovation Artificial Intelligence Technology Co.,Ltd. Address before: 901, floor 9, building 5, yard 1, Shangdi East Road, Haidian District, Beijing 100086 Patentee before: BEIJING GEO POLYMERIZATION TECHNOLOGY Co.,Ltd. |
|
TR01 | Transfer of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160224 |
|
CF01 | Termination of patent right due to non-payment of annual fee |