CN103150668A - Internet whole network advertising identification method based on content identification - Google Patents

Internet whole network advertising identification method based on content identification Download PDF

Info

Publication number
CN103150668A
CN103150668A CN2013100882820A CN201310088282A CN103150668A CN 103150668 A CN103150668 A CN 103150668A CN 2013100882820 A CN2013100882820 A CN 2013100882820A CN 201310088282 A CN201310088282 A CN 201310088282A CN 103150668 A CN103150668 A CN 103150668A
Authority
CN
China
Prior art keywords
advertisement
page
advertisement position
information
main information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013100882820A
Other languages
Chinese (zh)
Other versions
CN103150668B (en
Inventor
段培力
刘国清
郑重
丁立星
于锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaoxiang Innovation Artificial Intelligence Technology Co ltd
Original Assignee
BEIJING GEO POLYMERIZATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING GEO POLYMERIZATION TECHNOLOGY Co Ltd filed Critical BEIJING GEO POLYMERIZATION TECHNOLOGY Co Ltd
Priority to CN201310088282.0A priority Critical patent/CN103150668B/en
Publication of CN103150668A publication Critical patent/CN103150668A/en
Application granted granted Critical
Publication of CN103150668B publication Critical patent/CN103150668B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention provides an internet whole network advertising identification method based on content identification. The method comprises the following steps of establishing and maintaining an advertisement space knowledge base and an advertisement main information base, wherein the advertisement space knowledge base is used for storing advertisement space information of various networks; the advertisement main information base is used for storing advertisement main information; obtaining advertising information regularly from the advertising knowledge base by an advertisement space tracking server, and analyzing the advertisement space information, judging when the advertisement advertized by an advertisement space is changed, downloading and storing a designated advertisement page designated by a designated advertisement space by the advertisement space tracking server, and transmitting the designated advertisement page to the advertisement data analysis server; analyzing the designated advertisement page data based on the advertisement main information stored in the advertisement main information base by the advertisement data analysis server, so as to recognize the designated advertisement main information attached to the designated advertisement page. The method provided by the invention can be used for dynamically and precisely tracking the change situation of the internet whole network advertisement, and providing valuable significant foundation data for analyzing and mining internet advertising business.

Description

Internet the whole network advertisement putting recognition methods of content-based identification
Technical field
The invention belongs to Internet advertising and throw in technical field, be specifically related to a kind of internet the whole network advertisement putting recognition methods of content-based identification.
Background technology
High speed development along with the internet, the web advertisement becomes the main source of website income gradually, the web advertisement is known as the 5th media now, have the incomparable advantages of traditional media such as newspaper, magazine, TV, broadcasting, such as: but the spread scope of the web advertisement is extensive, the target audience is with strong points, advertising expenditure is cheap, the plurality of advantages such as statistical of commercial audience quantity, therefore, the web advertisement more and more is subject to the favor of businessman.
At present, advertisement website One's name is legion, such as: Sohu, Sina, Tengxun etc.; And, for same advertisement website, the advertising message of also having thrown in One's name is legion, therefore, in the internet, exist the huge numerous and diverse advertising message of quantity, and each advertiser also can often upgrade the advertising message of throwing in, in prior art, lack a kind of effective mode, come effectively the advertising message of internet dynamically to be followed the tracks of and to add up, thereby provide basic data for significant analysis and excavation Internet advertising business.
Summary of the invention
Defective for the prior art existence, the invention provides a kind of internet the whole network advertisement putting recognition methods of content-based identification, can dynamically follow the tracks of the situation of change of internet the whole network advertisement, thereby provide significant important foundation data for analyzing and excavate the Internet advertising business.
The technical solution used in the present invention is as follows:
The invention provides a kind of internet the whole network advertisement putting recognition methods of content-based identification, comprise the following steps:
S1 sets up and safeguards advertisement position knowledge base and advertisement main information storehouse; Wherein, described advertisement position knowledge base is used for the ad spot information of each website of storage; Described advertisement main information storehouse is used for the storage advertisement main information;
S2, the advertisement position tracking server regularly obtains each described ad spot information corresponding with advertisement position from described advertisement position knowledge base, and each described ad spot information is analyzed, judge whether the advertisement that each advertisement position is thrown in changes, if do not change, continue next ad spot information is analyzed; If change, carry out S3;
S3, described advertisement position tracking server download and preserve the given ad position given ad page pointed that changes, and the described given ad page is transferred to the ad data Analysis server;
S4, described ad data Analysis server carries out data analysis based on the described advertisement main information of described advertisement main information library storage to the described given ad page, identify given ad master's information under the described given ad page in conjunction with machine learning algorithm, and store the corresponding relation of the described given ad page and described given ad master's information.
Preferably, in S1, the code characteristic of the URL, advertisement position that described ad spot information comprises the advertisement position page at advertisement position place in the advertisement position page, advertisement position in the advertisement position page display location information and one or more in the rate card information of advertisement position.
Preferably, the URL of the advertisement position page at described advertisement position place, described advertisement position in the advertisement position page code characteristic and the acquisition methods of the display location information of described advertisement position in the advertisement position page be: URL, code characteristic and the described advertisement position display location information in the advertisement position page of described advertisement position in the advertisement position page of the advertisement position page of automatically collecting the described advertisement position place of each Website page by web crawlers; The rate card information exchange of described advertisement position is crossed under line artificial obtain manner and is obtained.
Preferably, in S2, described each described ad spot information is analyzed, is judged whether advertisement that each advertisement position is thrown in changes specifically to comprise the following steps:
S21 downloads the current advertisement position page according to the URL of the advertisement position page at described advertisement position place;
S22 according to the code characteristic of described advertisement position in the advertisement position page, extracts the current advertisement link of pointing to advertisement page from the described current advertisement position page that S21 downloads;
S23, it is identical that judgement belongs to the advertisement link whether described current advertisement link of this extraction of same advertisement position extract with the last time, if identical, draws the conclusion that advertisement that described advertisement position throws in does not change; If not identical, draw the conclusion that advertisement that described advertisement position throws in changes.
Preferably, described advertisement main information comprises brand message that advertiser's name information, advertiser have, product line information that the advertiser has and one or more in advertisement language material information.
Preferably, the acquisition methods of described advertisement main information is: initial ad master's information of automatically collecting each Website page by web crawlers; Then described initial ad master's information is carried out filtering screening, obtain the described advertisement main information of storing in described advertisement main information storehouse.
Beneficial effect of the present invention is as follows:
The invention provides a kind of internet the whole network advertisement putting recognition methods of content-based identification, can dynamically follow the tracks of the situation of change of internet the whole network advertisement, especially can trace into the ad spot information that changes, and restore the advertisement main information corresponding with the ad spot information that changes, and the ad spot information that changes of storage and the corresponding relation of advertisement main information, thereby provide significant important foundation data for analyzing and excavate the Internet advertising business.
Description of drawings
Fig. 1 is the schematic flow sheet of internet the whole network advertisement putting recognition methods of content-based identification provided by the invention.
Embodiment
The present invention is described in detail below in conjunction with accompanying drawing:
As shown in Figure 1, the invention provides a kind of internet the whole network advertisement putting recognition methods of content-based identification, comprise the following steps:
S1 sets up and safeguards advertisement position knowledge base and advertisement main information storehouse; Wherein, described advertisement position knowledge base is used for the ad spot information of each website of storage; Described advertisement main information storehouse is used for the storage advertisement main information;
Concrete, the advertisement position knowledge base need to dynamically update, the code characteristic of the URL, advertisement position that the ad spot information of its storage comprises the advertisement position page at advertisement position place in the advertisement position page, advertisement position in the advertisement position page display location information and one or more in the rate card information of advertisement position.Understand for convenience the implication of above-mentioned each ad spot information, be exemplified below: A advertisement putting website is provided with 50 advertisement positions in URL is linked as the advertisement position page of http://www.A.com, each advertisement position has all been thrown in the advertising picture, when some advertising pictures are clicked, be linked to corresponding advertisement page.Wherein, http://www.A.com is the URL of the advertisement position page at advertisement position place, the display location information of advertisement position in the advertisement position page is: the position of advertising picture in the advertisement position page that some advertisement positions are thrown in, such as: the upper left corner or the lower right corner etc.
Need to prove, for understanding the present invention, need to distinguish the implication of the advertisement position page and these two words of advertisement page, the advertisement position page refers to show the page of a plurality of advertisement positions, as above shows the advertisement position page of 50 advertisement positions in example; And after advertisement page refers to that some advertisement positions are clicked, the page of the correspondence that is linked to, throw in a picture as: advertiser who sells automobile at certain advertisement position of A advertisement putting website and be the advertising pictures of " automobile ", when this advertising pictures is clicked, be linked to the Website page that this advertiser sells relevant automobile, the Website page that this advertiser sells relevant automobile is advertisement page.
Wherein, the URL of the advertisement position page at advertisement position place, described advertisement position in the advertisement position page code characteristic and the acquisition methods of the display location information of described advertisement position in the advertisement position page be: URL, code characteristic and the described advertisement position display location information in the advertisement position page of described advertisement position in the advertisement position page of the advertisement position page of automatically collecting the described advertisement position place of each Website page by web crawlers; The rate card information exchange of described advertisement position is crossed under line artificial obtain manner and is obtained.
The brand message that the advertisement main information storehouse needs dynamic real-time update, the advertisement main information of its storage to comprise advertiser's name information, advertiser to have, the product line information that the advertiser has and one or more in advertisement language material information.For example: A company is that the sport footwear of the X brand of its production is thrown in advertisement, and A company is advertiser's name information, and the X brand is the brand message that A company has.
Wherein, the acquisition methods of advertisement main information is: initial ad master's information of automatically collecting each Website page by web crawlers; Then described initial ad master's information is carried out filtering screening, obtain the described advertisement main information of storing in described advertisement main information storehouse.
S2, the advertisement position tracking server regularly obtains each described ad spot information corresponding with advertisement position from described advertisement position knowledge base, and each described ad spot information is analyzed, judge whether the advertisement that each advertisement position is thrown in changes, if do not change, continue next ad spot information is analyzed; If change, carry out S3;
In this step, each described ad spot information is analyzed, is judged whether advertisement that each advertisement position is thrown in changes specifically to comprise the following steps:
S21 downloads the current advertisement position page according to the URL of the advertisement position page at described advertisement position place;
S22 according to the code characteristic of described advertisement position in the advertisement position page, extracts the current advertisement link of pointing to advertisement page from the described current advertisement position page that S21 downloads;
S23, it is identical that judgement belongs to the advertisement link whether described current advertisement link of this extraction of same advertisement position extract with the last time, if identical, draws the conclusion that advertisement that described advertisement position throws in does not change; If not identical, draw the conclusion that advertisement that described advertisement position throws in changes.
S3, described advertisement position tracking server download and preserve the given ad position given ad page pointed that changes, and the described given ad page is transferred to the ad data Analysis server;
S4, described ad data Analysis server carries out data analysis based on the described advertisement main information of described advertisement main information library storage to the described given ad page, identify given ad master's information under the described given ad page in conjunction with machine learning algorithm, and store the corresponding relation of the described given ad page and described given ad master's information.
In sum, internet the whole network advertisement putting recognition methods of content-based identification provided by the invention, can dynamically follow the tracks of the situation of change of internet the whole network advertisement, especially can trace into the ad spot information that changes, and restore the advertisement main information corresponding with the ad spot information that changes, and the ad spot information that changes of storage and the corresponding relation of advertisement main information, thereby provide significant important foundation data for analyzing and excavate the Internet advertising business.
The above is only the preferred embodiment of the present invention; should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the principle of the invention; can also make some improvements and modifications, these improvements and modifications also should be looked protection scope of the present invention.

Claims (6)

1. internet the whole network advertisement putting recognition methods of a content-based identification, is characterized in that, comprises the following steps:
S1 sets up and safeguards advertisement position knowledge base and advertisement main information storehouse; Wherein, described advertisement position knowledge base is used for the ad spot information of each website of storage; Described advertisement main information storehouse is used for the storage advertisement main information;
S2, the advertisement position tracking server regularly obtains each described ad spot information corresponding with advertisement position from described advertisement position knowledge base, and each described ad spot information is analyzed, judge whether the advertisement that each advertisement position is thrown in changes, if do not change, continue next ad spot information is analyzed; If change, carry out S3;
S3, described advertisement position tracking server download and preserve the given ad position given ad page pointed that changes, and the described given ad page is transferred to the ad data Analysis server;
S4, described ad data Analysis server carries out data analysis based on the described advertisement main information of described advertisement main information library storage to the described given ad page, identify given ad master's information under the described given ad page in conjunction with machine learning algorithm, and store the corresponding relation of the described given ad page and described given ad master's information.
2. internet the whole network advertisement putting recognition methods of content-based identification according to claim 1, it is characterized in that, in S1, the code characteristic of the URL, advertisement position that described ad spot information comprises the advertisement position page at advertisement position place in the advertisement position page, advertisement position in the advertisement position page display location information and one or more in the rate card information of advertisement position.
3. internet the whole network advertisement putting recognition methods of content-based identification according to claim 2, it is characterized in that, the URL of the advertisement position page at described advertisement position place, described advertisement position in the advertisement position page code characteristic and the acquisition methods of the display location information of described advertisement position in the advertisement position page be: URL, code characteristic and the described advertisement position display location information in the advertisement position page of described advertisement position in the advertisement position page of the advertisement position page of automatically collecting the described advertisement position place of each Website page by web crawlers; The rate card information exchange of described advertisement position is crossed under line artificial obtain manner and is obtained.
4. internet the whole network advertisement putting recognition methods of content-based identification according to claim 2, it is characterized in that, in S2, described each described ad spot information is analyzed, is judged whether advertisement that each advertisement position is thrown in changes specifically to comprise the following steps:
S21 downloads the current advertisement position page according to the URL of the advertisement position page at described advertisement position place;
S22 according to the code characteristic of described advertisement position in the advertisement position page, extracts the current advertisement link of pointing to advertisement page from the described current advertisement position page that S21 downloads;
S23, it is identical that judgement belongs to the advertisement link whether described current advertisement link of this extraction of same advertisement position extract with the last time, if identical, draws the conclusion that advertisement that described advertisement position throws in does not change; If not identical, draw the conclusion that advertisement that described advertisement position throws in changes.
5. internet the whole network advertisement putting recognition methods of content-based identification according to claim 1, it is characterized in that, described advertisement main information comprises brand message that advertiser's name information, advertiser have, product line information that the advertiser has and one or more in advertisement language material information.
6. internet the whole network advertisement putting recognition methods of content-based identification according to claim 5, is characterized in that, the acquisition methods of described advertisement main information is: initial ad master's information of automatically collecting each Website page by web crawlers; Then described initial ad master's information is carried out filtering screening, obtain the described advertisement main information of storing in described advertisement main information storehouse.
CN201310088282.0A 2013-03-19 2013-03-19 The Internet whole network advertisement putting recognition methods of content-based identification Expired - Fee Related CN103150668B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310088282.0A CN103150668B (en) 2013-03-19 2013-03-19 The Internet whole network advertisement putting recognition methods of content-based identification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310088282.0A CN103150668B (en) 2013-03-19 2013-03-19 The Internet whole network advertisement putting recognition methods of content-based identification

Publications (2)

Publication Number Publication Date
CN103150668A true CN103150668A (en) 2013-06-12
CN103150668B CN103150668B (en) 2016-02-24

Family

ID=48548724

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310088282.0A Expired - Fee Related CN103150668B (en) 2013-03-19 2013-03-19 The Internet whole network advertisement putting recognition methods of content-based identification

Country Status (1)

Country Link
CN (1) CN103150668B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104717210A (en) * 2015-02-13 2015-06-17 北京集奥聚合科技有限公司 Method and system for connecting optional media to DSP
CN112231618A (en) * 2020-10-14 2021-01-15 北京思特奇信息技术股份有限公司 Multi-contact non-invasive page information recommendation bit configuration method and related device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101158963A (en) * 2007-10-31 2008-04-09 中兴通讯股份有限公司 Information acquisition processing and retrieval system
CN202058145U (en) * 2011-01-31 2011-11-30 北京开心人信息技术有限公司 Update system of internet advertisements
CN102819580A (en) * 2012-07-25 2012-12-12 广州翼锋信息科技有限公司 Monitoring method and system of advertisements of internet third-part media website
CN102880607A (en) * 2011-07-15 2013-01-16 舆情(香港)有限公司 Dynamic network content grabbing method and dynamic network content crawler system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101158963A (en) * 2007-10-31 2008-04-09 中兴通讯股份有限公司 Information acquisition processing and retrieval system
CN202058145U (en) * 2011-01-31 2011-11-30 北京开心人信息技术有限公司 Update system of internet advertisements
CN102880607A (en) * 2011-07-15 2013-01-16 舆情(香港)有限公司 Dynamic network content grabbing method and dynamic network content crawler system
CN102819580A (en) * 2012-07-25 2012-12-12 广州翼锋信息科技有限公司 Monitoring method and system of advertisements of internet third-part media website

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104717210A (en) * 2015-02-13 2015-06-17 北京集奥聚合科技有限公司 Method and system for connecting optional media to DSP
CN104717210B (en) * 2015-02-13 2018-06-26 北京集奥聚合科技有限公司 A kind of method and system by free media access DSP
CN112231618A (en) * 2020-10-14 2021-01-15 北京思特奇信息技术股份有限公司 Multi-contact non-invasive page information recommendation bit configuration method and related device

Also Published As

Publication number Publication date
CN103150668B (en) 2016-02-24

Similar Documents

Publication Publication Date Title
CN102646248B (en) A kind of advertisement delivery method and system
CN102184230A (en) Method and device for displaying search results
CN104869173A (en) Information push method and device for outdoor large-scale advertising board
CN103377287A (en) Method and device for putting in item information
CN106846061A (en) Potential user's method for digging and device
CN101689263A (en) Determine advertising results
CN103491146A (en) Method, device and system for releasing network information
EP2557707A3 (en) Broadcast signal receiver, method for providing information related to a broadcast signal, and server for sending identification information about a broadcast programme
CN101520782A (en) Method and system for directionally releasing special-subject information relevant to online images
CN104170392A (en) Method, device, system and terminal of inplanting advertisements in files
CN103279565A (en) Advertisement placement tracking method and system
CN105160545A (en) Delivered information pattern determination method and device
CN106339891A (en) Intelligent analysis method and system based on large data acquisition
CN104732416A (en) Data processing method and device
CN104217353A (en) Webpage advertisement directional pushing system
CN101183396A (en) Advertisement display process, system and device
CN104239526A (en) POI (Point of Interest) labeling method and device for electronic map
CN108648009A (en) A kind of advertisement sending method and device
EP2577590A1 (en) Online advertising system and a method of operating the same
CN101882135A (en) Data processing method and device
CN105139233A (en) Advertisement putting method, device, and system
CN106202371A (en) The processing method of media file, device and advertisement analysis method
CN103150668B (en) The Internet whole network advertisement putting recognition methods of content-based identification
Collins et al. A co-evolving cultural cluster in the periphery: Film and TV production in Galway, Ireland
CN103337026A (en) Advertising systems and methods using embedded map

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220421

Address after: 100000 room 116, building 3, Shuangqiao (Shuangqiao dairy factory), Chaoyang District, Beijing

Patentee after: Beijing Xiaoxiang innovation Artificial Intelligence Technology Co.,Ltd.

Address before: 901, floor 9, building 5, yard 1, Shangdi East Road, Haidian District, Beijing 100086

Patentee before: BEIJING GEO POLYMERIZATION TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160224

CF01 Termination of patent right due to non-payment of annual fee