CN106303757A - A kind of view-based access control model feature and the network audio-video address resolution method of stream reduction - Google Patents

A kind of view-based access control model feature and the network audio-video address resolution method of stream reduction Download PDF

Info

Publication number
CN106303757A
CN106303757A CN201510349166.9A CN201510349166A CN106303757A CN 106303757 A CN106303757 A CN 106303757A CN 201510349166 A CN201510349166 A CN 201510349166A CN 106303757 A CN106303757 A CN 106303757A
Authority
CN
China
Prior art keywords
video
webpage
program
audio
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510349166.9A
Other languages
Chinese (zh)
Other versions
CN106303757B (en
Inventor
徐杰
叶建伟
包秀国
张永铮
云晓春
庹宇鹏
常鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201510349166.9A priority Critical patent/CN106303757B/en
Publication of CN106303757A publication Critical patent/CN106303757A/en
Application granted granted Critical
Publication of CN106303757B publication Critical patent/CN106303757B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8586Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47202End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting content on demand, e.g. video on demand
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4782Web browsing, e.g. WebTV
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • H04N21/6437Real-time Transport Protocol [RTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8543Content authoring using a description language, e.g. Multimedia and Hypermedia information coding Expert Group [MHEG], eXtensible Markup Language [XML]

Abstract

The invention discloses a kind of view-based access control model feature and the network audio-video address resolution method of stream reduction.The method include the steps that 1) according to the webpage behavior of currently playing webpage and web page contents, identify the video playback element of this broadcasting page;2) inject control instruction to video playback element or perform script, controlling the broadcasting of player in currently playing webpage, then carry out step 5);If playing unsuccessfully, then carry out step 3);3) calculate the required coordinate displacement clicking on video playback element, carry out clicking on according to this coordinate displacement control mouse and play, then carry out step 5);If playing unsuccessfully, then carry out step 4);4) carry out the player in the currently playing page clicking on broadcasting according to the rule of clicking on set, then carry out step 5);5) obtain the network data between player and audio/video server, extract the network audio-video address of currently playing webpage.The present invention can realize batch automatic address and resolve, and analyzing efficiency is greatly improved.

Description

A kind of view-based access control model feature and the network audio-video address resolution method of stream reduction
Technical field
The present invention relates to the Internet audio, video data and automatically gather field, relate to a kind of view-based access control model feature and the network of stream reduction Audio frequency and video address resolution method.
Background technology
Along with the development of Internet technology, online broadcasting has become as the Main Means that network audio-video is propagated, the most again with base Online broadcasting in web browser is main.Although online broadcasting is very easy to user and watches audio/video program, but a lot Under scene, user still wishes download and store audio/video program to locally stored, in order to carry out in the environment of connecting without network Off-line is play;Meanwhile, the automated system such as anti-pirate, network audio-video data analysis is also required to enter online playing audio-video Row batch is downloaded.The premise downloaded as audio frequency and video, it is necessary to first obtain the download address of audio/video program.But, player The technology such as web plug-in unit, javascript, dynamic address hinder user and directly obtain under audio/video program from web page contents Set address, and then carry out program download, need to resolve download address by new technology.
At present, the acquisition of the online playing audio/video programs of web page mainly includes directly obtaining audio frequency and video number from browser rs cache According to first obtain audio/video program address and be downloaded two ways further according to address.The former premise is that website data support is used Whole audio/video program to be finished playing by family local cache simultaneously, and owing to the scope of application is little the most time-consuming, this method is typically used Download on a small scale in individual.The latter's is to carry out audio frequency and video download on the premise of getting the address of audio/video program, at present The relatively audio-video collection instrument of main flow is typically adopted in this way.Although adopt instrument in this way (as sudden peal of thunder sniff, Large Mus etc.) all can realize the download of audio/video program to a certain extent, but all there is certain limitation.Such as, a sudden peal of thunder Sniff can only in IE browser the audio/video program of flv form in the sniff page;Large Mus can only process the website that minority is well-known, Audio/video program address is provided.This kind of instrument the most only supports specific website or minority famous Web site, simultaneously the download peace of instrument Fill comparatively laborious.
Currently, the propagation of audio/video program substantially can be divided into the standard transmission protocol such as use HTTP, RTMP, FTP, makes With burst host-host protocols such as F4M, use the several ways such as special transmission agreement.Audio/video program in webpage typically passes through Flash player, HTML5 player, privately owned player or plug-in unit play out.The webpage played out by player, Some is automatic playing audio/video programs after webpage loads, and some is broadcast after then needing to click on player or some web page elements Put.
For current online audio and video playing present situation, it is possible to for most website, in the feelings manually intervened without user Under condition, carrying out the batch-automated address resolution of various form audio/video program, the batch download for subsequent sound video frequency program carries It is key problems-solving of the present invention for support.
Summary of the invention
For technical problem present in prior art, it is an object of the invention to provide a kind of view-based access control model feature and stream reduction The audio/video program address resolution method of technology.The dissimilar of webpage, the method integrated use base of proposition is play according to online Program address in web page contents rule resolves, webpage simulation based on click rule is clicked on, the webpage mould of view-based access control model feature Intend technology parsing audio/video program addresses such as clicking on, network traffics capture and analyze, interference tones video data eliminating, significantly carry The high success rate of address resolution.Meanwhile, use webpage automatically load, chrome browser extender technology, click on rule Then automatically extract technology and batch audio frequency and video address resolve control technology automatically, need not manual intervention or a small amount of manually about In the case of, it is achieved in batches, automatic address resolution, the efficiency of address resolution is greatly improved.The address that can resolve includes http Protocol address, https protocol address, rtmp protocol address, rtmpt protocol address and f4m describe file.
The present invention mainly comprises the content of the following aspects:
1, audio/video program address resolution technology based on web page contents rule
For the page directly comprising audio/video program address or audio frequency and video can be spliced into by the information of diverse location in the page Program address play the page online, by download web page contents, utilize content of pages resolution rules that web page contents is solved Analysis, it is thus achieved that the address of audio/video program.Resolution rules is divided into the resolution rules of general resolution rules and specific website.Working solution Analysis rule is arranged according to the occurrence law of program address in most webpages, is suitable for most webpage.The resolution rules of specific website is The webpage of concrete website is arranged, and a rule is only applicable to the broadcasting page of a specific website.
2, click on and the audio/video program address resolution technology of network data analysis based on webpage simulation
The program address of most audio and video playing webpages is dynamically generated when playing by player plug-in, it is impossible to by content of pages solution Analysis obtains.Webpage is play for this type of, allows the automatic playing audio/video programs of webpage by webpage simulation click technology, then profit The network data flow produced with browser in data stream analysis techniques capture playing process, audio/video program of extracting from data stream Address.Main technical method be divided into following some:
(1) web page analysis based on Chrome browser extender and control technology
Extender development interface according to Chrome browser, exploitation is applicable to the webpage simulation of Chrome browser and clicks on Extender.Extender by arrange chrome.downloads.onCreated, chrome.downloads.onChanged, chrome.webRequest.onBeforeRequest、chrome.webRequest.onBeforeSendHeaders、 chrome.webRequest.onHeadersReceived、chrome.webRequest.onResponseStarted、 chrome.webNavigation.onBeforeNavigate、chrome.webNavigation.onCommitted、 The message such as chrome.webNavigation.onCreatedNavigationTarget, chrome.runtime.onMessage monitor event Web page contents and behavior are analyzed, are resolved by the DOM of web page contents, the vision of web page element is distributed and element Type is analyzed, and then identifies video playback element, by video playback element injecting control instruction or performing script The broadcasting of manipulation video, thus realize simulating click function.
(2) technology is clicked in the simulation of script injection mode
This technology is clicked on extender by exploitation Chome browsing device net page simulation and is injected from trend player place web page element Control instruction or execution script, the page of analog subscriber clicks on behavior so that player commences play out video frequency program.Specifically, By injection Javascript script execution play () function, click is completed for the HTML5 player in the HTML5 page broadcast Put;For Flash player, all automatic play parameter form first supported according to Flash player, controlled by injection Instruction modification processed and the automatic play parameter of interpolation, direct playing audio/video programs after promoting player to load, it is achieved click on and play; If automatically playing failure, then by the Flash player play function storehouse set up in advance (this function library by decompiling and Analyze a large amount of swf file acquisition), by inject Javascript script call one by one, successful then achieve click broadcasting.
(3) technology is clicked in the simulation of view-based access control model feature
This technology is clicked on extender analysis by exploitation Chome browsing device net page simulation and is play webpage elements online The audio and video playing element in webpage is play in vision distribution and element type identification, calculates the required coordinate offset clicking on element, The mode that employing system is called calls xdotool instrument, it is achieved mouse moves the click to player and plays.
(4) technology is clicked on based on the simulation clicking on rule
Cannot be clicked on by the simulation of the simulation click of script injection mode and view-based access control model feature and carry out clicking on broadcasting online of broadcasting Put the page, be because having carried out to pretend and hide to broadcasting element more, or there is interference element obstruction mouse to broadcasting Element clicks directly on.Playing webpage for this type of, the sample page is clicked on by the mode first passing through human assistance, The web page element clicked on by the Chome browser auxiliary clicking developed and rule generation extender record user time Sequence, position and click mode, extension journey is clicked in the click rules guide Chome browsing device net page simulation generating such webpage Ordered pair is similar to webpage and is simulated clicking on, it is achieved click on result of broadcast.
(5) network data flow capture and analytical technology
The click clicking on extender via the simulation of Chome browsing device net page is play, and player can send to audio/video server Programme content obtains request, the address of meeting designated tone video frequency program in request.Network data flow capture with analytical technology based on Bag analysis program is caught in the exploitation of libpcap and libnids storehouse, and the overall network flow of browser place main frame is play in capture, to it In http, rtmp, rtmpt flow be analyzed, extract all audio/video program addresses occurred in flow as broadcasting online Put the alternative program address of the page, filter out real program address subsequently through interference tones video frequency program elimination technique.
3, human assistance is clicked on and clicks on rule generation technology
For to part cannot by script injection mode simulate click and view-based access control model feature simulation click on carry out click broadcasting The page play out, exploitation Chrome browser auxiliary clicking and rule generation extender, record is artificial to be clicked on The web page element of sample webpage, click order and click location, automatically generate click rule, clicks on as webpage simulation and extends Program carries out the rule that the automatic imitation of similar webpage is clicked on.
4, interference tones video frequency program elimination technique
Play online in webpage in addition to theme audio/video program, the most a lot of relevant audio/video programs, and theme sound regards The interference tones video frequency programs such as the advertisement before frequency broadcasting.The present invention is by combining broadcast window display position in webpage, broadcasting Put forward the information such as sequence, playing duration, be that the probability of interference tones video frequency program is weighted to audio/video program, in conjunction with The canonical having disturbed program address is mated, and distinguishes the theme in webpage and interference tones video frequency program, improves audio/video program ground The accuracy rate that location resolves.
5, control technology is automatically resolved based on the audio/video program address resolving state automata
Above-mentioned analytic technique realizes in different software, needs united application just can complete program according to the difference of type of webpage The correct parsing of address.The audio/video program address that the present invention proposes based on resolving state automata resolves control technology automatically, Realize batch audio frequency and video according to the above-mentioned technology of different conditions integrated use resolved and play the high speed of program address in webpage online Automatization resolves.
Compared with prior art, the positive effect of the present invention is:
The invention discloses a kind of side of audio/video program address resolution be applicable to major part audio and video playing webpage both at home and abroad Method.The method can use the audio/video program in the audio and video playing page of the universal player such as flash effectively to great majority Address is analyzed and extracts.Compared with other program address extracting method, have the advantage that (1) other addresses carry Access method is only applicable to specific website, and this method is applicable to most audio frequency and video using the universal player such as flash and broadcasts Put website;(2) overwhelming majority is play webpage and can automatically extract program address without manual intervention, minority website is had The similar webpage of this website can be realized automatically extracting after clicking on by the human assistance of limit sample webpage;(3) use based on state Parsing control, can to batch play webpage concurrently resolve, resolution speed and efficiency are greatly improved;(4) interference is used Audio/video program address elimination technique, carries out ELIMINATION OF ITS INTERFERENCE to the program address parsed, and distinguishes theme and interference tones video Program, drastically increases the accuracy rate of audio/video program address resolution.
Accompanying drawing explanation
Fig. 1 is the software architecture schematic diagram of the present invention;
Fig. 2 is for resolving state automata state transition diagram;
Fig. 3 is address resolution flow chart based on web page contents rule;
Fig. 4 clicks on flow chart for simulation;
Fig. 5 is network data analysis program flow diagram.
Detailed description of the invention
Below, in conjunction with specific embodiment, the present invention is described in detail.
1, the software architecture of analytic method
The software architecture of the present invention is made up of totally 5 software programs and rule base 6 parts as shown in Figure 1.Wherein:
" based on state automatically resolve control " program (hereinafter referred to as controlling program) is the master control of whole analytic method, and it leads to Cross the built-in state automata that resolves the address resolution procedure of each online broadcasting page to be controlled, according to the difference resolved The broadcasting page is sent to other programs and carries out resolving and result by state;It is according to interference tones video rules and interference simultaneously The address of audio/video program probability calculation exclusive PCR audio/video program, filters out theme audio/video program address.
" address resolution based on web page contents rule " program receives from the broadcasting page URL to be resolved controlling program, downloads Audio/video program address is also resolved by content of pages according to web page contents rule, returns to control by resolving the result obtained Program.
" webpage simulation is clicked on " program receives from controlling the broadcasting page URL to be resolved of program and clicking on type, calls The automatic load-on module of Chrome browser opens the broadcasting page, the webpage in Chrome browser in Chrome browser Simulation is clicked on extender and is simulated click, playing audio/video programs accordingly according to clicking on type.
" network data analysis " program is deployed in " webpage simulation is clicked on " program place main frame, and its capture Chrome browser is beaten Start broadcasting the all-network flow produced during putting the page and playing audio/video programs, audio/video program address of therefrom extracting, and Obtain the online of program address subordinate according to page invocation relation and play webpage URL, analysis result is returned to control program.
" human assistance click " program receives the broadcasting page URL that cannot resolve from the system controlling program, calls Chrome The automatic load-on module of browser opens the broadcasting page in Chrome browser, transfers to manually to carry out the click of audio/video program Playing, during artificial click, the auxiliary clicking in Chrome browser is with rule generation extender record institute a little Hit element, click on order, click location and element execution method, automatically generate click rule and be stored in rule base.
" rule base " is relational database, wherein stores three rule-likes: web page contents is regular, click on rule and interference tones video Rule.Wherein web page contents rule is used for instructing " address resolution based on web page contents rule " program to carry out address resolution;Point Hit rule to click on for instructing " webpage simulation is clicked on " program to carry out page simulation;Interference tones video rules instructs control program to arrange Except interference tones video frequency program address.
2, audio/video program address based on state resolves control automatically
The resolving of online broadcasting page each in task list is controlled by control program according to the built-in state automata that resolves System, by the concurrently parsing of control realization multi-page based on state.Control the parsing state of single broadcasting page resolving As shown in Figure 2, concrete control process prescription is as follows for the state transition diagram of automat:
1) state is set to " original state " when adding task list by the online page URL of broadcasting to be resolved;
2) the parsing state of task is " original state ", then control program and be sent to the page URL of this task based on webpage The address resolution procedure of content rule, amendment task status is " in address resolution based on web page contents rule ".
3) the parsing state of task is " in address resolution based on web page contents rule ", then control program and wait based on webpage The address resolution procedure of content rule returns analysis result.If containing program address in return result, amendment task status is " to solve Analyse successfully ", it is parsed, the program address of return is the audio/video program address of this task;Otherwise, amendment task shape State is " address resolution based on web page contents rule completes ".
4) the parsing state of task is " address resolution based on web page contents rule completes ", then control program and send broadcast page Face URL is to webpage simulation click program (arrange click type and click on for clicking on rule simulation), and amendment task status is " point Hit rule simulation to click on ".
5) the parsing state of task is " clicking on rule simulation click ", then control program and wait that simulation click program returns click Information, waits that network data analysis program returns analysis result simultaneously.If simulation click program return information is displayed without The click rule joined, then amendment task is " clicking on rule simulation click to complete ".Otherwise, if network data flow in 90 seconds Analysis program does not has return information or according to not finding effective program ground after the information exclusive PCR audio/video program returned Location, then amendment task status is " resolving unsuccessfully ", and task is parsed, and does not find program address;Otherwise, amendment task State is " successfully resolved ", and task is parsed, and effective program address of discovery is the audio/video program address of this task.
6) the parsing state of task is " clicking on rule simulation click to complete ", then control program sends and plays page URL to net Page simulation click program (arranging click type is that script injection simulation is clicked on), amendment task status is that " script injects and simulates point Hit ".
7) the parsing state of task is " script injects simulation point and hits ", then control program and wait simulation click program reentry point Hit information, wait that network data analysis program returns analysis result simultaneously.If two programs do not have return information in 90 seconds, Or according to not finding effective program address after the information exclusive PCR audio/video program returned, then amendment task status is " script injects simulation click to be completed ";Otherwise, amendment task status is " successfully resolved ", and task is parsed, having of discovery Effect program address is the audio/video program address of this task.
8) the parsing state of task is " script injects simulation click to be completed ", then control program transmission and play page URL to net Page simulation click program (arranging click type is that visual signature simulation is clicked on), amendment task status is " visual signature simulation point Hit ".
9) the parsing state of task is " visual simulation point hits ", then control program and wait that simulation click program returns click letter Breath, waits that network data analysis program returns analysis result simultaneously.If two programs do not have return information in 90 seconds, or Do not find effective program address according to after the information exclusive PCR audio/video program returned, then amendment task status for " depending on Feel that characteristic simulation has been clicked on ";Otherwise, amendment task status is " successfully resolved ", and task is parsed, discovery effective Program address is the audio/video program address of this task.
10) the parsing state of task is " visual signature simulation click complete ", then control program send play page URL to Human assistance clicks on program, and amendment task status is " human assistance click ".
11) the parsing state of task is " human assistance click ", then control program and wait that human assistance is clicked on program and returned generation Rule.If create-rule failure, then amendment task status is " resolving unsuccessfully ", is parsed, and this task is not resolved to Program address;Otherwise, will click on rule and be stored in rule base, the state of amendment task is " address based on web page contents rule It is parsed ".
3, audio/video program address resolution based on web page contents rule
" address resolution based on web page contents rule " program, receives from the broadcasting page URL to be resolved controlling program, under Carry content of pages and according to web page contents rule, audio/video program address resolved, returning to control by resolving the result obtained Processing procedure sequence.Web page contents rule is divided into rule two class of general rule and specific website.General rule should be met by web page contents Regular expression, webpage content extraction describes and program address generates expression formula composition;The rule of specific website is by webpage URL Should meet regular expression, extraction describes and program address generates expression formula composition.Program work process as shown in Figure 3, It is described in detail below:
1) program initialization.Open asynchronous UDP listening port, wait program to be controlled to issue parsing task (webpage to be resolved URL)。
2) from rule base, web page contents rule is read.
3) if rule base occurs the web page contents rule increased newly, then newly-increased rule is read.
4) read parsing task URL from UDP listening port, perform 5);Otherwise wait for 1 second, jump to 3).
5) utilize libcurl storehouse to download URL place web page contents, download and the most then perform 6);Otherwise send this URL institute Cannot open at webpage wrong to control program, jump to 3).
6) in web page contents rule, search the specific website rule that task URL meets.If searching successfully, according to rule Extraction describes and program address generates expression formula and generates program address from web page contents, is sent to control program, jumps to 3); Otherwise perform 7);
7) mate web page contents by general rule, find the rule of coupling, then according to extraction description and the program address of rule Generate expression formula and generate program address from web page contents, be sent to control program;Otherwise send and resolve unsuccessfully to controlling program. Jump to 3).
4, webpage simulation is clicked on
The program address of most audio and video playing webpages is dynamically generated when playing by player plug-in, it is impossible to by content of pages solution Analysis obtains.Webpage is play for this type of, allows the automatic playing audio/video programs of webpage by webpage simulation click program, then profit The network data flow produced with browser in network data analysis capture playing process, audio/video program of extracting from data stream Address.
The present invention is applicable to the webpage mould of Chrome browser according to the extender development interface of Chrome browser, exploitation Intending clicking on extender, extender implements simulation click action after broadcasting webpage opened by browser automatically.Such as accompanying drawing 4 institute Showing, the work process of simulation click program is described in detail below:
1) open Chrome browser by order line, load webpage simulation and click on extender.
2) from controlling program reception click task (online broadcasting webpage URL and click type).
3) without receiving click task, wait 1 second, jump to 2);Otherwise perform 4).
4) increase click type parameter at URL afterbody, open in a tab page of Chrome browser by order line.
5) webpage simulation click extender passes through chrome.webRequest.onBeforeRequest. AddListener (function (info) { }) monitors event, before the page loads, extracts the click type parameter of URL afterbody, And load original URL webpage.
6) if opening the failure of the URL page, then send webpage and open failure information to control program, jump to 2);No Then perform 7).
7) by chrome.webRequest.onBeforeRequest.addListener (function (info) { }) according to interference Audio frequency and video rule-based filtering falls the jamming resource of known features.
8) webpage simulation is clicked on extender and is simulated click according to clicking on the type webpage to opening.
9) information of clicked web page element is sent to control program
10) close the page, jump to 2).
The present invention clicks in the simulation of Chrome browsing device net page and achieves three kinds of simulation click technology in extender: script is noted Entering the simulation click technology of mode, technology is clicked in the simulation of view-based access control model feature, clicks on technology based on the simulation clicking on rule. Concrete methods of realizing is described as follows:
(1) technology is clicked in the simulation of script injection mode
When webpage simulation click on extender find current page some blow mode be script inject simulation click on time, pass through Chrome.tabs.sendMessage (tabId, { }, function (response) { }), in the Shipping Options Page of page place, injection script is carried out Alternately, inform that injection script carries out page simulation and clicks on.Injection script obtains the player control in the page by DOM element Part, broadcasting (download) hyperlink.Click () is simply performed by Javascript script for playing (download) hyperlink Function completes to click on;By injecting Javascript script, play () is performed for the HTML5 player in the HTML5 page Function completes to click on and plays;For Flash player, first according to the automatic play parameter storehouse set up in advance, by injecting Control instruction is revised and adds automatic play parameter, direct playing audio/video programs after promoting player to load, it is achieved click on and broadcast Put;If automatically playing failure, then by the Flash player play function storehouse set up in advance, by injecting Javascript Script calls one by one, successful then achieve click play.
(2) technology is clicked in the simulation of view-based access control model feature
When the some blow mode that webpage simulation click extender finds current page is visual signature simulation click, pass through Chrome.tabs.sendMessage (tabId, { }, function (response) { }), in the Shipping Options Page of page place, injection script is carried out Alternately, inform that injection script carries out page simulation and clicks on.Injection script by DOM analyze web page element vision distribution and Element type, identifies audio and video playing element, calculates the required offset coordinates clicking on element, by ajax to click Web services sends request, and web services calls xdotool instrument, moved on coherent element by mouse according to offset coordinates Click on, it is achieved mouse moves the click with player and plays.Pass through during click Chrome.windows.update (windowId, { }, function (win) { }) and chrome.tabs.update (tabId, { }, function (tab) { }) realizes the switching of Chrome window and Shipping Options Page, and by the deferred of JQuery Object, synchronizes asynchronous becoming, it is achieved limit switches the effect do not clicked on.
(3) technology is clicked on based on the simulation clicking on rule
Cannot be clicked on by the simulation of the simulation click of script injection mode and view-based access control model feature and carry out clicking on broadcasting online of broadcasting Put the page, be because having carried out to pretend and hide to broadcasting element more, or there is interference element obstruction mouse to broadcasting Element clicks directly on.Playing webpage for this type of, the sample page is clicked on by the mode first passing through human assistance, The web page element clicked on by the Chome browser auxiliary clicking developed and rule generation extender record user time Sequence, position and click mode, generate the click rule of such webpage.
When webpage simulation clicks on the some blow mode of extender discovery current page for clicking on rule simulation click, webpage is simulated Click on extender and search the click rule of coupling according to webpage URL.If finding click rule, then according to clicking on rule In click element, click on order, click location, by ajax to click web services send request, web services Call xdotool instrument, move to mouse successively click on coherent element, it is achieved mouse moves the point with player Hit broadcasting.During click by chrome.windows.update (windowId, { }, function (win) { }) and Chrome.tabs.update (tabId, { }, function (tab) { }) realizes the switching of Chrome window and Shipping Options Page, and passes through The deferred object of JQuery, synchronizes asynchronous becoming, it is achieved limit switches the effect do not clicked on.
5, network data analysis
The click clicking on extender via the simulation of Chome browsing device net page is play, and player can send to audio/video server Programme content obtains request, the address of meeting designated tone video frequency program in request, is produced by capture and analysis Chrome browser Raw network traffics can obtain the address of these audio/video programs.Network data analysis program is based on libpcap and libnids Bag analysis program is caught in storehouse exploitation, and the overall network flow of browser place main frame is play in capture, to http therein, rtmp, Rtmpt flow is analyzed, and extracts audio/video program address and webpage url address is play at program place.By interference tones video Get rid of, can obtain and specify the address playing effectively audio/video program in webpage.
Network data analysis program is developed based on libpcap and libnids storehouse, by libpcap storehouse capture network traffics TCP flow amount, utilizes libnids storehouse that the TCP flow amount of capture is implemented protocol assembly, and enters the application layer traffic after reduction Row application protocol is analyzed, and extracts the audio/video program address in http, rtmp, rtmpt agreement or f4m message file, and closes Interlink mesh place is transmitted program address and plays page URL to control program after playing page URL.As shown in Figure 5, The processing procedure of network data analysis program is as follows:
1) initializing libnids, registration tcp processes function.
2) start libnids, carry out tcp protocol assembly.
3) Libnids restores network data, and tcp processes function and is called back;
4) if connection status is " connection establishment ", 5 are performed);If connection status is " connection data ", perform 6); Other states, jump to 7).
5) initialize connection data storage, jump to 3);
6) storage connects data, if the total amount of data connecting storage is more than 10K, jumps to 7);Otherwise, 3 are jumped to);
7) connection type is judged.If being connected as http to connect, jump to 9);Otherwise, 8 are performed);
8) if being connected as rtmp and connecting, 10 are performed);Otherwise, release connects data, jumps to 3).
9) if being connected as rtmpt and connecting, 10 are performed);Otherwise perform 11);
10) extract each field information such as program address in rtmp request according to rtmp agreement.According to pageURL information Page URL is play in association, sends program address and plays page URL to controlling program.Release connects data, forwards 3 to);
11) judge to connect the data type of transmission.If audio/video program data or f4m message file, extract program Address or file address, be sent to control program after playing page URL according to Reference association;Otherwise, according to Reference information record currently connect and Reference connect between incidence relation.Release connects data, jumps to 3).
6, human assistance is clicked on and clicks on rule generation
For to part cannot by script injection mode simulate click and view-based access control model feature simulation click on carry out click broadcasting The page play out, exploitation Chrome browser auxiliary clicking and rule generation extender, record is artificial to be clicked on The web page element of sample webpage, click order and click location, automatically generate click rule, clicks on as webpage simulation and extends Program carries out the rule that the automatic imitation of similar webpage is clicked on.
Human assistance simulation click plug-in unit main flow:
1) chrome browser is opened manually and loads auxiliary clicking and rule generation extender.
2) simulation is opened manually and clicks on administration page.
3) artificial clicking from administration page need to click on the page, and browser loads the broadcasting page clicked, and extender is to broadcasting Put and each iframe page in the page and the page is injected separately into click logout script.
4) user clicks on page coherent element, the record script of injection record respectively order that in the different page, user clicks on and Coherent element.
5) user closes the broadcasting page, confirms according to the result of broadcast after clicking on or cancels click behavior in administration page.
6) the click behavior confirmed for user, extender automatically extracts eigenvalue to the element that user clicks on, and protects Card can be by some or multiple attribute character, the element that unique mark user clicks on, and according to when clicking on order and interval Between, generate and click on rule, recorded rule base.
7) 3 are repeated)-6) carry out the click behavior of other pages successively, until no longer clicking on.
7, interference tones video data is got rid of
The various interference tones video data that adulterates in audio and video playing, specifically includes that and swims in the sound automatically play on the page Video ads, the jamming resource that window column table produces, the background music of the type websites such as forum and the true film sources play are play more Before the advertisement etc. intercutted.Identify to greatest extent and get rid of these resources, the standard of audio/video program Address Recognition can be effectively improved Really property.
The present invention by combining webpage visual feature, playing sequence, file size, the information such as playing duration, audio frequency and video are saved Mesh is that the probability of interference tones video frequency program is weighted, and mates in conjunction with the canonical having disturbed program address, distinguishes net Theme in Ye and interference tones video frequency program, improve the accuracy rate of audio/video program address resolution.
For page floating ad and broadcast window list due to its position and the particularity of combination, the present invention is according to view-based access control model The information such as the block structure position that characteristics page parsing is split, filter out the advertisement player being in marginal texture block.For Background music jamming resource, one aspect of the present invention uses by the way of element tags searches filtration, loads embed label Audio and video resources filter;On the other hand to using Flash player or HTML5 to broadcast near main structure block edge Put device to filter.
The advertisement produced before playing for true film source and the page load the various audio, video datas play immediately to be difficult to remove.This A little advertisement interference often had the following characteristics that before true audio and video playing plays, and file is less, and playing duration is shorter, Definition (code check) is low, often there is obvious characteristic (such as: the keywords such as ad or advert) in resource link, and Server domain name may be the static resource server of website, without cookie information in the audio frequency and video solicited message of request, this A little the most all can be as judging that a resource is whether as the feature of advertisement jamming resource.The present invention is in order to get rid of advertisement more accurately Jamming resource and delete real resources the most by mistake, use weights scoring mode exclusive PCR audio and video resources.
For a specific broadcasting page, the enforcement step that interference tones video frequency program address is got rid of is as follows:
1) obtain in address resolution based on web page contents rule and address resolution procedure based on simulation click to resolve and obtain All audio/video program addresses, visual position in webpage, playing sequence, program size, the information such as playing duration.
2) with interference tones video rules in rule base, audio/video program address is carried out canonical to mate.If the match is successful, sentence This program address fixed is interference program address.Otherwise, step 3 is turned to).
3) according to the visual position of audio/video programs all in webpage, extract the web page element of theme audio/video program, get rid of The audio frequency and video address occurring in corner and the audio frequency and video address occurred in groups in broadcast window list.
4) according to audio/video program place label, identify background music address, get rid of.
5) program is carried out weights scoring.The characteristic attribute of weights scoring is divided into: audio/video program produces time, audio frequency and video Program file size, audio/video program playout length, audio/video program code check, audio/video program chain feature, audio frequency and video take Business device domain name type and Cookie property value.The present invention gives different balance score value for features described above value, and weight setting is such as Shown in table 1.
6) if calculating acquisition program weights to be more than 15, then it is judged to disturb program address, forwards step 8 to), otherwise, Forward step 7 to).
7) this program address is labeled as theme audio/video program address.
8) judge that this program address is big as the probability of interference program address, this program address is labeled as doubtful interference program ground Location.
Table 1 weights code of points table
Feature name Eigenvalue Minimum score value Eigenvalue Maximum scores
Resource produces the time After 1 First 5
Resource file length Greatly 2 Little 4
Resource playing duration Long 1 Short 4
Resource file code check High 2 Low 4
Whether resource link is containing eigenvalue Do not contain 1 Contain 3
Domain name is consistent with task service domain name Unanimously 1 Inconsistent 2
Whether linking request has Cookie value Have 1 No 3

Claims (9)

1. view-based access control model feature and a network audio-video address resolution method for stream reduction, the steps include:
1) according to webpage behavior and the web page contents of currently playing webpage, the video playback element of this broadcasting page is identified;
2) inject the control instruction for controlling video playback to described video playback element or perform script, the page of analog subscriber Cake hits behavior, controls the broadcasting of player in currently playing webpage, then carries out step 5);Lose if play Lose, then carry out step 3);
3) required click video playback element is calculated according to the vision distribution of web page element in the web page contents of currently playing webpage Coordinate displacement, then moves to carry out player clicking on according to this coordinate displacement control mouse and plays, then walk Rapid 5);If playing unsuccessfully, then carry out step 4);
4) carry out the player in the currently playing page clicking on broadcasting according to the rule of clicking on set, then carry out step 5);
5) obtain the network data between player and audio/video server, from this network data, extract currently playing webpage Network audio-video address.
2. the method for claim 1, it is characterised in that first capture currently playing webpage, utilizes web page contents to resolve rule Then this webpage being resolved, if not getting video address, then carrying out step 1).
3. the method for claim 1, it is characterised in that the described net extracting currently playing webpage from this network data The method of network audio frequency and video address is: extract all audio/video program addresses from this network data as currently playing webpage Alternative program address, then with disturbed program address to carry out canonical to mate, filter out interference tones video frequency program ground Location.
4. method as claimed in claim 3, it is characterised in that extract each alternative program address correspondence broadcast window and currently broadcasting Put the display position in webpage, playing sequence, playing duration information, then according to showing position, playing sequence, broadcasting Put duration information corresponding audio/video program is weighted, filter out weighted calculation value less than setting the standby of threshold value Select program address.
5. method as claimed in claim 3, it is characterised in that according to http, rtmp, rtmpt number in described network data According to extracting described audio/video program.
6. the method for claim 1, it is characterised in that the generation method of described click rule is: first pass through the most auxiliary Each class sample page is clicked on and records the order of web page element, position and the click that user clicks on by the mode helped Mode, generates the click rule of each class sample page;Wherein, the sample page is step 2), 3) all play unsuccessfully The page, and utilize web page contents resolution rules that webpage resolves the page not getting video address.
7. the method as described in as arbitrary in claim 1~6, it is characterised in that identify the side of the video playback element of this broadcasting page Method is: be distributed the web page element vision of currently playing webpage and element class according to the DOM of webpage behavior and web page contents Type is analyzed, and identifies video playback element.
8. the method as described in as arbitrary in claim 1~6, it is characterised in that step 2) in, if playing unsuccessfully, adjust the most one by one Play out by the play function in the player plays function library set up in advance.
9. the method as described in as arbitrary in claim 1~6, it is characterised in that the mode that employing system is called calls xdotool instrument, Realize mouse to move player click broadcasting.
CN201510349166.9A 2015-06-23 2015-06-23 A kind of view-based access control model feature and the network audio-video address resolution method of stream reduction Expired - Fee Related CN106303757B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510349166.9A CN106303757B (en) 2015-06-23 2015-06-23 A kind of view-based access control model feature and the network audio-video address resolution method of stream reduction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510349166.9A CN106303757B (en) 2015-06-23 2015-06-23 A kind of view-based access control model feature and the network audio-video address resolution method of stream reduction

Publications (2)

Publication Number Publication Date
CN106303757A true CN106303757A (en) 2017-01-04
CN106303757B CN106303757B (en) 2019-07-16

Family

ID=57650769

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510349166.9A Expired - Fee Related CN106303757B (en) 2015-06-23 2015-06-23 A kind of view-based access control model feature and the network audio-video address resolution method of stream reduction

Country Status (1)

Country Link
CN (1) CN106303757B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111565336A (en) * 2020-05-22 2020-08-21 郑州阿帕斯科技有限公司 Video playing method and device
WO2021047296A1 (en) * 2019-09-09 2021-03-18 北京为快科技有限公司 Method and device for improving efficiency of vr video interaction
CN113271500A (en) * 2021-04-06 2021-08-17 北京硬核聚视科技有限公司 Artificial simulation video playing test system and method
CN114710708A (en) * 2022-03-14 2022-07-05 武汉虹信技术服务有限责任公司 Method and system for Web playing monitoring video hosting C/S host program

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020099802A1 (en) * 2000-11-29 2002-07-25 Marsh Thomas Gerard Computer based training system and method
CN102170446A (en) * 2011-04-29 2011-08-31 南京邮电大学 Fishing webpage detection method based on spatial layout and visual features
CN102332028A (en) * 2011-10-15 2012-01-25 西安交通大学 Webpage-oriented unhealthy Web content identifying method
CN103491196A (en) * 2013-10-09 2014-01-01 百度在线网络技术(北京)有限公司 Method and device for acquiring multimedia address in web page

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020099802A1 (en) * 2000-11-29 2002-07-25 Marsh Thomas Gerard Computer based training system and method
CN102170446A (en) * 2011-04-29 2011-08-31 南京邮电大学 Fishing webpage detection method based on spatial layout and visual features
CN102332028A (en) * 2011-10-15 2012-01-25 西安交通大学 Webpage-oriented unhealthy Web content identifying method
CN103491196A (en) * 2013-10-09 2014-01-01 百度在线网络技术(北京)有限公司 Method and device for acquiring multimedia address in web page

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021047296A1 (en) * 2019-09-09 2021-03-18 北京为快科技有限公司 Method and device for improving efficiency of vr video interaction
CN111565336A (en) * 2020-05-22 2020-08-21 郑州阿帕斯科技有限公司 Video playing method and device
CN111565336B (en) * 2020-05-22 2022-12-27 郑州阿帕斯科技有限公司 Video playing method and device
CN113271500A (en) * 2021-04-06 2021-08-17 北京硬核聚视科技有限公司 Artificial simulation video playing test system and method
CN113271500B (en) * 2021-04-06 2022-06-21 北京硬核聚视科技有限公司 Artificial simulation video playing test system and method
CN114710708A (en) * 2022-03-14 2022-07-05 武汉虹信技术服务有限责任公司 Method and system for Web playing monitoring video hosting C/S host program
CN114710708B (en) * 2022-03-14 2024-04-02 武汉虹信技术服务有限责任公司 Method and system for Web playing monitoring video for hosting C/S host program

Also Published As

Publication number Publication date
CN106303757B (en) 2019-07-16

Similar Documents

Publication Publication Date Title
CN110688554B (en) Indexing data for native applications
CN104766014B (en) For detecting the method and system of malice network address
CN101706796B (en) Method and device for showing webpage resources
CN103268361B (en) Extracting method, the device and system of URL are hidden in webpage
CN100533434C (en) Method and apparatus for detecting invalid clicks on the internet search engine
US20080235671A1 (en) Injecting content into third party documents for document processing
CN104598232B (en) A kind of Web applications striding equipment catches back method
CN103279516B (en) Web spider identification method
CN104021172A (en) Advertisement filtering method and advertisement filtering device
CN106303757A (en) A kind of view-based access control model feature and the network audio-video address resolution method of stream reduction
US20110214075A1 (en) Process for displaying in a web browser the rendering produced by an application
CN106230809B (en) A kind of mobile Internet public sentiment monitoring method and system based on URL
CN106598991A (en) Web crawler system capable of realizing website interaction and automatic form extraction by conversational mode
CN106980614B (en) A kind of Web page speech control implementation method based on JavaScript extension
CN102999595B (en) A kind of for providing method and the equipment of the accession page corresponding with page info
CN107846426A (en) The tracking and device of user trajectory in page access
GB2558870A (en) Internet browsing
CN102542001B (en) Searching method and system
CN111523074A (en) Acquisition system for dynamic page sensitive data of front-end rendering website
CN104090923A (en) Method and device for displaying rich media information in browser
CN103838862A (en) Video searching method, device and terminal
CN108984641A (en) A kind of method for page jump based on WEB terminal
CN113992553B (en) Micro-service-based platform traffic generation system, method, computer and storage medium
CN111294620A (en) Video recommendation method and device
CN105808623B (en) A kind of page access event correlation methodology and device based on search

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190716

Termination date: 20210623