CN108306918A - A kind of website access information automatic obtaining method based on dynamically analyzing of program - Google Patents

A kind of website access information automatic obtaining method based on dynamically analyzing of program Download PDF

Info

Publication number
CN108306918A
CN108306918A CN201710033453.8A CN201710033453A CN108306918A CN 108306918 A CN108306918 A CN 108306918A CN 201710033453 A CN201710033453 A CN 201710033453A CN 108306918 A CN108306918 A CN 108306918A
Authority
CN
China
Prior art keywords
user
webpage
file
website
access
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710033453.8A
Other languages
Chinese (zh)
Other versions
CN108306918B (en
Inventor
张卫丰
陈贵美
刘蕊成
赵晨
许蕾
周国强
张迎周
王子元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nupt Institute Of Big Data Research At Yancheng Co Ltd
Original Assignee
Nupt Institute Of Big Data Research At Yancheng Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nupt Institute Of Big Data Research At Yancheng Co Ltd filed Critical Nupt Institute Of Big Data Research At Yancheng Co Ltd
Priority to CN201710033453.8A priority Critical patent/CN108306918B/en
Publication of CN108306918A publication Critical patent/CN108306918A/en
Application granted granted Critical
Publication of CN108306918B publication Critical patent/CN108306918B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/34Network arrangements or protocols for supporting network services or applications involving the movement of software or configuration parameters 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services

Abstract

The invention discloses a kind of based on dynamically analyzing of program, the automatic method for obtaining user and accessing site information in different time sections.Dynamic pitching pile technical limit spacing user is used to access the behavior of webpage first, then it collects the behavior and is stored in a variable, the webpage accessed simultaneously by domain name cache user using agency, then tool Selenium setting timings are used, the same webpage of multiple auto-accessing, the file for comparing acquisition again, to find the behavioral difference of user.The present invention has remarkable result in terms of the dynamic behaviour that multiple automation obtains user.

Description

A kind of website access information automatic obtaining method based on dynamically analyzing of program
Technical field
The present invention relates to a kind of analog subscriber multiple timings to access a website automatically, records its behavior and is analyzed Dynamic analysis frame belongs to the Internet, applications field.
Background technology
With the development of internet, JavaScript (abbreviation JS) script has obtained answering extensively in Web client With.Largely the appearance (such as Node.js) of Development Frameworks to become feasible with JavaScript to write server, Also there are more and more JavaScript applications to appear in server end, embody the powerful property of JavaScript language.At present Research and application for JavaScript also become to become more and more popular.
Selenium is a tool for being used for Web applications client program tests.Selenium tests are run directly in In browser, it, come the true operation of analog subscriber, and is supported a variety of clear based on JavaScript and in conjunction with its WebDriver It lookes at device (such as Safari, IE, Firefox, Chrome etc.), may operate in above several operation systems.
Jalangi is a frame that dynamic analysis are carried out to front-end and back-end JavaScript code, it can pass through Hook is added to monitor each operation of JavaScript, such as the read/write of variable, unitary/binary operation, function/method tune With etc..We can carry out pitching pile, to extract relevant letter using Jalangi to JavaScript code according to actual needs Breath.
Invention content
Technical problem:The purpose of the present invention is one user of simulation accesses the same webpage in different time sections and automates The dynamic behaviour for recording the user compares the difference of different time sections web page content after being serialized deposit local file It is different.The technology overcome existing crawler technology can only obtain the existing content of webpage itself in batches, cannot be to the certain of user Special behavior carries out batch records and the deficiency of analysis.
To achieve the above object, by the present invention in that with JavaScript dynamic pitching pile tool Jalangi, to entire webpage JavaScript source codes carry out pitching pile, our required information (such as JavaScript texts performed by user with acquisition Part and JavaScript variables comprising information), then got in pitching pile webpage by Selenium JavaScript variables.Since Selenium supports multilingual (including Java), we are obtained with the stream method preservation in Java Timing is arranged to obtain the dynamic behaviour of user in batches and access server info to local file in the information got.This Outside, Selenium can execute one section of JS script to handle the page using executeScript methods, we hold accordingly One section of JS script of row returns to JS variables.
The method of the present invention specifically comprises the following steps:
Step 1:It accesses user the JavaScript code executed in webpage and carries out pitching pile.
Using the dynamic pitching pile tool Jalangi of JavaScript, when webpage loads performed by a couple user JavaScript code carries out pitching pile, to obtain its dynamic characteristic.It, can since JavaScript allows object dynamic binding attribute To add a JS variable on the object of information needed, in webpage implementation procedure, this JS variable is added and transmits, with this To record the dynamic behaviour of user.
Step 2:Acquire the dynamic behaviour during user once accesses.
Step 2.1:Proxy server is set using red fox browser, the webpage that such user executes passes through agency service Device, proxy server call Jalangi to carry out pitching pile, and the page for returning to user includes the JS variables added in step 1;Profit The information that this variable includes is obtained with the executeScript methods of Selenium, and using the stream method of Java by its sequence Change into local file and carries out permanence preservation;Further, it is also possible to execute one section in Selenium using executeScript JS scripts obtain our required information, for example obtain the url information etc. containing src attributes in script labels.
Step 2.2:The webpage for calling Python script cache users to access using mitmproxy is acted on behalf of, file is by request The url domain names of web content are stored respectively, and record md5 codes (file difference, the md5 codes generated of each file It is different), then the file copy recorded to locally.
Step 3:Analog subscriber multiple timing accesses the same website, and records its bulk information.
Selenium has the characteristic for opening and closing automatically red fox browser, can access a net with analog subscriber accordingly It stands.This can be allowed to act using timer and repeat timing execution repeatedly, it is same to complete same user's multiple timing access The purpose of a website.Because mitmproxy caching file be continue record, accessed every time this website it Afterwards, buffered file will be deleted before this website by accessing again, with ensure each record it is all corresponding be once to visit The result asked.Therefore, to complete and delete this after accessing webpage, in the specified time that browser is closed file copy The file of secondary caching, and using the time-lag action of Selenium, ensure that browser does not access webpage within this specified time, Etc. caching to be deleted and then continue to access the same webpage.
Step 4:Analysis comparison is carried out to the dynamic behaviour of the user got.
The batch documents obtained by mitmproxy are analyzed:If the file that user accesses is different, generate Md5 just differ, the difference that user accesses the dynamic behaviour of same website can be distinguished by comparing the similarities and differences of md5 codes; Multiple access for user, we take the time of current accessed to distinguish.We with for the first time access obtain information with File later makees the comparison of md5 codes respectively, and comparison result uses domain name as key, once has before recording but once no afterwards File is as value.
The JS variable informations that Selenium is got are analyzed, identical mode may be used and carry out.
Compared with prior art, the present invention has the advantages that:
By using technologies such as Selenium, Jalangi, proxy servers, user's multiple timing can be imitated and accessed together One website, and using Jalangi to webpage JavaScript code carry out pitching pile, can easily track, detect and Obtain the dynamic behaviour feature of user.
In addition, proxy server is arranged in a browser, when the user accesses a web page, webpage passes through proxy server, generation It manages server calls Jalangi and carries out pitching pile, just containing us in the webpage of return records the JS variables of information, Ke Yiyong Selenium automations obtain the information of the variable, and this variable is not present before webpage pitching pile, can effectively more It mends existing crawler technology and can only obtain the static content of webpage itself and the same user can not be crawled and repeatedly access content It is insufficient.And user access information, including JS, HTML etc. before and after pitching pile can be cached according to domain name due to proxy server, it can Repeatedly access the difference of webpage to compare a user, and crawler technology can only crawl entire webpage source code, can not be to not Same domain name content distinguishes.
Description of the drawings
Fig. 1 is the JS variable flow charts that storage information is obtained when user accesses a webpage;
Fig. 2 is that user obtains the flow chart for not having to md5 files under domain name when accessing a webpage;
Fig. 3 is the acquisition flow chart of file when user repeatedly accesses same webpage;
Fig. 4 is the flow chart that processing is compared to the file of acquisition;
Specific implementation mode
Invention is further described in detail below in conjunction with the accompanying drawings.
Step 1:It accesses user the JavaScript code executed in webpage and carries out pitching pile.
Using the dynamic pitching pile tool Jalangi of JavaScript, imitates user using Selenium and access webpage, Selenium just sets proxy server, including the IP address of setting proxy server, agency before starting red fox browser Port address of server etc..Then start browser, user's network address to be accessed be passed to using the get methods of browser, And loading page, the JavaScript code when webpage loads performed by a couple user carries out pitching pile, to obtain its dynamic Characteristic, since JavaScript allows object dynamic binding attribute, it is possible to add a JS on the object of information needed Variable, in webpage implementation procedure, this JS variable is added and transmits, and the dynamic behaviour of user is recorded with this.
Step 2:The dynamic behaviour that acquisition user once accesses.
Step 2.1:Proxy server is set using red fox browser, the webpage that such user executes passes through agency service Device, proxy server call Jalangi to carry out pitching pile, and the page for returning to user includes the JS variables added in step 1, profit The information that this variable includes is obtained with the executeScript methods of Selenium, the method returns to the variable of object types, Format conversion can be carried out to it, switch to the type conveniently compared, such as JSON formats.
Using Selenium do variable automation obtain when, it has to be noted that it is across iframe the problem of:One website can Can include multiple iframe, the dynamic behaviour of user is across multiple iframe;In the variable in obtaining an iframe, it is necessary to make It is transferred in this iframe with switchTo () .frame () method of browser and obtains variable;It is transferred to separately from an iframe When one iframe, it is necessary to switchTo () .defaultContent () method first be used to be transferred to the content of acquiescence and then be transferred to The iframe to be entered.
Progress permanence preservation in local file is serialized using the stream method of Java later, uses regular expression (<=http://|\\.)[^.]*(com | cn | net | org | biz | info | cc | tv) matching user access network address Title of the second level domain as file for obtaining this website uses the time of current accessed as the title of file, to distinguish not Access together simultaneously facilitates subsequent comparison.
Additionally in Selenium one section of JS script can be executed using executeScript come required for obtaining us Information, such as obtain script labels in the url information etc. containing src attributes.Different files can be established to deposit Store up the different information that we get.
Step 2.2:The webpage for calling Python script cache users to access using mitmproxy is acted on behalf of, file is by request The url domain names of web content are stored respectively, and automatically record the md5 codes of each file.Then these are recorded File copy is to locally.
The timing node of copied files is very crucial.We are arranged a server and monitor a port:When listening to When port is accessed, start to copy;After copy, deleted the file that mitmproxy is cached is acted on behalf of.
Step 3:Analog subscriber multiple timing accesses the same website, and records its bulk information.
Selenium has the characteristic that can open and close automatically red fox browser, can access one with analog subscriber accordingly A website.This action can be allowed to repeat timing using timer to execute repeatedly, access to complete same user's multiple timing The purpose of the same website.
Because the file of mitmproxy cachings is to continue record, after having accessed this website every time, again Buffered file is deleted before accessing this website, to ensure to record all corresponding result once accessed every time.Therefore, It needs after accessing webpage, file copy is completed in the specified time that browser is closed and deletes the text of this caching Part, and using the delay function Thread.sleep () of Selenium, ensure that browser does not access net at the appointed time Page waits caching to be deleted to be further continued for accessing the same webpage later.
We wait for the regular hour at the setting after browser startup, recall get () method and are passed to url load webpages, After having obtained JS variables using Selenium and having preserved, retransmits a url and be linked to server and start to replicate as one The signal of md5 files, and be passed to using the current time as the parameter of url, server gets the variable of this parameter, as The name of file.
Step 4:Analysis comparison is carried out to the dynamic behaviour of the user got.
The batch documents obtained by mitmproxy are analyzed.If the file that user accesses is different, generate Md5 just differ.So can by compare md5 codes the similarities and differences come distinguish user access same website dynamic behaviour it is poor It is different.Multiple access for user, we take the time of current accessed to distinguish.We access the information obtained with first time Carry out the comparison of md5 codes respectively with file later, comparison result uses domain name as key, once has but do not have once afterwards before recording Some files are as value.Two identical files are md5 identical, and different files is the dynamic with a nearest user The different md5 subsets of behavior.
The present invention is not limited to examples detailed above, all belong to this using the technical solution that equivalent replacement or equivalence replacement are formed The claimed range of invention.

Claims (7)

1. a kind of method that the website access information based on dynamically analyzing of program obtains automatically, which is characterized in that dynamic based on program State analysis, the automatic information for obtaining user and accessing website in different time sections, i.e.,:Use pitching pile technology to entire webpage first JavaScript source codes carry out pitching pile, obtain dynamic behaviour of the user in accessing webpage;Then the dynamic behaviour is received The webpage for collecting and being stored in a variable, while being accessed by domain name cache user using agency;Then it is obtained using Selenium It gets the JavaScript variables in the webpage that pitching pile is crossed and the timing repeatedly same webpage of auto-accessing is set;Finally compare The file of acquisition, to find the dynamic behaviour difference of user.
2. the website access information automatic obtaining method based on dynamically analyzing of program according to claim 1, it is characterized in that packet Include following steps:
1) the dynamic pitching pile tool Jalangi for using JavaScript imitates user using Selenium and accesses webpage, to user It accesses the JavaScript code executed in webpage and carries out pitching pile, a JS variable, webpage are added on the object of information needed In implementation procedure, this JS variable is added and transmits, and the dynamic behaviour of user is recorded with this.
2) utilize red fox browser that proxy server is set, the webpage that such user executes passes through proxy server, agency service Device calls Jalangi to carry out pitching pile;The information that this variable includes is obtained using the executeScript methods of Selenium.
3) webpage for calling Python script cache users to access using mitmproxy is acted on behalf of, file is by request web content Url domain names are stored respectively, and record the md5 codes of each file.
4) analog subscriber multiple timing accesses the same website, and records its bulk information, that is, uses Selenium analog subscribers A website is accessed, it is multiple by this action repetition timing execution using timer.
5) batch documents obtained by mitmproxy are analyzed, because the file that user accesses is different, generation Md5 is just differed, and can distinguish the dynamic behaviour difference that user accesses same website by comparing the similarities and differences of md5 codes.
3. the method for automatic acquisition website access information according to claim 2 visits user it is characterized in that in step 1) Ask that the JavaScript code executed in webpage carries out pitching pile, i.e.,:
Using the dynamic pitching pile tool Jalangi of JavaScript, and imitates user using tool Selenium and access webpage, Selenium just sets proxy server, including the IP address of setting proxy server, agency before starting red fox browser Port address of server etc.;Then start browser, user's network address to be accessed be passed to using the get methods of browser, And loading page, the JavaScript code when webpage loads performed by a couple user carries out pitching pile, to obtain its dynamic Characteristic adds a JS variable on the object of information needed, and in webpage implementation procedure, this JS variable is added and transmits, The dynamic behaviour of user is recorded with this.
4. the method for automatic acquisition website access information according to claim 2 acquires user it is characterized in that in step 2) The dynamic behaviour once accessed:
Proxy server is set using red fox browser, the webpage that such user executes passes through proxy server, proxy server Jalangi is called to carry out pitching pile, the page for returning to user includes the JS variables added in step 1, utilizes Selenium's ExecuteScript methods obtain the information that this variable includes, and the method returns to the variable of object types, can be carried out to it Format conversion switchs to the type conveniently compared, such as JSON formats;
Using Selenium do variable automation obtain when, it has to be noted that it is across iframe the problem of:One website may wrap Containing multiple iframe, the dynamic behaviour of user is across multiple iframe;In the variable in obtaining an iframe, it is necessary to using clear SwitchTo () the .frame () method of device of looking at is transferred in this iframe and obtains variable;It is transferred to another from an iframe When iframe, it is necessary to first use that switchTo () .defaultContent () method is transferred to the content of acquiescence and then be transferred to will be into The iframe entered;
Additionally in Selenium it can execute one section of JS script using executeScript to obtain the information of needs, than It such as obtains the url information containing src attributes in script labels, different files can be established and got to store Different information.
5. the method for automatic acquisition website access information according to claim 1, it is characterised in that in step 3), utilize generation The webpage that cache user accesses is managed, i.e.,:
The webpage for calling Python script cache users to access using mitmproxy is acted on behalf of, file is by the url for asking web content Domain name is stored respectively, and automatically records the md5 codes of each file;Then the file copy these recorded to this Ground;The timing node of copied files is very crucial, we are arranged a server and monitor a port:When listening to port quilt Start to copy when access, be deleted after copy the file that mitmproxy is cached is acted on behalf of.
6. the method for automatic acquisition website access information according to claim 1, it is characterised in that in step 4), simulation is used Family multiple timing accesses the same website, and records its bulk information, i.e.,:
Selenium has the characteristic that can open and close automatically red fox browser, can access a net with analog subscriber accordingly It stands, this can be allowed to act using timer and repeat timing execution repeatedly, it is same to complete same user's multiple timing access The purpose of a website;
Because the file of mitmproxy cachings is to continue record, after having accessed this website every time, again access Buffered file is deleted before this website, to ensure that every time it is once accessing as a result, accessing net that record all corresponds to After page, file copy is completed in the specified time that browser is closed and deletes the file of this caching, and utilized The delay function Thread.sleep () of Selenium ensures that browser does not access webpage at the appointed time, waits to be deleted It is further continued for accessing the same webpage after caching;
After browser startup, setting waits for the regular hour to recall get () method and is passed to url load webpages, is using Selenium obtained JS variables and preserve after, retransmit a url be linked to server as one start replicate md5 files Signal, and be passed to using current time as the parameter of url, server gets life of the variable as file of this parameter Name.
7. the method for automatic acquisition website access information according to claim 1, it is characterised in that in step 5), to obtaining To the dynamic behaviour of user carry out analysis comparison, i.e.,:
The batch documents obtained by mitmproxy are analyzed:It is accessed by comparing the similarities and differences of md5 codes to distinguish user The dynamic behaviour difference of same website;Multiple access for user, we take the time of current accessed to distinguish;We use Access the comparison that the information obtained makees md5 codes with file later respectively for the first time, comparison result uses domain name as key, before record Primary have but once no file is used as value afterwards;Two identical files are md5 identical, different files be with The different md5 subsets of the dynamic behaviour of a nearest user.
CN201710033453.8A 2017-01-13 2017-01-13 Automatic website access information acquisition method based on program dynamic analysis Active CN108306918B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710033453.8A CN108306918B (en) 2017-01-13 2017-01-13 Automatic website access information acquisition method based on program dynamic analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710033453.8A CN108306918B (en) 2017-01-13 2017-01-13 Automatic website access information acquisition method based on program dynamic analysis

Publications (2)

Publication Number Publication Date
CN108306918A true CN108306918A (en) 2018-07-20
CN108306918B CN108306918B (en) 2021-08-31

Family

ID=62872143

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710033453.8A Active CN108306918B (en) 2017-01-13 2017-01-13 Automatic website access information acquisition method based on program dynamic analysis

Country Status (1)

Country Link
CN (1) CN108306918B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109460529A (en) * 2018-10-15 2019-03-12 杭州安恒信息技术股份有限公司 A kind of front end micro services module communication means based on iframe
CN109828921A (en) * 2019-01-18 2019-05-31 上海极链网络科技有限公司 HTML5 webpage automated function test method, system and electronic equipment
CN112965873A (en) * 2021-03-04 2021-06-15 中国邮政储蓄银行股份有限公司 Page processing method and device, storage medium and processor
CN117422427A (en) * 2023-12-19 2024-01-19 广东省建设工程质量安全检测总站有限公司 Online batch analysis method and system for low-strain data

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1987862A (en) * 2005-12-22 2007-06-27 国际商业机器公司 Method for analyzing state transition in web page
CN101515300A (en) * 2009-04-02 2009-08-26 阿里巴巴集团控股有限公司 Method and system for grabbing Ajax webpage content
CN101996196A (en) * 2009-08-28 2011-03-30 中国移动通信集团公司 Dynamic webpage acquisition method and device
CN103297469A (en) * 2012-02-25 2013-09-11 阿里巴巴集团控股有限公司 Method and device of collecting website data
US20160019229A1 (en) * 2014-07-15 2016-01-21 American Express Travel Related Services Company, Inc. Systems and methods for progressively launching websites
CN105468779A (en) * 2015-12-16 2016-04-06 中国科学院软件研究所 Browser compatibility detection oriented client Web application capture and playback system and method
CN106022132A (en) * 2016-05-30 2016-10-12 南京邮电大学 Real-time webpage Trojan detection method based on dynamic content analysis

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1987862A (en) * 2005-12-22 2007-06-27 国际商业机器公司 Method for analyzing state transition in web page
CN101515300A (en) * 2009-04-02 2009-08-26 阿里巴巴集团控股有限公司 Method and system for grabbing Ajax webpage content
CN101996196A (en) * 2009-08-28 2011-03-30 中国移动通信集团公司 Dynamic webpage acquisition method and device
CN103297469A (en) * 2012-02-25 2013-09-11 阿里巴巴集团控股有限公司 Method and device of collecting website data
US20160019229A1 (en) * 2014-07-15 2016-01-21 American Express Travel Related Services Company, Inc. Systems and methods for progressively launching websites
CN105468779A (en) * 2015-12-16 2016-04-06 中国科学院软件研究所 Browser compatibility detection oriented client Web application capture and playback system and method
CN106022132A (en) * 2016-05-30 2016-10-12 南京邮电大学 Real-time webpage Trojan detection method based on dynamic content analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李超琪: "基于插桩行为的动态检测技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109460529A (en) * 2018-10-15 2019-03-12 杭州安恒信息技术股份有限公司 A kind of front end micro services module communication means based on iframe
CN109828921A (en) * 2019-01-18 2019-05-31 上海极链网络科技有限公司 HTML5 webpage automated function test method, system and electronic equipment
CN112965873A (en) * 2021-03-04 2021-06-15 中国邮政储蓄银行股份有限公司 Page processing method and device, storage medium and processor
CN117422427A (en) * 2023-12-19 2024-01-19 广东省建设工程质量安全检测总站有限公司 Online batch analysis method and system for low-strain data
CN117422427B (en) * 2023-12-19 2024-03-15 广东省建设工程质量安全检测总站有限公司 Online batch analysis method and system for low-strain data

Also Published As

Publication number Publication date
CN108306918B (en) 2021-08-31

Similar Documents

Publication Publication Date Title
CN104766014B (en) For detecting the method and system of malice network address
US7770068B2 (en) Systems and methods for website monitoring and load testing via simulation
CN106503134B (en) Browser jumps to the method for data synchronization and device of application program
CN102385594B (en) The kernel control method of multi-core browser and device
US7343390B2 (en) Systems and methods for conducting internet content usage experiments
CN101354721B (en) Server,data processing device and method thereof
CN108306918A (en) A kind of website access information automatic obtaining method based on dynamically analyzing of program
US20100064234A1 (en) System and Method for Browser within a Web Site and Proxy Server
CN104426925B (en) Web page resources acquisition methods and device
WO2010096211A1 (en) Method and system of processing cookies across domains
CN102857369B (en) Website log saving system, method and apparatus
US7987243B2 (en) Method for media discovery
CN111163054B (en) Method and device for detecting malicious behavior of webpage
CN106603296A (en) Log processing method and device
CN106776318A (en) A kind of test script method for recording and system
CN110555146A (en) method and system for generating network crawler camouflage data
CN106599270B (en) Network data capturing method and crawler
CN108334619A (en) A kind of collecting method, device, computing device and storage medium
CN110851681A (en) Crawler processing method and device, server and computer readable storage medium
CN112637361A (en) Page proxy method, device, electronic equipment and storage medium
CN103458065A (en) Method for extracting video address based on Webkit kernel under HTML5 standard
CN103618760B (en) Processing method of cookie information in browser and browser
CN112612943A (en) Asynchronous processing framework-based data crawling method with automatic testing function
CN110633432A (en) Method, device, terminal equipment and medium for acquiring data
CN103117892A (en) Method and device for adding website access record

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20180720

Assignee: Jiangsu Yanan Information Technology Co.,Ltd.

Assignor: NUPT INSTITUTE OF BIG DATA RESEARCH AT YANCHENG

Contract record no.: X2023980047097

Denomination of invention: A method for automatically obtaining website access information based on program dynamic analysis

Granted publication date: 20210831

License type: Common License

Record date: 20231117

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20180720

Assignee: Yancheng Nongfu Technology Co.,Ltd.

Assignor: NUPT INSTITUTE OF BIG DATA RESEARCH AT YANCHENG

Contract record no.: X2023980049126

Denomination of invention: A method for automatically obtaining website access information based on program dynamic analysis

Granted publication date: 20210831

License type: Common License

Record date: 20231203

Application publication date: 20180720

Assignee: Yanmi Technology (Yancheng) Co.,Ltd.

Assignor: NUPT INSTITUTE OF BIG DATA RESEARCH AT YANCHENG

Contract record no.: X2023980049119

Denomination of invention: A method for automatically obtaining website access information based on program dynamic analysis

Granted publication date: 20210831

License type: Common License

Record date: 20231203