TWI402694B - System and method for controlling downloading data of webpage - Google Patents

System and method for controlling downloading data of webpage Download PDF

Info

Publication number
TWI402694B
TWI402694B TW96131388A TW96131388A TWI402694B TW I402694 B TWI402694 B TW I402694B TW 96131388 A TW96131388 A TW 96131388A TW 96131388 A TW96131388 A TW 96131388A TW I402694 B TWI402694 B TW I402694B
Authority
TW
Taiwan
Prior art keywords
data
download
list
downloaded
module
Prior art date
Application number
TW96131388A
Other languages
Chinese (zh)
Other versions
TW200910111A (en
Inventor
Chung I Lee
Chien Fa Yeh
Da-Peng Li
Zhi-Hong Li
Original Assignee
Hon Hai Prec Ind Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hon Hai Prec Ind Co Ltd filed Critical Hon Hai Prec Ind Co Ltd
Priority to TW96131388A priority Critical patent/TWI402694B/en
Publication of TW200910111A publication Critical patent/TW200910111A/en
Application granted granted Critical
Publication of TWI402694B publication Critical patent/TWI402694B/en

Links

Landscapes

  • Information Transfer Between Computers (AREA)

Description

網頁資料下載控制系統及方法Web page data download control system and method

本發明涉及一種網頁資料下載控制系統及方法。The invention relates to a webpage data download control system and method.

隨著互聯網的運用,從網站上下載各種資料到資料庫越來越普遍,通常下載的資料不是在一個頁面顯示,而是分成很多頁,在實際的操作過程中,每次下載都是從頭到尾執行一遍,即從第一頁到最後一頁都執行一遍,對於已經下載過的資料,重復執行,效率不高,且浪費網路資源。With the use of the Internet, it is more and more common to download various materials from the website to the database. Usually, the downloaded data is not displayed on one page, but is divided into many pages. In the actual operation process, each download is from scratch to Executing once, that is, executing from the first page to the last page, repeating the execution of the already downloaded data is inefficient and wastes network resources.

鑒於以上內容,有必要提供一種網頁資料下載控制系統,通過比較下載資料列表中的資料條數和所生成的下載資料的命令數來控制網頁的翻動。In view of the above, it is necessary to provide a webpage data download control system for controlling webpage flipping by comparing the number of data in the downloaded material list with the number of generated download data.

此外,還有必要提供一種網頁資料下載控制方法,通過比較下載資料列表中的資料條數和所生成的下載資料的命令數來控制網頁的翻動。In addition, it is also necessary to provide a webpage data download control method for controlling webpage flipping by comparing the number of data in the downloaded material list with the number of generated download data.

一種網頁資料下載控制系統,包括應用伺服器及與該應用伺服器相連的資料庫,該應用伺服器包括:設置模組,用於設置資料下載控制參數,所述的資料下載控制參數包括需要下載的資料的網站地址及允許下載的時間範圍;解析模組,用於解析設置的網站位址下的一個網頁的資料列表;獲取模組,用於獲取解析得到的該資料列表的資料資訊,所述資料列表的資料資訊包括各條資料的發佈時間;計算模組,用於計算所述資料列表中全部資料的條數;判斷模組,用於通過獲取的資料資訊來判斷所述資料列表中的各條資料是否已經被下載在資料庫中,及根據各條資料的發佈時間判斷該資料列表中的各條資料是否在允許下載的時間範圍之內;生成模組,用於生成所述資料列表中還在允許下載的時間範圍之內且沒有被下載的資料的腳本語言下載命令;下載模組,用於通過生成的腳本語言下載命令下載對應的資料,並將下載的資料保存至資料庫中;所述計算模組,還用於計算生成的腳本語言下載命令數;所述判斷模組,還用於判斷所述資料列表中資料的條數是否等於生成的腳本語言下載命令數;翻頁模組,用於當所述資料列表中資料的條數等於生成的腳本語言下載命令數時,執行網頁的翻頁動作。A webpage data download control system includes an application server and a database connected to the application server, the application server includes: a setting module, configured to set data download control parameters, wherein the data download control parameter includes downloading The website address of the data and the time range for allowing the download; the parsing module is configured to parse the data list of a web page under the set website address; the obtaining module is configured to obtain the parsed data information of the data list, The data information of the data list includes the time when each piece of data is published; the calculation module is configured to calculate the number of all the data in the data list; and the determining module is configured to judge the data list by using the obtained data information Whether each piece of data has been downloaded in the database, and according to the release time of each piece of data, whether the pieces of data in the list of materials are within the time range allowed for downloading; generating a module for generating the data The script language download command in the list is also within the time range allowed for download and there is no downloaded material; download mode For downloading the corresponding data by using the generated script language download command, and saving the downloaded data to the database; the computing module is further configured to calculate the generated script language download command number; the determining module, The method is further configured to determine whether the number of the data in the data list is equal to the number of generated script language download commands, and the page turning module is configured to: when the number of the data in the data list is equal to the number of generated script language download commands, Perform page flipping of the web page.

一種網頁資料下載控制方法,該方法包括如下步驟:設置資料下載控制參數,所述的資料下載控制參數包括需要下載的資料的網站地址及允許下載的時間範圍;解析設置的網站位址下的一個網頁的資料列表;獲取解析得到的該資料列表的資料資訊,所述資料列表的資料資訊包括各條資料的發佈時間;計算該資料列表中全部資料的條數;通過獲取的資料資訊來判斷所述資料列表中的各條資料是否已經被下載在資料庫中,及根據各條資料的發佈時間判斷該資料列表中的各條資料是否在允許下載時間範圍之內;若資料列表中有資料沒有被下載,且所述沒有被下載的資料的發佈時間在允許下載時間範圍之內,則生成所述還在允許下載的時間範圍之內且沒有被下載的資料的腳本語言下載命令;通過生成的腳本語言下載命令下載對應的資料,並將下載的資料保存至資料庫中;計算生成的腳本語言下載命令數;判斷所述資料列表中全部資料的條數是否等於生成的腳本語言下載命令數;若所述資料列表中全部資料的條數等於生成的腳本語言下載命令數,則執行翻頁動作。A webpage data download control method, the method comprising the steps of: setting a data download control parameter, wherein the data download control parameter includes a website address of a data to be downloaded and a time range for allowing downloading; and parsing a set of a website address a data list of the webpage; obtaining the data information of the data list obtained by the parsing, the data information of the data list includes the publishing time of each data; calculating the number of all the data in the data list; and judging the information by the obtained information Whether each piece of data in the data list has been downloaded in the database, and according to the publication time of each piece of data, it is judged whether each piece of data in the item list is within the allowable download time range; if there is no data in the item list Is downloaded, and the publishing time of the undownloaded material is within the allowable download time range, then the script language download command that is not within the time range allowed for downloading and is not downloaded is generated; The script language download command downloads the corresponding data and saves the downloaded data. Go to the database; calculate the number of generated script language download commands; determine whether the number of all the data in the data list is equal to the number of generated script language download commands; if the number of all the data in the data list is equal to the generated script If the number of languages is downloaded, the page flipping action is performed.

相較於現有技術,所述的網頁資料下載控制系統及方法,其可通過解析的網頁中是否有已經下載的資料判斷是否要翻頁,避免了程式每次都從第一頁到最後一頁都執行一遍的繁瑣步驟,提高了下載的效率,節約了網路資源。Compared with the prior art, the webpage data download control system and method can determine whether or not to turn pages by analyzing whether the webpage has been downloaded, thereby avoiding the program from the first page to the last page each time. The cumbersome steps are performed once, which improves the efficiency of downloading and saves network resources.

如圖1所示,是本發明一種網頁資料下載控制系統較佳實施例的硬體架構圖。該系統包括應用伺服器1、資料庫2、用戶端3、防火牆4及網路5。該應用伺服器1通過防火牆4與網路5相連,用於下載網站上發佈的資料,並將所述下載的資料保存至資料庫2中。該應用伺服器1可以是個人電腦、網路服務器,還可以是任意其他適用的資料處理設備。該防火牆4用於管控網路5的資訊安全。該網路5可以是因特網也可以是局域網。FIG. 1 is a hardware structural diagram of a preferred embodiment of a webpage data download control system according to the present invention. The system includes an application server 1, a database 2, a client 3, a firewall 4, and a network 5. The application server 1 is connected to the network 5 through the firewall 4, and is used for downloading the materials published on the website, and saving the downloaded materials to the database 2. The application server 1 can be a personal computer, a web server, or any other suitable data processing device. The firewall 4 is used to control the information security of the network 5. The network 5 can be the Internet or a local area network.

該應用伺服器1與資料庫2相連,該資料庫2用於存儲應用伺服器1所下載的資料,該資料庫2可以內置於應用伺服器1,也可以外置於應用伺服器1。The application server 1 is connected to a database 2 for storing data downloaded by the application server 1. The database 2 can be built in the application server 1 or external to the application server 1.

此外,該應用伺服器1與至少一用戶端3相連,該用戶端3用於提供一個互動式介面給用戶,便於用戶輸入下載資訊至應用伺服器1。In addition, the application server 1 is connected to at least one user terminal 3 for providing an interactive interface to the user, so that the user can input the download information to the application server 1.

如圖2所示,是圖1中應用伺服器的的功能模組圖。該應用伺服器1包括設置模組10、解析模組12、計算模組14、判斷模組16、獲取模組18、生成模組20、下載模組22及翻頁模組24。As shown in FIG. 2, it is a functional module diagram of the application server in FIG. The application server 1 includes a setting module 10, an analysis module 12, a calculation module 14, a determination module 16, an acquisition module 18, a generation module 20, a download module 22, and a page turning module 24.

所述設置模組10用於設置資料下載控制參數,並將所述設置的參數保存至資料庫2中。所述的資料下載控制參數包括需要下載資料的網站位址、下載資料允許的時間範圍及保存下載資料的路徑。The setting module 10 is configured to set a data download control parameter, and save the set parameter to the data base 2. The data download control parameter includes a website address that needs to download data, a time range allowed for downloading the data, and a path for saving the downloaded data.

所述解析模組12用於解析設置的網站位址下的一個網頁的資料列表,所述的資料列表的資料資訊包括各條下載資料的日期,各條下載資料的位址及標題。The parsing module 12 is configured to parse the data list of a webpage under the set website address, and the data information of the data list includes the date of each downloading data, the address and the title of each downloaded data.

所述計算模組14用於計算當前網頁資料列表中全部資料的條數,及計算生成的腳本語言下載命令數。在此實施例中腳本語言指的是Xquery腳本語言。The calculation module 14 is configured to calculate the number of all the materials in the current webpage data list, and calculate the generated script language download command number. The scripting language in this embodiment refers to the Xquery scripting language.

所述判斷模組16用於判斷當前網頁資料列表中是否存在下一條資料。The determining module 16 is configured to determine whether the next piece of data exists in the current webpage data list.

所述獲取模組18用於獲取解析得到的該資料列表的資料資訊。The obtaining module 18 is configured to obtain the data information of the data list obtained by the parsing.

所述的判斷模組16還用於判斷當前網頁資料列表中的各條資料是否已經被下載在資料庫中,及根據各條資料的發佈時間判斷該資料列表中的各條資料是否在允許下載的時間範圍之內。The determining module 16 is further configured to determine whether each piece of data in the current webpage data list has been downloaded in the database, and determine whether each piece of data in the data list is allowed to be downloaded according to the publishing time of each piece of data. Within the time range.

所述生成模組20用於生成所述當前頁資料列表中還在允許下載的時間範圍之內且沒有被下載的資料的腳本語言(Xquery)命令。在此實施例中,當解析資料列表中的一條資料時,若該條資料沒有下載且在允許的下載日期範圍內時,則生成模組20生成一條Xquery語言命令。生成模組20生成多少條命令數,則下載多少條資料。The generating module 20 is configured to generate a scripting language (Xquery) command in the current page material list that is still within the time range that is allowed to be downloaded and has not been downloaded. In this embodiment, when parsing a piece of material in the material list, if the piece of material is not downloaded and is within the allowed download date range, the generation module 20 generates an Xquery language command. How many pieces of data are generated by the generation module 20, how many pieces of data are downloaded.

所述判斷模組16還用於判斷所述當前網頁資料列表中的下載資料是否全部檢測完畢。The determining module 16 is further configured to determine whether the downloaded data in the current webpage data list is all detected.

所述下載模組22用於通過生成的腳本語言下載命令下載對應的資料並以可擴展標記語言(Extensible Markup Language,XML)文件形式將下載的資料保存至應用伺服器1中。The download module 22 is configured to download the corresponding data by using the generated script language download command and save the downloaded data to the application server 1 in the form of an Extensible Markup Language (XML) file.

所述判斷模組16還用於判斷所述當前頁資料列表中資料的條數是否大於生成的Xquery命令數。The determining module 16 is further configured to determine whether the number of pieces of data in the current page material list is greater than the number of generated Xquery commands.

所述的翻頁模組24用於當所述當前頁資料列表中資料的條數等於生成的腳本語言下載命令數時,執行網頁的翻頁動作。The page turning module 24 is configured to perform a page turning action of the webpage when the number of pieces of material in the current page material list is equal to the number of generated script language download commands.

如圖3所示,是本發明一種網頁資料下載控制方法的較佳實施例的流程圖。首先,步驟S10,設置模組10設置資料下載控制參數,並將所設置的參數保存至資料庫2中。所述的資料下載控制參數包括:下載資料的網站地址、下載資料允許的時間範圍及保存下載資料的路徑。所述的下載資料的網站位址是指:URL鏈結位址。具體而言,以需要下載新浪網站上的科技新聞資料為例,設置模組10在系統中設置新浪網站下科技新聞資料的URL位址。FIG. 3 is a flow chart of a preferred embodiment of a webpage data download control method according to the present invention. First, in step S10, the setting module 10 sets the data download control parameters, and saves the set parameters to the database 2. The data download control parameter includes: a website address for downloading the data, a time range allowed for downloading the data, and a path for saving the downloaded data. The website address of the downloaded data refers to: a URL link address. Specifically, taking the technology news material on the Sina website as an example, the setting module 10 sets the URL address of the technology news material under the Sina website in the system.

步驟S11,解析模組12解析設置的下載資料的網站位址下的當前網頁的資料列表,以獲得該資料列表的資料資訊。所述的當前網頁的資料列表是指包含多條資料的。所述資料列表的資料資訊包括資料列表中每條下載資料的發佈日期,每條下載資料的位址及標題。網站上可供下載的資料都以資料列表的形式顯示與頁面上,資料列表中有存放的一條一條的資料,解析模組12解析網頁下資料列表中存放的各條資料資訊。In step S11, the parsing module 12 parses the data list of the current webpage under the website address of the set download data to obtain the data information of the data list. The data list of the current webpage refers to a plurality of materials. The data information of the data list includes the date of publication of each downloaded data in the data list, the address and title of each downloaded data. The information available for download on the website is displayed in the form of a data list and one piece of data stored in the data list on the page, and the parsing module 12 parses each piece of information stored in the data list under the web page.

步驟S12,計算模組14計算當前網頁資料列表中全部資料的條數。In step S12, the calculation module 14 calculates the number of all the data in the current webpage data list.

步驟S13,判斷模組16判斷當前網頁資料列表中是否存在下一條資料。In step S13, the determining module 16 determines whether the next piece of data exists in the current webpage material list.

步驟S14,當前網頁資料列表中存在下一條資料時,獲取模組18獲取一條資料的基本資訊。所述的基本資訊包括:包括該條資料的發佈日期、下載位址、標題。Step S14: When the next piece of data exists in the current webpage data list, the obtaining module 18 acquires basic information of a piece of data. The basic information includes: a release date, a download address, and a title of the piece of information.

步驟S15,判斷模組16判斷當前網頁資料列表中該條資料是否已經被下載在資料庫中。具體而言,通過該條資料的標題和下載位址(URL)查詢資料庫2中是否有該條資料;若資料庫2中有該條資料,則該條資料已經下載,若資料庫2中沒有該條資料,則該條資料還沒有被下載。In step S15, the determining module 16 determines whether the piece of data in the current webpage material list has been downloaded in the database. Specifically, the title of the article and the download address (URL) are used to check whether the data is in the database 2; if the data is in the database 2, the data has been downloaded, if the database 2 If there is no such information, the information has not yet been downloaded.

步驟S16,當前網頁資料列表中單條資料還沒有被下載至資料庫中時,判斷模組16判斷當前網頁資料列表中該單條資料的發佈時間是否在設置的允許下載的時間範圍之內。In step S16, when a single piece of data in the current webpage material list has not been downloaded to the database, the judging module 16 judges whether the publishing time of the single piece of data in the current webpage material list is within the set time range for allowing downloading.

步驟S17,當該單條資料發佈時間在設置的允許下載的時間範圍之內時,生成模組20生成下載該單條資料的Xquery命令。一條資料對應一條Xquery命令。In step S17, when the single data publishing time is within the set time range for allowing downloading, the generating module 20 generates an Xquery command for downloading the single piece of data. A piece of data corresponds to an Xquery command.

步驟S18,判斷模組16判斷當前網頁資料列表的資料是否全部檢測完畢。In step S18, the determining module 16 determines whether the data of the current webpage data list is all detected.

步驟S19,當前網頁資料列表中的每條資料全部檢測完畢時,下載模組22通過生成的Xquery命令下載對應的當前網頁資料,並以可擴展標記語言(Extensible Markup Language,XML)文件形式將下載的資料按照設置的存儲下載資料的路徑保存至應用伺服器1中。In step S19, when all the data in the current webpage data list is detected, the downloading module 22 downloads the corresponding current webpage data through the generated Xquery command, and downloads the file in the form of an Extensible Markup Language (XML) file. The data is saved to the application server 1 according to the set path of the stored download data.

步驟S20,計算模組14計算當前網頁資料列表中生成的下載資料的Xquery命令數。In step S20, the calculation module 14 calculates the number of Xquery commands of the downloaded data generated in the current webpage material list.

步驟S21,判斷模組16判斷當前網頁資料列表中資料的條數是否大於生成的Xquery命令數。In step S21, the determining module 16 determines whether the number of pieces of data in the current webpage material list is greater than the number of generated Xquery commands.

步驟S22,當前網頁資料列表中全部資料的條數等於生成的Xquery命令數時,翻頁模組24執行網頁的翻頁動作,之後回到步驟S11。In step S22, when the number of all the data in the current webpage material list is equal to the number of generated Xquery commands, the page turning module 24 performs the page turning operation of the webpage, and then returns to step S11.

在步驟S13中,若當前網頁資料列表中不存在下一條資料時,則轉到步驟S20。In step S13, if the next piece of data does not exist in the current web page material list, then go to step S20.

在步驟S15中,若當前網頁資料列表中單條資料已經下載時,則回到步驟S13。In step S15, if a single piece of data in the current web page material list has been downloaded, the process returns to step S13.

在步驟S16中,若當該單條資料發佈時間不在設置的允許下載的時間範圍之內時,則回到步驟S13。In step S16, if the single piece of material publication time is not within the set time period of the permitted download, the process returns to step S13.

在步驟S18中,若當前網頁資料列表中還有資料沒有檢測到時,則回到步驟S13。In step S18, if there is still no data detected in the current web page material list, the process returns to step S13.

在步驟S22中,若當前網頁資料列表中資料的條數不等於生成的Xquery命令數時,則結束流程。In step S22, if the number of pieces of data in the current web page material list is not equal to the number of generated Xquery commands, the process ends.

應用伺服器...1Application server. . . 1

資料庫...2database. . . 2

用戶端...3user terminal. . . 3

防火牆...4Firewall. . . 4

網路...5network. . . 5

設置模組...10Set the module. . . 10

解析模組...12Analytic module. . . 12

計算模組...14Calculation module. . . 14

判斷模組...16Judging the module. . . 16

獲取模組...18Get the module. . . 18

生成模組...20Generate modules. . . 20

下載模組...22Download module. . . twenty two

翻頁模組...24Page turning module. . . twenty four

圖1是本發明網頁資料下載控制系統的較佳實施例的硬體框架圖。1 is a hardware frame diagram of a preferred embodiment of a web page data download control system of the present invention.

圖2是圖1中應用伺服器的功能模組圖。2 is a functional block diagram of the application server of FIG. 1.

圖3是本發明網頁資料下載控制方法的較佳實施例的流程圖。3 is a flow chart of a preferred embodiment of a web page data download control method of the present invention.

參數設置...S10parameter settings. . . S10

解析當前頁中的資料列表信息...S11Parse the data list information in the current page. . . S11

計算當前頁列表資料條數...S12Calculate the current page list data. . . S12

是否存在下一條資料...S13Is there a next item? . . S13

獲取單條資料的基本信息...S14Get basic information about a single piece of information. . . S14

所獲取的單條資料是否已經下載...S15Whether the single piece of information obtained has been downloaded. . . S15

該單條資料的發布時間是否在設置時間範圍...S16Whether the release time of the single piece of data is within the set time range. . . S16

生成一條下載該單條資料的Xquery命令...S17Generate an Xquery command to download the single piece of information. . . S17

是否全部檢測完畢...S18Whether all tests have been completed. . . S18

下載當前頁資料並保存到數據庫...S19Download the current page and save it to the database. . . S19

計算當前頁生成的下載資料的Xquery命令數...S20Calculates the number of Xquery commands for the downloaded data generated by the current page. . . S20

列表資料條數是否等於生成的Xquery命令數...S21Whether the number of list data is equal to the number of generated Xquery commands. . . S21

翻下一頁...S22Turn over the next page. . . S22

Claims (7)

一種網頁資料下載控制系統,包括應用伺服器及與該應用伺服器相連的資料庫,其中,該應用伺服器包括:設置模組,用於設置資料下載控制參數,所述的資料下載控制參數包括需要下載的資料的網站地址及允許下載的時間範圍;解析模組,用於解析設置的網站位址下的一個網頁的資料列表;獲取模組,用於獲取解析得到的該資料列表的資料資訊,所述資料列表的資料資訊包括各條資料的發佈時間;計算模組,用於計算所述資料列表中全部資料的條數;判斷模組,用於通過獲取的資料資訊來判斷所述資料列表中的各條資料是否已經被下載在資料庫中,及根據各條資料的發佈時間判斷該資料列表中的各條資料是否在允許下載的時間範圍之內;生成模組,用於生成所述資料列表中還在允許下載的時間範圍之內且沒有被下載的資料的腳本語言下載命令;下載模組,用於通過生成的腳本語言下載命令下載對應的資料,並將下載的資料保存至資料庫中;所述計算模組,還用於計算生成的腳本語言下載命令數;所述判斷模組,還用於判斷所述資料列表中資料的條數是否等於生成的腳本語言下載命令數;翻頁模組,用於當所述資料列表中資料的條數等於生成的腳本語言下載命令數時,執行網頁的翻頁動作。A webpage data download control system includes an application server and a database connected to the application server, wherein the application server comprises: a setting module, configured to set data download control parameters, wherein the data download control parameters include The website address of the data to be downloaded and the time range for allowing the download; the parsing module is used to parse the data list of a web page under the set website address; and the obtaining module is used to obtain the parsed data information of the data list. The data information of the data list includes the time when each piece of data is published; the calculation module is configured to calculate the number of all the data in the data list; and the determining module is configured to judge the data by using the obtained data information Whether each piece of data in the list has been downloaded in the database, and according to the release time of each piece of data, it is judged whether each piece of data in the item list is within the time range allowed for downloading; generating a module for generating the house a script language download command in the list of materials that is still within the time range allowed for download and has no downloaded material; The loading module is configured to download the corresponding data by using the generated script language download command, and save the downloaded data to the database; the computing module is further configured to calculate the generated script language download command number; the determining The module is further configured to determine whether the number of the data in the data list is equal to the number of generated script language download commands; and the page turning module is configured to: when the number of the data in the data list is equal to the generated script language download command When the number is counted, the page flipping action of the web page is performed. 如申請專利範圍第1項所述之網頁資料下載控制系統,其中,所述資料下載控制參數還包括保存下載的資料的路徑;下載模組在下載資料的過程中,按照所述的路經把資料下載到資料庫中。The webpage data download control system of claim 1, wherein the data download control parameter further includes a path for saving the downloaded data; and the downloading module downloads the data according to the path described. The data is downloaded to the database. 如申請專利範圍第1項所述之網頁資料下載控制系統,其中,所述資料列表的資料資訊包括:各條資料的下載位址及標題;判斷模組通過所述的各條資料的下載位址及標題判斷所述資料是否在資料庫中已經下載。The webpage data download control system according to the first aspect of the patent application, wherein the data information of the data list includes: a download address and a title of each piece of data; and the downloading position of the piece of data by the judging module The address and title determine whether the material has been downloaded in the database. 一種網頁資料下載控制方法,其中,該方法包括如下步驟:設置資料下載控制參數,所述的資料下載控制參數包括需要下載的資料的網站地址及允許下載的時間範圍;解析設置的網站位址下的一個網頁的資料列表;獲取解析得到的該資料列表的資料資訊,所述資料列表的資料資訊包括各條資料的發佈時間;計算該資料列表中全部資料的條數;通過獲取的資料資訊來判斷所述資料列表中的各條資料是否已經被下載在資料庫中,及根據各條資料的發佈時間判斷該資料列表中的各條資料是否在允許下載時間範圍之內;若資料列表中有資料沒有被下載,且所述沒有被下載的資料的發佈時間在允許下載時間範圍之內,則生成所述還在允許下載的時間範圍之內且沒有被下載的資料的腳本語言下載命令;通過生成的腳本語言下載命令下載對應的資料,並將下載的資料保存至資料庫中;計算生成的腳本語言下載命令數;判斷所述資料列表中全部資料的條數是否等於生成的腳本語言下載命令數;若所述資料列表中全部資料的條數等於生成的腳本語言下載命令數,則執行翻頁動作。A webpage data download control method, wherein the method comprises the following steps: setting a data download control parameter, wherein the data download control parameter includes a website address of a data to be downloaded and a time range for allowing downloading; and parsing the set website address a data list of a webpage; obtaining information information of the data list obtained by the parsing, the data information of the data list includes the publishing time of each data; calculating the number of all the data in the data list; and obtaining the information information Determining whether each piece of data in the data list has been downloaded in the database, and determining whether each piece of data in the item list is within the allowable download time range according to the publication time of each piece of data; The data is not downloaded, and the publishing time of the undownloaded material is within the allowable download time range, and the script language download command of the material that is still within the time range allowed for downloading and is not downloaded is generated; The generated script language download command downloads the corresponding data and will download the The material is saved in the database; the number of generated script language download commands is calculated; whether the number of all the data in the data list is equal to the number of generated script language download commands; if the number of all the data in the data list is equal to the generated The script language download command number, then perform page flipping action. 如申請專利範圍第4項所述之網頁資料下載控制方法,其中,該方法還包括步驟:若資料列表中全部資料的條數不等於生成的腳本語言下載命令數,則結束流程。The webpage data download control method of claim 4, wherein the method further comprises the step of: if the number of all the data in the data list is not equal to the number of generated script language download commands, the process ends. 如申請專利範圍第4項所述之網頁資料下載控制方法,其中,所述資料下載控制參數還包括保存下載的資料的路徑;在下載資料的過程中,是按照所述的路經把資料下載到資料庫中。The webpage data download control method according to the fourth aspect of the invention, wherein the data download control parameter further includes a path for saving the downloaded data; in the process of downloading the data, downloading the data according to the path Go to the database. 如申請專利範圍第4項所述之網頁資料下載控制方法,其中,所述資料列表的資料資訊還包括:資料列表中各條資料的下載位址及標題;所述的各條資料的下載位址及標題被用於判斷各條資料是否已經被下載在資料庫中。The method for controlling downloading of webpage data according to item 4 of the patent application scope, wherein the information of the data list further includes: a download address and a title of each piece of data in the data list; and a download position of each piece of data The address and title are used to determine if each piece of material has been downloaded in the database.
TW96131388A 2007-08-24 2007-08-24 System and method for controlling downloading data of webpage TWI402694B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW96131388A TWI402694B (en) 2007-08-24 2007-08-24 System and method for controlling downloading data of webpage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW96131388A TWI402694B (en) 2007-08-24 2007-08-24 System and method for controlling downloading data of webpage

Publications (2)

Publication Number Publication Date
TW200910111A TW200910111A (en) 2009-03-01
TWI402694B true TWI402694B (en) 2013-07-21

Family

ID=44724258

Family Applications (1)

Application Number Title Priority Date Filing Date
TW96131388A TWI402694B (en) 2007-08-24 2007-08-24 System and method for controlling downloading data of webpage

Country Status (1)

Country Link
TW (1) TWI402694B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103885965A (en) * 2012-12-21 2014-06-25 鸿富锦精密工业(深圳)有限公司 Page loading management method and page loading management system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060036657A1 (en) * 2004-08-10 2006-02-16 Palo Alto Research Center Incorporated Full-text search integration in XML database
TW200712947A (en) * 2005-09-28 2007-04-01 Inventec Appliances Corp Computer netwok information system, search system and search method thereof

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060036657A1 (en) * 2004-08-10 2006-02-16 Palo Alto Research Center Incorporated Full-text search integration in XML database
TW200712947A (en) * 2005-09-28 2007-04-01 Inventec Appliances Corp Computer netwok information system, search system and search method thereof

Also Published As

Publication number Publication date
TW200910111A (en) 2009-03-01

Similar Documents

Publication Publication Date Title
US8775926B2 (en) Stylesheet conversion engine
US9361345B2 (en) Method and system for automated analysis and transformation of web pages
CN108415804B (en) Method for acquiring information, terminal device and computer readable storage medium
US20130159839A1 (en) Semantic compression of cascading style sheets
CN107766344B (en) Template rendering method and device and browser
CN104331369B (en) Page detection method and device, server based on browser
CN109144567B (en) Cross-platform webpage rendering method and device, server and storage medium
JP2006351002A5 (en)
CN105447198A (en) Convenient page script importing method and device
CN111177623A (en) Information processing method and device
US9317488B2 (en) Method and system for optimally transcoding websites
CN101763432A (en) Method for constructing lightweight webpage dynamic view
Mardani et al. Fawkes: Faster Mobile Page Loads via {App-Inspired} Static Templating
Vogel et al. An in-depth analysis of web page structure and efficiency with focus on optimization potential for initial page load
TWI402694B (en) System and method for controlling downloading data of webpage
US10095791B2 (en) Information search method and apparatus
JP5535184B2 (en) Browser execution script conversion system and browser execution script conversion program
Hanafi et al. Comparison of Web Page Rendering Methods Based on Next. js Framework Using Page Loading Time Test
CN101140578B (en) Method and system for multithread analyzing web page data
CN101364970B (en) Webpage material download control system and method
Parker et al. Using caching and optimization techniques to improve performance of the Ensembl website
TWI320144B (en) System and method for downloading static web page
CN113656674B (en) Automatic processing method and device for click type hyperlink in website crawler
TWI494781B (en) Activex capable of saving the information of the webpage and method thereof
TW201019142A (en) Dynamic webpage content capturing method

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees