CN101179525A - Method, system and device for obtaining file information - Google Patents
Method, system and device for obtaining file information Download PDFInfo
- Publication number
- CN101179525A CN101179525A CNA2007101606112A CN200710160611A CN101179525A CN 101179525 A CN101179525 A CN 101179525A CN A2007101606112 A CNA2007101606112 A CN A2007101606112A CN 200710160611 A CN200710160611 A CN 200710160611A CN 101179525 A CN101179525 A CN 101179525A
- Authority
- CN
- China
- Prior art keywords
- file
- download
- client
- fileinfo
- binary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Information Transfer Between Computers (AREA)
Abstract
The invention relates to a network download technique, in particular to a method, a system and a device obtaining document information. The invention solves a problem of the prior art that a user is not able to obtain accurate information of the document to be downloaded before downloading which leads to in capability of guaranteeing the quality of downloaded document. The invention includes: a client-side-side obtains a document mark of a binary system document which is not completely downloaded; a server determines a corresponding document information of the document mark obtained by the client according to a corresponding relationship of the established document mark and the document information; the client displays the document information determined by the sever to the user. By adopting the invention, the user is able to be aware of the quality of the document before downloading, and situations that a re-downloading caused by bad quality of the downloaded document are reduced, and time and downloading bandwidth of the user are saved.
Description
Technical field
The present invention relates to the network download technology, particularly a kind of mthods, systems and devices that obtain fileinfo.
Background technology
Along with the continuous development of Internet technology, the network bandwidth is also in continuous increase.The raising of network speed makes increasing user begin the file that needs by network download oneself.
When but the user passes through loading internet document, be merely able to know according to the descriptor of downloading page the content of file in download, but this type of descriptor may be very inaccurate, so the user can not know the accurate information of desiring file in download in advance.Such as: film of user's download, download the information of introducing this film on the webpage might be different with the information of the film of actual download, the relatively fuzzyyer Chu that do not see of the film that picture is downloaded, film does not have captions, but but introduce this film on the download webpage is clear version, and Chinese subtitle is arranged, perhaps in addition institute's downloaded files be not the file that webpage is introduced.
These problems often will could find that the user runs into above-mentioned situation and understands deleted file usually behind the intact file of user's download, seek other download address once more and download, and this has caused waste to user time and download bandwidth.
In sum, the user has no way of knowing the accurate content of desiring file in download before download, if after having downloaded file, finds that institute's file in download does not meet the requirement of oneself, will cause waste to user time and download bandwidth.
Summary of the invention
The embodiment of the invention provides a kind of mthods, systems and devices that obtain fileinfo, the accurate information of desiring file in download can't be before file in download, obtained in order to solve the user who exists in the prior art, thereby the problem of the quality of user's download file can't be guaranteed.
A kind of method of obtaining fileinfo that the embodiment of the invention provides comprises:
Client is obtained the file identification of the binary file of not finishing download;
Server is determined the described file identification corresponding file information that described client is obtained according to the file identification of setting up and the corresponding relation of fileinfo;
The described fileinfo that described client is determined described server shows to the user.
A kind of system that obtains fileinfo that the embodiment of the invention provides comprises:
Client is used to obtain the file identification of the binary file of not finishing download, and the fileinfo that server is determined shows to the user;
Server is used for determining the described file identification corresponding file information of obtaining according to the file identification of setting up and the corresponding relation of fileinfo.
A kind of client that the embodiment of the invention provides comprises:
Acquisition module is used to obtain the file identification of the binary file of not finishing download;
Display module is used for the fileinfo that server is determined is shown to the user.
A kind of server that the embodiment of the invention provides comprises:
Set up module, the file identification that is used to set up and the corresponding relation of fileinfo;
Determination module is used for setting up the described corresponding relation that module is set up according to described, determines the file identification corresponding file information that client is obtained.
Embodiment of the invention client is obtained the file identification of the binary file of not finishing download; Server is determined the described file identification corresponding file information that described client is obtained according to the file identification of setting up and the corresponding relation of fileinfo; The described fileinfo that described client is determined described server shows to the user, before the user's download file, just can know the quality of this document, reduced and downloaded the back, downloaded the situation of alternative document again, thereby saved user time and download bandwidth because document quality is not high.
Description of drawings
Fig. 1 obtains the system configuration schematic diagram of fileinfo for the embodiment of the invention;
Fig. 2 A is the structural representation of embodiment of the invention client;
Fig. 2 B is the structural representation of embodiment of the invention server;
Fig. 3 obtains the method flow schematic diagram of fileinfo for the embodiment of the invention.
Embodiment
In the present embodiment, according to the file identification of setting up and the corresponding relation of fileinfo, determine the file identification corresponding file information of the binary file of user's download, thereby can guarantee whether the content that allows the user judge file in download before file in download meets the demand of oneself.
Wherein, described file identification is used for identification document, and it can be filename, and URL that file is deposited (URL(uniform resource locator)) or content signature (Content Identity, CID) etc.Wherein, content signature CID calculates the back according to default algorithm to the content-data of binary file to obtain.Default algorithm can be that the content-data of different binary files is handled the arbitrary algorithm that obtains different results, its result (be content signature) but the unique identification binary file, perhaps, also can be that the result repetition rate is extremely low, the algorithm in tolerance interval.
Described fileinfo includes but not limited to one or more in the following message:
File title, content description, download time, date issued, preview picture, video segment or the like.
Below in conjunction with Figure of description the embodiment of the invention is described in further detail.
As shown in Figure 1, the embodiment of the invention system that obtains fileinfo comprises: client 10 and server 11.
Client 10 is connected with server 11, is used to obtain the file identification of the binary file of not finishing download, and the fileinfo that server 11 is determined shows to the user.
The binary file of not finishing download can be sets up downloading task, but does not also begin the binary file downloaded; Perhaps begun to download, but also do not had to download the binary file of finishing.
Wherein, client 10 can further include: acquisition module 100 and display module 101.
Acquisition module 100 is used to obtain the file identification of the binary file of not finishing download.
Wherein, acquisition module 100 can further include: check module 1000 and processing module 1001.
Check module 1000, be used to check whether the user sets up the binary file task of downloading.
Concrete, the position of demonstration can be provided with according to concrete needs, such as: may be displayed on the newly-built panel of downloading task; Can also be presented on the web browser or the like.
Wherein, client 10 can further include: judge module 102.
Server 11 is connected with client 10, is used for determining the file identification corresponding file information that client 10 is obtained according to the file identification of setting up in advance and the corresponding relation of fileinfo.
Wherein, server 11 can further include: set up module 110 and determination module 120.
Set up module 110, the file identification that is used to set up and the corresponding relation of fileinfo.
In specific implementation process, set up before the corresponding relation of file identification and fileinfo, need to determine respectively file identification and fileinfo.
Wherein, the method for determining file identification has multiple, such as: the content-data to each binary file carries out Hash operation, obtains the cryptographic Hash of file content, and the cryptographic Hash of this document content can this binary file content information of unique representative.
Hash algorithm can adopt md5-challenge (Message-Digest Algorithm, MD) 5, MD4, SHA (Secure Hash Algorithm, SHA), (Secure HashStandard, SHS) scheduling algorithm is as formula for SHA.
One of them calculation document content signature way: choose each 20KB (or other any several sections of file) data before, during and after the whole binary file, utilize aforementioned algorithm that these three parts are carried out Hash calculation altogether, obtain a value, should be worth as file identification;
Another calculation document content signature way is: the content-data of binary file is equally divided into N part with certain length (such as being 20K, 30K or other any values), utilize aforementioned algorithm that it is carried out Hash calculation respectively, obtain a value, become the piecemeal user supplied video content using fingerprints (Block ContentIdentity, BCID); Same carries out hash algorithm one time altogether to all BCID, a value that obtains, and (Global Content Identity GCID), signs as the content of this document with this GCID to be called global user supplied video content using fingerprints.
Certainly, client 10 obtain the binary file of not finishing download the method that file identification adopted must with server 11 in determine that the method for file identification is identical.
Such as: determine in the server 11 that file identification adopts the data of each 32k byte of head, centre and afterbody of getting a file, utilize its value of MD5 algorithm computation respectively, to calculate resulting three MD5 values is linked in sequence, and calculate with the data of DM5 algorithm after to this connection once more, with the file identification of this result of calculation as file; Then client 10 method of obtaining the file identification of the binary file of not finishing download must be the data of each 32k byte of head, centre and afterbody of the binary file of obtaining download equally, utilize its file identification of MD5 algorithm computation respectively, to calculate resulting three MD5 values is linked in sequence, and calculate with the data of DM5 algorithm after to this connection once more, with the file identification of this result of calculation as this file in download.
Wherein, fileinfo is determined according to following one or more modes:
The relevant information of description document in the periodic search webpage is for example collected by web crawlers spider;
The user for example provides user interface for user-in file information by information promulgating platform file publishing information;
Download software and collect file-related information (be the user when downloading software download uniform resource locator (URL) corresponding file, client downloads software is collected the relevant information of describing in this URL respective file webpage).
The fileinfo of the identical file of obtaining by said method has multiple, can determine a fileinfo the most accurately by the mode of calculating each fileinfo weight.For example, obtained the multiple fileinfo of a file correspondence, calculated the number of times that various fileinfos occur, the fileinfo that occurrence number is maximum is defined as describing this document fileinfo the most accurately.
Certainly, the mode that present embodiment obtains fileinfo is not limited to several mode above-mentioned, and any mode that can obtain fileinfo all is suitable for present embodiment.
In specific implementation process, the storage file sign can adopt database or file or other forms to store for server 11 inquiries with the entity of the corresponding relation of fileinfo, also the corresponding relation of file identification and fileinfo can be saved in the server 11 as required certainly.
If the employing database, then this database can be realized by the relevant database technology.Such as: on server, the relational data library software can be installed, and can adopt the application programming interfaces that relevant database manufacturer provides (Application Programming Interface, API).Generally in relevant database, (Structured Query Language is SQL) as the interface routine of management database content to adopt SQL.
Shown in Fig. 2 A, embodiment of the invention client comprises: acquisition module 200 and display module 201.
Embodiment of the invention client can further include: judge module 202.
Wherein, acquisition module 200, display module 201 and judge module 202 are identical with judge module 102 functions with acquisition module 100, display module 101 among Fig. 1, repeat no more.
Shown in Fig. 2 B, embodiment of the invention server comprises: set up module 210 and determination module 220.
Wherein, set up among module 210 and determination module 220 and Fig. 1 to set up module 110 identical with determination module 120 functions, repeat no more.
As shown in Figure 3, the embodiment of the invention method of obtaining fileinfo comprises the following steps:
Step 300, client are obtained the file identification of the binary file of not finishing download.
The binary file of not finishing download can be sets up downloading task, but does not also begin the binary file downloaded; Perhaps begun to download, but also do not had to download the binary file of finishing.
Wherein, step 300 can further include:
When the user set up the file in download task, client was obtained the file identification of binary file to be downloaded in the task.
Step 301, server are determined the file identification corresponding file information that client is obtained according to the file identification of setting up in advance and the corresponding relation of fileinfo.
The fileinfo that step 302, client are determined server shows to the user.
Concrete, the position of reality can be provided with according to concrete needs, such as: may be displayed on the newly-built panel of downloading task; Can also be presented on the browser or the like.
Further, after step 302, can further include:
Client checks whether the user selects to download binary file, if then download this binary file; Otherwise this binary file is downloaded in cancellation.
In specific implementation process, set up before the corresponding relation of file identification and fileinfo, need to determine respectively file identification and fileinfo.
Wherein, the method for determining file identification has a variety of, such as: the content-data to each binary file carries out Hash operation, obtains the cryptographic Hash of file content, and the cryptographic Hash of this document content can this binary file content information of unique representative.
Hash algorithm can adopt MD5, MD4, and SHA, the SHS scheduling algorithm is as formula.
One of them calculation document content signature way: choose each 20KB (or other any several sections of file) data before, during and after the whole binary file, utilize aforementioned algorithm that these three parts are carried out Hash calculation altogether, obtain a value, this is worth as file identification;
Another calculation document content signature way is: the content-data of binary file is equally divided into N part with certain length (such as being 20K, 30K or other any values), utilize aforementioned algorithm that it is carried out Hash calculation respectively, obtain a value, become BCID; Same carries out hash algorithm one time altogether to all BCID, and a value that obtains is called GCID, with the content signature of this GCID as this document.
Certainly, the client in the step 300 obtain the binary file of not finishing download the method that file identification adopted must with server in determine that the method for file identification is identical.
Such as: determine in the server that file identification adopts the data of each 32k byte of head, centre and afterbody of getting a file, utilize its value of MD5 algorithm computation respectively, to calculate resulting three MD5 values is linked in sequence, and calculate with the data of DM5 algorithm after to this connection once more, with the file identification of this result of calculation as file; Then the client method of obtaining the file identification of the binary file of not finishing download must be the data of each 32k byte of head, centre and afterbody of the binary file of obtaining download equally, utilize its file identification of MD5 algorithm computation respectively, to calculate resulting three MD5 values is linked in sequence, and calculate with the data of DM5 algorithm after to this connection once more, with the file identification of this result of calculation as this file in download.
Wherein, fileinfo is determined according to following one or more modes:
The relevant information of description document in the periodic search webpage is for example collected by web crawlers spider;
The user for example provides user interface for user-in file information by information promulgating platform file publishing information;
Client downloads software is collected file-related information (be user when unifying the URL corresponding file by the client downloads software download, client downloads software is collected the relevant information of describing in this URL respective file webpage).
The fileinfo of the identical file of obtaining by said method has multiple, can determine a fileinfo the most accurately by the mode of calculating each fileinfo weight.For example, obtained the multiple fileinfo of a file correspondence, calculated the number of times that various fileinfos occur, the fileinfo that occurrence number is maximum is defined as describing this document fileinfo the most accurately.
Certainly, present embodiment determines that the mode of fileinfo is not limited to several mode above-mentioned, anyly can determine that the mode of fileinfo all is suitable for present embodiment.
In specific implementation process, the entity of storage file sign and the corresponding relation of fileinfo can adopt during database or file or other forms store.
If the employing database, then this database can be realized by the relevant database technology.Such as: on server, the relational data library software can be installed, and the API that can adopt relevant database manufacturer to provide.Generally in relevant database, adopt the interface routine of SQL as the management database content.
Obviously, those skilled in the art should be understood that, above-mentioned each module of the present invention or each step can realize with the general calculation device, they can concentrate on the single calculation element, perhaps be distributed on the network that a plurality of calculation element forms, alternatively, they can be realized with the executable program code of calculation element, thereby, they can be stored in the storage device and carry out by calculation element, perhaps they are made into each integrated circuit modules respectively, perhaps a plurality of modules in them or step are made into the single integrated circuit module and realize.Like this, the present invention is not restricted to any specific hardware and software combination.Should be understood that the variation in these concrete enforcements is conspicuous for a person skilled in the art, do not break away from spiritual protection range of the present invention.
From the foregoing description as can be seen: embodiment of the invention client is obtained the file identification of the binary file of not finishing download; Server is determined the described file identification corresponding file information that described client is obtained according to the file identification of setting up and the corresponding relation of fileinfo; The described fileinfo that described client is determined described server shows to the user, thereby before the user's download file, just can know the actual content of this document, reduced and downloaded the back because file content and expection are not inconsistent, again download the situation of alternative document, thereby user time and download bandwidth have been saved, and avoid the user to be subjected to the deceiving of some websites swindle download behavior (be some website introduce the information of A file, actual is the B file) for personal interests.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification interior.
Claims (15)
1. a method of obtaining fileinfo is characterized in that, this method comprises:
Client is obtained the file identification of the binary file of not finishing download;
Server is determined the described file identification corresponding file information that described client is obtained according to the file identification of setting up and the corresponding relation of fileinfo;
The described fileinfo that described client is determined described server shows to the user.
2. the method for claim 1 is characterized in that, described file identification is the content signature of binary file; Described content signature is according to preset algorithm the content-data of binary file to be calculated the back to obtain; This preset algorithm is for to handle the algorithm that obtains different results to the content-data of different binary files.
3. the method for claim 1 is characterized in that, the described fileinfo that described client is determined described server also comprises after the user shows:
Described client checks whether the user selects to download binary file, if then download this binary file; Otherwise this binary file is downloaded in cancellation.
4. the method for claim 1 is characterized in that, the file identification that described client is obtained the binary file of not finishing download comprises:
When the user sets up download binary file task, obtain the file identification of binary file to be downloaded in the task.
5. a system that obtains fileinfo is characterized in that, this system comprises:
Client is used to obtain the file identification of the binary file of not finishing download, and the fileinfo that server is determined shows to the user;
Server is used for determining the described file identification corresponding file information of obtaining according to the file identification of setting up and the corresponding relation of fileinfo.
6. system as claimed in claim 5 is characterized in that, described file identification is the content signature of binary file; Described content signature is according to preset algorithm the content-data of binary file to be calculated the back to obtain; This preset algorithm is for to handle the algorithm that obtains different results to the content-data of different binary files.
7. system as claimed in claim 5 is characterized in that, described client comprises:
Acquisition module is used to obtain the file identification of the binary file of not finishing download;
Display module is used for the fileinfo that described server is determined is shown to the user;
Described server comprises:
Set up module, the file identification that is used to set up and the corresponding relation of fileinfo;
Determination module is used for setting up the described corresponding relation that module is set up according to described, determines the described file identification corresponding file information that described client is obtained.
8. system as claimed in claim 7 is characterized in that, described acquisition module comprises:
Check module, be used to check whether the user sets up the binary file task of downloading;
Processing module is used for obtaining the file identification of binary file to be downloaded in the downloading task when the user sets up download binary file task.
9. system as claimed in claim 7 is characterized in that, described client also comprises:
Judge module is used at described fileinfo that described display module will be determined checking whether the user selects file in download, if then download this binary file after the user shows; Otherwise this binary file is downloaded in cancellation.
10. a client is characterized in that, described client comprises:
Acquisition module is used to obtain the file identification of the binary file of not finishing download;
Display module is used for the fileinfo that server is determined is shown to the user.
11. client as claimed in claim 10 is characterized in that, described acquisition module comprises:
Check module, be used to check whether the user sets up the binary file task of downloading;
Processing module is used for obtaining the file identification of binary file to be downloaded in the task when the user sets up download binary file task.
12. client as claimed in claim 10 is characterized in that, described client also comprises:
Judge module is used at described fileinfo that described display module will be determined checking whether the user selects to download binary file, if then download this binary file after the user shows; Otherwise this binary file is downloaded in cancellation.
13. client as claimed in claim 10 is characterized in that, described file identification is the content signature of binary file; Described content signature is according to preset algorithm the content-data of binary file to be calculated the back to obtain; This preset algorithm is for to handle the algorithm that obtains different results to the content-data of different binary files.
14. a server is characterized in that, described server comprises:
Set up module, the file identification that is used to set up and the corresponding relation of fileinfo;
Determination module is used for setting up the described corresponding relation that module is set up according to described, determines the file identification corresponding file information that client is obtained.
15. server as claimed in claim 14 is characterized in that, described file identification is the content signature of binary file; Described content signature is according to preset algorithm the content-data of binary file to be calculated the back to obtain; This preset algorithm is for to handle the algorithm that obtains different results to the content-data of different binary files.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA2007101606112A CN101179525A (en) | 2007-12-21 | 2007-12-21 | Method, system and device for obtaining file information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA2007101606112A CN101179525A (en) | 2007-12-21 | 2007-12-21 | Method, system and device for obtaining file information |
Publications (1)
Publication Number | Publication Date |
---|---|
CN101179525A true CN101179525A (en) | 2008-05-14 |
Family
ID=39405615
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNA2007101606112A Pending CN101179525A (en) | 2007-12-21 | 2007-12-21 | Method, system and device for obtaining file information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101179525A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101873355A (en) * | 2010-06-28 | 2010-10-27 | 深圳市迅雷网络技术有限公司 | Method, device and system for downloading file |
CN103747241A (en) * | 2013-12-23 | 2014-04-23 | 乐视致新电子科技(天津)有限公司 | Method and apparatus for detecting integrity of video |
CN106941510A (en) * | 2016-01-05 | 2017-07-11 | 广州市动景计算机科技有限公司 | A kind of offline download method, equipment and system |
CN107241446A (en) * | 2017-07-31 | 2017-10-10 | 广州优视网络科技有限公司 | Document transmission method, device and the terminal device and storage medium of application program |
CN114785773A (en) * | 2022-04-27 | 2022-07-22 | 广州宸祺出行科技有限公司 | File transmission method and device for converting file data into messages |
-
2007
- 2007-12-21 CN CNA2007101606112A patent/CN101179525A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101873355A (en) * | 2010-06-28 | 2010-10-27 | 深圳市迅雷网络技术有限公司 | Method, device and system for downloading file |
CN103747241A (en) * | 2013-12-23 | 2014-04-23 | 乐视致新电子科技(天津)有限公司 | Method and apparatus for detecting integrity of video |
CN106941510A (en) * | 2016-01-05 | 2017-07-11 | 广州市动景计算机科技有限公司 | A kind of offline download method, equipment and system |
CN107241446A (en) * | 2017-07-31 | 2017-10-10 | 广州优视网络科技有限公司 | Document transmission method, device and the terminal device and storage medium of application program |
CN114785773A (en) * | 2022-04-27 | 2022-07-22 | 广州宸祺出行科技有限公司 | File transmission method and device for converting file data into messages |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107729352B (en) | Page resource loading method and terminal equipment | |
CN108052334B (en) | Page jump method, device, computer equipment and storage medium | |
US8429201B2 (en) | Updating a database from a browser | |
EP1958119B1 (en) | System and method for appending security information to search engine results | |
US8788925B1 (en) | Authorized syndicated descriptions of linked web content displayed with links in user-generated content | |
CN106911693B (en) | Method and device for detecting hijacking of webpage content and terminal equipment | |
US20070162566A1 (en) | System and method for using a mobile device to create and access searchable user-created content | |
US20080059544A1 (en) | System and method for providing secure third party website histories | |
CN102722439B (en) | Method, device and system for improving running stability of FLASH assembly | |
CN108021598B (en) | Page extraction template matching method and device and server | |
US7913249B1 (en) | Software installation checker | |
CN112596932A (en) | Service registration and interception method and device, electronic equipment and readable storage medium | |
CN113343312B (en) | Page tamper-proof method and system based on front-end embedded point technology | |
CN101179525A (en) | Method, system and device for obtaining file information | |
US9514228B2 (en) | Banning tags | |
US20070100914A1 (en) | Automated process for identifying and delivering domain specific unstructured content for advanced business analysis | |
CN106372179A (en) | Method and system for detecting document change and synchronization | |
CN101282341A (en) | Method, system and apparatus for obtaining document related information | |
CN116708178A (en) | Method, device, equipment, medium and product for backtracking change history of network equipment | |
CN116595047A (en) | Rights management method, rights management device, electronic device and computer-readable storage medium | |
US20100205191A1 (en) | User access time based content filtering | |
EP3151519A1 (en) | An intelligent system of unified content posting | |
CN107220306B (en) | Searching method and device | |
CN111865576A (en) | Method and device for synchronizing URL classification data | |
CN101340463A (en) | Method and apparatus for determining network resource type |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |
Open date: 20080514 |