CN104537017A - File search method and device based on path - Google Patents

File search method and device based on path Download PDF

Info

Publication number
CN104537017A
CN104537017A CN201410795855.8A CN201410795855A CN104537017A CN 104537017 A CN104537017 A CN 104537017A CN 201410795855 A CN201410795855 A CN 201410795855A CN 104537017 A CN104537017 A CN 104537017A
Authority
CN
China
Prior art keywords
path
absolute path
checked
file
lexcographical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410795855.8A
Other languages
Chinese (zh)
Other versions
CN104537017B (en
Inventor
薛贞文
张程伟
于传帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Huawei Technology Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201410795855.8A priority Critical patent/CN104537017B/en
Publication of CN104537017A publication Critical patent/CN104537017A/en
Application granted granted Critical
Publication of CN104537017B publication Critical patent/CN104537017B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Abstract

The invention discloses a file search method and device based on a path to solve the problems that in the prior art, a path partition mapping table is large in scale, the occupied storage space is large, and the file search performance descends. The method comprises the steps that a path partition mapping table storing the corresponding relation of partition information and the absolute path with the maximum lexicographical order in partitions corresponding to the partition information is obtained; retrieval is carried out on the absolute paths of the path partition mapping table, and the absolute path with the minimum lexicographical order in the absolute paths with the lexicographical order larger than or equal to the lexicographical order of the path of a file to be inquired is obtained and serves as a target absolute path; according to the target absolute path and the path partition mapping table, a file set to which the file to be inquired belongs is determined. Thus, the number of data items in the path partition mapping table for searching for the file set is small, the storage space is greatly saved, and meanwhile the file retrieval performance and the inquiry updating efficiency of the path partition mapping table are improved.

Description

A kind of file search method based on path and device
Technical field
The present invention relates to field of computer technology, particularly relate to a kind of file search method based on path and device.
Background technology
Traditional file system is the metadata management structure based on directory tree, by constructing directory tree to the All Files in file system, consult shown in Fig. 1, carry out file management by catalog tree, the way to manage based on directory tree structure is applicable to file in file system and the small scene of catalogue.
In current mass file system, the file size of management reaches up to a million and even more than one hundred million rank, and obviously, traditional way to manage based on directory tree structure far away cannot the performance requirement of document retrieval of satisfying magnanimity file.In order to address this problem, usually mass file system is carried out subregion by directory tree, consult shown in Fig. 2, each subregion comprises file or the catalogue number (such as 10,000 catalogues or 100,000 files) of some, one or a few subregion is filtered out according to the file path of input during inquiry, then in these subregions, utilize other file attribute to carry out finer search, thus accelerate metasearch.
In order to realize filtering out one or several subregion by input path in all subregions, prior art carries out subregion (such as a subregion comprises 10,000 catalogues) according to catalogue usually, then maintain one and comprise the mapping table (being called for short path partition map table) of all directory paths to subregion, consult shown in table 1.During inquiry, corresponding subregion is found in the path according to user's input in the partition map table of path, then, carries out searching for the file needed in the subregion that these filter out.
The path partition map table that table 1 builds for the directory tree subregion in Fig. 2
Prior art adopts the mode of path partition map table to realize the screening of input path to subregion, is a kind of method of simple, intuitive.But, in a mass file system, the number of catalogue is very huge, if all paths to be all added into path partition map table, the in large scale of path partition map table can be caused, the storage space taken is comparatively large, and causes the inquiry of path partition map watch renewal efficiency search performance that is comparatively slow and file sharply to descend degradation problem.
Summary of the invention
The embodiment of the present invention provides a kind of file search method based on path and device, in large scale in order to solve the path partition map table existed in prior art, the storage space taken is comparatively large, and causes the inquiry of path partition map watch to upgrade the problem of the comparatively slow and search performance decline of efficiency.
The concrete technical scheme that the embodiment of the present invention provides is as follows:
First aspect, a kind of file search method based on path, comprising:
Obtain the path of file to be checked, and path partition map table, wherein, in the partition map table of described path, preserve the corresponding relation of the absolute path that lexcographical order is maximum in each partition information and subregion corresponding to each partition information;
Determine the lexcographical order in the path of described file to be checked, as lexcographical order to be checked; And
Corresponding absolute path lexcographical order is determined according to each absolute path in the partition map table of described path;
Retrieve in all absolute path lexcographical orders based on described lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of described lexcographical order to be checked, and using absolute path corresponding for the absolute path lexcographical order of acquisition as target absolute path;
According to the described target absolute path obtained and path partition map table, determine the partition information that described target absolute path belongs to, as target partition information;
By the All Files in subregion corresponding for described target partition information, as the file set that described file to be checked belongs to.
In conjunction with first aspect, in the implementation that the first is possible, before the destination path obtaining file to be checked and path partition map table, also comprise:
For the All Files structure directory tree that this locality is preserved;
The absolute path lexcographical order corresponding according to All Files carries out subregion to described directory tree, generates multiple subregion;
Wherein, the common factor of the absolute path lexcographical order scope that any two subregions are corresponding is empty, and the absolute path lexcographical order scope that any one subregion is corresponding is from absolute path lexcographical order minimum value to absolute path lexcographical order maximal value.
In conjunction with the first possible implementation of first aspect or first aspect, in the implementation that the second is possible, before retrieving in all absolute path lexcographical orders based on described lexcographical order to be checked, also comprise:
All absolute path lexcographical orders are sorted according to lexcographical order order from small to large.
In conjunction with the implementation that the second of first aspect is possible, in the implementation that the third is possible, retrieve in all absolute path lexcographical orders based on described lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of described lexcographical order to be checked, comprising:
In all absolute path lexcographical orders carrying out from small to large sorting according to lexcographical order, from first absolute path lexcographical order, select two adjacent the first absolute path lexcographical orders and the second absolute path lexcographical order successively;
Judge that described lexcographical order to be checked is greater than described first absolute path lexcographical order, and when being less than or equal to the second absolute path lexcographical order, determine that described second absolute path lexcographical order is the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of described lexcographical order to be checked.
In conjunction with any one possible implementation above of first aspect or first aspect, in the 4th kind of possible implementation, by the All Files in subregion corresponding for described target partition information, as the file set that described file to be checked belongs to, comprising:
If the path of described file to be checked is absolute path, then determine the file set that the All Files in the subregion that described target partition information is corresponding belongs to for described file to be checked;
Otherwise, judge that whether the path of described file to be checked is the prefix path of described target absolute path;
When judging that the path of described file to be checked is not the prefix path of described target absolute path, determine the file set that the All Files in the subregion that described target partition information is corresponding belongs to for described file to be checked;
When judging that the path of described file to be checked is the prefix path of described target absolute path, the All Files in subregion corresponding for described target partition information is added in the original set that described file to be checked belongs to.
In conjunction with the 4th kind of possible implementation of first aspect, in the 5th kind of possible implementation, the All Files in subregion corresponding for described target partition information is added into after in the original set that described file to be checked belongs to, also comprises:
Described path partition map table is sorted from small to large according to the lexcographical order of absolute path;
In the partition map table of described path, select the next absolute path that comes below described target absolute path as the second target absolute path, and judge that whether the path of described file to be checked is the prefix path of described second target absolute path;
When judging that the path of described file to be checked is the prefix path of described second target absolute path, using partition information corresponding for described second target absolute path as the second target partition information, the All Files in subregion corresponding for described second target partition information is added in current the belonged to file set of described file to be checked.
Second aspect, a kind of file search device based on path, comprising:
Acquiring unit, for obtaining the path of file to be checked, and path partition map table, wherein, preserves the corresponding relation of the absolute path that lexcographical order is maximum in each partition information and subregion corresponding to each partition information in the partition map table of described path;
First determining unit, for determining the lexcographical order in the path of described file to be checked, as lexcographical order to be checked; And determine corresponding absolute path lexcographical order according to each absolute path in the partition map table of described path;
Processing unit, for retrieving in all absolute path lexcographical orders based on described lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of described lexcographical order to be checked, and using absolute path corresponding for the absolute path lexcographical order of acquisition as target absolute path;
Second determining unit, for according to the described target absolute path obtained and path partition map table, determines the partition information that described target absolute path belongs to, as target partition information;
Running unit, for by the All Files in subregion corresponding for described target partition information, as the file set that described file to be checked belongs to.
In conjunction with second aspect, in the implementation that the first is possible, also comprise:
Zoning unit, for before the destination path obtaining file to be checked and path partition map table, for the All Files structure directory tree that this locality is preserved; The absolute path lexcographical order corresponding according to All Files carries out subregion to described directory tree, generates multiple subregion;
Wherein, the common factor of the absolute path lexcographical order scope that any two subregions are corresponding is empty, and the absolute path lexcographical order scope that any one subregion is corresponding is from absolute path lexcographical order minimum value to absolute path lexcographical order maximal value.
In conjunction with the first possible implementation of second aspect or second aspect, in the implementation that the second is possible, described processing unit also for:
Before retrieving in all absolute path lexcographical orders based on described lexcographical order to be checked, all absolute path lexcographical orders are sorted according to lexcographical order order from small to large.
In conjunction with the implementation that the second of second aspect is possible, in the implementation that the third is possible, described processing unit, specifically for:
In all absolute path lexcographical orders carrying out from small to large sorting according to lexcographical order, from first absolute path lexcographical order, select two adjacent the first absolute path lexcographical orders and the second absolute path lexcographical order successively;
Judge that described lexcographical order to be checked is greater than described first absolute path lexcographical order, and when being less than or equal to the second absolute path lexcographical order, determine that described second absolute path lexcographical order is the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of described lexcographical order to be checked.
In conjunction with any one possible implementation above of second aspect or second aspect, in the 4th kind of possible implementation, described running unit, specifically for:
If the path of described file to be checked is absolute path, then determine the file set that the All Files in the subregion that described target partition information is corresponding belongs to for described file to be checked;
Otherwise, judge that whether the path of described file to be checked is the prefix path of described target absolute path;
When judging that the path of described file to be checked is not the prefix path of described target absolute path, determine the file set that the All Files in the subregion that described target partition information is corresponding belongs to for described file to be checked;
When judging that the path of described file to be checked is the prefix path of described target absolute path, the All Files in subregion corresponding for described target partition information is added in the original set that described file to be checked belongs to.
In conjunction with the 4th kind of possible implementation of second aspect, in the 5th kind of possible implementation, All Files in subregion corresponding for described target partition information is being added into after in the original set that described file to be checked belongs to by described running unit, also for:
Described path partition map table is sorted from small to large according to the lexcographical order of absolute path;
In the partition map table of described path, select the next absolute path that comes below described target absolute path as the second target absolute path, and judge that whether the path of described file to be checked is the prefix path of described second target absolute path;
When judging that the path of described file to be checked is the prefix path of described second target absolute path, using partition information corresponding for described second target absolute path as the second target partition information, the All Files in subregion corresponding for described second target partition information is added in current the belonged to file set of described file to be checked.
Adopt technical solution of the present invention, obtain the path of file to be checked, and preserve the path partition map table of corresponding relation of the absolute path that lexcographical order is maximum in each partition information and subregion corresponding to each partition information, by the lexcographical order in the path of file to be checked, as lexcographical order to be checked; And determine corresponding absolute path lexcographical order according to each absolute path in this path partition map table; Retrieve in all absolute path lexcographical orders based on this lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of this lexcographical order to be checked, and using absolute path corresponding for the absolute path lexcographical order of acquisition as target absolute path; According to this target absolute path obtained and path partition map table, determine the partition information that this target absolute path belongs to, as target partition information; By the All Files in subregion corresponding for this target partition information, as the file set that this file to be checked belongs to.Like this, the file set that can file to be checked be found in larger document storage system to belong to, and it is identical with the quantity of subregion for the data item number in the path partition map table of locating file set, relative to path of the prior art partition map table, save storage space greatly, improve document retrieval performance, and the inquiry of path partition map table upgrades efficiency simultaneously.
Accompanying drawing explanation
Fig. 1 is a kind of document structure management schematic diagram based on directory tree of the prior art;
Fig. 2 is a kind of partitioned organization schematic diagram based on directory tree of the prior art;
The particular flow sheet of a kind of file search method based on path that Fig. 3 provides for the embodiment of the present invention;
A kind of schematic diagram according to absolute path lexcographical order, directory tree being carried out to subregion that Fig. 4 provides for the embodiment of the present invention;
The structural representation of a kind of file search device based on path that Fig. 5 provides for the embodiment of the present invention;
The structural representation of a kind of terminal device that Fig. 6 provides for the embodiment of the present invention.
Embodiment
Adopt the file search method based on path provided by the invention, by obtaining the path of file to be checked, and preserve the path partition map table of corresponding relation of the absolute path that lexcographical order is maximum in each partition information and subregion corresponding to each partition information, by the lexcographical order in the path of file to be checked, as lexcographical order to be checked; And determine corresponding absolute path lexcographical order according to each absolute path in this path partition map table; Retrieve in all absolute path lexcographical orders based on this lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of this lexcographical order to be checked, and using absolute path corresponding for the absolute path lexcographical order of acquisition as target absolute path; According to this target absolute path obtained and path partition map table, determine the partition information that this target absolute path belongs to, as target partition information; By the All Files in subregion corresponding for this target partition information, as the file set that this file to be checked belongs to.Like this, the file set that can file to be checked be found in larger document storage system to belong to, and it is identical with the quantity of subregion for the data item number in the path partition map table of locating file set, relative to path of the prior art partition map table, save storage space greatly, improve document retrieval performance, and the inquiry of path partition map table upgrades efficiency simultaneously.
Embodiments provide a kind of file search method based on path, be applied to the terminal device preserving heap file, as computing machine, server etc., below in conjunction with accompanying drawing, the preferred embodiment of the present invention is described in detail.
Consult shown in Fig. 3, a kind of file search method based on path that the embodiment of the present invention provides, the concrete treatment scheme of the method comprises:
Step 301: the path obtaining file to be checked, and path partition map table, wherein, preserve the corresponding relation of the absolute path that lexcographical order is maximum in each partition information and subregion corresponding to each partition information in this path partition map table.
Wherein, partition information is the information such as the mark of subregion.
Before execution step 301, also comprise:
For the All Files structure directory tree that this locality is preserved;
The absolute path lexcographical order corresponding according to All Files carries out subregion to this directory tree, generates multiple subregion;
Wherein, the common factor of the absolute path lexcographical order scope that any two subregions are corresponding is empty, and the absolute path lexcographical order scope that any one subregion is corresponding is from absolute path lexcographical order minimum value to absolute path lexcographical order maximal value.
Absolute path is the absolute position under catalogue, and the absolute path as Fig. 1 file c3 is "/a1/b2/c3 ", and the character string of c3 absolute path is " a1b2c3 ", and the character string of absolute path determines the lexcographical order of absolute path.
For the directory tree in Fig. 1, when carrying out subregion to directory tree, need to consider that the common factor of the absolute path lexcographical order scope that any two subregions are corresponding is for empty, carry out subregion for the directory tree in Fig. 1, consult shown in Fig. 4, wherein, the absolute path lexcographical order scope of subregion P2 is " a1b1c1d1-a1b2c2 ", and the absolute path lexcographical order scope of subregion P3 is " a1b2c3-a1b2c3d2e5 ", obviously, the absolute path lexcographical order scope that subregion P2 and subregion P3 is corresponding is not occured simultaneously.
The absolute path lexcographical order corresponding according to All Files carries out subregion to this directory tree, after generating multiple subregion, build path partition map table, the corresponding relation of the absolute path that lexcographical order is maximum in each partition information and subregion corresponding to each partition information is preserved in the partition map table of path, still for the subregion in Fig. 4, the path partition map table generated based on Fig. 4 is as shown in table 2:
The path partition map table that table 2 builds for the directory tree subregion in Fig. 4
The absolute path that in subregion, lexcographical order is maximum Affiliated subregion
/a1/b1/c1/ P1
/a1/b2/c2/ P2
/a1/b2/c3/d2/e5/ P3
/a2/b5/ P4
/a2/b5/c5/d4/e7/ P5
Step 302: the lexcographical order determining the path of file to be checked, as lexcographical order to be checked; And determine corresponding absolute path lexcographical order according to each absolute path in the partition map table of path.
The embodiment of the present invention adopts lexcographical order to inquire about, and determines the lexcographical order in the path of file to be checked, and determines the absolute path lexcographical order that in partition map table, each absolute path is corresponding.
Step 303: retrieve in all absolute path lexcographical orders based on lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of this lexcographical order to be checked, and using absolute path corresponding for the absolute path lexcographical order of acquisition as target absolute path.
Concrete, before retrieving in all absolute path lexcographical orders based on lexcographical order to be checked, also comprise:
All absolute path lexcographical orders are sorted according to lexcographical order order from small to large.
Concrete, retrieve in all absolute path lexcographical orders based on this lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of this lexcographical order to be checked, comprising:
In all absolute path lexcographical orders carrying out from small to large sorting according to lexcographical order, from first absolute path lexcographical order, select two adjacent the first absolute path lexcographical orders and the second absolute path lexcographical order successively;
Judge that this lexcographical order to be checked is greater than described first absolute path lexcographical order, and when being less than or equal to the second absolute path lexcographical order, determine that the second absolute path lexcographical order is the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of this lexcographical order to be checked.
When absolute path lexcographical order number is larger, can also carry out searching by modes such as binary searchs the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of this lexcographical order to be checked
Like this, all absolute path lexcographical orders and lexcographical order to be checked can be avoided to carry out comparison one by one, save search time, improve search efficiency.
Step 304: according to this target absolute path obtained and path partition map table, determine the partition information that this target absolute path belongs to, as target partition information.
According to path partition map table, search the partition information corresponding with this target absolute path, using this partition information as target partition information.
Step 305: by the All Files in subregion corresponding for target partition information, as the file set that this file to be checked belongs to.
Concrete, by the All Files in subregion corresponding for target partition information, as the file set that this file to be checked belongs to, comprising:
If the path of this file to be checked is absolute path, then determine the file set that the All Files in the subregion that target partition information is corresponding belongs to for this file to be checked;
Otherwise, when namely the path of this file to be checked is prefix path, judge that whether the path of this file to be checked is the prefix path of target absolute path;
When the path judging this file to be checked is not the prefix path of target absolute path, determine the file set that the All Files in the subregion that target partition information is corresponding belongs to for this file to be checked;
When judging that the path of this file to be checked is the prefix path of target absolute path, the All Files in subregion corresponding for target partition information is added in the original set that this file to be checked belongs to.
Concrete, the All Files in subregion corresponding for target partition information is added into after in the original set that this file to be checked belongs to, also comprises:
Path partition map table is sorted from small to large according to the lexcographical order of absolute path;
In this path partition map table, select the next absolute path that comes below this target absolute path as the second target absolute path, and judge that whether the path of this file to be checked is the prefix path of the second target absolute path;
When judging that the path of this file to be checked is the prefix path of the second target absolute path, using partition information corresponding for the second target absolute path as the second target partition information, the All Files in subregion corresponding for this second target partition information is added in current the belonged to file set of this file to be checked.
Obviously, the method that the embodiment of the present invention provides, relative to prior art, not only realizes carrying out file search for absolute path, and can also realize carrying out file search for prefix path, applicability is wider.
Based on above embodiment, the embodiment of the present invention additionally provides a kind of idiographic flow of the file search method based on path:
First, according to absolute path lexcographical order, subregion is carried out to directory tree, build path partition map table, wherein, the corresponding relation of the absolute path max-path (i) that lexcographical order is maximum in each partition information P (i) and subregion corresponding to each partition information is preserved in the partition map table of path, and the absolute path in this path partition map table sorts from small to large according to the lexcographical order of absolute path, as shown in table 2;
By the path input-path entry terminal equipment of file to be checked, because path partition map table is orderly, therefore order scanning or binary search can be adopted to search in the partition map table of path satisfy condition: the lexcographical order of the lexcographical order≤max-path (k) of the lexcographical order <input-path of max-path (k-1), the absolute path of k >=2;
Subregion P (k) that absolute path max-path (k) is corresponding is added the file set that file to be checked belongs to;
If when input-path is absolute path, whole flow process terminates, the net result of the file set that P (k) belongs to for search file to be checked;
If when input-path is not absolute path, think that this input-path is prefix path, and continue to judge that whether this input-path is the prefix path of max-path (k): if input-path is not the prefix path of max-path (k), then whole flow process terminates, the net result of the file set that P (k) belongs to for search file to be checked; If input-path is the prefix path of max-path (k), then P (k) is added into the file set that file to be checked belongs to, and make k+1, continue to judge that whether this input-path is the prefix path of max-path (k+1), until when judging that input-path is not the prefix path of max-path (n), P (n) is added into the file set that file to be checked belongs to, whole flow process terminates.
Example 1, still to carry out subregion to directory tree in Fig. 4, corresponding path partition map table is table 2, if input absolute path input-path="/a1/b3/ ", with reference to table 2, the lexcographical order of absolute path max-path maximum with lexcographical order in the subregion in the partition map table of path line by line for the lexcographical order of input-path is compared, because in the partition map table of path, max-path is orderly, therefore by methods such as binary search or order scannings, find the absolute path met the following conditions fast: the lexcographical order of the lexcographical order≤max-path (k) of the lexcographical order <input-path of max-path (k-1), k >=2, the i.e. lexcographical order of the lexcographical order≤"/a2/b5/ " of the lexcographical order < "/a1/b3/ " of "/a1/b2/c3/d2/e5/ ", then, corresponding to "/a2/b5/ " subregion P4 is added the file set S that file to be checked belongs to, because input-path is absolute path, return S set (only comprising P4 subregion), flow process terminates.
Example 2, continue, in Fig. 4, subregion is carried out to directory tree, corresponding path partition map table is table 2, if input path input-path="/a1/b2/ ", with reference to table 2, the lexcographical order of absolute path max-path maximum with lexcographical order in the subregion in the partition map table of path line by line for the lexcographical order of input-path is compared, because in the partition map table of path, max-path is orderly, therefore by methods such as binary search or order scannings, find the absolute path met the following conditions fast: the lexcographical order of the lexcographical order≤max-path (k) of the lexcographical order <input-path of max-path (k-1), k >=2, the i.e. lexcographical order of the lexcographical order≤"/a1/b2/c2/ " of the lexcographical order < "/a1/b2/ " of "/a1/b1/c1/ ", then, corresponding to "/a1/b2/c2/ " subregion P2 is added the file set S that file to be checked belongs to, owing to can not determine whether input-path is absolute path, be then prefix path depending on this input-path.In the partition map table table 2 of path, the next item down of "/a1/b2/c2/ " is "/a1/b2/c3/d2/e5/ ", and input-path is the prefix path of "/a1/b2/c3/d2/e5/ ", therefore, the file set S that file to be checked belongs to can be added by "/a1/b2/c3/d2/e5/ " corresponding subregion P3, continue to judge the next item down, due to the prefix path that input-path is not "/a2/b5/ ", therefore, return the file set S (comprising P2, P3 two subregions) that file to be checked belongs to, flow process terminates.
Based on above embodiment, present invention also offers a kind of file search device based on path, consult shown in Fig. 5, this device comprises: acquiring unit 501, first determining unit 502, processing unit 503, second determining unit 504 and running unit 505, wherein,
Acquiring unit 501, for obtaining the path of file to be checked, and path partition map table, wherein, preserves the corresponding relation of the absolute path that lexcographical order is maximum in each partition information and subregion corresponding to each partition information in the partition map table of path;
First determining unit 502, for determining the lexcographical order in the path of this file to be checked, as lexcographical order to be checked; And determine corresponding absolute path lexcographical order according to each absolute path in this path partition map table;
Processing unit 503, for retrieving in all absolute path lexcographical orders based on this lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of this lexcographical order to be checked, and using absolute path corresponding for the absolute path lexcographical order of acquisition as target absolute path;
Second determining unit 504, for according to the target absolute path obtained and path partition map table, determines the partition information that this target absolute path belongs to, as target partition information;
Running unit 505, for by the All Files in subregion corresponding for this target partition information, as the file set that this file to be checked belongs to.
The file search device based on path that the embodiment of the present invention provides also comprises:
Zoning unit 500, for before the destination path obtaining file to be checked and path partition map table, for the All Files structure directory tree that this locality is preserved; The absolute path lexcographical order corresponding according to All Files carries out subregion to directory tree, generates multiple subregion;
Wherein, the common factor of the absolute path lexcographical order scope that any two subregions are corresponding is empty, and the absolute path lexcographical order scope that any one subregion is corresponding is from absolute path lexcographical order minimum value to absolute path lexcographical order maximal value.
Processing unit 503 also for:
Before retrieving in all absolute path lexcographical orders based on this lexcographical order to be checked, all absolute path lexcographical orders are sorted according to lexcographical order order from small to large.
Processing unit 503, specifically for:
In all absolute path lexcographical orders carrying out from small to large sorting according to lexcographical order, from first absolute path lexcographical order, select two adjacent the first absolute path lexcographical orders and the second absolute path lexcographical order successively;
Judge that this lexcographical order to be checked is greater than the first absolute path lexcographical order, and when being less than or equal to the second absolute path lexcographical order, determine that the second absolute path lexcographical order is the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of this lexcographical order to be checked.
Running unit 505, specifically for:
If the path of this file to be checked is absolute path, then determine the file set that the All Files in the subregion that target partition information is corresponding belongs to for this file to be checked;
Otherwise, judge that whether the path of this file to be checked is the prefix path of target absolute path;
When the path judging this file to be checked is not the prefix path of target absolute path, determine the file set that the All Files in the subregion that target partition information is corresponding belongs to for this file to be checked;
When judging that the path of this file to be checked is the prefix path of target absolute path, the All Files in subregion corresponding for target partition information is added in the original set that this file to be checked belongs to.
All Files in subregion corresponding for target partition information is being added into after in the original set that this file to be checked belongs to by running unit 505, also for:
Path partition map table is sorted from small to large according to the lexcographical order of absolute path;
In the partition map table of path, select the next absolute path that comes below target absolute path as the second target absolute path, and judge that whether the path of file to be checked is the prefix path of the second target absolute path;
When judging that the path of file to be checked is the prefix path of the second target absolute path, using partition information corresponding for the second target absolute path as the second target partition information, the All Files in subregion corresponding for the second target partition information is added in current the belonged to file set of this file to be checked.
Based on above embodiment, present invention also offers a kind of terminal device 600, such as, computing machine etc., consult Fig. 6, and this equipment comprises: transceiver 601, processor 602, bus 603 and storer 604, wherein:
Transceiver 601 and processor 602 are interconnected by bus 603; Bus 603 can be Peripheral Component Interconnect standard (peripheral component interconnect, be called for short PCI) bus or EISA (extended industry standard architecture is called for short EISA) bus etc.Described bus can be divided into address bus, data bus, control bus etc.For ease of representing, only representing with a thick line in Fig. 6, but not representing the bus only having a bus or a type.
Transceiver 601 is for obtaining the path of inquiry file.
Processor 602, for realizing the file search method based on path shown in embodiment of the present invention Fig. 3, comprising:
Obtain the path of file to be checked, and path partition map table, wherein, in this path partition map table, preserve the corresponding relation of the absolute path that lexcographical order is maximum in each partition information and subregion corresponding to each partition information;
Determine the lexcographical order in the path of file to be checked, as lexcographical order to be checked; And determine corresponding absolute path lexcographical order according to each absolute path in the partition map table of path;
Retrieve in all absolute path lexcographical orders based on lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of this lexcographical order to be checked, and using absolute path corresponding for the absolute path lexcographical order of acquisition as target absolute path;
According to this target absolute path obtained and path partition map table, determine the partition information that this target absolute path belongs to, as target partition information;
By the All Files in subregion corresponding for target partition information, as the file set that this file to be checked belongs to.
This terminal device 600 also comprises storer 604, for depositing program and path partition map table.Particularly, program can comprise program code, and this program code comprises computer-managed instruction.Storer 604 may comprise random access memory (random access memory, RAM), still may comprise nonvolatile memory (non-volatile memory), such as at least one magnetic disk memory.The application program that processor 602 execute store 604 is deposited, realizes as above based on the file search method in path.
In sum, by a kind of file search method based on path of providing in the embodiment of the present invention and device, the method is by obtaining the path of file to be checked, and preserve the path partition map table of corresponding relation of the absolute path that lexcographical order is maximum in each partition information and subregion corresponding to each partition information, by the lexcographical order in the path of file to be checked, as lexcographical order to be checked; And determine corresponding absolute path lexcographical order according to each absolute path in this path partition map table; Retrieve in all absolute path lexcographical orders based on this lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of this lexcographical order to be checked, and using absolute path corresponding for the absolute path lexcographical order of acquisition as target absolute path; According to this target absolute path obtained and path partition map table, determine the partition information that this target absolute path belongs to, as target partition information; By the All Files in subregion corresponding for this target partition information, as the file set that this file to be checked belongs to.Like this, the file set that can file to be checked be found in larger document storage system to belong to, and it is identical with the quantity of subregion for the data item number in the path partition map table of locating file set, relative to path of the prior art partition map table, save storage space greatly, improve document retrieval performance, and the inquiry of path partition map table upgrades efficiency simultaneously.
Although describe the preferred embodiments of the present invention, those skilled in the art once obtain the basic creative concept of cicada, then can make other change and amendment to these embodiments.So claims are intended to be interpreted as comprising preferred embodiment and falling into all changes and the amendment of the scope of the invention.
Obviously, those skilled in the art can carry out various change and modification to the embodiment of the present invention and not depart from the spirit and scope of the embodiment of the present invention.Like this, if these amendments of the embodiment of the present invention and modification belong within the scope of the claims in the present invention and equivalent technologies thereof, then the present invention is also intended to comprise these change and modification.

Claims (12)

1. based on the file search method in path, it is characterized in that, comprising:
Obtain the path of file to be checked, and path partition map table, wherein, in the partition map table of described path, preserve the corresponding relation of the absolute path that lexcographical order is maximum in each partition information and subregion corresponding to each partition information;
Determine the lexcographical order in the path of described file to be checked, as lexcographical order to be checked; And
Corresponding absolute path lexcographical order is determined according to each absolute path in the partition map table of described path;
Retrieve in all absolute path lexcographical orders based on described lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of described lexcographical order to be checked, and using absolute path corresponding for the absolute path lexcographical order of acquisition as target absolute path;
According to the described target absolute path obtained and path partition map table, determine the partition information that described target absolute path belongs to, as target partition information;
By the All Files in subregion corresponding for described target partition information, as the file set that described file to be checked belongs to.
2. the method for claim 1, is characterized in that, before the destination path obtaining file to be checked and path partition map table, also comprises:
For the All Files structure directory tree that this locality is preserved;
The absolute path lexcographical order corresponding according to All Files carries out subregion to described directory tree, generates multiple subregion;
Wherein, the common factor of the absolute path lexcographical order scope that any two subregions are corresponding is empty, and the absolute path lexcographical order scope that any one subregion is corresponding is from absolute path lexcographical order minimum value to absolute path lexcographical order maximal value.
3. method as claimed in claim 1 or 2, is characterized in that, before retrieving in all absolute path lexcographical orders based on described lexcographical order to be checked, also comprise:
All absolute path lexcographical orders are sorted according to lexcographical order order from small to large.
4. method as claimed in claim 3, it is characterized in that, retrieve in all absolute path lexcographical orders based on described lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of described lexcographical order to be checked, comprising:
In all absolute path lexcographical orders carrying out from small to large sorting according to lexcographical order, from first absolute path lexcographical order, select two adjacent the first absolute path lexcographical orders and the second absolute path lexcographical order successively;
Judge that described lexcographical order to be checked is greater than described first absolute path lexcographical order, and when being less than or equal to the second absolute path lexcographical order, determine that described second absolute path lexcographical order is the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of described lexcographical order to be checked.
5. the method as described in any one of claim 1-4, is characterized in that, by the All Files in subregion corresponding for described target partition information, as the file set that described file to be checked belongs to, comprising:
If the path of described file to be checked is absolute path, then determine the file set that the All Files in the subregion that described target partition information is corresponding belongs to for described file to be checked;
Otherwise, judge that whether the path of described file to be checked is the prefix path of described target absolute path;
When judging that the path of described file to be checked is not the prefix path of described target absolute path, determine the file set that the All Files in the subregion that described target partition information is corresponding belongs to for described file to be checked;
When judging that the path of described file to be checked is the prefix path of described target absolute path, the All Files in subregion corresponding for described target partition information is added in the original set that described file to be checked belongs to.
6. method as claimed in claim 5, is characterized in that, is added into by the All Files in subregion corresponding for described target partition information after in the original set that described file to be checked belongs to, also comprises:
Described path partition map table is sorted from small to large according to the lexcographical order of absolute path;
In the partition map table of described path, select the next absolute path that comes below described target absolute path as the second target absolute path, and judge that whether the path of described file to be checked is the prefix path of described second target absolute path;
When judging that the path of described file to be checked is the prefix path of described second target absolute path, using partition information corresponding for described second target absolute path as the second target partition information, the All Files in subregion corresponding for described second target partition information is added in current the belonged to file set of described file to be checked.
7. based on the file search device in path, it is characterized in that, comprising:
Acquiring unit, for obtaining the path of file to be checked, and path partition map table, wherein, preserves the corresponding relation of the absolute path that lexcographical order is maximum in each partition information and subregion corresponding to each partition information in the partition map table of described path;
First determining unit, for determining the lexcographical order in the path of described file to be checked, as lexcographical order to be checked; And determine corresponding absolute path lexcographical order according to each absolute path in the partition map table of described path;
Processing unit, for retrieving in all absolute path lexcographical orders based on described lexcographical order to be checked, obtain the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of described lexcographical order to be checked, and using absolute path corresponding for the absolute path lexcographical order of acquisition as target absolute path;
Second determining unit, for according to the described target absolute path obtained and path partition map table, determines the partition information that described target absolute path belongs to, as target partition information;
Running unit, for by the All Files in subregion corresponding for described target partition information, as the file set that described file to be checked belongs to.
8. device as claimed in claim 7, is characterized in that, also comprise:
Zoning unit, for before the destination path obtaining file to be checked and path partition map table, for the All Files structure directory tree that this locality is preserved; The absolute path lexcographical order corresponding according to All Files carries out subregion to described directory tree, generates multiple subregion;
Wherein, the common factor of the absolute path lexcographical order scope that any two subregions are corresponding is empty, and the absolute path lexcographical order scope that any one subregion is corresponding is from absolute path lexcographical order minimum value to absolute path lexcographical order maximal value.
9. as claimed in claim 7 or 8 device, is characterized in that, described processing unit also for:
Before retrieving in all absolute path lexcographical orders based on described lexcographical order to be checked, all absolute path lexcographical orders are sorted according to lexcographical order order from small to large.
10. device as claimed in claim 9, is characterized in that, described processing unit, specifically for:
In all absolute path lexcographical orders carrying out from small to large sorting according to lexcographical order, from first absolute path lexcographical order, select two adjacent the first absolute path lexcographical orders and the second absolute path lexcographical order successively;
Judge that described lexcographical order to be checked is greater than described first absolute path lexcographical order, and when being less than or equal to the second absolute path lexcographical order, determine that described second absolute path lexcographical order is the minimum absolute path lexcographical order be more than or equal in each absolute path lexcographical order of described lexcographical order to be checked.
11. devices as described in any one of claim 7-10, is characterized in that, described running unit, specifically for:
If the path of described file to be checked is absolute path, then determine the file set that the All Files in the subregion that described target partition information is corresponding belongs to for described file to be checked;
Otherwise, judge that whether the path of described file to be checked is the prefix path of described target absolute path;
When judging that the path of described file to be checked is not the prefix path of described target absolute path, determine the file set that the All Files in the subregion that described target partition information is corresponding belongs to for described file to be checked;
When judging that the path of described file to be checked is the prefix path of described target absolute path, the All Files in subregion corresponding for described target partition information is added in the original set that described file to be checked belongs to.
12. devices as claimed in claim 11, is characterized in that, the All Files in subregion corresponding for described target partition information is being added into after in the original set that described file to be checked belongs to by described running unit, also for:
Described path partition map table is sorted from small to large according to the lexcographical order of absolute path;
In the partition map table of described path, select the next absolute path that comes below described target absolute path as the second target absolute path, and judge that whether the path of described file to be checked is the prefix path of described second target absolute path;
When judging that the path of described file to be checked is the prefix path of described second target absolute path, using partition information corresponding for described second target absolute path as the second target partition information, the All Files in subregion corresponding for described second target partition information is added in current the belonged to file set of described file to be checked.
CN201410795855.8A 2014-12-18 2014-12-18 A kind of file search method and device based on path Active CN104537017B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410795855.8A CN104537017B (en) 2014-12-18 2014-12-18 A kind of file search method and device based on path

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410795855.8A CN104537017B (en) 2014-12-18 2014-12-18 A kind of file search method and device based on path

Publications (2)

Publication Number Publication Date
CN104537017A true CN104537017A (en) 2015-04-22
CN104537017B CN104537017B (en) 2018-05-04

Family

ID=52852545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410795855.8A Active CN104537017B (en) 2014-12-18 2014-12-18 A kind of file search method and device based on path

Country Status (1)

Country Link
CN (1) CN104537017B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427341A (en) * 2019-06-11 2019-11-08 福建奇点时空数字科技有限公司 A kind of knowledge mapping entity relationship method for digging based on paths ordering

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030204515A1 (en) * 2002-03-06 2003-10-30 Ori Software Development Ltd. Efficient traversals over hierarchical data and indexing semistructured data
US20080263072A1 (en) * 2000-11-29 2008-10-23 Virtual Key Graph Methods of Encoding a Combining Integer Lists in a Computer System, and Computer Software Product for Implementing Such Methods
CN101339570A (en) * 2008-08-12 2009-01-07 北京航空航天大学 Efficient distributed organization and management method for mass remote sensing data
US20090070382A1 (en) * 2007-09-11 2009-03-12 Mukund Satish Agrawal System and Method for Performing a File System Operation on a Specified Storage Tier
CN101551814A (en) * 2009-05-13 2009-10-07 广东威创视讯科技股份有限公司 Method for data management and data search
CN101937377A (en) * 2009-06-29 2011-01-05 百度在线网络技术(北京)有限公司 Data recovery method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080263072A1 (en) * 2000-11-29 2008-10-23 Virtual Key Graph Methods of Encoding a Combining Integer Lists in a Computer System, and Computer Software Product for Implementing Such Methods
US20030204515A1 (en) * 2002-03-06 2003-10-30 Ori Software Development Ltd. Efficient traversals over hierarchical data and indexing semistructured data
US20090070382A1 (en) * 2007-09-11 2009-03-12 Mukund Satish Agrawal System and Method for Performing a File System Operation on a Specified Storage Tier
CN101339570A (en) * 2008-08-12 2009-01-07 北京航空航天大学 Efficient distributed organization and management method for mass remote sensing data
CN101551814A (en) * 2009-05-13 2009-10-07 广东威创视讯科技股份有限公司 Method for data management and data search
CN101937377A (en) * 2009-06-29 2011-01-05 百度在线网络技术(北京)有限公司 Data recovery method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427341A (en) * 2019-06-11 2019-11-08 福建奇点时空数字科技有限公司 A kind of knowledge mapping entity relationship method for digging based on paths ordering

Also Published As

Publication number Publication date
CN104537017B (en) 2018-05-04

Similar Documents

Publication Publication Date Title
CN110321344B (en) Information query method and device for associated data, computer equipment and storage medium
CN104679778B (en) A kind of generation method and device of search result
CN104809182B (en) Based on the web crawlers URL De-weight method that dynamically can divide Bloom Filter
CN101826107B (en) Hash data processing method and device
CN103164490B (en) A kind of efficient storage implementation method of not fixed-length data and device
CN106294352B (en) A kind of document handling method, device and file system
CN106407207B (en) Real-time newly-added data updating method and device
CN102298641A (en) Method for uniformly storing files and structured data based on key value bank
JP2010503117A (en) Dynamic fragment mapping
CN105468642A (en) Data storage method and apparatus
CN106471501B (en) Data query method, data object storage method and data system
CN103902702A (en) Data storage system and data storage method
CN102915382A (en) Method and device for carrying out data query on database based on indexes
CN103246549B (en) A kind of method and system of data conversion storage
CN110109910A (en) Data processing method and system, electronic equipment and computer readable storage medium
CN104298736A (en) Method and device for aggregating and connecting data as well as database system
CN103858125A (en) Repeating data processing methods, devices, storage controller and storage node
CN103186622A (en) Updating method of index information in full text retrieval system and device thereof
CN104965826A (en) Search method and search apparatus based on a browser
CN104636349A (en) Method and equipment for compression and searching of index data
CN105183391B (en) The method and apparatus that data store under a kind of distributed data platform
CN107423321B (en) Method and device suitable for cloud storage of large-batch small files
CN104346347A (en) Data storage method, device, server and system
CN106649385B (en) Data reordering method and device based on HBase database
CN100399338C (en) A sorting method of data record

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220831

Address after: No. 1899 Xiyuan Avenue, high tech Zone (West District), Chengdu, Sichuan 610041

Patentee after: Chengdu Huawei Technologies Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.