CN109408571B - Object enumeration method, device and computer storage medium - Google Patents

Object enumeration method, device and computer storage medium Download PDF

Info

Publication number
CN109408571B
CN109408571B CN201811146403.1A CN201811146403A CN109408571B CN 109408571 B CN109408571 B CN 109408571B CN 201811146403 A CN201811146403 A CN 201811146403A CN 109408571 B CN109408571 B CN 109408571B
Authority
CN
China
Prior art keywords
interval
enumeration
objects
enumerated
bucket
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811146403.1A
Other languages
Chinese (zh)
Other versions
CN109408571A (en
Inventor
刘畅
袁立非
赵梓健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Cloud Computing Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Cloud Computing Technologies Co Ltd filed Critical Huawei Cloud Computing Technologies Co Ltd
Priority to CN201811146403.1A priority Critical patent/CN109408571B/en
Publication of CN109408571A publication Critical patent/CN109408571A/en
Application granted granted Critical
Publication of CN109408571B publication Critical patent/CN109408571B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses an object enumeration method, an object enumeration device and a computer storage medium, wherein the object enumeration method is applied to the object enumeration device and comprises the following steps: sampling keys of an object in a bucket at predetermined intervals to obtain a section mark set, wherein the section mark set comprises a plurality of section marks which are sampling results of sampling the keys of the object in the bucket; synchronously enumerating objects corresponding to any two adjacent interval marks in the interval mark set so as to obtain a plurality of enumerated object results; finally, the multiple enumerated object results are merged. The method comprises the steps that all objects of a storage bucket can be divided into a plurality of sections through a plurality of section marks in a section mark set, and objects corresponding to any two adjacent section marks in the section mark set are listed synchronously.

Description

Object enumeration method, device and computer storage medium
Technical Field
The invention relates to the technical field of interface calling, in particular to an object enumeration method, device and computer storage medium.
Background
A bucket is a container in an object store for storing objects, where the objects are uniquely identified by keys (keys). An enumerate objects (list objects) interface in the object interfaces may enumerate objects in the buckets. When the enumeration object interface is called, at least a mark value (marker) is used as an entry of the enumeration object interface, and the marker is used for indicating the starting position of one enumeration.
To obtain all objects in a bucket, the current solution is to serially loop through the enumerate object interface until the return value of the enumerate object interface is empty. Specifically, when the enumerated object interface is called for the first time, the marker is set to be a null value (null), when the enumerated interface is called for the subsequent time, the marker is set to be a key of the last object returned by the last call until the return value of the enumerated object interface is null, and then the objects returned by each call are merged to obtain all the objects in the bucket.
Since the above scheme depends on the return value of the last call when the enumeration object interface is called circularly, and the maximum number of objects that can be returned per call is a constant value, the enumeration efficiency is very low. For example, assuming that the return time of an interface is 0.5 seconds per call and a maximum of 1000 objects are returned per call, it takes 10000 × 0.5 ═ 5000 seconds to enumerate 1000 ten thousand objects and call 10000 interfaces.
Disclosure of Invention
The application provides an object enumeration method, device and computer storage medium, which can solve the problem that in the prior art, the time consumption for enumerating objects in a storage bucket is long.
In a first aspect, the present application provides an object enumeration method, including:
sampling keys of an object in a bucket at predetermined intervals to obtain a section mark set, wherein the section mark set comprises a plurality of section marks which are sampling results of sampling the keys of the object in the bucket; synchronously enumerating objects corresponding to any two adjacent interval marks in the interval mark set so as to obtain a plurality of enumerated object results; finally, the multiple enumerated object results are merged.
Therefore, all objects of the bucket can be divided into a plurality of sections by the section marks in the section mark set, and objects corresponding to any two adjacent section marks in the section mark set can be enumerated simultaneously, so that all the objects in the bucket can be enumerated quickly, and the time for enumerating all the objects in the bucket is shortened.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the synchronously enumerating objects corresponding to any two adjacent interval labels in the interval label set includes:
and allocating the plurality of section marks to a plurality of processing units respectively, wherein one section mark corresponds to one processing unit, so that the plurality of processing units take the section mark corresponding to the processing unit as an entry parameter of an enumeration object interface, and objects are enumerated in the bucket until an enumerated object result comprises objects identified by the section marks different from the section marks corresponding to the processing unit in the section mark set, or until the number of the objects contained in the enumerated object result is less than the maximum return value of the enumeration object interface.
With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, a first interval flag in the plurality of interval flags is a null value.
With reference to the first possible implementation manner of the first aspect and the second possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, after the combining the multiple enumerated object results, the method further includes: and carrying out deduplication on the combined plurality of enumerated object results.
As can be seen, by implementing the above optional embodiment, the multiple processing units invoke the enumerate object interface in parallel to enumerate the objects corresponding to any two adjacent interval labels in the interval label set, so that all the objects in the bucket can be enumerated quickly, and the time required for enumerating all the objects in the bucket is shortened.
In a second aspect, the present application provides an object enumeration apparatus comprising means for performing the method of the first aspect and possible method embodiments of the first aspect described above. Specifically, the object enumeration device includes:
a sampling module, configured to sample keys of an object in a bucket at predetermined intervals to obtain a set of interval labels, where the set of interval labels includes a plurality of interval labels, and the plurality of interval labels are sampling results of sampling the keys of the object in the bucket; the enumeration module is used for synchronously enumerating objects corresponding to any two adjacent interval markers in the interval marker set so as to obtain a plurality of enumerated object results; and the merging module is used for merging the plurality of enumerated object results.
With reference to the second aspect, in a first possible implementation manner of the second aspect, the synchronization module is specifically configured to:
and allocating the plurality of section marks to a plurality of processing units respectively, wherein one section mark corresponds to one processing unit, so that the plurality of processing units take the section mark corresponding to the processing unit as an entry parameter of an enumeration object interface, and objects are enumerated in the bucket until an enumerated object result comprises objects identified by the section marks different from the section marks corresponding to the processing unit in the section mark set, or until the number of the objects contained in the enumerated object result is less than the maximum return value of the enumeration object interface.
With reference to the second aspect and the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, a first interval flag in the plurality of interval flags is a null value.
With reference to the second aspect, the first possible implementation manner of the second aspect, and the second possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the object enumeration device further includes: and the duplication removing module is used for carrying out duplication removal on the combined multiple enumerated object results.
Based on the same inventive concept, as the principle and the advantageous effects of the object enumeration device for solving the problems can be referred to the method of the first aspect and each possible implementation manner of the first aspect and the advantageous effects brought thereby, the implementation of the object enumeration device can be referred to the method of the first aspect and each possible implementation manner of the first aspect, and repeated details are not repeated.
In a third aspect, the present application provides an object enumeration apparatus, including: a memory for storing one or more programs; for the object enumerating the implementation manners and the advantageous effects of the device for solving the problems, reference may be made to the method of the first aspect and each possible implementation manner and the advantageous effects of the first aspect, and repeated details are omitted.
In a fourth aspect, the present application provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, where the computer program includes computer instructions, and when the computer instructions are executed by a processor, the processor is enabled to execute the method of the first aspect and each possible implementation manner of the first aspect and the beneficial effects brought by the first aspect, and repeated details are omitted here.
In a fifth aspect, the present application provides a computer program product, where the computer program product includes computer instructions, and when the computer instructions are executed by a processor, the processor is enabled to execute the method of the first aspect and each possible implementation manner of the first aspect and the beneficial effects brought by the method of the first aspect, and repeated details are omitted.
Drawings
FIG. 1 is a schematic diagram of buckets and objects in an embodiment of the present application;
fig. 2 is a schematic flowchart of an object enumeration method according to an embodiment of the present application;
FIG. 3 is a block diagram of an object enumeration device according to an embodiment of the present disclosure;
fig. 4 is a schematic frame diagram of another object enumeration device provided in an embodiment of the present application.
Detailed Description
The following description will be made with reference to the drawings in the embodiments of the present application.
Referring to fig. 1, fig. 1 is a schematic diagram of a bucket and an object. Wherein, the bucket (bucket)11 is a container for storing the Object (Object)12 in the Object storage. As shown in fig. 1, each object 12 is stored in one bucket 11. For example, for an object 12 named as photos/puppy. jpg stored in the johnsmith bucket 11, the object 12 may be addressed using a link with a URL (Uniform Resource Locator) of http:// johnsmith. xx/photos/puppy. jpg. Wherein xx is the vendor domain name.
The object 12 is a basic entity of the object storage, and the object 12 is composed of object data and metadata. Where the data portion is opaque to S3, the metadata is a set of name-value pairs that describe the object 12. The metadata may include default metadata (e.g., last modified date) and standard HTTP (HyperText Transfer Protocol) metadata (e.g., content-type).
In the bucket, the object 12 will be uniquely identified by a key (key), i.e., the name of the object, and a version number (version ID) of the object 12. A key is a unique identifier of an object 12 in a bucket 11, and there can be only one key for each object 12 in a bucket 11. Since combining buckets 11, keys, and version IDs uniquely identifies each object 12, the object store can be viewed as a basic data mapping between buckets 11+ keys + versions and the objects 12 themselves. Each object 12 in the object store is uniquely addressable by a combination of a network (Web) service end node, bucket name, key, and version ID (optional).
In object store, a list objects interface is used to list objects 12 in buckets 11. Wherein the enumerate object interface may return up to 1000 objects 12 at a time when called. The enumerated object interface enumerates objects 12 that may include information such as the key, size, latest modification time, hash value (i.e., MD5 value) generated by MD5(Message Digest Algorithm 5, fifth edition Message Digest Algorithm), etc. of the object 12.
Entry parameters (hereinafter, simply referred to as entry parameters) of an object interface are listed, and include a marker (marker). The marker is a key of one object 12 and is used for specifying the initial position of one query, and after the marker is set, the object results returned by the enumerated object interface are all the objects 12 behind the object 12 identified by the key after the object results are sorted according to the dictionary order.
The enlistment of the enumerated object interface also includes a maximum return value (max-keys). max-keys are used to specify the maximum number of objects that the enumerated object interface can return at one time, and the object result returned by the object enumerated interface after setting max-keys will be the top max-keys of objects 12 in lexicographic order. Wherein the max-keys has a value range of [1, n ]. When max-keys is not set, the default value of max-keys is n. When the set max-keys exceeds n, the processing is carried out according to the default n, wherein n is a positive integer, and the value of n is 1000 in a normal case.
In embodiments of the present application, the object enumeration device may first invoke the enumerate object interface through a serial loop to enumerate a list of keys for objects in the bucket. The following describes a method for serially looping through an enumerate object interface to enumerate a list of object keys in the bucket.
Specifically, when the enumerate object interface is called for the first time, the object enumerate device sets the marker to null (null), and enumerates from the first object in the bucket; each subsequent invocation of the enumerated object interface, the object enumeration device sets the marker as the key of the last object returned by the enumerated object interface at the last interface invocation. The object enumeration device serially loops through the enumerated object interface until the last object in the bucket is enumerated. When the last object in the bucket is determined to have been enumerated, the object enumeration device merges the object results returned each time by the enumerated object interface to obtain all the objects in the bucket.
In an embodiment of the application, the object enumeration device may determine that the last object in the bucket has been enumerated when the number of objects included in the object result returned to the enumerate object interface is less than the maximum return value (max-keys) of the enumerate object interface.
It will be appreciated that to maximize the efficiency of object enumeration, the object enumeration device may set the value of max-keys to an integer no less than n each time the enumerate object interface is invoked, or the object enumeration device may not set max-keys so that the enumerate object interface can return up to n objects at a time. In this case, the object enumeration device may determine that the last object in the bucket has been enumerated when the first thread detects that the number of objects included in the object result returned by the enumerate object interface is less than n.
In practice, there are often add/delete operations on objects in a bucket, for example, adding objects to a bucket, or deleting portions of objects in a bucket. The embodiment of the invention provides an object enumeration method, which can be particularly applied to object enumeration equipment, wherein the object enumeration equipment is demand side equipment for object enumeration and is used for enumerating objects after addition and deletion in a storage bucket. As shown in fig. 2, the object enumeration method includes the steps of:
s21, the keys of the objects in the bucket are sampled at predetermined intervals to obtain a set of interval labels. Wherein the predetermined interval is an interval when sampling of the key is performed. The set of interval labels includes a plurality of interval labels including a sampling result of sampling a key of an object in the bucket.
The keys of the objects in the buckets can be obtained by serially calling the enumerated object interface, and the specific method is as above, and repeated parts are not described again.
After enumerating the keys of the objects in the bucket, the object enumeration device may sample the keys of the objects in the bucket at predetermined intervals to obtain a set of interval labels.
The performing, by the object enumeration device, sampling keys of the objects in the bucket at predetermined intervals to obtain the interval flag set may specifically include: sampling keys of objects in the bucket at preset intervals to obtain sampling results, wherein the sampling results are keys of one or more objects; the keys of the one or more objects are used as interval marks to generate an interval mark set.
As can be seen, the interval mark set includes a sampling result for sampling the keys of the objects in the bucket. In an embodiment of the present application, a null value is also included in the interval flag set. Thus, the section flag set includes a plurality of section flags.
In an embodiment of the present application, a first interval flag in the interval flags is null, and a precedence order of the key of the one or more objects in the interval flag set is the same as a precedence order of the key of the one or more objects in the bucket.
Wherein the predetermined interval is a sampling interval of the key.
For example, when the number of objects contained in the bucket is 20000, if the predetermined interval is 10000, the sampling result is the key of the 10000 th object in the bucket; if the predetermined interval is 5000, the sampling result is the key of the 5000 th object, the key of the 10000 th object, and the key of the 15000 th object in the bucket.
In an embodiment of the present application, the predetermined interval may be determined according to the number of objects contained in the bucket and the number of processing units performing interval object enumeration on the bucket, so as to maximize efficiency of object enumeration.
For example, when the number of objects contained in the bucket and the number of processing units performing interval object enumeration on the bucket are 20000 and 2, respectively, the predetermined interval may be set to 10000 to equally divide the invocation of the enumerate object interface to two processing units to maximize the efficiency of object enumeration.
It is to be understood that all objects in the bucket are partitioned into intervals by the keys of the one or more objects, and each interval mark in the set of interval marks corresponds to an interval.
In an embodiment of the present application, when only one key of an object is included in the interval mark set, the null value and the key of the object may be respectively used as a first interval mark and a second interval mark (i.e. a last interval mark) in the interval mark set. When the interval marker set includes keys of multiple objects, the null value may be used as the first interval marker in the interval marker set, and the sequence of the keys of the multiple objects in the interval marker set may be the same as the sequence of the keys in the bucket.
For example, when 20000 objects are included in the bucket, if the sampling result is the key of the 10000 th object in the bucket, the null value and the key of the 10000 objects can be used as the first interval mark and the second interval mark (i.e. the last interval mark) in the interval mark set, respectively.
When 20000 objects are included in the bucket, if the sampling result is the key of the 5000 th object, the key of the 10000 th object and the key of the 15000 th object in the bucket, the null value, the key of the 5000 th object, the key of the 10000 th object and the key of the 15000 th object may be respectively used as the first interval mark, the second interval mark, the third interval mark and the fourth interval mark (i.e. the last interval mark) in the interval mark set.
Thus, when m interval marks are included in the interval mark set, the interval corresponding to the first interval mark (i.e., the first interval) includes the first object to the second object identified by the interval mark in the bucket, the interval corresponding to the mth interval mark (i.e., the mth interval) includes the next object to the last object in the bucket of the object identified by the mth interval mark, and the interval corresponding to the ith (1< i < m) interval mark (i.e., the ith interval) includes the next object to the object identified by the ith interval mark to the object identified by the i +1 th interval mark.
And S22, synchronously enumerating objects corresponding to any two adjacent interval marks in the interval mark set so as to obtain a plurality of enumerated object results.
When the interval mark set comprises m interval marks, if the two adjacent interval marks are a first interval mark and a second interval mark, the objects corresponding to the two adjacent interval marks comprise the objects marked by the first interval mark to the second interval mark in the storage bucket; if the two adjacent interval marks are the mth interval mark and the first interval mark, the objects corresponding to the two adjacent interval marks comprise the next object of the objects identified by the mth interval mark to the last object in the bucket; if the two adjacent interval marks are the ith (1< i < m) interval mark and the (i + 1) th interval mark, the object corresponding to the two adjacent interval marks is the object from the next object of the object identified by the ith interval mark to the object identified by the (i + 1) th interval mark.
That is to say, the object corresponding to any two adjacent interval marks is the object included in the interval corresponding to the previous interval mark in the any two adjacent interval marks.
As an optional implementation manner, the performing, by the object enumeration device, synchronous enumeration on the object corresponding to any two adjacent section markers in the section marker set may specifically include: and allocating the plurality of section marks to a plurality of processing units respectively, wherein one section mark corresponds to one processing unit, so that the plurality of processing units take the section mark corresponding to the processing unit as an entry parameter of an enumeration object interface, and objects are enumerated in the bucket until an enumerated object result comprises objects identified by the section marks different from the section marks corresponding to the processing unit in the section mark set, or until the number of the objects contained in the enumerated object result is less than the maximum return value of the enumeration object interface.
It is understood that the plurality of processing units synchronously execute the step of enumerating the objects from the buckets with the interval markers corresponding to themselves as entries of the enumerate object interface. In embodiments of the present application, the plurality of processing units may be a plurality of threads or a plurality of processes.
When the plurality of processing units are a plurality of threads, the plurality of threads execute to use the interval mark corresponding to the plurality of threads as an entry of the enumeration object interface, and enumerating the object from the bucket may specifically include: s221, judging whether the interval mark set is empty or not; s222, when the interval mark set is not empty, acquiring a target interval mark from the interval mark set, and deleting the target interval mark from the interval mark set; s223, performing section object enumeration on the section corresponding to the target section mark to obtain an object enumeration result corresponding to the target section mark.
Wherein the plurality of threads are all deployed in the object enumeration device.
In an embodiment of the present application, whether the interval flag set is empty is used to indicate whether the multiple threads have performed interval object enumeration on all intervals.
Specifically, if the interval flag set is not empty, which indicates that the multiple threads do not perform interval object enumeration on all intervals, the multiple threads may perform steps S222 to S223; if the interval flag set is empty, indicating that the multiple threads have performed interval object enumeration for each interval, the object enumeration device may perform step S23.
When any one of the threads acquires a target interval marker from the interval marker set, the target interval marker becomes an interval marker corresponding to the thread itself. The target interval mark may be any interval mark in the interval mark set. As an alternative embodiment, the multiple threads may obtain the first interval marker in the interval marker set as the target interval marker.
After the target interval mark is acquired, any one thread can delete the target interval mark from the interval mark set to update the interval mark set, so that other threads are prevented from executing interval object enumeration on the interval corresponding to the target interval mark.
In an embodiment of the application, the plurality of threads may serially and circularly call the enumeration object interface to perform the interval object enumeration on the interval corresponding to the target interval mark to obtain the object enumeration result corresponding to the target interval mark.
Specifically, when the enumerated object interface is called for the first time, the multiple threads can set the marker as the target interval mark; each subsequent cycle calls the enumerated object interface, the multiple threads can set the marker to the key of the last object in the object result returned by the enumerated object interface at the last interface call. The multiple threads call the enumerate object interface in a loop until the last object in the interval corresponding to the target interval marker is enumerated. When the last object in the interval corresponding to the target interval mark is determined to be enumerated, the object enumeration of the current interval is indicated to be finished, and the multiple threads merge the object results returned by the enumerated object interface each time to obtain the object enumeration result corresponding to the target interval mark.
As an optional implementation manner, the multiple threads may detect whether an object result returned by the enumeration object interface includes an object identified by a section marker different from a section marker corresponding to the thread (i.e., the target section marker) in the section marker set; if it is detected that the object result returned by the enumerate object interface includes an object identified by a section marker different from the target section marker, the multiple threads may determine that the last object in the section corresponding to the target section marker has been enumerated.
In a specific embodiment, before synchronously enumerating objects corresponding to any two adjacent interval markers in the interval marker set to obtain a plurality of enumerated object results, the object enumeration device may further perform a copy process on the interval marker set to obtain a first marker set. Therefore, the detecting, executed by the multiple threads, whether the object result returned by the enumerated object interface includes an object identified by an interval label different from the interval label (i.e., the target interval label) corresponding to the object in the interval label set may specifically include: and detecting whether an object result returned by the enumeration object interface comprises an object identified by an interval mark which is different from the target interval mark in the first interval mark set.
As another optional implementation, the multiple threads may determine whether the number of objects included in the object result returned by the enumerate object interface is less than max-keys; if the number of objects included in the object result returned by the enumeration object interface is judged to be less than max-keys, the plurality of threads can determine that the last object in the interval corresponding to the target interval mark is enumerated.
Further, after obtaining the object enumeration result corresponding to the target section mark by performing the section object enumeration corresponding to the target section mark, the plurality of threads may return to the execution of steps S221 to S223.
When the plurality of processing units are a plurality of processes, the plurality of processes execute to use the interval mark corresponding to the plurality of processes as an entry of the enumeration object interface, and enumerating the object from the bucket may specifically include: s224, judging whether the interval mark set is empty or not; s225, when the interval mark set is not empty, acquiring a target interval mark from the interval mark set, and deleting the target interval mark from the interval mark set; s226, executing interval object enumeration on the interval corresponding to the target interval mark to obtain an object enumeration result corresponding to the target interval mark; and S227, performing countdown on a preset countdown counter.
Wherein, the processes can be virtual machine processes, host processes or container processes.
As an alternative embodiment, the plurality of processes may be all deployed in the object enumeration device.
As another alternative, the multiple processes may all be deployed in other devices. For example, the multiple processes may be deployed in different devices, respectively, to maximize utilization of the computing power of each device.
As yet another alternative, the processes may be deployed in the object enumeration device and other devices, respectively. For example, some of the processes may be deployed in the object enumeration device, and other processes may be deployed in other devices.
Specifically, if the interval flag set is not empty, which indicates that the processes do not perform interval object enumeration on all intervals, the processes may perform steps S225 to S227; if the interval flag set is empty, indicating that the multiple processes have performed interval object enumeration on each interval, the object enumeration device may perform step S23.
For specific technical details of steps S224 to S226, reference may be made to the related descriptions of steps S221 to S223, and repeated descriptions are omitted.
In an embodiment of the application, after the object enumeration result is obtained, the processes may further obtain a URL set corresponding to the object enumeration result, and store the URL set in a preset distributed queue. And the URL set comprises the URL corresponding to each object in the object enumeration result.
When one interval object enumeration is finished, the plurality of processes subtract 1 from the current value of the preset countdown. Wherein, the preset countdown counter is provided with an initial value.
And S23, merging the plurality of enumerated object results.
After performing interval object enumeration for all intervals, the object enumeration device may merge the enumerated object results to obtain all objects currently in the bucket.
As an optional implementation, after merging the multiple enumerated object results, the object enumeration device may further perform deduplication on the merged multiple enumerated object results to obtain all objects currently in the bucket.
And under the condition that the processing units are a plurality of processes, when the current value of the preset down counter is a specified value, triggering the object enumeration equipment to merge the enumerated object results.
Wherein the designated value is the difference between the initial value and the number of interval marks in the interval mark set. It is to be understood that when the current value of the preset countdown is the designated value, it indicates that the multiple processes have executed the interval object enumeration on each interval.
In an embodiment of the application, the object enumeration device may continuously detect the current value of the preset countdown. For example, the object enumeration device may poll the preset countdown counter.
As an alternative embodiment, the initial value may be set to the number of interval markers in the interval marker set. In this embodiment, when the value of the preset countdown counter is 0, the object enumeration device may first obtain each URL set from the preset distributed queue, then perform addressing according to each URL set to obtain each object enumeration result, and then merge the obtained object enumeration results to obtain all objects currently in the bucket.
It is understood that after all the objects in the current bucket are obtained, the object enumeration method of the embodiment of the present application may generate a new interval tag set according to all the objects in the current bucket, so that steps S21 to S23 may be performed based on the new interval tag set to enumerate the objects in the bucket in case of object enumeration requirement.
According to the object enumeration method, a plurality of processing units synchronously execute interval object enumeration to obtain a plurality of object enumeration results, and then the object enumeration results are combined to obtain all objects in a storage bucket. Compared with the technical scheme that the enumeration object interface is called in a serial and circular mode through one thread until all objects in the storage bucket are enumerated, the enumeration object interface is called in parallel through a plurality of processing units until all intervals are subjected to interval object enumeration, and all objects in the storage bucket can be enumerated quickly.
For the incremental migration of the object buckets, after all objects in the buckets are enumerated quickly, the object enumeration method according to the embodiment of the present application may obtain an incremental object set of a source bucket by screening the latest modification time of the objects, and synchronize the incremental object set to a destination bucket. If there are a large number of objects in the source bucket, the object enumeration method according to the embodiment of the present application may greatly shorten the time for obtaining the incremental object set, so as to shorten the overall migration time of the objects and the service cutover interruption time.
Referring to fig. 3, fig. 3 is a schematic frame diagram of an object enumeration device 30 according to an embodiment of the present application. As shown in fig. 3, the object enumeration device 30 may include a sampling module 31, an enumeration module 32, and a merge module 33.
The sampling module 31 is configured to sample keys of an object in a bucket at predetermined intervals to obtain a set of interval labels, where the set of interval labels includes a plurality of interval labels, and the interval labels are sampling results of sampling the keys of the object in the bucket.
As an alternative embodiment, the first interval flag in the interval flags is null.
The enumerating module 32 is configured to enumerate objects corresponding to any two adjacent interval markers in the interval marker set synchronously to obtain a plurality of enumerated object results.
As an optional implementation manner, the enumeration module 32 is specifically configured to allocate the plurality of section markers to a plurality of processing units, where one section marker corresponds to one processing unit, so that the plurality of processing units use the section marker corresponding to itself as an entry of an enumeration object interface, and enumerate objects in the bucket until an enumerated object result includes an object identified by a section marker different from the section marker corresponding to itself in the section marker set, or until the number of objects included in the enumerated object result is smaller than a maximum return value of the enumeration object interface.
The merging module 33 is configured to merge the multiple enumerated object results.
As an optional implementation manner, the object enumeration device 30 may further include a deduplication module 34, configured to deduplicate the multiple enumerated object results after merging.
Based on the same inventive concept, the principle and the beneficial effect of the object enumeration device 30 provided in the embodiment of the present application for solving the problem are similar to those of the embodiment of the object enumeration method of the present application, so that the implementation of the object enumeration device 30 may refer to the implementation of the object enumeration method shown in fig. 2, and repeated details are not repeated.
Referring to fig. 4, fig. 4 is a schematic frame diagram of another object enumeration device 40 according to an embodiment of the present application. As shown in fig. 4, the object enumeration device 40 may include: a bus 41, a processor 42, a memory 43, and an input/output interface 44. The bus 41 is used for interconnecting the processor 42, the memory 43 and the input/output interface 44 and enabling the above elements to communicate with each other. The memory 43 is used to store one or more computer programs comprising computer instructions. The input/output interface 44 is used to control the communication connections between the object enumeration device 40 and other devices.
In particular, the processor 42 is configured to invoke the computer instructions to implement the object enumeration method as shown in FIG. 2.
The processor 42 may be a Central Processing Unit (CPU). The Memory 43 may be any type of Memory, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a nonvolatile Random Access Memory (RAM), and the like.
Based on the same inventive concept, the principle and the beneficial effect of the object enumeration device 40 provided in the embodiment of the present application for solving the problem are similar to those of the embodiment of the object enumeration method of the present application, so that the implementation of the object enumeration device 40 may refer to the implementation of the object enumeration method shown in fig. 2, and repeated details are not repeated.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a ROM or a RAM.

Claims (10)

1. An object enumeration method, comprising:
sampling keys of an object in a bucket at predetermined intervals to obtain a section mark set, wherein the section mark set comprises a plurality of section marks which are sampling results of sampling the keys of the object in the bucket;
synchronously enumerating objects corresponding to any two adjacent interval marks in the interval mark set so as to obtain a plurality of enumerated object results;
merging the plurality of enumerated object results.
2. The method of claim 1, wherein said synchronously enumerating objects corresponding to any two adjacent interval labels in said set of interval labels comprises:
and allocating the plurality of section marks to a plurality of processing units respectively, wherein one section mark corresponds to one processing unit, so that the plurality of processing units take the section mark corresponding to the processing unit as an entry parameter of an enumeration object interface, and objects are enumerated in the bucket until an enumerated object result comprises objects identified by the section marks different from the section marks corresponding to the processing unit in the section mark set, or until the number of the objects contained in the enumerated object result is less than the maximum return value of the enumeration object interface.
3. The method of claim 1 or 2, wherein a first interval marker of the plurality of interval markers is null.
4. The method of any of claims 1 to 3, wherein after said merging said plurality of enumerated object results, said method further comprises:
and carrying out deduplication on the combined plurality of enumerated object results.
5. An object enumeration device, comprising:
a sampling module, configured to sample keys of an object in a bucket at predetermined intervals to obtain a set of interval labels, where the set of interval labels includes a plurality of interval labels, and the plurality of interval labels are sampling results of sampling the keys of the object in the bucket;
the enumeration module is used for synchronously enumerating objects corresponding to any two adjacent interval markers in the interval marker set so as to obtain a plurality of enumerated object results;
and the merging module is used for merging the plurality of enumerated object results.
6. The device of claim 5, wherein the enumeration module is specifically configured to:
and allocating the plurality of section marks to a plurality of processing units respectively, wherein one section mark corresponds to one processing unit, so that the plurality of processing units take the section mark corresponding to the processing unit as an entry parameter of an enumeration object interface, and objects are enumerated in the bucket until an enumerated object result comprises objects identified by the section marks different from the section marks corresponding to the processing unit in the section mark set, or until the number of the objects contained in the enumerated object result is less than the maximum return value of the enumeration object interface.
7. The apparatus of claim 5 or 6, wherein a first interval marker of the plurality of interval markers is null.
8. The apparatus according to any one of claims 5 to 7, further comprising:
and the duplication removing module is used for carrying out duplication removal on the combined multiple enumerated object results.
9. An object enumeration device, wherein the object enumeration device comprises a processor and a memory, the processor and the memory being interconnected, wherein the processor executes computer instructions in the memory to implement the method of any of claims 1-4.
10. A computer storage medium, characterized in that the computer storage medium stores a computer program comprising computer instructions that, when executed by a processor, cause the processor to perform the method according to any of claims 1-4.
CN201811146403.1A 2018-09-28 2018-09-28 Object enumeration method, device and computer storage medium Active CN109408571B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811146403.1A CN109408571B (en) 2018-09-28 2018-09-28 Object enumeration method, device and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811146403.1A CN109408571B (en) 2018-09-28 2018-09-28 Object enumeration method, device and computer storage medium

Publications (2)

Publication Number Publication Date
CN109408571A CN109408571A (en) 2019-03-01
CN109408571B true CN109408571B (en) 2022-03-29

Family

ID=65466568

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811146403.1A Active CN109408571B (en) 2018-09-28 2018-09-28 Object enumeration method, device and computer storage medium

Country Status (1)

Country Link
CN (1) CN109408571B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116150093B (en) * 2023-03-04 2023-11-03 北京大道云行科技有限公司 Method for realizing object storage enumeration of objects and electronic equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7191300B2 (en) * 2003-12-23 2007-03-13 International Business Machines Corporation Parallel memory compaction
CN104462324A (en) * 2014-12-03 2015-03-25 浪潮电子信息产业股份有限公司 HDFS multithreaded parallel downloading method
CN104994177B (en) * 2015-08-06 2019-01-25 上海爱数信息技术股份有限公司 Synchronous method, terminal device and the Dropbox system of Dropbox system

Also Published As

Publication number Publication date
CN109408571A (en) 2019-03-01

Similar Documents

Publication Publication Date Title
CN108319460B (en) Method and device for generating application program installation package, electronic equipment and storage medium
CN109784058A (en) Version strong consistency method of calibration, client, server and storage medium
CN109800207B (en) Log analysis method, device and equipment and computer readable storage medium
CN109783258B (en) Message processing method and device and server
CN110781028A (en) Data backup method, data recovery method, data backup device, data recovery device and computing equipment
CN106919620B (en) Single page processing method and device
CN109408571B (en) Object enumeration method, device and computer storage medium
CN109408376B (en) Configuration data generation method, device, equipment and storage medium
CN106529281B (en) A kind of executable file processing method and processing device
CN112269588B (en) Algorithm upgrading method, device, terminal and computer readable storage medium
CN112035444B (en) Method and device for transferring image data among heterogeneous systems without stopping
CN106383755A (en) Method, apparatus and system for recovering slave database from master-slave database system
CN103838642A (en) Data recovery method, device and system
US10387887B2 (en) Bloom filter driven data synchronization
CN110413921B (en) Webpage loading method and device, computer equipment and storage medium
CN112363980A (en) Data processing method and device for distributed system
CN111385613B (en) Television system repairing method, storage medium and application server
CN110365809B (en) Distributed server address configuration system and method
CN113792026A (en) Deployment method and device of database script and computer readable storage medium
CN117149728B (en) Online synchronization method and system for multi-node data of upper computer of trusted distributed control system
CN113127443A (en) Method and device for updating cache data
CN111142791A (en) Data migration method and device
CN115033647B (en) Data synchronization method and device, electronic equipment and storage medium
WO2024065778A1 (en) Method, apparatus, device, and medium for building knowledge graph and executing workflow
US20230013075A1 (en) Determining candidates for circuit breaker patterns in cloud applications using machine learning techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220217

Address after: 550025 Huawei cloud data center, jiaoxinggong Road, Qianzhong Avenue, Gui'an New District, Guiyang City, Guizhou Province

Applicant after: Huawei Cloud Computing Technology Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Applicant before: HUAWEI TECHNOLOGIES Co.,Ltd.

GR01 Patent grant
GR01 Patent grant