CN111538865B - Multiparty set synchronization method and device and electronic equipment - Google Patents

Multiparty set synchronization method and device and electronic equipment Download PDF

Info

Publication number
CN111538865B
CN111538865B CN202010230302.3A CN202010230302A CN111538865B CN 111538865 B CN111538865 B CN 111538865B CN 202010230302 A CN202010230302 A CN 202010230302A CN 111538865 B CN111538865 B CN 111538865B
Authority
CN
China
Prior art keywords
synchronization
cuckoo
filtering
fingerprints
elements
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010230302.3A
Other languages
Chinese (zh)
Other versions
CN111538865A (en
Inventor
郭得科
罗来龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202010230302.3A priority Critical patent/CN111538865B/en
Publication of CN111538865A publication Critical patent/CN111538865A/en
Application granted granted Critical
Publication of CN111538865B publication Critical patent/CN111538865B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9027Trees
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Abstract

The invention provides a multiparty set synchronization method, a multiparty set synchronization device and electronic equipment, which are characterized by comprising the following steps: constructing a flag cuckoo filtering data structure and defining the operation of the flag cuckoo filtering; the local elements of each synchronous participant are respectively represented by using the mark cuckoo filtering, so that a local set of each synchronous participant is obtained; the local set is sent to a central node for aggregation, and a global set is obtained; transmitting the global set to each of the synchronization participants; traversing the global set by the synchronous participants to determine difference elements; and enabling the synchronous participants to acquire the difference elements from the global set so as to complete multi-party set synchronization.

Description

Multiparty set synchronization method and device and electronic equipment
Technical Field
The present invention relates to the field of distributed systems, and in particular, to a method and an apparatus for synchronizing a multi-party set, and an electronic device.
Background
The multi-party aggregate content is synchronized with the basic functions in the distributed and network systems. The existing data structures currently cannot meet the requirements of space efficiency, fusibility and information integrity required in multiparty synchronization scenarios, and cannot optimize the transmission overhead of synchronization from the global layer.
Disclosure of Invention
In view of the above, the present invention aims to provide a multi-party set synchronization method, apparatus and electronic device, which can optimize the transmission overhead of synchronization from the global level and meet the requirements of space efficiency, fusibility and information integrity in multi-party synchronization scenarios.
Based on the above object, the present invention provides a multiparty set synchronization method, which is characterized in that:
constructing a flag cuckoo filtering data structure and defining the operation of the flag cuckoo filtering;
the local elements of each synchronous participant are respectively represented by using the mark cuckoo filtering, so that a local set of each synchronous participant is obtained;
the local set is sent to a central node for aggregation, and a global set is obtained;
transmitting the global set to each of the synchronization participants;
traversing the global set by the synchronous participants to determine difference elements;
and enabling the synchronous participants to acquire the difference elements from the global set so as to complete multi-party set synchronization.
In some embodiments, the flag cuckoo filtering data structure specifically includes:
isomorphic marker cuckoo filtering: consisting of m cells, each cell containing b storage slots, storing a maximum of b element fingerprints, said element fingerprints being storable in any one of said candidate cells; the storage groove is provided with two fields, including a fingerprint field for storing the fingerprint information of the element and a mark field for recording the membership information of the stored element; the fingerprint domain comprises f bits, the flag domain comprises n bits, wherein n is the number of sets participating in synchronization; providing two candidate cells for any element x, wherein indexes of the element x in the two candidate cells are respectively as follows: h is a 1 (x) =hash (x)% m and h 2 (x)=hash(x)⊙(hash(η x ) % m), wherein eta x Element fingerprints representing element x, hash representing a hash function;
filtering the heterogeneous marker cuckoo: the relationship between the number of cells and the capacity of each cell is specifically: m is m i =α|S i |/b i Wherein m is i And b i Respectively representing the number of the cells and the capacity of each cell, wherein alpha is more than or equal to 1 and is a constant; the indexes of the element x in the two candidate cells are respectively: h is a 1 (x)=hash(η x ) % m
Figure SMS_1
In some embodiments, the operation of flag cuckoo filtering specifically includes:
element insertion: calculating element fingerprints through a hash function, and then inserting the element fingerprints into the two candidate cells; reassigning when all the storage slots in both of the candidate cells are occupied;
element query: traversing the two candidate cells, and returning the mark domain of the storage tank of the queried element when the queried element is found; otherwise, returning an error;
element deletion:
when an element is deleted from a collection: firstly, determining candidate cell positions of element fingerprints of elements to be deleted, and returning an error if the element fingerprints of the elements to be deleted do not exist or the ith bit of a mark field of a storage tank storing the element fingerprints of the elements to be deleted is 0; otherwise, the ith bit position of the mark field of the storage tank storing the element fingerprint of the element to be deleted is reversely set to 0; if all bits of the mark field of the storage tank for storing the element fingerprints of the element to be deleted are 0, deleting the element fingerprints of the element to be deleted;
When an element is deleted from all sets: deleting the element fingerprints of the elements to be deleted and setting all the positions of the mark domain of the storage tank where the element fingerprints of the elements to be deleted are positioned as 0;
a plurality of marker cuckoo filtering and polymerizing: given two flag cuckoo filtering vectors MCF i And MCF j Adding membership information of common elements in the two vectors to the vector MCF i In (3), MCF j Is inserted into MCF i Is a kind of medium.
In some embodiments, the reassigning specifically includes:
when all storage grooves in the two candidate cells are occupied, randomly kicking out an element fingerprint which is already stored to obtain an empty storage groove, and then storing the element fingerprint to be stored and membership information of the element fingerprint to be stored in the empty storage groove; the kicked element fingerprint will be reassigned to another candidate cell; ending when no additional element fingerprint is kicked out or the number of reassignments reaches a given upper limit.
In some embodiments, the sending the local set to a central node for aggregation specifically includes:
selecting one of the synchronous participants having the greatest node degree as a central node;
The isomorphic marker cuckoo filtering: selecting an aggregation path through a minimum spanning tree algorithm, and transmitting the local set to a central node through the aggregation path for aggregation;
the heterogeneous mark cuckoo is filtered: constructing an empty mark cuckoo filter, wherein the capacity of the empty mark cuckoo filter is determined by the number of element fingerprints of a local set to be aggregated; and inserting element fingerprints in the local set to be aggregated into the empty mark cuckoo filtering one by one.
In some embodiments, the step of enabling the synchronous participant to traverse the global set, and the step of determining the difference element specifically includes:
the difference elements include missing elements and unique elements;
the synchronous participants traverse the global set, element fingerprints of the missing elements and membership information thereof are added to the missing element set, and element fingerprints of the unique elements and membership information thereof are added to the unique element set.
In some implementations, when the synchronization participant dynamically joins and leaves:
when a synchronization participant participating in synchronization leaves, if the leaving synchronization participant is not a leaf node of a minimum spanning tree in an aggregation path, splitting the minimum spanning tree into a plurality of subtrees, and connecting the subtrees by using a minimum edge among the subtrees to obtain a new minimum spanning tree;
When a new synchronization participant joins, selecting a minimum edge between the new synchronization participant and an existing synchronization participant, and joining the new synchronization participant into an existing minimum spanning tree through the minimum edge.
Based on the same inventive concept, the invention also provides a multiparty set synchronization device, which is characterized by comprising:
a data structure construction module configured to construct a flag cuckoo filtering data structure and define an operation of the flag cuckoo filtering;
the local element representation module is configured to respectively represent the local elements of each synchronous participant by using the mark cuckoo filtering to obtain a local set of each synchronous participant;
the aggregation module is configured to send the local aggregation to a central node for aggregation to obtain a global aggregation;
a set sending module configured to send the global set to each of the synchronization participants;
a difference element determination module configured to cause the synchronization participant to traverse the global set, determining a difference element;
and the synchronization module is configured to enable the synchronization participant to acquire the difference element from the global set so as to complete multi-party set synchronization.
Based on the same inventive concept, the present invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any one of claims 1 to 7 when executing the program.
Based on the same inventive concept, the present invention also provides a non-transitory computer readable storage medium, characterized in that the non-transitory computer readable storage medium stores computer instructions for causing the computer to perform the method of any one of claims 1 to 7.
From the above, it can be seen that the multiparty set synchronization method, device and electronic equipment provided by the invention construct and use the flag cuckoo filtering as the data structure of set summarization for the multiparty set synchronization scene in the distributed scene, and aggregate and distribute the set through the minimum spanning tree, so that the transmission path is shortened, and the transmission method of selecting the central node and transmitting and aggregating at the same time further reduces the transmission cost, meets the requirements of space efficiency, fusibility and information integrity required in the multiparty synchronization scene, and greatly saves the bandwidth.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a block diagram of a signpost cuckoo filter according to one embodiment of the present invention;
FIG. 2 is a schematic diagram of an aggregation and distribution process according to an embodiment of the present invention;
fig. 3 is a schematic hardware structure of an electronic device according to an embodiment of the invention.
Detailed Description
The present invention will be further described in detail below with reference to specific embodiments and with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent.
It should be noted that unless otherwise defined, technical or scientific terms used in the embodiments of the present invention should be given the ordinary meaning as understood by one of ordinary skill in the art to which the present disclosure pertains. The terms "first," "second," and the like, as used in this disclosure, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
As users transfer computing and data to the cloud, cloud services such as Dropbox, google Drive, and OneDrive continue to emerge for helping users access data from different devices and to achieve collaborative tasks on the same data. Because multiple copies of data coexist at the user equipment end, the cloud server and the edge server, when a user updates data from different devices, the copies must be synchronized periodically to ensure consistency and correctness. The problem of multi-party set synchronization in such distributed scenarios is also prevalent in wireless sensor networks, software defined networks, content distribution networks, blockchain transaction processes, and other scenarios.
Multiparty set synchronization can be solved by naturally decomposing into two dimensions: 1) Aggregate representation-how to represent aggregate elements; 2) Synchronization policy-how the synchronized parties interact to determine and transmit the discrepancy element. Existing set representation methods rely primarily on linear sketch data structures such as hash trees, bloom filters, reversible Bloom look-up tables (Invertible Bloom lookup table, IBLT), etc. In particular, if the set element can be represented by an integer, a set feature polynomial based on these integer frameworks can also be used as the set sketch. Synchronization policies are then built on top of these set representation methods. Typically, these slots need to be exchanged between the parties involved in the synchronization. After grasping the set of the other synchronization participants, the local synchronization participant can infer the difference element that needs to be transmitted from the obtained sketch.
However, current sketch data structures are not adequate for the new demands put forth by multiparty set synchronization scenarios. Specifically, in the case of multiparty participation, the employed sketch data structure needs to have the following characteristics: 1) Space efficiency, the space used is far smaller than the original data size of the collection element; 2) The fusion performance, a plurality of tabs can be fused into one tab without losing any information; 3) Information integrity, content information (e.g., element fingerprints) and membership information (which set or sets the elements belong to) of the set elements can be recorded into the sketch. Space efficiency and fusibility guarantee the transmission overhead of exchanging the slots between the synchronization participants. The information integrity ensures the synchronization accuracy. However, currently existing sketch data structures do not have the above three characteristics at the same time. Bloom filters and variants thereof are all spatially friendly, but most of them do not have fusibility. IBLT is both space friendly and fusibility, but does not achieve information integrity.
Moreover, the existing synchronization strategy does not comprehensively consider the position of the synchronization party in the network and the distribution condition of data in the synchronization party, so that the global optimization of the transmission overhead cannot be realized. Thus, these synchronization policies often result in unnecessary skin exchanges or data transfers. Currently, most cloud storage services deploy replica policies to achieve set synchronization. They typically use a cloud server as a central node for collecting data on other user devices, i.e. on each other synchronization participant. And the other user equipment uploads the updated data modules to the cloud or downloads the missing data modules from the cloud, namely, the multiple parties are synchronous. Unfortunately, the transmission overhead of such synchronization strategies is highly dependent on the physical location of the synchronization participants and the distance between the synchronization participants. This strategy also wastes a lot of network bandwidth, since the synchronizer can only obtain data from the cloud server side, although the neighbor node of the synchronizer also has its required data. Other possible synchronization strategies, such as exchanging the skin and transmission difference elements via all-to-all or gossip protocols, may more severely occupy network bandwidth resources.
For this purpose, the present application first designs a flag Cuckoo filter, i.e., a flag Cuckoo filter (Marked Cuckoo Filter, MCF) data structure for representing a multi-party set, and subsequently represents the flag Cuckoo filter with MCF. Based on this, the application further provides a multi-party set synchronization strategy MCFsyn. MCFsyn aggregates and distributes MCFs generated by parties based on the smallest spanning tree among the participating synchronous parties. Each participating synchronizer traverses the global MCF that records the entire union information, identifying its missing and unique set elements. For missing elements, MCFsyn allows the sync participants to select the best element content provider, thereby achieving transmission overhead minimization. Experiments show that MCFsyn can be superior to other methods in terms of synchronization precision and transmission overhead.
The application firstly proposes a new variant mark Cuckoo filter MCF of Cuckoo filter, namely mark Cuckoo filter, which is used for representing a multiparty set. The MCF adds an additional flag field to each slot to indicate membership information of the stored element before synchronization is completed. For example, given three sets S 1 、S 2 And S is 3 The MCF will use a three-bit flag field to represent membership information of the stored element before synchronization is complete. If the element exists in set S i In the flag field, the i-th bit will be set to 1.MCF naturally inherits the functions of standard Cuckoo filtering, including element insertion, query, and deletion. In addition, MCFs also support aggregation operations between MCFs, such that MCFs from different synchronized participants can be aggregated into a single MCF. Based on the design thought, the MCF simultaneously realizes space efficiency, fusibility and information integrity.
Based on the MCF data structure, the application provides a novel multiparty set synchronization strategy MCFsyn. MCFsyn has five main steps: 1) Each synchronization participant represents its aggregate element as MCF; 2) Aggregating MCFs generated by each synchronous party into a global MCF; 3) Distributing the global MCF to each synchronous party; 4) The synchronous participant traverses the global MCF to determine missing elements; 5) The synchronization participant tries to acquire its missing elements with minimal transmission overhead. Each of the above steps requires a comprehensive consideration of synchronization delay, transmission overhead, and underlying network topology.
The invention aims to provide a multiparty set synchronization method, a multiparty set synchronization device and electronic equipment, wherein the multiparty set synchronization method, the multiparty set synchronization device and the electronic equipment can optimize synchronous transmission overhead from a global level and meet the requirements of space efficiency, fusibility and information integrity under multiparty synchronization scenes.
The following is a schematic diagram of a flag bird filtering structure, a schematic diagram of an aggregation and distribution process, and a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present invention, which are described in further detail below with reference to fig. 1, fig. 2, and fig. 3. The invention provides a multi-party set synchronization method, which comprises the following steps:
s1: constructing a flag cuckoo filtering data structure and defining the operation of the flag cuckoo filtering;
the tag cuckoo filtering data structure specifically comprises:
isomorphic marker cuckoo filtering: consisting of m cells, each cell containing b storage slots, storing a maximum of b element fingerprints, said element fingerprints being storable in any one of said candidate cells; the storage groove is provided with two fields, including a fingerprint field for storing the fingerprint information of the element and a mark field for recording the membership information of the stored element; the fingerprint domain comprises f bits, the flag domain comprises n bits, wherein n is the number of sets participating in synchronization; providing two candidate cells for any element x, wherein indexes of the element x in the two candidate cells are respectively as follows: h is a 1 (x) =hash (x)% m and h 2 (x)=hash(x)⊙(hash(η x ) % m), wherein eta x Element fingerprints representing element x, hash representing a hash function;
As shown in fig. 1, the MCF is composed of m unit cells. Each cell contains b storage slots, so each cell stores a maximum of b element fingerprints. Each storage slot has two fields, including a fingerprint field for storing element fingerprint information and a flag field for recording membership information of the stored element. The fingerprint field contains f bits; the flag field has n bits, where n is the number of sets involved in synchronization. If the element exists in set S i In the flag field, the i-th bit will be set to 1.MCF provides two candidate cells for any element x, with indices: h is a 1 (x) =hash (x)% m and h 2 (x)=hash(x)⊙(hash(η x ) % m). Element fingerprint eta x Can be stored in any one of the cells. If all the storage slots in the two candidate cells are occupied, the MCF randomly kicks out an already stored element fingerprint and will η x And its membership information is stored in the storage tank. The kicked element fingerprint will be reassigned to another candidate cell thereof. The above reassignment procedure willThe end is reached when no additional element fingerprint is kicked out (element indicates success) or the number of reassignments reaches a given upper limit max (element indicates failure).
Filtering the heterogeneous marker cuckoo: the relationship between the number of cells and the capacity of each cell is specifically: m is m i =α|S i |/b i Wherein m is i And b i Respectively representing the number of the cells and the capacity of each cell, wherein alpha is more than or equal to 1 and is a constant; the indexes of the element x in the two candidate cells are respectively: h is a 1 (x)=hash(η x ) % m
Figure SMS_2
/>
In practice, the aggregate size on the synchronous participants varies widely. For example, a participant that is offline for a long period of time may have far fewer elements than other synchronizers, while a synchronization party that is suddenly updated may contain more elements. In this case, representing the different sets with homogenous MCFs is not an economical option. Therefore, this section considers the MCFsyn design when using heterogeneous MCFs. As such, each sync participant can customize the MCF capacity it uses according to its local aggregate size.
Unlike an isomorphic MCF, a heterogeneous MCF has different numbers of cells such that the candidate cells in which an element is located are not identical. This feature renders the polymerization operation ineffective. For this reason, the MCF data structure needs to be redesigned so that the candidate cells of the element are only determined by the element fingerprint. The specific improvement is that the element x and the fingerprint eta thereof are given x The two corresponding candidate cells are: h is a 1 (x)=hash(η x ) % m
Figure SMS_3
As such, when an element fingerprint from two different MCFs is inserted into its aggregated MCF, its candidate cells are determined only by the element fingerprints. For party P i The MCF length used was calculated as: m is m i =α|S i |/b i Wherein m is i And b i Respectively represent MCF i Number of cells of (a) and/orThe capacity of individual cells, and α.gtoreq.1 is a constant.
In addition to element fingerprints, multiparty collection representations also require record membership information for elements. For this, the MCF introduces extra bits in each slot and is used to record this information. Based on the above design, the MCF supports element-oriented set representation operations, including element insertion, query, and deletion. In addition, MCF also enables aggregation operations between MCF vectors. MCFs naturally support multiparty aggregate representations under the design of fingerprint and logo fields. Thus, the MCF will be used in this application as the basic data structure supporting the multiparty synchronization strategy.
Element insertion: calculating element fingerprints through a hash function, and then inserting the element fingerprints into the two candidate cells; reassigning when all the storage slots in both of the candidate cells are occupied; the specific process comprises the following steps:
at the insert set S i When the element x is the medium element, the MCF calculates f-bit element fingerprint eta through a hash function firstly x . Subsequently, the candidate cell corresponding to x passes through a function h 1 (x) And h 2 (x) Calculated. The reassignment specifically includes: when all storage grooves in the two candidate cells are occupied, randomly kicking out an element fingerprint which is already stored to obtain an empty storage groove, and then storing the element fingerprint to be stored and membership information of the element fingerprint to be stored in the empty storage groove; the kicked element fingerprint will be reassigned to another candidate cell; when no additional element fingerprint is kicked out or the reassignment number reaches a given upper limit, the specific process includes: if there is an unoccupied storage slot therein, η x Will be placed in the holding tank. And the i-th bit in the tag field of the slot will be set to 1. Otherwise, the MCF must kick out the stored element fingerprint of one of the two candidate cells as η x Space is vacated. The kicked element fingerprint and its membership information will be reassigned to another candidate cell thereof. The above reassignment process will end when no additional element fingerprint is kicked out or the reassignment number reaches a given upper limit max. When the element fingerprint is kickedWhen a slot is output, the 1 in the tag field of the slot will also be inverted to 0. Accordingly, when an element fingerprint is reinserted, the corresponding bit of the tag field of the slot in which it resides will be set to 1. Thus, the correctness of the element information in the whole reinsertion process is ensured.
Element query: traversing the two candidate cells, and returning the mark domain of the storage tank of the queried element when the queried element is found; otherwise, returning an error; the specific process comprises the following steps:
when querying element x, the MCF only needs to check two candidate cells of element x. If the element fingerprint eta x Can be found in either of these two cells, the MCF returns the flag field of its slot to explicitly give membership information for element x. Otherwise, the MCF returns False, meaning that element x is not an element in any collection. The temporal complexity of element queries is also a constant level, as only two cells will be traversed. The false positive misjudgment probability of the MCF membership query is
Figure SMS_4
Where f and b are the element fingerprint length and the number of storage slots in each cell, respectively. />
Element deletion: when an element is deleted from a collection: firstly, determining candidate cell positions of element fingerprints of elements to be deleted, and returning an error if the element fingerprints of the elements to be deleted do not exist or the ith bit of a mark field of a storage tank storing the element fingerprints of the elements to be deleted is 0; otherwise, the ith bit position of the mark field of the storage tank storing the element fingerprint of the element to be deleted is reversely set to 0; if all bits of the mark field of the storage tank for storing the element fingerprints of the element to be deleted are 0, deleting the element fingerprints of the element to be deleted; when an element is deleted from all sets: deleting the element fingerprints of the elements to be deleted and setting all the positions of the mark domain of the storage tank where the element fingerprints of the elements to be deleted are positioned as 0; the specific process comprises the following steps:
the delete operation is critical when representing a dynamic collection. MCF supports from a certain set S i Middle puncturingThe element x is divided, and x is also supported to be deleted from all sets simultaneously. Specifically, to delete element x of Si in the collection, the MCF first determines the element fingerprint η x Is included in the candidate cell locations of (a). If eta x If the ith bit of the flag field of the slot that is not present in its candidate cell or stores the element fingerprint is 0, the MCF returns False indicating that the element is not present in the set S i Is a kind of medium. Otherwise, the ith bit of the tag field of the slot storing the element fingerprint will be inverted to 0. After that, if all bits of the tag field of the slot are 0, the element fingerprint is also deleted. It is simpler to delete element x from all sets directly. MCF need only attempt to get η x Deleting and setting all the positions of the mark field of the storage tank to 0. It follows that the time complexity of the element deletion operation is of constant order.
Figure SMS_5
Table 1 algorithm 1
A plurality of marker cuckoo filtering and polymerizing: given two flag cuckoo filtering vectors MCF i And MCF j Adding membership information of common elements in the two vectors to the vector MCF i In (3), MCF j Is inserted into MCF i In (a) and (b); the specific process comprises the following steps:
the aggregation operation means that element fingerprint information and membership information thereof from a plurality of isomorphic MCFs are fused into one MCF. This operation is significant for reducing the overhead of set-synchronized transmissions. As shown in algorithm 1, two MCF vectors MCF are given i And MCF j The basic idea of the aggregation operation is to add membership information of common elements in the two vectors to the vector MCF i While MCF j The difference element in (a) is inserted into the MCF i Is a kind of medium. Specifically, the algorithm traverses the entire MCF j Vector. For any stored elemental fingerprint MCF j [k][r]Finger print (k is 0.ltoreq.m-1 and r is 0.ltoreq.b-1), the MCF tries to detect the MCF i The element fingerprint is searched. If the element fingerprint can be in MCF i If found, the corresponding storage slot (marked slot) is recalculated into the marking domain and the MCF j [k][r]Mark's or result (lines 4 to 6 in algorithm 1). Otherwise, MCF j [k][r]Finger print to be inserted into MCF i Is a kind of medium. Finally, updated MCF i Will be returned as the result of the aggregation. The time complexity of this algorithm is O (mb).
Based on the above synchronization framework, there are still two challenges that need to be resolved. First, aggregating and distributing the paths of the MCFs has a significant impact on the overall transmission overhead. Gossip or broadcasting these skips can also serve the purpose, but can result in significant delays and transmission overhead. Therefore, it is necessary to comprehensively consider the underlying network and reasonably plan the transmission path. Second, missing elements may exist in multiple synchronizers. How to select the optimal missing element sender is also very challenging. To solve the above two problems, the present application first presents a synchrony group abstraction before introducing a concrete scheme.
Consider n synchronized participants, denoted P 1 ,…,P n There is one local set on each participant. As shown in fig. 2, the present application abstracts the synchronization group into a complete graph. Each edge in the graph has a weight value that is used to represent the number of physical network hops between the pair of nodes. For example, the weight w i,j Representing party P i And P j Number of hops between. In the aggregation step, the resulting tabs of all participants will be sent to the central node. Based on the above definition, specific design details of the multi-party set synchronization MCFsyn are as follows:
s2: the local elements of each synchronous participant are respectively represented by using the mark cuckoo filtering, so that a local set of each synchronous participant is obtained:
each synchronization participant represents its local element with one MCF as a sketch of its collection. The MCFs used by different participants can be selected from isomorphic MCFs or heterogeneous MCFs according to specific requirements. The isomorphic MCF has the same parameter configuration, comprises MCF length m, hash functions for generating element fingerprints and calculating candidate cells, and capacity b of each cell, so that the subsequent aggregation and extraction steps are greatly simplified; heterogeneous MCF reduces storage overhead and overhead of aggregate distribution, reducing bandwidth waste.
S3: the local set is sent to a central node for aggregation, and a global set is obtained:
the sending the local set to the central node for aggregation specifically comprises: selecting one of the synchronous participants having the greatest node degree as a central node; the isomorphic marker cuckoo filtering: selecting an aggregation path through a minimum spanning tree algorithm, and transmitting the local set to a central node through the aggregation path for aggregation; the heterogeneous mark cuckoo is filtered: constructing an empty mark cuckoo filter, wherein the capacity of the empty mark cuckoo filter is determined by the number of element fingerprints of a local set to be aggregated; and inserting element fingerprints in the local set to be aggregated into the empty mark cuckoo filtering one by one.
The method specifically comprises the following steps:
all synchronous participants send the tabs to a central node for aggregation, so that the global tabs of the union are obtained. For privacy protection purposes, the data to be synchronized can only be transmitted between the parties involved in the synchronization, and therefore the selected central node must also be the party involved in the synchronization.
After constructing the synchrony group topology, MCFsyn calculates the minimum spanning tree (Minimum Spanning Tree, MST) in the graph as the path for the sketch aggregation. One of the synchronization participants is selected as the central node and is responsible for collecting and aggregating MCFs from the other synchronization parties. Notably, the choice of the central node has no impact on the transmission overhead of the overall aggregation process. MCFsyn, however, favors the choice of the party with the greatest degree of node in the MST as the central node. In this way, the participants from the smallest spanning tree leaf node can transmit their MCFs in parallel to the central node. As in fig. 2, party P 3 Is selected as the central node. Thus, party P 2 And P 5 Its MCF can be simultaneously transferred to P along MST 3 . Intermediate nodes between the leaf nodes and the central node aggregate the MCFs from their child nodes with their local MCFs,and sends the aggregate result to its parent node. For example, in fig. 2, party P 1 On receiving from P 5 Is of MCF of (F) 5 Thereafter, the received MCF 5 With local MCF 1 Aggregation is carried out, and the aggregation result is sent to the father node P 3 . Such a transmission policy with transmission and aggregation can significantly reduce the transmission overhead of the aggregation.
In heterogeneous marker cuckoo filtering, a given MCF i And MCF j An empty MCF, designated MCF o Will be enabled, the capacity of which can be calculated as: m is m o b o =α|S i ∪S j |。|S i ∪S j The value of l may pass through MCF i And MCF j The number of element fingerprints stored in the database. Then, MCF i And MCF j The element fingerprints in (a) will be inserted into the MCF one by one o Is a kind of medium. Thus, the time complexity of the aggregation operation increases from O (mb) when the MCF is isomorphic to O (m) i b i +m j b j )。
As the synchronization participants dynamically join and leave:
when a synchronization participant participating in synchronization leaves, if the leaving synchronization participant is not a leaf node of a minimum spanning tree in an aggregation path, splitting the minimum spanning tree into a plurality of subtrees, and connecting the subtrees by using a minimum edge among the subtrees to obtain a new minimum spanning tree;
When a new synchronization participant joins, selecting a minimum edge between the new synchronization participant and an existing synchronization participant, and joining the new synchronization participant into an existing minimum spanning tree through the minimum edge.
The method specifically comprises the following steps:
in the synchronous group topology, if the constructed MST, i.e. the leaf node of the minimum spanning tree, leaves the synchronous group, the rest MST is not affected, and only the corresponding bit of the flag field in the MCF needs to be ignored or deleted. However, when a non-leaf node in the MST leaves the synchronization group, the MST will be split into multiple subtrees, in which case MCFsyn needs to reconstruct a new MST. This problem can be solved by the following cutset nature and inference.
Theorem 1 (cut-set property of minimum spanning tree): let G (V, E) represent a graph and (X, V-X) represent a cut set of graph G, and edge E is the least costly edge of all edges connecting the cut set, then each minimum spanning tree of graph G contains edge E.
Inference 1: let G (V, E) represent a graph, T represents a minimum spanning tree for graph G
Figure SMS_6
Is a communication diagram of diagram G, wherein +.>
Figure SMS_7
Figure SMS_8
Is a subtree of T and is covered with the sub-map +.>
Figure SMS_9
In (a) is- >
Figure SMS_10
Is a graph
Figure SMS_11
Is a minimum spanning tree.
And (3) proving: consider cutsets
Figure SMS_14
And a minimum spanning tree T, then according to theorem 1, one can obtain: />
Figure SMS_15
Wherein the method comprises the steps of
Figure SMS_17
Is T is the overlay->
Figure SMS_13
Subtrees of nodes, and e is the connection +.>
Figure SMS_16
And->
Figure SMS_18
The least costly of the edges of (a). If->
Figure SMS_19
Not subgraph->
Figure SMS_12
It can be replaced by the smallest spanning tree with a smaller cost, so that graph G has a smaller spanning tree than T. This contradicts the minimum spanning tree where T is graph G, thus proving the correctness of inference 1.
Inference 2: let the complete graph G (V, E) represent a synchrony group topology with minimum spanning tree T, when synchronizing party P i Upon leaving the sync group, T may be split into multiple sub-trees. By using the subtrees one by one with the minimum edge connection between the subtrees and without generating a ring, the method can obtain the method without P i Is the smallest spanning tree of the synchronized group.
Inference 2 is a natural result of theorem 1 and inference 1 and is not further demonstrated herein. Based on this inference, MCFsyn gets a new minimum spanning tree by concatenating these subtrees using the smallest edge between the subtrees. Of course, when adding these edges, it is not allowed to introduce loops into the minimum spanning tree.
When a new synchronization participant joins the synchronization group, the required operations are quite simple. When P n+1 When the synchronization group is added, MCFsyn only needs to select P in order to maintain the minimum spanning tree n+1 The minimum edge between the synchronization party and the existing synchronization party is added into the existing minimum spanning tree. Furthermore, an extra bit needs to be introduced into the MCF data structure to characterize the set S on the synchronous side n+1 Element membership information of (a).
S4: transmitting the global set to each of the synchronization participants:
the central node distributes the aggregated global sketch to each synchronization party. In this way, each sync is conveniently aware of the information of the union element.
The central node aggregates the MCFs sent by the child nodes into a global MCF to serve asA sketch that is a union set. The global MCF, denoted as MCF o Fingerprint information and membership information of the entire union element are represented. As shown in FIG. 2, MCFxyn will MCF o Distributed from the central node to each synchronization participant via MST.
S5: traversing the global set by the synchronous participants to determine difference elements:
the difference elements include missing elements and unique elements; the synchronous participants traverse the global set, element fingerprints of the missing elements and membership information thereof are added to the missing element set, and element fingerprints of the unique elements and membership information thereof are added to the unique element set.
The method specifically comprises the following steps:
receiving MCF o Thereafter, each participant attempts to pass through the MCF o To determine its missing elements and unique elements. As algorithm 2 gives party P i And carrying out extraction. For an element fingerprint stored in any slot, if the value of the ith bit in the slot' S tag field is 0, it indicates that the element does not belong to set S i . Thus, the element fingerprint and its membership information will be added to the collection
Figure SMS_20
(lines 6 to 7 in algorithm 2). In addition, if only the value of the ith bit is 1, then the element is present only in set S i And will be added to the collection +.>
Figure SMS_21
(lines 8 to 9 in algorithm 2). The time complexity of the decimation operation is O (mb) because of the MCF o All the slots in (a) need to be traversed.
Figure SMS_22
Figure SMS_23
Table 2 algorithm 2
S6: the synchronization participant acquires the difference element from the global set to complete multi-party set synchronization:
for a certain participator P i Which is provided with
Figure SMS_24
The unique elements of (c) will be pushed by them to other participants. The elements are pushed in a broadcast mode, so that time can be saved, and transmission overhead can be saved when the elements are transmitted along the MST. For->
Figure SMS_25
If only one bit in its corresponding flag field is 1, the element is the unique element of the party to which that bit corresponds. At this time, P i Only the element content from the element host need be received. Conversely, if there are multiple bits 1 in the flag field, P i The synchronization will be completed with the content sender that selected the participant with which it constitutes the smallest edge weight in the synchronization group topology as the element. Such a policy may be implemented by maintaining a preference list on each synchronization participant. In this list, neighbor nodes with lower edge weights will have higher priority, thus selecting the optimal element content sender.
Based on the same inventive concept, the invention also provides a multiparty set synchronization device, comprising:
a data structure construction module configured to construct a flag cuckoo filtering data structure and define an operation of the flag cuckoo filtering;
the local element representation module is configured to respectively represent the local elements of each synchronous participant by using the mark cuckoo filtering to obtain a local set of each synchronous participant;
the aggregation module is configured to send the local aggregation to a central node for aggregation to obtain a global aggregation;
a set sending module configured to send the global set to each of the synchronization participants;
A difference element determination module configured to cause the synchronization participant to traverse the global set, determining a difference element;
and the synchronization module is configured to enable the synchronization participant to acquire the difference element from the global set so as to complete multi-party set synchronization.
The device of the foregoing embodiment is configured to implement the corresponding method in the foregoing embodiment, and has the beneficial effects of the corresponding method embodiment, which is not described herein.
Based on the same inventive concept, the invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and running on the processor, characterized in that the processor executes the program to implement the method according to the above embodiment.
Based on the same inventive concept, the present invention also provides a non-transitory computer readable storage medium, characterized in that the non-transitory computer readable storage medium stores computer instructions for causing the computer to perform the method according to the above embodiments.
It should be noted that, the method of the embodiment of the present invention may be performed by a single device, for example, a computer or a server. The method of the embodiment can also be applied to a distributed scene, and is completed by mutually matching a plurality of devices. In the case of such a distributed scenario, one of the devices may perform only one or more steps of the method of an embodiment of the present invention, the devices interacting with each other to accomplish the method.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Fig. 3 shows a more specific hardware architecture of an electronic device according to this embodiment, where the device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 implement communication connections therebetween within the device via a bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit ), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc. for executing relevant programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage device, dynamic storage device, or the like. Memory 1020 may store an operating system and other application programs, and when the embodiments of the present specification are implemented in software or firmware, the associated program code is stored in memory 1020 and executed by processor 1010.
The input/output interface 1030 is used to connect with an input/output module for inputting and outputting information. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
Communication interface 1040 is used to connect communication modules (not shown) to enable communication interactions of the present device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 1050 includes a path for transferring information between components of the device (e.g., processor 1010, memory 1020, input/output interface 1030, and communication interface 1040).
It should be noted that although the above-described device only shows processor 1010, memory 1020, input/output interface 1030, communication interface 1040, and bus 1050, in an implementation, the device may include other components necessary to achieve proper operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the embodiments of the present description, and not all the components shown in the drawings.
The computer readable media of the present embodiments, including both permanent and non-permanent, removable and non-removable media, may be used to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the disclosure, including the claims, is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined within the idea of the invention, the steps may be implemented in any order and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity.
Additionally, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures, in order to simplify the illustration and discussion, and so as not to obscure the invention. Furthermore, the devices may be shown in block diagram form in order to avoid obscuring the invention, and also in view of the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the present invention is to be implemented (i.e., such specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the invention, it should be apparent to one skilled in the art that the invention can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative in nature and not as restrictive.
While the invention has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of those embodiments will be apparent to those skilled in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic RAM (DRAM)) may use the embodiments discussed.
The embodiments of the invention are intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the present invention should be included in the scope of the present invention.

Claims (8)

1. A method for synchronizing a plurality of sets of parties, comprising:
constructing a flag cuckoo filtering data structure and defining the operation of the flag cuckoo filtering;
the local elements of each synchronous participant are respectively represented by using the mark cuckoo filtering, so that a local set of each synchronous participant is obtained;
the local set is sent to a central node for aggregation, and a global set is obtained;
transmitting the global set to each of the synchronization participants;
Traversing the global set by the synchronous participants to determine difference elements;
enabling the synchronous participants to acquire the difference elements from the global set so as to complete multi-party set synchronization;
the tag cuckoo filtering data structure specifically comprises:
isomorphic marker cuckoo filtering: consists of m cells, each cell containing b storage slots, storing a maximum of b element fingerprints, which can be stored in any one candidate cell; the storage groove is provided with two fields, including a fingerprint field for storing the fingerprint information of the element and a mark field for recording the membership information of the stored element; the fingerprint domain comprises f bits, the flag domain comprises n bits, wherein n is the number of sets participating in synchronization; providing two candidate cells for any element x, wherein indexes of the element x in the two candidate cells are respectively as follows: h is a 1 (x) =hash (x)% m and h 2 (x)=hash(x)⊙(hash(η x ) % m), wherein eta x Element fingerprints representing element x, hash representing a hash function;
filtering the heterogeneous marker cuckoo: the relationship between the number of cells and the capacity of each cell is specifically: m is m i =α|S i |/b i Wherein m is i And b i Respectively representing the number of the cells and the capacity of each cell, wherein alpha is more than or equal to 1 and is a constant; the indexes of the element x in the two candidate cells are respectively: h is a 1 (x)=hash(η x ) % m and h 2 (x)=(h 1 (x)⊕η x )%m;
Each synchronous participant uses one MCF to represent the local element, and the MCFs used by different participants adopt isomorphic MCFs or heterogeneous MCFs according to specific requirements;
the sending the local set to the central node for aggregation specifically comprises:
selecting one of the synchronous participants having the greatest node degree as a central node;
the isomorphic marker cuckoo filtering: selecting an aggregation path through a minimum spanning tree algorithm, and transmitting the local set to a central node through the aggregation path for aggregation;
the heterogeneous mark cuckoo is filtered: constructing an empty mark cuckoo filter, wherein the capacity of the empty mark cuckoo filter is determined by the number of element fingerprints of a local set to be aggregated; and inserting element fingerprints in the local set to be aggregated into the empty mark cuckoo filtering one by one.
2. The multi-party set synchronization method of claim 1, wherein the operation of flag cuckoo filtering specifically comprises:
element insertion: calculating element fingerprints through a hash function, and then inserting the element fingerprints into the two candidate cells; reassigning when all the storage slots in both of the candidate cells are occupied;
Element query: traversing the two candidate cells, and returning the mark domain of the storage tank of the queried element when the queried element is found; otherwise, returning an error;
element deletion:
when an element is deleted from a collection: firstly, determining candidate cell positions of element fingerprints of elements to be deleted, and returning an error if the element fingerprints of the elements to be deleted do not exist or the ith bit of a mark field of a storage tank storing the element fingerprints of the elements to be deleted is 0; otherwise, the ith bit position of the mark field of the storage tank storing the element fingerprint of the element to be deleted is reversely set to 0; if all bits of the mark field of the storage tank for storing the element fingerprints of the element to be deleted are 0, deleting the element fingerprints of the element to be deleted;
when an element is deleted from all sets: deleting the element fingerprints of the elements to be deleted and setting all the positions of the mark domain of the storage tank where the element fingerprints of the elements to be deleted are positioned as 0;
a plurality of marker cuckoo filtering and polymerizing: given two flag cuckoo filtering vectors MCF i And MCF j Adding membership information of common elements in the two vectors to the vector MCF i In (3), MCF j Is inserted into MCF i Is a kind of medium.
3. The multi-party set synchronization method according to claim 2, wherein said reassigning specifically comprises:
when all storage grooves in the two candidate cells are occupied, randomly kicking out an element fingerprint which is already stored to obtain an empty storage groove, and then storing the element fingerprint to be stored and membership information of the element fingerprint to be stored in the empty storage groove; the kicked element fingerprint will be reassigned to another candidate cell; ending when no additional element fingerprint is kicked out or the number of reassignments reaches a given upper limit.
4. The method for synchronizing a set of parties according to claim 1, wherein the step of causing the synchronized parties to traverse the global set, determining the difference element comprises:
the difference elements include missing elements and unique elements;
the synchronous participants traverse the global set, element fingerprints of the missing elements and membership information thereof are added to the missing element set, and element fingerprints of the unique elements and membership information thereof are added to the unique element set.
5. The multi-party set synchronization method of claim 4, wherein when the synchronization participants dynamically join and leave:
When a synchronization participant participating in synchronization leaves, if the leaving synchronization participant is not a leaf node of a minimum spanning tree in an aggregation path, splitting the minimum spanning tree into a plurality of subtrees, and connecting the subtrees by using a minimum edge among the subtrees to obtain a new minimum spanning tree;
when a new synchronization participant joins, selecting a minimum edge between the new synchronization participant and an existing synchronization participant, and joining the new synchronization participant into an existing minimum spanning tree through the minimum edge.
6. A multi-party set synchronization apparatus, comprising:
a data structure construction module configured to construct a flag cuckoo filtering data structure and define an operation of the flag cuckoo filtering;
the local element representation module is configured to respectively represent the local elements of each synchronous participant by using the mark cuckoo filtering to obtain a local set of each synchronous participant;
the aggregation module is configured to send the local aggregation to a central node for aggregation to obtain a global aggregation;
a set sending module configured to send the global set to each of the synchronization participants;
A difference element determination module configured to cause the synchronization participant to traverse the global set, determining a difference element;
the synchronization module is configured to enable the synchronization participant to acquire the difference elements from the global set so as to complete multi-party set synchronization;
the tag cuckoo filtering data structure specifically comprises:
isomorphic marker cuckoo filtering: consists of m cells, each cell containing b storage slots, storing a maximum of b element fingerprints, which can be stored in any one candidate cell; the storage groove is provided with two fields, including a fingerprint field for storing the fingerprint information of the element and a mark field for recording the membership information of the stored element; the fingerprint domain comprises f bits, the flag domain comprises n bits, wherein n is the number of sets participating in synchronization; providing two candidate cells for any element x, wherein indexes of the element x in the two candidate cells are respectively as follows: h is a 1 (x) =hash (x)% m and h 2 (x)=hash(x)⊙(hash(η x ) % m), wherein eta x Element fingerprints representing element x, hash representing a hash function;
filtering the heterogeneous marker cuckoo: number of cellsThe relationship of each cell capacity is specifically: m is m i =α|S i |/b i Wherein m is i And b i Respectively representing the number of the cells and the capacity of each cell, wherein alpha is more than or equal to 1 and is a constant; the indexes of the element x in the two candidate cells are respectively: h is a 1 (x)=hash(η x ) % m and h 2 (x)=(h 1 (x)⊕η x )%m;
Each synchronous participant uses one MCF to represent the local element, and the MCFs used by different participants adopt isomorphic MCFs or heterogeneous MCFs according to specific requirements;
the sending the local set to the central node for aggregation specifically comprises:
selecting one of the synchronous participants having the greatest node degree as a central node;
the isomorphic marker cuckoo filtering: selecting an aggregation path through a minimum spanning tree algorithm, and transmitting the local set to a central node through the aggregation path for aggregation;
the heterogeneous mark cuckoo is filtered: constructing an empty mark cuckoo filter, wherein the capacity of the empty mark cuckoo filter is determined by the number of element fingerprints of a local set to be aggregated; and inserting element fingerprints in the local set to be aggregated into the empty mark cuckoo filtering one by one.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 5 when the program is executed by the processor.
8. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 5.
CN202010230302.3A 2020-03-27 2020-03-27 Multiparty set synchronization method and device and electronic equipment Active CN111538865B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010230302.3A CN111538865B (en) 2020-03-27 2020-03-27 Multiparty set synchronization method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010230302.3A CN111538865B (en) 2020-03-27 2020-03-27 Multiparty set synchronization method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN111538865A CN111538865A (en) 2020-08-14
CN111538865B true CN111538865B (en) 2023-06-02

Family

ID=71974826

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010230302.3A Active CN111538865B (en) 2020-03-27 2020-03-27 Multiparty set synchronization method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN111538865B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114579311B (en) * 2022-03-04 2023-05-30 北京百度网讯科技有限公司 Method, device, equipment and storage medium for executing distributed computing task

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105630955A (en) * 2015-12-24 2016-06-01 华中科技大学 Method for efficiently managing members of dynamic data set
CN108804226A (en) * 2018-05-28 2018-11-13 中国人民解放军国防科技大学 Graph segmentation and division method for distributed graph computation
CN109416694A (en) * 2016-07-11 2019-03-01 微软技术许可有限责任公司 The key assignments storage system effectively indexed including resource
CN109819030A (en) * 2019-01-22 2019-05-28 西北大学 A kind of preparatory dispatching method of data resource based on edge calculations
CN110222088A (en) * 2019-05-20 2019-09-10 华中科技大学 Data approximation set representation method and system based on insertion position selection

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10579633B2 (en) * 2017-08-31 2020-03-03 Micron Technology, Inc. Reducing probabilistic filter query latency

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105630955A (en) * 2015-12-24 2016-06-01 华中科技大学 Method for efficiently managing members of dynamic data set
CN109416694A (en) * 2016-07-11 2019-03-01 微软技术许可有限责任公司 The key assignments storage system effectively indexed including resource
CN108804226A (en) * 2018-05-28 2018-11-13 中国人民解放军国防科技大学 Graph segmentation and division method for distributed graph computation
CN109819030A (en) * 2019-01-22 2019-05-28 西北大学 A kind of preparatory dispatching method of data resource based on edge calculations
CN110222088A (en) * 2019-05-20 2019-09-10 华中科技大学 Data approximation set representation method and system based on insertion position selection

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
L Luo.The Consistent Cuckoo Filter.《IEEE》.2019,全文. *
蒋捷 ; 杨仝 ; 张梦瑜 ; 代亚非 ; 黄亮 ; 郑廉清 ; .DCuckoo:基于片内摘要的高性能散列表.计算机研究与发展.2017,(11),全文. *

Also Published As

Publication number Publication date
CN111538865A (en) 2020-08-14

Similar Documents

Publication Publication Date Title
TWI250742B (en) Method and system for identifying available resources in a peer-to-peer network
CN1764171B (en) Rendezvousing resource requests with corresponding resources
JP5551270B2 (en) Method and apparatus for decomposing a peer-to-peer network and using the decomposed peer-to-peer network
CN102831170B (en) The method for pushing of activity information and device
CN107169083A (en) Public security bayonet socket magnanimity vehicle data storage and retrieval method and device, electronic equipment
US20200394181A1 (en) Big data blockchains with merkle trees
US20090006489A1 (en) Hierarchical synchronization of replicas
JP2011198023A (en) Communication terminal device, computer program, and content search method
Fu et al. Synchronizing namespaces with invertible bloom filters
CN103326925B (en) A kind of information push method and device
CN110109874A (en) A kind of non-stop layer distributed document retrieval method based on block chain
US9692847B2 (en) Content distribution method and content distribution server
JP2016529594A (en) Content sharing method and social synchronization apparatus
US11868328B2 (en) Multi-record index structure for key-value stores
JP7202558B1 (en) DIGITAL OBJECT ACCESS METHOD AND SYSTEM IN HUMAN-CYBER-PHYSICAL COMBINED ENVIRONMENT
CN102073733B (en) Method and device for managing Hash table
CN111538865B (en) Multiparty set synchronization method and device and electronic equipment
CN109698814A (en) Botnet finds that method and Botnet find device
CN104063377B (en) Information processing method and use its electronic equipment
KR100876105B1 (en) How to download multimedia contents to wired and wireless terminals
Karolewicz et al. On efficient data storage service for IoT
Afonso Key-Value Storage for handling data in mobile devices
CN107846429A (en) A kind of file backup method, device and system
Chazapis et al. Replica-aware, multi-dimensional range queries in distributed hash tables
CN101521597B (en) Data statistical approach and system of mixed P2P network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant