CN112416950B - Design method and device of three-dimensional sketch structure - Google Patents

Design method and device of three-dimensional sketch structure Download PDF

Info

Publication number
CN112416950B
CN112416950B CN202110093365.3A CN202110093365A CN112416950B CN 112416950 B CN112416950 B CN 112416950B CN 202110093365 A CN202110093365 A CN 202110093365A CN 112416950 B CN112416950 B CN 112416950B
Authority
CN
China
Prior art keywords
weight
bucket
edge
node
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110093365.3A
Other languages
Chinese (zh)
Other versions
CN112416950A (en
Inventor
蔡志平
侯昌盛
侯冰楠
周桐庆
胡罡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202110093365.3A priority Critical patent/CN112416950B/en
Publication of CN112416950A publication Critical patent/CN112416950A/en
Application granted granted Critical
Publication of CN112416950B publication Critical patent/CN112416950B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2264Multidimensional index structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a design method and device of a three-dimensional sketch structure, computer equipment and a storage medium. The method comprises the following steps: the method comprises the steps of constructing a three-dimensional sketch structure for storing stream data information, dividing each bucket in a three-dimensional bucket array into three areas, obtaining the stream data information of a graph structure, and updating the stream data information into the three-dimensional sketch structure according to hash functions corresponding to different depths. The invention combines the representative key reservation and the majority voting, the constructed three-dimensional sketch structure simultaneously maintains the structure information and the weight information of the graph data, one-time updating is carried out on the primary stream data, the constant probability error range is ensured, the majority voting algorithm is used for selecting the stored keys so as to record the most representative graph edges, the error range can be reduced, and the reversibility is realized so as to improve the query efficiency.

Description

Design method and device of three-dimensional sketch structure
Technical Field
The application relates to the technical field of computer networks, in particular to a design method and device of a three-dimensional sketch structure, computer equipment and a storage medium.
Background
In the past decade, graph structures have been used extensively to model complex structured data in interactive applications, such as network traffic and social networks. As a sequence of data that changes over time, the stream data that preserves the graph structure may continuously describe entities (e.g., social media users) and connections between entities (e.g., interactions between users) and form the basis for various services such as network anonymity detection, community discovery, and the like. However, analyzing massive amounts of streaming data in real-time is quite challenging. For example, each link in a large ISP or data center processes about millions of data packets per second. In the face of such a situation, the conventional data structure (e.g., adjacency table) is not suitable for storing the graph structure in stream data.
In the relevant scenario of streaming data applications, rather than sacrificing efficiency to arrive at an exact result, it is more desirable to get an approximate result quickly. To meet such a requirement, sketch is widely used, and generates a summary in a space-saving manner, performs approximate weight estimation, and can query elephant streams (heavy-hitter) or mine top-k terms (top-k items). A Sketch is a stream data aggregation structure that can store a fixed number of entries in units of buckets, and classical sketches (e.g., Count Sketch, K-ary Sketch, and Count-Min Sketch) linearly project stream data into a lower dimensional space that preserves data aggregation features using multiple hash functions.
Some previous graph sketch techniques (e.g., GSS, TCM, gMatrix) improved the structure of the classic sketch, recording stream data that preserves graph structure in a generic way, supporting real-time updates and queries. The gSketch is designed for edge frequencies, but it can only answer queries based on edge frequencies, and not more complex graph structure queries. TCM and GSS aim to preserve graph structure and support various types of queries, but they cannot do reverse hash queries by traversing the entire key space to obtain the edges or nodes of interest, or storing other index tables to record keys and their hash values. gMatrix recovers a key by pruning the key space using modular hashing techniques, but the process of recovering a key using modular hashing techniques requires lifting entries in the sub-key space, an enumeration that is computationally expensive.
Therefore, the prior art cannot take both the estimation accuracy and the query efficiency into consideration.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a method, an apparatus, a computer device, and a storage medium for designing a three-dimensional sketch structure, which can achieve both estimation accuracy and query efficiency.
A method of designing a three-dimensional sketch structure, the method comprising:
constructing a three-dimensional sketch structure for storing stream data information, wherein the three-dimensional sketch structure comprises a three-dimensional bucket array, the total line number and the total column number of the three-dimensional bucket array are the same, and different depths of the three-dimensional bucket array correspond to different hash functions;
dividing each bucket in the three-dimensional bucket array into three regions, respectively storing the sum of weight values mapped to all edges of the bucket, a key with the largest current weight mapped to the bucket and a value of an indication counter for indicating to reserve or replace the key with the largest current weight, and initializing the bucket;
acquiring stream data information of a graph structure, and calculating a bucket index of each edge at each depth according to node information of the edge in the stream data information and hash functions corresponding to different depths;
updating the bucket of each depth corresponding to the edge according to the bucket index, wherein the updating comprises: updating the sum of the weight values in the bucket according to the weight information of the edge, updating the value of the indication counter through a majority voting algorithm according to the key with the largest current weight in the bucket and the acquired information of the edge, and determining to reserve or replace the key with the largest current weight according to the updated value of the indication counter;
continuously acquiring stream data information, updating information of all sides in the stream data information into the three-dimensional sketch structure, and carrying out statistics and query on the stream data according to data stored in the three-dimensional sketch structure.
In one embodiment, the method further comprises the following steps: according to the edge in the stream data information
Figure 652237DEST_PATH_IMAGE001
The bucket index of
Figure 63626DEST_PATH_IMAGE002
Updating the bucket of each depth corresponding to the edge, wherein,
Figure 970403DEST_PATH_IMAGE003
the depth sequence number of the packet,
Figure 607664DEST_PATH_IMAGE004
is composed of
Figure 814654DEST_PATH_IMAGE003
The hash function corresponding to the depth layer comprises the following steps:
updating the sum of the weight values in the bucket to be:
Figure 131366DEST_PATH_IMAGE005
wherein the content of the first and second substances,
Figure 525438DEST_PATH_IMAGE006
is the weight information of the edge in question,
Figure 484167DEST_PATH_IMAGE007
is an edge
Figure 529352DEST_PATH_IMAGE008
Mapped to
Figure 548124DEST_PATH_IMAGE003
The sum of the weight values in bucket of the depth layer;
comparing the key with the largest current weight in the bucket
Figure 429492DEST_PATH_IMAGE009
And edge
Figure 660753DEST_PATH_IMAGE010
When in use
Figure 576757DEST_PATH_IMAGE011
When the counter is updated, the value of the indication counter is:
Figure 251583DEST_PATH_IMAGE012
wherein the content of the first and second substances,
Figure 885827DEST_PATH_IMAGE013
is an edge
Figure 920779DEST_PATH_IMAGE014
Mapped to
Figure 425709DEST_PATH_IMAGE003
The key with the largest current weight in the bucket of the depth layer;
Figure 786284DEST_PATH_IMAGE015
is an edge
Figure 891512DEST_PATH_IMAGE016
Mapped to
Figure 730155DEST_PATH_IMAGE017
A value of the indication counter in a bucket of a depth layer;
when in use
Figure 620750DEST_PATH_IMAGE018
When the counter is updated, the value of the indication counter is:
Figure 886647DEST_PATH_IMAGE019
judgment of
Figure 229903DEST_PATH_IMAGE020
Positive and negative of (2)
Figure 620040DEST_PATH_IMAGE021
When the temperature of the water is higher than the set temperature,
Figure 365142DEST_PATH_IMAGE022
the value of (d) remains unchanged; when in use
Figure 801940DEST_PATH_IMAGE023
At the same time, update
Figure 632492DEST_PATH_IMAGE024
And
Figure 812938DEST_PATH_IMAGE025
comprises the following steps:
Figure 412547DEST_PATH_IMAGE026
Figure 535092DEST_PATH_IMAGE027
in one embodiment, the method further comprises the following steps: and carrying out statistics and query of flow data according to data stored in the three-dimensional sketch structure, wherein the query comprises side weight query, node weight query, elephant flow side query, elephant flow node query, mutation flow query, mutation node query, subgraph weight query and path reachability query.
In one embodiment, the method further comprises the following steps: counting and inquiring flow data according to the data stored in the three-dimensional sketch structure, wherein the step of inquiring the edge weight is as follows:
obtaining edges to query
Figure 587362DEST_PATH_IMAGE028
Is compared to a depth ofkIn a bucket of
Figure 305919DEST_PATH_IMAGE029
Value of (A) and
Figure 25613DEST_PATH_IMAGE030
whether they are the same or not, if so, the edge
Figure 69793DEST_PATH_IMAGE031
In thatkThe edge weight estimate for the depth layer is:
Figure 360091DEST_PATH_IMAGE032
otherwise, the edge
Figure 147918DEST_PATH_IMAGE033
In thatkThe edge weight estimate for the depth layer is:
Figure 722119DEST_PATH_IMAGE034
wherein the content of the first and second substances,
Figure 671621DEST_PATH_IMAGE035
is an edge
Figure 698483DEST_PATH_IMAGE036
In thatkAn edge weight estimate for the depth layer;
according to the edge
Figure 24422DEST_PATH_IMAGE036
The minimum value of the edge weight estimated values at all depths is obtained to obtain the edge
Figure 702397DEST_PATH_IMAGE037
The edge weight estimate of (a) is:
Figure 822799DEST_PATH_IMAGE038
wherein the content of the first and second substances,
Figure 602536DEST_PATH_IMAGE039
for queried edges
Figure 466587DEST_PATH_IMAGE040
The edge weight estimate of (2).
In one embodiment, the method further comprises the following steps: and carrying out statistics and query on flow data according to the data stored in the three-dimensional sketch structure, wherein the node weight query step is as follows:
obtaining a node to queryx
For depth
Figure 749801DEST_PATH_IMAGE041
Middle row of
Figure 77925DEST_PATH_IMAGE042
If each packet of
Figure 79379DEST_PATH_IMAGE043
Then nodexIn thatkDepth layerjThe output weight estimates for the columns are:
Figure 481541DEST_PATH_IMAGE044
otherwise, the nodexIn thatkDepth layerjThe output weight estimates for the columns are:
Figure 884841DEST_PATH_IMAGE045
wherein the content of the first and second substances,
Figure 878204DEST_PATH_IMAGE046
representing the total depth of the three-dimensional bucket values;
Figure 835796DEST_PATH_IMAGE047
representing the total column number of the three-dimensional bucket numerical value;
Figure 822076DEST_PATH_IMAGE048
is a nodexIn thatkDepth layerjAn output weight estimate for the column;
according to the nodexIn thatkDepth layer
Figure 814302DEST_PATH_IMAGE049
The output weight estimated value of the column is obtained to obtain a nodexIn thatkThe output weight estimate for the depth layer is:
Figure 447409DEST_PATH_IMAGE050
wherein the content of the first and second substances,
Figure 423455DEST_PATH_IMAGE051
is a nodexIn thatkAn output weight estimate for the depth layer;
according to the nodexIn that
Figure 698579DEST_PATH_IMAGE052
The estimated value of the output weight of the depth is obtained to obtain a nodexThe node weight estimate of (a) is:
Figure 296045DEST_PATH_IMAGE053
wherein the content of the first and second substances,
Figure 631211DEST_PATH_IMAGE054
is a nodexThe output weight estimate of (a);
for depth
Figure 828974DEST_PATH_IMAGE055
In
Figure 642209DEST_PATH_IMAGE056
If each packet of
Figure 874608DEST_PATH_IMAGE057
Then nodexIn thatkDepth layeriThe input weight estimates for the rows are:
Figure 115096DEST_PATH_IMAGE058
otherwise, the nodexIn thatkDepth layerjThe input weight estimates for the columns are:
Figure 49423DEST_PATH_IMAGE059
wherein the content of the first and second substances,
Figure 666349DEST_PATH_IMAGE060
is a nodexIn thatkDepth layeriAn input weight estimate for the row;
according to the nodexIn thatkDepth layer
Figure 753254DEST_PATH_IMAGE061
Input weight estimation value of the column to obtain a nodexIn thatkThe input weight estimate for the depth layer is:
Figure 899064DEST_PATH_IMAGE062
wherein the content of the first and second substances,
Figure 71420DEST_PATH_IMAGE063
the total row number of the three-dimensional bucket numerical value is represented;
Figure 226457DEST_PATH_IMAGE064
is a nodexIn thatkAn input weight estimate for the depth layer;
according to the nodexIn that
Figure 915671DEST_PATH_IMAGE065
The depth input weight estimated value is obtained to obtain a nodexThe node weight estimate of (a) is:
Figure 763542DEST_PATH_IMAGE066
wherein the content of the first and second substances,
Figure 157614DEST_PATH_IMAGE067
is a nodexThe input weight estimate of (1);
according to the nodexOutputting weight estimates
Figure 116343DEST_PATH_IMAGE068
And input weight estimates
Figure 912260DEST_PATH_IMAGE069
To obtain a nodexThe node weight estimate of (1).
In one embodiment, the method further comprises the following steps: performing statistics and query of stream data according to data stored in the three-dimensional sketch structure, wherein the sub-graph weight query step is as follows:
obtaining a subset of nodes of a subgraph to be queried
Figure 931032DEST_PATH_IMAGE070
WhereinnRepresenting a number of nodes in the subset of nodes;
Figure 61668DEST_PATH_IMAGE071
indicating a serial number ofiThe node(s) of (a) is (are),
Figure 558508DEST_PATH_IMAGE072
determining a subset of edges of the subgraph from nodes in the subset of nodes
Figure 208932DEST_PATH_IMAGE073
Wherein m represents the number of edges in the subset of edges,
Figure 398605DEST_PATH_IMAGE074
Figure 32849DEST_PATH_IMAGE075
indicating a serial number ofiThe edge of (a) is provided with,
Figure 818533DEST_PATH_IMAGE076
obtaining weights of all edges in the subset of edges through edge weight query
Figure 589043DEST_PATH_IMAGE077
Will be provided with
Figure 949618DEST_PATH_IMAGE078
The result of the sum of the weights of all edges in
Figure 539999DEST_PATH_IMAGE079
As weight estimates for the subgraph.
In one embodiment, the method further comprises the following steps: and carrying out statistics and query on stream data according to the data stored in the three-dimensional sketch structure, wherein the step of path reachability query is as follows:
according to each bucket in the three-dimensional sketch structure
Figure 644221DEST_PATH_IMAGE080
Sum of the weight values in (1)
Figure 269237DEST_PATH_IMAGE081
Summing to obtain the total weight of the data stream in the current periodF(ii) a Wherein
Figure 49981DEST_PATH_IMAGE082
Sequence numbers representing the bucket in three dimensions of row, column and depth;
traverse all
Figure 127658DEST_PATH_IMAGE083
When is coming into contact with
Figure 769992DEST_PATH_IMAGE084
At the same time, read
Figure 249515DEST_PATH_IMAGE085
Key stored in
Figure 686312DEST_PATH_IMAGE086
Obtained by edge weight query
Figure 999089DEST_PATH_IMAGE087
The edge weight of when
Figure 445113DEST_PATH_IMAGE088
When it is determined
Figure 44722DEST_PATH_IMAGE089
Is an elephant stream edge; wherein
Figure 918000DEST_PATH_IMAGE090
Is a preset threshold parameter;
at all weights greater than
Figure 704691DEST_PATH_IMAGE091
The elephant stream edge is subjected to path reachability query through a breadth-first search algorithm, wherein the path reachability query is to determine whether the source node a can pass through the weight of at least
Figure 938095DEST_PATH_IMAGE091
Is connected to the target node b.
An apparatus for designing a three-dimensional sketch structure, the apparatus comprising:
the device comprises a three-dimensional sketch structure construction module, a data information storage module and a data information processing module, wherein the three-dimensional sketch structure construction module is used for constructing a three-dimensional sketch structure for storing stream data information, the three-dimensional sketch structure comprises a three-dimensional bucket array, the total line number and the total column number of the three-dimensional bucket array are the same, and different depths of the three-dimensional bucket array correspond to different hash functions;
the Bucket initialization module is used for dividing each Bucket in the three-dimensional Bucket array into three regions, respectively storing the sum of weight values mapped to all edges of the Bucket, a key with the largest current weight mapped to the Bucket and a value of an indication counter used for indicating to reserve or replace the key with the largest current weight, and initializing the Bucket;
the bucket index determining module is used for acquiring stream data information of a graph structure, and calculating a bucket index of each edge at each depth according to node information of the edge in the stream data information and hash functions corresponding to different depths;
a bucket updating module, configured to update the bucket of each depth corresponding to the edge according to the bucket index, where the updating includes: updating the sum of the weight values in the bucket according to the weight information of the edge, updating the value of the indication counter through a majority voting algorithm according to the key with the largest current weight in the bucket and the acquired information of the edge, and determining to reserve or replace the key with the largest current weight according to the updated value of the indication counter;
and the counting and inquiring module is used for continuously acquiring stream data information, updating information of all sides in the stream data information into the three-dimensional sketch structure, and counting and inquiring the stream data according to the data stored in the three-dimensional sketch structure.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
constructing a three-dimensional sketch structure for storing stream data information, wherein the three-dimensional sketch structure comprises a three-dimensional bucket array, the total line number and the total column number of the three-dimensional bucket array are the same, and different depths of the three-dimensional bucket array correspond to different hash functions;
dividing each bucket in the three-dimensional bucket array into three regions, respectively storing the sum of weight values mapped to all edges of the bucket, a key with the largest current weight mapped to the bucket and a value of an indication counter for indicating to reserve or replace the key with the largest current weight, and initializing the bucket;
acquiring stream data information of a graph structure, and calculating a bucket index of each edge at each depth according to node information of the edge in the stream data information and hash functions corresponding to different depths;
updating the bucket of each depth corresponding to the edge according to the bucket index, wherein the updating comprises: updating the sum of the weight values in the bucket according to the weight information of the edge, updating the value of the indication counter through a majority voting algorithm according to the key with the largest current weight in the bucket and the acquired information of the edge, and determining to reserve or replace the key with the largest current weight according to the updated value of the indication counter;
continuously acquiring stream data information, updating information of all sides in the stream data information into the three-dimensional sketch structure, and carrying out statistics and query on the stream data according to data stored in the three-dimensional sketch structure.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
constructing a three-dimensional sketch structure for storing stream data information, wherein the three-dimensional sketch structure comprises a three-dimensional bucket array, the total line number and the total column number of the three-dimensional bucket array are the same, and different depths of the three-dimensional bucket array correspond to different hash functions;
dividing each bucket in the three-dimensional bucket array into three regions, respectively storing the sum of weight values mapped to all edges of the bucket, a key with the largest current weight mapped to the bucket and a value of an indication counter for indicating to reserve or replace the key with the largest current weight, and initializing the bucket;
acquiring stream data information of a graph structure, and calculating a bucket index of each edge at each depth according to node information of the edge in the stream data information and hash functions corresponding to different depths;
updating the bucket of each depth corresponding to the edge according to the bucket index, wherein the updating comprises: updating the sum of the weight values in the bucket according to the weight information of the edge, updating the value of the indication counter through a majority voting algorithm according to the key with the largest current weight in the bucket and the acquired information of the edge, and determining to reserve or replace the key with the largest current weight according to the updated value of the indication counter;
continuously acquiring stream data information, updating information of all sides in the stream data information into the three-dimensional sketch structure, and carrying out statistics and query on the stream data according to data stored in the three-dimensional sketch structure.
According to the design method, the device, the computer equipment and the storage medium of the three-dimensional sketch structure, the three-dimensional sketch structure used for storing the stream data information is constructed, each bucket in the three-dimensional bucket array is divided into three areas, and the sum of weight values of all edges mapped to the buckets, the key with the maximum current weight mapped to the bucket and the value of an indication counter used for indicating to reserve or replace the key with the maximum current weight are stored respectively; acquiring stream data information of a graph structure, updating the sum of weighted values mapped in a bucket according to side information in the stream data information and hash functions corresponding to different depths, updating the value of an indication counter through a majority voting algorithm according to the key with the largest current weight in the bucket and the side information, and determining to reserve or replace the key with the largest current weight according to the updated value of the indication counter. The three-dimensional sketch structure constructed by the invention simultaneously maintains the structure information and the weight information of the graph data, updates the primary stream data only once, has a constant probability error range, uses a majority voting algorithm to select a stored key so as to record the most representative graph edge, can reduce the error range and realize reversibility so as to improve the query efficiency.
Drawings
FIG. 1 is an application scenario diagram of a design method of a three-dimensional sketch structure in one embodiment;
FIG. 2 is a schematic diagram of a three-dimensional sketch structure in one embodiment;
FIG. 3 is a flow diagram illustrating the updating of values in a packet according to one embodiment;
FIG. 4 is a block diagram of a device for designing a three-dimensional sketch structure in one embodiment;
FIG. 5 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The design method of the three-dimensional sketch structure provided by the application can be applied to the following application environments. Constructing a three-dimensional sketch structure for storing stream data information, dividing each bucket in a three-dimensional bucket array into three regions, respectively storing the sum of weight values mapped to all edges of the bucket, a key with the maximum current weight mapped to the bucket and a value of an indication counter for indicating to reserve or replace the key with the maximum current weight, and initializing the bucket; acquiring stream data information of a graph structure, updating the sum of weighted values mapped in a bucket according to side information in the stream data information and hash functions corresponding to different depths, updating the value of an indication counter through a majority voting algorithm according to the key with the largest current weight in the bucket and the side information, and determining to reserve or replace the key with the largest current weight according to the updated value of the indication counter.
In one embodiment, as shown in fig. 1, there is provided a method for designing a three-dimensional sketch structure, including the steps of:
102, constructing a three-dimensional sketch structure for storing stream data information, wherein the three-dimensional sketch structure comprises a three-dimensional bucket array, the total line number and the total column number of the three-dimensional bucket array are the same, and different depths of the three-dimensional bucket array correspond to different hash functions.
The sketch data structure is a probability statistical data structure with high reliability, wherein the characteristics (occurrence frequency and data stream size) of a large number of elements in stream data need to be counted, a hash function is used for multiple times for the stream data to map an event to the frequency, the size of the event is estimated within reasonable deviation, and the memory occupation is obviously reduced. Sketch may model the stream data as a (key, value) form, where a "key" is a field in one or more data packet headers, such as a source address or a combination of a source address and a destination address, and a "value" is a stored characteristic, such as the number of data packets.
The graph structure referred to in the stream data is a discrete structure formed by nodes and edges connecting the nodes, and the graph structure may be represented by G = (V, E), where V represents a node set, E represents an edge set, and a relationship between any two nodes in the graph structure is represented by an adjacency matrix, when no weight is referred to in the graph structure, an element of the adjacency matrix is 0 or 1, 0 represents that the corresponding node has no relationship, and 1 represents that the corresponding node has relationship; when weights are involved in the graph structure, the element values of the adjacency matrix are weight values between nodes.
Classical sketch is designed as a two-dimensional bucket array and cannot query for structure-based information (e.g., path and subgraph based queries), so applying classical sketch methods directly to stream data would lose graph structure information.
The three-dimensional sketch structure DMatrix provided by the invention expands the sketch structure into three dimensions, and the DMatrix is a structure with three dimensionsh×h×wThe first two dimensions of the three-dimensional sketch structure of each bucket represent the length and the width of the bucket array, and the third dimension represents the depth of the bucket array. Use of DMatrixwIndependent hash functions, each defining a hash function from a node in the set of nodes to 1,h]a mapping of integers within a range. The two-dimensional arrays with the same depth are set to have the same length and width, and the same hash function is used for the same depth.
And step 104, dividing each bucket in the three-dimensional bucket array into three regions, respectively storing the sum of the weight values of all edges mapped to the buckets, the key with the maximum current weight mapped to the bucket and the value of an indication counter for indicating to reserve or replace the key with the maximum current weight, and initializing the buckets.
In order to realize reversible query, the method of the invention reserves a space in the bucket for storing keys of edges or nodes. Due to the hash collision problem, different edges may map into the same bucket. Therefore, there may be multiple keys in the same bucket. For this problem, the LD-Sketch selection uses the associative key array in the extended bucket to accommodate more candidate keys, but the cost of dynamic memory allocation is very high. To save computation and storage overhead, DMatrix retains only one key in each bucket. For buckets with hash collisions, we will select a key to store in the bucket. In fact, the bucket stores the set of weights for all edges mapped to this bucket that have the same hash value, and the key selected by the present invention to store should be most representative of this aggregated result. Therefore, the present invention stores the most weighted key in this bucket. In selecting the stored key, we apply a majority voting algorithm (mjty), which records the most weighted key in the bucket. The mjty processes a sequence of votes and attempts to find a majority of the votes. In the single vote voting process, it will store the candidate majority votes observed so far in the stream and an indication counter that records whether the currently stored votes are still candidate majority votes.
We use
Figure 657789DEST_PATH_IMAGE092
The representation is located atiGo to the firstjColumn depth ofkIn which
Figure 436389DEST_PATH_IMAGE093
,
Figure 975955DEST_PATH_IMAGE094
. As shown in fig. 2, the memory area of such a packet is divided into three parts: the first block is
Figure 763782DEST_PATH_IMAGE095
Recording the sum of all edge weight values hashed to the bucket; the second block is
Figure 823136DEST_PATH_IMAGE096
Storing the key with the largest current weight of the bucket; the third block is
Figure 38217DEST_PATH_IMAGE097
It is an indication counter that checks whether it should be reserved or replaced as in mjty
Figure 330658DEST_PATH_IMAGE098
A stored key.
The initialization of a Bucket is to initialize three blocks in the Bucket
Figure 391018DEST_PATH_IMAGE099
Figure 85304DEST_PATH_IMAGE100
Figure 205707DEST_PATH_IMAGE101
Are initialized to 0.
And 106, acquiring stream data information of the graph structure, and calculating the bucket index of each edge at each depth according to the node information of the edge in the stream data information and the hash functions corresponding to different depths.
Obtaining stream data information of graph structure according to edges in the stream data information
Figure 969133DEST_PATH_IMAGE102
Node information ofxAndyand hash functions corresponding to different depths, calculating the bucket index of the edge at each depth as
Figure 833184DEST_PATH_IMAGE103
Wherein, in the step (A),
Figure 381977DEST_PATH_IMAGE104
the depth serial number of the bucket is;
Figure 673281DEST_PATH_IMAGE105
is composed of
Figure 674735DEST_PATH_IMAGE106
And the depth layer corresponds to a hash function.
Step 108, updating the bucket of each depth corresponding to the edge according to the bucket index, wherein the updating comprises: updating the sum of the weight values in the bucket according to the weight information of the edge, updating the value of an indication counter through a majority voting algorithm according to the key with the maximum current weight in the bucket and the acquired information of the edge, and determining to reserve or replace the key with the maximum current weight according to the updated value of the indication counter.
The update procedure for DMatrix is as follows: for incoming weight offIs not limited by
Figure 90279DEST_PATH_IMAGE107
First, calculate the index of bucket
Figure 493579DEST_PATH_IMAGE108
Then will be
Figure 955784DEST_PATH_IMAGE109
Value increase of regionfAnd check
Figure 444534DEST_PATH_IMAGE110
Whether the key stored in (1) is
Figure 181546DEST_PATH_IMAGE111
If so, it will indicate a counter
Figure 173773DEST_PATH_IMAGE112
Increase in value offOtherwise, it will
Figure 321726DEST_PATH_IMAGE113
Value of (2) is reducedf. If it is not
Figure 297772DEST_PATH_IMAGE114
To a value of less than 0, we use
Figure 307317DEST_PATH_IMAGE115
Is replaced by
Figure 419629DEST_PATH_IMAGE116
And will store the key in
Figure 489216DEST_PATH_IMAGE117
Is set to its absolute value.
And step 110, continuously acquiring stream data information, updating information of all sides in the stream data information into the three-dimensional sketch structure, and performing statistics and query on the stream data according to data stored in the three-dimensional sketch structure.
Operations for querying on stream data information of the retention graph structure can be classified into 4 types: edge-based queries, node-based queries, subgraph-based queries, path-based queries. The edge-based query is divided into: side weight query, elephant stream (heavy-viewer) side query, mutation stream (heavy-changer) side query; node-based queries are further divided into: node weight query, elephant flow (heavy-hit) node query, mutation flow (heavy-change) node query; the subgraph-based query is mainly a subgraph weight query, and queries the total traffic weight in a subgraph composed of all nodes and connecting edges thereof in a node set V; the path-based query is mainly a path reachability query, and queries whether a communication path exists between the node a and the node b.
In the design method of the three-dimensional sketch structure, the three-dimensional sketch structure used for storing stream data information is constructed, each bucket in the three-dimensional bucket array is divided into three regions, and the sum of weight values mapped to all edges of the bucket, the key with the maximum current weight mapped to the bucket and the value of an indication counter used for indicating to reserve or replace the key with the maximum current weight are respectively stored; acquiring stream data information of a graph structure, updating the sum of weighted values mapped in a bucket according to side information in the stream data information and hash functions corresponding to different depths, updating the value of an indication counter through a majority voting algorithm according to the key with the largest current weight in the bucket and the side information, and determining to reserve or replace the key with the largest current weight according to the updated value of the indication counter. The three-dimensional sketch structure constructed by the invention simultaneously maintains the structure information and the weight information of the graph data, updates the primary stream data only once, has a constant probability error range, uses a majority voting algorithm to select a stored key so as to record the most representative graph edge, can reduce the error range and realize reversibility so as to improve the query efficiency.
In one embodiment, the method further comprises the following steps: as shown in fig. 3, according to the edge in the stream data information
Figure 686980DEST_PATH_IMAGE118
Bucket index of
Figure 516526DEST_PATH_IMAGE119
Each corresponding to the opposite sideThe bucket of depth is updated, wherein,
Figure 217766DEST_PATH_IMAGE120
the depth number of the packet is,
Figure 723834DEST_PATH_IMAGE121
is composed of
Figure 408893DEST_PATH_IMAGE122
The hash function corresponding to the depth layer comprises the following steps: updating the sum of the weight values in the bucket according to the weight information of the edge as follows:
Figure 760240DEST_PATH_IMAGE123
(ii) a Wherein the content of the first and second substances,
Figure 830833DEST_PATH_IMAGE124
is the weight information of the edge(s),
Figure 507802DEST_PATH_IMAGE125
is an edge
Figure 680157DEST_PATH_IMAGE126
Mapped to
Figure 835195DEST_PATH_IMAGE127
Sum of weight values in bucket of depth layer; comparing the key with the largest current weight in the bucket
Figure 776607DEST_PATH_IMAGE128
And edge
Figure 624477DEST_PATH_IMAGE129
: when in use
Figure 766352DEST_PATH_IMAGE130
When, the value of the update indication counter is:
Figure 725080DEST_PATH_IMAGE131
(ii) a Wherein the content of the first and second substances,
Figure 786577DEST_PATH_IMAGE132
is an edge
Figure 274191DEST_PATH_IMAGE133
Mapped to
Figure 421138DEST_PATH_IMAGE120
The key with the largest current weight in the bucket of the depth layer;
Figure 917979DEST_PATH_IMAGE134
is an edge
Figure 83250DEST_PATH_IMAGE135
Mapped to
Figure 272922DEST_PATH_IMAGE120
A value indicating a counter in a bucket of the depth layer; when in use
Figure 376008DEST_PATH_IMAGE136
When, the value of the update indication counter is:
Figure 410960DEST_PATH_IMAGE137
(ii) a Judgment of
Figure 447049DEST_PATH_IMAGE138
Positive and negative of (2)
Figure 807623DEST_PATH_IMAGE139
When the temperature of the water is higher than the set temperature,
Figure 148737DEST_PATH_IMAGE140
the value of (d) remains unchanged; when in use
Figure 252959DEST_PATH_IMAGE141
At the same time, update
Figure 877975DEST_PATH_IMAGE142
And
Figure 143872DEST_PATH_IMAGE143
comprises the following steps:
Figure 487128DEST_PATH_IMAGE144
Figure 378730DEST_PATH_IMAGE145
in one embodiment, the method further comprises the following steps: and carrying out statistics and query on flow data according to data stored in the three-dimensional sketch structure, wherein the query comprises edge weight query, node weight query, elephant flow edge query, elephant flow node query, mutation flow query, mutation node query, subgraph weight query and path reachability query.
In one embodiment, the method further comprises the following steps: counting and inquiring flow data according to data stored in the three-dimensional sketch structure, wherein the step of inquiring the side weight is as follows: obtaining edges to query
Figure 858253DEST_PATH_IMAGE146
(ii) a Is compared to a depth ofkIn a bucket of
Figure 560629DEST_PATH_IMAGE147
Value of (A) and
Figure 391182DEST_PATH_IMAGE148
whether they are the same or not, if so, the edge
Figure 571628DEST_PATH_IMAGE148
In thatkThe edge weight estimate for the depth layer is:
Figure 171236DEST_PATH_IMAGE149
(ii) a Otherwise, the edge
Figure 792317DEST_PATH_IMAGE150
In thatkThe edge weight estimate for the depth layer is:
Figure 844587DEST_PATH_IMAGE151
(ii) a Wherein the content of the first and second substances,
Figure 94303DEST_PATH_IMAGE152
is an edge
Figure 548418DEST_PATH_IMAGE153
In thatkAn edge weight estimate for the depth layer; according to the edge
Figure 327018DEST_PATH_IMAGE153
The minimum value of the edge weight estimated values at all depths is obtained to obtain the edge
Figure 132163DEST_PATH_IMAGE153
The edge weight estimate of (a) is:
Figure 903679DEST_PATH_IMAGE154
(ii) a Wherein the content of the first and second substances,
Figure 477879DEST_PATH_IMAGE155
for queried edges
Figure 427381DEST_PATH_IMAGE156
The edge weight estimate of (2).
In one embodiment, the method further comprises the following steps: counting and inquiring flow data according to data stored in the three-dimensional sketch structure, wherein the node weight inquiry step comprises the following steps: obtaining a node to queryx(ii) a For depth
Figure 719822DEST_PATH_IMAGE157
Middle row of
Figure 45761DEST_PATH_IMAGE158
If each packet of
Figure 959622DEST_PATH_IMAGE159
Then nodexIn thatkDepth layerjThe output weight estimates for the columns are:
Figure 611183DEST_PATH_IMAGE160
(ii) a Otherwise, the nodexIn thatkDepth layerjThe output weight estimates for the columns are:
Figure 125341DEST_PATH_IMAGE161
(ii) a Wherein the content of the first and second substances,
Figure 989392DEST_PATH_IMAGE162
representing the total depth of the three-dimensional bucket value;
Figure 538185DEST_PATH_IMAGE163
representing the total column number of the three-dimensional bucket numerical value;
Figure 829489DEST_PATH_IMAGE164
is a nodexIn thatkDepth layerjAn output weight estimate for the column; according to the nodexIn thatkDepth layer
Figure 80210DEST_PATH_IMAGE165
The output weight estimated value of the column is obtained to obtain a nodexIn thatkThe output weight estimate for the depth layer is:
Figure 747952DEST_PATH_IMAGE166
(ii) a Wherein the content of the first and second substances,
Figure 151251DEST_PATH_IMAGE167
is a nodexIn thatkAn output weight estimate for the depth layer; according to the nodexIn that
Figure 879036DEST_PATH_IMAGE168
The estimated value of the output weight of the depth is obtained to obtain a nodexThe node weight estimate of (a) is:
Figure 102207DEST_PATH_IMAGE169
(ii) a Wherein the content of the first and second substances,
Figure 573640DEST_PATH_IMAGE170
is a nodexThe output weight estimate of (a);
for depth
Figure 313669DEST_PATH_IMAGE171
In
Figure 477934DEST_PATH_IMAGE172
If each packet of
Figure 453981DEST_PATH_IMAGE173
Then nodexIn thatkDepth layeriThe input weight estimates for the rows are:
Figure 197946DEST_PATH_IMAGE174
(ii) a Otherwise, the nodexIn thatkDepth layerjThe input weight estimates for the columns are:
Figure 575837DEST_PATH_IMAGE175
(ii) a Wherein the content of the first and second substances,
Figure 645424DEST_PATH_IMAGE176
is a nodexIn thatkDepth layeriAn input weight estimate for the row; according to the nodexIn thatkDepth layer
Figure 92455DEST_PATH_IMAGE177
Input weight estimation value of the column to obtain a nodexIn thatkThe input weight estimate for the depth layer is:
Figure 905690DEST_PATH_IMAGE178
(ii) a Wherein the content of the first and second substances,
Figure 138089DEST_PATH_IMAGE179
the total row number of the three-dimensional bucket numerical value is represented;
Figure 378577DEST_PATH_IMAGE180
is a nodexIn thatkAn input weight estimate for the depth layer; according to the nodexIn that
Figure 63636DEST_PATH_IMAGE181
The depth input weight estimated value is obtained to obtain a nodexNode (a) ofThe weight estimate is:
Figure 680562DEST_PATH_IMAGE182
(ii) a Wherein the content of the first and second substances,
Figure 252620DEST_PATH_IMAGE183
is a nodexThe input weight estimate of (1); according to the nodexOutputting weight estimates
Figure 929589DEST_PATH_IMAGE184
And input weight estimates
Figure 101945DEST_PATH_IMAGE185
To obtain a nodexThe node weight estimate of (1).
In the directed graph, nodes have an in degree and an out degree, the sum of edge weights of an in node is referred to as an in weight, and the sum of edge weights of an out node is referred to as an out weight. The inflow weight and the outflow weight of the nodes can be respectively calculated through the three-dimensional sketch structure provided by the invention.
In one embodiment, the method further comprises the following steps: counting and inquiring flow data according to data stored in the three-dimensional sketch structure, wherein the step of inquiring the subgraph weight is as follows: obtaining a subset of nodes of a subgraph to be queried
Figure 991403DEST_PATH_IMAGE186
WhereinnRepresenting the number of nodes in the node subset;
Figure 932815DEST_PATH_IMAGE187
indicating a serial number ofiThe node(s) of (a) is (are),
Figure 29952DEST_PATH_IMAGE188
(ii) a Determining a subset of edges of a subgraph from nodes in the subset of nodes
Figure 689604DEST_PATH_IMAGE189
Where m represents the number of edges in the subset of edges,
Figure 648333DEST_PATH_IMAGE190
Figure 444250DEST_PATH_IMAGE191
indicating a serial number ofiThe edge of (a) is provided with,
Figure 197443DEST_PATH_IMAGE192
(ii) a Obtaining weights of all edges in a subset of edges by an edge weight query
Figure 344390DEST_PATH_IMAGE193
(ii) a Will be provided with
Figure 106810DEST_PATH_IMAGE194
The result of the sum of the weights of all edges in
Figure 505037DEST_PATH_IMAGE195
As weight estimates for the subgraph.
In one embodiment, the method further comprises the following steps: counting and inquiring flow data according to data stored in the three-dimensional sketch structure, wherein the path reachability inquiry comprises the following steps: according to each bucket in the three-dimensional sketch structure
Figure 429130DEST_PATH_IMAGE196
Sum of weighted values in
Figure 797795DEST_PATH_IMAGE197
Summing to obtain the total weight of the data stream in the current periodF(ii) a Wherein
Figure 98326DEST_PATH_IMAGE198
Sequence numbers representing the three dimensions of row, column and depth of the bucket; traverse all
Figure 134415DEST_PATH_IMAGE199
When is coming into contact with
Figure 229410DEST_PATH_IMAGE200
At the same time, read
Figure 334638DEST_PATH_IMAGE201
Key stored in
Figure 438861DEST_PATH_IMAGE202
Obtained by edge weight query
Figure 798298DEST_PATH_IMAGE202
The edge weight of when
Figure 329773DEST_PATH_IMAGE203
When it is determined
Figure 407451DEST_PATH_IMAGE204
Is an elephant stream edge; wherein
Figure 66096DEST_PATH_IMAGE205
Is a preset threshold parameter; at all weights greater than
Figure 811198DEST_PATH_IMAGE206
The elephant stream edge carries out path reachability query through a breadth-first search algorithm, wherein the path reachability query is to determine whether the source node a can pass through the weight of at least
Figure 513575DEST_PATH_IMAGE206
Is connected to the target node b.
In one embodiment, the method further comprises the following steps: counting and inquiring stream data according to data stored in the three-dimensional sketch structure, wherein the step of elephant stream side inquiry comprises the following steps: given threshold parameter
Figure 78549DEST_PATH_IMAGE207
To query all elephant streams, first the total weight of all streams for that epoch is calculatedF. Then, we examine each bucket
Figure 524574DEST_PATH_IMAGE208
If, if
Figure 858603DEST_PATH_IMAGE209
Let us order
Figure 981149DEST_PATH_IMAGE210
And query using edge weight query operations
Figure 298997DEST_PATH_IMAGE211
Return all of satisfaction
Figure 17555DEST_PATH_IMAGE212
Is not limited by
Figure 737249DEST_PATH_IMAGE213
In one embodiment, the method further comprises the following steps: counting and inquiring flow data according to data stored in the three-dimensional sketch structure, wherein the step of inquiring elephant flow nodes is as follows:
the elephant flow on the node outflow weight is taken as an example. Given threshold parameter
Figure 781428DEST_PATH_IMAGE214
First, the total weight of all streams in the period is calculatedF. Then, the sum of each row in the bucket array for different depths is checked, for the secondiGo if, if
Figure 320994DEST_PATH_IMAGE215
Then give an order
Figure 723720DEST_PATH_IMAGE216
And query the operation using the node weight
Figure 829080DEST_PATH_IMAGE217
Return all of satisfaction
Figure 840898DEST_PATH_IMAGE218
Node (a) ofx
In one embodiment, the method further comprises the following steps: counting and inquiring stream data according to data stored in the three-dimensional sketch structure, wherein the step of abrupt stream edge inquiry comprises the following steps: given threshold parameter
Figure 398918DEST_PATH_IMAGE219
To query all burst edges, first obtain all different edges from the bucket arrays of two adjacent epochs
Figure 426655DEST_PATH_IMAGE220
And calculating the total flow change of two adjacent periods
Figure 183258DEST_PATH_IMAGE221
Figure 365978DEST_PATH_IMAGE221
Is to the true total flow variation
Figure 693185DEST_PATH_IMAGE222
Is estimated, and
Figure 619553DEST_PATH_IMAGE223
since hash collisions will counteract weight changes in different directions. Then, the user can use the device to perform the operation,
Figure 433925DEST_PATH_IMAGE224
and
Figure 53125DEST_PATH_IMAGE225
respectively represent
Figure 585738DEST_PATH_IMAGE226
Upper and lower limits of (d). Using the results returned from the edge weight query to assign an upper edge weight limit, i.e.
Figure 66529DEST_PATH_IMAGE227
. For a depth ofkSet of bucket
Figure 987DEST_PATH_IMAGE228
Figure 525509DEST_PATH_IMAGE229
Edge of
Figure 810997DEST_PATH_IMAGE230
Mapping to bucket
Figure 79167DEST_PATH_IMAGE231
If, if
Figure 884443DEST_PATH_IMAGE232
Then give an order
Figure 579867DEST_PATH_IMAGE233
Otherwise, to
Figure 87071DEST_PATH_IMAGE234
. The edge weight lower bound is then assigned using the maximum of the edge weight lower bounds for the different depth estimates, i.e. the edge weight lower bound is assigned a value
Figure 893353DEST_PATH_IMAGE235
. Order to
Figure 67983DEST_PATH_IMAGE236
And
Figure 416531DEST_PATH_IMAGE237
represents the last epoch edge weight
Figure 676611DEST_PATH_IMAGE238
Upper and lower limits of (1), of
Figure 21005DEST_PATH_IMAGE239
And
Figure 50141DEST_PATH_IMAGE240
representing current epoch edge weight
Figure 87367DEST_PATH_IMAGE241
Upper and lower limits of (d). Estimated edges
Figure 319896DEST_PATH_IMAGE242
Is given by:
Figure 467981DEST_PATH_IMAGE243
. Finally, each edge appearing in two adjacent epochs is examined. For each edge, if
Figure 351623DEST_PATH_IMAGE244
Or
Figure 559751DEST_PATH_IMAGE245
Then calculate an estimate of the change in weight thereof
Figure 263264DEST_PATH_IMAGE246
If, if
Figure 215040DEST_PATH_IMAGE247
Returning edge
Figure 703921DEST_PATH_IMAGE248
Consider an edge
Figure 817371DEST_PATH_IMAGE249
A burst stream appears above.
In one embodiment, the method further comprises the following steps: carrying out statistics and query on stream data according to data stored in the three-dimensional sketch structure, wherein the step of querying the mutant stream nodes is as follows: given threshold parameter
Figure 273760DEST_PATH_IMAGE250
To query all the mutant stream nodes, we first obtain all the different nodes from the bucket arrays of two adjacent epochs
Figure 763647DEST_PATH_IMAGE251
And calculating the total flow change D ̃ of two adjacent periods.
Then, order
Figure 621882DEST_PATH_IMAGE252
And
Figure 171812DEST_PATH_IMAGE253
are respectively provided withTo represent
Figure 866229DEST_PATH_IMAGE254
Upper and lower limits of (d). Next, the node outflow weight
Figure 159807DEST_PATH_IMAGE255
For example, a mutant stream node query operation is illustrated. Using the result returned from the node weight query to assign an upper limit to the node weight, i.e.
Figure 606969DEST_PATH_IMAGE256
. For a depth ofkSet of bucket
Figure 327801DEST_PATH_IMAGE257
Node ofxMapping to bucket array
Figure 493203DEST_PATH_IMAGE258
In, if
Figure 338274DEST_PATH_IMAGE259
Then give an order
Figure 639943DEST_PATH_IMAGE260
Otherwise, to
Figure 797255DEST_PATH_IMAGE261
. Then, order
Figure 449953DEST_PATH_IMAGE262
. Assigning the lower node weight limits using the maximum of the lower node weight limits for different depth estimates, i.e.
Figure 819754DEST_PATH_IMAGE263
. Order to
Figure 507088DEST_PATH_IMAGE264
And
Figure 320454DEST_PATH_IMAGE265
represents the node weight of the last period
Figure 194869DEST_PATH_IMAGE266
Upper and lower limits of (1), of
Figure 899520DEST_PATH_IMAGE267
And
Figure 175780DEST_PATH_IMAGE268
representing node weights for the current epoch
Figure 409316DEST_PATH_IMAGE269
Upper and lower limits of (d). The estimated maximum weight change for node x is given by:
Figure 36606DEST_PATH_IMAGE270
. Finally, each node present in two adjacent epochs is examined. For each node, if
Figure 498943DEST_PATH_IMAGE271
Or
Figure 160868DEST_PATH_IMAGE272
Then calculate an estimate of the change in weight thereof
Figure 565305DEST_PATH_IMAGE273
If, if
Figure 414312DEST_PATH_IMAGE274
Returning nodexConsider a nodexA burst stream appears above.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 4, there is provided a design apparatus of a three-dimensional sketch structure, including: the method comprises a three-dimensional sketch structure building module 402, a bucket initializing module 404, a bucket index determining module 406, a bucket updating module 408 and a statistics and query module 410, wherein:
the three-dimensional sketch structure constructing module 402 is used for constructing a three-dimensional sketch structure for storing stream data information, wherein the three-dimensional sketch structure comprises a three-dimensional bucket array, the total line number and the total column number of the three-dimensional bucket array are the same, and different depths of the three-dimensional bucket array correspond to different hash functions;
a bucket initialization module 404, configured to divide each bucket in the three-dimensional bucket array into three regions, respectively store the sum of weight values mapped to all edges of the bucket, the key with the largest current weight mapped to the bucket, and a value of an indication counter used for indicating to reserve or replace the key with the largest current weight, and initialize the bucket;
a bucket index determining module 406, configured to obtain stream data information of the graph structure, and calculate a bucket index of each depth of the edge according to node information of the edge in the stream data information and hash functions corresponding to different depths;
the packet updating module 408 is configured to update the packet of each depth corresponding to the edge according to the packet index, where the updating includes: updating the sum of the weight values in the bucket according to the weight information of the edges, updating the value of an indication counter through a majority voting algorithm according to the key with the largest current weight in the bucket and the acquired information of the edges, and determining to reserve or replace the key with the largest current weight according to the updated value of the indication counter;
and the statistics and query module 410 is configured to continuously obtain stream data information, update information of all sides in the stream data information to the three-dimensional sketch structure, and perform statistics and query on the stream data according to data stored in the three-dimensional sketch structure.
The packet update module 408 is further configured to update the edge according to the stream data information
Figure 195186DEST_PATH_IMAGE275
Bucket index of
Figure 446039DEST_PATH_IMAGE276
Updating the bucket of each depth corresponding to the edge, wherein,
Figure 772109DEST_PATH_IMAGE277
the depth number of the packet is,
Figure 373992DEST_PATH_IMAGE278
is composed of
Figure 692978DEST_PATH_IMAGE279
The hash function corresponding to the depth layer comprises the following steps: updating the sum of the weight values in the bucket according to the weight information of the edge as follows:
Figure 798337DEST_PATH_IMAGE280
(ii) a Wherein the content of the first and second substances,
Figure 544576DEST_PATH_IMAGE281
is the weight information of the edge(s),
Figure 368176DEST_PATH_IMAGE282
is an edge
Figure 504234DEST_PATH_IMAGE283
Mapped to
Figure 464100DEST_PATH_IMAGE277
Sum of weight values in bucket of depth layer; comparing the key with the largest current weight in the bucket
Figure 381240DEST_PATH_IMAGE284
And edge
Figure 957715DEST_PATH_IMAGE285
: when in use
Figure 618504DEST_PATH_IMAGE286
When, the value of the update indication counter is:
Figure 432876DEST_PATH_IMAGE287
(ii) a Wherein the content of the first and second substances,
Figure 537229DEST_PATH_IMAGE288
is an edge
Figure 69842DEST_PATH_IMAGE289
Mapped to
Figure 534321DEST_PATH_IMAGE290
The key with the largest current weight in the bucket of the depth layer;
Figure 468779DEST_PATH_IMAGE291
is an edge
Figure 993302DEST_PATH_IMAGE292
Mapped to
Figure 13210DEST_PATH_IMAGE293
A value indicating a counter in a bucket of the depth layer; when in use
Figure 32113DEST_PATH_IMAGE294
When, the value of the update indication counter is:
Figure 86657DEST_PATH_IMAGE295
(ii) a Judgment of
Figure 516501DEST_PATH_IMAGE296
Positive and negative of (2)
Figure 23706DEST_PATH_IMAGE297
When the temperature of the water is higher than the set temperature,
Figure 95567DEST_PATH_IMAGE298
the value of (d) remains unchanged; when in use
Figure 739038DEST_PATH_IMAGE299
At the same time, update
Figure 356095DEST_PATH_IMAGE300
And
Figure 85017DEST_PATH_IMAGE301
comprises the following steps:
Figure 960569DEST_PATH_IMAGE302
Figure 724125DEST_PATH_IMAGE303
the bucket update module 408 is further configured to perform statistics and query on stream data according to data stored in the three-dimensional sketch structure, where the step of side weight query is: obtaining edges to query
Figure 761351DEST_PATH_IMAGE304
(ii) a Is compared to a depth ofkIn a bucket of
Figure 243148DEST_PATH_IMAGE305
Value of (A) and
Figure 404615DEST_PATH_IMAGE306
whether they are the same or not, if so, the edge
Figure 22678DEST_PATH_IMAGE307
In thatkThe edge weight estimate for the depth layer is:
Figure 965226DEST_PATH_IMAGE308
(ii) a Otherwise, the edge
Figure 934319DEST_PATH_IMAGE309
In thatkThe edge weight estimate for the depth layer is:
Figure 886095DEST_PATH_IMAGE310
(ii) a Wherein the content of the first and second substances,
Figure 374976DEST_PATH_IMAGE311
is an edge
Figure 754005DEST_PATH_IMAGE312
In thatkAn edge weight estimate for the depth layer; according to the edge
Figure 944815DEST_PATH_IMAGE312
The minimum value of the edge weight estimated values at all depths is obtained to obtain the edge
Figure 700281DEST_PATH_IMAGE313
The edge weight estimate of (a) is:
Figure 292937DEST_PATH_IMAGE314
(ii) a Wherein the content of the first and second substances,
Figure 842867DEST_PATH_IMAGE315
for queried edges
Figure 255393DEST_PATH_IMAGE316
The edge weight estimate of (2).
The bucket updating module 408 is further configured to perform statistics and query on stream data according to data stored in the three-dimensional sketch structure, where the node weight query includes: obtaining a node to queryx(ii) a For depth
Figure 565283DEST_PATH_IMAGE317
Middle row of
Figure 12445DEST_PATH_IMAGE318
If each packet of
Figure 733276DEST_PATH_IMAGE319
Then nodexIn thatkDepth layerjThe output weight estimates for the columns are:
Figure 164258DEST_PATH_IMAGE320
(ii) a Otherwise, the nodexIn thatkDepth layerjThe output weight estimates for the columns are:
Figure 995947DEST_PATH_IMAGE321
(ii) a Wherein the content of the first and second substances,
Figure 313927DEST_PATH_IMAGE322
representing the total depth of the three-dimensional bucket value;
Figure 205660DEST_PATH_IMAGE323
representing the total column number of the three-dimensional bucket numerical value;
Figure 592779DEST_PATH_IMAGE324
is a nodexIn thatkDepth layerjAn output weight estimate for the column; according to the nodexIn thatkDepth layer
Figure 493739DEST_PATH_IMAGE325
The output weight estimated value of the column is obtained to obtain a nodexIn thatkThe output weight estimate for the depth layer is:
Figure 915493DEST_PATH_IMAGE326
(ii) a Wherein the content of the first and second substances,
Figure 978127DEST_PATH_IMAGE327
is a nodexIn thatkAn output weight estimate for the depth layer; according to the nodexIn that
Figure 852542DEST_PATH_IMAGE328
The estimated value of the output weight of the depth is obtained to obtain a nodexThe node weight estimate of (a) is:
Figure 39416DEST_PATH_IMAGE329
(ii) a Wherein the content of the first and second substances,
Figure 315677DEST_PATH_IMAGE330
is a nodexThe output weight estimate of (a);
for depth
Figure 814791DEST_PATH_IMAGE331
In
Figure 176503DEST_PATH_IMAGE332
If each packet of
Figure 153686DEST_PATH_IMAGE333
Then nodexIn thatkDepth layeriThe input weight estimates for the rows are:
Figure 550032DEST_PATH_IMAGE334
(ii) a Otherwise, the nodexIn thatkDepth layerjThe input weight estimates for the columns are:
Figure 970780DEST_PATH_IMAGE335
(ii) a Wherein the content of the first and second substances,
Figure 554208DEST_PATH_IMAGE336
is a nodexIn thatkDepth layeriAn input weight estimate for the row; according to the nodexIn thatkDepth layer
Figure 600662DEST_PATH_IMAGE337
Input weight estimation value of the column to obtain a nodexIn thatkThe input weight estimate for the depth layer is:
Figure 851515DEST_PATH_IMAGE338
(ii) a Wherein the content of the first and second substances,
Figure 692432DEST_PATH_IMAGE339
the total row number of the three-dimensional bucket numerical value is represented;
Figure 513888DEST_PATH_IMAGE340
is a nodexIn thatkAn input weight estimate for the depth layer; according to the nodexIn that
Figure 98453DEST_PATH_IMAGE341
The depth input weight estimated value is obtained to obtain a nodexThe node weight estimate of:
Figure 203813DEST_PATH_IMAGE342
(ii) a Wherein the content of the first and second substances,
Figure 950052DEST_PATH_IMAGE343
is a nodexThe input weight estimate of (1); according to the nodexOutputting weight estimates
Figure 39230DEST_PATH_IMAGE344
And input weight estimates
Figure 896328DEST_PATH_IMAGE345
To obtain a nodexThe node weight estimate of (1).
The bucket updating module 408 is further configured to perform statistics and query of stream data according to data stored in the three-dimensional sketch structure, where the sub-graph weight query includes: obtaining a subset of nodes of a subgraph to be queried
Figure 872505DEST_PATH_IMAGE346
WhereinnRepresenting the number of nodes in the node subset;
Figure 789646DEST_PATH_IMAGE347
indicating a serial number ofiThe node(s) of (a) is (are),
Figure 100541DEST_PATH_IMAGE348
(ii) a Determining a subset of edges of a subgraph from nodes in the subset of nodes
Figure 761330DEST_PATH_IMAGE349
Where m represents the number of edges in the subset of edges,
Figure 841281DEST_PATH_IMAGE350
Figure 194902DEST_PATH_IMAGE351
indicating a serial number ofiThe edge of (a) is provided with,
Figure 475318DEST_PATH_IMAGE352
(ii) a Obtaining weights of all edges in a subset of edges by an edge weight query
Figure 939797DEST_PATH_IMAGE353
(ii) a Will be provided with
Figure 139834DEST_PATH_IMAGE354
The result of the sum of the weights of all edges in
Figure 398777DEST_PATH_IMAGE355
As weight estimates for the subgraph.
The bucket updating module 408 is further configured to perform statistics and query on stream data according to data stored in the three-dimensional sketch structure, where the step of the path reachability query is: according to each bucket in the three-dimensional sketch structure
Figure 418686DEST_PATH_IMAGE356
Sum of weighted values in
Figure 952435DEST_PATH_IMAGE357
Summing to obtain the total weight of the data stream in the current periodF(ii) a Wherein
Figure 492132DEST_PATH_IMAGE358
Sequence numbers representing the three dimensions of row, column and depth of the bucket; traverse all
Figure 921976DEST_PATH_IMAGE359
When is coming into contact with
Figure 694760DEST_PATH_IMAGE360
At the same time, read
Figure 766622DEST_PATH_IMAGE361
Key stored in
Figure 410093DEST_PATH_IMAGE362
Obtained by edge weight query
Figure 276417DEST_PATH_IMAGE362
The edge weight of when
Figure 756071DEST_PATH_IMAGE363
When it is determined
Figure 631624DEST_PATH_IMAGE364
Is an elephant stream edge; wherein
Figure 129601DEST_PATH_IMAGE365
Is a preset threshold parameter; at all weights greater than
Figure 166827DEST_PATH_IMAGE366
The elephant stream edge carries out path reachability query through a breadth-first search algorithm, wherein the path reachability query is to determine whether the source node a can pass through the weight of at least
Figure 648624DEST_PATH_IMAGE366
Is connected to the target node b.
For specific limitations of the design apparatus of the three-dimensional sketch structure, reference may be made to the above limitations of the design method of the three-dimensional sketch structure, which are not described herein again. All or part of each module in the design device of the three-dimensional sketch structure can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 5. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of designing a three-dimensional sketch structure. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 5 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method for designing a three-dimensional sketch structure, the method comprising:
constructing a three-dimensional sketch structure for storing stream data information, wherein the three-dimensional sketch structure comprises a three-dimensional bucket array, the total line number and the total column number of the three-dimensional bucket array are the same, and different depths of the three-dimensional bucket array correspond to different hash functions;
dividing each bucket in the three-dimensional bucket array into three regions, respectively storing the sum of weight values mapped to all edges of the bucket, a key with the largest current weight mapped to the bucket and a value of an indication counter for indicating to reserve or replace the key with the largest current weight, and initializing the bucket;
acquiring stream data information of a graph structure, and calculating a bucket index of each edge at each depth according to node information of the edge in the stream data information and hash functions corresponding to different depths;
updating the bucket of each depth corresponding to the edge according to the bucket index, wherein the updating comprises: updating the sum of the weight values in the bucket according to the weight information of the edge, updating the value of the indication counter through a majority voting algorithm according to the key with the largest current weight in the bucket and the acquired information of the edge, and determining to reserve or replace the key with the largest current weight according to the updated value of the indication counter;
continuously acquiring stream data information, updating information of all sides in the stream data information into the three-dimensional sketch structure, and carrying out statistics and query on the stream data according to data stored in the three-dimensional sketch structure.
2. The method according to claim 1, wherein the bucket of each depth corresponding to the edge is updated according to the bucket index, and the updating comprises: updating the sum of the weight values in the bucket according to the weight information of the edge, updating the value of the indication counter through a majority voting algorithm according to the key with the maximum current weight in the bucket and the acquired information of the edge, and determining to reserve or replace the key with the maximum current weight according to the updated value of the indication counter, wherein the method comprises the following steps:
according to the edge in the stream data information
Figure 81524DEST_PATH_IMAGE001
The bucket index of
Figure 716774DEST_PATH_IMAGE002
Updating the bucket of each depth corresponding to the edge, wherein,
Figure 230932DEST_PATH_IMAGE003
the depth sequence number of the packet,
Figure 360562DEST_PATH_IMAGE004
is composed of
Figure 909355DEST_PATH_IMAGE005
The hash function corresponding to the depth layer comprises the following steps:
updating the sum of the weight values in the bucket to be:
Figure 466238DEST_PATH_IMAGE006
wherein the content of the first and second substances,
Figure 202113DEST_PATH_IMAGE007
is the weight information of the edge in question,
Figure 883236DEST_PATH_IMAGE008
is an edge
Figure 20957DEST_PATH_IMAGE009
Mapped to
Figure 748741DEST_PATH_IMAGE003
The sum of the weight values in bucket of the depth layer;
comparing the key with the largest current weight in the bucket
Figure 237491DEST_PATH_IMAGE010
And edge
Figure 443345DEST_PATH_IMAGE011
When in use
Figure 701151DEST_PATH_IMAGE012
When it is time, update the fingerThe value of the counter is shown as:
Figure 114684DEST_PATH_IMAGE013
wherein the content of the first and second substances,
Figure 825151DEST_PATH_IMAGE014
is an edge
Figure 834695DEST_PATH_IMAGE015
Mapped to
Figure 947007DEST_PATH_IMAGE016
The key with the largest current weight in the bucket of the depth layer;
Figure 16595DEST_PATH_IMAGE017
is an edge
Figure 479937DEST_PATH_IMAGE018
Mapped to
Figure 43905DEST_PATH_IMAGE019
A value of the indication counter in a bucket of a depth layer;
when in use
Figure 10724DEST_PATH_IMAGE020
When the counter is updated, the value of the indication counter is:
Figure 516791DEST_PATH_IMAGE021
judgment of
Figure 936271DEST_PATH_IMAGE022
Positive and negative of (2)
Figure 553197DEST_PATH_IMAGE023
When the temperature of the water is higher than the set temperature,
Figure 623790DEST_PATH_IMAGE024
the value of (d) remains unchanged; when in use
Figure 300759DEST_PATH_IMAGE025
At the same time, update
Figure 473115DEST_PATH_IMAGE026
And
Figure 362573DEST_PATH_IMAGE027
comprises the following steps:
Figure 569564DEST_PATH_IMAGE028
Figure 417434DEST_PATH_IMAGE029
3. the method of claim 2, wherein the counting and querying of stream data according to the data stored in the three-dimensional sketch structure comprises:
and carrying out statistics and query of flow data according to data stored in the three-dimensional sketch structure, wherein the query comprises side weight query, node weight query, elephant flow side query, elephant flow node query, mutation flow query, mutation node query, subgraph weight query and path reachability query.
4. The method of claim 3, wherein the counting and querying of the stream data according to the data stored in the three-dimensional sketch structure comprises:
counting and inquiring flow data according to the data stored in the three-dimensional sketch structure, wherein the step of inquiring the edge weight is as follows:
obtaining edges to query
Figure 559309DEST_PATH_IMAGE030
Is compared to a depth ofkIn a bucket of
Figure 518038DEST_PATH_IMAGE031
Value of (A) and
Figure 48376DEST_PATH_IMAGE032
whether they are the same or not, if so, the edge
Figure 67148DEST_PATH_IMAGE032
In thatkThe edge weight estimate for the depth layer is:
Figure 948516DEST_PATH_IMAGE033
otherwise, the edge
Figure 710936DEST_PATH_IMAGE034
In thatkThe edge weight estimate for the depth layer is:
Figure 876207DEST_PATH_IMAGE035
wherein the content of the first and second substances,
Figure 534721DEST_PATH_IMAGE036
is an edge
Figure 168965DEST_PATH_IMAGE037
In thatkAn edge weight estimate for the depth layer;
according to the edge
Figure 469496DEST_PATH_IMAGE038
Minimization of edge weight estimates at all depthsValue, get edge
Figure 240006DEST_PATH_IMAGE039
The edge weight estimate of (a) is:
Figure 335001DEST_PATH_IMAGE040
wherein the content of the first and second substances,
Figure 941694DEST_PATH_IMAGE041
for queried edges
Figure 45916DEST_PATH_IMAGE042
The edge weight estimate of (2).
5. The method of claim 3, wherein the counting and querying of the stream data according to the data stored in the three-dimensional sketch structure comprises:
and carrying out statistics and query on flow data according to the data stored in the three-dimensional sketch structure, wherein the node weight query step is as follows:
obtaining a node to queryx
For depth
Figure 670933DEST_PATH_IMAGE043
Middle row of
Figure 202408DEST_PATH_IMAGE044
If each packet of
Figure 280086DEST_PATH_IMAGE045
Then nodexIn thatkDepth layerjThe output weight estimates for the columns are:
Figure 187999DEST_PATH_IMAGE046
otherwise, the nodexIn thatkDepth layerjThe output weight estimates for the columns are:
Figure 182368DEST_PATH_IMAGE047
wherein the content of the first and second substances,
Figure 619166DEST_PATH_IMAGE048
representing the total depth of the three-dimensional bucket values;
Figure 449719DEST_PATH_IMAGE049
representing the total column number of the three-dimensional bucket numerical value;
Figure 630164DEST_PATH_IMAGE050
is a nodexIn thatkDepth layerjAn output weight estimate for the column;
according to the nodexIn thatkDepth layer
Figure 229773DEST_PATH_IMAGE051
The output weight estimated value of the column is obtained to obtain a nodexIn thatkThe output weight estimate for the depth layer is:
Figure 103051DEST_PATH_IMAGE052
wherein the content of the first and second substances,
Figure 903123DEST_PATH_IMAGE053
is a nodexIn thatkAn output weight estimate for the depth layer;
according to the nodexIn that
Figure 887260DEST_PATH_IMAGE054
The estimated value of the output weight of the depth is obtained to obtain a nodexThe node weight estimate of (a) is:
Figure 606954DEST_PATH_IMAGE055
wherein the content of the first and second substances,
Figure 651134DEST_PATH_IMAGE056
is a nodexThe output weight estimate of (a);
for depth
Figure 190699DEST_PATH_IMAGE057
In
Figure 978527DEST_PATH_IMAGE058
If each packet of
Figure 801995DEST_PATH_IMAGE059
Then nodexIn thatkDepth layeriThe input weight estimates for the rows are:
Figure 751497DEST_PATH_IMAGE060
otherwise, the nodexIn thatkDepth layerLine iThe input weight estimate of (a) is:
Figure 778359DEST_PATH_IMAGE061
wherein the content of the first and second substances,
Figure 369877DEST_PATH_IMAGE062
is a nodexIn thatkDepth layeriAn input weight estimate for the row;
according to the nodexIn thatkDepth layer
Figure 433183DEST_PATH_IMAGE070
Input weight estimation value of the column to obtain a nodexIn thatkThe input weight estimate for the depth layer is:
Figure 16611DEST_PATH_IMAGE071
wherein the content of the first and second substances,
Figure 752579DEST_PATH_IMAGE063
the total row number of the three-dimensional bucket numerical value is represented;
Figure 138561DEST_PATH_IMAGE064
is a nodexIn thatkAn input weight estimate for the depth layer;
according to the nodexIn that
Figure 652719DEST_PATH_IMAGE065
The depth input weight estimated value is obtained to obtain a nodexThe node weight estimate of (a) is:
Figure 782349DEST_PATH_IMAGE066
wherein the content of the first and second substances,
Figure 799983DEST_PATH_IMAGE067
is a nodexThe input weight estimate of (1);
according to the nodexOutputting weight estimates
Figure 622446DEST_PATH_IMAGE068
And input weight estimates
Figure 873168DEST_PATH_IMAGE069
To obtain a nodexThe node weight estimate of (1).
6. The method of claim 4, wherein the counting and querying of stream data according to the data stored in the three-dimensional sketch structure comprises:
performing statistics and query of stream data according to data stored in the three-dimensional sketch structure, wherein the sub-graph weight query step is as follows:
obtaining a subset of nodes of a subgraph to be queried
Figure 540909DEST_PATH_IMAGE070
WhereinnRepresenting a number of nodes in the subset of nodes;
Figure 678630DEST_PATH_IMAGE071
indicating a serial number ofiThe node(s) of (a) is (are),
Figure 671993DEST_PATH_IMAGE072
determining a subset of edges of the subgraph from nodes in the subset of nodes
Figure 895164DEST_PATH_IMAGE073
Wherein m represents the number of edges in the subset of edges,
Figure 366597DEST_PATH_IMAGE074
Figure 106626DEST_PATH_IMAGE075
indicating a serial number ofiThe edge of (a) is provided with,
Figure 5312DEST_PATH_IMAGE076
obtaining weights of all edges in the subset of edges through edge weight query
Figure 715779DEST_PATH_IMAGE077
Will be provided with
Figure 990903DEST_PATH_IMAGE078
The result of the sum of the weights of all edges in
Figure 368795DEST_PATH_IMAGE079
As weight estimates for the subgraph.
7. The method of claim 4, wherein the counting and querying of stream data according to the data stored in the three-dimensional sketch structure comprises:
and carrying out statistics and query on stream data according to the data stored in the three-dimensional sketch structure, wherein the step of path reachability query is as follows:
according to each bucket in the three-dimensional sketch structure
Figure 422070DEST_PATH_IMAGE080
Sum of the weight values in (1)
Figure 885413DEST_PATH_IMAGE081
Summing to obtain the total weight of the data stream in the current periodF(ii) a Wherein
Figure 433069DEST_PATH_IMAGE082
Sequence numbers representing the bucket in three dimensions of row, column and depth;
traverse all
Figure 665467DEST_PATH_IMAGE083
When is coming into contact with
Figure 905955DEST_PATH_IMAGE084
At the same time, read
Figure 591014DEST_PATH_IMAGE085
Key stored in
Figure 958673DEST_PATH_IMAGE086
Obtained by edge weight query
Figure 653127DEST_PATH_IMAGE097
The edge weight of when
Figure 312909DEST_PATH_IMAGE098
When it is determined
Figure 444813DEST_PATH_IMAGE099
Is an elephant stream edge; wherein
Figure 173735DEST_PATH_IMAGE100
Is a preset threshold parameter;
at all weights greater than
Figure 49287DEST_PATH_IMAGE101
The elephant stream edge is subjected to path reachability query through a breadth-first search algorithm, wherein the path reachability query is to determine whether the source node a can pass through the weight of at least
Figure 560646DEST_PATH_IMAGE103
Is connected to the target node b.
8. An apparatus for designing a three-dimensional sketch structure, the apparatus comprising:
the device comprises a three-dimensional sketch structure construction module, a data information storage module and a data information processing module, wherein the three-dimensional sketch structure construction module is used for constructing a three-dimensional sketch structure for storing stream data information, the three-dimensional sketch structure comprises a three-dimensional bucket array, the total line number and the total column number of the three-dimensional bucket array are the same, and different depths of the three-dimensional bucket array correspond to different hash functions;
a bucket initialization module, configured to divide each bucket in the three-dimensional bucket array into three regions, respectively store a sum of weight values mapped to all edges of the bucket, a key with a largest current weight mapped to the bucket, and a value of an indication counter used for indicating to reserve or replace the key with the largest current weight, and initialize the bucket;
the bucket index determining module is used for acquiring stream data information of a graph structure, and calculating a bucket index of each edge at each depth according to node information of the edge in the stream data information and hash functions corresponding to different depths;
a bucket updating module, configured to update the bucket of each depth corresponding to the edge according to the bucket index, where the updating includes: updating the sum of the weight values in the bucket according to the weight information of the edge, updating the value of the indication counter through a majority voting algorithm according to the key with the largest current weight in the bucket and the acquired information of the edge, and determining to reserve or replace the key with the largest current weight according to the updated value of the indication counter;
and the counting and inquiring module is used for continuously acquiring stream data information, updating information of all sides in the stream data information into the three-dimensional sketch structure, and counting and inquiring the stream data according to the data stored in the three-dimensional sketch structure.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202110093365.3A 2021-01-25 2021-01-25 Design method and device of three-dimensional sketch structure Active CN112416950B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110093365.3A CN112416950B (en) 2021-01-25 2021-01-25 Design method and device of three-dimensional sketch structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110093365.3A CN112416950B (en) 2021-01-25 2021-01-25 Design method and device of three-dimensional sketch structure

Publications (2)

Publication Number Publication Date
CN112416950A CN112416950A (en) 2021-02-26
CN112416950B true CN112416950B (en) 2021-03-26

Family

ID=74782483

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110093365.3A Active CN112416950B (en) 2021-01-25 2021-01-25 Design method and device of three-dimensional sketch structure

Country Status (1)

Country Link
CN (1) CN112416950B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113225227B (en) * 2021-03-25 2023-01-17 北京大学 Network measurement method and device based on simplified diagram and considering simplicity and accuracy
CN113670293A (en) * 2021-08-11 2021-11-19 追觅创新科技(苏州)有限公司 Map construction method and device
CN113746700B (en) * 2021-09-02 2023-04-07 中国人民解放军国防科技大学 Elephant flow rapid detection method and system based on probability sampling

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102855667A (en) * 2011-06-30 2013-01-02 中国科学院深圳先进技术研究院 Computer-assisted design system and method for three-dimensional objects
CN106528815A (en) * 2016-11-14 2017-03-22 中国人民解放军理工大学 Method and system for probabilistic aggregation query of road network moving objects
CN106951501A (en) * 2017-03-16 2017-07-14 天津大学 A kind of method for searching three-dimension model based on many figure matchings
CN107077744A (en) * 2014-10-03 2017-08-18 三星电子株式会社 Generated using the threedimensional model at edge
US20200106867A1 (en) * 2018-12-03 2020-04-02 Intel Corporation Sketch Table For Traffic Profiling and Measurement

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100070509A1 (en) * 2008-08-15 2010-03-18 Kai Li System And Method For High-Dimensional Similarity Search

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102855667A (en) * 2011-06-30 2013-01-02 中国科学院深圳先进技术研究院 Computer-assisted design system and method for three-dimensional objects
CN107077744A (en) * 2014-10-03 2017-08-18 三星电子株式会社 Generated using the threedimensional model at edge
CN106528815A (en) * 2016-11-14 2017-03-22 中国人民解放军理工大学 Method and system for probabilistic aggregation query of road network moving objects
CN106951501A (en) * 2017-03-16 2017-07-14 天津大学 A kind of method for searching three-dimension model based on many figure matchings
US20200106867A1 (en) * 2018-12-03 2020-04-02 Intel Corporation Sketch Table For Traffic Profiling and Measurement

Also Published As

Publication number Publication date
CN112416950A (en) 2021-02-26

Similar Documents

Publication Publication Date Title
CN112416950B (en) Design method and device of three-dimensional sketch structure
To et al. A framework for protecting worker location privacy in spatial crowdsourcing
Wise et al. Regionalisation tools for the exploratory spatial analysis of health data
Naor et al. The load, capacity, and availability of quorum systems
CN109684181A (en) Alarm root is because of analysis method, device, equipment and storage medium
Xuan et al. Voronoi-based multi-level range search in mobile navigation
Mo et al. Event recommendation in social networks based on reverse random walk and participant scale control
Zhao et al. A blockchain-based approach for saving and tracking differential-privacy cost
Patgiri et al. Role of bloom filter in big data research: A survey
Shirvani Novel solutions and applications of the object partitioning problem
Cohen et al. Spatially-decaying aggregation over a network
CN107506401A (en) A kind of image retrieval rearrangement method based on drawing method
Ahmed et al. Social graph publishing with privacy guarantees
Shaham et al. Machine learning aided anonymization of spatiotemporal trajectory datasets
Hou et al. DMatrix: Toward fast and accurate queries in graph stream
CN116756494A (en) Data outlier processing method, apparatus, computer device, and readable storage medium
Henry et al. Practical approaches to varying network size in combinatorial key predistribution schemes
Iftikhar et al. dK-projection: publishing graph joint degree distribution with node differential privacy
CN103200034B (en) Network user structure disturbance method based on spectral constraint and sensitive area partition
Gao et al. U 2-Tree: A Universal Two-Layer Distributed Indexing Scheme for Cloud Storage System
Chen et al. Scube: Efficient summarization for skewed graph streams
Liu et al. Towards in-network compact representation: Mergeable counting bloom filter vis cuckoo scheduling
Liu et al. SEAD counter: Self-adaptive counters with different counting ranges
Chen et al. A Sketch-based clustering algorithm for uncertain data streams
CN117609412B (en) Spatial object association method and device based on network structure information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant