CN114722242A - Binary counting type summarization method and device based on graph data stream and computer equipment - Google Patents
Binary counting type summarization method and device based on graph data stream and computer equipment Download PDFInfo
- Publication number
- CN114722242A CN114722242A CN202210248361.2A CN202210248361A CN114722242A CN 114722242 A CN114722242 A CN 114722242A CN 202210248361 A CN202210248361 A CN 202210248361A CN 114722242 A CN114722242 A CN 114722242A
- Authority
- CN
- China
- Prior art keywords
- bucket
- dimensional array
- directed edge
- weighted
- updated
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9014—Indexing; Data structures therefor; Storage structures hash tables
Abstract
The application relates to a binary counting type summarization method and device based on graph data stream, computer equipment and storage medium. The method comprises the following steps: constructing a two-dimensional array abstract structure; inserting a weighted directed edge into the two-dimensional array abstract structure, respectively calculating a source vertex and a target vertex of the weighted directed edge by using two hash functions to obtain a row index and a column index of a bucket of the weighted directed edge, then finding a corresponding bucket in the two-dimensional array abstract structure, and updating a counter of the corresponding bucket by using an algebra sum rule and a weighted algebra sum rule to obtain an updated two-dimensional array abstract structure; and respectively calculating weight predicted values according to the algebra and rules and the counters of the buckets updated by the weighted algebra and rules, and updating the updated two-dimensional array abstract structure by using the obtained weight predicted values of the directed edges to obtain a binary counting abstract. By adopting the method, the access of the graph data stream of the binary relation can be realized.
Description
Technical Field
The present application relates to the field of network communication technologies, and in particular, to a binary counting type summarization method and apparatus based on graph data streams, a computer device, and a storage medium.
Background
With the development of network communication technology, graph data flow appears, which is a general data flow model and represents an infinite continuous arriving data record sequence, and each data record corresponds to a weighted directed edge in a graph. Many data flows in the field of network communications can be classified into the category of graph data flows, for example, a network flow in a computer network corresponds to one data transmission from a source host to a specific destination host, and a message in a social network corresponds to one information interaction between two online accounts. Because of the limited storage and processing capabilities of computer systems, how to compute a fixed-size digest for a graph data stream with economical storage space becomes an important means to improve the scalability of graph data stream processing.
The traditional counting type abstract is composed of one-dimensional arrays and can only access a unary relation, namely, for a key value pair, an input key is calculated through a hash function, a random position of an array is output, and then the array element of the position is updated or inquired. The graph data flow reflects the binary relation between the vertexes, and each time one directed edge needs to be accessed, the calculation is not carried out by one-dimensional input key, so that the traditional counting type abstract is not applicable any more.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a graph data stream-based binary counter type summarization method, apparatus, computer device and storage medium capable of achieving graph data stream access in binary relation.
A method for binary-counting summarization based on graph data streams, the method comprising:
constructing a two-dimensional array abstract structure; the two-dimensional array abstract structure comprises a plurality of two-dimensional arrays; each position in the two-dimensional array is called a bucket, and each bucket maintains a counter;
inserting a weighted directed edge into the two-dimensional array abstract structure, and respectively calculating a source vertex and a destination vertex of the weighted directed edge by using two hash functions to obtain a row index and a column index of a bucket of the weighted directed edge;
finding a corresponding bucket in the two-dimensional array abstract structure according to the row index and the column index of the bucket with the weighted directed edge, and updating a counter of the corresponding bucket by utilizing an algebra sum rule and a weighted algebra sum rule to obtain an updated two-dimensional array abstract structure;
calculating the source vertex and the target vertex of the directed edge of the updated two-dimensional array abstract structure according to the hash function to obtain the row index and the column index of the bucket corresponding to the directed edge; respectively carrying out weight prediction value calculation according to the algebra sum rule and the counter of the bucket after the weighted algebra sum rule is updated, and obtaining the weight prediction value of the directed edge;
and updating the updated two-dimensional array abstract structure by using the weight predicted value of the directed edge to obtain a binary counting abstract.
In one embodiment, the updating the counter of the corresponding bucket by using the algebraic sum rule and the weighted algebraic sum rule to obtain the updated two-dimensional array abstract structure includes:
updating the counter of the corresponding bucket by utilizing the algebra sum rule and the weighted algebra sum rule to obtain the updated counter of the bucket;
and constructing an updated two-dimensional array abstract structure according to the updated counter of the bucket.
wherein, I [ l ]]The first two-dimensional array is shown,index representing bucket, noLine and firstThe bucket of column (t), s represents the source vertex, t represents the destination vertex, and c represents the weight.
wherein sign (s, t) ═ Hr(s)×Hc(t), sign (s, t) E + -1, sign (s, t) representing a random value calculated from directed edges, Hr(·),Hc(. cndot.) represents 2 hash functions.
In one embodiment, the calculating of the weight prediction value is performed according to the algebraic sum rule and the counter of the bucket updated by the weighted algebraic sum rule, so as to obtain the weight prediction value of the directed edge, and the method includes:
for the counter of the bucket updated by algebra and rules, the minimum value of each bucket is taken as the weight prediction value of the directed edge as
Where x represents the aggregate value of the bucket's counter, u represents the source vertex of the directed edge, v represents the destination vertex of the directed edge,and k represents the number of the two-dimensional arrays.
In one embodiment, for the counter of the bucket updated by the weighted algebraic sum rule, the median of each bucket is taken as the weight predicted value of the directed edge, and the weight predicted value is
A binary-counting summarization apparatus based on graph data streams, the apparatus comprising:
the two-dimensional array abstract structure constructing module is used for constructing a two-dimensional array abstract structure; the two-dimensional array abstract structure comprises a plurality of two-dimensional arrays; each position in the two-dimensional array is called a bucket, and each bucket maintains a counter;
the weighted directed edge inserting module is used for inserting weighted directed edges into the two-dimensional array abstract structure, and calculating the source vertex and the target vertex of each weighted directed edge by using two hash functions to obtain the row index and the column index of the bucket of each weighted directed edge;
the two-dimensional array abstract structure updating module is used for finding a corresponding bucket in the two-dimensional array abstract structure according to the row index and the column index of the weighted directed edge bucket, and updating a counter of the corresponding bucket by utilizing an algebra sum rule and a weighted algebra sum rule to obtain an updated two-dimensional array abstract structure;
the binary counting type abstract constructing module is used for calculating the source vertex and the target vertex of the directed edge of the updated two-dimensional array abstract structure according to the hash function to obtain the row index and the column index of the barrel corresponding to the directed edge; respectively carrying out weight prediction value calculation according to the algebra sum rule and the counter of the bucket after the weighted algebra sum rule is updated, and obtaining the weight prediction value of the directed edge; and updating the updated two-dimensional array abstract structure by using the weight predicted value of the directed edge to obtain a binary counting abstract.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
constructing a two-dimensional array abstract structure; the two-dimensional array abstract structure comprises a plurality of two-dimensional arrays; each position in the two-dimensional array is called a bucket, and each bucket maintains a counter;
inserting a weighted directed edge into the two-dimensional array abstract structure, and respectively calculating a source vertex and a target vertex of the weighted directed edge by using two hash functions to obtain a row index and a column index of a bucket of the weighted directed edge;
finding a corresponding bucket in the two-dimensional array abstract structure according to the row index and the column index of the bucket with the weighted directed edge, and updating a counter of the corresponding bucket by utilizing an algebra sum rule and a weighted algebra sum rule to obtain an updated two-dimensional array abstract structure;
calculating the source vertex and the target vertex of the directed edge of the updated two-dimensional array abstract structure according to the hash function to obtain the row index and the column index of the bucket corresponding to the directed edge; respectively carrying out weight prediction value calculation according to the algebra sum rule and the counter of the bucket after the weighted algebra sum rule is updated, and obtaining the weight prediction value of the directed edge;
and updating the updated two-dimensional array abstract structure by using the weight predicted value of the directed edge to obtain a binary counting abstract.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
constructing a two-dimensional array abstract structure; the two-dimensional array abstract structure comprises a plurality of two-dimensional arrays; each position in the two-dimensional array is called a bucket, and each bucket maintains a counter;
inserting a weighted directed edge into the two-dimensional array abstract structure, and respectively calculating a source vertex and a target vertex of the weighted directed edge by using two hash functions to obtain a row index and a column index of a bucket of the weighted directed edge;
finding a corresponding bucket in the two-dimensional array abstract structure according to the row index and the column index of the bucket with the weighted directed edge, and updating a counter of the corresponding bucket by utilizing an algebra sum rule and a weighted algebra sum rule to obtain an updated two-dimensional array abstract structure;
calculating the source vertex and the target vertex of the directed edge of the updated two-dimensional array abstract structure according to the hash function to obtain the row index and the column index of the bucket corresponding to the directed edge; respectively carrying out weight prediction value calculation according to the algebraic sum rule and the counter of the bucket updated by the weighted algebraic sum rule to obtain a weight prediction value of the directed edge;
and updating the updated two-dimensional array abstract structure by using the weight predicted value of the directed edge to obtain a binary counting abstract.
According to the binary counting type summarization method, device, computer equipment and storage medium based on the graph data stream, a two-dimensional array summarization structure is firstly constructed, each position in an array is called as a barrel, each barrel maintains a counter for aggregating the count values of weighted directed edges, and each barrel can be accessed through row and column indexes within constant time; inserting a weighted directed edge into the two-dimensional array abstract structure, and respectively calculating a source vertex and a target vertex of the weighted directed edge by using two hash functions to obtain a row index and a column index of a bucket of the weighted directed edge; finding a corresponding bucket in the two-dimensional array abstract structure according to the row index and the column index of the bucket with the weighted directed edge, and updating a counter of the corresponding bucket by utilizing an algebra sum rule and a weighted algebra sum rule to obtain an updated two-dimensional array abstract structure; calculating the source vertex and the target vertex of the directed edge of the updated two-dimensional array abstract structure according to a hash function to obtain the row index and the column index of the bucket corresponding to the directed edge; respectively carrying out weight prediction value calculation according to the algebra sum rule and the counter of the bucket after the weighted algebra sum rule is updated, and obtaining the weight prediction value of the directed edge; the invention can flexibly support the access of the graph data stream by constructing the two-dimensional array abstract structure and adding the binary counting abstract obtained after the insertion operation and the query operation in the two-dimensional array abstract structure, realizes the online insertion and query of the weighted directed edge of the graph data stream, and flexibly supports the approximate calculation of the weight result by utilizing the algebraic sum and the weighted algebraic sum and two classes of operators, and when the hash conflict of the inserted directed edge is less, the precision of the weighted algebraic sum operator is higher than that of the algebraic sum operator.
Drawings
FIG. 1 is a flow diagram illustrating a binary-counting summarization method based on graph data flow, according to one embodiment;
FIG. 2 is a diagram illustrating a two-dimensional array abstract structure in one embodiment;
FIG. 3 is a block diagram of an embodiment of a binary-counting summarization apparatus based on graph data flow;
FIG. 4 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application.
In one embodiment, as shown in fig. 1, there is provided a binary-counting summarization method based on graph data flow, comprising the following steps:
102, constructing a two-dimensional array abstract structure; the two-dimensional array abstract structure comprises a plurality of two-dimensional arrays; each location in the two-dimensional array is referred to as a bucket, and each bucket maintains a counter.
The abstract structure is used for storing a graph data stream, wherein the graph data stream is a general data stream model and represents a data record sequence which arrives infinitely continuously; many data flows in the field of network communications can be classified as graph data flows, for example, a network flow in a computer network corresponds to one data transmission from a source host to a specific destination host, and a message in a social network corresponds to one information interaction between two online accounts.
A two-dimensional array abstract structure comprises k two-dimensional arrays, and each two-dimensional array comprises mr*mcBucket (m)rNumber of rows, m, corresponding to the current two-dimensional arraycThe number of columns corresponding to the current two-dimensional array, both being parameters pre-configured by the system). Each location in the array is referred to as a "bucket," each bucket maintaining a counter for aggregating the count values of weighted directed edges, each bucket being accessible within a constant time by row and column indices, the ith two-dimensional array being indexed by I [ I [ ]]The p-th row and q-th column in the two-dimensional array are represented by I [ I](p, q) represents (p.epsilon. [1, m)r],q∈[1,mc]P and q are positive integers). The larger the number of rows or columns, the more memory resources are needed for a two-dimensional array, and each bucket contains 1 counter count. The two-dimensional array digest structure supports insert operations (inserting a weighted directed edge into the two-dimensional array digest structure) and query operations (querying the weight of a directed edge). Each two-dimensional array needs to select 2 hash functions to calculate the indexes of the rows and the columns, and a two-dimensional array abstract structure needs 2k hash functions as a hash function family(Representing a hash function that computes the line index,represents a hash function that computes the column index, ∈ [1, k ]]) And the method is used for the insertion and query process of the directed edge.
And 104, inserting a weighted directed edge into the two-dimensional array abstract structure, and respectively calculating a source vertex and a target vertex of the weighted directed edge by using two hash functions to obtain a row index and a column index of a bucket of the weighted directed edge.
Inserting a weighted directed edge into a two-dimensional array abstract structure<s,t,v>When the key value of the element in the hash table is mapped into the element storage position, the hash function is used for calculating the storage address of the element in the element table, the hash function family is used for calculating the row and column indexes of the weighted directed edge in the two-dimensional array abstract structure, and for k two-dimensional arrays, the index mark of the bucket of the weighted directed edge is used for markingFor the l-th group, the bucket index isRepresents the firstLine and firstThe barrels of the column.
And 106, finding a corresponding bucket in the two-dimensional array abstract structure according to the row index and the column index of the bucket with the weighted directed edge, and updating a counter of the corresponding bucket by using an algebra sum rule and a weighted algebra sum rule to obtain an updated two-dimensional array abstract structure.
Each position in the array is called a barrel, the position represents the number of rows and columns, the position of the row index and the column index in the two-dimensional array abstract structure is found according to the row index and the column index of the weighted directed edge barrel, the position is the barrel corresponding to the row index and the column index of the weighted directed edge barrel, the counter of the corresponding barrel is updated by utilizing an algebra sum rule and a weighted algebra sum rule, and the updated two-dimensional array abstract structure, namely the two-bit array abstract structure supporting the insert operation, is obtained.
The query operation is executed after the insertion operation is completed, and for a given arbitrary directed edge < u, v >, the row and column indexes of the updated two-dimensional array abstract structure are calculated as
The index of the ith bucket isAnd then, respectively calculating the weight predicted values of the counters of the updated buckets according to the two insertion rules, and calculating the weight predicted values of the directed edges in the updated two-dimensional array abstract structure, so that the graph data stream can be conveniently inquired of the directed edges.
In the binary counting type summarization method based on the graph data stream, a two-dimensional array summarization structure is firstly constructed, each position in an array is called as a 'bucket', each bucket maintains a counter for aggregating the count values of weighted directed edges, and each bucket can be accessed through row and column indexes within constant time; inserting a weighted directed edge into the two-dimensional array abstract structure, and respectively calculating a source vertex and a target vertex of the weighted directed edge by using two hash functions to obtain a row index and a column index of a bucket of the weighted directed edge; finding a corresponding bucket in the two-dimensional array abstract structure according to the row index and the column index of the bucket with the weighted directed edge, and updating a counter of the corresponding bucket by utilizing an algebra sum rule and a weighted algebra sum rule to obtain an updated two-dimensional array abstract structure; calculating the source vertex and the target vertex of the directed edge of the updated two-dimensional array abstract structure according to the hash function to obtain the row index and the column index of the bucket corresponding to the directed edge; respectively carrying out weight prediction value calculation according to the algebra sum rule and the counter of the bucket after the weighted algebra sum rule is updated, and obtaining the weight prediction value of the directed edge; the invention can flexibly support the access of the graph data stream by constructing the two-dimensional array abstract structure and adding the binary counting abstract obtained after the insertion operation and the query operation in the two-dimensional array abstract structure, realizes the online insertion and query of the weighted directed edge of the graph data stream, and flexibly supports the approximate calculation of the weight result by utilizing the algebraic sum and the weighted algebraic sum and two classes of operators, and when the hash conflict of the inserted directed edge is less, the precision of the weighted algebraic sum operator is higher than that of the algebraic sum operator.
In one embodiment, the updating the counter of the corresponding bucket by using the algebraic sum rule and the weighted algebraic sum rule to obtain the updated two-dimensional array abstract structure includes:
updating the counter of the corresponding bucket by utilizing the algebra sum rule and the weighted algebra sum rule to obtain the updated counter of the bucket;
and constructing an updated two-dimensional array abstract structure according to the updated counter of the bucket.
As shown in FIG. 2, counters for buckets in the two-dimensional array digest structure are updated with algebraic sum operators and weighted algebraic sum operators.
wherein, I [ l ]]The first two-dimensional array is shown,index representing bucket, noLine and firstThe bucket of column (t), s represents the source vertex, t represents the destination vertex, and c represents the weight.
wherein sign (s, t) ═ Hr(s)×Hc(t), sign (s, t) E + -1, sign (s, t) representing a random value calculated from directed edges, Hr(·),Hc(. cndot.) represents 2 hash functions.
In one embodiment, the calculating of the weight prediction value is performed according to the algebraic sum rule and the counter of the bucket updated by the weighted algebraic sum rule, so as to obtain the weight prediction value of the directed edge, and the method includes:
for the counter of the bucket updated by algebra and rules, the minimum value of each bucket is taken as the weight prediction value of the directed edge as
Where x represents the aggregate value of the bucket's counter, u represents the source vertex of the directed edge, v represents the destination vertex of the directed edge,and k represents the number of the two-dimensional arrays.
If a plurality of directed edges are inserted into the same bucket, the counter of the bucket is not less than the weight of the arbitrarily inserted edge, so the minimum value of each bucket is taken as the weight predicted value of the directed edge.
In one embodiment, for the counter of the bucket updated by the weighted algebraic sum rule, the median of each bucket is taken as the weight predicted value of the directed edge, and the weight predicted value is
When the counter of the bucket is updated by using the weighted algebra and the rule, the numerical value of each bucket is close to the true value through the numerical value of the positive and negative offset part, so that the median is selected as the weight predicted value of the directed edge.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least some of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 3, there is provided a graph data flow-based binary-counting summarization apparatus, including: a two-dimensional array abstract structure constructing module 302, a weighted directed edge inserting module 304, a two-dimensional array abstract structure updating module 306 and a binary counting abstract structure constructing module 308, wherein:
a two-dimensional array abstract structure constructing module 302, configured to construct a two-dimensional array abstract structure; the two-dimensional array abstract structure comprises a plurality of two-dimensional arrays; each position in the two-dimensional array is called a bucket, and each bucket maintains a counter;
a weighted directed edge insertion module 304, configured to insert a weighted directed edge into the two-dimensional array digest structure, and calculate a source vertex and a destination vertex of the weighted directed edge by using two hash functions, respectively, to obtain a row index and a column index of a bucket of the weighted directed edge;
the two-dimensional array abstract structure updating module 306 is configured to find a corresponding bucket in the two-dimensional array abstract structure according to the row index and the column index of the weighted directed edge bucket, and update a counter of the corresponding bucket by using an algebra sum rule and a weighted algebra sum rule to obtain an updated two-dimensional array abstract structure;
a binary counting type abstract constructing module 308, configured to calculate, according to a hash function, a source vertex and a destination vertex of a directed edge of the updated two-dimensional array abstract structure, so as to obtain a row index and a column index of a bucket corresponding to the directed edge; respectively carrying out weight prediction value calculation according to the algebra sum rule and the counter of the bucket after the weighted algebra sum rule is updated, and obtaining the weight prediction value of the directed edge; and updating the updated two-dimensional array abstract structure by using the weight predicted value of the directed edge to obtain a binary counting abstract.
In one embodiment, the two-dimensional array digest structure updating module 306 is further configured to update the counter of the corresponding bucket by using the algebraic sum rule and the weighted algebraic sum rule to obtain an updated two-dimensional array digest structure, including:
updating the counter of the corresponding bucket by utilizing the algebra sum rule and the weighted algebra sum rule to obtain the updated counter of the bucket;
and constructing an updated two-dimensional array abstract structure according to the updated counter of the bucket.
wherein, I [ l ]]The first two-dimensional array is shown,index representing bucket, noGo, first(t) buckets in column, s represents source vertex, t represents destination vertex, and c represents weight.
wherein sign (s, t) ═ Hr(s)×Hc(t), sign (s, t) E + -1, sign (s, t) representing a random value calculated from directed edges, Hr(·),Hc(. cndot.) represents 2 hash functions.
In one embodiment, the binary-counting type abstract constructing module 308 is further configured to perform weight prediction value calculation according to the algebraic sum rule and the counter of the bucket updated by the weighted algebraic sum rule, to obtain a weight prediction value of the directed edge, where the weight prediction value includes:
for the counter of the bucket updated by algebra and rules, the minimum value of each bucket is taken as the weight prediction value of the directed edge as
Where x represents the aggregate value of the bucket's counter, u represents the source vertex of the directed edge, v represents the destination vertex of the directed edge,and k represents the number of the two-dimensional arrays.
In one embodiment, for the counter of the bucket updated by the weighted algebraic sum rule, the median of each bucket is taken as the weight predicted value of the directed edge, and the weight predicted value is
For the specific definition of the binary count type summarization device based on graph data stream, refer to the above definition of a binary count type summarization method based on graph data stream, which is not described herein again. The modules in the binary counting type summarization device based on graph data flow can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 3. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a binary-counting summarization method based on graph data streams. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the configuration shown in fig. 3 is a block diagram of only a portion of the configuration associated with the present application, and is not intended to limit the computing device to which the present application may be applied, and that a particular computing device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the method in the above embodiments when the processor executes the computer program.
In an embodiment, a computer storage medium is provided, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method in the above-mentioned embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A binary counting type summarization method based on graph data flow is characterized by comprising the following steps:
constructing a two-dimensional array abstract structure; the two-dimensional array abstract structure comprises a plurality of two-dimensional arrays; each position in the two-dimensional array is called a bucket, and each bucket maintains a counter;
inserting a weighted directed edge into a two-dimensional array abstract structure, and respectively calculating a source vertex and a destination vertex of the weighted directed edge by using two hash functions to obtain a row index and a column index of a bucket of the weighted directed edge;
finding a corresponding bucket in the two-dimensional array abstract structure according to the row index and the column index of the bucket with the weighted directed edge, and updating a counter of the corresponding bucket by utilizing an algebra sum rule and a weighted algebra sum rule to obtain an updated two-dimensional array abstract structure;
calculating the source vertex and the target vertex of the directed edge of the updated two-dimensional array abstract structure according to the hash function to obtain the row index and the column index of the bucket corresponding to the directed edge; respectively carrying out weight prediction value calculation according to the algebra sum rule and the counter of the bucket updated by the weighted algebra sum rule to obtain the weight prediction value of the directed edge;
and updating the updated two-dimensional array abstract structure by using the weight predicted value of the directed edge to obtain a binary counting abstract.
2. The method of claim 1, wherein updating the counter of the corresponding bucket using an algebraic sum rule and a weighted algebraic sum rule to obtain an updated two-dimensional array digest structure comprises:
updating the counter of the corresponding barrel by utilizing an algebraic sum rule and a weighted algebraic sum rule to obtain the updated counter of the barrel;
and constructing an updated two-dimensional array abstract structure according to the updated counter of the bucket.
5. The method of claim 4, wherein performing weight prediction value calculation according to the algebraic sum rule and the counter of the bucket updated by the weighted algebraic sum rule to obtain the weight prediction value of the directed edge comprises:
for the counter of the bucket updated by the algebra and the rule, taking the minimum value of each bucket as the weight predicted value of the directed edge as
7. A binary-counting summarization apparatus based on graph data stream, the apparatus comprising:
the two-dimensional array abstract structure constructing module is used for constructing a two-dimensional array abstract structure; the two-dimensional array abstract structure comprises a plurality of two-dimensional arrays; each position in the two-dimensional array is called a barrel, and each barrel maintains a counter;
the weighted directed edge inserting module is used for inserting weighted directed edges into the two-dimensional array abstract structure, and calculating a source vertex and a destination vertex of each weighted directed edge by using two hash functions to obtain a row index and a column index of a bucket of each weighted directed edge;
the two-dimensional array abstract structure updating module is used for finding a corresponding bucket in the two-dimensional array abstract structure according to the row index and the column index of the bucket of the weighted directed edge, and updating a counter of the corresponding bucket by utilizing an algebraic sum rule and a weighted algebraic sum rule to obtain an updated two-dimensional array abstract structure;
the binary counting type abstract constructing module is used for calculating the source vertex and the destination vertex of the directed edge of the updated two-dimensional array abstract structure according to the hash function to obtain the row index and the column index of the bucket corresponding to the directed edge; respectively carrying out weight prediction value calculation according to the algebra sum rule and the counter of the bucket updated by the weighted algebra sum rule to obtain the weight prediction value of the directed edge; and updating the updated two-dimensional array abstract structure by using the weight predicted value of the directed edge to obtain a binary counting abstract.
8. The apparatus of claim 7, wherein the two-dimensional array digest structure updating module is further configured to update the counter of the corresponding bucket by using a summation rule and a weighted summation rule to obtain an updated counter of the bucket; and constructing an updated two-dimensional array abstract structure according to the updated counter of the bucket.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 6 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210248361.2A CN114722242A (en) | 2022-03-14 | 2022-03-14 | Binary counting type summarization method and device based on graph data stream and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210248361.2A CN114722242A (en) | 2022-03-14 | 2022-03-14 | Binary counting type summarization method and device based on graph data stream and computer equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114722242A true CN114722242A (en) | 2022-07-08 |
Family
ID=82238262
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210248361.2A Pending CN114722242A (en) | 2022-03-14 | 2022-03-14 | Binary counting type summarization method and device based on graph data stream and computer equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114722242A (en) |
-
2022
- 2022-03-14 CN CN202210248361.2A patent/CN114722242A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109918382A (en) | Data processing method, device, terminal and storage medium | |
CN111159329A (en) | Sensitive word detection method and device, terminal equipment and computer-readable storage medium | |
CN110162692B (en) | User label determination method and device, computer equipment and storage medium | |
CN112269830A (en) | Big data analysis method, system, computer equipment and storage medium thereof | |
CN106991080A (en) | A kind of quantile of data determines method and device | |
CN111078689B (en) | Data processing method and system of discontinuous pre-ordering traversal tree algorithm | |
CN105677645A (en) | Data sheet comparison method and device | |
CN109240893B (en) | Application running state query method and terminal equipment | |
US7734456B2 (en) | Method and apparatus for priority based data processing | |
CN112767032A (en) | Information processing method and device, electronic equipment and storage medium | |
CN114722242A (en) | Binary counting type summarization method and device based on graph data stream and computer equipment | |
CN111158732A (en) | Access data processing method and device, computer equipment and storage medium | |
CN116304251A (en) | Label processing method, device, computer equipment and storage medium | |
CN115759742A (en) | Enterprise risk assessment method and device, computer equipment and storage medium | |
Adan et al. | Analysis of structured Markov processes | |
Li et al. | Optimizing streaming graph partitioning via a heuristic greedy method and caching strategy | |
CN113780666A (en) | Missing value prediction method and device and readable storage medium | |
CN113360218A (en) | Service scheme selection method, device, equipment and storage medium | |
Kang | Stochastic coordinate-exchange optimal designs with complex constraints | |
Carrasco | Transient analysis of large Markov models with absorbing states using regenerative randomization | |
CN116501993B (en) | House source data recommendation method and device | |
Bordenave et al. | Markovian linearization of random walks on groups | |
CN113411395B (en) | Access request routing method, device, computer equipment and storage medium | |
CN112380494B (en) | Method and device for determining object characteristics | |
CN115827930B (en) | Data query optimization method, system and device for graph database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |