CN116595499B - Multi-department collaborative transaction data sharing traceability method - Google Patents
Multi-department collaborative transaction data sharing traceability method Download PDFInfo
- Publication number
- CN116595499B CN116595499B CN202310880579.4A CN202310880579A CN116595499B CN 116595499 B CN116595499 B CN 116595499B CN 202310880579 A CN202310880579 A CN 202310880579A CN 116595499 B CN116595499 B CN 116595499B
- Authority
- CN
- China
- Prior art keywords
- transaction data
- attribute
- traceability
- acquiring
- tracing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 97
- 239000013598 vector Substances 0.000 claims abstract description 72
- 238000012545 processing Methods 0.000 claims abstract description 68
- 230000008569 process Effects 0.000 claims abstract description 41
- 238000013138 pruning Methods 0.000 claims abstract description 25
- 238000010845 search algorithm Methods 0.000 claims abstract description 10
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000005259 measurement Methods 0.000 claims description 6
- 230000002159 abnormal effect Effects 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 claims description 4
- 230000008859 change Effects 0.000 description 6
- 230000009286 beneficial effect Effects 0.000 description 3
- 241001048891 Jatropha curcas Species 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013075 data extraction Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/16—Program or content traceability, e.g. by watermarking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/26—Discovering frequent patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Software Systems (AREA)
- Evolutionary Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Computing Systems (AREA)
- Computer Security & Cryptography (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Computer Hardware Design (AREA)
- Technology Law (AREA)
- Algebra (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of transaction data processing, and provides a multi-department collaborative transaction data sharing traceability method, which comprises the following steps: acquiring transaction data in a sharing center, acquiring attribute values and attribute vectors corresponding to the transaction data, acquiring importance weights of all attributes according to the attribute values, acquiring associated stability of all the attributes according to attribute feature vectors, acquiring attribute traceability indexes of all the attributes according to the importance weights and the associated stability, acquiring a database classification result according to the attribute feature vectors and the attribute traceability vectors, acquiring traceability complexity according to traceability difficulty of the transaction data, acquiring pruning strategies of a depth-first search algorithm in a traceability process on a directed acyclic graph according to the traceability complexity, and finishing traceability of the transaction data of the sharing center. According to the invention, the pruning strategy of the depth-first search algorithm in the directed acyclic graph is obtained through analysis of the tracing difficulty of different transaction data in different databases, so that the tracing accuracy is ensured and the tracing efficiency is improved.
Description
Technical Field
The invention relates to the technical field of transaction data processing, in particular to a multi-department collaborative transaction data sharing traceability method.
Background
Because specific services and functions of different departments are different, when multiple departments cooperatively develop services, a large amount of multi-type transaction data are transferred among the departments, however, most of the transaction data are independently stored by databases of all units, are difficult to share by other departments, and are easy to have the problem of unsmooth sharing. Therefore, the method and the device realize safe and reliable sharing traceability of the multi-department transaction data, and have important effects on preventing privacy disclosure of the transaction data, preventing confusion of department authorities, avoiding single-point faults and the like.
Data tracing methods refer to determining the origin and integrity of data by tracking and analyzing the flow path and course of change of the data. The historical change information of the data is also called as tracing information, and the tracing information comprises data information such as a data source, a method step for processing the data, intermediate data and the like. When problems occur in the transaction data using process, the method can find out places where problems possibly occur in the data generating and processing links according to the tracing information, and improves the utilization rate of the transaction data. The current common data tracing includes: the method based on the data identification needs extra storage space and is difficult to be used for tracing tasks with more data volume; the tracing method based on data stream tracing needs a large amount of manual traceability and has low efficiency when facing the problem of asymmetry of traceable information. The tracing method based on the data log is easy to cause the situation that a plurality of visitor parts have the same tracing information, and the inaccurate transaction data visitor records can cause the reliability of the transaction data tracing result to be reduced.
Disclosure of Invention
The invention provides a multi-department collaborative transaction data sharing traceability method, which aims to solve the problem of low traceability efficiency in the process of carrying out transaction data traceability by utilizing a depth-first search algorithm, and adopts the following specific technical scheme:
the invention provides a multi-department collaborative transaction data sharing traceability method, which comprises the following steps:
acquiring transaction data and a processing mode in a flowing process thereof, and carrying out coding processing on the transaction data and the processing mode thereof;
acquiring attribute values and attribute feature vectors of each transaction data according to the coded decimal results of each transaction data, acquiring importance weights of different attributes of the transaction data according to the information gains of the attribute values of the transaction data, acquiring association stability of different attributes of the transaction data according to the attribute feature vectors of the transaction data in the data similarity set, and acquiring association stability according to the importance weights and the association stability
Acquiring attribute traceability indexes of different attributes of the transaction data in a definite degree, acquiring attribute traceability vectors of each transaction data according to the ordering result of the attribute traceability indexes, and acquiring database classification results of the transaction data with different contents according to the attribute feature vectors and the attribute traceability vectors;
acquiring a directed acyclic graph corresponding to each type of database according to the transaction data in each type of database, acquiring tracing complexity of the transaction data according to tracing difficulty of the transaction data, acquiring pruning strategies of a depth-first search algorithm in the tracing process of the directed acyclic graph according to the tracing complexity, and acquiring pruning results of nodes corresponding to each transaction data according to the pruning strategies;
and finishing tracing of the transaction data of the sharing center according to the pruning result of each node in the depth-first search algorithm, and recording and updating the tracing result in the zone block according to the sequence of the tracing time of the transaction data.
Preferably, the method for encoding transaction data and the processing mode thereof comprises the following steps:
acquiring all transaction data of a sharing center, and recording the processing mode of each transaction data in a block chain mode in the transaction data sharing flow process, wherein the transaction data comprises type data of numbers, characters and letters, and the processing mode of the transaction data comprises the processing modes of creating, deleting, adding and moving the transaction data;
and coding all transaction data and processing modes of the sharing center by utilizing the UTF-8 coding technology, and converting different types of transaction data and processing modes of the transaction data into the same data format.
Preferably, the method for obtaining the attribute value and the attribute feature vector of each transaction data according to the decimal result corresponding to each transaction data includes:
for each transaction data, converting each octet binary byte in the coding result into a corresponding decimal number, and obtaining a decimal result corresponding to each transaction data according to the decimal numbers corresponding to all octets binary bytes;
acquiring attribute values of the transaction data in a preset number according to a decimal result corresponding to the transaction data, wherein the attribute values comprise skewness, mean value, standard deviation, variance, variation coefficient, middle-range number, maximum value, minimum value, mode number and abnormal ratio of the decimal number in the decimal result, and taking a vector formed by all the attribute values as an attribute feature vector of the transaction data.
Preferably, the method for obtaining the importance weights of different attributes of the transaction data according to the information gain of the attribute values of the transaction data comprises the following steps:
the method comprises the steps of obtaining attribute values of all transaction data contained in a sharing center, taking attribute values representing the same attribute as attribute values of the same type of attribute, obtaining information gain of each type of attribute by using an information gain algorithm, and taking the ratio of the information gain of each type of attribute value to the accumulated sum of the information gains of all types of attribute values as an importance weight of each type of attribute.
Preferably, the method for obtaining the association stability of different attributes of the transaction data according to the attribute feature vector of the transaction data in the data similarity set comprises the following steps:
in the method, in the process of the invention,is the associated stability of the b attribute of the a-th transaction data, m is the data similarity set +.>The number of transaction data in (c) is the data similarity set +.>In (c) th transaction data, +.> 、/>The values of the b attribute of the a-th and c-th transaction data, respectively, +.>、/>Attribute feature vectors of the a-th and c-th transaction data, respectively, +.>Is cosine similarity between attribute feature vectors, < ->Is a parameter adjusting factor.
Preferably, the method for obtaining attribute traceability indexes of different attributes of the transaction data according to the importance weight and the associated stability comprises the following steps:
for any attribute of each transaction data, acquiring the association stability of each type of attribute of the transaction data, taking the importance weight of each type of attribute as a first product factor, taking the difference result of the association stability of each type of attribute and the average value of the association stability of all attributes of the transaction data as a second product factor, and taking the product of the first product factor and the second product factor of each type of attribute as the attribute traceability index of each type of attribute.
Preferably, the method for obtaining the database classification result of the transaction data with different contents according to the attribute feature vector and the attribute traceability vector comprises the following steps:
acquiring attribute traceability indexes of each type of attribute of each transaction data, and taking the attribute traceability indexes of all attributes of each transaction data as attribute traceability vectors of each transaction data according to the sequence from big to small;
taking the attribute feature vector of each transaction data as a first row vector, taking the attribute traceability vector of each transaction data as a second row vector, and taking a matrix formed by the first row vector and the second row vector as a feature matrix of each transaction data;
and obtaining the measurement distance between the transaction data according to the similarity between the feature matrixes corresponding to the transaction data, obtaining clustering results of all the transaction data by using a DBSCAN clustering algorithm, and taking each clustering cluster as a database of the transaction data.
Preferably, the method for obtaining the measurement distance between the transaction data according to the similarity between the feature matrices corresponding to the transaction data comprises the following steps:
in the method, in the process of the invention,is the measured distance between the a-th and c-th transaction data,/and>、/>attribute feature vectors of the a-th and c-th transaction data, respectively, +.>、/>Attribute traceability vector of the a-th transaction data and the c-th transaction data are respectively +.>、/>Attribute feature vector +.>、/>Between, attribute traceability vector->、/>Pearson correlation coefficient therebetween.
Preferably, the method for obtaining the tracing complexity of the transaction data according to the tracing difficulty of the transaction data comprises the following steps:
in the method, in the process of the invention,is the processing path complexity of the a-th transaction data,/->Is the number of times the a-th transaction data is inputted as the data processing start,/time>Is the number of nodes on the path at the f-th time as data processing input, i is the i-th node therein,/->Is the similarity between the transaction data corresponding to the ith node and the transaction data a,/and a>The absolute value of the Hurst index difference value of the similarity sequence corresponding to the i node and the i+1 node of the Hurst index of the similarity sequence corresponding to the i node;
is the trace-out path complexity of the a-th transaction data,/->Is the number of times the a-th transaction data is outputted as data processing, +.>Is the number of nodes on the path at the kth time of data processing outputQuantity j is where j is the j-th node, < ->Is the similarity between the transaction data corresponding to the jth node and the transaction data a,/and a>The absolute value of the difference value between the Hurst index of the similarity sequence corresponding to the jth node and the Hurst index of the similarity sequence corresponding to the jth-1 node;
is the traceability complexity of transaction data a, < +.>Is the average value of the similarity between the a-th transaction data and all transaction data in the database where the a-th transaction data is located,/->Is a parameter adjusting factor.
Preferably, the method for acquiring the pruning strategy of the depth-first search algorithm in the directed acyclic graph tracing process according to the tracing complexity comprises the following steps:
acquiring the tracing complexity of all the transaction data in each database, taking the tracing complexity as a mark of a corresponding node of the transaction data in the directed acyclic graph, acquiring the difference value of the tracing complexity of two nodes connected in the directed acyclic graph, and taking two nodes which have absolute values of the difference value of the tracing complexity smaller than a threshold value and have no data processing relationship as pruning node pairs;
and when tracing the transaction data corresponding to one node in the pruning node pair by utilizing the depth-first search DFS algorithm, all pruning is carried out on paths taking the other node in the pruning node pair as a starting point.
The beneficial effects of the invention are as follows: the method and the system construct the attribute traceability index through analyzing the attribute feature vector and the content similarity of the transaction data coding result, and the attribute traceability index considers the attribute feature and the content similarity of the transaction data. And constructing tracing complexity based on tracing difficulty of different transaction data in the database corresponding DAG, wherein the tracing complexity considers different characteristics of the tracing difficulty of the different transaction data.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a schematic flow chart of a multi-department collaborative transaction data sharing traceability method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of tracing difficulty of different nodes in a directed acyclic graph according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, a flowchart of a multi-department collaborative transaction data sharing tracing method according to an embodiment of the present invention is shown, and the method includes the following steps:
step S001, obtaining transaction data and a processing mode thereof in a sharing flow process, and recording the transaction data and a coding result of the processing mode thereof in a block chain mode.
During the cooperative work of multiple departments, the transaction data of the multiple departments are stored in a transaction data sharing center, each department acquires required transaction data according to the established responsibility and the specified task, the transaction data comprises a plurality of different types of data such as numbers, characters and letters, and the processing of the transaction data comprises a plurality of different processing modes such as creation, deletion, addition and movement. In the invention, the operation processing in the transaction data flow process is recorded in the form of a block chain, each block in the block chain consists of a block head and a block body, wherein the block head contains information such as hash value, random number, time, merkel root and the like, and the block body contains information such as transaction information, data processing and the like.
All transaction data of the transaction data sharing center are coded, different types of transaction data and processing modes of the transaction data are converted into the same data format, the conversion of the data format is completed by adopting a UTF-8 coding method, UTF-8 coding is a known technology, and specific processes are not repeated.
Thus, the transaction data and the coding result of the processing mode are obtained.
And step S002, constructing an attribute traceability index based on the attribute characteristics and the content similarity among the transaction data, and obtaining a database classification result based on the attribute characteristic vector and the attribute traceability vector.
In the process of cooperative work of a plurality of departments, each department acquires required transaction data from a transaction data sharing center, and when a certain department has confusion on the acquired transaction data or abnormal phenomena such as privacy leakage and the like on the transaction data, the transaction data needs to be traced to the source to acquire specific personnel causing problems on the transaction data. When a piece of transaction data needing to trace the source appears, the invention considers that firstly, a database with the same data content as the transaction data is found from each block of a block chain, and then the tracing result of the transaction data is obtained from the database.
The process of tracing the transaction data is to find the transaction data possibly serving as the transaction data source needing tracing in the transaction data sharing center, so that the process of tracing the data is also to find similar transaction data. The responsibilities of different departments are different, and the required transaction data is different, besides, the transaction data is processed differently by different staff in the same department, which can lead to different degrees of difference in the encoding results of the transaction data.
In order to accelerate the tracing efficiency of the transaction data, the invention considers classifying the transaction data in the transaction data contribution center. For any transaction data, the corresponding coding result of letters is usually 1 byte, the coding result of Chinese characters is usually 3 bytes, and the coding result of complex symbols is 4-6 bytes, namely the coded transaction data consists of binary bytes, so that the transaction data of a sharing center is classified by extracting the attribute characteristics of each transaction data, the classification speed is increased on the premise of keeping the content of the transaction data, and the follow-up tracing process is convenient to carry out quickly.
For the a-th transaction data, acquiring decimal corresponding to each eight-bit binary byte in the coding result to obtain a decimal result F (a) corresponding to the a-th transaction data, respectively acquiring 10 attribute values of skewness, mean value, standard deviation, variance, variation coefficient, middle-process number, maximum value, minimum value, mode and abnormal ratio of decimal numbers in the F (a), and taking a vector formed by the 10 attribute values as an attribute feature vector of the a-th transaction data. The total amount of transaction data in a sharing center is recorded as K, attribute feature vectors corresponding to the K transaction data are obtained, then information gain of each attribute is obtained by using an information gain algorithm, and the information gain of the 1 st attribute skewness is recorded as +.>The information gain is a known technique, specificallyThe process is not described in detail.
Secondly, respectively acquiring the longest common subsequence LCS between the rest K-1 transaction data and the coding result of the a-th transaction data, and recording the average value of the sequence lengths of the K-1 longest common subsequences LCS asThe length of the sequence is greater than +.>Is a data similarity set of the a-th transaction data>。
Based on the analysis, an attribute traceability index V is constructed here and used for representing the importance degree of different attributes of the transaction data in the traceability process, and the attribute traceability index of the attribute b of the a-th transaction data is calculated:
In the method, in the process of the invention,is the importance weight of the b attribute, n is the number of transaction data extraction attributes, in the invention, the size of n takes the checked value of 10 +.>The larger the value of (c), the more classification of the shared hub transaction data by attribute b.
Is the associated stability of the b attribute of the a-th transaction data, m is the data similarity set +.>The number of transaction data in (c) is the data similarity set +.>In (c) th transaction data, +.>、/>The values of the b attribute of the a-th and c-th transaction data, respectively, +.>、/>Attribute feature vectors of the a-th and c-th transaction data, respectively, +.>Is cosine similarity between attribute feature vectors, < ->Is a parameter regulating factor, and is a herb of Jatropha curcas>The function of (2) is to avoid the denominator being 0, < >>The magnitude of (2) is 0.001, cosine similarity is a known technique, and detailed description is omitted. />The greater the value of attribute b, the more stable the association of attribute b with each transaction data in the data affinity set.
Attribute traceability index, which is the b attribute of the a-th transaction data,/->Is the average of all attribute-related stabilities of the a-th transaction data.
The attribute traceability index reflects the importance degree of different attributes of the transaction data in the traceability process. The larger the value distribution difference of the b attribute of different transaction data in the sharing center is, the stronger the classification capability of the b attribute to the transaction data is, the larger the information gain of the b attribute is,the greater the value of +.>The greater the value of (2); the more similar the content of the transaction data within the data similarity set,the larger the value of b attribute is, the more similar the value of b attribute is in each transaction data, the smaller the difference of the corresponding b attribute of the transaction data with similar data content is, and +.>The smaller the value of b attribute is, the more stable the association of the b attribute with the a-th transaction data is; i.e.The larger the value of a, the more important the b attribute in the a-th transaction data plays in its tracing process. The attribute traceability index considers the attribute characteristics and the content similarity of the transaction data, has the beneficial effects that the transaction data classification is realized on the premise of being based on the content of the transaction data, the determination of the node position and the direction between nodes in the directed acyclic graph DAG in the subsequent traceability process is facilitated, and the efficiency of the transaction data traceability is improved.
Further, attribute traceability indexes of each attribute are obtained, attribute traceability vectors of the a-th transaction data are obtained according to the attribute traceability indexes in sequence from large to small, and feature matrixes of the a-th transaction data are constructed by utilizing attribute feature vectors and attribute traceability vectors of the a-th transaction data, wherein the construction process of the feature matrixes is as follows: taking the attribute feature vector of the a-th transaction data as a first row vector, taking the attribute tracing vector of the a-th transaction data as a second row vector, and taking a matrix formed by the first row vector and the second row vector as a feature matrix of the a-th transaction data.
Further, feature matrixes of all transaction data in the data similarity set are obtained, similarity among the feature matrixes is obtained, the measurement distance in a DBSCAN clustering algorithm is built by the similarity, the transaction data is clustered by the DBSCAN clustering algorithm, the radius is set to be 0.5, the minPts is set to be 15, and each transaction data clustering cluster is used as a database of the transaction data. The similarity between the feature matrices is composed of the similarity between the vectors of the key-structured feature matrices, and the measurement distance of the a-th and c-th transaction data is calculated:
In the method, in the process of the invention,、/>attribute feature vectors of the a-th and c-th transaction data, respectively, +.>、/>Attribute traceability vector of the a-th transaction data and the c-th transaction data are respectively +.>、/>The pearson correlation coefficients between the attribute feature vector and the attribute traceability vector are respectively.
Thus, the database classification result of transaction data with different contents in the block chain is obtained.
And step S003, constructing processing path complexity and tracing path complexity based on tracing difficulty of the transaction data in the directed acyclic graph, and obtaining tracing complexity based on the processing path complexity and the tracing path complexity.
Further, in the invention, the transaction data is modeled by the transaction data sharing center, namely, different transaction data and operation processing of the transaction data are represented in the form of nodes. The flow direction and processing order of the transaction data are taken as the direction between nodes. If the processing mode and the flow direction of two transaction data in the database are the same, the tracing difficulty of the two task data in the DAG is approximate. In the following figures, node 1 and node 2 respectively represent two indication files issued by a certain gate, each lower-level department needs to acquire the indication files and issue the indication files to all persons in each lower-level department after the indication files reply, in the process, the processing modes and the flow directions of transaction data corresponding to node 1 and node 2 are the same, and then the tracing difficulty of two nodes in the DAGs where node 1 and node 2 are located is similar.
In the invention, the transaction data tracing is carried out on the priority acyclic graph by considering the corresponding DAG in each database and utilizing a depth-first search DFS algorithm, and pruning operation is selectively carried out on different transaction data based on different transaction data tracing difficulties in the tracing process. For example, if the tracing difficulty of the node 1 and the node 2 is similar, it is considered that the tracing result of the transaction data in the node 1 does not exist in the node 2, and when the depth-first search algorithm searches the node 1, pruning is performed on the node 2. According to the data processing records of the transaction data in the block chain, the times of the nodes serving as data processing results are respectively obtained and serve as the times of data processing input, each data processing represents that the nodes are one more path in the DAG, a schematic diagram of the directed acyclic graph DAG is shown in figure 2, for the path of the nodes, the nodes are taken as data processing input as an example, the transaction data at the nodes reach the next node through data processing, and the greater the similarity of the transaction data at the two nodes is, the simpler the change of the transaction data is after the data processing is, and the easier the tracing is.
Based on the analysis, a tracing complexity R is constructed here and used for representing the complexity of deep search of nodes corresponding to different transaction data in the DAG of each transaction database, and the tracing complexity of the transaction data a is calculated:
In the method, in the process of the invention,is the processing path complexity of the a-th transaction data,/->Is the number of times the a-th transaction data is inputted as the data processing start,/time>Is the number of nodes on the path at the f-th time as data processing input, i is the i-th node therein,/->Is the similarity between the transaction data corresponding to the ith node and the transaction data a,/and a>Is the absolute value of the Hurst exponent difference of the i+1th node corresponding to the similarity sequence of the Hurst exponent corresponding to the similarity sequence of the i node, the similarity sequence is a sequence formed by similarity of transaction data corresponding to each node on the path of data processing input at the f time and the transaction data at the a time according to the node sequence, and the Hurst index is a known technology, and the specific process is not repeated.
Is the trace-out path complexity of the a-th transaction data,/->Is the number of times the a-th transaction data is outputted as data processing, +.>Is the number of nodes on the path at the kth time as the data processing output, j is the jth node, and +.>Is the similarity between the transaction data corresponding to the jth node and the transaction data a,/and a>The absolute value of the difference value between the Hurst index of the similarity sequence corresponding to the jth node and the Hurst index of the similarity sequence corresponding to the jth-1 node is the sequence formed by the similarity of the transaction data corresponding to each node on the path and the a transaction data according to the node sequence when the kth node is used as the data processing output.
Is the average of the similarity of the a-th transaction data and all the transaction data in the database where the a-th transaction data is located. />Is a parameter regulating factor, and is a herb of Jatropha curcas>Is made ofThe use is to avoid the denominator being 0, < >>The size of (2) is 0.001.
The tracing complexity reflects the complexity of deep searching of nodes corresponding to different transaction data in the DAG of each transaction database. When the a-th transaction data is input as data processing, the simpler the data operation is performed at the node on the path, the smaller the change of the transaction data is,the smaller the value of +.>The smaller the value of +.>The smaller the value of (2); the simpler the data operation of the a-th transaction data manager is, the more and more complex the number of data operations the a-th transaction data is subjected to when being output as data processing, the lower the similarity between the a-th transaction data and the transaction data before the data processing is, the more and more>The greater the value of +.>The greater the value of (2); i.e. < ->The larger the value of the (a) transaction data is, the more complex the data processing is carried out on the corresponding node of the (a) transaction data, and the more difficult the deep search is carried out in the tracing process.
Further, acquiring the tracing complexity of all the transaction data in each database, taking the tracing complexity as a mark of a corresponding node of the transaction data, and if the absolute value of the difference value of the tracing complexity of two nodes in the DAG is smaller than a threshold value, the threshold value is 0.3, and no data processing relation exists between the two nodes, pruning all paths taking the other node as a starting point when carrying out data tracing on one node by utilizing a depth-first search DFS algorithm. The depth-first search DFS algorithm is a well-known technique, and the specific process is not described in detail.
The tracing complexity considers the characteristic that the tracing difficulty of different transaction data is different, and has the beneficial effects that the number of searched paths and nodes is reduced through a pruning strategy, so that the pruning operation is performed by different nodes in the depth-first search DFS algorithm in a self-adaptive manner, and the tracing efficiency of the transaction data in the DAG is improved.
So far, the tracing result of each transaction data in the database is obtained.
And S004, marking the tracing result according to the tracing time, and acquiring the corresponding tracing result according to the transaction data sharing time.
And analyzing the flow path, the change process and the source condition of the transaction data according to the tracing result of the transaction data, storing the tracing result in a block chain form in a transaction data sharing center, storing tracing time, tracing complexity and the like in a block head, and storing the flow path, the change process of the transaction data and the like in a block body.
Further, each time of transaction data sharing flow, the last tracing result is updated once in the deblocking of the blockchain, the tracing result is marked as a first sharing tracing result according to time sequence, the updated result is marked as a second sharing tracing result, when the related departments confuse the shared transaction data, the corresponding transaction data tracing result is obtained according to the transaction data sharing time, the detailed verification is carried out on the data source and the processing steps and results of the transaction data in the tracing result, for example, in the process of transaction data tracing, whether the passed edge is legal or not is verified, namely whether the data is transmitted and changed according to an expected flow path or not is verified. The operation record of the data and the identification of the node can be compared to verify, so that the accuracy of transaction data sharing is ensured.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.
Claims (6)
1. The multi-department collaborative transaction data sharing traceability method is characterized by comprising the following steps of:
acquiring transaction data and a processing mode in a flowing process thereof, and carrying out coding processing on the transaction data and the processing mode thereof;
acquiring attribute values and attribute feature vectors of each transaction data according to the coded decimal results of each transaction data, acquiring importance weights of different attributes of the transaction data according to information gains of the attribute values of the transaction data, acquiring association stability of different attributes of the transaction data according to the attribute feature vectors of the transaction data in a data similarity set, acquiring attribute traceability indexes of different attributes of the transaction data according to the importance weights and the association stability, acquiring attribute traceability vectors of each transaction data according to ordering results of the attribute traceability indexes, and acquiring database classification results of the transaction data with different contents according to the attribute feature vectors and the attribute traceability vectors;
acquiring a directed acyclic graph corresponding to each type of database according to the transaction data in each type of database, acquiring tracing complexity of the transaction data according to tracing difficulty of the transaction data, acquiring pruning strategies of a depth-first search algorithm in the tracing process of the directed acyclic graph according to the tracing complexity, and acquiring pruning results of nodes corresponding to each transaction data according to the pruning strategies;
according to the pruning result of each node in the depth-first search algorithm, tracing the transaction data of the sharing center, and according to the sequence of the tracing time of the transaction data, recording and updating the tracing result in the zone block;
the method for acquiring the attribute value and the attribute feature vector of each transaction data according to the decimal result corresponding to each transaction data comprises the following steps:
for each transaction data, converting each octet binary byte in the coding result into a corresponding decimal number, and obtaining a decimal result corresponding to each transaction data according to the decimal numbers corresponding to all octets binary bytes;
acquiring a preset number of attribute values of the transaction data according to a decimal result corresponding to the transaction data, wherein the attribute values comprise skewness, mean value, standard deviation, variance, variation coefficient, middle-range number, maximum value, minimum value, mode number and abnormal ratio of the decimal number in the decimal result, and taking a vector formed by all the attribute values as an attribute feature vector of the transaction data;
the method for acquiring the importance weight of different attributes of the transaction data according to the information gain of the attribute value of the transaction data comprises the following steps:
acquiring attribute values of all transaction data contained in a sharing center, taking attribute values representing the same attribute as attribute values of the same type of attribute, acquiring information gain of each type of attribute by using an information gain algorithm, and taking the ratio of the information gain of each type of attribute value to the accumulated sum of the information gains of all types of attribute values as the importance weight of each type of attribute;
the method for acquiring the association stability of different attributes of the transaction data according to the attribute feature vector of the transaction data in the data similarity set comprises the following steps:
in the method, in the process of the invention,is the associated stability of the b attribute of the a-th transaction data, m is the data similarity set +.>The number of transaction data in (c) is the data similarity set +.>In (c) th transaction data, +.>、/>The a-th and c-th transaction data respectivelyb value of attribute, +.>、/>Attribute feature vectors of the a-th and c-th transaction data, respectively, +.>Is cosine similarity between attribute feature vectors, < ->Is a parameter adjusting factor;
the method for acquiring the attribute traceability indexes of different attributes of the transaction data according to the importance weight and the associated stability comprises the following steps:
for any attribute of each transaction data, acquiring the association stability of each type of attribute of the transaction data, taking the importance weight of each type of attribute as a first product factor, taking the difference result of the association stability of each type of attribute and the average value of the association stability of all attributes of the transaction data as a second product factor, and taking the product of the first product factor and the second product factor of each type of attribute as the attribute traceability index of each type of attribute.
2. The multi-department collaborative transaction data sharing traceability method according to claim 1, wherein the method for encoding transaction data and processing modes thereof is as follows:
acquiring all transaction data of a sharing center, and recording the processing mode of each transaction data in a block chain mode in the transaction data sharing flow process, wherein the transaction data comprises type data of numbers, characters and letters, and the processing mode of the transaction data comprises the processing modes of creating, deleting, adding and moving the transaction data;
and coding all transaction data and processing modes of the sharing center by utilizing the UTF-8 coding technology, and converting different types of transaction data and processing modes of the transaction data into the same data format.
3. The method for sharing and tracing the transaction data of multiple departments according to claim 1, wherein the method for obtaining the database classification results of the transaction data of different contents according to the attribute feature vector and the attribute tracing vector is as follows:
acquiring attribute traceability indexes of each type of attribute of each transaction data, and taking the attribute traceability indexes of all attributes of each transaction data as attribute traceability vectors of each transaction data according to the sequence from big to small;
taking the attribute feature vector of each transaction data as a first row vector, taking the attribute traceability vector of each transaction data as a second row vector, and taking a matrix formed by the first row vector and the second row vector as a feature matrix of each transaction data;
and obtaining the measurement distance between the transaction data according to the similarity between the feature matrixes corresponding to the transaction data, obtaining clustering results of all the transaction data by using a DBSCAN clustering algorithm, and taking each clustering cluster as a database of the transaction data.
4. The multi-department collaborative transaction data sharing traceability method according to claim 3, wherein the method for obtaining the measurement distance between the transaction data according to the similarity between the corresponding feature matrices of the transaction data comprises the following steps:
in the method, in the process of the invention,is the measured distance between the a-th and c-th transaction data,/and>、/>attribute feature vectors of the a-th and c-th transaction data, respectively, +.>、/>Attribute traceability vector of the a-th transaction data and the c-th transaction data are respectively +.>、Attribute feature vector +.>、/>Between, attribute traceability vector->、/>Pearson correlation coefficient therebetween.
5. The multi-department collaborative transaction data sharing traceability method according to claim 1, wherein the method for obtaining traceability complexity of transaction data according to traceability difficulty of the transaction data comprises the following steps:
in the method, in the process of the invention,is the processing path complexity of the a-th transaction data,/->Is the number of times the a-th transaction data is inputted as the data processing start,/time>Is the number of nodes on the path at the f-th time as data processing input, i is the i-th node therein,/->Is the similarity between the transaction data corresponding to the ith node and the transaction data a,/and a>The absolute value of the Hurst index difference value of the similarity sequence corresponding to the i node and the i+1 node of the Hurst index of the similarity sequence corresponding to the i node;
is the trace-out path complexity of the a-th transaction data,/->Is the number of times the a-th transaction data is outputted as data processing, +.>Is the number of nodes on the path at the kth time as the data processing output, j is the jth node, and +.>Is the similarity between the transaction data corresponding to the jth node and the transaction data a,/and a>The absolute value of the difference value between the Hurst index of the similarity sequence corresponding to the jth node and the Hurst index of the similarity sequence corresponding to the jth-1 node;
is the traceability complexity of transaction data a, < +.>Is the average value of the similarity between the a-th transaction data and all transaction data in the database where the a-th transaction data is located,/->Is a parameter adjusting factor.
6. The multi-department collaborative transaction data sharing traceability method according to claim 1, wherein the method for obtaining pruning strategies of a depth-first search algorithm in a directed acyclic graph traceability process according to traceability complexity is as follows:
acquiring the tracing complexity of all the transaction data in each database, taking the tracing complexity as a mark of a corresponding node of the transaction data in the directed acyclic graph, acquiring the difference value of the tracing complexity of two nodes connected in the directed acyclic graph, and taking two nodes which have absolute values of the difference value of the tracing complexity smaller than a threshold value and have no data processing relationship as pruning node pairs;
and when tracing the transaction data corresponding to one node in the pruning node pair by utilizing the depth-first search DFS algorithm, all pruning is carried out on paths taking the other node in the pruning node pair as a starting point.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310880579.4A CN116595499B (en) | 2023-07-18 | 2023-07-18 | Multi-department collaborative transaction data sharing traceability method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310880579.4A CN116595499B (en) | 2023-07-18 | 2023-07-18 | Multi-department collaborative transaction data sharing traceability method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116595499A CN116595499A (en) | 2023-08-15 |
CN116595499B true CN116595499B (en) | 2023-11-21 |
Family
ID=87599569
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310880579.4A Active CN116595499B (en) | 2023-07-18 | 2023-07-18 | Multi-department collaborative transaction data sharing traceability method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116595499B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118133202B (en) * | 2024-03-20 | 2024-09-27 | 国网河南省电力公司经济技术研究院 | Block chain-based energy data safe and efficient tracing method and system |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110110547A (en) * | 2019-04-08 | 2019-08-09 | 智链万源(北京)数字科技有限公司 | Data processing method of tracing to the source and device |
WO2020233325A1 (en) * | 2019-05-21 | 2020-11-26 | 深圳壹账通智能科技有限公司 | Blockchain-based tracing data acquisition method, and related apparatus |
CN112150152A (en) * | 2020-10-09 | 2020-12-29 | 浙江专线宝网阔物联科技有限公司 | B-F neural network traceable algorithm based on block chain and fuzzy cognitive mapping fusion |
CN114079567A (en) * | 2020-08-21 | 2022-02-22 | 东北大学秦皇岛分校 | Block chain-based universal IP tracing system and method |
CN115134250A (en) * | 2022-06-29 | 2022-09-30 | 北京计算机技术及应用研究所 | Network attack source tracing evidence obtaining method |
CN115271959A (en) * | 2022-08-17 | 2022-11-01 | 中国工商银行股份有限公司 | Distributed transaction log evidence storing and tracing method and system |
-
2023
- 2023-07-18 CN CN202310880579.4A patent/CN116595499B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110110547A (en) * | 2019-04-08 | 2019-08-09 | 智链万源(北京)数字科技有限公司 | Data processing method of tracing to the source and device |
WO2020233325A1 (en) * | 2019-05-21 | 2020-11-26 | 深圳壹账通智能科技有限公司 | Blockchain-based tracing data acquisition method, and related apparatus |
CN114079567A (en) * | 2020-08-21 | 2022-02-22 | 东北大学秦皇岛分校 | Block chain-based universal IP tracing system and method |
CN112150152A (en) * | 2020-10-09 | 2020-12-29 | 浙江专线宝网阔物联科技有限公司 | B-F neural network traceable algorithm based on block chain and fuzzy cognitive mapping fusion |
CN115134250A (en) * | 2022-06-29 | 2022-09-30 | 北京计算机技术及应用研究所 | Network attack source tracing evidence obtaining method |
CN115271959A (en) * | 2022-08-17 | 2022-11-01 | 中国工商银行股份有限公司 | Distributed transaction log evidence storing and tracing method and system |
Non-Patent Citations (1)
Title |
---|
面向金融活动的复合区块链关联事件溯源方法;李素 等;《计算机科学》;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN116595499A (en) | 2023-08-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110163261B (en) | Unbalanced data classification model training method, device, equipment and storage medium | |
CN111241241B (en) | Case retrieval method, device, equipment and storage medium based on knowledge graph | |
CN112035672B (en) | Knowledge graph completion method, device, equipment and storage medium | |
CN116595499B (en) | Multi-department collaborative transaction data sharing traceability method | |
WO2020232898A1 (en) | Text classification method and apparatus, electronic device and computer non-volatile readable storage medium | |
CN108170759A (en) | Method, apparatus, computer equipment and the storage medium of tip-offs about environmental issues processing | |
CN110442702B (en) | Searching method and device, readable storage medium and electronic equipment | |
CN114580392B (en) | Data processing system for identifying entity | |
CN110321426B (en) | Digest extraction method and device and computer equipment | |
CN114049926A (en) | Electronic medical record text classification method | |
CN112507170A (en) | Data asset directory construction method based on intelligent decision and related equipment thereof | |
CN112883066B (en) | Method for estimating multi-dimensional range query cardinality on database | |
CN116738009B (en) | Method for archiving and backtracking data | |
CN109686413A (en) | A kind of chemical molecular formula search method based on es inverted index | |
US20200142910A1 (en) | Data clustering apparatus and method based on range query using cf tree | |
CN116916195A (en) | Passive optical network management method, device and readable storage medium | |
CN112559823B (en) | Data standardized data acquisition method | |
US20230018525A1 (en) | Artificial Intelligence (AI) Framework to Identify Object-Relational Mapping Issues in Real-Time | |
CN112463964B (en) | Text classification and model training method, device, equipment and storage medium | |
CN115204147A (en) | Data feature fingerprint construction and similarity measurement method and index | |
CN115310606A (en) | Deep learning model depolarization method and device based on data set sensitive attribute reconstruction | |
CN115344734A (en) | Image retrieval method, image retrieval device, electronic equipment and computer-readable storage medium | |
CN109086373B (en) | Method for constructing fair link prediction evaluation system | |
CN113987536A (en) | Method and device for determining security level of field in data table, electronic equipment and medium | |
CN112528662A (en) | Entity category identification method, device, equipment and storage medium based on meta-learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |