CN112329028A - Abnormal data identification method, system, device and medium based on block chain - Google Patents

Abnormal data identification method, system, device and medium based on block chain Download PDF

Info

Publication number
CN112329028A
CN112329028A CN202011049107.7A CN202011049107A CN112329028A CN 112329028 A CN112329028 A CN 112329028A CN 202011049107 A CN202011049107 A CN 202011049107A CN 112329028 A CN112329028 A CN 112329028A
Authority
CN
China
Prior art keywords
block
data
gradient
transaction
block chain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011049107.7A
Other languages
Chinese (zh)
Other versions
CN112329028B (en
Inventor
朱佳
陈善轩
马晓东
林志豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Normal University
Original Assignee
South China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Normal University filed Critical South China Normal University
Priority to CN202011049107.7A priority Critical patent/CN112329028B/en
Publication of CN112329028A publication Critical patent/CN112329028A/en
Application granted granted Critical
Publication of CN112329028B publication Critical patent/CN112329028B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/604Tools and structures for managing or administering access control systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Finance (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Accounting & Taxation (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Automation & Control Theory (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention provides a method, a system, a device and a storage medium for identifying abnormal data based on a block chain, wherein the method comprises the following steps: acquiring transaction data, and packaging the transaction data to obtain a block to be audited; uploading the block to be audited to a block chain, and performing transaction verification on the block to be audited; the transaction verification is to determine to receive transaction data of the block to be checked according to the voting result of the machine learning model; broadcasting the transaction verification result on the block chain, packaging transaction data which is subjected to transaction verification to obtain a second block, and accessing the second block to the block chain; according to the method, two classification votes are carried out based on the gradient direction of the transaction data to eliminate the influence of malicious data or abnormal data which can be used by malicious parties, so that the transaction data stored on a block chain is more real and reliable, and the reliability is higher. The invention can be widely applied to the technical field of block chains.

Description

Abnormal data identification method, system, device and medium based on block chain
Technical Field
The invention belongs to the technical field of block chains, and particularly relates to a block chain-based abnormal data identification method, system, device and storage medium.
Background
The block chain network is greatly popularized due to the properties of high transparency, decentralization, distrust, no tampering, anonymity and the like, embodies the idea of distributed autonomy, and gradually receives wide attention of financial institutions with innovation consciousness. Blockchain networks are used as a backbone of a common distributed ledger system to handle asset transactions in the form of digital tokens between point-to-point users, particularly those networks that employ open access policies, which are distinguished by their intrinsic characteristics of non-mediation, public accessibility to network functions and tamper resistance.
Although the block chain has the characteristic that data cannot be tampered with, the data cannot be guaranteed to be authentic and reliable. If the blockchain is considered to be a database, then each event and data needs to be accurately recorded in order for the data stored on the blockchain to be authentic. In addition, in the process of actual production application, there is a situation that a malicious party can use malicious data or abnormal data to participate in the creation process of the block, or data falsification is performed, thereby causing loss.
Disclosure of Invention
In view of the above, to at least partially solve one of the above technical problems, an embodiment of the present invention provides a control method with dual time scales to ensure that transaction data recorded in a block chain are both true and reliable; and simultaneously provides a system, a device and a storage medium which can correspondingly realize the abnormal data identification method based on the block chain.
In a first aspect, the present invention provides a method for identifying abnormal data based on a block chain, which includes the following steps:
acquiring transaction data, and packaging the transaction data to obtain a block to be audited;
uploading the block to be audited to a block chain, and performing transaction verification on the block to be audited; the transaction verification is to determine to receive transaction data of the block to be checked according to the voting result of the machine learning model;
and broadcasting the transaction verification result on the block chain, packaging the transaction data which completes the transaction verification to obtain a second block, and accessing the second block to the block chain.
In some embodiments of the present invention, the step of uploading the to-be-audited block to the block chain and performing transaction verification on the to-be-audited block specifically includes: the transaction data in the block to be audited is sent to the nodes in the block chain; dividing the transaction data, and determining the gradient of the divided transaction data; and carrying out gradient fault tolerance according to random gradient descent with consistent direction in the gradient, and voting nodes based on the result of the gradient fault tolerance.
In some embodiments of the present invention, the step of uploading the block to be reviewed to the block chain and performing transaction verification on the block to be reviewed further includes: and when the node voting result is not less than the preset threshold value, updating the gradient of the node and updating the machine learning model according to the gradient of the transaction data.
In some embodiments of the present invention, the step of dividing the transaction data and determining a gradient of the divided transaction data specifically includes: dividing according to transaction data to obtain a plurality of training data sets, and training according to the training data sets to obtain a gradient descent model; and obtaining the gradient of the transaction data according to the gradient descent model.
In some embodiments of the present invention, the gradient fault tolerance is performed according to a random gradient descent in which the direction of the gradient is consistent, and the node voting is performed based on a result of the gradient fault tolerance, which specifically includes: acquiring gradients of nodes and gradients of adjacent nodes of a plurality of nodes to construct a gradient set; obtaining the scores of the nodes in the gradient set according to the gradient set, and determining the average gradient according to the scores; and training a two-classification model according to the average gradient, and obtaining a voting result of the node according to the two-classification model.
In some embodiments of the present invention, the step of training a binary model according to the average gradient and obtaining a voting result of the node according to the binary model specifically includes: and screening to obtain malicious data according to the average gradient, determining the node of the malicious data as a Byzantine node, and removing the Byzantine node.
In some embodiments of the present invention, the step of obtaining a plurality of training data sets according to the transaction data partitioning further comprises: and acquiring historical data, and dividing the historical data and the transaction data to obtain a plurality of training data sets.
In a second aspect, the technical solution of the present invention further provides a system for identifying abnormal data based on a block chain, including a data processing unit and the block chain; wherein:
the data processing unit is used for acquiring transaction data and packaging the transaction data to obtain a block to be audited; uploading the block to be checked to a block chain;
the block chain is used for receiving the block to be audited and carrying out transaction verification on the block to be audited; the transaction verification is to determine to receive transaction data of the block to be checked according to the voting result of the machine learning model; and broadcasting the transaction verification result on the block chain, packaging the transaction data which completes the transaction verification to obtain a second block, and completing the access.
In a third aspect, a technical solution of the present invention further provides an apparatus for identifying abnormal data based on a block chain, including:
at least one processor;
at least one memory for storing at least one program;
when the at least one program is executed by the at least one processor, the at least one processor implements the block chain based abnormal data identifying method in the first aspect.
In a fourth aspect, the present invention also provides a storage medium in which a processor-executable program is stored, the processor-executable program being configured to implement the method as in the first aspect when executed by a processor.
Advantages and benefits of the present invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention:
the abnormal data identification method based on the block chain provided by the invention selects the consensus result of the machine learning model based on the mode of model voting to judge whether the transaction is received; and two classification votes are carried out based on the gradient direction of the transaction data to eliminate the influence of malicious data or abnormal data which can be used by malicious parties, so that the transaction data stored on the block chain is more real and reliable, and the credibility is higher.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flowchart illustrating the steps of a method for identifying abnormal data based on a blockchain according to an embodiment of the present invention;
FIG. 2 illustrates a preparation phase for transaction verification in accordance with an embodiment of the present invention;
FIG. 3 illustrates the verification phase of transaction verification in an embodiment of the present invention;
FIG. 4 illustrates a consensus phase of transaction verification in accordance with an embodiment of the present invention;
fig. 5 is a schematic diagram illustrating visualization of random gradient descent based on gradient direction coincidence in the embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
The general idea of the invention is as follows: considering a blockchain as a database, in order for the data stored on the chain to be authentic, it is necessary to accurately record each event and data. In order to identify abnormal data in a blockchain system, the invention provides an abnormal data identification method in a blockchain environment, which is used for judging whether transaction data is received or not by selecting a consensus result of a machine learning model based on a model voting mode. Considering that a malicious party can use malicious data or abnormal data to influence the training process of a machine learning model, the method adopts a model based on consistent gradient direction, and completes model training in a Byzantine network attack environment in an online training mode.
In a first aspect, as shown in fig. 1, the present embodiment provides a method for identifying abnormal data based on a block chain, which mainly includes steps S01-S03:
and S01, acquiring the transaction data, and packaging the transaction data to obtain the block to be audited. Specifically, the node packages the generated transaction record or transaction data to form a block. More specifically, transaction data and related information cached locally by a node are stored in a block body of a new block, a Merkle tree of the transaction data stored in the block is generated in the block body, and the value of a root of the Merkle tree is stored in a block header; then, in the block head, acquiring a parent hash value, namely the hash value and a random number in the block head of the last block in the block chain; generating a hash value through an SHA256 algorithm, filling the hash value into a block header of a current block, and simultaneously generating a timestamp field; and the block also comprises a difficulty value field which can be adjusted according to the average generation time of the block in a previous period of time so as to deal with the overall calculation total quantity which is continuously changed in the whole network, and if the calculation total quantity is changed, the system can adjust the calculated difficulty value so that the time for expecting to finish the next block is still in a certain time. After the node finishes packaging the transaction data, the node submits the block to a block chain to wait for subsequent auditing, and the block is recorded as an to-be-audited block.
S02, uploading the block to be audited to a block chain, and performing transaction verification on the block to be audited; the transaction verification is to determine to receive transaction data of the block to be checked according to the voting result of the machine learning model; specifically, other nodes in the block chain receive the block to be checked uploaded by the node, and the other nodes verify the transaction data through the trained machine learning model. And other nodes verify the transaction data in the block to be checked one by one through the machine learning model, namely, the transaction record is voted, and the output result of the model of each node represents the voting result. In this embodiment, the machine learning model is a binary classification model, and in some optional embodiments, in the step of performing transaction verification on the block to be checked, when the node voting result is not less than the preset threshold, the gradient of the node is updated according to the gradient of the transaction data, and the machine learning model is updated; i.e. 0 for objection and 1 for approval.
In the implementation process of the method, the machine learning model can be trained in a mode of specifying a data set to carry out transaction verification, or the online model training mode can be carried out in steps S021 to S023 to carry out transaction verification, and for data in the same field, the data obeys independent and same distribution. If malicious data uploaded by a malicious party exists, the distribution of original data is influenced certainly. And the data uploaded by the Byzantine node is malicious data, and the attack of the Byzantine node can occur at any stage. The Byzantine node can upload malicious gradients to cause attacks on the system in a preparation stage, or avoid a defense mechanism to directly modify a final result in an integration stage, or a voting stage supports transmission of malicious voting results to influence consensus. Therefore, the steps S021 to S023 are mainly responsible for identifying malicious data or other operations uploaded by the malicious party, and the malicious data uploaded by the malicious party is prevented from affecting convergence of the model.
And S021, transmitting the transaction data in the block to be audited to the node in the block chain. As shown in fig. 2, in the preparation phase of the algorithm model, the nodes pack the unverified transaction data, send the packed transaction data to each node for receiving and checking, and each node analyzes the packed transaction data to obtain the transaction data. In order to make each node verify the block together, each node extracts a part of data for verification by adopting an equidistant sampling method.
S022, dividing the transaction data and determining the gradient of the divided transaction data. As shown in fig. 3, this step corresponds to a verification stage, in which each node extracts a part of data for verification by using an equidistant sampling method in order to verify the block by all nodes together. After each node receives the to-be-verified block, the gradient generated by the to-be-verified data is calculated, and as a malicious party can use the training of a malicious data influence model or hope to store abnormal data into a block chain, the embodiment provides an algorithm for filtering the abnormal gradient based on the random gradient descent with consistent gradient direction. In the verification stage, after each node receives transaction data to be verified, a specific algorithm for calculating a gradient generated by the data to be verified can be further subdivided into steps S0221-S0223:
s0221, obtaining a plurality of training data sets according to the division of transaction data, and obtaining a gradient descent model according to the training data sets. In this embodiment, the data set of the divided transaction data received by each node is P1,P2,...,PNThe minimum batch size is
Figure BDA0002708979810000051
The training data set obtained by batch division is
Figure BDA0002708979810000052
The training round is E, the learning rate of the model is eta, and the latest gradient descent model Wt
S0222, obtaining the gradient of the transaction data according to the gradient descent model. Based on batch data
Figure BDA0002708979810000053
With the current model WtAnd circularly training a gradient descent model, and calculating:
Figure BDA0002708979810000054
calculating to obtain a gradient lg in which
Figure BDA0002708979810000055
The derivative function and the parameter η are shown to control the rate of gradient descent. Finally, broadcast gradient values lg on the block chain, and wait for gradient values lg of other nodes1,lg2……lgi,lgi+1……lgN
S023, carrying out gradient fault tolerance according to random gradient descent with consistent direction in the gradient, and voting nodes based on the result of the gradient fault tolerance. As shown in fig. 4, the last stage is a consensus stage, and the gradient fault tolerance is a voting process for determining whether the obtained gradient is abnormal transaction data, the model of each node constitutes a voting committee, the model feedback result is voted, if the number of votes exceeds 1/2, consensus is achieved, and the latest model is recorded and the current latest state is changed. If 1/2 is not exceeded, the node is considered malicious data and is also identified as a byzantine node. More specifically, step S023 may be further subdivided into step S0231-step S0233:
s0231, obtaining the gradients of the nodes and obtaining the gradients of the adjacent nodes of the nodes to construct a gradient set. I.e. the gradient value lg1,lg2……lgi,lgi+1……lgNThe number of nodes is N, and the number of Byzantine nodes is f. Calculate nearest neighbor n-f-2 gradients lgiAnd calculating the nearest gradient set V of each nodei
S0232, obtaining the scores of the nodes in the gradient set according to the gradient set, and determining the average gradient according to the scores. Calculating the score of each node:
Figure BDA0002708979810000056
and simultaneously obtain the minimum in the scores:
Min I={i|s(i)∈mink∈Ns(k)}
finally, the set { v } is calculatediI ∈ MinI }.
S0233, training a binary model according to the average gradient, and obtaining a voting result of the node according to the binary model.
For exampleN honest nodes and f Byzantine nodes (n is more than f) exist on the block chain, and the nodes receive the gradient lg calculated by other nodes1,lg2……lgi,lgi+1……lgNThen, the nearest n-f-2 gradients are calculated, the score of each node i is calculated according to the nearest n-f-2 gradients, and finally the gradient average result of each node is used as a consistent consensus result. As shown in fig. 5, the dotted arrow indicates a malicious gradient calculated from the malicious data, and the solid arrow indicates a normal gradient. And the distribution of malicious data is far away from the distribution of original normal data, so that the model is attacked. Therefore, the gradient of malicious data is generally far from the gradient calculated for normal data. Then according to the algorithm the tangential arrow is calculated from the nearest n-f-2 gradients.
In some other embodiments, the step of dividing the transaction data into a plurality of training data sets further comprises: and acquiring historical data, and dividing the historical data and the transaction data to obtain a plurality of training data sets. In order to meet the feature distribution of the training data and avoid the problem of insufficient verification data, the algorithm extracts the existing data from the historical data and supplements the existing data to the verification data. After supplementing the validation data, each node calculates a gradient based on the existing validation data set.
And S03, broadcasting the transaction verification result on the block chain, packaging the transaction data which completes the transaction verification to obtain a second block, and accessing the second block to the block chain. And finally, selecting the transaction data which exceeds 1/2 votes, and repackaging the transaction data into blocks.
In a second aspect, the technical solution of the present invention further provides a system for identifying abnormal data based on a block chain, including a data processing unit and the block chain; wherein:
the data processing unit is used for acquiring transaction data and packaging the transaction data to obtain a block to be audited; uploading the block to be checked to a block chain;
the block chain is used for receiving the block to be audited and carrying out transaction verification on the block to be audited; the transaction verification is to determine to receive transaction data of the block to be checked according to the voting result of the machine learning model; and broadcasting the transaction verification result on the block chain, packaging the transaction data which completes the transaction verification to obtain a second block, and completing the access.
In a third aspect, an embodiment of the present invention further provides an apparatus for identifying abnormal data based on a blockchain, where the apparatus includes at least one processor; at least one memory for storing at least one program; when the at least one program is executed by the at least one processor, the at least one processor implements the block chain based abnormal data identifying method as in the first aspect.
An embodiment of the present invention further provides a storage medium storing a program, where the program is executed by a processor as the method in the first aspect.
From the above specific implementation process, it can be concluded that the technical solution provided by the present invention has the following advantages or advantages compared to the prior art:
1. the invention provides a model on-line training method based on gradient direction consistency, which filters abnormal gradients of abnormal data by utilizing the consistency of normal data distribution, thereby training a credible two-classification model.
2. The scheme of the invention has the characteristic that the data cannot be tampered, and simultaneously, the block chain is really trusted, so that the content of the data is real and reliable.
In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the functions and/or features may be integrated in a single physical device and/or software module, or one or more of the functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.
Wherein the functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. The abnormal data identification method based on the block chain is characterized by comprising the following steps:
acquiring transaction data, and packaging the transaction data to obtain a block to be audited;
uploading the block to be checked to a block chain, and performing transaction verification on the block to be checked; the transaction verification is that voting is carried out through a machine learning model according to the gradient direction of the transaction data, and the transaction data of the block to be audited is determined and accepted according to the voting result;
and broadcasting a transaction verification result on the block chain, packaging transaction data which is subjected to transaction verification to obtain a second block, and accessing the second block to the block chain.
2. The method for identifying abnormal data based on block chains according to claim 1, wherein the step of uploading the block to be checked to the block chain and performing transaction verification on the block to be checked specifically comprises:
sending the transaction data in the block to be audited to a node in the block chain;
dividing the transaction data, and determining the gradient of the divided transaction data;
and carrying out gradient fault tolerance according to random gradient descent with consistent direction in the gradient, and voting nodes based on the result of the gradient fault tolerance.
3. The method for identifying abnormal data based on block chain according to claim 2, wherein the step of uploading the block to be checked to the block chain and performing transaction verification on the block to be checked further comprises:
and when the voting result of the node is not less than a preset threshold value, updating the gradient of the node and updating the machine learning model according to the gradient of the transaction data.
4. The method for identifying abnormal data based on a blockchain according to claim 2, wherein the step of dividing the transaction data to determine the gradient of the divided transaction data specifically comprises:
dividing the transaction data to obtain a plurality of training data sets, and training according to the training data sets to obtain a gradient descent model;
and obtaining the gradient of the transaction data according to the gradient descent model.
5. The method according to claim 2, wherein the step of performing gradient fault tolerance according to a random gradient descent with a consistent direction in the gradient and voting nodes based on a result of the gradient fault tolerance specifically comprises:
acquiring gradients of the nodes and gradients of adjacent nodes of a plurality of the nodes to construct a gradient set;
obtaining the scores of the nodes in the gradient set according to the gradient set, and determining the average gradient according to the scores;
and training a two-classification model according to the average gradient, and obtaining the voting result of the node according to the two-classification model.
6. The method according to claim 5, wherein the step of training a binary model according to the average gradient and obtaining the voting result of the node according to the binary model specifically comprises:
and screening to obtain malicious data according to the average gradient, determining the node of the malicious data as a Byzantine node, and removing the Byzantine node.
7. The method for identifying abnormal data based on blockchain according to claim 4, wherein the step of dividing the transaction data into a plurality of training data sets further comprises:
and acquiring historical data, and dividing the historical data and the transaction data to obtain a plurality of training data sets.
8. The abnormal data identification system based on the block chain is characterized by comprising a data processing unit and the block chain; wherein:
the data processing unit is used for acquiring transaction data and packaging the transaction data to obtain a block to be audited; uploading the block to be checked to a block chain;
the block chain is used for receiving the block to be audited and carrying out transaction verification on the block to be audited; the transaction verification is to determine to accept the transaction data of the block to be checked according to the voting result of the machine learning model; and broadcasting the transaction verification result on the block chain, packaging the transaction data which completes the transaction verification to obtain a second block, and completing the access.
9. An apparatus for identifying abnormal data based on block chains, comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method for blockchain-based anomaly data identification according to any one of claims 1 to 7.
10. A storage medium having stored therein a program executable by a processor, characterized in that: the processor-executable program when executed by a processor is for implementing the method of blockchain based anomaly data identification according to any one of claims 1 to 7.
CN202011049107.7A 2020-09-29 2020-09-29 Abnormal data identification method, system, device and medium based on block chain Active CN112329028B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011049107.7A CN112329028B (en) 2020-09-29 2020-09-29 Abnormal data identification method, system, device and medium based on block chain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011049107.7A CN112329028B (en) 2020-09-29 2020-09-29 Abnormal data identification method, system, device and medium based on block chain

Publications (2)

Publication Number Publication Date
CN112329028A true CN112329028A (en) 2021-02-05
CN112329028B CN112329028B (en) 2024-05-14

Family

ID=74313000

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011049107.7A Active CN112329028B (en) 2020-09-29 2020-09-29 Abnormal data identification method, system, device and medium based on block chain

Country Status (1)

Country Link
CN (1) CN112329028B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597240A (en) * 2021-03-01 2021-04-02 索信达(北京)数据技术有限公司 Federal learning data processing method and system based on alliance chain
CN113111124A (en) * 2021-03-24 2021-07-13 广州大学 Block chain-based federal learning data auditing system and method
CN113159953A (en) * 2021-04-30 2021-07-23 中国工商银行股份有限公司 Data persistence and parallel processing method and device based on block chain
CN113673996A (en) * 2021-08-06 2021-11-19 深圳前海微众银行股份有限公司 Block chain-based block node detection method and device
CN113868216A (en) * 2021-12-03 2021-12-31 中国信息通信研究院 Block chain monitoring method and device
CN115760388A (en) * 2022-11-07 2023-03-07 深圳市腾盟技术有限公司 Block chain-based consensus method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182635A (en) * 2017-12-18 2018-06-19 深圳前海微众银行股份有限公司 Block chain common recognition method, system and computer readable storage medium
CN108491266A (en) * 2018-03-09 2018-09-04 联想(北京)有限公司 Data processing method, device based on block chain and electronic equipment
CN110084318A (en) * 2019-05-07 2019-08-02 哈尔滨理工大学 A kind of image-recognizing method of combination convolutional neural networks and gradient boosted tree
CN110602248A (en) * 2019-09-27 2019-12-20 腾讯科技(深圳)有限公司 Abnormal behavior information identification method, system, device, equipment and medium
CN110659901A (en) * 2019-09-03 2020-01-07 北京航空航天大学 Game model-based block chain complex transaction verification method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182635A (en) * 2017-12-18 2018-06-19 深圳前海微众银行股份有限公司 Block chain common recognition method, system and computer readable storage medium
CN108491266A (en) * 2018-03-09 2018-09-04 联想(北京)有限公司 Data processing method, device based on block chain and electronic equipment
CN110084318A (en) * 2019-05-07 2019-08-02 哈尔滨理工大学 A kind of image-recognizing method of combination convolutional neural networks and gradient boosted tree
CN110659901A (en) * 2019-09-03 2020-01-07 北京航空航天大学 Game model-based block chain complex transaction verification method and device
CN110602248A (en) * 2019-09-27 2019-12-20 腾讯科技(深圳)有限公司 Abnormal behavior information identification method, system, device, equipment and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
马晓东,朱佳等: "基于区块链技术的去中心化应用", 《网络安全空间》, vol. 10, no. 8, pages 102 - 109 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597240A (en) * 2021-03-01 2021-04-02 索信达(北京)数据技术有限公司 Federal learning data processing method and system based on alliance chain
CN112597240B (en) * 2021-03-01 2021-06-04 索信达(北京)数据技术有限公司 Federal learning data processing method and system based on alliance chain
CN113111124A (en) * 2021-03-24 2021-07-13 广州大学 Block chain-based federal learning data auditing system and method
CN113111124B (en) * 2021-03-24 2021-11-26 广州大学 Block chain-based federal learning data auditing system and method
CN113159953A (en) * 2021-04-30 2021-07-23 中国工商银行股份有限公司 Data persistence and parallel processing method and device based on block chain
CN113673996A (en) * 2021-08-06 2021-11-19 深圳前海微众银行股份有限公司 Block chain-based block node detection method and device
CN113868216A (en) * 2021-12-03 2021-12-31 中国信息通信研究院 Block chain monitoring method and device
CN115760388A (en) * 2022-11-07 2023-03-07 深圳市腾盟技术有限公司 Block chain-based consensus method, device, equipment and storage medium
CN115760388B (en) * 2022-11-07 2023-11-21 深圳市腾盟技术有限公司 Block chain-based consensus method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN112329028B (en) 2024-05-14

Similar Documents

Publication Publication Date Title
CN112329028B (en) Abnormal data identification method, system, device and medium based on block chain
CN110163600B (en) Block chain system and method using the same
CN110647503A (en) Distributed storage method and device
Eskandari et al. Sok: Oracles from the ground truth to market manipulation
CN108985772A (en) A kind of verification method, device, equipment and the storage medium of block chain
CN111415161B (en) Block chain-based data verification method and device and computer readable storage medium
CN112307458A (en) Light node uplink method and device, Internet of things central control terminal and block chain network
CN113723962B (en) Block chain authority management method and block chain system
CN105357167A (en) Service processing method and device
CN110177079A (en) The calling system and call method of intelligent contract
CN112530537B (en) Big health management platform based on algorithm, medical image and block chain
CN112053164A (en) Block chain-based electronic commerce data processing method and system
CN112862474A (en) Supply chain management method, system, equipment and storage medium based on block chain
CN108647974A (en) A kind of Information Authentication method, apparatus and system based on block chain
CN111753987A (en) Method and device for generating machine learning model
CN114127771A (en) System and method for proof of viewing via blockchain
CN115865378A (en) Streaming media real-time evidence storing and checking method based on block chain
CN113568577A (en) Distributed packet storage method based on alliance block chain
CN114970886A (en) Clustering-based adaptive robust collaborative learning method and device
CN115022326A (en) Block chain Byzantine fault-tolerant consensus method based on collaborative filtering recommendation
KR102182750B1 (en) System and method for distributing data using block chain
CN112862469A (en) Block chain-based digital asset transaction method, system, equipment and storage medium
CN117171786A (en) Decentralizing federal learning method for resisting poisoning attack
CN116049816A (en) Federal learning method capable of verifying safety based on blockchain
CN114676195A (en) Block chain data tracing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant