CN117974147A - Block chain attack transaction identification method and system based on multivariate time sequence anomaly identification - Google Patents

Block chain attack transaction identification method and system based on multivariate time sequence anomaly identification Download PDF

Info

Publication number
CN117974147A
CN117974147A CN202311786460.7A CN202311786460A CN117974147A CN 117974147 A CN117974147 A CN 117974147A CN 202311786460 A CN202311786460 A CN 202311786460A CN 117974147 A CN117974147 A CN 117974147A
Authority
CN
China
Prior art keywords
transaction
data
sequence
time
blockchain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311786460.7A
Other languages
Chinese (zh)
Inventor
冯志淇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Li'an Technology Co ltd
Original Assignee
Chengdu Li'an Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Li'an Technology Co ltd filed Critical Chengdu Li'an Technology Co ltd
Priority to CN202311786460.7A priority Critical patent/CN117974147A/en
Publication of CN117974147A publication Critical patent/CN117974147A/en
Pending legal-status Critical Current

Links

Abstract

The invention belongs to the technical field of blockchain application, and discloses a blockchain attack transaction identification method and a system based on multivariate time series anomaly distinction, which construct transaction time series data of a blockchain target address; then constructing and dividing multi-element time sequence data of the transaction; inputting a plurality of pieces of original sequence data obtained by segmentation into a trained prediction model based on unsupervised learning to obtain corresponding original sequence vector data and reconstructed sequence vector data; obtaining an anomaly score corresponding to each transaction in the sequence according to the original sequence vector data and the corresponding reconstructed sequence vector data; and then, identifying the real-time transaction of the target address according to the anomaly score. The method and the system can automatically analyze whether the current transaction and the historical transaction behavior of the appointed target address have differences, thereby achieving the purpose of identifying the abnormal transaction and successfully solving the problems that the attack transaction is difficult to identify and the coverage rate of the abnormal category is insufficient.

Description

Block chain attack transaction identification method and system based on multivariate time sequence anomaly identification
Technical Field
The invention belongs to the technical field of blockchain application, relates to blockchain transaction identification, and particularly relates to a blockchain attack transaction identification method and system based on multivariate time sequence anomaly identification.
Background
The blockchain is a distributed database technology, and can realize the characteristics of transparency, non-tampering, traceability and the like of data, thereby solving the trust and security problems of network space. Blockchain technology may also be utilized by some lawbreakers to conduct various forms of criminal activity, jeopardizing social order and public interests. According to the study of some experts and scholars, the block chain criminal activity is mainly of the following two types:
(1) Crimes such as marketing, fraud, money laundering and the like are carried out by using a blockchain technology; for example, some lawbreakers use high returns as baits to publicize and issue virtual currency or other digital assets to society, absorb or change phases to absorb public funds, and practically do not have any actual business activities or value supports, but rather cheat digital assets paid by members by means of developing offline, setting various profit modes, fictional platform strength, profit prospects and the like, so that the amount of involved cases is huge; these activities may constitute illegitimate absorbing public deposit crimes, funding fraud crimes, organizations, leadership marketing activity crimes, etc.;
(2) Hacking, stealing user data, destroying network security and other crimes are performed by utilizing a block chain technology; for example, some lawbreakers exploit vulnerabilities or malicious code of blockchain technology to conduct the act of illegally acquiring digital or other assets of a user; these actions may constitute destroying computer information system crimes, fraud crimes, etc.
A blockchain hacking event refers to the act of stealing or breaking data or assets on the blockchain using the vulnerability or malicious code of the blockchain technology. For most hacking events of the blockchain public chain, there is diversity and variability in the way the blockchain hacks. The blockchain attack hacking mode can be classified according to different levels, targets, means and the like, such as network layer attack, consensus layer attack, contract layer attack, application layer attack and the like. These attacks may exploit specific flaws or vulnerabilities of blockchain technology, or may penetrate or spoof in conjunction with conventional network attack approaches. With the development of blockchain technology and application, the blockchain attack mode is evolving and innovating continuously, and some new or hidden attack modes are presented. In transactions facing massive dynamic anonymization of blockchain, the challenges faced by real-time identification of blockchain hacking network attacks are exacerbated.
Prevention and identification of these attack events is an important topic in the area of blockchain security, and there is currently a need for academic and industrial methods to prevent and identify blockchain attack events, but the corresponding technologies face different challenges, detailed information is as follows:
(1) Contract auditing refers to systematic inspection and analysis of the code of an intelligent contract to discover errors, vulnerabilities, and risks that may exist therein; contract auditing can help to improve security and reliability of intelligent contracts, preventing hackers from attacking with vulnerabilities of the contracts, see Detection of vulnerabilities in smart contracts specifications inethereum platforms (2020); although the intelligent contract audit can discover the loopholes in advance to a certain extent, all the loopholes cannot be discovered with a certain probability, and meanwhile, corresponding economic loopholes can be generated in the interaction process of the intelligent contract and the intelligent contract, so that the method of contract audit cannot analyze all the intelligent contracts on the blockchain;
(2) Identifying a blockchain address of a specified illegal category by adopting a supervised learning method, wherein the method mainly comprises the steps of training a model by utilizing data of a group of known categories or labels so as to classify or predict new data, and can be seen in PHISHING SCAMS detection in ethereum transaction network (2020); however, the attack address of the hacker is generally newly created, and there is not enough history for machine learning, and meanwhile, the number of hacking samples is extremely unbalanced from the number of normal samples, so that the supervised machine learning cannot effectively identify the attack transaction.
Disclosure of Invention
Aiming at the problem that the prior art cannot effectively and real-timely identify abnormal blockchain transactions, the invention aims to provide a blockchain attack transaction identification method based on multivariate time sequence abnormal distinction, and adopts an unsupervised learning-based deep learning algorithm to identify abnormal blockchain transactions so as to achieve the aim of establishing an effective, rapid and accurate abnormal behavior detection mechanism.
It is another object of the present invention to provide a blockchain attack transaction identification system based on multivariate time series anomaly discrimination.
In order to achieve the above purpose, the present invention is realized by adopting the following technical scheme.
The invention provides a blockchain attack transaction identification method based on multivariate time sequence anomaly discrimination, which comprises the following steps:
s1, constructing transaction time series data of a blockchain target address; the method comprises the following sub-steps:
s11, acquiring real-time transaction data of all nodes of the blockchain, and filtering out the real-time transaction data of the target address;
S12, acquiring historical transaction data related to a target address from the blockchain full-node historical transaction data;
S13, arranging historical transaction data and real-time transaction data related to the target address according to the transaction execution time to construct transaction time sequence data of the blockchain target address;
S2, constructing and dividing multi-element time sequence data of the transaction; the method comprises the following sub-steps:
S21, carrying out characteristic engineering on transaction time series data of a target address to obtain multi-element time series data of transaction;
S22, dividing the obtained multi-element time sequence data of the transaction to obtain a plurality of original sequence data;
S3, inputting a plurality of original sequence data into a trained prediction model based on unsupervised learning to obtain corresponding original sequence vector data and reconstructed sequence vector data;
s4, obtaining a reconstruction error corresponding to each transaction in the sequence, namely an anomaly score, according to the original sequence vector data and the corresponding reconstruction sequence vector data; and identifying the real-time transaction of the target address according to the anomaly score.
In the step S1, a blockchain node that may be hacked is used as a target address, for example, a wallet address, an exchange address, etc. on the blockchain. Since a hacker transfers funds from a destination address to his wallet address, only the transfer-out transaction from the destination address may be of interest.
In the step S11, the blockchain full-node real-time transaction data is first obtained, and the real-time transaction data of the target address is filtered out. The real-time transaction data of the destination address here is mainly the roll-out transaction data of the destination address.
In the step S12, the obtained historical transaction data related to the target address is a plurality of transaction data before the real-time transaction data, and may be the transfer-in transaction data or/and the transfer-out transaction data.
In step S2, in order to more effectively characterize the transaction data, the present invention performs feature engineering on the transaction data in the time series, and extracts feature factors having an influence from multiple dimensions, such as general transaction features (general transaction or contract transaction), transaction amount, transaction time, transaction count, currency, opponent type (new opponent or old opponent), and the like.
In step S21, any one of the transaction time series of the target address is subjected to feature extraction from a plurality of dimensions, and standardized processing (for example, normalization processing) is performed to obtain multi-element time series data of the transaction.
In the step S22, the obtained multi-element data sequence data of the transaction is subjected to sequence sliding slicing according to a given time window, so as to obtain a plurality of original sequence data. The size of the time window may be set according to the transaction behavior of the target address.
Step S3, learning and reconstructing original sequence data through a prediction model; the prediction model comprises a first encoder, a sequence prediction module and a second encoder; the first encoder is used for learning the original sequence data to obtain vector representation of each transaction in the original sequence data, namely the original sequence vector data; the sequence prediction module is used for generating prediction vector data according to the original sequence vector data; the second encoder is used for decoding the prediction vector data to obtain reconstructed sequence vector data corresponding to the original sequence vector data. The first encoder is the same as or different from the second encoder, and can be a self encoder (AE) or a Variational Automatic Encoder (VAE); the sequence prediction module may be LSTM, RNN or GRU, etc.
In a specific implementation manner, the step 3 comprises the following sub-steps:
S31, learning the original sequence data through a first encoder to obtain vector representation of each transaction in the original sequence data, namely obtaining original sequence vector data;
S32, generating predicted vector data according to the original sequence vector data through a sequence prediction module;
S33, decoding the predicted sequence vector data through a second encoder to obtain reconstructed sequence vector data corresponding to the original sequence vector data.
In the step S4, according to the original sequence vector data and the corresponding reconstructed sequence vector data, performing error calculation to obtain a reconstructed error of each transaction in the sequence, and defining the reconstructed error as an anomaly score; the larger the reconstruction error, the more abnormal the node is. The anomaly scores may be arranged in descending order and then according to a set anomaly ratio, when the current real-time transaction ranks within the anomaly ratio, the current transaction is determined to be an anomalous transaction.
Based on the above analysis, step S4 comprises the following sub-steps:
S41, carrying out error calculation according to the original sequence vector data and the corresponding reconstructed sequence vector data to obtain a reconstruction error of each transaction in the sequence, and defining the reconstruction error as an anomaly score;
s42, arranging the anomaly scores in a descending order;
s43, judging whether the current real-time transaction ranking is in the range of the abnormal ratio according to the set abnormal ratio, if so, judging that the current transaction is abnormal, otherwise, judging that the current transaction is normal.
In the step S41, the distance or similarity between the original sequence vector data and the corresponding reconstructed sequence vector data may be calculated by using methods such as euclidean distance or cosine similarity, so as to obtain the reconstruction error between the two. The following is a calculation formula of Euclidean distance d t:
W is original sequence vector data obtained through the prediction of the first encoder; k is the number of elements in the reconstructed sequence vector data/original sequence vector data.
Based on the blockchain attack transaction identification method provided by the invention, if the current transaction is an attack transaction, the transaction behavior of the attacked project address in the current transaction can certainly violate the historical transaction behavior, and the current transaction anomaly score ranking can be in the range of anomaly ratio, so that the current transaction anomaly score ranking can be identified as an anomaly transaction, thereby achieving the purpose of establishing effective, rapid, accurate and stable anomaly transaction detection.
The invention also provides a blockchain attack transaction identification system based on multivariate time sequence anomaly discrimination, which comprises:
the transaction time series data generation module is used for constructing transaction time series data of the blockchain target address; the transaction time series data generation module comprises:
the real-time transaction data acquisition unit is used for acquiring the real-time transaction data of the blockchain full-node and filtering out the real-time transaction data of the target address;
the historical transaction data acquisition unit is used for acquiring historical transaction data related to the target address from the blockchain full-node historical transaction data;
The transaction data ordering unit is used for arranging the historical transaction data and the real-time transaction data related to the target address according to the transaction execution time to construct transaction time sequence data of the blockchain target address;
the original sequence data generation module is used for constructing and dividing multi-element time sequence data of the transaction; the original sequence data generation module comprises:
the multi-element time sequence data generation unit is used for carrying out characteristic engineering on the transaction time sequence data of the target address to obtain multi-element time sequence data of the transaction;
the dividing unit is used for dividing the obtained multi-element time sequence data of the transaction to obtain a plurality of original sequence data;
the prediction model is used for processing the plurality of original sequence data to obtain corresponding original sequence vector data and reconstructed sequence vector data;
the abnormal transaction identification module is used for obtaining a reconstruction error corresponding to each transaction in the sequence, namely an abnormal score according to the original sequence vector data and the corresponding reconstruction sequence vector data; and identifying the real-time transaction of the target address according to the anomaly score.
The block chain attack transaction identification system comprises a prediction model, a sequence prediction module and a second encoder, wherein the prediction model comprises a first encoder, a sequence prediction module and a second encoder; the first encoder is used for learning the original sequence data to obtain vector representation of each transaction in the original sequence data, namely the original sequence vector data; the sequence prediction module is used for generating prediction vector data according to the original sequence vector data; the second encoder is used for decoding the prediction vector data to obtain reconstructed sequence vector data corresponding to the original sequence vector data. The first encoder is the same as or different from the second encoder, and can be a self encoder (AE) or a Variational Automatic Encoder (VAE); the sequence prediction module may be LSTM, RNN or GRU, etc.
The blockchain attack transaction identification system, wherein the abnormal transaction identification module comprises:
the reconstruction error calculation unit is used for carrying out error calculation according to the original sequence vector data and the corresponding reconstruction sequence vector data to obtain a reconstruction error of each transaction in the sequence and defining the reconstruction error as an anomaly score;
An anomaly score ranking unit for ranking the anomaly scores in descending order;
And the judging unit is used for judging whether the current real-time transaction ranking is in the range of the abnormal ratio according to the set abnormal ratio, if so, judging that the current transaction is abnormal, and if not, judging that the current transaction is normal.
The block chain attack transaction identification system performs the operations according to the steps given by the method.
Compared with the prior art, the blockchain attack transaction identification method and system based on the multivariate time sequence anomaly discrimination have the following beneficial effects:
1) According to the invention, the transaction characteristics of the historical transaction are constructed into time series data, and a series prediction model is adopted to predict the time series data, so that whether the current transaction and the historical transaction behavior have differences or not is identified through the reconstruction errors of each transaction, and the influence caused by neglecting the transaction time sequence of the blockchain transaction in the traditional supervised and unsupervised machine learning methods is successfully solved; because the abnormal transaction of the blockchain has relation with the latest transaction, the abnormal transaction identification accuracy can be improved;
2) The invention carries out abnormal transaction identification of the appointed target address based on the deep learning of the unsupervised mode, and successfully solves the problem that the attack transaction is difficult to identify caused by lack of hacking transaction data and diversified hacking modes;
3) The invention can automatically analyze whether the current transaction of the appointed target address and the historical transaction behavior have differences, thereby achieving the purpose of abnormal transaction identification; the problem of insufficient coverage rate of the abnormal category faced by the abnormal recognition algorithm based on the supervised machine learning is successfully solved.
Drawings
Fig. 1 is a flowchart of a blockchain attack transaction identification method based on multivariate time series anomaly discrimination provided in embodiment 1;
FIG. 2 is a schematic diagram showing the sub-step flow of step S1 in example 1;
FIG. 3 is a schematic diagram showing the sub-step flow of step S2 in example 1;
FIG. 4 is a schematic diagram showing the sub-step flow of step S3 in example 1;
FIG. 5 is a schematic diagram showing the process of step S4 in example 1;
Fig. 6 is a schematic block diagram of a blockchain attack transaction identification system based on multivariate time series anomaly discrimination as provided in embodiment 2.
Detailed Description
The following description of the embodiments of the present invention will be made more fully hereinafter with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the invention, are within the scope of the invention.
Example 1
The embodiment is realized by acquiring full-node real-time transaction data and historical transaction data based on the block chain full-node.
The invention provides a blockchain attack transaction identification method based on multivariate time sequence anomaly discrimination, which is shown in figure 1 and comprises the following steps:
S1, constructing transaction time series data of a blockchain target address;
S2, constructing and dividing multi-element time sequence data of the transaction;
S3, inputting a plurality of original sequence data into a trained prediction model based on unsupervised learning to obtain corresponding original sequence vector data and reconstructed sequence vector data;
s4, obtaining a reconstruction error corresponding to each transaction in the sequence, namely an anomaly score, according to the original sequence vector data and the corresponding reconstruction sequence vector data; and identifying the real-time transaction of the target address according to the anomaly score.
Step S1 above, because any hacker supply is primarily to transfer virtual currency of the blockchain project side wallet address (meaning the institution user wallet address on the public chain providing financial services to virtual currency investors) and the private wallet address where a huge amount of virtual currency is stored, to his wallet. Thus, the blockchain project wallet can be targeted with a private wallet address holding a huge amount of virtual currency, and only outgoing transactions at the targeted address can be focused on.
As shown in fig. 2, the above step S1 includes the following sub-steps:
s11, acquiring real-time transaction data of all nodes of the blockchain, and filtering out the real-time transaction data of the target address.
Firstly, the real-time transaction data of the block chain all nodes are obtained, and the real-time transaction data of the target address is filtered. The real-time transaction data of the target address is mainly the outgoing transaction data of the target address, namely the outgoing transaction of the blockchain project side wallet address and the private wallet address with a huge amount of virtual currency.
S12, acquiring historical transaction data related to the target address from the blockchain full-node historical transaction data.
According to the target address (namely the designated blockchain project side wallet address, the private money Bao Dezhi with a great amount of virtual currency) and the real-time transaction obtained after filtering, acquiring N transactions before the real-time transaction as historical transaction data of the target address, wherein N represents a set historical transaction data threshold value.
S13, the historical transaction data and the real-time transaction data related to the target address are arranged according to the transaction execution time to construct the transaction time series data of the blockchain target address.
In step S2, in order to more effectively characterize the transaction data, the present invention performs feature engineering on the transaction data in the time series, and extracts feature factors having an influence from multiple dimensions, such as general transaction features (general transaction or contract transaction), transaction amount, transaction time, transaction count, currency, opponent type (new opponent or old opponent), and the like.
As shown in fig. 3, the above step S2 includes the following sub-steps:
S21, carrying out characteristic engineering on the transaction time series data of the target address to obtain multi-element time series data of the transaction.
And carrying out feature extraction from a plurality of dimensions on any transaction in the transaction time sequence of the target address, and carrying out standardization processing to obtain multi-element time sequence data of the transaction. The normalization processing here refers to normalization processing performed for any feature.
In this embodiment, the transaction characteristics extracted for the target address (here, the designated blockchain project side wallet address and the private money Bao Dezhi with a huge amount of virtual currency) include general transaction characteristics (normal transaction or contract transaction), transaction amount, transaction time, transaction count, currency, opponent type (new opponent or old opponent), and the like. Because the common-chain transaction is divided into a common transaction and a contract transaction, the features generated herein include the common transaction features of both aspects from the service level, and the common transaction may be denoted by "0" and the contract transaction by "1". It is also a relatively important method for opponent types (i.e., transaction parties transacting with a target address), where a typical hacker may appear as a new opponent; thus, an old adversary may be represented by a "0" and a "1" a new adversary. For example, a transaction may be characterized as a multi-dimensional transaction feature [0,1,98,12,5,5], where "0" represents a normal transaction, "1" represents a new opponent, "98" represents the blockchain home currency transaction total amount for the current transaction, "12" is the transaction count, "5" is the transaction currency number, "5" is the currency exchange number. And then carrying out normalization processing on the same characteristics of different transactions in the transaction time sequence data.
S22, dividing the obtained multi-element time sequence data of the transaction to obtain a plurality of original sequence data.
And performing serial sliding slicing on the obtained multi-element data serial data of the transaction according to a given time window to obtain a plurality of original serial data. The size of the time window may be set according to the transaction behavior of the target address, for example, the time window size may be set to 200.
In the step S3, the original sequence data is learned and reconstructed by the prediction model. The prediction model used includes a first encoder, a sequence prediction module, and a second encoder. The first encoder is used for learning the original sequence data to obtain vector representation of each transaction in the original sequence data, namely the original sequence vector data; the sequence prediction module is used for generating prediction vector data according to the original sequence vector data; the second encoder is used for decoding the prediction vector data to obtain reconstructed sequence vector data corresponding to the original sequence vector data. The first encoder may be the same as or different from the second encoder, and may be a self encoder (AE) or a Variable Automatic Encoder (VAE). The sequence prediction module is a depth sequence model related to a cyclic neural network, and can be LSTM, RNN or GRU and the like.
The prediction model is processed according to the block chain all-node historical transaction data and the steps S1-S2 given before, training set data is constructed, then the prediction model is trained according to a conventional training method disclosed in the art, KL divergence (Kullback-Leibler Divergence) is used as a loss function of the model in the training process, and AdamOptimizer optimization algorithm is used for optimizing model parameters. According to the invention, an attack sample generated in the past year is added into the model training sample, so that most attack modes on the market are covered.
As shown in fig. 4, in a specific implementation, step 3 includes the following sub-steps:
s31, learning the original sequence data through a first encoder to obtain vector representation of each transaction in the original sequence data, and obtaining the original sequence vector data.
S32 is configured to generate prediction vector data according to the original sequence vector data by using the sequence prediction module.
S33, decoding the predicted sequence vector data through a second encoder to obtain reconstructed sequence vector data corresponding to the original sequence vector data.
Step S4, according to the original sequence vector data and the corresponding reconstructed sequence vector data, carrying out error calculation to obtain a reconstructed error of each transaction in the sequence, and defining the reconstructed error as an anomaly score; the larger the reconstruction error, the more abnormal the node is.
As shown in fig. 5, step S4 includes the following sub-steps:
S41, carrying out error calculation according to the original sequence vector data and the corresponding reconstructed sequence vector data to obtain a reconstruction error of each transaction in the sequence, and defining the reconstruction error as an anomaly score;
The distance or similarity between the original sequence vector data and the corresponding reconstructed sequence vector data can be calculated by Euclidean distance or cosine similarity and other methods to obtain the reconstruction error between the two.
The following is a calculation formula of Euclidean distance d t:
W is original sequence vector data obtained through the prediction of the first encoder; k is the number of elements in the reconstructed sequence vector data/original sequence vector data; the expression of ||2L 2 norm.
S42 ranks the anomaly scores in descending order.
Here, the obtained anomaly scores for each transaction are arranged in descending order from large to small, and the reordering of each transaction in the time window is completed.
S43, judging whether the current real-time transaction ranking is in the range of the abnormal ratio according to the set abnormal ratio, if so, judging that the current transaction is abnormal, otherwise, judging that the current transaction is normal.
The anomaly ratio may be set to 5% or 10%, and then for anomaly scores in descending order, the first 5% or 10% is taken as the determined anomaly score. If the current real-time transaction rank is within the range of the abnormal ratio, the current real-time transaction is considered to be an abnormal transaction; otherwise, the current transaction is judged to be a normal transaction.
Example 2
The blockchain attack transaction identification system based on multivariate time series anomaly discrimination provided in this embodiment, as shown in fig. 6, includes:
The transaction time series data generation module is used for constructing transaction time series data of the blockchain target address;
The original sequence data generation module is used for constructing and dividing multi-element time sequence data of the transaction;
the prediction model is used for processing the plurality of original sequence data to obtain corresponding original sequence vector data and reconstructed sequence vector data;
the abnormal transaction identification module is used for obtaining a reconstruction error corresponding to each transaction in the sequence, namely an abnormal score according to the original sequence vector data and the corresponding reconstruction sequence vector data; and identifying the real-time transaction of the target address according to the anomaly score.
The transaction time series data generation module includes:
the real-time transaction data acquisition unit is used for acquiring the real-time transaction data of the blockchain full-node and filtering out the real-time transaction data of the target address;
the historical transaction data acquisition unit is used for acquiring historical transaction data related to the target address from the blockchain full-node historical transaction data;
and the transaction data ordering unit is used for arranging the historical transaction data and the real-time transaction data related to the target address according to the transaction execution time to construct the transaction time sequence data of the blockchain target address.
The original sequence data generating module includes:
the multi-element time sequence data generation unit is used for carrying out characteristic engineering on the transaction time sequence data of the target address to obtain multi-element time sequence data of the transaction;
the dividing unit is used for dividing the obtained multi-element time sequence data of the transaction to obtain a plurality of original sequence data;
the prediction model comprises a first encoder, a sequence prediction module and a second encoder; the first encoder is used for learning the original sequence data to obtain vector representation of each transaction in the original sequence data, namely the original sequence vector data; the sequence prediction module is used for generating prediction vector data according to the original sequence vector data; the second encoder is used for decoding the prediction vector data to obtain reconstructed sequence vector data corresponding to the original sequence vector data. The first encoder is the same as or different from the second encoder, and can be a self encoder (AE) or a Variable Automatic Encoder (VAE); the sequence prediction module may be LSTM, RNN or GRU, etc.
The abnormal transaction identification module comprises:
the reconstruction error calculation unit is used for carrying out error calculation according to the original sequence vector data and the corresponding reconstruction sequence vector data to obtain a reconstruction error of each transaction in the sequence and defining the reconstruction error as an anomaly score;
An anomaly score ranking unit for ranking the anomaly scores in descending order;
And the judging unit is used for judging whether the current real-time transaction ranking is in the range of the abnormal ratio according to the set abnormal ratio, if so, judging that the current transaction is abnormal, and if not, judging that the current transaction is normal.
The above modules execute operations according to the steps given by the blockchain attack transaction identification method provided by the embodiment.
Those of ordinary skill in the art will recognize that the embodiments herein are intended to assist the reader in understanding the principles of the invention and should be understood that the scope of the invention is not limited to such specific statements and embodiments. Those of ordinary skill in the art can make various other specific modifications and combinations from the teachings of the present disclosure without departing from the spirit thereof, and such modifications and combinations remain within the scope of the present disclosure.

Claims (10)

1. A blockchain attack transaction identification method based on multivariate time series anomaly distinction is characterized by comprising the following steps:
s1, constructing transaction time series data of a blockchain target address; the method comprises the following sub-steps:
s11, acquiring real-time transaction data of all nodes of the blockchain, and filtering out the real-time transaction data of the target address;
S12, acquiring historical transaction data related to a target address from the blockchain full-node historical transaction data;
S13, arranging historical transaction data and real-time transaction data related to the target address according to the transaction execution time to construct transaction time sequence data of the blockchain target address;
S2, constructing and dividing multi-element time sequence data of the transaction; the method comprises the following sub-steps:
S21, carrying out characteristic engineering on transaction time series data of a target address to obtain multi-element time series data of transaction;
S22, dividing the obtained multi-element time sequence data of the transaction to obtain a plurality of original sequence data;
S3, inputting a plurality of original sequence data into a trained prediction model based on unsupervised learning to obtain corresponding original sequence vector data and reconstructed sequence vector data;
s4, obtaining a reconstruction error corresponding to each transaction in the sequence, namely an anomaly score, according to the original sequence vector data and the corresponding reconstruction sequence vector data; and identifying the real-time transaction of the target address according to the anomaly score.
2. The blockchain attack transaction identification method based on the multivariate time series anomaly discrimination of claim 1, wherein in step S11, the real-time transaction data of the target address is the roll-out transaction data of the target address.
3. The blockchain attack transaction identification method based on the multivariate time series anomaly discrimination of claim 1, wherein in step S12, the obtained historical transaction data related to the target address is a plurality of transaction data before the real-time transaction data, which is the transfer-in transaction data or/and the transfer-out transaction data.
4. The blockchain attack transaction identification method based on the multivariate time series anomaly discrimination of claim 1, wherein in step S21, any one of the transaction time series of the target address is subjected to feature extraction from a plurality of dimensions and standardized processing, so as to obtain multivariate time series data of the transaction.
5. The blockchain attack transaction identification method based on multivariate time series anomaly discrimination of claim 4, wherein the extracted features are general transaction features, transaction amount, transaction time, transaction count, currency, or/and opponent type.
6. The blockchain attack transaction identification method based on the multivariate time series anomaly discrimination of any one of claims 1 to 5, wherein in step S22, sequence sliding slicing is performed on the multivariate data sequence data of the obtained transaction according to a given time window to obtain a plurality of original sequence data.
7. The blockchain attack transaction identification method based on multivariate time series anomaly discrimination of claim 6, wherein the prediction model comprises a first encoder, a sequence prediction module, and a second encoder;
Step3 comprises the following sub-steps:
S31, learning the original sequence data through a first encoder to obtain vector representation of each transaction in the original sequence data, namely obtaining original sequence vector data;
S32, generating predicted vector data according to the original sequence vector data through a sequence prediction module;
S33, decoding the predicted sequence vector data through a second encoder to obtain reconstructed sequence vector data corresponding to the original sequence vector data.
8. The blockchain attack transaction identification method based on multivariate time series anomaly discrimination of claim 7, wherein the first encoder is the same as or different from the second encoder, and is a self-encoder or a variational automatic encoder; the sequence prediction module may be an LSTM, RNN or GRU.
9. The blockchain attack transaction identification method based on multivariate time series anomaly discrimination of claim 7, wherein step S4 comprises the following sub-steps:
S41, carrying out error calculation according to the original sequence vector data and the corresponding reconstructed sequence vector data to obtain a reconstruction error of each transaction in the sequence, and defining the reconstruction error as an anomaly score;
s42, arranging the anomaly scores in a descending order;
s43, judging whether the current real-time transaction ranking is in the range of the abnormal ratio according to the set abnormal ratio, if so, judging that the current transaction is abnormal, otherwise, judging that the current transaction is normal.
10. A blockchain attack transaction identification system based on multivariate time series anomaly discrimination, comprising:
the transaction time series data generation module is used for constructing transaction time series data of the blockchain target address; the transaction time series data generation module comprises:
the real-time transaction data acquisition unit is used for acquiring the real-time transaction data of the blockchain full-node and filtering out the real-time transaction data of the target address;
the historical transaction data acquisition unit is used for acquiring historical transaction data related to the target address from the blockchain full-node historical transaction data;
The transaction data ordering unit is used for arranging the historical transaction data and the real-time transaction data related to the target address according to the transaction execution time to construct transaction time sequence data of the blockchain target address;
the original sequence data generation module is used for constructing and dividing multi-element time sequence data of the transaction; the original sequence data generation module comprises:
the multi-element time sequence data generation unit is used for carrying out characteristic engineering on the transaction time sequence data of the target address to obtain multi-element time sequence data of the transaction;
the dividing unit is used for dividing the obtained multi-element time sequence data of the transaction to obtain a plurality of original sequence data;
the prediction model is used for processing the plurality of original sequence data to obtain corresponding original sequence vector data and reconstructed sequence vector data;
the abnormal transaction identification module is used for obtaining a reconstruction error corresponding to each transaction in the sequence, namely an abnormal score according to the original sequence vector data and the corresponding reconstruction sequence vector data; and identifying the real-time transaction of the target address according to the anomaly score.
CN202311786460.7A 2023-12-22 2023-12-22 Block chain attack transaction identification method and system based on multivariate time sequence anomaly identification Pending CN117974147A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311786460.7A CN117974147A (en) 2023-12-22 2023-12-22 Block chain attack transaction identification method and system based on multivariate time sequence anomaly identification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311786460.7A CN117974147A (en) 2023-12-22 2023-12-22 Block chain attack transaction identification method and system based on multivariate time sequence anomaly identification

Publications (1)

Publication Number Publication Date
CN117974147A true CN117974147A (en) 2024-05-03

Family

ID=90860226

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311786460.7A Pending CN117974147A (en) 2023-12-22 2023-12-22 Block chain attack transaction identification method and system based on multivariate time sequence anomaly identification

Country Status (1)

Country Link
CN (1) CN117974147A (en)

Similar Documents

Publication Publication Date Title
US11797657B1 (en) Behavioral profiling method and system to authenticate a user
US11023963B2 (en) Detection of compromise of merchants, ATMs, and networks
Nicholls et al. Financial cybercrime: A comprehensive survey of deep learning approaches to tackle the evolving financial crime landscape
US9516035B1 (en) Behavioral profiling method and system to authenticate a user
US10019744B2 (en) Multi-dimensional behavior device ID
Singh et al. Fraud detection by monitoring customer behavior and activities
Carminati et al. Security evaluation of a banking fraud analysis system
Sethi et al. A revived survey of various credit card fraud detection techniques
Dhok et al. Credit card fraud detection using hidden Markov model
Latif et al. A smart methodology for analyzing secure e-banking and e-commerce websites
Makki et al. Fraud analysis approaches in the age of big data-A review of state of the art
Lata et al. A comprehensive survey of fraud detection techniques
Mahalaxmi et al. Data Analysis with Blockchain Technology: A Review
Sulayman et al. Designing security user profiles via anomaly detection for user authentication
Patil et al. Learning to Detect Phishing Web Pages Using Lexical and String Complexity Analysis
Reddy et al. Utilization of AI for streamlining and optimizing credit decision process and security access loan risks in the banking sector
CN117974147A (en) Block chain attack transaction identification method and system based on multivariate time sequence anomaly identification
Kumar et al. Preserving Security of Crypto Transactions with Machine Learning Methodologies
Gong et al. Privacy and security
Gupta et al. Machine Learning For Detecting Credit Card Fraud
Sunday Phishing website detection using machine learning: Model development and django integration
Mittapalli et al. Harnessing Machine Learning For Phishing Website Detection
Zhao et al. A Closed-loop Hybrid Supervision Framework of Cryptocurrency Transactions for Data Trading in IoT
Almejrab et al. A Classification Model For Phishing Detection System Based On Machine Learning Algorithms
Wang et al. Identifying Crypto Addresses with Gambling Behaviors: A Graph Neural Network Approach

Legal Events

Date Code Title Description
PB01 Publication