CN114511330B - Ether house Pompe fraudster detection method and system based on improved CNN-RF - Google Patents

Ether house Pompe fraudster detection method and system based on improved CNN-RF Download PDF

Info

Publication number
CN114511330B
CN114511330B CN202210401495.3A CN202210401495A CN114511330B CN 114511330 B CN114511330 B CN 114511330B CN 202210401495 A CN202210401495 A CN 202210401495A CN 114511330 B CN114511330 B CN 114511330B
Authority
CN
China
Prior art keywords
contract
data
account
adopting
cnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210401495.3A
Other languages
Chinese (zh)
Other versions
CN114511330A (en
Inventor
张淑慧
兰田
王连海
徐淑奖
邵蔚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Computer Science Center National Super Computing Center in Jinan
Original Assignee
Shandong Computer Science Center National Super Computing Center in Jinan
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Computer Science Center National Super Computing Center in Jinan filed Critical Shandong Computer Science Center National Super Computing Center in Jinan
Priority to CN202210401495.3A priority Critical patent/CN114511330B/en
Publication of CN114511330A publication Critical patent/CN114511330A/en
Application granted granted Critical
Publication of CN114511330B publication Critical patent/CN114511330B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/405Establishing or using transaction specific rules

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Security & Cryptography (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the field of block chain abnormal behavior detection, and provides an Ethenhouse Pompe fraudster detection method and system based on improved CNN-RF. The method comprises the steps of obtaining intelligent contract data on an ether house; extracting account characteristics and operation code characteristics of the intelligent contract data, and combining the account characteristics and the operation code characteristics to obtain mixed characteristics; extracting key features of the contract detection of the Pompe fraudster bureau by adopting a CNN feature extraction model based on the mixed features; and based on the key characteristics, obtaining a detection result of whether the contract is a Pompe fraudster contract or not by adopting an RF classification model. The invention uses the convolutional neural network model to screen key characteristic data, integrates the training and the use of a prediction model of a random forest classifier, and improves the detection accuracy rate.

Description

Ether house Pompe fraudster detection method and system based on improved CNN-RF
Technical Field
The invention belongs to the field of block chain abnormal behavior detection, and particularly relates to an Ethenhouse Pompe fraudster detection method and system based on improved CNN-RF.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
The issuing of digital currency requires high technical requirements, so that the blockchain technology with the technical advantages of decentralized distributed databases, intelligent contracts, consensus algorithms and the like becomes a key alternative technology for the technical basis of bank issuing of digital currency. Although research on the blockchain technology is continuously advanced in various countries, the blockchain technology has not only legal and regulatory problems in implementation, but also some contradictions between the centralized management requirement of digital currency and the decentralized characteristic of the blockchain technology. Particularly, the Ethernet workshop platform with the intelligent contract function can automatically execute the program as long as the running condition is met and the outside cannot intervene to terminate the program after the intelligent contract is successfully deployed on the Ethernet workshop. This also allows illegal speculators to enter opportunistically, and to use the malicious code embedding of contracts to gather money and money, most typically representing contract fraud by a pompe fraudster. Therefore, the research of a huge rouge cheating detection method in the ether workshop is urgent.
At present, a plurality of scholars aim at detecting abnormal behaviors of block chains. Among them, the hot spot of research is the contract detection of ether house pointmobile cheats on block chains. The former researchers all analyzed the contract code and account transaction information manually to determine whether the contract is a pompe fraud contract. Machine learning and data mining algorithms are not introduced until some researchers, so that the block chain anomaly detection method is simplified, and the ether-house pointcast cheat detection is the focus of block chain anomaly detection. However, no matter the data mining technology or the machine learning algorithm, the detection performance of the characteristic data imbalance processing and detection method in the detection of the Etheng Pompe fraudster still has shortcomings.
Disclosure of Invention
In order to solve the technical problems in the background art, the invention provides an Etherhouse Pompe fraudster detection method and system based on improved CNN-RF, wherein an SMOTE + ENN mixed sampling algorithm is adopted to process a feature data set with unbalanced samples, a convolutional neural network model is used to screen key feature data, training and using of a prediction model of a random forest classifier are fused, and the detection accuracy rate is improved.
In order to achieve the purpose, the invention adopts the following technical scheme:
a first aspect of the invention provides an improved CNN-RF based etherhouse pompe fraud detection method.
An improved CNN-RF-based Etherhouse Pompe-deception detection method includes:
acquiring intelligent contract data on an ether workshop;
extracting account characteristics and operation code characteristics of the intelligent contract data, and combining the account characteristics and the operation code characteristics to obtain mixed characteristics;
extracting key features of the contract detection of the Pompe fraudster bureau by adopting a CNN feature extraction model based on the mixed features;
and based on the key characteristics, obtaining a detection result of whether the contract is a Pompe fraud contract or not by adopting an RF classification model.
Further, the acquiring intelligent contract data on the ether house specifically includes: acquiring account transaction information and contract running byte codes of the normal contract according to the contract address of the normal contract; and acquiring account transaction information and contract running byte codes of the Pompe cheating contract according to the contract address of the Pompe cheating contract.
Further, the extracting of the account features of the intelligent contract data specifically includes: and respectively extracting the normal account characteristics of the normal contract account transaction information and the abnormal account characteristics of the PoE fraud contract account transaction information.
Further, the extracting the operation code feature specifically includes: and respectively performing decompiling on the normal contract running byte codes and the Ponchonian cheat contract running byte codes by adopting a decompiling tool to obtain normal operation code characteristics and abnormal operation code characteristics.
Further, a parameter automatic optimizer is arranged in the improved CNN feature extraction model in the CNN-RF; and in the training process, the parameter automatic optimizer is used for continuously carrying out performance tuning on the model and storing the optimal training model.
Further, the training process comprises:
constructing a feature data set based on the account features and the operation code features of the intelligent contract data;
performing mixed sampling on the characteristic data set by adopting an SMOTE + ENN algorithm to obtain a new sample set;
and training the CNN-RF by adopting the samples in the new sample set to obtain the trained CNN-RF.
Further, the specific process of performing mixed sampling on the feature data set by using the SMOTE + ENN algorithm includes: and performing oversampling processing on the characteristic data set by adopting an SMOTE algorithm, performing deduplication operation on the data by adopting an ENN data cleaning algorithm, and finally generating a new sample set.
Further, the specific process of extracting the account features of the intelligent contract data comprises the following steps: and calculating and processing the contract transaction time and the transaction amount to obtain the number of investors, the investment amount, the income of the investors and the proportion of the investment amount, the proportion of new investors and the maximum amount paid by the investors.
Further, the intelligent contract data is obtained from the Etherhouse blockchain browser website based on crawler technology.
A second aspect of the invention provides an etherhouse pointcast fraud detection system based on improved CNN-RF.
An etherhouse pointcast fraud detection system based on improved CNN-RF, comprising:
a data acquisition module configured to: acquiring intelligent contract data on an ether workshop;
a feature extraction module configured to: extracting account characteristics and operation code characteristics of the intelligent contract data, and combining the account characteristics and the operation code characteristics to obtain mixed characteristics;
a model processing module configured to: extracting key features of the contract detection of the Pompe deception bureau by adopting a CNN feature extraction model based on the mixed features;
a detection module configured to: and based on the key characteristics, obtaining a detection result of whether the contract is a Pompe fraudster contract or not by adopting an RF classification model.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention introduces SMOTE + ENN mixed sampling algorithm to improve CNN-RF, can avoid the data overlapping problem after SMOTE oversampling processing, realizes automatic extraction of key characteristics, and improves the detection accuracy rate.
2. The Etherhouse Pompe fraudster detection method based on the improved CNN-RF not only solves the problem of data repetition after data oversampling, but also simplifies the key feature extraction process. Experiments prove that the detection method of the embodiment is more suitable for detecting the Etheng Pompe frauds on any level.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
FIG. 1 is a flow chart of the Etherhouse Ponza cheat detection method based on the improved CNN-RF of the present invention;
FIG. 2 is an architecture diagram of a CNN-RF training model according to the present invention;
FIG. 3 is a diagram of a CNN-RF detection confusion matrix processed by the SMOTE + ENN algorithm of the present invention;
FIG. 4 is a diagram of a CNN-RF detection confusion matrix without SMOTE + ENN algorithm processing in accordance with the present invention.
Detailed Description
The invention is further described with reference to the following figures and examples.
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
It is noted that the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods and systems according to various embodiments of the present disclosure. It should be noted that each block in the flowchart or block diagrams may represent a module, a segment, or a portion of code, which may comprise one or more executable instructions for implementing the logical function specified in the respective embodiment. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Interpretation of terms:
CNN-RF, convolutional Neural Network-Random Forest, refers to a detection method for the fusion of a Convolutional Neural Network and a Random Forest model, wherein CNN is a Convolutional Neural Network which is a feedforward Neural Network containing Convolutional calculation and having a deep structure, is one of the representative algorithms of deep learning, and is used for extracting key features. RF is a classifier that trains and predicts samples using multiple trees. Here we input CNN extracted features into RF for training predictive classification.
SMOTE + ENN, synthetic Minivity Oversampling Technique + optimized neighboring borwood, refers to an integrated hybrid sampling algorithm. Wherein SMOTE is to analyze the samples of the minority class and artificially synthesize new samples according to the samples of the minority class to be added into the data set to form a new sample set. ENN is a method of using nearest neighbor algorithm (KNN) to edit a data set, and for each sample to be downsampled, those samples whose vast majority of neighbor samples do not belong to the class are removed, and those samples whose vast majority of neighbor samples belong to the same class are retained, and the newly generated sample set is subjected to data cleaning. Ensure that the data we processed do not overlap.
Example one
As shown in fig. 1, the embodiment provides an ethernet arcade pompe fraud detection method based on improved CNN-RF, and the embodiment is illustrated by applying the method to a server, it is understood that the method may also be applied to a terminal, and may also be applied to a system including a terminal and a server, and is implemented by interaction between the terminal and the server. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network server, cloud communication, middleware service, a domain name service, a security service CDN, a big data and artificial intelligence platform, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein. In this embodiment, the method includes the steps of:
acquiring intelligent contract data on an ether workshop;
extracting account characteristics and operation code characteristics of the intelligent contract data, and combining the account characteristics and the operation code characteristics to obtain mixed characteristics;
extracting key features of the contract detection of the Pompe fraudster bureau by adopting a CNN feature extraction model based on the mixed features;
and based on the key characteristics, obtaining a detection result of whether the contract is a Pompe fraud contract or not by adopting an RF classification model.
Specifically, the specific scheme of this embodiment may be implemented by the following scheme, as shown in fig. 2:
step 1: and according to the publicly marked contract account address, crawling relevant account internal transaction information of the intelligent contract and byte code information of the intelligent contract from an Etherscan. The account transaction comprises information such as block number, timestamp, hash value, transaction transfer and transaction receipt, contract address, transaction amount, transaction balance, and Gas value of transaction cost, and the most common decompilation tool Easy Python Decompiler for decompilation of contract byte codes is programmed by Python.
Step 2: and processing the acquired rough data, performing characteristic analysis and selection on numerous account transaction characteristics, then performing decompiling processing on byte codes of the contract by using a decompiling tool, and performing calling frequency statistics on the decompiled operation codes.
Wherein, the method comprises the following steps: the number of investors, the investment amount, the proportion of income and the investment amount of the investors, the proportion of new investors and the maximum amount paid by the investors are obtained by calculating and processing contract transaction time and transaction amount. And carrying out calling frequency statistics on the decompiled operation codes.
Step 3: the account and opcode feature data is processed and stored in the features file, denoted as data set Q. The sample data ratio in Q is 12:1, there is a data imbalance problem. Aiming at the problem, the invention carries out SMOTE + ENN algorithm processing on Q and generates a new data set T.
Specifically, firstly, the SMOTE algorithm is used to perform oversampling processing on the data set Q, and there may be multiple data repetitions in the processed data. Then, the ENN data cleaning algorithm is used for carrying out deduplication operation on the data, and finally a new sample set T is generated.
Data set T was as follows 8:2 dividing the data set T into training sets
Figure 218060DEST_PATH_IMAGE001
And test set
Figure 824622DEST_PATH_IMAGE002
Step 4: partitioned training set
Figure 817986DEST_PATH_IMAGE003
And the key features are input into the CNN as a data input source for key feature extraction.
Step 5: the CNN feature extraction model can automatically extract the key features of the Ether workshop Pont cheat bureau detection, the proportion analysis of effective features in data sets is not needed manually, the time is saved, and the manual operation burden is reduced. A parameter automatic optimizer is arranged in a CNN characteristic extraction model, performance tuning of the model is continuously carried out, and an optimal training model is stored. Will be provided with
Figure 509998DEST_PATH_IMAGE001
And the optimal CNN model is loaded into the RF model, using CNN pairs
Figure 247010DEST_PATH_IMAGE003
And (5) extracting features, and training the RF model by using the extracted features. The RF model is used after training
Figure 708078DEST_PATH_IMAGE002
And (5) carrying out detection evaluation on the model. In order to facilitate the comparison of the models, the models are evaluated by using three common performance indexes of accuracy, recall rate and F1 value.
Step 6: after model training is finished, the performance of the model is detected by using the divided test sets, and the superiority of the method is verified through experimental results.
And (3) comparative analysis: aiming at the high efficiency of the detection method of the embodiment, which is verified by experiments, some classical machine learning models are selected for reproduction. Through experimental verification, the improved CNN-RF detection method is more practical for detecting Etherhouse Pompe frauds than other machine learning models. The experimental comparison results are shown in table 1.
Table 1: comparison of detection Performance of various methods
Figure 137923DEST_PATH_IMAGE004
Aiming at the detection method with or without SMOTE + ENN algorithm processing, a comparison experiment is also carried out, and a confusion matrix diagram is drawn. As shown in fig. 3 and 4, the detection method processed by the SMOTE + ENN algorithm has higher detection accuracy on the sample data.
The Etherhouse Pompe fraudster detection method based on the improved CNN-RF not only solves the problem of data repetition after data oversampling, but also simplifies the key feature extraction process. Experiments prove that the detection method of the embodiment is more suitable for detecting the Etheng Pompe frauds in any level.
Example two
The embodiment provides an Etherhouse Pompe fraudster detection system based on improved CNN-RF.
An etherhouse pointcast fraud detection system based on improved CNN-RF, comprising:
a data acquisition module configured to: acquiring intelligent contract data on an ether workshop;
a feature extraction module configured to: extracting account characteristics and operation code characteristics of the intelligent contract data, and combining the account characteristics and the operation code characteristics to obtain mixed characteristics;
a model processing module configured to: extracting key features of the contract detection of the Pompe deception bureau by adopting a CNN feature extraction model based on the mixed features;
a detection module configured to: and based on the key characteristics, obtaining a detection result of whether the contract is a Pompe fraudster contract or not by adopting an RF classification model.
It should be noted here that the data acquisition module, the feature extraction module, the model processing module and the detection module are the same as the example and the application scenario realized by the steps in the first embodiment, but are not limited to the disclosure of the first embodiment. It should be noted that the modules described above as part of a system may be implemented in a computer system such as a set of computer-executable instructions.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (3)

1. An Etherhouse Pompe fraudster detection method based on improved CNN-RF is characterized by comprising the following steps:
acquiring intelligent contract data on an ether workshop;
extracting account characteristics and operation code characteristics of the intelligent contract data, and combining the account characteristics and the operation code characteristics to obtain mixed characteristics;
extracting key features of the contract detection of the Pompe deception bureau by adopting a CNN feature extraction model based on the mixed features;
based on the key characteristics, adopting an RF classification model to obtain a detection result of whether the contract is a Pompe fraudster contract or not;
the convolutional neural network model is used for screening key characteristic data, and training and using of a prediction model of the random forest classifier are integrated, so that the detection accuracy rate is improved;
the acquiring of the intelligent contract data on the ether house specifically includes: acquiring account transaction information and contract running byte codes of the normal contract according to the contract address of the normal contract; acquiring account transaction information and contract running byte codes of the Pompe cheating contract according to the contract address of the Pompe cheating contract;
the extracting of the account characteristics of the intelligent contract data specifically comprises the following steps: respectively extracting normal account characteristics of the normal contract account transaction information and abnormal account characteristics of the PoE fraud contract account transaction information;
the extracting the operation code features specifically includes: decompiling the normal contract running byte codes and the Pompe fraudster contract running byte codes by adopting a decompiling tool, and carrying out calling frequency statistics on the decompiled operation codes to obtain normal operation code characteristics and abnormal operation code characteristics;
the specific process for extracting the account features of the intelligent contract data comprises the following steps: calculating and processing the number of investors, the investment amount, the income of the investors and the proportion of the investment amount, the proportion of a new investor and the maximum amount paid by the investors by adopting contract transaction time and transaction amount;
a parameter automatic optimizer is arranged in the improved CNN-RF characteristic extraction model; in the training process, the parameter automatic optimizer is used for continuously carrying out performance tuning on the model and storing the optimal training model;
the training process comprises:
constructing a feature data set based on the account features and the operation code features of the intelligent contract data;
performing mixed sampling on the characteristic data set by adopting an SMOTE + ENN algorithm to obtain a new sample set;
training the CNN-RF by using samples in a new sample set to obtain the trained CNN-RF; specifically, inputting account characteristics and operation code characteristics of an intelligent contract to two convolution layers which are connected in sequence to automatically extract key characteristics, and connecting a linear rectification function behind each convolution layer; inputting the output of the last convolution layer into two fully-connected layers which are connected in sequence, wherein the input dimension and the output dimension of the first fully-connected layer are respectively 16 and 8, and the input dimension and the output dimension of the last fully-connected layer are respectively 8 and 2; after the output of the last full-connection layer is calculated to obtain a loss value through a cross loss function, updating the weights of the two convolution layers by adopting a self-adaptive momentum estimation algorithm; inputting the extracted optimal features into a random forest classifier to realize self-training of the random forest classifier;
the specific process of performing mixed sampling on the feature data set by adopting the SMOTE + ENN algorithm comprises the following steps: performing oversampling processing on the characteristic data set by adopting an SMOTE algorithm, performing duplicate removal operation on the data by adopting an ENN data cleaning algorithm, and finally generating a new sample set;
the SMOTE + ENN mixed sampling algorithm is introduced to improve CNN-RF, so that the problem of data overlapping after SMOTE oversampling processing is avoided, automatic extraction of key features is realized, and the detection accuracy is improved.
2. The improved CNN-RF based ethernet house pointcast fraud detection method of claim 1, wherein said intelligent contract data is obtained from an ethernet house blockchain browser website based on crawler technology.
3. An etherhouse pointcast fraud detection system based on improved CNN-RF, comprising:
a data acquisition module configured to: acquiring intelligent contract data on an ether workshop;
a feature extraction module configured to: extracting account characteristics and operation code characteristics of the intelligent contract data, and combining the account characteristics and the operation code characteristics to obtain mixed characteristics;
a model processing module configured to: extracting key features of the contract detection of the Pompe fraudster bureau by adopting a CNN feature extraction model based on the mixed features;
a detection module configured to: based on the key characteristics, adopting an RF classification model to obtain a detection result of whether the contract is a Pompe fraudster contract or not;
the convolutional neural network model is used for screening key characteristic data, and training and using of a prediction model of the random forest classifier are fused, so that the detection accuracy rate is improved;
the acquiring of the intelligent contract data on the ether house specifically includes: acquiring account transaction information and contract running byte codes of the normal contract according to the contract address of the normal contract; acquiring account transaction information and contract running byte codes of the Pompe cheating contract according to the contract address of the Pompe cheating contract;
the extracting of the account features of the intelligent contract data specifically comprises: respectively extracting normal account characteristics of normal contract account transaction information and abnormal account characteristics of Pont cheating contract account transaction information;
the extracting the operation code features specifically includes: decompiling the normal contract running byte codes and the Poncar deception contract running byte codes by adopting a decompiling tool, and carrying out calling frequency statistics on the decompiled operation codes to obtain normal operation code characteristics and abnormal operation code characteristics;
a parameter automatic optimizer is arranged in the improved CNN-RF characteristic extraction model; in the training process, the parameter automatic optimizer is used for continuously carrying out performance tuning on the model and storing the optimal training model;
the training process comprises:
constructing a feature data set based on the account features and the operation code features of the intelligent contract data;
performing mixed sampling on the characteristic data set by adopting an SMOTE + ENN algorithm to obtain a new sample set;
training the CNN-RF by using samples in a new sample set to obtain the trained CNN-RF; specifically, inputting account characteristics and operation code characteristics of an intelligent contract to two convolution layers which are connected in sequence to automatically extract key characteristics, and connecting a linear rectification function behind each convolution layer; inputting the output of the last convolution layer into two fully-connected layers which are connected in sequence, wherein the input dimension and the output dimension of the first fully-connected layer are respectively 16 and 8, and the input dimension and the output dimension of the last fully-connected layer are respectively 8 and 2; after the output of the last full-connection layer is calculated to obtain a loss value through a cross loss function, updating the weights of the two convolution layers by adopting a self-adaptive momentum estimation algorithm; inputting the extracted optimal features into a random forest classifier to realize self-training of the random forest classifier;
the specific process of performing mixed sampling on the feature data set by adopting the SMOTE + ENN algorithm comprises the following steps: performing oversampling processing on the characteristic data set by adopting an SMOTE algorithm, performing duplicate removal operation on the data by adopting an ENN data cleaning algorithm, and finally generating a new sample set;
the specific process for extracting the account features of the intelligent contract data comprises the following steps: calculating and processing the number of investors, the investment amount, the income of the investors and the proportion of the investment amount, the proportion of new investors and the maximum amount paid by the investors by adopting contract transaction time and transaction amount;
the SMOTE + ENN mixed sampling algorithm is introduced to improve CNN-RF, so that the problem of data overlapping after SMOTE oversampling processing is avoided, automatic extraction of key features is realized, and the detection accuracy rate is improved.
CN202210401495.3A 2022-04-18 2022-04-18 Ether house Pompe fraudster detection method and system based on improved CNN-RF Active CN114511330B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210401495.3A CN114511330B (en) 2022-04-18 2022-04-18 Ether house Pompe fraudster detection method and system based on improved CNN-RF

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210401495.3A CN114511330B (en) 2022-04-18 2022-04-18 Ether house Pompe fraudster detection method and system based on improved CNN-RF

Publications (2)

Publication Number Publication Date
CN114511330A CN114511330A (en) 2022-05-17
CN114511330B true CN114511330B (en) 2022-12-13

Family

ID=81554752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210401495.3A Active CN114511330B (en) 2022-04-18 2022-04-18 Ether house Pompe fraudster detection method and system based on improved CNN-RF

Country Status (1)

Country Link
CN (1) CN114511330B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837512A (en) * 2020-06-23 2021-12-24 中国移动通信集团辽宁有限公司 Abnormal user identification method and device
CN117473170B (en) * 2023-12-27 2024-04-09 布比(北京)网络技术有限公司 Intelligent contract template recommendation method and device based on code characterization and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066553A (en) * 2017-03-24 2017-08-18 北京工业大学 A kind of short text classification method based on convolutional neural networks and random forest
CN113127933A (en) * 2021-03-22 2021-07-16 西北大学 Intelligent contract Pompe fraudster detection method and system based on graph matching network
CN113191775A (en) * 2021-04-22 2021-07-30 深圳前海移联科技有限公司 Pompe fraudster intelligent contract detection method based on Ethernet shop transaction timing sequence information
CN113362071A (en) * 2021-06-21 2021-09-07 浙江工业大学 Pompe fraudster identification method and system for Ether house platform
CN113806746A (en) * 2021-09-24 2021-12-17 沈阳理工大学 Malicious code detection method based on improved CNN network

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113901977A (en) * 2020-06-22 2022-01-07 中国电力科学研究院有限公司 Deep learning-based power consumer electricity stealing identification method and system
US11907955B2 (en) * 2020-08-28 2024-02-20 Anchain.ai Inc. System and method for blockchain automatic tracing of money flow using artificial intelligence
CN111815330A (en) * 2020-08-31 2020-10-23 支付宝(杭州)信息技术有限公司 Method for deploying intelligent contract, block chain node and storage medium
CN112631611B (en) * 2021-01-06 2022-06-03 中山大学 Intelligent Pompe deception contract identification method and device
CN112967063B (en) * 2021-04-06 2024-02-02 北京工商大学 Pond cheat detection method for Ethernet platform
CN113469730A (en) * 2021-06-08 2021-10-01 北京化工大学 Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene
CN113591924A (en) * 2021-07-01 2021-11-02 中山大学新华学院 Phishing number detection method, system, storage medium and terminal equipment
CN113783852B (en) * 2021-08-29 2022-12-20 复旦大学 Intelligent contract Pompe fraudster detection algorithm based on neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066553A (en) * 2017-03-24 2017-08-18 北京工业大学 A kind of short text classification method based on convolutional neural networks and random forest
CN113127933A (en) * 2021-03-22 2021-07-16 西北大学 Intelligent contract Pompe fraudster detection method and system based on graph matching network
CN113191775A (en) * 2021-04-22 2021-07-30 深圳前海移联科技有限公司 Pompe fraudster intelligent contract detection method based on Ethernet shop transaction timing sequence information
CN113362071A (en) * 2021-06-21 2021-09-07 浙江工业大学 Pompe fraudster identification method and system for Ether house platform
CN113806746A (en) * 2021-09-24 2021-12-17 沈阳理工大学 Malicious code detection method based on improved CNN network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Dropout Prediction Optimization through SMOTE and Ensemble Learning";Eko Mulyani 等;《2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)》;20200316;全文 *
浅谈互联网金融中"庞氏骗局"的识别与防范;汪谦生;《计算机产品与流通》;20170915(第09期);全文 *

Also Published As

Publication number Publication date
CN114511330A (en) 2022-05-17

Similar Documents

Publication Publication Date Title
CN114511330B (en) Ether house Pompe fraudster detection method and system based on improved CNN-RF
CN107316198B (en) Account risk identification method and device
US11580560B2 (en) Identity resolution for fraud ring detection
CN106599686A (en) Malware clustering method based on TLSH character representation
CN111325619A (en) Credit card fraud detection model updating method and device based on joint learning
CN110837963A (en) Risk control platform construction method based on data, model and strategy
Li et al. Intelligent anti-money laundering solution based upon novel community detection in massive transaction networks on spark
CN105740667A (en) User behavior based information identification method and apparatus
Luo et al. A CNN-based Approach to the Detection of SQL Injection Attacks
CN113657896A (en) Block chain transaction topological graph analysis method and device based on graph neural network
CN106681980B (en) A kind of refuse messages analysis method and device
CN113268370A (en) Root cause alarm analysis method, system, equipment and storage medium
CN112435033A (en) System and method for realizing financial anti-fraud rule engine
Li et al. Computer network security evaluation model based on neural network
CN111931848A (en) Data feature extraction method and device, computer equipment and storage medium
CN107871055A (en) A kind of data analysing method and device
CN113487241A (en) Method, device, equipment and storage medium for classifying enterprise environment-friendly credit grades
CN103761298B (en) Distributed-architecture-based entity matching method
CN109977977A (en) A kind of method and corresponding intrument identifying potential user
CN105991609B (en) A kind of risk case determines method and device
CN117035207A (en) Method for planning path of securicar, method and device for training path prediction model
US20220398264A1 (en) Systems and methods for streaming classification of distributed ledger-based activities
CN113742495B (en) Rating feature weight determining method and device based on prediction model and electronic equipment
CN110032596B (en) Method and system for identifying abnormal traffic user
CN112561538B (en) Risk model creation method, apparatus, computer device and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant