CN112949778A - Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment - Google Patents
Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment Download PDFInfo
- Publication number
- CN112949778A CN112949778A CN202110414897.2A CN202110414897A CN112949778A CN 112949778 A CN112949778 A CN 112949778A CN 202110414897 A CN202110414897 A CN 202110414897A CN 112949778 A CN112949778 A CN 112949778A
- Authority
- CN
- China
- Prior art keywords
- matrix
- characteristic
- code
- transaction
- intelligent contract
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 239000011159 matrix material Substances 0.000 claims abstract description 203
- 238000010606 normalization Methods 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 5
- 238000010801 machine learning Methods 0.000 abstract description 17
- 238000004364 calculation method Methods 0.000 abstract description 13
- 238000012549 training Methods 0.000 abstract description 10
- 238000002372 labelling Methods 0.000 abstract description 5
- 238000013145 classification model Methods 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 description 11
- 238000012546 transfer Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/04—Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- Mathematical Physics (AREA)
- Finance (AREA)
- Mathematical Analysis (AREA)
- Accounting & Taxation (AREA)
- General Engineering & Computer Science (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Algebra (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Technology Law (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an intelligent contract classification method, a system and electronic equipment based on locality sensitive hashing, constructing a code feature matrix and a transaction feature matrix by combining the code information and the transaction information of the plurality of intelligent contracts, and carrying out locality sensitive hash calculation on the obtained data to obtain corresponding code characteristic locality sensitive hash matrix and transaction characteristic locality sensitive hash matrix, respectively representing the similarity of code information and transaction information in the intelligent contract, finally splicing the two sensitive hash matrixes to obtain a final intelligent contract local sensitive hash matrix, the method does not need a label data training model, does not need to consume manpower to finish data labeling work, and reduces the dependence on a large number of training samples when the machine learning classification model is trained. Meanwhile, the method has stronger adaptability to new types of intelligent contracts.
Description
Technical Field
The invention relates to the field of intelligent contract classification methods for block chains, in particular to an intelligent contract classification method and system based on locality sensitive hashing and electronic equipment.
Background
With the increasing popularity of networking and information technology, the amount of data generated by the internet is growing exponentially. In a blockchain, thousands of smart contracts are deployed each day. Investigations have found that many people are very interested in the full picture of intelligent contracts on blockchains, as this affects the work they are currently doing. For example, intelligent contract developers need the information of the whole intelligent contract on the chain to guide the developers to develop the type of contracts, so that the hot contracts can be developed more smoothly; intelligent contract researchers also need comprehensive information of intelligent contracts, which they use to guide their research direction, and they will pay more attention to the safety problem of the contract types used by many people and whether some new contracts have safety problems, after all, the contracts and digital assets are closely related and need to ensure that the contracts have no loopholes so as to avoid property loss. If it is not practical to know the current development situation of the intelligent contract by manually reading each contract, because the speed of reading the contract cannot completely follow the deployment speed of the contract, a technology for quickly clustering the intelligent contract is urgently needed at present, so that people can more quickly acquire the full-face information of the intelligent contract on the block chain.
The existing method for classifying intelligent contracts has a classification method based on machine learning, but the machine learning method needs a large amount of labeled data training models, the labeling process consumes manpower, and the machine learning classification method has poor adaptability to new types of contracts. After an intelligent contract running on a block chain is born, the type of the existing intelligent contract is definitely not comprehensive enough, and if a novel intelligent contract appears, the model only divides the intelligent contract into a certain type of the existing type, but cannot create a new type of the existing type, and the identification and classification are inaccurate.
Disclosure of Invention
The invention provides an intelligent contract classification method, system and electronic equipment based on locality sensitive hashing, and aims to solve the problems that an intelligent contract classification method based on machine learning in the existing block chain is difficult in model training and inaccurate in classification.
According to the embodiment of the application, the intelligent contract classification method based on the locality sensitive hashing is provided, and comprises the following steps: step S1: acquiring a plurality of intelligent contracts based on a block chain, and extracting code information and transaction information in each intelligent contract; step S2: constructing a code characteristic matrix based on the code information, and constructing a transaction characteristic matrix based on the transaction information; step S3: performing local sensitive hashing on the code characteristic matrix to obtain a code characteristic local sensitive hashing matrix, and performing local sensitive hashing on the transaction characteristic matrix to obtain a transaction characteristic local sensitive hashing matrix; step S4: splicing the code characteristic local sensitive hash matrix and the transaction characteristic local sensitive hash matrix to obtain an intelligent contract local sensitive hash matrix; and step S5: and classifying the vectors of each line based on the intelligent contract local sensitive hash matrix to obtain various intelligent contracts of different types.
Preferably, the step S1 specifically includes: step S11: acquiring a plurality of intelligent contracts of a block chain, and generating an abstract syntax tree by all the intelligent contracts; and step S12: and traversing the abstract syntax tree, analyzing and recording the information of each node on the abstract syntax tree, and respectively acquiring corresponding code information and transaction information.
Preferably, the step S2 specifically includes: step S21: obtaining corresponding code feature matrixes in all intelligent contracts based on nodes of the abstract syntax tree and corresponding code information; and step S22: and obtaining corresponding transaction characteristic matrixes in all intelligent contracts based on the nodes of the abstract syntax tree and the corresponding transaction information.
Preferably, the step S21 specifically includes: step S211: acquiring all nodes of the abstract syntax tree, extracting the type of each node, and removing repeated nodes; and step S212: and counting the types of the residual nodes and constructing a corresponding code characteristic matrix in the intelligent contract.
Preferably, the step S3 specifically includes: step S31: generating a random matrix, wherein the row number of the random matrix is equal to the row number of the code characteristic matrix, and the column number is a user preset value; step S32: multiplying the code characteristic matrix by a random matrix to obtain a code characteristic random matrix, and carrying out normalization processing on the code characteristic random matrix to obtain a code characteristic local sensitive hash matrix; and step S33: and performing point multiplication on the transaction characteristic matrix by using a random matrix to obtain a transaction characteristic random matrix, and performing normalization processing on the transaction characteristic random matrix to obtain a transaction characteristic local sensitive hash matrix.
The invention also provides an intelligent contract classification system based on locality sensitive hashing, which comprises: the intelligent contract acquisition unit is used for acquiring a plurality of intelligent contracts based on the block chain and extracting code information and transaction information in each intelligent contract; the characteristic matrix constructing unit is used for constructing a code characteristic matrix based on the code information and constructing a transaction characteristic matrix based on the transaction information; the local sensitive hash unit is used for carrying out local sensitive hash on the code characteristic matrix to obtain a code characteristic local sensitive hash matrix and carrying out local sensitive hash on the transaction characteristic matrix to obtain a transaction characteristic local sensitive hash matrix; the matrix splicing unit is used for splicing the code characteristic local sensitive hash matrix and the transaction characteristic local sensitive hash matrix to obtain an intelligent contract local sensitive hash matrix; and the contract classification unit is used for classifying the vectors in each row based on the intelligent contract local sensitive hash matrix to obtain various intelligent contracts of different types.
Preferably, the intelligent contract obtaining unit further includes: the syntax tree construction unit is used for acquiring a plurality of intelligent contracts of the block chain and generating an abstract syntax tree from all the intelligent contracts; and the node traversing unit is used for traversing the abstract syntax tree, analyzing and recording the information of each node on the abstract syntax tree, and respectively acquiring corresponding code information and transaction information.
Preferably, the locality-sensitive hashing unit further comprises: the random matrix generating unit is used for generating a random matrix, the row number of the random matrix is equal to the row number of the code characteristic matrix, and the column number is a user preset value; the code matrix point multiplication unit is used for point multiplication of the code characteristic matrix by a random matrix to obtain a code characteristic random matrix, and normalization processing is carried out on the code characteristic random matrix to obtain a code characteristic local sensitive hash matrix; and the transaction matrix dot multiplication unit is used for dot multiplication of the transaction characteristic matrix by the random matrix to obtain a transaction characteristic random matrix, and normalization processing is carried out on the transaction characteristic random matrix to obtain a transaction characteristic local sensitive hash matrix.
The invention also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the computer program is set to execute the intelligent contract classification method based on locality sensitive hashing in any item when running; the processor is configured to execute, by the computer program, the smart contract classification method based on locality-sensitive hashing according to any one of the above.
The intelligent contract classification method, system and electronic equipment based on locality sensitive hashing have the following beneficial effects:
1. constructing a code feature matrix and a transaction feature matrix by combining the code information and the transaction information of the plurality of intelligent contracts, and carrying out locality sensitive hash calculation on the obtained data to obtain corresponding code characteristic locality sensitive hash matrix and transaction characteristic locality sensitive hash matrix, respectively representing the similarity of code information and transaction information in the intelligent contract, finally splicing the two sensitive hash matrixes to obtain a final intelligent contract local sensitive hash matrix, compared with the technology of clustering the intelligent contracts by using machine learning, the method does not need a label data training model, does not need to consume manpower to finish data labeling work, and reduces the dependence on a large number of training samples when the machine learning classification model is trained. Meanwhile, the method has stronger adaptability to the new type of intelligent contracts, and based on the machine learning method, the new type of intelligent contracts also need to be subjected to sample learning again to refine characteristics. The method only needs to calculate the local sensitive hash value of the new type of contract according to the rule, so that the local sensitive hash value is different from the existing local sensitive hash value, then the local sensitive hash value is classified into one type, and finally the new type of intelligent contract can be classified according to the user check. Furthermore, the method performs classification based on the form of the matrix through a calculation method of the local sensitive hash, namely, a technology for linearly calculating the similarity of the intelligent contract, so that the efficiency of classification calculation is higher, and the method is more suitable for large-scale and frequent intelligent contract clustering scenes. In addition, the method combines the code information and the transaction information, and compared with the existing non-machine learning intelligent contract clustering technology, the used characteristics are richer and more comprehensive.
2. Repeated node types are removed, the repeated nodes are classified into the same type, subsequent calculation is reduced, and classification efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a locality sensitive hash-based intelligent contract classification method according to a first embodiment of the present invention.
Fig. 2 is a flowchart of step S1 in the locality sensitive hash-based intelligent contract classification method according to the first embodiment of the present invention.
Fig. 3 is a flowchart of step S2 in the locality sensitive hash-based intelligent contract classification method according to the first embodiment of the present invention.
Fig. 4 is a flowchart of step S21 in the locality sensitive hash-based intelligent contract classification method according to the first embodiment of the present invention.
Fig. 5 is a flowchart of step S3 in the locality sensitive hash-based intelligent contract classification method according to the first embodiment of the present invention.
Fig. 6 is a block diagram of an intelligent contract classification system based on locality sensitive hashing according to a second embodiment of the present invention.
FIG. 7 is a block diagram of intelligent contract units in a locality sensitive hash-based intelligent contract classification system according to a second embodiment of the present invention.
Fig. 8 is a block diagram of a locality sensitive hash unit in the locality sensitive hash-based intelligent contract classification system according to the second embodiment of the present invention.
Fig. 9 is a block diagram of an electronic device according to a third embodiment of the present invention.
Description of reference numerals:
1. an intelligent contract acquisition unit; 2. a feature matrix construction unit; 3. a locality sensitive hash unit; 4. a matrix splicing unit; 5. a contract classification unit;
11. a syntax tree construction unit; 12. a node traversing unit; 31. generating a random matrix; 32. a code matrix dot multiplication unit; 33. a transaction matrix dot multiplication unit;
10. a memory; 20. a processor.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items and includes such combinations.
Referring to fig. 1, a first embodiment of the present invention discloses an intelligent contract classification method based on locality sensitive hashing, which includes the following steps:
step S1: and acquiring a plurality of intelligent contracts based on the block chain, and extracting code information and transaction information in each intelligent contract.
Step S2: and constructing a code characteristic matrix based on the code information, and constructing a transaction characteristic matrix based on the transaction information.
Step S3: and carrying out local sensitive hashing on the code characteristic matrix to obtain a code characteristic local sensitive hashing matrix, and carrying out local sensitive hashing on the transaction characteristic matrix to obtain a transaction characteristic local sensitive hashing matrix.
Step S4: and splicing the code characteristic local sensitive hash matrix and the transaction characteristic local sensitive hash matrix to obtain an intelligent contract local sensitive hash matrix. And
step S5: and classifying the vectors of each line based on the intelligent contract local sensitive hash matrix to obtain various intelligent contracts of different types.
It is understood that in step S1, the smart contract is a program running on the blockchain, which is event driven, has a status, and can hold assets on the ledger. The user may write asset transaction logic into the intelligent contract to effect value transfer. The transaction information refers to information stored in the transaction, such as transfer amount and the like, and the code information refers to functions of a programming language of the intelligent contract, such as conditional branching, circulation, transfer and the like.
It is understood that, in step S2, the code feature matrix characterizes which agent feature exists in which intelligent contract or not among all the intelligent contracts collected currently, and is expressed in the form of a matrix, and the transaction feature matrix is the same as the code feature matrix.
It is understood that in step S3, the locality sensitive hashing is a method for fast clustering of large-scale data, and its basic idea is: if the two data points are similar in the original data space, the data points still have high similarity after being respectively subjected to the hash function conversion. The method comprises the steps of adopting a local sensitive Hash algorithm to solve the problem of comparing massive intelligent contracts, namely, carrying out Hash function conversion through the local sensitive Hash algorithm, and then keeping original similarity for reservation.
It can be understood that, in step S4, the code feature locality sensitive hash matrix and the transaction feature locality sensitive hash matrix are concatenated to classify the intelligent contract based on the combination of the code information and the transaction information, so that the classification of the intelligent contract is more accurate, and the classification accuracy is improved.
It is to be understood that, in step S5, the vectors in each row of the smart contract locality-sensitive hash matrix represent locality-sensitive hash values of one smart contract, and different types of smart contracts can be accurately classified based on similarity of the vectors.
It is to be understood that, in step S5, the ith row of the locality-sensitive hash matrix H' represents the locality-sensitive hash value of the ith intelligent contract, and intelligent contracts (highly-similar intelligent contracts) having the same locality-sensitive hash value are grouped into one class. The user may randomly pick an intelligent contract from each cluster to deduce the type of intelligent contract in the cluster. If a new locality sensitive hash value is found, it means that a new intelligent contract class appears on the blockchain.
It will be appreciated that by constructing a code signature matrix and a transaction signature matrix from the code information and transaction information for a plurality of intelligent contracts, and carrying out locality sensitive hash calculation on the obtained data to obtain corresponding code characteristic locality sensitive hash matrix and transaction characteristic locality sensitive hash matrix, respectively representing the similarity of code information and transaction information in the intelligent contract, finally splicing the two sensitive hash matrixes to obtain a final intelligent contract local sensitive hash matrix, compared with the technology of clustering the intelligent contracts by using machine learning, the method does not need a label data training model, does not need to consume manpower to finish data labeling work, and reduces the dependence on a large number of training samples when the machine learning classification model is trained. Meanwhile, the method has stronger adaptability to the new type of intelligent contracts, and based on the machine learning method, the new type of intelligent contracts also need to be subjected to sample learning again to refine characteristics. The method only needs to calculate the local sensitive hash value of the new type of contract according to the rule, so that the local sensitive hash value is different from the existing local sensitive hash value, then the local sensitive hash value is classified into one type, and finally the new type of intelligent contract can be classified according to the user check. Furthermore, the method performs classification based on the form of the matrix through a calculation method of the local sensitive hash, namely, a technology for linearly calculating the similarity of the intelligent contract, so that the efficiency of classification calculation is higher, and the method is more suitable for large-scale and frequent intelligent contract clustering scenes. In addition, the method combines the code information and the transaction information, and compared with the existing non-machine learning intelligent contract clustering technology, the used characteristics are richer and more comprehensive.
Referring to fig. 2, the step S1 specifically includes:
step S11: acquiring a plurality of intelligent contracts of a block chain, and generating an abstract syntax tree by all the intelligent contracts; and
step S12: and traversing the abstract syntax tree, analyzing and recording the information of each node on the abstract syntax tree, and respectively acquiring corresponding code information and transaction information.
It is to be understood that in step S11, specifically, an abstract syntax tree is generated for all intelligent contracts using the solid-Antlr 4 tool, where the abstract syntax tree has a plurality of nodes, and each node represents a syntax structure in the contract.
It is understood that in step S12, the abstract syntax tree is traversed, and the information in each node in the syntax tree is parsed and recorded, including querycoctrbalance representing a query contract balance operation, Transfer representing a Transfer operation, IfStatement representing a conditional statement, and so on.
It is understood that the transaction information is also obtained according to the above steps S11-S12, and will not be described herein.
It is understood that steps S11-S12 are only one embodiment of this example, and the embodiment is not limited to steps S11-S12.
Referring to fig. 3, the step S2 specifically includes:
step S21: obtaining corresponding code feature matrixes in all intelligent contracts based on nodes of the abstract syntax tree and corresponding code information; and
step S22: and obtaining corresponding transaction characteristic matrixes in all intelligent contracts based on the nodes of the abstract syntax tree and the corresponding transaction information.
It can be understood that in step S21, statistics is made about which node classes each contract has, resulting in a 0, 1 code feature matrix McThe size of the code feature matrix is z multiplied by m, z represents the number of contracts, and m represents the number of node types, so that the ith and jth elements of the code feature matrix represent whether the ith intelligent contract has the jth node type, if so, the jth element is equal to 1, otherwise, the jth element is 0.
It is understood that steps S21-S22 are only one embodiment of this example, and the embodiment is not limited to steps S21-S22.
Optionally, referring to fig. 4, as an embodiment, the step S21 specifically includes:
step S211: acquiring all nodes of the abstract syntax tree, extracting the type of each node, and removing repeated nodes; and
step S212: and counting the types of the residual nodes and constructing a corresponding code characteristic matrix in the intelligent contract.
It can be understood that in step S211, the repeated node types are removed, and the repeated nodes are classified into the same class, so that the subsequent calculation is reduced, and the classification efficiency is improved.
Referring to fig. 5, the step S3 specifically includes:
step S31: and generating a random matrix, wherein the row number of the random matrix is equal to the row number of the code characteristic matrix, and the column number is a user preset value.
Step S32: and multiplying the code characteristic matrix by a random matrix to obtain a code characteristic random matrix, and carrying out normalization processing on the code characteristic random matrix to obtain a code characteristic locality sensitive hash matrix. And
step S33: and performing point multiplication on the transaction characteristic matrix by using a random matrix to obtain a transaction characteristic random matrix, and performing normalization processing on the transaction characteristic random matrix to obtain a transaction characteristic local sensitive hash matrix.
It is understood that in step S31, a 0, 1 random matrix V of m x r dimensions is generatedcWhere M represents a code feature matrix McThe number of rows (r) represents the degree of loose clustering, the larger r represents the stricter, and the looser r the representation.
It is to be understood that in step S32, the code feature matrix M is appliedcDot-by-dot random matrix VcObtaining a random matrix H of code characteristicscThe size is z × r. Adjusting code feature locality sensitive hash matrix HcIs adjusted to 1 if the element is greater than the threshold value t, otherwise is 0, and a final locality sensitive hash matrix H 'is obtained'cLocal sensitive Hash matrix H'cThe vector of row i is the locally sensitive hash value of the code feature of the intelligent contract of row i. H'cIs HcThe result of normalization can accelerate the clustering efficiency.
It is understood that steps S31-S33 are only one embodiment of this example, and the embodiment is not limited to steps S31-S33.
Referring to fig. 6, a second embodiment of the present invention provides a locality-sensitive-hash-based intelligent contract classification system, which uses the locality-sensitive-hash-based intelligent contract classification method provided in the first embodiment, and includes:
the intelligent contract acquisition unit 1 is used for acquiring a plurality of intelligent contracts based on a block chain, and extracting code information and transaction information in each intelligent contract.
And the feature matrix constructing unit 2 is used for constructing a code feature matrix based on the code information and constructing a transaction feature matrix based on the transaction information.
And the locality sensitive hashing unit 3 is used for performing locality sensitive hashing on the code feature matrix to obtain a code feature locality sensitive hashing matrix, and performing locality sensitive hashing on the transaction feature matrix to obtain a transaction feature locality sensitive hashing matrix.
And the matrix splicing unit 4 is used for splicing the code characteristic local sensitive hash matrix and the transaction characteristic local sensitive hash matrix to obtain an intelligent contract local sensitive hash matrix. And
and the contract classification unit 5 is used for classifying the vectors in each row based on the intelligent contract local sensitive hash matrix to obtain various intelligent contracts of different types.
Referring to fig. 7, the intelligent contract obtaining unit 1 further includes:
a syntax tree construction unit 11, configured to obtain multiple intelligent contracts of a block chain, and generate an abstract syntax tree for all the intelligent contracts; and
and the node traversing unit 12 is configured to traverse the abstract syntax tree, analyze and record information of each node on the abstract syntax tree, and obtain corresponding code information and transaction information, respectively.
Referring to fig. 8, the locality-sensitive hashing unit 3 further includes:
the random matrix generating unit 31 is configured to generate a random matrix, where the number of rows of the random matrix is equal to the number of rows of the code feature matrix, and the number of columns is a user preset value.
And the code matrix dot multiplication unit 32 is configured to multiply the code feature matrix dot by the random matrix to obtain a code feature random matrix, and perform normalization processing on the code feature random matrix to obtain a code feature locality sensitive hash matrix. And
and the transaction matrix dot multiplication unit 33 is configured to dot-multiply the transaction feature matrix by the random matrix to obtain a transaction feature random matrix, and perform normalization processing on the transaction feature random matrix to obtain a transaction feature locality sensitive hash matrix.
Referring to fig. 9, a third embodiment of the present invention provides an electronic device, where the electronic device includes a memory 10 and a processor 20, and the memory 10 stores therein an arithmetic machine program, where the arithmetic machine program is configured to execute, when running, the steps in any one of the embodiments of the smart contract classification method based on locality sensitive hashing. The processor 20 is configured to execute the steps of any one of the embodiments of the smart contract classification method based on locality-sensitive hashing by using the arithmetic machine program.
Optionally, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of an operating machine network.
The intelligent contract classification method, system and electronic equipment based on locality sensitive hashing have the following beneficial effects:
1. constructing a code feature matrix and a transaction feature matrix by combining the code information and the transaction information of the plurality of intelligent contracts, and carrying out locality sensitive hash calculation on the obtained data to obtain corresponding code characteristic locality sensitive hash matrix and transaction characteristic locality sensitive hash matrix, respectively representing the similarity of code information and transaction information in the intelligent contract, finally splicing the two sensitive hash matrixes to obtain a final intelligent contract local sensitive hash matrix, compared with the technology of clustering the intelligent contracts by using machine learning, the method does not need a label data training model, does not need to consume manpower to finish data labeling work, and reduces the dependence on a large number of training samples when the machine learning classification model is trained. Meanwhile, the method has stronger adaptability to the new type of intelligent contracts, and based on the machine learning method, the new type of intelligent contracts also need to be subjected to sample learning again to refine characteristics. The method only needs to calculate the local sensitive hash value of the new type of contract according to the rule, so that the local sensitive hash value is different from the existing local sensitive hash value, then the local sensitive hash value is classified into one type, and finally the new type of intelligent contract can be classified according to the user check. Furthermore, the method performs classification based on the form of the matrix through a calculation method of the local sensitive hash, namely, a technology for linearly calculating the similarity of the intelligent contract, so that the efficiency of classification calculation is higher, and the method is more suitable for large-scale and frequent intelligent contract clustering scenes. In addition, the method combines the code information and the transaction information, and compared with the existing non-machine learning intelligent contract clustering technology, the used characteristics are richer and more comprehensive.
2. Repeated node types are removed, the repeated nodes are classified into the same type, subsequent calculation is reduced, and classification efficiency is improved.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (9)
1. The intelligent contract classification method based on locality sensitive hashing is characterized by comprising the following steps: the method comprises the following steps:
step S1: acquiring a plurality of intelligent contracts based on a block chain, and extracting code information and transaction information in each intelligent contract;
step S2: constructing a code characteristic matrix based on the code information, and constructing a transaction characteristic matrix based on the transaction information;
step S3: performing local sensitive hashing on the code characteristic matrix to obtain a code characteristic local sensitive hashing matrix, and performing local sensitive hashing on the transaction characteristic matrix to obtain a transaction characteristic local sensitive hashing matrix;
step S4: splicing the code characteristic local sensitive hash matrix and the transaction characteristic local sensitive hash matrix to obtain an intelligent contract local sensitive hash matrix; and
step S5: and classifying the vectors of each line based on the intelligent contract local sensitive hash matrix to obtain various intelligent contracts of different types.
2. The intelligent contract classification method based on locality sensitive hashing according to claim 1, wherein: the step S1 specifically includes:
step S11: acquiring a plurality of intelligent contracts of a block chain, and generating an abstract syntax tree by all the intelligent contracts; and
step S12: and traversing the abstract syntax tree, analyzing and recording the information of each node on the abstract syntax tree, and respectively acquiring corresponding code information and transaction information.
3. The locality-sensitive-hash-based intelligent contract classification method according to claim 2, wherein: the step S2 specifically includes:
step S21: obtaining corresponding code feature matrixes in all intelligent contracts based on nodes of the abstract syntax tree and corresponding code information; and
step S22: and obtaining corresponding transaction characteristic matrixes in all intelligent contracts based on the nodes of the abstract syntax tree and the corresponding transaction information.
4. The intelligent contract classification method based on locality sensitive hashing according to claim 1, wherein: the step S21 specifically includes:
step S211: acquiring all nodes of the abstract syntax tree, extracting the type of each node, and removing repeated nodes; and
step S212: and counting the types of the residual nodes and constructing a corresponding code characteristic matrix in the intelligent contract.
5. The intelligent contract classification method based on locality sensitive hashing according to claim 1, wherein: the step S3 specifically includes:
step S31: generating a random matrix, wherein the row number of the random matrix is equal to the row number of the code characteristic matrix, and the column number is a user preset value;
step S32: multiplying the code characteristic matrix by a random matrix to obtain a code characteristic random matrix, and carrying out normalization processing on the code characteristic random matrix to obtain a code characteristic local sensitive hash matrix; and
step S33: and performing point multiplication on the transaction characteristic matrix by using a random matrix to obtain a transaction characteristic random matrix, and performing normalization processing on the transaction characteristic random matrix to obtain a transaction characteristic local sensitive hash matrix.
6. Intelligent contract classification system based on locality sensitive hashing is characterized in that: the method comprises the following steps:
the intelligent contract acquisition unit is used for acquiring a plurality of intelligent contracts based on the block chain and extracting code information and transaction information in each intelligent contract;
the characteristic matrix constructing unit is used for constructing a code characteristic matrix based on the code information and constructing a transaction characteristic matrix based on the transaction information;
the local sensitive hash unit is used for carrying out local sensitive hash on the code characteristic matrix to obtain a code characteristic local sensitive hash matrix and carrying out local sensitive hash on the transaction characteristic matrix to obtain a transaction characteristic local sensitive hash matrix;
the matrix splicing unit is used for splicing the code characteristic local sensitive hash matrix and the transaction characteristic local sensitive hash matrix to obtain an intelligent contract local sensitive hash matrix; and
and the contract classification unit is used for classifying the vectors in each row based on the intelligent contract local sensitive hash matrix to obtain various intelligent contracts of different types.
7. The locality-sensitive-hash-based intelligent contract classification system of claim 6, wherein: the intelligent contract acquisition unit further includes:
the syntax tree construction unit is used for acquiring a plurality of intelligent contracts of the block chain and generating an abstract syntax tree from all the intelligent contracts; and
and the node traversing unit is used for traversing the abstract syntax tree, analyzing and recording the information of each node on the abstract syntax tree, and respectively acquiring corresponding code information and transaction information.
8. The locality-sensitive-hash-based intelligent contract classification system of claim 7, wherein: the locality-sensitive hash unit further comprises:
the random matrix generating unit is used for generating a random matrix, the row number of the random matrix is equal to the row number of the code characteristic matrix, and the column number is a user preset value;
the code matrix point multiplication unit is used for point multiplication of the code characteristic matrix by a random matrix to obtain a code characteristic random matrix, and normalization processing is carried out on the code characteristic random matrix to obtain a code characteristic local sensitive hash matrix; and
and the transaction matrix dot multiplication unit is used for dot multiplication of the transaction characteristic matrix by the random matrix to obtain a transaction characteristic random matrix, and normalization processing is carried out on the transaction characteristic random matrix to obtain a transaction characteristic local sensitive hash matrix.
9. An electronic device comprising a memory and a processor, characterized in that: the memory has stored therein a computer program arranged to execute, when running, the locality sensitive hash-based intelligent contract classification method of any of claims 1 to 5;
the processor is configured to execute the smart contract classification method based on locality-sensitive hashing according to any one of claims 1 to 5 by the computer program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110414897.2A CN112949778A (en) | 2021-04-17 | 2021-04-17 | Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110414897.2A CN112949778A (en) | 2021-04-17 | 2021-04-17 | Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112949778A true CN112949778A (en) | 2021-06-11 |
Family
ID=76232938
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110414897.2A Pending CN112949778A (en) | 2021-04-17 | 2021-04-17 | Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112949778A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114153496A (en) * | 2021-09-08 | 2022-03-08 | 北京天德科技有限公司 | Block chain-based high-speed parallelizable code similarity comparison method and system |
CN117170677A (en) * | 2023-09-01 | 2023-12-05 | 佛山市康颐福城市服务科技有限公司 | Similarity detection method, device and equipment for intelligent contracts and readable storage medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106709704A (en) * | 2016-11-23 | 2017-05-24 | 杭州秘猿科技有限公司 | Intelligent contract upgrading method based on permission chain |
CN109445834A (en) * | 2018-10-30 | 2019-03-08 | 北京计算机技术及应用研究所 | The quick comparative approach of program code similitude based on abstract syntax tree |
CN110288307A (en) * | 2019-05-13 | 2019-09-27 | 西安电子科技大学 | Intelligent contract co-development system and data processing method based on Fabric block chain |
CN110569033A (en) * | 2019-09-12 | 2019-12-13 | 北京工商大学 | method for generating basic code of digital transaction type intelligent contract |
CN110795432A (en) * | 2019-10-29 | 2020-02-14 | 腾讯云计算(北京)有限责任公司 | Characteristic data retrieval method and device and storage medium |
CN110796546A (en) * | 2019-10-25 | 2020-02-14 | 上海有倕信息科技有限公司 | Distributed clustering algorithm based on block chain |
CN111061996A (en) * | 2019-12-09 | 2020-04-24 | 昆明理工大学 | Recommendation algorithm combining Word2vec Word vector and LSH locality sensitive hashing |
CN112084520A (en) * | 2020-09-18 | 2020-12-15 | 支付宝(杭州)信息技术有限公司 | Method and device for protecting business prediction model of data privacy through joint training of two parties |
CN112416912A (en) * | 2020-10-14 | 2021-02-26 | 深圳前海微众银行股份有限公司 | Method, device, terminal equipment and medium for removing duplicate of longitudinal federal data statistics |
CN112613043A (en) * | 2020-12-30 | 2021-04-06 | 杭州趣链科技有限公司 | Intelligent contract vulnerability detection method based on intelligent contract calling network |
-
2021
- 2021-04-17 CN CN202110414897.2A patent/CN112949778A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106709704A (en) * | 2016-11-23 | 2017-05-24 | 杭州秘猿科技有限公司 | Intelligent contract upgrading method based on permission chain |
CN109445834A (en) * | 2018-10-30 | 2019-03-08 | 北京计算机技术及应用研究所 | The quick comparative approach of program code similitude based on abstract syntax tree |
CN110288307A (en) * | 2019-05-13 | 2019-09-27 | 西安电子科技大学 | Intelligent contract co-development system and data processing method based on Fabric block chain |
CN110569033A (en) * | 2019-09-12 | 2019-12-13 | 北京工商大学 | method for generating basic code of digital transaction type intelligent contract |
CN110796546A (en) * | 2019-10-25 | 2020-02-14 | 上海有倕信息科技有限公司 | Distributed clustering algorithm based on block chain |
CN110795432A (en) * | 2019-10-29 | 2020-02-14 | 腾讯云计算(北京)有限责任公司 | Characteristic data retrieval method and device and storage medium |
CN111061996A (en) * | 2019-12-09 | 2020-04-24 | 昆明理工大学 | Recommendation algorithm combining Word2vec Word vector and LSH locality sensitive hashing |
CN112084520A (en) * | 2020-09-18 | 2020-12-15 | 支付宝(杭州)信息技术有限公司 | Method and device for protecting business prediction model of data privacy through joint training of two parties |
CN112416912A (en) * | 2020-10-14 | 2021-02-26 | 深圳前海微众银行股份有限公司 | Method, device, terminal equipment and medium for removing duplicate of longitudinal federal data statistics |
CN112613043A (en) * | 2020-12-30 | 2021-04-06 | 杭州趣链科技有限公司 | Intelligent contract vulnerability detection method based on intelligent contract calling network |
Non-Patent Citations (2)
Title |
---|
HUAKUN LIU等: "OPRCP: approximate nearest neighbor binary search algorithm for hybrid data over WMSN blockchain", 《EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING》, pages 1 - 14 * |
黄步添等: "基于语义嵌入模型与交易信息的智能合约自动分类系统", 《自动化学报》, vol. 43, no. 09, pages 2 - 4 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114153496A (en) * | 2021-09-08 | 2022-03-08 | 北京天德科技有限公司 | Block chain-based high-speed parallelizable code similarity comparison method and system |
CN114153496B (en) * | 2021-09-08 | 2023-09-12 | 北京天德科技有限公司 | High-speed parallelizable code similarity comparison method and system based on blockchain |
CN117170677A (en) * | 2023-09-01 | 2023-12-05 | 佛山市康颐福城市服务科技有限公司 | Similarity detection method, device and equipment for intelligent contracts and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019218475A1 (en) | Method and device for identifying abnormally-behaving subject, terminal device, and medium | |
US11841839B1 (en) | Preprocessing and imputing method for structural data | |
CN107844533A (en) | A kind of intelligent Answer System and analysis method | |
CN107800591A (en) | A kind of analysis method of unified daily record data | |
CN111461164B (en) | Sample data set capacity expansion method and model training method | |
CN109033833B (en) | Malicious code classification method based on multiple features and feature selection | |
CN110619064A (en) | Case studying and judging method and device based on deep learning | |
CN112949778A (en) | Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment | |
WO2024060684A1 (en) | Model training method, image processing method, device, and storage medium | |
CN111626251A (en) | Video classification method, video classification device and electronic equipment | |
CN110737821B (en) | Similar event query method, device, storage medium and terminal equipment | |
CN115577701A (en) | Risk behavior identification method, device, equipment and medium for big data security | |
CN114491082A (en) | Plan matching method based on network security emergency response knowledge graph feature extraction | |
CN111898418A (en) | Human body abnormal behavior detection method based on T-TINY-YOLO network | |
CN116186759A (en) | Sensitive data identification and desensitization method for privacy calculation | |
Ozdemir et al. | Comparison of deep learning techniques for classification of the insects in order level with mobile software application | |
CN113282433B (en) | Cluster anomaly detection method, device and related equipment | |
Varghese et al. | A novel video genre classification algorithm by keyframe relevance | |
CN116016365B (en) | Webpage identification method based on data packet length information under encrypted flow | |
CN108830302B (en) | Image classification method, training method, classification prediction method and related device | |
Gorokhovatskyi et al. | Transforming image descriptions as a set of descriptors to construct classification features | |
CN113259369B (en) | Data set authentication method and system based on machine learning member inference attack | |
CN112989869B (en) | Optimization method, device, equipment and storage medium of face quality detection model | |
Mutasim et al. | Impute Missing Values in R Language using IBK Classification Algorithm | |
CN111581640A (en) | Malicious software detection method, device and equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |