CN112949778A - Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment - Google Patents

Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment Download PDF

Info

Publication number
CN112949778A
CN112949778A CN202110414897.2A CN202110414897A CN112949778A CN 112949778 A CN112949778 A CN 112949778A CN 202110414897 A CN202110414897 A CN 202110414897A CN 112949778 A CN112949778 A CN 112949778A
Authority
CN
China
Prior art keywords
matrix
characteristic
code
transaction
intelligent contract
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110414897.2A
Other languages
Chinese (zh)
Inventor
郑子彬
罗少龙
连松彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Qianhai Mobile Technology Co ltd
Original Assignee
Shenzhen Qianhai Mobile Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Qianhai Mobile Technology Co ltd filed Critical Shenzhen Qianhai Mobile Technology Co ltd
Priority to CN202110414897.2A priority Critical patent/CN112949778A/en
Publication of CN112949778A publication Critical patent/CN112949778A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Finance (AREA)
  • Mathematical Analysis (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Algebra (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an intelligent contract classification method, a system and electronic equipment based on locality sensitive hashing, constructing a code feature matrix and a transaction feature matrix by combining the code information and the transaction information of the plurality of intelligent contracts, and carrying out locality sensitive hash calculation on the obtained data to obtain corresponding code characteristic locality sensitive hash matrix and transaction characteristic locality sensitive hash matrix, respectively representing the similarity of code information and transaction information in the intelligent contract, finally splicing the two sensitive hash matrixes to obtain a final intelligent contract local sensitive hash matrix, the method does not need a label data training model, does not need to consume manpower to finish data labeling work, and reduces the dependence on a large number of training samples when the machine learning classification model is trained. Meanwhile, the method has stronger adaptability to new types of intelligent contracts.

Description

Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment
Technical Field
The invention relates to the field of intelligent contract classification methods for block chains, in particular to an intelligent contract classification method and system based on locality sensitive hashing and electronic equipment.
Background
With the increasing popularity of networking and information technology, the amount of data generated by the internet is growing exponentially. In a blockchain, thousands of smart contracts are deployed each day. Investigations have found that many people are very interested in the full picture of intelligent contracts on blockchains, as this affects the work they are currently doing. For example, intelligent contract developers need the information of the whole intelligent contract on the chain to guide the developers to develop the type of contracts, so that the hot contracts can be developed more smoothly; intelligent contract researchers also need comprehensive information of intelligent contracts, which they use to guide their research direction, and they will pay more attention to the safety problem of the contract types used by many people and whether some new contracts have safety problems, after all, the contracts and digital assets are closely related and need to ensure that the contracts have no loopholes so as to avoid property loss. If it is not practical to know the current development situation of the intelligent contract by manually reading each contract, because the speed of reading the contract cannot completely follow the deployment speed of the contract, a technology for quickly clustering the intelligent contract is urgently needed at present, so that people can more quickly acquire the full-face information of the intelligent contract on the block chain.
The existing method for classifying intelligent contracts has a classification method based on machine learning, but the machine learning method needs a large amount of labeled data training models, the labeling process consumes manpower, and the machine learning classification method has poor adaptability to new types of contracts. After an intelligent contract running on a block chain is born, the type of the existing intelligent contract is definitely not comprehensive enough, and if a novel intelligent contract appears, the model only divides the intelligent contract into a certain type of the existing type, but cannot create a new type of the existing type, and the identification and classification are inaccurate.
Disclosure of Invention
The invention provides an intelligent contract classification method, system and electronic equipment based on locality sensitive hashing, and aims to solve the problems that an intelligent contract classification method based on machine learning in the existing block chain is difficult in model training and inaccurate in classification.
According to the embodiment of the application, the intelligent contract classification method based on the locality sensitive hashing is provided, and comprises the following steps: step S1: acquiring a plurality of intelligent contracts based on a block chain, and extracting code information and transaction information in each intelligent contract; step S2: constructing a code characteristic matrix based on the code information, and constructing a transaction characteristic matrix based on the transaction information; step S3: performing local sensitive hashing on the code characteristic matrix to obtain a code characteristic local sensitive hashing matrix, and performing local sensitive hashing on the transaction characteristic matrix to obtain a transaction characteristic local sensitive hashing matrix; step S4: splicing the code characteristic local sensitive hash matrix and the transaction characteristic local sensitive hash matrix to obtain an intelligent contract local sensitive hash matrix; and step S5: and classifying the vectors of each line based on the intelligent contract local sensitive hash matrix to obtain various intelligent contracts of different types.
Preferably, the step S1 specifically includes: step S11: acquiring a plurality of intelligent contracts of a block chain, and generating an abstract syntax tree by all the intelligent contracts; and step S12: and traversing the abstract syntax tree, analyzing and recording the information of each node on the abstract syntax tree, and respectively acquiring corresponding code information and transaction information.
Preferably, the step S2 specifically includes: step S21: obtaining corresponding code feature matrixes in all intelligent contracts based on nodes of the abstract syntax tree and corresponding code information; and step S22: and obtaining corresponding transaction characteristic matrixes in all intelligent contracts based on the nodes of the abstract syntax tree and the corresponding transaction information.
Preferably, the step S21 specifically includes: step S211: acquiring all nodes of the abstract syntax tree, extracting the type of each node, and removing repeated nodes; and step S212: and counting the types of the residual nodes and constructing a corresponding code characteristic matrix in the intelligent contract.
Preferably, the step S3 specifically includes: step S31: generating a random matrix, wherein the row number of the random matrix is equal to the row number of the code characteristic matrix, and the column number is a user preset value; step S32: multiplying the code characteristic matrix by a random matrix to obtain a code characteristic random matrix, and carrying out normalization processing on the code characteristic random matrix to obtain a code characteristic local sensitive hash matrix; and step S33: and performing point multiplication on the transaction characteristic matrix by using a random matrix to obtain a transaction characteristic random matrix, and performing normalization processing on the transaction characteristic random matrix to obtain a transaction characteristic local sensitive hash matrix.
The invention also provides an intelligent contract classification system based on locality sensitive hashing, which comprises: the intelligent contract acquisition unit is used for acquiring a plurality of intelligent contracts based on the block chain and extracting code information and transaction information in each intelligent contract; the characteristic matrix constructing unit is used for constructing a code characteristic matrix based on the code information and constructing a transaction characteristic matrix based on the transaction information; the local sensitive hash unit is used for carrying out local sensitive hash on the code characteristic matrix to obtain a code characteristic local sensitive hash matrix and carrying out local sensitive hash on the transaction characteristic matrix to obtain a transaction characteristic local sensitive hash matrix; the matrix splicing unit is used for splicing the code characteristic local sensitive hash matrix and the transaction characteristic local sensitive hash matrix to obtain an intelligent contract local sensitive hash matrix; and the contract classification unit is used for classifying the vectors in each row based on the intelligent contract local sensitive hash matrix to obtain various intelligent contracts of different types.
Preferably, the intelligent contract obtaining unit further includes: the syntax tree construction unit is used for acquiring a plurality of intelligent contracts of the block chain and generating an abstract syntax tree from all the intelligent contracts; and the node traversing unit is used for traversing the abstract syntax tree, analyzing and recording the information of each node on the abstract syntax tree, and respectively acquiring corresponding code information and transaction information.
Preferably, the locality-sensitive hashing unit further comprises: the random matrix generating unit is used for generating a random matrix, the row number of the random matrix is equal to the row number of the code characteristic matrix, and the column number is a user preset value; the code matrix point multiplication unit is used for point multiplication of the code characteristic matrix by a random matrix to obtain a code characteristic random matrix, and normalization processing is carried out on the code characteristic random matrix to obtain a code characteristic local sensitive hash matrix; and the transaction matrix dot multiplication unit is used for dot multiplication of the transaction characteristic matrix by the random matrix to obtain a transaction characteristic random matrix, and normalization processing is carried out on the transaction characteristic random matrix to obtain a transaction characteristic local sensitive hash matrix.
The invention also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the computer program is set to execute the intelligent contract classification method based on locality sensitive hashing in any item when running; the processor is configured to execute, by the computer program, the smart contract classification method based on locality-sensitive hashing according to any one of the above.
The intelligent contract classification method, system and electronic equipment based on locality sensitive hashing have the following beneficial effects:
1. constructing a code feature matrix and a transaction feature matrix by combining the code information and the transaction information of the plurality of intelligent contracts, and carrying out locality sensitive hash calculation on the obtained data to obtain corresponding code characteristic locality sensitive hash matrix and transaction characteristic locality sensitive hash matrix, respectively representing the similarity of code information and transaction information in the intelligent contract, finally splicing the two sensitive hash matrixes to obtain a final intelligent contract local sensitive hash matrix, compared with the technology of clustering the intelligent contracts by using machine learning, the method does not need a label data training model, does not need to consume manpower to finish data labeling work, and reduces the dependence on a large number of training samples when the machine learning classification model is trained. Meanwhile, the method has stronger adaptability to the new type of intelligent contracts, and based on the machine learning method, the new type of intelligent contracts also need to be subjected to sample learning again to refine characteristics. The method only needs to calculate the local sensitive hash value of the new type of contract according to the rule, so that the local sensitive hash value is different from the existing local sensitive hash value, then the local sensitive hash value is classified into one type, and finally the new type of intelligent contract can be classified according to the user check. Furthermore, the method performs classification based on the form of the matrix through a calculation method of the local sensitive hash, namely, a technology for linearly calculating the similarity of the intelligent contract, so that the efficiency of classification calculation is higher, and the method is more suitable for large-scale and frequent intelligent contract clustering scenes. In addition, the method combines the code information and the transaction information, and compared with the existing non-machine learning intelligent contract clustering technology, the used characteristics are richer and more comprehensive.
2. Repeated node types are removed, the repeated nodes are classified into the same type, subsequent calculation is reduced, and classification efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a locality sensitive hash-based intelligent contract classification method according to a first embodiment of the present invention.
Fig. 2 is a flowchart of step S1 in the locality sensitive hash-based intelligent contract classification method according to the first embodiment of the present invention.
Fig. 3 is a flowchart of step S2 in the locality sensitive hash-based intelligent contract classification method according to the first embodiment of the present invention.
Fig. 4 is a flowchart of step S21 in the locality sensitive hash-based intelligent contract classification method according to the first embodiment of the present invention.
Fig. 5 is a flowchart of step S3 in the locality sensitive hash-based intelligent contract classification method according to the first embodiment of the present invention.
Fig. 6 is a block diagram of an intelligent contract classification system based on locality sensitive hashing according to a second embodiment of the present invention.
FIG. 7 is a block diagram of intelligent contract units in a locality sensitive hash-based intelligent contract classification system according to a second embodiment of the present invention.
Fig. 8 is a block diagram of a locality sensitive hash unit in the locality sensitive hash-based intelligent contract classification system according to the second embodiment of the present invention.
Fig. 9 is a block diagram of an electronic device according to a third embodiment of the present invention.
Description of reference numerals:
1. an intelligent contract acquisition unit; 2. a feature matrix construction unit; 3. a locality sensitive hash unit; 4. a matrix splicing unit; 5. a contract classification unit;
11. a syntax tree construction unit; 12. a node traversing unit; 31. generating a random matrix; 32. a code matrix dot multiplication unit; 33. a transaction matrix dot multiplication unit;
10. a memory; 20. a processor.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items and includes such combinations.
Referring to fig. 1, a first embodiment of the present invention discloses an intelligent contract classification method based on locality sensitive hashing, which includes the following steps:
step S1: and acquiring a plurality of intelligent contracts based on the block chain, and extracting code information and transaction information in each intelligent contract.
Step S2: and constructing a code characteristic matrix based on the code information, and constructing a transaction characteristic matrix based on the transaction information.
Step S3: and carrying out local sensitive hashing on the code characteristic matrix to obtain a code characteristic local sensitive hashing matrix, and carrying out local sensitive hashing on the transaction characteristic matrix to obtain a transaction characteristic local sensitive hashing matrix.
Step S4: and splicing the code characteristic local sensitive hash matrix and the transaction characteristic local sensitive hash matrix to obtain an intelligent contract local sensitive hash matrix. And
step S5: and classifying the vectors of each line based on the intelligent contract local sensitive hash matrix to obtain various intelligent contracts of different types.
It is understood that in step S1, the smart contract is a program running on the blockchain, which is event driven, has a status, and can hold assets on the ledger. The user may write asset transaction logic into the intelligent contract to effect value transfer. The transaction information refers to information stored in the transaction, such as transfer amount and the like, and the code information refers to functions of a programming language of the intelligent contract, such as conditional branching, circulation, transfer and the like.
It is understood that, in step S2, the code feature matrix characterizes which agent feature exists in which intelligent contract or not among all the intelligent contracts collected currently, and is expressed in the form of a matrix, and the transaction feature matrix is the same as the code feature matrix.
It is understood that in step S3, the locality sensitive hashing is a method for fast clustering of large-scale data, and its basic idea is: if the two data points are similar in the original data space, the data points still have high similarity after being respectively subjected to the hash function conversion. The method comprises the steps of adopting a local sensitive Hash algorithm to solve the problem of comparing massive intelligent contracts, namely, carrying out Hash function conversion through the local sensitive Hash algorithm, and then keeping original similarity for reservation.
It can be understood that, in step S4, the code feature locality sensitive hash matrix and the transaction feature locality sensitive hash matrix are concatenated to classify the intelligent contract based on the combination of the code information and the transaction information, so that the classification of the intelligent contract is more accurate, and the classification accuracy is improved.
It is to be understood that, in step S5, the vectors in each row of the smart contract locality-sensitive hash matrix represent locality-sensitive hash values of one smart contract, and different types of smart contracts can be accurately classified based on similarity of the vectors.
It is to be understood that, in step S5, the ith row of the locality-sensitive hash matrix H' represents the locality-sensitive hash value of the ith intelligent contract, and intelligent contracts (highly-similar intelligent contracts) having the same locality-sensitive hash value are grouped into one class. The user may randomly pick an intelligent contract from each cluster to deduce the type of intelligent contract in the cluster. If a new locality sensitive hash value is found, it means that a new intelligent contract class appears on the blockchain.
It will be appreciated that by constructing a code signature matrix and a transaction signature matrix from the code information and transaction information for a plurality of intelligent contracts, and carrying out locality sensitive hash calculation on the obtained data to obtain corresponding code characteristic locality sensitive hash matrix and transaction characteristic locality sensitive hash matrix, respectively representing the similarity of code information and transaction information in the intelligent contract, finally splicing the two sensitive hash matrixes to obtain a final intelligent contract local sensitive hash matrix, compared with the technology of clustering the intelligent contracts by using machine learning, the method does not need a label data training model, does not need to consume manpower to finish data labeling work, and reduces the dependence on a large number of training samples when the machine learning classification model is trained. Meanwhile, the method has stronger adaptability to the new type of intelligent contracts, and based on the machine learning method, the new type of intelligent contracts also need to be subjected to sample learning again to refine characteristics. The method only needs to calculate the local sensitive hash value of the new type of contract according to the rule, so that the local sensitive hash value is different from the existing local sensitive hash value, then the local sensitive hash value is classified into one type, and finally the new type of intelligent contract can be classified according to the user check. Furthermore, the method performs classification based on the form of the matrix through a calculation method of the local sensitive hash, namely, a technology for linearly calculating the similarity of the intelligent contract, so that the efficiency of classification calculation is higher, and the method is more suitable for large-scale and frequent intelligent contract clustering scenes. In addition, the method combines the code information and the transaction information, and compared with the existing non-machine learning intelligent contract clustering technology, the used characteristics are richer and more comprehensive.
Referring to fig. 2, the step S1 specifically includes:
step S11: acquiring a plurality of intelligent contracts of a block chain, and generating an abstract syntax tree by all the intelligent contracts; and
step S12: and traversing the abstract syntax tree, analyzing and recording the information of each node on the abstract syntax tree, and respectively acquiring corresponding code information and transaction information.
It is to be understood that in step S11, specifically, an abstract syntax tree is generated for all intelligent contracts using the solid-Antlr 4 tool, where the abstract syntax tree has a plurality of nodes, and each node represents a syntax structure in the contract.
It is understood that in step S12, the abstract syntax tree is traversed, and the information in each node in the syntax tree is parsed and recorded, including querycoctrbalance representing a query contract balance operation, Transfer representing a Transfer operation, IfStatement representing a conditional statement, and so on.
It is understood that the transaction information is also obtained according to the above steps S11-S12, and will not be described herein.
It is understood that steps S11-S12 are only one embodiment of this example, and the embodiment is not limited to steps S11-S12.
Referring to fig. 3, the step S2 specifically includes:
step S21: obtaining corresponding code feature matrixes in all intelligent contracts based on nodes of the abstract syntax tree and corresponding code information; and
step S22: and obtaining corresponding transaction characteristic matrixes in all intelligent contracts based on the nodes of the abstract syntax tree and the corresponding transaction information.
It can be understood that in step S21, statistics is made about which node classes each contract has, resulting in a 0, 1 code feature matrix McThe size of the code feature matrix is z multiplied by m, z represents the number of contracts, and m represents the number of node types, so that the ith and jth elements of the code feature matrix represent whether the ith intelligent contract has the jth node type, if so, the jth element is equal to 1, otherwise, the jth element is 0.
It is understood that steps S21-S22 are only one embodiment of this example, and the embodiment is not limited to steps S21-S22.
Optionally, referring to fig. 4, as an embodiment, the step S21 specifically includes:
step S211: acquiring all nodes of the abstract syntax tree, extracting the type of each node, and removing repeated nodes; and
step S212: and counting the types of the residual nodes and constructing a corresponding code characteristic matrix in the intelligent contract.
It can be understood that in step S211, the repeated node types are removed, and the repeated nodes are classified into the same class, so that the subsequent calculation is reduced, and the classification efficiency is improved.
Referring to fig. 5, the step S3 specifically includes:
step S31: and generating a random matrix, wherein the row number of the random matrix is equal to the row number of the code characteristic matrix, and the column number is a user preset value.
Step S32: and multiplying the code characteristic matrix by a random matrix to obtain a code characteristic random matrix, and carrying out normalization processing on the code characteristic random matrix to obtain a code characteristic locality sensitive hash matrix. And
step S33: and performing point multiplication on the transaction characteristic matrix by using a random matrix to obtain a transaction characteristic random matrix, and performing normalization processing on the transaction characteristic random matrix to obtain a transaction characteristic local sensitive hash matrix.
It is understood that in step S31, a 0, 1 random matrix V of m x r dimensions is generatedcWhere M represents a code feature matrix McThe number of rows (r) represents the degree of loose clustering, the larger r represents the stricter, and the looser r the representation.
It is to be understood that in step S32, the code feature matrix M is appliedcDot-by-dot random matrix VcObtaining a random matrix H of code characteristicscThe size is z × r. Adjusting code feature locality sensitive hash matrix HcIs adjusted to 1 if the element is greater than the threshold value t, otherwise is 0, and a final locality sensitive hash matrix H 'is obtained'cLocal sensitive Hash matrix H'cThe vector of row i is the locally sensitive hash value of the code feature of the intelligent contract of row i. H'cIs HcThe result of normalization can accelerate the clustering efficiency.
It is understood that steps S31-S33 are only one embodiment of this example, and the embodiment is not limited to steps S31-S33.
Referring to fig. 6, a second embodiment of the present invention provides a locality-sensitive-hash-based intelligent contract classification system, which uses the locality-sensitive-hash-based intelligent contract classification method provided in the first embodiment, and includes:
the intelligent contract acquisition unit 1 is used for acquiring a plurality of intelligent contracts based on a block chain, and extracting code information and transaction information in each intelligent contract.
And the feature matrix constructing unit 2 is used for constructing a code feature matrix based on the code information and constructing a transaction feature matrix based on the transaction information.
And the locality sensitive hashing unit 3 is used for performing locality sensitive hashing on the code feature matrix to obtain a code feature locality sensitive hashing matrix, and performing locality sensitive hashing on the transaction feature matrix to obtain a transaction feature locality sensitive hashing matrix.
And the matrix splicing unit 4 is used for splicing the code characteristic local sensitive hash matrix and the transaction characteristic local sensitive hash matrix to obtain an intelligent contract local sensitive hash matrix. And
and the contract classification unit 5 is used for classifying the vectors in each row based on the intelligent contract local sensitive hash matrix to obtain various intelligent contracts of different types.
Referring to fig. 7, the intelligent contract obtaining unit 1 further includes:
a syntax tree construction unit 11, configured to obtain multiple intelligent contracts of a block chain, and generate an abstract syntax tree for all the intelligent contracts; and
and the node traversing unit 12 is configured to traverse the abstract syntax tree, analyze and record information of each node on the abstract syntax tree, and obtain corresponding code information and transaction information, respectively.
Referring to fig. 8, the locality-sensitive hashing unit 3 further includes:
the random matrix generating unit 31 is configured to generate a random matrix, where the number of rows of the random matrix is equal to the number of rows of the code feature matrix, and the number of columns is a user preset value.
And the code matrix dot multiplication unit 32 is configured to multiply the code feature matrix dot by the random matrix to obtain a code feature random matrix, and perform normalization processing on the code feature random matrix to obtain a code feature locality sensitive hash matrix. And
and the transaction matrix dot multiplication unit 33 is configured to dot-multiply the transaction feature matrix by the random matrix to obtain a transaction feature random matrix, and perform normalization processing on the transaction feature random matrix to obtain a transaction feature locality sensitive hash matrix.
Referring to fig. 9, a third embodiment of the present invention provides an electronic device, where the electronic device includes a memory 10 and a processor 20, and the memory 10 stores therein an arithmetic machine program, where the arithmetic machine program is configured to execute, when running, the steps in any one of the embodiments of the smart contract classification method based on locality sensitive hashing. The processor 20 is configured to execute the steps of any one of the embodiments of the smart contract classification method based on locality-sensitive hashing by using the arithmetic machine program.
Optionally, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of an operating machine network.
The intelligent contract classification method, system and electronic equipment based on locality sensitive hashing have the following beneficial effects:
1. constructing a code feature matrix and a transaction feature matrix by combining the code information and the transaction information of the plurality of intelligent contracts, and carrying out locality sensitive hash calculation on the obtained data to obtain corresponding code characteristic locality sensitive hash matrix and transaction characteristic locality sensitive hash matrix, respectively representing the similarity of code information and transaction information in the intelligent contract, finally splicing the two sensitive hash matrixes to obtain a final intelligent contract local sensitive hash matrix, compared with the technology of clustering the intelligent contracts by using machine learning, the method does not need a label data training model, does not need to consume manpower to finish data labeling work, and reduces the dependence on a large number of training samples when the machine learning classification model is trained. Meanwhile, the method has stronger adaptability to the new type of intelligent contracts, and based on the machine learning method, the new type of intelligent contracts also need to be subjected to sample learning again to refine characteristics. The method only needs to calculate the local sensitive hash value of the new type of contract according to the rule, so that the local sensitive hash value is different from the existing local sensitive hash value, then the local sensitive hash value is classified into one type, and finally the new type of intelligent contract can be classified according to the user check. Furthermore, the method performs classification based on the form of the matrix through a calculation method of the local sensitive hash, namely, a technology for linearly calculating the similarity of the intelligent contract, so that the efficiency of classification calculation is higher, and the method is more suitable for large-scale and frequent intelligent contract clustering scenes. In addition, the method combines the code information and the transaction information, and compared with the existing non-machine learning intelligent contract clustering technology, the used characteristics are richer and more comprehensive.
2. Repeated node types are removed, the repeated nodes are classified into the same type, subsequent calculation is reduced, and classification efficiency is improved.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (9)

1. The intelligent contract classification method based on locality sensitive hashing is characterized by comprising the following steps: the method comprises the following steps:
step S1: acquiring a plurality of intelligent contracts based on a block chain, and extracting code information and transaction information in each intelligent contract;
step S2: constructing a code characteristic matrix based on the code information, and constructing a transaction characteristic matrix based on the transaction information;
step S3: performing local sensitive hashing on the code characteristic matrix to obtain a code characteristic local sensitive hashing matrix, and performing local sensitive hashing on the transaction characteristic matrix to obtain a transaction characteristic local sensitive hashing matrix;
step S4: splicing the code characteristic local sensitive hash matrix and the transaction characteristic local sensitive hash matrix to obtain an intelligent contract local sensitive hash matrix; and
step S5: and classifying the vectors of each line based on the intelligent contract local sensitive hash matrix to obtain various intelligent contracts of different types.
2. The intelligent contract classification method based on locality sensitive hashing according to claim 1, wherein: the step S1 specifically includes:
step S11: acquiring a plurality of intelligent contracts of a block chain, and generating an abstract syntax tree by all the intelligent contracts; and
step S12: and traversing the abstract syntax tree, analyzing and recording the information of each node on the abstract syntax tree, and respectively acquiring corresponding code information and transaction information.
3. The locality-sensitive-hash-based intelligent contract classification method according to claim 2, wherein: the step S2 specifically includes:
step S21: obtaining corresponding code feature matrixes in all intelligent contracts based on nodes of the abstract syntax tree and corresponding code information; and
step S22: and obtaining corresponding transaction characteristic matrixes in all intelligent contracts based on the nodes of the abstract syntax tree and the corresponding transaction information.
4. The intelligent contract classification method based on locality sensitive hashing according to claim 1, wherein: the step S21 specifically includes:
step S211: acquiring all nodes of the abstract syntax tree, extracting the type of each node, and removing repeated nodes; and
step S212: and counting the types of the residual nodes and constructing a corresponding code characteristic matrix in the intelligent contract.
5. The intelligent contract classification method based on locality sensitive hashing according to claim 1, wherein: the step S3 specifically includes:
step S31: generating a random matrix, wherein the row number of the random matrix is equal to the row number of the code characteristic matrix, and the column number is a user preset value;
step S32: multiplying the code characteristic matrix by a random matrix to obtain a code characteristic random matrix, and carrying out normalization processing on the code characteristic random matrix to obtain a code characteristic local sensitive hash matrix; and
step S33: and performing point multiplication on the transaction characteristic matrix by using a random matrix to obtain a transaction characteristic random matrix, and performing normalization processing on the transaction characteristic random matrix to obtain a transaction characteristic local sensitive hash matrix.
6. Intelligent contract classification system based on locality sensitive hashing is characterized in that: the method comprises the following steps:
the intelligent contract acquisition unit is used for acquiring a plurality of intelligent contracts based on the block chain and extracting code information and transaction information in each intelligent contract;
the characteristic matrix constructing unit is used for constructing a code characteristic matrix based on the code information and constructing a transaction characteristic matrix based on the transaction information;
the local sensitive hash unit is used for carrying out local sensitive hash on the code characteristic matrix to obtain a code characteristic local sensitive hash matrix and carrying out local sensitive hash on the transaction characteristic matrix to obtain a transaction characteristic local sensitive hash matrix;
the matrix splicing unit is used for splicing the code characteristic local sensitive hash matrix and the transaction characteristic local sensitive hash matrix to obtain an intelligent contract local sensitive hash matrix; and
and the contract classification unit is used for classifying the vectors in each row based on the intelligent contract local sensitive hash matrix to obtain various intelligent contracts of different types.
7. The locality-sensitive-hash-based intelligent contract classification system of claim 6, wherein: the intelligent contract acquisition unit further includes:
the syntax tree construction unit is used for acquiring a plurality of intelligent contracts of the block chain and generating an abstract syntax tree from all the intelligent contracts; and
and the node traversing unit is used for traversing the abstract syntax tree, analyzing and recording the information of each node on the abstract syntax tree, and respectively acquiring corresponding code information and transaction information.
8. The locality-sensitive-hash-based intelligent contract classification system of claim 7, wherein: the locality-sensitive hash unit further comprises:
the random matrix generating unit is used for generating a random matrix, the row number of the random matrix is equal to the row number of the code characteristic matrix, and the column number is a user preset value;
the code matrix point multiplication unit is used for point multiplication of the code characteristic matrix by a random matrix to obtain a code characteristic random matrix, and normalization processing is carried out on the code characteristic random matrix to obtain a code characteristic local sensitive hash matrix; and
and the transaction matrix dot multiplication unit is used for dot multiplication of the transaction characteristic matrix by the random matrix to obtain a transaction characteristic random matrix, and normalization processing is carried out on the transaction characteristic random matrix to obtain a transaction characteristic local sensitive hash matrix.
9. An electronic device comprising a memory and a processor, characterized in that: the memory has stored therein a computer program arranged to execute, when running, the locality sensitive hash-based intelligent contract classification method of any of claims 1 to 5;
the processor is configured to execute the smart contract classification method based on locality-sensitive hashing according to any one of claims 1 to 5 by the computer program.
CN202110414897.2A 2021-04-17 2021-04-17 Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment Pending CN112949778A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110414897.2A CN112949778A (en) 2021-04-17 2021-04-17 Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110414897.2A CN112949778A (en) 2021-04-17 2021-04-17 Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment

Publications (1)

Publication Number Publication Date
CN112949778A true CN112949778A (en) 2021-06-11

Family

ID=76232938

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110414897.2A Pending CN112949778A (en) 2021-04-17 2021-04-17 Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment

Country Status (1)

Country Link
CN (1) CN112949778A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114153496A (en) * 2021-09-08 2022-03-08 北京天德科技有限公司 Block chain-based high-speed parallelizable code similarity comparison method and system
CN117170677A (en) * 2023-09-01 2023-12-05 佛山市康颐福城市服务科技有限公司 Similarity detection method, device and equipment for intelligent contracts and readable storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709704A (en) * 2016-11-23 2017-05-24 杭州秘猿科技有限公司 Intelligent contract upgrading method based on permission chain
CN109445834A (en) * 2018-10-30 2019-03-08 北京计算机技术及应用研究所 The quick comparative approach of program code similitude based on abstract syntax tree
CN110288307A (en) * 2019-05-13 2019-09-27 西安电子科技大学 Intelligent contract co-development system and data processing method based on Fabric block chain
CN110569033A (en) * 2019-09-12 2019-12-13 北京工商大学 method for generating basic code of digital transaction type intelligent contract
CN110795432A (en) * 2019-10-29 2020-02-14 腾讯云计算(北京)有限责任公司 Characteristic data retrieval method and device and storage medium
CN110796546A (en) * 2019-10-25 2020-02-14 上海有倕信息科技有限公司 Distributed clustering algorithm based on block chain
CN111061996A (en) * 2019-12-09 2020-04-24 昆明理工大学 Recommendation algorithm combining Word2vec Word vector and LSH locality sensitive hashing
CN112084520A (en) * 2020-09-18 2020-12-15 支付宝(杭州)信息技术有限公司 Method and device for protecting business prediction model of data privacy through joint training of two parties
CN112416912A (en) * 2020-10-14 2021-02-26 深圳前海微众银行股份有限公司 Method, device, terminal equipment and medium for removing duplicate of longitudinal federal data statistics
CN112613043A (en) * 2020-12-30 2021-04-06 杭州趣链科技有限公司 Intelligent contract vulnerability detection method based on intelligent contract calling network

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709704A (en) * 2016-11-23 2017-05-24 杭州秘猿科技有限公司 Intelligent contract upgrading method based on permission chain
CN109445834A (en) * 2018-10-30 2019-03-08 北京计算机技术及应用研究所 The quick comparative approach of program code similitude based on abstract syntax tree
CN110288307A (en) * 2019-05-13 2019-09-27 西安电子科技大学 Intelligent contract co-development system and data processing method based on Fabric block chain
CN110569033A (en) * 2019-09-12 2019-12-13 北京工商大学 method for generating basic code of digital transaction type intelligent contract
CN110796546A (en) * 2019-10-25 2020-02-14 上海有倕信息科技有限公司 Distributed clustering algorithm based on block chain
CN110795432A (en) * 2019-10-29 2020-02-14 腾讯云计算(北京)有限责任公司 Characteristic data retrieval method and device and storage medium
CN111061996A (en) * 2019-12-09 2020-04-24 昆明理工大学 Recommendation algorithm combining Word2vec Word vector and LSH locality sensitive hashing
CN112084520A (en) * 2020-09-18 2020-12-15 支付宝(杭州)信息技术有限公司 Method and device for protecting business prediction model of data privacy through joint training of two parties
CN112416912A (en) * 2020-10-14 2021-02-26 深圳前海微众银行股份有限公司 Method, device, terminal equipment and medium for removing duplicate of longitudinal federal data statistics
CN112613043A (en) * 2020-12-30 2021-04-06 杭州趣链科技有限公司 Intelligent contract vulnerability detection method based on intelligent contract calling network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HUAKUN LIU等: "OPRCP: approximate nearest neighbor binary search algorithm for hybrid data over WMSN blockchain", 《EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING》, pages 1 - 14 *
黄步添等: "基于语义嵌入模型与交易信息的智能合约自动分类系统", 《自动化学报》, vol. 43, no. 09, pages 2 - 4 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114153496A (en) * 2021-09-08 2022-03-08 北京天德科技有限公司 Block chain-based high-speed parallelizable code similarity comparison method and system
CN114153496B (en) * 2021-09-08 2023-09-12 北京天德科技有限公司 High-speed parallelizable code similarity comparison method and system based on blockchain
CN117170677A (en) * 2023-09-01 2023-12-05 佛山市康颐福城市服务科技有限公司 Similarity detection method, device and equipment for intelligent contracts and readable storage medium

Similar Documents

Publication Publication Date Title
WO2019218475A1 (en) Method and device for identifying abnormally-behaving subject, terminal device, and medium
US11841839B1 (en) Preprocessing and imputing method for structural data
CN107844533A (en) A kind of intelligent Answer System and analysis method
CN107800591A (en) A kind of analysis method of unified daily record data
CN111461164B (en) Sample data set capacity expansion method and model training method
CN109033833B (en) Malicious code classification method based on multiple features and feature selection
CN110619064A (en) Case studying and judging method and device based on deep learning
CN112949778A (en) Intelligent contract classification method and system based on locality sensitive hashing and electronic equipment
WO2024060684A1 (en) Model training method, image processing method, device, and storage medium
CN111626251A (en) Video classification method, video classification device and electronic equipment
CN110737821B (en) Similar event query method, device, storage medium and terminal equipment
CN115577701A (en) Risk behavior identification method, device, equipment and medium for big data security
CN114491082A (en) Plan matching method based on network security emergency response knowledge graph feature extraction
CN111898418A (en) Human body abnormal behavior detection method based on T-TINY-YOLO network
CN116186759A (en) Sensitive data identification and desensitization method for privacy calculation
Ozdemir et al. Comparison of deep learning techniques for classification of the insects in order level with mobile software application
CN113282433B (en) Cluster anomaly detection method, device and related equipment
Varghese et al. A novel video genre classification algorithm by keyframe relevance
CN116016365B (en) Webpage identification method based on data packet length information under encrypted flow
CN108830302B (en) Image classification method, training method, classification prediction method and related device
Gorokhovatskyi et al. Transforming image descriptions as a set of descriptors to construct classification features
CN113259369B (en) Data set authentication method and system based on machine learning member inference attack
CN112989869B (en) Optimization method, device, equipment and storage medium of face quality detection model
Mutasim et al. Impute Missing Values in R Language using IBK Classification Algorithm
CN111581640A (en) Malicious software detection method, device and equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination