CN111158692A - Method, system and storage medium for ordering similarity of intelligent contract functions - Google Patents

Method, system and storage medium for ordering similarity of intelligent contract functions Download PDF

Info

Publication number
CN111158692A
CN111158692A CN201911249429.3A CN201911249429A CN111158692A CN 111158692 A CN111158692 A CN 111158692A CN 201911249429 A CN201911249429 A CN 201911249429A CN 111158692 A CN111158692 A CN 111158692A
Authority
CN
China
Prior art keywords
intelligent contract
vector
grammar
preset
obtaining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911249429.3A
Other languages
Chinese (zh)
Other versions
CN111158692B (en
Inventor
赵淦森
王锡亮
王欣明
罗浩宇
刘学枫
何嘉浩
谢智健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Normal University
Original Assignee
South China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Normal University filed Critical South China Normal University
Priority to CN201911249429.3A priority Critical patent/CN111158692B/en
Publication of CN111158692A publication Critical patent/CN111158692A/en
Application granted granted Critical
Publication of CN111158692B publication Critical patent/CN111158692B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/436Semantic checking

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a sorting method, a system and a storage medium for similarity of intelligent contract functions, wherein the method comprises the following steps: acquiring an input intelligent contract function; obtaining a first grammar vector according to an intelligent contract function, and obtaining a grammar vector set according to a preset function library; obtaining a first semantic vector according to an intelligent contract function, and obtaining a semantic vector set according to a preset function library; determining a similarity score result between the intelligent contract function and a preset intelligent contract function according to the first grammar vector, the grammar vector set, the first semantic vector and the semantic vector set; and sequencing the preset intelligent contract functions according to the similarity score result. The method and the device can improve the accuracy of similarity analysis and improve the compiling efficiency of programmers, and can be widely applied to the technical field of information processing.

Description

Method, system and storage medium for ordering similarity of intelligent contract functions
Technical Field
The invention relates to the technical field of information processing, in particular to a method, a system and a storage medium for ordering similarity of intelligent contract functions.
Background
With the development of internet technology, block chains are more and more widely applied, the block chains are provided with intelligent contracts, the intelligent contracts are programs stored on the block chains, can assist and verify negotiation and operation of contracts, are operated by each node, and people needing to operate the programs pay commission fees to miners or authorized people of the nodes, and the intelligent contracts have the characteristics of data transparency, non-falsification and permanent operation, so that the application of the intelligent contracts has important influence on the application of the block chains.
In order to reduce the high threshold of intelligent contract development and improve the efficiency of a programmer in programming an intelligent contract function, and to help the programmer find and reconstruct code, code recommendation is usually performed on the programmer, for example, when code to be implemented by the programmer actually exists or when a bug appears in written code, the programmer can learn or reconstruct the code according to the recommendation result. At present, code recommendation is usually performed according to the similarity of intelligent contract functions, but the current method for performing similarity ranking on functions mainly has two problems, namely, low efficiency, often only considering one aspect of grammar or semantics without comprehensive consideration, only ranking common languages and frames, and not discussing scenes of the intelligent contract, and actually, if only considering a grammar structure, even if the grammar structures of the two contract functions are very similar, the information actually expressed by the two contract functions and the applicable scenes are completely different. Therefore, actually, the recommendation result is inaccurate, and the programmer does not have actual reference value, so that the programmer still needs to write the code by himself, finds out the reason of the code bug by himself, and the writing efficiency is not improved easily.
Disclosure of Invention
In view of the above, in order to solve the above technical problems, an object of the present invention is to provide an accurate and efficient ordering method, system and storage medium for similarity of intelligent contract functions.
The technical scheme adopted by the invention is as follows: the method for sequencing similarity of intelligent contract functions comprises the following steps:
acquiring an input intelligent contract function;
obtaining a first grammar vector according to an intelligent contract function, and obtaining a grammar vector set according to a preset function library;
obtaining a first semantic vector according to an intelligent contract function, and obtaining a semantic vector set according to a preset function library;
determining a similarity score result between the intelligent contract function and a preset intelligent contract function according to the first grammar vector, the grammar vector set, the first semantic vector and the semantic vector set;
according to the similarity score result, sequencing the preset intelligent contract function;
the preset function library is provided with a plurality of preset intelligent contract functions, the grammar vector set comprises second grammar vectors of the plurality of preset intelligent contract functions, and the semantic vector set comprises the second semantic vectors of the plurality of preset intelligent contract functions.
Further, the step of obtaining a first grammar vector according to the intelligent contract function and obtaining a grammar vector set according to a preset function library comprises the following steps:
obtaining a first abstract syntax tree according to a preset lexical rule, a preset syntax rule and an intelligent contract function;
obtaining a second abstract syntax tree of each preset intelligent contract function according to the preset lexical rule, the preset syntax rule and the preset function library;
and obtaining a grammar vector set according to the second abstract grammar tree, and obtaining a first grammar vector according to the first abstract grammar tree and the second abstract grammar tree.
Further, the step of obtaining a syntax vector set according to the second abstract syntax tree and obtaining the first syntax vector according to the first abstract syntax tree and the second abstract syntax tree includes the following steps:
performing feature extraction on non-key byte points of the first abstract syntax tree to obtain a first feature extraction result;
performing feature extraction on the non-key byte points of each second abstract syntax tree to obtain a second feature extraction result;
according to the second feature extraction result, performing one-hot coding on the features extracted from each second abstract syntax tree to obtain a syntax vector set;
performing one-hot coding according to the first feature extraction result and the second feature extraction result to obtain a first grammar vector;
wherein the feature extraction includes extraction of at least one of a token feature, a parent feature, a sibling feature, and a variable usage feature, and the second feature extraction result includes the first feature extraction result.
Further, the steps of obtaining a first semantic vector according to the intelligent contract function and obtaining a semantic vector set according to a preset function library include the following steps:
preprocessing a preset function library to obtain a sentence set, wherein the sentence set comprises first sentences of each preset intelligent contract function, and each first sentence consists of at least one first word;
preprocessing the intelligent contract function to obtain a second sentence, wherein the second sentence is composed of at least one second word;
training and learning are carried out according to the first sentences to obtain a first word vector of each first word;
obtaining a second word vector of each second word according to the first word vector;
obtaining a semantic vector set according to the first word vector, and obtaining a first semantic vector according to the second word vector;
wherein the preprocessing does not filter the preset special meaning keywords.
Further, the steps of obtaining a semantic vector set according to the first word vector and obtaining the first semantic vector according to the second word vector include the following steps:
obtaining the weight of each first word according to the sentence subset, and obtaining the semantic vector of each first sentence according to the weight and the first word vector;
obtaining a semantic vector of a second sentence according to the second word vector and the weight;
the semantic vector of the first sentence is a second semantic vector of a preset intelligent contract function, the semantic vector of the second sentence is a first semantic vector, and the first word vector comprises a second word vector.
Further, the step of determining a similarity score result between the intelligent contract function and the preset intelligent contract function according to the first grammar vector, the grammar vector set, the first semantic vector and the semantic vector set includes the following steps:
according to the result of cosine similarity calculation between the first grammar vector and each second grammar vector, grammar similarity scores of the first grammar vector and each second grammar vector are obtained;
according to the result of cosine similarity calculation between the first semantic vector and each second semantic vector, semantic similarity scores of the first semantic vector and each second semantic vector are obtained;
and obtaining a similarity score result between the intelligent contract function and each preset intelligent contract function according to the grammar similarity score, the semantic similarity score and the preset weight.
Further, the step of performing ranking processing on the preset intelligent contract function according to the similarity score result includes the following steps:
according to the similarity score result, performing descending arrangement on the preset intelligent contract functions;
and returning the descending order result as an index.
The invention also provides a system for ordering similarity of intelligent contract functions, comprising:
the acquisition module is used for acquiring an input intelligent contract function;
the first processing module is used for obtaining a first grammar vector according to the intelligent contract function and obtaining a grammar vector set according to a preset function library;
the second processing module is used for obtaining a first semantic vector according to the intelligent contract function and obtaining a semantic vector set according to a preset function library;
the scoring module is used for determining a similarity scoring result between the intelligent contract function and a preset intelligent contract function according to the first grammar vector, the grammar vector set, the first semantic vector and the semantic vector set;
the sorting processing module is used for sorting the preset intelligent contract function according to the similarity score result;
the preset function library is provided with a plurality of preset intelligent contract functions, the grammar vector set comprises second grammar vectors of the plurality of preset intelligent contract functions, and the semantic vector set comprises the second semantic vectors of the plurality of preset intelligent contract functions.
The invention also provides a system for ordering similarity of intelligent contract functions, comprising:
at least one processor;
at least one memory for storing at least one program;
when the at least one program is executed by the at least one processor, the at least one processor is caused to implement the method for ordering intelligent contract function similarities.
The present invention also provides a storage medium storing instructions executable by a processor, wherein: the processor executes the processor-executable instructions to perform the method for ordering intelligent contract function similarity.
The invention has the beneficial effects that: obtaining a first grammar vector according to an intelligent contract function, and obtaining a grammar vector set according to a preset function library; obtaining a first semantic vector according to an intelligent contract function, and obtaining a semantic vector set according to a preset function library; determining a similarity score result between the intelligent contract function and a preset intelligent contract function according to the first grammar vector, the grammar vector set, the first semantic vector and the semantic vector set; according to the similarity score result, sequencing the preset intelligent contract function; the method and the device fully consider the grammar and the semantics between the intelligent contract function and the preset intelligent contract function, carry out sequencing processing according to the similarity score result obtained by the grammar vector, the grammar vector set, the first semantic vector and the semantic vector set, ensure the accurate similarity between the intelligent contract function and the preset intelligent contract function, enable programmers needing recommendation or retrieval of the preset intelligent contract function to quickly find the needed preset intelligent contract function through the sequencing processing, and improve the writing efficiency of the programmers.
Drawings
FIG. 1 is a schematic flow chart of the steps of the method of the present invention;
FIG. 2 is a diagram illustrating an abstract syntax tree structure according to an embodiment of the present invention.
Detailed Description
The invention will be further explained and explained with reference to the drawings and the embodiments in the description. The step numbers in the embodiments of the present invention are set for convenience of illustration only, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adaptively adjusted according to the understanding of those skilled in the art.
As shown in FIG. 1, the method for ordering similarity of intelligent contract functions includes the following steps:
acquiring an input intelligent contract function;
obtaining a first grammar vector according to an intelligent contract function, and obtaining a grammar vector set according to a preset function library;
obtaining a first semantic vector according to an intelligent contract function, and obtaining a semantic vector set according to a preset function library;
determining a similarity score result between the intelligent contract function and a preset intelligent contract function according to the first grammar vector, the grammar vector set, the first semantic vector and the semantic vector set;
according to the similarity score result, sequencing the preset intelligent contract function;
the preset function library is provided with a plurality of preset intelligent contract functions, the grammar vector set comprises second grammar vectors of the plurality of preset intelligent contract functions, the semantic vector set comprises the second semantic vectors of the plurality of preset intelligent contract functions, and the input intelligent contract functions refer to functions contained in input codes.
In this embodiment, specifically, the following steps are included:
1) setting a preset lexical rule and a preset grammatical rule;
taking a Solidiy programming language as an example, designing a preset lexical rule (namely, a lexical rule capable of identifying various words (Token)) according to a Solidiy grammar based on an antlr framework, then adopting a context-free grammar as a preset grammatical rule of a language, and judging whether a Token sequence accords with a certain preset grammatical rule or not according to the preset grammatical rule; if they match, the grammar can be combined with the corresponding input Token string to generate a sentence.
2) Constructing a consistency abstract syntax tree and extracting characteristics;
s11: presetting a lexical rule, a grammar rule and an intelligent contract function to obtain a first abstract syntax tree, and obtaining a second abstract syntax tree of each preset intelligent contract function according to the lexical rule, the grammar rule and a function library, wherein each preset intelligent contract function corresponds to one second abstract syntax tree, and the function library is a library consisting of all functions (preset intelligent contract functions) contained in the identity;
s12: respectively extracting characteristics of non-key byte points of the first abstract syntax tree and each second abstract syntax tree, wherein the non-key byte points refer to keys which are not the identity, the non-key byte points refer to nodes where the non-keys are located, such as variable names, method names and the like in codes, and the extracted characteristics comprise token characteristics, parent characteristics, sibling characteristics and variable use characteristics, specifically:
the token characteristics are as follows: the value of each non-key node, denoted by n, is replaced by # VAR if the value is a local variable;
parent characteristics: is shaped as (n, i)1,L(t1)),(n,i2,L(t2)),(n,i3,L(t3) Wherein n is t)1I th of (1)1Individual child, t1Is t2I th of (1)2Individual child, t2Is t3I th of (1)3A child, wherein L (t) represents the value of the abstract syntax tree node t; n represents the value of the current non-key node, i.e. its token characteristic, and if n is a local variable, it is also replaced by # VAR;
brother characteristics: form (n, next (n)) and (prev (n), as before, where prev (n) represents the value of the node of the first non-key before the non-key byte point; next (n) represents the value of the first non-key node after the non-key node; if n, prev (n), Next (n) are local variables, then the same is replaced with # VAR.
Variable use characteristics: if n is a local variable, adding a variable use feature, such as (C (PrevUse (n)), C (n)), (C (n), C (NextUse (n))), wherein PreUse refers to the same variable before the current variable, NextUse refers to the same variable after the current variable, C (n) refers to a context feature that the current non-key byte-point value is n, and a feature, such as (i, L (t)), wherein t refers to the parent node of the current node, and i refers to the child of the node which is the parent node.
For example, as shown in fig. 2, taking as an example that the input code has the following segments, an abstract syntax tree is constructed,
Figure BDA0002308601100000051
Figure BDA0002308601100000061
taking the non-key node j with the number a in fig. 2 as an example, the token is characterized in that: # VAR, parental characteristics: (# VAR,1, # #), (# VAR,1, uint #, (# VAR,1, # #), and the sibling features are (# VAR ), wherein the first # VAR represents j, because it is a local variable, denoted # VAR, and the second # VAR represents 10, because it is a specific number, and therefore also replaced with # VAR, and the variable uses the feature that the variable used previously ((1, # #), and therefore there is no NextUse feature since the variable was not reproduced after this variable.
S13: constructing a feature library (namely equivalent to a second feature extraction result) by using the extracted features of each second abstract syntax tree;
s14: according to the second feature extraction result, performing one-hot coding on the features extracted from each second abstract syntax tree to obtain a syntax vector set;
s15: and after the first abstract syntax tree features are extracted, obtaining a first feature extraction result, and performing one-hot coding according to the first feature extraction result and a second feature extraction result to obtain a first syntax vector, wherein the second feature extraction result comprises the first feature extraction result.
For example, the preset function library has a first preset intelligent contract function, a second preset intelligent contract function, a third preset intelligent contract function, extracting the second abstract syntax tree characteristics of the first preset intelligent contract function to obtain characteristics f1 and f2, extracting the second abstract syntax tree characteristics of the second preset intelligent contract function to obtain characteristics f3 and f4, extracting the second abstract syntax tree features of the third preset intelligent contract function to obtain features f4 and f5, wherein the feature libraries are f1, f2, f3, f4 and f5, carrying out one-hot coding, the second syntax vector of the first preset intelligent contract function is (1,1,0,0,0), the second syntax vector of the second preset intelligent contract function is (0,0,1,1,0), the second syntax vector of the third preset intelligent contract function is (0,0,0,1,1), and a syntax vector set is formed by the three second syntax vectors; if the first abstract syntax tree features of the first preset intelligent contract function are extracted to obtain features f2 and f3, the first syntax vector obtained after the one-hot encoding is carried out according to the feature library is (0,1,1,0, 0).
3) Pre-treating;
all preset intelligent contract functions and intelligent contract functions are preprocessed, wherein the preprocessing comprises the following steps:
dividing variable names of a hump naming method into a plurality of words;
filtering out keywords of the consistency, but reserving preset special meaning keywords (such as keywords which have great influence or limitation on function functions: payable, pure, constant, view, memory, storage), for example, functions function a () public payable { … } and function a () public { … }, only differing by one payabale, while in an intelligent contract of an etherhouse, the related operation of the balance can be performed only if the payabale exists, and the related operation of the balance cannot be performed if the payabale does not exist, so that the application scenarios of the functions are completely different;
filtering out strings of variables that do not represent meaning;
filter out partitions, e.g.; and so on.
Therefore, after the preset function library is preprocessed, a sentence set is obtained, wherein the sentence set comprises first sentences of each preset intelligent contract function, and each first sentence is composed of at least one first word;
preprocessing the intelligent contract function to obtain a second sentence, wherein the second sentence is composed of at least one second word;
4) vectorizing a word;
s41: training and learning all first sentences through a skip-gram model of word2vec to obtain a first word vector of each first word, namely one first word corresponds to one first word vector;
s42: matching according to the first word vector to obtain a second word vector of each second word;
wherein the first word comprises a second word and the first word vector comprises a second word vector.
5) Obtaining a semantic vector set and a first semantic vector;
s51: calculating the weight of each first word through TF-IDF according to the sentence subset, and respectively carrying out weighted summation on the first word vectors of the first words in each first sentence according to the weight to obtain the semantic vector of each first sentence, wherein the semantic vector of one first sentence is a second semantic vector of a preset intelligent contract function, namely the semantic vectors of all the first sentences form a semantic vector set;
s52: and according to the weight, carrying out weighted summation on the second word vector to obtain a semantic vector of the second sentence, namely the first semantic vector.
6) Calculating a similarity score;
s61: and (3) calculating the grammar similarity score, wherein the concrete formula is as follows:
syncim ═ cos (syncvec 1, syncvec 2), where syncvec 1 is the first syntax vector and syncvec 2 is one of the second syntax vectors;
respectively carrying out cosine similarity calculation on the first grammar vector and each second grammar vector through the formula to obtain grammar similarity scores of the first grammar vector and each second grammar vector, namely obtaining grammar similarity scores with the number equal to the number of the second grammar vectors;
s62: calculating the semantic similarity score by a specific formula:
SemSim ═ cos (SemVec1, SemVec2), wherein SemVec1 is a first semantic vector, and SemVec2 is one of second semantic vectors;
by the formula, cosine similarity calculation is carried out on the first semantic vector and each second semantic vector respectively to obtain semantic similarity scores of the first semantic vector and each second semantic vector, namely the number of the semantic similarity scores is equal to the number of the second semantic vectors;
and S63, determining a similarity score result between the intelligent contract function and the preset intelligent contract function, and calculating a formula:
sim ═ a × semsmim + b × SynSim. Calculating for multiple times according to a formula to finally obtain a similarity score result between the intelligent contract function and each preset intelligent contract function, wherein a and b are preset weights, and a + b is 1;
7) sorting treatment;
s71: according to the similarity score result, performing descending arrangement on the preset intelligent contract functions;
s72: and returning the descending order result as an index for a programmer to check.
The invention also provides a system for ordering the similarity of the intelligent contract functions, which comprises:
the acquisition module is used for acquiring an input intelligent contract function;
the first processing module is used for obtaining a first grammar vector according to the intelligent contract function and obtaining a grammar vector set according to a preset function library;
the second processing module is used for obtaining a first semantic vector according to the intelligent contract function and obtaining a semantic vector set according to a preset function library;
the scoring module is used for determining a similarity scoring result between the intelligent contract function and a preset intelligent contract function according to the first grammar vector, the grammar vector set, the first semantic vector and the semantic vector set;
the sorting processing module is used for sorting the preset intelligent contract function according to the similarity score result;
the preset function library is provided with a plurality of preset intelligent contract functions, the grammar vector set comprises second grammar vectors of the plurality of preset intelligent contract functions, and the semantic vector set comprises the second semantic vectors of the plurality of preset intelligent contract functions.
The embodiment of the invention also provides a sequencing system of the similarity of the intelligent contract functions, which comprises the following steps:
at least one processor;
at least one memory for storing at least one program;
when the at least one program is executed by the at least one processor, the at least one processor is caused to implement the method for ordering intelligent contract function similarities.
The contents in the above method embodiments are all applicable to the present system embodiment, the functions specifically implemented by the present system embodiment are the same as those in the above method embodiment, and the beneficial effects achieved by the present system embodiment are also the same as those achieved by the above method embodiment.
In summary, compared with the prior art, the invention has the following advantages:
1) the grammatical characteristics and the semantic characteristics of the intelligent contract function are comprehensively considered, meanwhile, the similarity between the intelligent contract function and the preset intelligent contract function can be better measured, and the accuracy of code searching and recommending is improved;
2) the programmer who needs to recommend or retrieve the preset intelligent contract function can quickly find the needed preset intelligent contract function through sequencing processing, and the writing efficiency of the programmer is improved;
3) the code recommendation of the method can be realized by constructing plug-ins in programming software.
The embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.
The embodiment of the invention also provides a storage medium, which stores the instructions executable by the processor, and the processor executes the sorting method of the similarity of the intelligent contract functions when executing the instructions executable by the processor.
It can also be seen that the contents in the above method embodiments are all applicable to the present storage medium embodiment, and the realized functions and advantageous effects are the same as those in the method embodiments.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The logic and/or steps represented in particular embodiments or otherwise described herein, e.g., as a sequential list of executable instructions that may be thought of as being useful to implement logical functions, may be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that may fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
In the description herein, references to the description of the term "one embodiment," "the present embodiment," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. The method for sequencing similarity of intelligent contract functions is characterized by comprising the following steps of:
acquiring an input intelligent contract function;
obtaining a first grammar vector according to an intelligent contract function, and obtaining a grammar vector set according to a preset function library;
obtaining a first semantic vector according to an intelligent contract function, and obtaining a semantic vector set according to a preset function library;
determining a similarity score result between the intelligent contract function and a preset intelligent contract function according to the first grammar vector, the grammar vector set, the first semantic vector and the semantic vector set;
according to the similarity score result, sequencing the preset intelligent contract function;
the preset function library is provided with a plurality of preset intelligent contract functions, the grammar vector set comprises second grammar vectors of the plurality of preset intelligent contract functions, and the semantic vector set comprises the second semantic vectors of the plurality of preset intelligent contract functions.
2. The method of claim 1, wherein the method comprises: the method comprises the following steps of obtaining a first grammar vector according to an intelligent contract function and obtaining a grammar vector set according to a preset function library:
obtaining a first abstract syntax tree according to a preset lexical rule, a preset syntax rule and an intelligent contract function;
obtaining a second abstract syntax tree of each preset intelligent contract function according to the preset lexical rule, the preset syntax rule and the preset function library;
and obtaining a grammar vector set according to the second abstract grammar tree, and obtaining a first grammar vector according to the first abstract grammar tree and the second abstract grammar tree.
3. The method of claim 2, wherein the method comprises: the step of obtaining a grammar vector set according to the second abstract grammar tree and obtaining the first grammar vector according to the first abstract grammar tree and the second abstract grammar tree comprises the following steps:
performing feature extraction on non-key byte points of the first abstract syntax tree to obtain a first feature extraction result;
performing feature extraction on the non-key byte points of each second abstract syntax tree to obtain a second feature extraction result;
according to the second feature extraction result, performing one-hot coding on the features extracted from each second abstract syntax tree to obtain a syntax vector set;
performing one-hot coding according to the first feature extraction result and the second feature extraction result to obtain a first grammar vector;
wherein the extracted features include at least one of a token feature, a parent feature, a sibling feature, and a variable use feature, and the second feature extraction result includes the first feature extraction result.
4. The method of claim 1, wherein the method comprises: the steps of obtaining a first semantic vector according to the intelligent contract function and obtaining a semantic vector set according to a preset function library comprise the following steps:
preprocessing a preset function library to obtain a sentence set, wherein the sentence set comprises first sentences of each preset intelligent contract function, and each first sentence consists of at least one first word;
preprocessing the intelligent contract function to obtain a second sentence, wherein the second sentence is composed of at least one second word;
training and learning are carried out according to the first sentences to obtain a first word vector of each first word;
obtaining a second word vector of each second word according to the first word vector;
obtaining a semantic vector set according to the first word vector, and obtaining a first semantic vector according to the second word vector;
wherein the preprocessing does not filter the preset special meaning keywords.
5. The method of claim 4, wherein the method comprises: the steps of obtaining a semantic vector set according to the first word vector and obtaining the first semantic vector according to the second word vector comprise the following steps:
obtaining the weight of each first word according to the sentence subset, and obtaining the semantic vector of each first sentence according to the weight and the first word vector;
obtaining a semantic vector of a second sentence according to the second word vector and the weight;
the semantic vector of the first sentence is a second semantic vector of a preset intelligent contract function, the semantic vector of the second sentence is a first semantic vector, and the first word vector comprises a second word vector.
6. The method of claim 1, wherein the method comprises: the step of determining a similarity score result between the intelligent contract function and the preset intelligent contract function according to the first grammar vector, the grammar vector set, the first semantic vector and the semantic vector set comprises the following steps:
according to the result of cosine similarity calculation between the first grammar vector and each second grammar vector, grammar similarity scores of the first grammar vector and each second grammar vector are obtained;
according to the result of cosine similarity calculation between the first semantic vector and each second semantic vector, semantic similarity scores of the first semantic vector and each second semantic vector are obtained;
and obtaining a similarity score result between the intelligent contract function and each preset intelligent contract function according to the grammar similarity score, the semantic similarity score and the preset weight.
7. The method of claim 1, wherein the method comprises: the step of sequencing the preset intelligent contract function according to the similarity score result comprises the following steps:
according to the similarity score result, performing descending arrangement on the preset intelligent contract functions;
and returning the descending order result as an index.
8. An intelligent contract function similarity ranking system, comprising:
the acquisition module is used for acquiring an input intelligent contract function;
the first processing module is used for obtaining a first grammar vector according to the intelligent contract function and obtaining a grammar vector set according to a preset function library;
the second processing module is used for obtaining a first semantic vector according to the intelligent contract function and obtaining a semantic vector set according to a preset function library;
the scoring module is used for determining a similarity scoring result between the intelligent contract function and a preset intelligent contract function according to the first grammar vector, the grammar vector set, the first semantic vector and the semantic vector set;
the sorting processing module is used for sorting the preset intelligent contract function according to the similarity score result;
the preset function library is provided with a plurality of preset intelligent contract functions, the grammar vector set comprises second grammar vectors of the plurality of preset intelligent contract functions, and the semantic vector set comprises the second semantic vectors of the plurality of preset intelligent contract functions.
9. An intelligent contract function similarity ranking system, comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the intelligent contract function similarity ranking method of any of claims 1-7.
10. A storage medium storing instructions executable by a processor, wherein: a processor executing the processor-executable instructions to perform the method for ordering intelligent contract function similarities of any one of claims 1-7.
CN201911249429.3A 2019-12-09 2019-12-09 Ordering method, ordering system and storage medium for intelligent contract function similarity Active CN111158692B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911249429.3A CN111158692B (en) 2019-12-09 2019-12-09 Ordering method, ordering system and storage medium for intelligent contract function similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911249429.3A CN111158692B (en) 2019-12-09 2019-12-09 Ordering method, ordering system and storage medium for intelligent contract function similarity

Publications (2)

Publication Number Publication Date
CN111158692A true CN111158692A (en) 2020-05-15
CN111158692B CN111158692B (en) 2023-05-02

Family

ID=70555799

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911249429.3A Active CN111158692B (en) 2019-12-09 2019-12-09 Ordering method, ordering system and storage medium for intelligent contract function similarity

Country Status (1)

Country Link
CN (1) CN111158692B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051144A (en) * 2021-03-26 2021-06-29 中山大学 Intelligent contract recommendation method and device
CN113127042A (en) * 2021-05-08 2021-07-16 中山大学 Intelligent contract recommendation method, equipment and storage medium
CN113760941A (en) * 2021-09-10 2021-12-07 北京航空航天大学 Intelligent contract program function retrieval method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030004716A1 (en) * 2001-06-29 2003-01-02 Haigh Karen Z. Method and apparatus for determining a measure of similarity between natural language sentences
CN108459860A (en) * 2018-03-28 2018-08-28 成都链安科技有限公司 Block chain intelligence forms of contract chemical examination card code converter and conversion method
CN110109675A (en) * 2019-04-30 2019-08-09 翟红鹰 Intelligent contract processing method, device and computer readable storage medium
CN110489973A (en) * 2019-08-06 2019-11-22 广州大学 A kind of intelligent contract leak detection method, device and storage medium based on Fuzz

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030004716A1 (en) * 2001-06-29 2003-01-02 Haigh Karen Z. Method and apparatus for determining a measure of similarity between natural language sentences
CN108459860A (en) * 2018-03-28 2018-08-28 成都链安科技有限公司 Block chain intelligence forms of contract chemical examination card code converter and conversion method
CN110109675A (en) * 2019-04-30 2019-08-09 翟红鹰 Intelligent contract processing method, device and computer readable storage medium
CN110489973A (en) * 2019-08-06 2019-11-22 广州大学 A kind of intelligent contract leak detection method, device and storage medium based on Fuzz

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051144A (en) * 2021-03-26 2021-06-29 中山大学 Intelligent contract recommendation method and device
CN113127042A (en) * 2021-05-08 2021-07-16 中山大学 Intelligent contract recommendation method, equipment and storage medium
CN113760941A (en) * 2021-09-10 2021-12-07 北京航空航天大学 Intelligent contract program function retrieval method
CN113760941B (en) * 2021-09-10 2024-01-05 北京航空航天大学 Intelligent contract program function retrieval method

Also Published As

Publication number Publication date
CN111158692B (en) 2023-05-02

Similar Documents

Publication Publication Date Title
CN111159223B (en) Interactive code searching method and device based on structured embedding
Rutherford et al. A systematic study of neural discourse models for implicit discourse relation
US20080221870A1 (en) System and method for revising natural language parse trees
CN111158692B (en) Ordering method, ordering system and storage medium for intelligent contract function similarity
CN109697239B (en) Method for generating teletext information
CN109740158B (en) Text semantic parsing method and device
Vu et al. Etnlp: A visual-aided systematic approach to select pre-trained embeddings for a downstream task
CN114625844B (en) Code searching method, device and equipment
CN112581327B (en) Knowledge graph-based law recommendation method and device and electronic equipment
Tran et al. Effective attention-based neural architectures for sentence compression with bidirectional long short-term memory
CN116775872A (en) Text processing method and device, electronic equipment and storage medium
CN114661872A (en) Beginner-oriented API self-adaptive recommendation method and system
Choi et al. Source code summarization using attention-based keyword memory networks
CN112559691B (en) Semantic similarity determining method and device and electronic equipment
CN114490926A (en) Method and device for determining similar problems, storage medium and terminal
CN113590811A (en) Text abstract generation method and device, electronic equipment and storage medium
CN111831624A (en) Data table creating method and device, computer equipment and storage medium
CN115934948A (en) Knowledge enhancement-based drug entity relationship combined extraction method and system
CN111126066B (en) Method and device for determining Chinese congratulation technique based on neural network
CN115438220A (en) Cross-language and cross-modal retrieval method and device for noise robust learning
Karpagam et al. Deep learning approaches for answer selection in question answering system for conversation agents
CN115455969A (en) Medical text named entity recognition method, device, equipment and storage medium
CN114138929A (en) Question answering method and device
US20200356868A1 (en) Deep-learning model catalog creation
Bansal et al. Online Insurance Business Analytics Approach for Customer Segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant