CN111857728A - Code abstract generation method and device - Google Patents

Code abstract generation method and device Download PDF

Info

Publication number
CN111857728A
CN111857728A CN202010710215.8A CN202010710215A CN111857728A CN 111857728 A CN111857728 A CN 111857728A CN 202010710215 A CN202010710215 A CN 202010710215A CN 111857728 A CN111857728 A CN 111857728A
Authority
CN
China
Prior art keywords
output
vector
code
vectors
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010710215.8A
Other languages
Chinese (zh)
Other versions
CN111857728B (en
Inventor
陈湘萍
黄少豪
周晓聪
郑子彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202010710215.8A priority Critical patent/CN111857728B/en
Publication of CN111857728A publication Critical patent/CN111857728A/en
Application granted granted Critical
Publication of CN111857728B publication Critical patent/CN111857728B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a code abstract generating method and a device, wherein the method comprises the following steps: coding the extracted code features to obtain a plurality of state vectors; aggregating a plurality of state vectors into one aggregated vector using an attention mechanism; decoding the aggregation vector and the last output vector to obtain the current output vector and all the output vectors; using a bidirectional model to mutually link all output vectors to obtain an output vector with optimized sequence; and sequentially obtaining output words according to the output vectors optimized in the sequence, and combining all the output words according to the sequence of the output words to obtain the code abstract. And the decoder based on the bidirectional model is used for converting all output vectors into the output vectors with optimized sequence, so that all output words can be directly combined according to the sequence of the output words when the code abstract is generated to obtain the code abstract, the accuracy of the code abstract is improved, and the model effect generated by the code abstract is also improved.

Description

Code abstract generation method and device
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a code abstract generating method and device.
Background
Code summarization generation is intended to utilize natural language processing techniques in the field of existing artificial intelligence so that a computer can understand the functions of code and generate a summary describing the code functions. This technique can help programmers to read and understand code more efficiently, so that programmers can maintain and modify programs more efficiently.
The mainstream model structure for generating the code abstract at present is encoder-decoder based authentication, an encoder is used for encoding extracted code features into state vectors, an attention mechanism is used for aggregating a plurality of state vectors output by the encoder into one state vector, a decoder is used for decoding the aggregated state vector into words to be output, and finally a sentence of abstract is obtained to describe the function of an input code.
The features of the code mainly include three types, the first is a text (play text) feature of the code, the second is an Abstract Syntax Tree (AST) feature of the code, and the third is a logical execution feature of the code.
The text characteristics of the code, as the name implies, are that the text of the code is directly used as the characteristics, in the translation task of natural language processing, when translating english into chinese, the text of english is directly used as the characteristics, and in the code abstract generation task, the text of the code is also directly used as the characteristics.
The abstract syntax tree is used as the characteristic of the code, namely the structure information (middle node) of the code can be reserved, and information (leaf nodes) such as variable names, numerical values, attributes and the like of the code can be reserved.
When the code is compiled into assembly language or byte code, the computer executes the instructions in sequence according to a line, when jump instructions such as 'goto', 'jump' and the like are encountered, the computer jumps to a certain line of instructions to continue to execute the instructions in sequence, and a logic execution diagram can be constructed according to any possible sequence of instruction execution, and the diagram is the logic execution characteristic extracted from the code.
In summary, three characteristics of the code have three expressions respectively: linear sequences, trees, graphs.
The most mainstream and the best method at present are all based on the code characteristics of an abstract syntax tree.
Deep Code Comment Generation proposes a Deep Code model that converts abstract syntax trees into their Traversal sequences using a Traversal method known as Structure-based Traversal (SBT) and then converts the Code into its digest using the Seq2Seq Attention model.
The Automatic Source Code multiplication with Extended Tree-LSTM provides a Tree-based LSTM model, coding is carried out from bottom to top from leaf nodes, for each intermediate node, the output vector of the node is obtained by utilizing the output vector of the child node and the input of the node, and finally each node on an abstract syntax Tree has one output vector to finish the coding work of a coder.
A code2seq model is provided, a plurality of pairs of leaf nodes are randomly selected on an abstract syntax tree, each pair of leaf nodes can form a path, a plurality of paths are obtained through the leaf nodes, each path is coded by using an RNN model, and the coding of the paths is obtained, namely the work of a coder is finished.
The above methods are identical except for the way in which the encoders operate, and their decoders operate as follows: (1) aggregating a plurality of vectors output by an encoder into one vector by using an Attention mechanism; (2) inputting "< START >" into the decoder initially, and inputting the output word of the decoder at the previous moment at the rest time; (3) the decoder uses the combined action of the input words and the aggregation vectors to obtain output words; (4) the above process is repeated until the decoder outputs "< END >". From the above description we can find that such a decoder is generated unidirectionally, i.e. the words of a sentence are output one after the other in sequence.
Therefore, in the prior art, attention has been mostly paid to improving the quality of information obtained from codes by improving the working mode of the encoder, and attention has been rarely paid to the improvement of the decoder.
Disclosure of Invention
The invention aims to provide a code abstract generating method and a code abstract generating device, which improve the model effect and improve the accuracy of the generated code abstract through a bidirectional decoder.
In order to achieve the purpose, the invention adopts the following technical scheme:
in a first aspect, the present invention provides a code summary generating method, including:
coding the extracted code features to obtain a plurality of state vectors;
aggregating the number of state vectors into one aggregated vector using an attention mechanism;
decoding the aggregation vector and the last output vector to obtain the current output vector and all the output vectors;
using a bidirectional model to mutually link all the output vectors to obtain an output vector with optimized sequence;
and sequentially obtaining output words according to the output vectors optimized in the sequence, and combining all the output words according to the sequence of the output words to obtain the code abstract.
As an optional technical solution of the present invention, before the encoding the extracted code features to obtain a plurality of state vectors, the method further includes:
And extracting the characteristics of the code, wherein the characteristics are text characteristics, abstract syntax tree characteristics or logic execution characteristics of the code.
As an optional technical solution of the present invention, when the feature of the code is a text feature, the encoding the extracted code feature to obtain a plurality of state vectors includes:
encoding features of the code using a model capable of processing sequences, satisfying the following equation: z is a radical of1,z2,...,zmEncoder (x) whereCoding model of physical sequence, x representing the characteristics of the code as input to the encoder, z1,z2,...,zmM state vectors output by the model respectively.
As an optional technical solution of the present invention, the aggregating the plurality of state vectors into one aggregated vector using an attention mechanism includes:
using an attention mechanism to aggregate the m state vectors to obtain an aggregate vector contexttThe following formula is satisfied:
Figure BDA0002596276260000041
where the output of the v function is a vector, ht-1Representing the t-1 th output vector, the output of the a-function is a constant, and the a-function satisfies the following equation:
Figure BDA0002596276260000042
the v function and the a function belong to the attention mechanism.
As an optional technical solution of the present invention, the decoding the aggregation vector and the previous output vector to obtain a current output vector, and thus obtaining all output vectors includes:
For the contexttAnd h is saidt-1Decoding to obtain the t output vector htThe following formula is satisfied: h ist=f(ht-1,contextt) And f is a decoding function.
As an optional technical solution of the present invention, the decoding the aggregation vector and the previous output vector to obtain a current output vector, and thus obtaining all output vectors includes:
for the contexttAnd h is saidt-1Decoding to obtain the t output vector htThe following formula is satisfied: h ist=f(ht-1,contextt,u(ht-1) U) is a transformation function for transforming the t-1 th output vector ht-1Transformed into another vector.
As an optional technical solution of the present invention, the using a bidirectional model to interact all the output vectors to obtain an output vector optimized in sequence includes:
judging whether the current output vector is the last one;
if yes, all the output vectors h are mutually linked and acted by using a bidirectional model to obtain an output vector o with optimized sequence.
As an optional technical solution of the present invention, the sequentially obtaining output words according to the sequentially optimized output vectors, and combining all the output words according to the sequence of the output words to obtain a code abstract includes:
and transforming the output vector o with the optimized sequence to obtain the word distribution d of the output words, wherein the word distribution d satisfies the following expression: d t=g(ot),otFor the t-th order optimized output vector, g is the transformation function model, dtA word distribution for the t-th output word;
distribution of subordinate words dtSelecting the word with the highest probability as the tth output word, and obtaining all the output words according to the tth output word;
and combining all the output words according to the sequence of the output words to obtain the code abstract.
In a second aspect, the present invention provides a code summary generating apparatus, including:
an extractor for extracting features of the code;
the coder is used for coding the extracted code characteristics to obtain a plurality of state vectors;
an attention mechanism module for aggregating the plurality of state vectors into one aggregated vector;
the decoder is used for decoding the aggregation vector and the last output vector to obtain the current output vector and all the output vectors; using a bidirectional model to mutually link all the output vectors to obtain an output vector with optimized sequence;
and the generator is used for sequentially obtaining output words according to the output vectors optimized in the sequence, and combining all the output words according to the sequence of the output words to obtain the code abstract.
As an optional technical solution of the present invention, the generator includes:
The transformation unit is used for transforming the output vector optimized in sequence to obtain word distribution of output words;
the screening unit is used for selecting the words with the highest probability from the word distribution as the output words and obtaining all the output words according to the output words;
and the combination unit is used for combining all the output words according to the sequence of the output words to obtain the code abstract.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
according to the code abstract generating method and device provided by the embodiment of the invention, after the code features are sequentially encoded and aggregated, all output vectors are converted into the sequentially optimized output vectors in a decoding stage by using a decoding mode based on a bidirectional model, so that all output words can be directly combined according to the sequence of the output words when the code abstract is generated, the code abstract is obtained, the accuracy of the code abstract is improved, and the model effect of code abstract generation is also improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
The structures, ratios, sizes, and the like shown in the present specification are only used for matching with the contents disclosed in the specification, so that those skilled in the art can understand and read the present invention, and do not limit the conditions for implementing the present invention, so that the present invention has no technical significance, and any structural modifications, changes in the ratio relationship, or adjustments of the sizes, without affecting the functions and purposes of the present invention, should still fall within the scope covered by the contents disclosed in the present invention.
Fig. 1 is a schematic diagram of a code digest generation method according to an embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present embodiment provides a code digest generation method based on an encoder-decoder + attribute model structure.
Specifically, the input inputs the characteristics of the code, the code characteristics are three (text characteristics, abstract syntax tree characteristics or logic execution characteristics), the coding modes corresponding to different characteristics are different, and the sequence characteristics (text characteristics) are taken as an example, and are represented by x. The encoder is responsible for encoding code features into state vectors, satisfying the following equation: z is a radical of1,z2,...,zmAn encoder is a coding model that can process a sequence, x represents a feature of a code as an input of the encoder, and z is1,z2,...,zmM state vectors output by the model respectively.
Specifically, the encoder model may be a model for processing a sequence such as RNN, LSTM, GRU, transform, or the like.
In fig. 1, attention mechanism attention is responsible for each input to decoder. The attention mechanism attention aggregates a plurality of output vectors of the encoder to obtain an aggregation vector context, and the aggregation vector context satisfies the following expression:
Figure BDA0002596276260000071
in which v is a functionThe output is a vector, ht-1Representing the t-1 th output vector.
The output of the a-function is a constant and the a-function output can be regarded as a weighting coefficient. The a function satisfies the following equation:
Figure BDA0002596276260000072
The above-mentioned v function and a function belong to the attention mechanism.
Further, in the uni-directional decoder, the output vector of the decoder satisfies the following equation: h ist=f(ht-1,contextt,yt-1)。
The output of the f function is a vector, ht-1Is decoder t-1 th output vector, contexttIs the context vector, y, from attention mechanism attentiont-1Is the word vector for the t-1 output word of decoder.
Since the present embodiment is based on a bi-directional decoder, the final output vector is obtained in an optimized order. Therefore, the word vector of the t-1 th output word of decoder cannot be directly obtained.
To this end, the present embodiment provides two innovative approaches to be applied to the code summarization method based on a bi-directional decoder. The method specifically comprises the following steps:
first, ht=f(ht-1,contextt) I.e. directly removing the parameter y in the f-functiont-1So that the decoder does not need to know the last word vector when generating the next output vector;
second, ht=f(ht-1,contextt,u(ht-1) U) is a transformation function for transforming the t-1 th output vector ht-1Conversion into another vector, i.e. using u (h)t-1) In place of yt-1U (h) oft-1) Machine learning mechanism optimization may be used.
Therefore, by using the two innovative methods, the output vectors can be obtained in sequence, and all the output vectors can be obtained.
It should be noted that, when the 1 st output vector h needs to be obtained1In the f function, the first parameter ht-1A specific vector is used instead. I.e. initially, to the decoder "<START>"(which may also be represented by 0 in fig. 1) so that the decoder uses a particular vector instead, depending on the instruction.
Is the output vector the last determined in real time?
If so, i.e. when the last output vector (h in FIG. 1) is presentn) Then all output vectors can be interacted with each other using a bi-directional model to obtain a sequentially optimized output vector.
Specifically, bi-directional models include, but are not limited to: bidirectional RNN, bidirectional LSTM, bidirectional GRU, transformer. And (4) mutually linking all the output vectors h to obtain an output vector o with optimized sequence. The output order of the order-optimized output vector o is identical to the word order of the final generated sentence, i.e. the tth output vector o of the order-optimized output vector otRelated to the tth word of the sentence, so the vector o can be usedtAnd obtaining the word distribution of the t-th output word, and obtaining the t-th output word of the decoder from the word distribution.
Specifically, the word distribution d of the output word can be obtained by transforming the output vector o with optimized sequence, and the expression is: d t=g(ot),otFor the t-th order optimized output vector, g is the transformation function model, dtWord distribution for the t-th output word.
Then distribute d from the wordtAnd selecting the word with the highest probability as the tth output word, and obtaining all the output words according to the tth output word.
And finally, combining all the output words according to the sequence of the output words to obtain the code abstract.
In summary, with the code abstract generating method provided in the embodiments of the present invention, after the code features are sequentially encoded and aggregated, in the decoding stage, all output vectors are converted into the sequentially optimized output vectors by using a decoding manner based on a bidirectional model, so that when the code abstract is generated, all output words can be directly combined according to the sequence of the output words to obtain the code abstract, which not only improves the accuracy of the code abstract, but also improves the model effect of code abstract generation.
For example, as a specific application scenario of the embodiment:
according to our human thinking, many times we want to say a sentence, which is not generated in sequence in the brain, we will first think of several keywords, and then get the word we want to say by recombining some conjunctions and word sequences. For example, the phrase "i feel that apple is better than banana" may be thought of by first the phrases "apple", "banana", "good", "than", "feel", "i" and then through recombination. That is, more critical words have a higher probability of being thought of first.
Therefore, compared with the prior art, the code abstract generating method provided by the embodiment of the invention has the advantages of better code abstract generating effect and higher accuracy.
In another embodiment of the present application, a code summary generating apparatus is further provided, which is used to implement the code summary generating method. Specifically, the code digest generation apparatus includes:
an extractor for extracting features of the code;
the coder is used for coding the extracted code characteristics to obtain a plurality of state vectors;
the attention mechanism module is used for aggregating a plurality of state vectors into an aggregation vector;
the decoder is used for decoding the aggregation vector and the last output vector to obtain the current output vector and all the output vectors; all output vectors are mutually linked and acted by using a bidirectional model to obtain an output vector with optimized sequence;
and the generator is used for sequentially obtaining output words according to the output vectors optimized in sequence, and combining all the output words according to the sequence of the output words to obtain the code abstract.
Further, the generator includes:
the transformation unit is used for transforming the output vector optimized in sequence to obtain word distribution of the output words;
The screening unit is used for selecting the words with the highest probability from the word distribution as the output words and obtaining all the output words according to the output words;
and the combination unit is used for combining all the output words according to the sequence of the output words to obtain the code abstract.
It should be noted that the specific implementation principle of the code summary generation apparatus has been explained in the above method embodiment, and is not described herein again.
According to the code abstract generating device provided by the embodiment of the invention, after the code features are sequentially encoded and aggregated, in the decoding stage, all output vectors are converted into the sequentially optimized output vectors by using the decoder based on the bidirectional model, so that when the code abstract is generated, all output words can be directly combined according to the sequence of the output words to obtain the code abstract, the accuracy of the code abstract is improved, and the model effect of code abstract generation is also improved.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for generating a code digest, comprising:
coding the extracted code features to obtain a plurality of state vectors;
aggregating the number of state vectors into one aggregated vector using an attention mechanism;
decoding the aggregation vector and the last output vector to obtain the current output vector and all the output vectors;
using a bidirectional model to mutually link all the output vectors to obtain an output vector with optimized sequence;
and sequentially obtaining output words according to the output vectors optimized in the sequence, and combining all the output words according to the sequence of the output words to obtain the code abstract.
2. The method of claim 1, wherein before the encoding the extracted code features to obtain a plurality of state vectors, the method further comprises:
and extracting the characteristics of the code, wherein the characteristics are text characteristics, abstract syntax tree characteristics or logic execution characteristics of the code.
3. The method of claim 1, wherein when the feature of the code is a text feature, the encoding the extracted feature of the code to obtain a plurality of state vectors comprises:
Encoding features of the code using a model capable of processing sequences, satisfying the following equation: z is a radical of1,z2,...,zmAn encoder is a coding model that can process a sequence, x represents a feature of a code as an input of the encoder, and z is1,z2,...,zmM state vectors output by the model respectively.
4. The method of generating a code summary according to claim 3, wherein the aggregating the plurality of state vectors into one aggregated vector using an attention mechanism comprises:
using an attention mechanism to aggregate the m state vectors to obtain an aggregate vector contexttThe following formula is satisfied:
Figure FDA0002596276250000011
where the output of the v function is a vector, ht-1Representing the t-1 th output vector, the output of the a-function is a constant, and the a-function satisfies the following equation:
Figure FDA0002596276250000021
the v function and the a function belong to the attention mechanism.
5. The method of claim 4, wherein the decoding the aggregate vector and the last output vector to obtain a current output vector and all output vectors therefrom comprises:
for the contexttAnd h is saidt-1Decoding to obtain the t output vector htThe following formula is satisfied: h is t=f(ht-1,contextt) And f is a decoding function.
6. The method of claim 4, wherein the decoding the aggregate vector and the last output vector to obtain a current output vector and all output vectors therefrom comprises:
for the contexttAnd h is saidt-1Decoding to obtain the t output vector htThe following formula is satisfied: h ist=f(ht-1,contextt,u(ht-1) U) is a transformation function for transforming the t-1 th output vector ht-1Transformed into another vector.
7. The method according to claim 5 or 6, wherein the using a bi-directional model to relate all the output vectors to each other to obtain a sequentially optimized output vector comprises:
judging whether the current output vector is the last one;
if yes, all the output vectors h are mutually linked and acted by using a bidirectional model to obtain an output vector o with optimized sequence.
8. The method of claim 7, wherein the obtaining output words in sequence according to the output vectors optimized in sequence, and combining all the output words according to the sequence of the output words to obtain the code abstract comprises:
and transforming the output vector o with the optimized sequence to obtain the word distribution d of the output words, wherein the word distribution d satisfies the following expression: d t=g(ot),otFor the t-th order optimized output vector, g is the transformation function model, dtA word distribution for the t-th output word;
distribution of subordinate words dtSelecting the word with the highest probability as the tth output word, and obtaining all the output words according to the tth output word;
and combining all the output words according to the sequence of the output words to obtain the code abstract.
9. A code digest generation apparatus, comprising:
an extractor for extracting features of the code;
the coder is used for coding the extracted code characteristics to obtain a plurality of state vectors;
an attention mechanism module for aggregating the plurality of state vectors into one aggregated vector;
the decoder is used for decoding the aggregation vector and the last output vector to obtain the current output vector and all the output vectors; using a bidirectional model to mutually link all the output vectors to obtain an output vector with optimized sequence;
and the generator is used for sequentially obtaining output words according to the output vectors optimized in the sequence, and combining all the output words according to the sequence of the output words to obtain the code abstract.
10. The code digest generation apparatus of claim 9, wherein the generator includes:
The transformation unit is used for transforming the output vector optimized in sequence to obtain word distribution of output words;
the screening unit is used for selecting the words with the highest probability from the word distribution as the output words and obtaining all the output words according to the output words;
and the combination unit is used for combining all the output words according to the sequence of the output words to obtain the code abstract.
CN202010710215.8A 2020-07-22 2020-07-22 Code abstract generation method and device Active CN111857728B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010710215.8A CN111857728B (en) 2020-07-22 2020-07-22 Code abstract generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010710215.8A CN111857728B (en) 2020-07-22 2020-07-22 Code abstract generation method and device

Publications (2)

Publication Number Publication Date
CN111857728A true CN111857728A (en) 2020-10-30
CN111857728B CN111857728B (en) 2021-08-31

Family

ID=73000942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010710215.8A Active CN111857728B (en) 2020-07-22 2020-07-22 Code abstract generation method and device

Country Status (1)

Country Link
CN (1) CN111857728B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113238798A (en) * 2021-04-19 2021-08-10 山东师范大学 Code abstract generation method, system, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090323820A1 (en) * 2008-06-30 2009-12-31 Microsoft Corporation Error detection, protection and recovery for video decoding
CN106980683A (en) * 2017-03-30 2017-07-25 中国科学技术大学苏州研究院 Blog text snippet generation method based on deep learning
CN108280112A (en) * 2017-06-22 2018-07-13 腾讯科技(深圳)有限公司 Abstraction generating method, device and computer equipment
CN108459874A (en) * 2018-03-05 2018-08-28 中国人民解放军国防科技大学 Code automatic summarization method integrating deep learning and natural language processing
CN109145105A (en) * 2018-07-26 2019-01-04 福州大学 A kind of text snippet model generation algorithm of fuse information selection and semantic association
US20200026760A1 (en) * 2018-07-23 2020-01-23 Google Llc Enhanced attention mechanisms
CN111290756A (en) * 2020-02-10 2020-06-16 大连海事大学 Code-annotation conversion method based on dual reinforcement learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090323820A1 (en) * 2008-06-30 2009-12-31 Microsoft Corporation Error detection, protection and recovery for video decoding
CN106980683A (en) * 2017-03-30 2017-07-25 中国科学技术大学苏州研究院 Blog text snippet generation method based on deep learning
CN108280112A (en) * 2017-06-22 2018-07-13 腾讯科技(深圳)有限公司 Abstraction generating method, device and computer equipment
CN108459874A (en) * 2018-03-05 2018-08-28 中国人民解放军国防科技大学 Code automatic summarization method integrating deep learning and natural language processing
US20200026760A1 (en) * 2018-07-23 2020-01-23 Google Llc Enhanced attention mechanisms
CN109145105A (en) * 2018-07-26 2019-01-04 福州大学 A kind of text snippet model generation algorithm of fuse information selection and semantic association
CN111290756A (en) * 2020-02-10 2020-06-16 大连海事大学 Code-annotation conversion method based on dual reinforcement learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YAO WAN,ETC.: "Improving Automatic Source Code Summarization via Deep", 《2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE)》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113238798A (en) * 2021-04-19 2021-08-10 山东师范大学 Code abstract generation method, system, equipment and storage medium

Also Published As

Publication number Publication date
CN111857728B (en) 2021-08-31

Similar Documents

Publication Publication Date Title
CN110489102B (en) Method for automatically generating Python code from natural language
Peng et al. Incrementally learning the hierarchical softmax function for neural language models
CN112560456B (en) Method and system for generating generated abstract based on improved neural network
CN109933602A (en) A kind of conversion method and device of natural language and structured query language
CN111382574A (en) Semantic parsing system combining syntax under virtual reality and augmented reality scenes
CN111309896B (en) Deep learning text abstract generation method based on secondary attention
CN112764738A (en) Code automatic generation method and system based on multi-view program characteristics
CN113901847A (en) Neural machine translation method based on source language syntax enhanced decoding
CN112835585A (en) Program understanding method and system based on abstract syntax tree
CN111857728B (en) Code abstract generation method and device
CN109857458B (en) ANTLR-based AltaRica3.0 flattening transformation method
CN112417089A (en) High-parallelism reading understanding method based on deep learning
CN115543437A (en) Code annotation generation method and system
CN112464673B (en) Language meaning understanding method for fusing meaning original information
CN113887249A (en) Mongolian Chinese neural machine translation method based on dependency syntax information and Transformer model
CN113486647A (en) Semantic parsing method and device, electronic equipment and storage medium
CN113486180A (en) Remote supervision relation extraction method and system based on relation hierarchy interaction
JP2017182277A (en) Coding device, decoding device, discrete series conversion device, method and program
da Costaα et al. Janus–faced physics: on Hilbert’s 6th Problem
Neumann Paranatural category theory
CN112528667B (en) Domain migration method and device on semantic analysis
He et al. Comparative analysis of problem representation learning in math word problem solving
CN117033847B (en) Mathematical application problem solving method and system based on hierarchical recursive tree decoding model
Zhou et al. RWKV-based Encoder-Decoder Model for Code Completion
CN117573084B (en) Code complement method based on layer-by-layer fusion abstract syntax tree

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant