CN116661797A - Code completion method based on enhanced Transformer under word element granularity - Google Patents
Code completion method based on enhanced Transformer under word element granularity Download PDFInfo
- Publication number
- CN116661797A CN116661797A CN202310543114.XA CN202310543114A CN116661797A CN 116661797 A CN116661797 A CN 116661797A CN 202310543114 A CN202310543114 A CN 202310543114A CN 116661797 A CN116661797 A CN 116661797A
- Authority
- CN
- China
- Prior art keywords
- word
- code
- model
- completion
- java
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 239000013598 vector Substances 0.000 claims abstract description 13
- 230000011218 segmentation Effects 0.000 claims abstract description 5
- 230000000295 complement effect Effects 0.000 claims description 14
- 238000012549 training Methods 0.000 claims description 8
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 3
- 239000012634 fragment Substances 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000007789 sealing Methods 0.000 claims description 3
- 238000012795 verification Methods 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 claims description 2
- 230000003014 reinforcing effect Effects 0.000 claims 2
- 230000002787 reinforcement Effects 0.000 claims 1
- 230000018109 developmental process Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 235000014510 cooky Nutrition 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/43—Checking; Contextual analysis
- G06F8/436—Semantic checking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/31—Programming languages or programming paradigms
- G06F8/315—Object-oriented languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/42—Syntactic analysis
- G06F8/425—Lexical analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Dc-Dc Converters (AREA)
Abstract
The invention belongs to the technical field of code completion, and particularly relates to a code completion method based on reinforced Transformer under word element granularity. The invention comprises the following steps: s1: collecting Java code segments, constructing a code corpus, flattening and converting Java source codes into a word sequence form; s2: using BPE word segmentation algorithm to the data, and utilizing the subwords to encode the data to obtain word vectors required by the model; s3: in the aspect of a model framework, using a transformerlencoder to encode word vector information to be learned, and obtaining a result to be complemented through transformerlencoder decoding; s4: the Multi-HeadAttention used by the traditional transducer model is improved by using a Talking-HeadAttention; s5: in the reasoning decoding stage, a beam searching method is used for generating top5 recommended completion codes, and repeated word elements are avoided in a recommendation list in the code completion stage. The invention can better utilize the semantic information of the source code to complete the word element granularity code, and the method can effectively improve the accuracy of the code completion.
Description
Technical Field
The invention belongs to the technical field of code completion, and particularly relates to a code completion method based on reinforced Transformer under word element granularity.
Background
Code completion is an important ring in intelligent software engineering, is used as a branch of automatic software development, can provide predictions of class names, method names and the like for programmers in time in the process of software development, reduces typing burden of the developers, reduces spelling errors, and intuitively improves the efficiency of software development.
The early code complement technology carries out complement and prediction through manually defined heuristic rules based on inputted codes and grammar rules, and the code complement mode can gradually lose efficacy along with version alternation, and needs to spend a great deal of manpower and material resources for defining and revising a new round of rules. Nowadays, people train the built learning model through an open source corpus, so that the completed code has grammar rules of programming language and higher accuracy.
Disclosure of Invention
The invention aims to solve the technical problem of further improving the accuracy of code completion, effectively assisting software developers in using the model to improve the efficiency in the software development process and saving the development time. The patent provides a code completion method based on reinforced Transformer under the granularity of the word elements based on the deep learning technology, which is beneficial to improving the accuracy of code completion.
In order to achieve the aim of the invention, the technical scheme adopted by the invention is as follows:
a code complement method based on reinforced convertors under the granularity of a word element comprises the following steps:
s1: collecting Java code segments, constructing a code corpus, flattening and converting Java source codes into a word sequence form;
s2: using BPE word segmentation algorithm to the data, and utilizing the subwords to encode the data to obtain word vectors required by the model;
s3: in the aspect of a model framework, transformer Encoder is used for encoding word vector information to be learned, and a result to be complemented is obtained through a transformerler decoder;
s4: the Multi-HeadAttention used by the traditional transducer model is improved by using a Talking-HeadAttention;
s5: in the reasoning decoding stage, a beam searching method is used for generating top5 recommended completion codes, and repeated word elements are avoided in a recommendation list in the code completion stage.
Further, as a preferred technical solution of the present invention, the step S1 includes the following steps:
s1.1: searching and downloading more than 10 open source items of star in Github, and collecting Java methods;
s1.2: deleting comments of each Java method, filtering codes with the number of lines less than 5, and deleting repeated code fragments;
s1.3: flattening a section of Java code into a line sequence form, and sealing the Java code into a file, wherein the file is used as a corpus, and the corpus is represented by the following formula 4:4:2, dividing the training set, the verification set and the test set.
Further, as a preferred technical solution of the present invention, the step S2 includes the following steps:
s2.1: adopting a BPE algorithm to encode data by sub words;
s2.2: in the process of constructing a vocabulary by using BPE, firstly splitting all the vocabulary elements into character sequences, constructing an initial vocabulary by using all the character sequences, then counting the occurrence frequency of each continuous byte pair in the training corpus, merging the byte pair with the highest occurrence frequency into a new sub-word, updating the vocabulary, and repeating the previous step until the occurrence frequency of the rest byte pairs is the highest 1;
s2.3: in the corpus coding process, the lengths of all the subwords in the table are arranged in the order from large to small, for each given word unit, the word table after the completion of the sorting is traversed, whether the subwords in the word table are the substrings of the word unit is searched, if the matching is successful, the subwords are output, and the rest character strings of the word unit are continuously matched; finally, if the remaining substrings are not successfully matched after the whole traversal is finished, replacing the substrings with special tokens < UNK >, and ending the whole coding process.
Further as a preferred embodiment of the present invention, in the step S3, a neural network is constructed using Transformer Encoder and Transformer Decoder; inputting a partial program of an original text sequence, and outputting a word element of the next position of the partial program; the neural network model formula is:
o=Trans((e t ) t∈src_seq )
where o is the distribution of all possible tokens, e t Is an embedded representation of each individual token t in the original token sequence.
Further, as a preferred technical scheme of the invention, in the selection of the model parameters, the word vector dimension is 128, the attention block number is 6, the head number of the multi-head attention layer is 8, and the dropout probability is 0.5.
Further as a preferred technical solution of the present invention, in step S4, the linking-head entry processes information of different locations into the same header by sharing parameters among different headers, for each of the different headers:
each (Q) is calculated using a parameter matrix lambda i K i ) T The result of each low-rank distributed Multi-Head is overlapped, so that each isolated Attention Head is connected, and the model has a new learnable parameter lambda; occurs in the mechanism of attentionPrior to line softmax calculation, the final walking-Heads Attention can be expressed by the following formula:
further, as a preferred technical solution of the present invention, in step S5, the generated top5 word probabilities are converted once, the numerical value thereof is amplified, and the value thereof is finally decoded and output.
Compared with the prior art, the code complement method based on the reinforced Transformer under the word element granularity has the following technical effects:
(1) The invention can better utilize the semantic information of the source code to complete the word element granularity code, and the method can effectively improve the accuracy of the code completion.
(2) The invention can effectively assist software developers to use the model to improve the efficiency in the software development process and save the development time.
Drawings
FIG. 1 is a schematic flow diagram of the method of the present invention;
FIG. 2 is a drawing illustrating a Talking-Headsattntion;
FIG. 3 is a diagram of a model overall framework;
fig. 4 is an example diagram of a user obtaining code completions.
Detailed Description
The invention is further explained in the following detailed description with reference to the drawings so that those skilled in the art can more fully understand the invention and can practice it, but the invention is explained below by way of example only and not by way of limitation.
As shown in fig. 1, a code complement method based on reinforced convertors under the granularity of a word element is mainly used for helping a user to carry out code complement, and comprises the following steps:
s1, collecting Java code segments, constructing a code corpus, flattening Java source codes and converting the Java source codes into a word element sequence form;
s2, using BPE (Byte Pair Encoding) word segmentation algorithm to the data, and coding the data by utilizing the subwords to obtain word vectors required by the model;
s3, in the aspect of a model framework, transformer Encoder is used for encoding word vector information to be learned, and a result to be complemented is obtained through a transformerler decoder;
s4, improving Multi-HeadAttention used by a traditional transducer model by using a linking-HeadAttention, breaking the original bottleneck of modeling, and obtaining a better effect;
s5, in the reasoning decoding stage, generating top5 recommended completion codes by using a Beam Search (Beam Search) method, and avoiding repeated word elements in a code completion stage recommendation list.
S1, collecting Java code segments, constructing a code corpus, and converting Java source codes into word element sequence forms by flattening, wherein the method comprises the following specific steps of:
searching and downloading more than 10 open source items of star in Github, and collecting Java methods; deleting the notes of each Java method, filtering out codes with the number of lines less than 5, and deleting the repeated code fragments. Flattening a section of Java code into a line sequence form, and sealing the Java code into a file, wherein the file is used as a corpus, and the corpus is represented by the following formula 4:4:2, dividing the training set, the verification set and the test set.
Step S2, using BPE (Byte Pair Encoding) word segmentation algorithm to the data, using sub words to encode the data, and obtaining the word vector required by the model comprises the following specific steps:
because of the hump naming convention rule of Java, many word elements such as class names and method names are unique to the current Java method class, if the dictionary is constructed by the traditional space division sequence, the dictionary is too large, the training resource result is wasted, and meanwhile, rare words or words which are not found in the training process (OOV problem) are difficult to process in the model test. Therefore, the BPE algorithm is employed to encode data in subwords (subwords).
In the process of constructing a vocabulary by using BPE, all the vocabulary elements are split into character sequences, an initial vocabulary is constructed by using all the character sequences, then the occurrence frequency of each continuous byte pair in the training corpus is counted, the byte pair with the highest occurrence frequency is combined into a new sub-word, the vocabulary is updated, and the previous step is repeated until the occurrence frequency of the rest byte pairs is 1 at the highest. By way of example, the initial vocabulary is ' c ', ' o ', ' k ', ' e ','d ', ' i ', ' n ' g ', and ' cookies '. The frequency of occurrence of 'cook' is highest in the original word stock, which is included as a new subword in the table, and the frequency of occurrence of the next two byte pairs of 'ed' and 'ing' is highest at 1, so that the whole word list construction is finished. The two word elements of "cookie" and "cookie" are divided into "cookie", "ed", "ing", so that the semantic information of the word can be learned while the vocabulary is reduced.
In the corpus coding process, the lengths of all the subwords in the table are arranged in the order from large to small, for each given word unit, the word table after the completion of the sorting is traversed, whether the subwords in the word table are the substrings of the word unit is searched, if the matching is successful, the subwords are output, and the rest character strings of the word unit are continuously matched. Finally, if the remaining substrings are not successfully matched after the whole traversal is finished, replacing the substrings with special tokens < UNK >, and ending the whole coding process.
Step S3, in terms of a model framework, transformer Encoder is used for encoding word vector information to be learned, and Transformer Decoder is used for decoding to obtain a result to be complemented, wherein the specific steps are as follows:
because the foregoing flattened the structured code into a sequential form of processing that naturally matches the characteristics of the Transformer processing long sequence text, neural networks were constructed using Transformer Encoder and Transformer Decoder. The input here is a partial program of the original text sequence and the output is a word element of the next position of the partial program. The model can be rewritten as:
o=Trans((e t ) t∈src_seq )
where o is the distribution of all possible tokens, e t Is an embedded representation of each individual token t in the original token sequence.
In the selection of the model parameters, the word vector dimension (embedding_size) is 128, the attention block number (block_num) is 6, the number of heads (num_heads) of the multi-head attention layer is 8, and the dropout probability is 0.5.
S4, improving Multi-head attribute used by a traditional transducer model by using a linking-Heads attribute, breaking the original bottleneck of modeling, and obtaining a better effect specifically comprises the following steps:
in this step, the Multi-HeadAttention used by the traditional transducer model is modified using the linking-Heads Attention. Since the input sequences in Multi-Head Attention are projected to different heads and the Attention is calculated, respectively, and finally the Attention of the heads is weighted and summed to obtain the final context representation, the result is that in the calculation process of each Head in isolation (Q i K i ) T Is not sufficient in expression ability. The parameters are shared among different Heads by the walking-Heads, and information of different positions is fused into the same head for processing, so that the model pays more Attention to the change of sequence positions, and the final expression capacity of the model is improved. Specifically, for each different head:
each (Q) is calculated using a parameter matrix lambda i K i ) T The result of each low rank distributed Multi-Head is superimposed, so that each isolated Attention Head is connected, and the model has a new learnable parameter lambda, thereby further improving the performance of the Attention. This operation occurs before the Attention mechanism makes a softmax calculation, and the final walking-Heads Attention can be expressed by the following formula:
as shown in the explanatory diagram of the talker-Heads Attention in fig. 2.
Step S5, in the reasoning decoding stage, generating top5 recommended completion codes by using a Beam Search (Beam Search) method, and avoiding the occurrence of repeated word elements in a recommendation list in the code completion stage, wherein the specific steps are as follows:
the model performs code completion based on the word elements, and when recommending codes, li composed of 5 different word elements is returned once s t. The use of a beam search algorithm to predict 5 next possible tokens from the given above avoids the model from repeating the decoding 5 times to produce repeated tokens, and as a greedy algorithm, the resulting solution may be referred to as the optimal solution under the task conditions described herein. Because the generated lemmas are 5 possible options which are ranked highest according to the probability value, the numerical value generated by the proportion of single lemmas to the whole word stock is extremely small, the probability is probably very similar, therefore, the probability of the generated top5 lemmas is converted once, the numerical value is amplified, and finally the value is decoded and output.
The invention provides a code complement method based on reinforced Transformer under the granularity of the word elements, a corpus is built, a deep neural network is built and trained to obtain a model, and a user can finally obtain a complement list of the next word element by inputting a substitute complement Java code segment into the model. Fig. 3 is a diagram of the overall frame of the model.
The limitation of the neural code complement model is that the traditional text embedding mode has insufficient acquisition capability for code rich semantic information, and meanwhile, multi-Head Attention has the bottleneck of modeling itself, and the computing mode is as follows (Q i K i ) T As a result, the expression ability was insufficient. The model proposed by the patent is based on the improved results of the two points, and the model method is considered to be capable of improving the accuracy of code annotation generation. FIG. 4 is a diagram of an example of a user obtaining a replacement completion code.
The invention provides a code completion method based on reinforced Transformer under the word element granularity, which can better utilize the semantic information of the source code to carry out the word element granularity code completion.
While the foregoing is directed to embodiments of the present invention, other and further details of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
Claims (7)
1. The code complement method based on the enhanced Transformer under the word element granularity is characterized by comprising the following steps of:
s1: collecting Java code segments, constructing a code corpus, flattening and converting Java source codes into a word sequence form;
s2: using BPE word segmentation algorithm to the data, and utilizing the subwords to encode the data to obtain word vectors required by the model;
s3: in terms of a model framework, transformer Encoder is used for encoding word vector information to be learned, and Transformer Decoder is used for decoding to obtain a result to be complemented;
s4: the Multi-Head attribute used by the traditional transducer model is improved by using the linking-Heads attribute;
s5: in the reasoning decoding stage, a beam searching method is used for generating top5 recommended completion codes, and repeated word elements are avoided in a recommendation list in the code completion stage.
2. The method for reinforcing Transformer-based code complement at the granularity of the lemma according to claim 1, wherein the step S1 comprises the steps of:
s1.1: searching and downloading more than 10 open source items of star in Github, and collecting Java methods;
s1.2: deleting comments of each Java method, filtering codes with the number of lines less than 5, and deleting repeated code fragments;
s1.3: flattening a section of Java code into a line sequence form, and sealing the Java code into a file, wherein the file is used as a corpus, and the corpus is represented by the following formula 4:4:2, dividing the training set, the verification set and the test set.
3. The method for reinforcing Transformer-based code complement at the granularity of the lemma according to claim 2, wherein the step S2 comprises the steps of:
s2.1: adopting a BPE algorithm to encode data by sub words;
s2.2: in the process of constructing a vocabulary by using BPE, firstly splitting all the vocabulary elements into character sequences, constructing an initial vocabulary by using all the character sequences, then counting the occurrence frequency of each continuous byte pair in the training corpus, merging the byte pair with the highest occurrence frequency into a new sub-word, updating the vocabulary, and repeating the previous step until the occurrence frequency of the rest byte pairs is the highest 1;
s2.3: in the corpus coding process, the lengths of all the subwords in the table are arranged in the order from large to small, for each given word unit, the word table after the completion of the sorting is traversed, whether the subwords in the word table are the substrings of the word unit is searched, if the matching is successful, the subwords are output, and the rest character strings of the word unit are continuously matched; finally, if the remaining substrings are not successfully matched after the whole traversal is finished, replacing the substrings with special tokens < UNK >, and ending the whole coding process.
4. A method of reinforcement fransformer based code completion at a lemma granularity according to claim 3, wherein in step S3, a neural network is constructed using Transformer Encoder and Transformer Decoder; inputting a partial program of an original text sequence, and outputting a word element of the next position of the partial program; the neural network model formula is:
o=Trans((e t ) t∈src_seq )
where o is the distribution of all possible tokens, e t Is an embedded representation of each individual token t in the original token sequence.
5. The method of claim 4, wherein the choice of model parameters is a word vector dimension of 128, a attention block number of 6, a multi-head attention layer number of 8, and a dropout probability of 0.5.
6. The method of claim 4, wherein in step S4, the processing of merging information of different positions into the same header by sharing parameters between different headers, for each of the different headers:
each (Q) is calculated using a parameter matrix lambda i K i ) T The result of each low-rank distributed Multi-Head is overlapped, so that each isolated Attention Head is connected, and the model has a new learnable parameter lambda; before the Attention mechanism performs softmax calculation, the final talker-Heads Attention can be expressed by the following formula:
7. the method of claim 6, wherein in step S5, the generated top5 token probabilities are converted once, the numerical values are amplified, and the values are finally decoded and output.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310543114.XA CN116661797A (en) | 2023-05-15 | 2023-05-15 | Code completion method based on enhanced Transformer under word element granularity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310543114.XA CN116661797A (en) | 2023-05-15 | 2023-05-15 | Code completion method based on enhanced Transformer under word element granularity |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116661797A true CN116661797A (en) | 2023-08-29 |
Family
ID=87723363
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310543114.XA Pending CN116661797A (en) | 2023-05-15 | 2023-05-15 | Code completion method based on enhanced Transformer under word element granularity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116661797A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116992888A (en) * | 2023-09-25 | 2023-11-03 | 天津华来科技股份有限公司 | Data analysis method and system based on natural semantics |
-
2023
- 2023-05-15 CN CN202310543114.XA patent/CN116661797A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116992888A (en) * | 2023-09-25 | 2023-11-03 | 天津华来科技股份有限公司 | Data analysis method and system based on natural semantics |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110018820B (en) | Method for automatically generating Java code annotation based on Graph2Seq of deep reinforcement learning | |
WO2021000362A1 (en) | Deep neural network model-based address information feature extraction method | |
Shi et al. | Incsql: Training incremental text-to-sql parsers with non-deterministic oracles | |
CN113064586B (en) | Code completion method based on abstract syntax tree augmented graph model | |
CN112069199B (en) | Multi-round natural language SQL conversion method based on intermediate syntax tree | |
CN112559556A (en) | Language model pre-training method and system for table mode analysis and sequence mask | |
CN112463424B (en) | Graph-based end-to-end program repairing method | |
CN113342318B (en) | Fine-grained code automatic generation method and system based on multi-view code characteristics | |
CN116151132B (en) | Intelligent code completion method, system and storage medium for programming learning scene | |
CN113254616B (en) | Intelligent question-answering system-oriented sentence vector generation method and system | |
CN117153294B (en) | Molecular generation method of single system | |
CN116661797A (en) | Code completion method based on enhanced Transformer under word element granularity | |
CN115935957B (en) | Sentence grammar error correction method and system based on syntactic analysis | |
CN114924741A (en) | Code completion method based on structural features and sequence features | |
CN116700780A (en) | Code completion method based on abstract syntax tree code representation | |
CN115438709A (en) | Code similarity detection method based on code attribute graph | |
CN115543437A (en) | Code annotation generation method and system | |
CN111309896A (en) | Deep learning text abstract generation method based on secondary attention | |
Li et al. | Toward less hidden cost of code completion with acceptance and ranking models | |
CN112287641B (en) | Synonym sentence generating method, system, terminal and storage medium | |
CN112380882B (en) | Mongolian Chinese neural machine translation method with error correction function | |
CN113342343A (en) | Code abstract generation method and system based on multi-hop inference mechanism | |
CN117763363A (en) | Cross-network academic community resource recommendation method based on knowledge graph and prompt learning | |
CN116010621B (en) | Rule-guided self-adaptive path generation method | |
CN117238436A (en) | Model pre-training method and device for drug molecular analysis design |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |