CN112905186A - High signal-to-noise ratio code classification method and device suitable for open-source software supply chain - Google Patents
High signal-to-noise ratio code classification method and device suitable for open-source software supply chain Download PDFInfo
- Publication number
- CN112905186A CN112905186A CN202110168454.XA CN202110168454A CN112905186A CN 112905186 A CN112905186 A CN 112905186A CN 202110168454 A CN202110168454 A CN 202110168454A CN 112905186 A CN112905186 A CN 112905186A
- Authority
- CN
- China
- Prior art keywords
- node
- path
- ast
- code
- syntax tree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/42—Syntactic analysis
- G06F8/427—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/43—Checking; Contextual analysis
- G06F8/436—Semantic checking
- G06F8/437—Type checking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
The invention discloses a high signal-to-noise ratio code classification method and a device suitable for an open source software supply chain, wherein the method comprises the following steps: the method comprises the steps of converting a code to be predicted into PE-AST, digitizing each node, extracting a PE-AST path in the PE-AST, converting the PE-AST path into a tuple capable of being operated, calculating a correlation coefficient WS, updating path representation and predicting according to a PE-AST feature vector. The invention can improve the signal-to-noise ratio in the code representation process, thereby improving the accuracy of machine classification codes; according to the classification of the codes, the working efficiency of programmers in the aspects of code understanding and code maintenance is improved.
Description
Technical Field
The invention belongs to the technical field of computers, and relates to a high signal-to-noise ratio code classification method and device suitable for an open source software supply chain.
Background
Over the past decade, a large amount of open source software has emerged, which is the core that constitutes the supply chain for open source software. For a programmer, correctly classifying a large amount of source code contained in source software is helpful to improve work efficiency. First, grouping applications with similar functionality may make it more convenient for programmers to find functions that need to be implemented in applications that belong to the same group or category. Secondly, the same code loopholes often widely exist in codes of the same type of functions, namely, the same type of codes often have common loopholes, and when a programmer finds the loopholes in a section of codes, the programmer can quickly locate other places where similar errors may occur, so that the maintenance efficiency is improved.
Currently, the code classification method is generally based on a neural network, and through the learning of a large number of samples, the neural network model finds specific rules in data, and then classifies the codes according to the rules in practical use. However, if the sample and the program to be tested are complex, that is, the number of tokens (words obtained by segmenting the sentence of the program) is large, the noise contained in the code is significantly increased, and this kind of method cannot completely capture the effective rule applicable to code classification, thereby reducing the accuracy of code classification. At this time, the signal-to-noise ratio should be increased to obtain the semantic representation of the code containing the key information of the program, thereby avoiding the reduction of the accuracy.
In summary, the prior art has a problem of insufficient code classification accuracy for the current open source software supply chain.
Disclosure of Invention
The invention aims to provide a high signal-to-noise ratio code classification method and device suitable for an open source software supply chain.
In order to achieve the purpose, the invention adopts the following technical scheme:
a high signal-to-noise ratio code classification method suitable for an open source software supply chain comprises the following steps:
1) analyzing the syntax tree of the program to be predicted to generate an abstract syntax tree T of the code of the program to be predictedASTAnd construct tuples<TAST,pos>Wherein T isAST(N, T, s, δ, Φ), which is a non-end node set, T is an end node set, s is a root node, X is an actual value of each node in the abstract syntax tree, δ is a correspondence between a parent node and a child node in the abstract syntax tree, Φ is a correspondence between each node in the abstract syntax tree and the actual value, and pos is a position coordinate of each node in the abstract syntax tree;
2) will tuple<TAT,pos>Inputting a code classification model to obtain a classification prediction result of a program to be predicted;
wherein the code classification model is based on classification indexes of a plurality of sample programs and corresponding tuples<T′AST,pos′>And is obtained by deep learning method training; the code classification model parses tuples by<TAST,pos>:
a) Coding a corresponding relation phi (n) of each node, mapping a coded result to a vector space, and obtaining a final vector representation v (n) of each node according to the obtained node vector representation and the distance from the corresponding position coordinate pos to a root node s;
b) according to the non-end node set N, the end node set T, the root node s and the corresponding relation delta, the abstract syntax tree T is pairedASTExtracting the path, and combining the final vector representation v (n) to construct a path LiVector representation of (L)i) Wherein i is the end node number;
c) the vector representation emb (L) is updated by computing the correlation coefficient WS of each path with other pathsq) Obtaining a path representation ziAnd for each path, represents ziPerforming maximum pooling to obtain a final vector representation e of the program code to be predictedcode;
d) According to the final vector representation ecodeAnd obtaining a classification index to obtain a classification prediction result of the program to be predicted.
Further, the method for parsing the syntax tree includes: javalang packets in Python are used.
Further, the position coordinates pos are obtained by:
1) by node n in an abstract syntax tree TASTDepth n ofdepthWith the abstract syntax tree TASTDepth T ofdepthCalculating the coordinates of node n
2) By the parent x value of node n, the number of siblings and the position n of node n in the siblingsqTo obtain the coordinates of the node n
3) And obtaining the position coordinate pos of the node n according to the coordinate x and the coordinate y.
Further, the framework for training the code classification model includes: PyTorch frame.
Further, the result E (Φ (n)) ═ W after encodingEPhi (n), where WE∈RN′*EN' is the number of different node types, and E is the embedding dimension.
Further, the final vector representationWhere (x, y) is the coordinate of node n in the coordinate system with the root node s as the origin.
Further, the association coefficient WS is calculated by:
1) for any two paths LiAnd LjComputing end node semantic similarityWherein v (L)i1) Represents a path LiEnd node of v (L)j1) Represents a path LjJ is the serial number of the path, and j is not equal to i;
2) for the two paths LiAnd LjCalculating the path semantic similarity WSpath=sigmoid(Wpath·[emb(Li),emb(Lj)]+bpath) Wherein W ispath∈R6E*1E is the embedding dimension when encoding the correspondence phi (n), bpathIs an offset;
3) correlation coefficient WSi,j=α*WStoken+β*WSpathWhere α is the first coefficient and β is the second coefficient.
Further, the path representationWherein WvFor linear transformation, NLIs divided by path LiOther path sets than the above.
Further, the classification index is obtained by:
1) for the final vector representation ecodePerforming linear and nonlinear transformation;
2) classifying the transformed result through a Softmax () function to obtain the probability distribution P of the predicted resultd;
3) Selecting a probability distribution PdObtaining the classification index of the program to be predicted according to the index corresponding to the medium maximum value
A storage medium having a computer program stored therein, wherein the computer program is arranged to perform the above-mentioned method when executed.
An electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer to perform the method as described above.
Compared with the prior art, the invention has the following advantages:
1) the signal-to-noise ratio in the code representation process can be improved, so that the accuracy of machine classification codes is improved;
2) according to the classification of the codes, the working efficiency of programmers in the aspects of code understanding and code maintenance is improved.
Drawings
FIG. 1 is a flow chart of a high signal-to-noise ratio code classification method suitable for use in an open source software supply chain.
Fig. 2 is a diagram illustrating a PE-AST path corresponding to a code.
Detailed Description
In order to make the technical solutions in the embodiments of the present invention better understood and make the objects, features, and advantages of the present invention more comprehensible, the technical core of the present invention is described in further detail below with reference to the accompanying drawings and examples.
The general flow of the high snr code classification method of the present embodiment is shown in fig. 1, and mainly includes the following steps:
1. and converting the code to be predicted into the PE-AST. The concrete description is as follows:
1a) AST analysis is carried out on the java program by adopting a java bag in Python, and an abstract syntax tree of the code segment is generated and recorded as TAST,TASTN is a set of non-end nodes, T is a set of end nodes, X is an actual value of each node, s is a root node, δ is a correspondence between a parent node and a child node, and Φ is a correspondence between a node and an actual value. Go to 1 b).
1b) Traverse TASTThe position coordinates pos of each node are generated, and pos is (x, y). For a certain node n, the node n,wherein n isdepthFor node n at TASTDepth of middle, TdepthIs TASTDepth of (i.e. T)ASTThe depth of the deepest node in the cluster.Wherein xpFor the value x, n, of the parent node of node nnumIs the number of n siblings of node n, niThe positions of the node n in the sibling nodes are 1 from the left, and the positions are sequentially increased. Go to 1 c).
1c) Building tuples<TAST,pos>PE-AST as the segment code.
2. Each node is digitized. The concrete description is as follows:
2a) using a matrix WE∈RN*EEncode φ (n), i.e.:
E(φ(n))=WE·φ(n)
wherein, N is the number of different node types, the types include: there are many types of initialization, stabilization, identification, etc., and E is the embedding dimension. Go to 2 b).
2b) Calculating an embedding weight coefficient according to the distance of the node n relative to the origin, and multiplying the embedding weight coefficient by the result in a) to obtain a final node vector representation, namely:
where v (n) is the final vector representation for node n, with the origin being the root node of the AST.
3. And extracting a PE-AST path in the PE-AST. A PE-AST path is a sequence n of length k1…nk-1s, wherein n1Is one end node in PE-AST, s is root node; for i e [2, k-1 ]]N of (A) to (B)iAre all non-end nodes; for a PE-AST path, the form is<n1,p,s>P represents the removal of n from the sequence1And the section of s. A PE-AST path starts at an end node, ends at a root node, and traverses a series of non-end nodes. For example, the portion enclosed by the dashed box in fig. 2 is an example of the PE-AST path.
4. Converting the PE-AST path into a tuple that can be operated on, the tuple being in the form of<v(n1),v(p),v(s)>. The concrete description is as follows:
4a) for the calculation mode of v (p), it is expressed as summing the vectors of other nodes except the starting point and the end point on the PE-AST path, that is:
go to 4 b).
4b) Building triplets<v(n1),v(p),v(s)>As a vector representation for the PE-AST path for calculation.
5. And selecting any PE-AST path, and calculating the association coefficient WS of other PE-AST paths and the path. The correlation coefficient is a coefficient indicating the degree of co-operation of different PE-AST paths in the same segment of code. The method comprises the following specific steps:
5a) for two PE-AST paths L1,L2Calculating end node semantic similarity WStokenNamely:
wherein, v (L)11) Represents L1End node of v (L)21) Represents L2The end node of (1). Go to 5 b).
5b) For two PE-AST paths L1,L2Calculating the path semantic similarity WSpathNamely:
WSpath=sigmoid(Wpath·[emb(L1),emb(L2)]+bpath)
wherein, Wpath∈R6E*1,bpathIs the offset. Go to 5 c).
5c) By end node semantic similarity WStokenAnd path semantic similarity WSpathThe average summation results in a path correlation coefficient WS, which is:
WS=0.5*WStoken+0.5*WSpath
6. the path representation is updated. HandleExpressed as input, the PE-AST path set, NLThe number of PE-AST paths, if emb (l) is input and the updated path indicates z is output, it can be calculated by the following formula:
wherein, WvFor linear transformation, 1 × 1 convolution kernel is adopted, i represents a certain PE-AST path, and j represents the rest PE-AST paths.
7. And predicting according to the PE-AST characteristic vector, and specifically comprising the following steps:
6a) all the updated PE-AST paths are subjected to maximum pooling to obtain the final vector representation of the whole code, namely ecode=[max(zi,1),max(zi,2),...,max(zi,E)]Wherein i ∈ [1, N ]p]Go to 6 b).
6b) E to be the feature vector of the whole segment codecodeAfter linear and nonlinear transformation, probability distribution P of prediction result is obtained by adopting Softmax () functiondSelecting the index corresponding to the maximum value in the results as the final prediction result to obtain the classification prediction result, namely Pd=Softmax(ReLU(Wcode·ecode+bcode) In a batch process), wherein, Nrnumber of possible answers, bcodeIs the offset.
8. And (3) training the model described in the steps by using a data set to obtain a trained deep learning model, wherein a PyTorch framework is used in the training process.
The inventor trains the model described in the above steps with a java14m data set, the data in the java14m data set is derived from a project of 10,072 GitHub, and comprises 12,636,998 training samples, 371,362 verification samples and 368,445 test samples in total, and the cloned codes are removed, so that the model has strong specificity and rigor. An Adams optimizer is used in the training process, the initial learning rate is set to be 0.01, a trained deep learning model is obtained after 10 times of training of the whole data set, and a PyTorch framework is used in the training process.
Although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. A high signal-to-noise ratio code classification method suitable for an open source software supply chain comprises the following steps:
1) analyzing the syntax tree of the program to be predicted to generate an abstract syntax tree T of the code of the program to be predictedASTAnd construct tuples<TAST,pos>Wherein T isASTN is a non-end node set, T is an end node set, s is a root node, X is an actual value of each node in the abstract syntax tree, δ is a correspondence between a parent node and a child node in the abstract syntax tree, Φ is a correspondence between each node in the abstract syntax tree and the actual value, and pos is a position coordinate of each node in the abstract syntax tree;
2) will tuple<TAST,pos>Inputting a code classification model to obtain a classification prediction result of a program to be predicted;
wherein the code classification model is based on classification indexes of a plurality of sample programs and corresponding tuples<T′AST,pos′>And is obtained by deep learning method training; the code classification model parses tuples by<TAST,pos>:
a) Coding a corresponding relation phi (n) of each node, mapping a coded result to a vector space, and obtaining a final vector representation v (n) of each node according to the obtained node vector representation and the distance from the corresponding position coordinate pos to a root node s;
b) according to the non-end node set N, the end node set T, the root node s and the corresponding relation delta, the abstract syntax tree T is pairedASTExtracting the path, and combining the final vector representation v (n) to construct a path LiVector representation of (L)i) Wherein i is the end node number;
c) the vector representation emb (L) is updated by computing the correlation coefficient WS of each path with other pathsq) Obtaining a path representation ziAnd for each path, represents ziPerforming maximum pooling to obtain a final vector representation e of the program code to be predictedcode;
d) According to the final vector representation ecodeAnd obtaining a classification index to obtain a classification prediction result of the program to be predicted.
2. The method of claim 1, wherein the method of syntax tree parsing comprises: javalang packets in Python are used.
3. The method of claim 1, wherein the position coordinates pos are obtained by:
1) by node n in an abstract syntax tree TASTDepth n ofdepthWith the abstract syntax tree TASTDepth T ofdepthCalculating the coordinates of node n
2) By the parent x value of node n, the number of siblings and the position n of node n in the siblingsqTo obtain the coordinates of the node n
3) And obtaining the position coordinate pos of the node n according to the coordinate x and the coordinate y.
4. The method of claim 1, wherein training a framework of a code classification model comprises: PyTorch frame.
6. The method of claim 1, wherein the correlation coefficient WS is calculated by:
1) for any two paths LiAnd LjComputing end node semantic similarityWherein v (L)i1) Represents a path LiEnd node of v (L)j1) Represents a path LjJ is the serial number of the path, and j is not equal to i;
2) for the two paths LiAnd LjCalculating the path semantic similarity WSpath=sigmoid(Wpath·[emb(Li),emb(Lj)]+bpath) Wherein W ispath∈R6E*1E is the embedding dimension when encoding the correspondence phi (n), bpathIs an offset;
3) correlation coefficient WSi,j=α*WStoken+β*WSpathWhere α is the first coefficient and β is the second coefficient.
8. The method of claim 1, wherein the classification index is obtained by:
1) for the final vector representation ecodePerforming linear and nonlinear transformation;
2) classifying the transformed result through a Softmax () function to obtain the probability distribution P of the predicted resultd;
3) Selecting a probability distribution PdAnd obtaining the classification index of the program to be predicted according to the index corresponding to the medium maximum value.
9. A storage medium having a computer program stored thereon, wherein the computer program is arranged to, when run, perform the method of any of claims 1-8.
10. An electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the method according to any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110168454.XA CN112905186B (en) | 2021-02-07 | 2021-02-07 | High signal-to-noise ratio code classification method and device suitable for open-source software supply chain |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110168454.XA CN112905186B (en) | 2021-02-07 | 2021-02-07 | High signal-to-noise ratio code classification method and device suitable for open-source software supply chain |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112905186A true CN112905186A (en) | 2021-06-04 |
CN112905186B CN112905186B (en) | 2023-04-07 |
Family
ID=76123652
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110168454.XA Active CN112905186B (en) | 2021-02-07 | 2021-02-07 | High signal-to-noise ratio code classification method and device suitable for open-source software supply chain |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112905186B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090222419A1 (en) * | 2005-12-06 | 2009-09-03 | National Ict Australia Limited | Succinct index structure for xml |
CN101697121A (en) * | 2009-10-26 | 2010-04-21 | 哈尔滨工业大学 | Method for detecting code similarity based on semantic analysis of program source code |
US9262406B1 (en) * | 2014-05-07 | 2016-02-16 | Google Inc. | Semantic frame identification with distributed word representations |
CN107729925A (en) * | 2017-09-26 | 2018-02-23 | 中国科学技术大学 | The automatic method classified with scoring is done according to solution approach to program competition type source code |
US20190005163A1 (en) * | 2017-06-29 | 2019-01-03 | International Business Machines Corporation | Extracting a knowledge graph from program source code |
CN109445834A (en) * | 2018-10-30 | 2019-03-08 | 北京计算机技术及应用研究所 | The quick comparative approach of program code similitude based on abstract syntax tree |
CN110597735A (en) * | 2019-09-25 | 2019-12-20 | 北京航空航天大学 | Software defect prediction method for open-source software defect feature deep learning |
CN112181428A (en) * | 2020-09-28 | 2021-01-05 | 北京航空航天大学 | Abstract syntax tree-based open-source software defect data classification method and system |
-
2021
- 2021-02-07 CN CN202110168454.XA patent/CN112905186B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090222419A1 (en) * | 2005-12-06 | 2009-09-03 | National Ict Australia Limited | Succinct index structure for xml |
CN101697121A (en) * | 2009-10-26 | 2010-04-21 | 哈尔滨工业大学 | Method for detecting code similarity based on semantic analysis of program source code |
US9262406B1 (en) * | 2014-05-07 | 2016-02-16 | Google Inc. | Semantic frame identification with distributed word representations |
US20190005163A1 (en) * | 2017-06-29 | 2019-01-03 | International Business Machines Corporation | Extracting a knowledge graph from program source code |
CN107729925A (en) * | 2017-09-26 | 2018-02-23 | 中国科学技术大学 | The automatic method classified with scoring is done according to solution approach to program competition type source code |
CN109445834A (en) * | 2018-10-30 | 2019-03-08 | 北京计算机技术及应用研究所 | The quick comparative approach of program code similitude based on abstract syntax tree |
CN110597735A (en) * | 2019-09-25 | 2019-12-20 | 北京航空航天大学 | Software defect prediction method for open-source software defect feature deep learning |
CN112181428A (en) * | 2020-09-28 | 2021-01-05 | 北京航空航天大学 | Abstract syntax tree-based open-source software defect data classification method and system |
Also Published As
Publication number | Publication date |
---|---|
CN112905186B (en) | 2023-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109086805B (en) | Clustering method based on deep neural network and pairwise constraints | |
CN114169330B (en) | Chinese named entity recognition method integrating time sequence convolution and transform encoder | |
CN109063021B (en) | Knowledge graph distributed expression method capable of coding relation semantic diversity structure | |
CN110673840A (en) | Automatic code generation method and system based on tag graph embedding technology | |
CN109871454B (en) | Robust discrete supervision cross-media hash retrieval method | |
US11900250B2 (en) | Deep learning model for learning program embeddings | |
CN113987174A (en) | Core statement extraction method, system, equipment and storage medium for classification label | |
CN113412492A (en) | Quantum algorithm for supervised training of quantum Boltzmann machine | |
CN115658846A (en) | Intelligent search method and device suitable for open-source software supply chain | |
CN115617614A (en) | Log sequence anomaly detection method based on time interval perception self-attention mechanism | |
Jahanshahi et al. | nTreeClus: A tree-based sequence encoder for clustering categorical series | |
CN117077586B (en) | Register transmission level resource prediction method, device and equipment for circuit design | |
CN111737694B (en) | Malicious software homology analysis method based on behavior tree | |
CN113076545A (en) | Deep learning-based kernel fuzzy test sequence generation method | |
CN112905186B (en) | High signal-to-noise ratio code classification method and device suitable for open-source software supply chain | |
CN116861373A (en) | Query selectivity estimation method, system, terminal equipment and storage medium | |
CN117271701A (en) | Method and system for extracting system operation abnormal event relation based on TGGAT and CNN | |
CN116226864A (en) | Network security-oriented code vulnerability detection method and system | |
CN116361788A (en) | Binary software vulnerability prediction method based on machine learning | |
CN112735604B (en) | Novel coronavirus classification method based on deep learning algorithm | |
CN113392929A (en) | Biological sequence feature extraction method based on word embedding and self-encoder fusion | |
CN112381280A (en) | Algorithm prediction method based on artificial intelligence | |
Su et al. | A wavelet transform based protein sequence similarity model | |
Zhang et al. | Reducing Test Cases with Attention Mechanism of Neural Networks | |
Wu et al. | Discovering Mathematical Expressions Through DeepSymNet: A Classification-Based Symbolic Regression Framework |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |