CN114429197B - Neural network architecture searching method, system, equipment and readable storage medium - Google Patents

Neural network architecture searching method, system, equipment and readable storage medium Download PDF

Info

Publication number
CN114429197B
CN114429197B CN202210085746.1A CN202210085746A CN114429197B CN 114429197 B CN114429197 B CN 114429197B CN 202210085746 A CN202210085746 A CN 202210085746A CN 114429197 B CN114429197 B CN 114429197B
Authority
CN
China
Prior art keywords
network
calculating
darts
architecture
significance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210085746.1A
Other languages
Chinese (zh)
Other versions
CN114429197A (en
Inventor
徐亦飞
王正洋
朱利
尉萍萍
王超勇
张越皖
张扬
徐明杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202210085746.1A priority Critical patent/CN114429197B/en
Publication of CN114429197A publication Critical patent/CN114429197A/en
Application granted granted Critical
Publication of CN114429197B publication Critical patent/CN114429197B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a neural network architecture searching method, a system, equipment and a readable storage medium, which are used for initializing related parameters of a DARTS network, inputting an image training set into the initialized DARTS network, calculating a loss value according to an objective function, calculating network loss change according to gradient information by using a second-order Taylor expansion, calculating index significance by using a grading index based on synapse significance, adopting a connection sensitivity index to the neural network architecture searching for indicating the importance of operation, defining the micro-architecture searching as network pruning during the initialization, adopting a measure called operation significance in the initialization of the network pruning, and experimental results show that the framework is a promising and reliable micro-neural architecture searching solution and has good performance on different reference data sets and DARTS searching spaces. The method of the invention is very efficient and can complete the architecture search within a few seconds.

Description

Neural network architecture searching method, system, equipment and readable storage medium
Technical Field
The invention belongs to the technical field of artificial intelligence, and relates to a neural network architecture searching method, a system, equipment and a readable storage medium.
Background
The success of deep learning in computer vision is largely due to the deep a priori expertise of human experts, however such manual designs are costly and become more difficult as networks become larger and more complex. Neural Network Architecture Search (NAS) automates the neural network design process and is therefore of great interest. However, this approach requires very high computational power, and early NAS approaches required thousands of GPUs to find efficient network architectures. To improve efficiency, many recent studies have turned to reducing search costs, with one of the most popular paradigms being known as the microneural network architecture search (DARTS) framework. DARTS employs continuous relaxation to convert the operation selection problem to continuous amplitude optimization of a set of candidate operations, which is then converted to a double-layer optimization problem, alternately optimizing architecture parameters and model weights in weights by gradient descent.
Although microneural network architecture searching (DARTS) has become the dominant paradigm for neural Network Architecture Searching (NAS) due to its simplicity and efficiency, recent research has found that with the progress of DARTS optimization, the performance of search architectures has barely improved, as they simply use the parameter values of the corresponding architecture as an important indicator of architecture selection. This will result in the network architecture selected from the search space typically falling into a suboptimal state, which indicates that the final value of the network architecture parameters obtained by DARTS hardly indicates the importance of the operation. The above observations indicate that supervisory signals in DARTS may be a poor or unreliable indicator for network architecture searches, and Wang et al, a recent work shows that the magnitude of architectural parameters obtained after super network training by dart is fundamentally erroneous and hardly indicates operational importance. More interestingly, there are several studies that utilize a simple early-stop strategy to interrupt super-network training during the search process, which can significantly improve DARTS performance. These empirical observations indicate that the super network training degrades performance as the search progresses.
Network pruning is an efficient method of compressing a parametric neural network that minimizes performance degradation of the neural network by removing parameters. The last discretization stage of DARTS, i.e., selecting a discrete architecture from the hyper-parameterized hyper-networks according to the size of the operation, can be considered as an operation-level network pruning, and based on this motivation ,Architecture search,anneal and prune.In International Conference on Artificial Intelligence and Statistics,pages 493–503.PMLR,2020., a scalable search space is proposed, which can progressively prune inferior operations during the search process. The search may also be expedited because the number of candidate operations is reduced during the search. Also ,Progressive differentiable architecture search:Bridging the depth gap between search and evaluation.In Proceedings of the IEEE International Conference on Computer Vision prunes cell candidate operations during the search process, and increases the network depth step by step to alleviate the depth gap in the architecture search and evaluation process, these operations all accelerate the network search process to a certain extent, but the excessive search cost still makes the application scenario limited for the network architecture search.
Disclosure of Invention
The invention aims to provide a neural network architecture searching method, a system, equipment and a readable storage medium, which solve the problems that the existing DARTS algorithm is too heavy in searching cost and the searching index cannot reflect the importance of operation.
A neural network architecture search method, comprising the steps of:
S1, initializing related parameters of a DARTS network;
S2, inputting an image training set into an initialized DARTS network, calculating a loss value according to an objective function, calculating network loss change according to gradient information by using a second-order Taylor expansion method, and calculating index significance by using a grading index based on synapse significance;
and S3, analyzing and calculating an optimal Cell network structure according to the significance index of the operation, the network loss change and the loss value, and stacking the obtained optimal Cell network structure to form a searched model structure.
Further, the initialized relevant parameters include weight parameters, architecture parameters, learning rate and batch size.
Further, R is adopted as the network loss change caused by removing the operation
R=L(D,W,α,Sp)-L(D,W|(1-αk T)Sp)
Wherein D, W, α, sp, k are the data set, network parameters, architecture parameters, search space, and operations to be removed, respectively.
Further, the score index based on synaptic significance is shown in formula (1)
Alpha is the architecture parameter.
Further, the network loss variation R using a second-order taylor series expansion design:
Further, cifar-10 datasets were employed as training sets.
Further, DARTS network structures include Normal Cell structures and Reduction Cell structures.
A neural network architecture search system comprises an initialization module, an optimization training module and a search module;
The initialization module is used for initializing related parameters of the DARTS network;
The optimization training module is used for processing the image training set according to the initialized DARTS network, calculating a loss value according to an objective function, calculating network loss change according to gradient information by using a second-order Taylor expansion, calculating index saliency by using a grading index based on synapse saliency, and calculating an optimal Cell network structure according to the operation saliency index, the network loss change and the loss value;
And the searching module stacks and forms a searched model structure according to the obtained optimal Cell network structure.
A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the neural network architecture search method when the computer program is executed.
A computer readable storage medium storing a computer program which when executed by a processor implements the steps of the neural network architecture search method.
Compared with the prior art, the invention has the following beneficial technical effects:
The invention relates to a neural network architecture searching method, which is characterized in that related parameters of a DARTS network are initialized, an image training set is input into the initialized DARTS network, a loss value is calculated according to an objective function, a network loss change is calculated according to gradient information by using a second-order Taylor expansion, index significance is calculated by using a grading index based on synapse significance, a connection sensitivity index is adopted to the neural network architecture searching for indicating the importance of operation, a micro-architecture searching is defined as network pruning during initialization, an operation significance measuring is adopted during the initialization of the network pruning, and experimental results show that the framework is a promising and reliable micro-neural structure searching solution and has good performance on different reference data sets and DARTS searching spaces. The method of the invention is very efficient and can complete the architecture search within a few seconds.
Furthermore, due to the memory and calculation efficiency, the method is applied to a more flexible search space, in which the depth of the network in architecture search and evaluation can be the same, and the method can be directly applied to a large data set to perform architecture search and then be transferred to the large data set to perform evaluation, so that the flexibility of the method is shown.
Drawings
Fig. 1 is a schematic diagram of an internal structure of a DARTS network Cell according to an embodiment of the invention.
FIG. 2 is a schematic diagram of a DARTS model network according to an embodiment of the present invention.
FIG. 3 is a flow chart of an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Setting the image size of the input image X as H×W×C, wherein H, W, C is the height, width and channel of the input image respectively; the DARTS network is S (X), the classification class of the classification network is N, the architecture parameter of the DARTS network is alpha, and the weight parameter is W.
The invention discloses a neural network architecture searching method based on saliency, which comprises the following steps:
S1, collecting an input image and dividing the input image into a training set T train and a testing set T test;
s2, initializing a weight parameter W, a framework parameter alpha, a learning rate and a batch size in a DARTS network S (X);
S3, inputting image data X in a training set T train into an initialized DARTS network S (X), calculating a loss value according to an objective function, calculating index significance M by using a grading index based on synapse significance, and calculating network loss change R by using a second-order Taylor expansion according to gradient information;
The score index based on synaptic significance is shown in formula (1)
The grading index of the synaptic significance is used for pruning the network during initialization; the application does not use a significance index to score the weight parameters of the super network; scoring operations in the super network by using importance of pruning the network architecture and correspondingly using scoring indicators of synaptic significance; the application adopts the grading index of synaptic significance without training, which enables us to perform pruning operation on the super network in initialization without training, namely removing the operation.
Using R as the change in network loss to remove the operation
R=L(D,W,α,Sp)-L(D,W|(1-αk T)Sp) (2)
Wherein D, W, alpha, sp, k are the data set, network parameters, architecture parameters, search space, operations to be removed, respectively; the application uses R as the optimal architecture selection criterion. Specifically, network architecture significance is defined as the change in network loss caused by removing the architecture from the neural network search space, reflecting the contribution of the candidate architecture to network performance, effectively eliminating the bias in architecture selection.
The network loss variation R using a second order taylor series expansion design in which a step size can be calculated by one back propagation over the validation dataset and a second derivative can be calculated by two back propagation:
S4, analyzing and calculating an optimal Cell network structure according to the operational significance index M and the network loss change R, and stacking the obtained optimal Cell network structure to form a searched model structure.
In one embodiment of the present invention, there is provided a terminal device including a processor and a memory for storing a computer program including program instructions, the processor for executing the program instructions stored by the computer storage medium. The processor adopts a Central Processing Unit (CPU), or adopts other general purpose processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), ready-made programmable gate arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components and the like, which are a computation core and a control core of the terminal, and are suitable for realizing one or more instructions, in particular for loading and executing one or more instructions so as to realize corresponding method flows or corresponding functions; the processor provided by the embodiment of the invention can be used for the operation of the neural network architecture searching method.
A neural network architecture search system, comprising:
the system comprises an initialization module, an optimization training module and a search module;
The initialization module is used for initializing related parameters of the DARTS network;
The optimization training module is used for processing the image training set according to the initialized DARTS network, calculating a loss value according to an objective function, calculating network loss change according to gradient information by using a second-order Taylor expansion, calculating index saliency by using a grading index based on synapse saliency, and calculating an optimal Cell network structure according to the operation saliency index, the network loss change and the loss value;
The searching module stacks according to the obtained optimal Cell network structure to form a searched model structure
In still another embodiment of the present invention, a storage medium, specifically a computer readable storage medium (Memory), is a Memory device in a terminal device, for storing programs and data. The computer readable storage medium includes a built-in storage medium in the terminal device, provides a storage space, stores an operating system of the terminal, and may also include an extended storage medium supported by the terminal device. Also stored in the memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor. The computer readable storage medium may be a high-speed RAM memory or a Non-volatile memory (Non-volatile memory), such as at least one magnetic disk memory. One or more instructions stored in a computer-readable storage medium may be loaded and executed by a processor to implement the corresponding steps of the above embodiments that may be used in a neural network architecture search method.
As shown in the implementation flowchart of the present invention in fig. 3, the present invention proposes a neural network architecture searching method based on saliency; the invention is described in detail below with reference to the attached drawing figures:
description of data:
we used Cifar-10, NAS-bench-201 dataset evaluations for training and evaluation, using DARTS and NAS-bench-201 search spaces. CIFAR-10 the dataset consisted of 60000 32x32 color images of 10 classes, 6000 images each. There are 50000 training images and 10000 test images. The dataset was divided into five training batches and one test batch, each batch having 10000 images. The test lot contains exactly 1000 randomly selected images from each category. The training batch contains the remaining images in a random order. Overall, the sum of the five training sets contains exactly 5000 images from each class.
The search space defined in NAS-band-201 includes all possible cell structures generated by 4 nodes and 5 related operation options, yielding a total of 5 6 = 15625 cell candidates. The three data sets (CIFAR-10, CIFAR-100, imagenet downsampled 16X 16, 120 classes selected) are provided with training logs using the same set and performance for each structure candidate
Training a network:
The DARTS network model adopted in this example is shown in FIG. 2, and the DARTS network structure is composed of two types of cells, namely Normal Cell and Reduction Cell, and total 20 cells, and the structure of the DARTS network model is shown in FIG. 2.
Evaluation of the searched model:
The training model is evaluated by using CIFAR-10 data sets, as shown in a formula (4), the cross entropy loss is an important evaluation index in the field of image classification, the lower the index is, the better the proving effect is, and the cross entropy loss is reduced to 0.12 by calculating the predictive score of 500 samples and 600 epochs. The present invention allows for a dramatic reduction in search time compared to other DARTS methods.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the concept of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the present invention.

Claims (5)

1. A neural network architecture search method, comprising the steps of:
s1, initializing related parameters of a DARTS network; the initialized related parameters comprise weight parameters, architecture parameters, learning rate and batch size;
S2, inputting an image training set into an initialized DARTS network, calculating a loss value according to an objective function, calculating network loss change according to gradient information by using a second-order Taylor expansion method, and calculating index significance by using a grading index based on synapse significance;
S3, analyzing and calculating an optimal Cell network structure according to the significance index of the operation, the network loss change and the loss value, and stacking the obtained optimal Cell network structure to form a searched model structure;
using R as the change in network loss to remove the operation
R=L(D,W,α,Sp)-L(D,W|(1-αk T)Sp)
Wherein D, W, alpha, sp, k are the data set, network parameters, architecture parameters, search space, operations to be removed, respectively; the score index based on synaptic significance is shown in formula (1)
Alpha is a framework parameter;
network loss variation R using a second-order taylor series expansion design:
Cifar-10 datasets were used as training sets.
2. The neural network architecture search method of claim 1, wherein the DARTS network structure comprises a Normal Cell structure and a Reduction Cell structure.
3. A neural network architecture search system for use in the method of claim 1, comprising an initialization module, an optimization training module, and a search module;
The initialization module is used for initializing related parameters of the DARTS network;
The optimization training module is used for processing the image training set according to the initialized DARTS network, calculating a loss value according to an objective function, calculating network loss change according to gradient information by using a second-order Taylor expansion, calculating index saliency by using a grading index based on synapse saliency, and calculating an optimal Cell network structure according to the operation saliency index, the network loss change and the loss value;
And the searching module stacks and forms a searched model structure according to the obtained optimal Cell network structure.
4. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 2 when the computer program is executed by the processor.
5. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method of any one of claims 1 to 2.
CN202210085746.1A 2022-01-25 2022-01-25 Neural network architecture searching method, system, equipment and readable storage medium Active CN114429197B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210085746.1A CN114429197B (en) 2022-01-25 2022-01-25 Neural network architecture searching method, system, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210085746.1A CN114429197B (en) 2022-01-25 2022-01-25 Neural network architecture searching method, system, equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN114429197A CN114429197A (en) 2022-05-03
CN114429197B true CN114429197B (en) 2024-05-28

Family

ID=81313355

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210085746.1A Active CN114429197B (en) 2022-01-25 2022-01-25 Neural network architecture searching method, system, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN114429197B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998648A (en) * 2022-05-16 2022-09-02 电子科技大学 Performance prediction compression method based on gradient architecture search

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112396179A (en) * 2020-11-20 2021-02-23 浙江工业大学 Flexible deep learning network model compression method based on channel gradient pruning
WO2021057056A1 (en) * 2019-09-25 2021-04-01 华为技术有限公司 Neural architecture search method, image processing method and device, and storage medium
CN113344174A (en) * 2021-04-20 2021-09-03 湖南大学 Efficient neural network structure searching method based on probability distribution

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210089832A1 (en) * 2019-09-19 2021-03-25 Cognizant Technology Solutions U.S. Corporation Loss Function Optimization Using Taylor Series Expansion

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021057056A1 (en) * 2019-09-25 2021-04-01 华为技术有限公司 Neural architecture search method, image processing method and device, and storage medium
CN112396179A (en) * 2020-11-20 2021-02-23 浙江工业大学 Flexible deep learning network model compression method based on channel gradient pruning
CN113344174A (en) * 2021-04-20 2021-09-03 湖南大学 Efficient neural network structure searching method based on probability distribution

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
赖叶静 ; 郝珊锋 ; 黄定江 ; .深度神经网络模型压缩方法与进展.华东师范大学学报(自然科学版).2020,(第05期),全文. *
闵锐 ; .高效深度神经网络综述.电信科学.2020,(第04期),全文. *

Also Published As

Publication number Publication date
CN114429197A (en) 2022-05-03

Similar Documents

Publication Publication Date Title
CN108985335B (en) Integrated learning prediction method for irradiation swelling of nuclear reactor cladding material
CN113076938B (en) Deep learning target detection method combining embedded hardware information
CN108051660A (en) A kind of transformer fault combined diagnosis method for establishing model and diagnostic method
CN108491226B (en) Spark configuration parameter automatic tuning method based on cluster scaling
CN106203534A (en) A kind of cost-sensitive Software Defects Predict Methods based on Boosting
CN106126589B (en) Resume search method and device
CN111445008A (en) Knowledge distillation-based neural network searching method and system
Nugroho et al. Hyper-parameter tuning based on random search for densenet optimization
CN106681305A (en) Online fault diagnosing method for Fast RVM (relevance vector machine) sewage treatment
CN112287656B (en) Text comparison method, device, equipment and storage medium
CN114429197B (en) Neural network architecture searching method, system, equipment and readable storage medium
CN114609994A (en) Fault diagnosis method and device based on multi-granularity regularization rebalance incremental learning
CN112699957B (en) Image classification optimization method based on DARTS
CN114021425A (en) Power system operation data modeling and feature selection method and device, electronic equipment and storage medium
CN111400964B (en) Fault occurrence time prediction method and device
CN112651499A (en) Structural model pruning method based on ant colony optimization algorithm and interlayer information
Li et al. Pruner to predictor: An efficient pruning method for neural networks compression
Mo et al. Simulated annealing for neural architecture search
CN112434729B (en) Intelligent fault diagnosis method based on layer regeneration network under unbalanced sample
CN111026661B (en) Comprehensive testing method and system for software usability
CN112527996A (en) Sample screening method and system, electronic equipment and storage medium
Zhang et al. Hardware-aware one-shot neural architecture search in coordinate ascent framework
Zhang et al. Design automation for fast, lightweight, and effective deep learning models: A survey
Nong Construction and Simulation of Financial Risk Prediction Model Based on LSTM
Sarah et al. LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant