WO2024012306A1 - 神经网络模型结构确定方法、装置、设备、介质及产品 - Google Patents

神经网络模型结构确定方法、装置、设备、介质及产品 Download PDF

Info

Publication number
WO2024012306A1
WO2024012306A1 PCT/CN2023/105495 CN2023105495W WO2024012306A1 WO 2024012306 A1 WO2024012306 A1 WO 2024012306A1 CN 2023105495 W CN2023105495 W CN 2023105495W WO 2024012306 A1 WO2024012306 A1 WO 2024012306A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
network model
preset
model
cpu
Prior art date
Application number
PCT/CN2023/105495
Other languages
English (en)
French (fr)
Inventor
王星
李卫
夏鑫
肖学锋
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2024012306A1 publication Critical patent/WO2024012306A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/06Multi-objective optimisation, e.g. Pareto optimisation using simulated annealing [SA], ant colony algorithms or genetic algorithms [GA]

Definitions

  • the present disclosure relates to the field of artificial intelligence technology, such as neural network model structure determination methods, devices, equipment, media and products.
  • the present disclosure provides methods, devices, equipment, media and products for determining neural network model structures, which can obtain a neural network model structure whose CPU occupancy meets preset requirements during operation, and reduce resource consumption of neural networks during operation.
  • the present disclosure provides a method for determining a neural network model structure, which method includes:
  • a target neural network model structure is determined among the structures of the at least one candidate neural network model.
  • the present disclosure also provides a device for determining a neural network model structure, which device includes:
  • a candidate model determination module is configured to determine at least one candidate neural network model based on a preset neural network model architecture search algorithm
  • the CPU utilization prediction module is set to predict the runtime CPU usage of each candidate neural network model based on the preset CPU usage prediction model, and obtain the CPU usage prediction value;
  • the target model structure determination module is configured to determine a target neural network model structure among the structures of the at least one candidate neural network model based on at least one CPU occupancy prediction value.
  • the present disclosure also provides an electronic device, which includes:
  • processors one or more processors
  • a storage device configured to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the neural network model structure determination method described in any one of the embodiments of the present disclosure.
  • the present disclosure also provides a storage medium containing computer-executable instructions, which when executed by a computer processor are used to perform the above-mentioned neural network model structure determination method.
  • the present disclosure also provides a computer program product, including a computer program that implements the above-mentioned neural network model structure determination method when executed by a processor.
  • Figure 1 is a schematic flowchart of a method for determining the structure of a neural network model provided by an embodiment of the present disclosure
  • Figure 2 is a schematic flowchart of yet another method for determining the structure of a neural network model provided by an embodiment of the present disclosure
  • Figure 3 is a schematic structural diagram of a neural network model structure determination device provided by an embodiment of the present disclosure
  • FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • the term “include” and its variations are open inclusive, that is, “includes.”
  • the term “based on” means “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; and the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • a prompt message is sent to the user to clearly remind the user that the operation requested will require the acquisition and use of the user's personal information. Therefore, users can autonomously choose whether to provide personal information to software or hardware such as electronic devices, applications, servers or storage media that perform the operations of the technical solution of the present disclosure based on the prompt information.
  • the method of sending prompt information to the user may be, for example, a pop-up window, and the prompt information may be presented in the form of text in the pop-up window.
  • the pop-up window can also contain a selection control for the user to choose "agree” or "disagree” to provide personal information to the electronic device.
  • the data involved in this technical solution shall comply with the requirements of corresponding laws, regulations and relevant regulations.
  • Figure 1 is a schematic flowchart of a method for determining a neural network model structure provided by an embodiment of the present disclosure.
  • the embodiment of the present disclosure is suitable for scenarios in which the model structure is determined through neural network architecture search.
  • This method can be performed by a neural network model structure determination device.
  • the device can be implemented in the form of software and/or hardware, and implemented through electronic equipment.
  • the electronic equipment can be a mobile terminal, a personal computer (Personal Computer, PC) or a server.
  • the method for determining the structure of the neural network model includes:
  • the performance of machine learning algorithms depends largely on a variety of hyperparameters.
  • hyperparameters There are three main categories of hyperparameters.
  • the first category is optimization parameters, such as learning rate, training batch size (batch size), weight decay, etc.
  • the second category is parameters that define the network structure. , such as how many layers a network has, what operators each layer includes, the filter size in convolution, etc.
  • the third category is regularization coefficients.
  • Neural Network Model Architecture Search is the process of automatically tuning the parameters of the network structure, which solves the problem of optimal parameter search in high-dimensional space.
  • the process of determining at least one candidate neural network model based on the preset neural network model architecture search algorithm is to search for a candidate that meets the search strategy requirements in the preset search space through the search strategy corresponding to the preset neural network model architecture search algorithm. candidate neural network model.
  • the preset neural network model architecture search algorithm may be one or more of an evolutionary search algorithm, a random search algorithm, a reinforcement learning algorithm, a gradient optimization algorithm or a Bayesian search algorithm.
  • the model architecture search algorithm can be optimized through an evolutionary search algorithm, a random search algorithm, a reinforcement learning algorithm, a gradient optimization algorithm or a Bayesian search algorithm, and then based on the optimized search strategy Search for candidate neural network models.
  • this step is the process of evaluating the performance of the candidate neural network model.
  • priority is given to the runtime CPU occupancy rate of the candidate neural network model, that is, the calculation resource consumption of the model during the deployment of the application is taken into consideration to avoid the final trained model due to excessive runtime resource consumption. Big reasons limit practical applications.
  • a preset CPU usage prediction model is used to predict the runtime CPU usage of each candidate neural network model.
  • the preset CPU usage prediction model is a pre-trained learning model that can be encoded according to the input model architecture to obtain a CPU usage prediction result accordingly.
  • Each time a candidate neural network model is searched the CPU usage of the candidate neural network model can be predicted. Alternatively, after multiple candidate neural network models are searched for at one time, the CPU usage can be predicted separately.
  • the target neural network model that meets the requirements can be screened out based on the preset CPU usage requirements.
  • the number of candidate neural network models may be one or more.
  • the number of candidate neural network models that meet the preset CPU usage requirements may also be one or more.
  • the target neural network model structure can be screened out based on parameters such as calculation amount, delay, model size or network performance of the neural network model. Then, the searched target neural network model architecture can be trained to obtain the final application model, which can be actually tested and put into use.
  • a preset model structure optimization algorithm can also be used to optimize the target neural network model structure, such as pruning algorithm, quantification algorithm and other algorithms. To achieve model optimization and improve the learning efficiency of the model.
  • the technical solution of the embodiment of the present disclosure first determines at least one candidate neural network model based on a preset neural network model architecture search algorithm; then, uses the preset CPU occupancy prediction model to calculate the runtime CPU of each candidate neural network model.
  • the occupancy rate is predicted to obtain the corresponding CPU occupancy rate prediction value; based on at least one CPU occupancy rate prediction value, a target neural network model structure is determined in the structure of at least one candidate neural network model, that is, the CPU occupancy rate is strongly constrained, Select the neural network model that meets the CPU usage requirements as the target neural network model.
  • the technical solution of the embodiment of the present disclosure solves the problem that the model structure obtained through neural architecture search has a large CPU occupancy rate and is limited in use during operation, and can make the CPU occupancy rate of the model structure obtained through neural architecture search meet the requirements. , Reduce the resource consumption of the neural network during operation and facilitate actual deployment and application.
  • Figure 2 is a schematic flowchart of another method for determining the structure of a neural network model provided by an embodiment of the present disclosure.
  • a model training device which can be implemented in the form of software and/or hardware, or by an electronic device, which can be a mobile terminal, a PC, a server, etc.
  • the method for determining the neural network model structure includes:
  • S210 Perform network model sampling in the preset network search space to obtain a subnetwork sample set, and run multiple subnetworks in the subnetwork sample set respectively to determine the runtime CPU occupancy rate of each subnetwork.
  • the preset network search space can be a supernetwork space, which contains model subnetworks of multiple structural types. For example, a certain number of subnetwork structures, such as 1,000 to 4,000, can be sampled from the preset network search space to obtain a subnetwork sample set.
  • One-hot encoding can be performed on multiple subnetworks in the subnetwork sample set, and expressed in the form of encoding Displays subnet architecture information so that multiple subnets can be distinguished based on the encoding results. Then, run each subnetwork in the subnetwork sample set separately in the actual environment to test the runtime CPU usage of each subnetwork. According to the CPU usage test results, the encoding results of each subnetwork and the corresponding CPU usage test results can be combined into a sample pair to construct a data set for training the CPU usage prediction model.
  • the model is trained on the sample data obtained in the previous step.
  • the training rounds reach the preset number of times and the loss function of the model converges, the corresponding training results, that is, the target CPU usage prediction model, can be obtained.
  • multiple candidate neural network models can be obtained through the neural network model architecture search algorithm, and the number of candidate neural network models that meet the preset CPU usage threshold requirement is also greater than 1.
  • a target neural network model structure can be selected optimally through the constraints of calculation amount, parameter amount and other index items. For example, a Pareto optimal strategy or other methods can be adopted based on one or more indicator data of multiple candidate neural network models whose predicted CPU usage values are less than the preset CPU usage threshold, model size, and model delay.
  • the optimal solution solution strategy determines a target neural network model structure among the structures of multiple candidate neural network models.
  • the neural network model structure determination method of this embodiment is used to search for a target network model.
  • the searched target network model is optimized with the pruning algorithm and has lower floating point operations (FLOPs), that is, the amount of calculations. , reducing FLOPs by 20% to 25%, and reducing CPU usage by 1.5% to 2%.
  • FLOPs floating point operations
  • the technical solution of the embodiment of the present disclosure first constructs a subnet set and tests the subnet set.
  • the runtime CPU usage of each subnet is composed of a model structure and a CPU usage sample pair based on the test results, and the CPU usage prediction model is trained based on the sample pair; then the neural network model architecture search algorithm is based on the preset neural network model to determine the neural network usage.
  • the CPU usage prediction model obtained by training is used to predict the runtime CPU usage of each candidate neural network model, and the corresponding CPU usage prediction value is obtained; based on at least one CPU usage prediction value, Determine a target neural network model structure among the structures of at least one candidate neural network model, that is, subject the CPU occupancy rate to a strong constraint, and select the structure of the neural network model that satisfies the CPU occupancy rate requirement as the target neural network model structure.
  • the technical solution of the embodiment of the present disclosure solves the problem that the model structure obtained through neural architecture search has a large CPU occupancy rate and is limited in use during operation, and can make the CPU occupancy rate of the model structure obtained through neural architecture search meet the requirements. , Reduce the resource consumption of the neural network during operation and facilitate actual deployment and application.
  • Figure 3 is a schematic structural diagram of a neural network model structure determination device provided by an embodiment of the present disclosure.
  • the device is suitable for scenarios in which the model structure is determined through neural network architecture search.
  • the neural network model structure determination device can be configured through software and/or hardware. It can be implemented in the form of an electronic device, and the electronic device can be a mobile terminal, a PC or a server.
  • the neural network model structure determination device includes: a candidate model determination module 310, a CPU utilization prediction module 320, and a target model structure determination module 330.
  • the candidate model determination module 310 is configured to determine at least one candidate neural network model based on the preset neural network model architecture search algorithm; the CPU utilization prediction module 320 is configured to determine each candidate neural network based on the preset CPU occupancy prediction model. The CPU occupancy rate of the model is predicted during runtime to obtain the CPU occupancy prediction value; the target model structure determination module 330 is configured to determine a target in the structure of the at least one candidate neural network model based on at least one CPU occupancy prediction value. Neural network model structure.
  • the technical solution of the embodiment of the present disclosure first determines at least one candidate neural network model based on a preset neural network model architecture search algorithm; then, uses the preset CPU occupancy prediction model to calculate the runtime CPU of each candidate neural network model.
  • the occupancy rate is predicted, and corresponding CPU occupancy rate prediction values are obtained respectively; according to at least one CPU occupancy rate prediction value, a target neural network model structure is determined in the structure of at least one candidate neural network model, that is, the CPU occupancy rate is strongly constrained , select the neural network model that meets the CPU usage requirements as the target neural network model structure.
  • the technical solution of the embodiment of the present disclosure solves the problem that the model structure obtained through neural architecture search has a large CPU occupancy rate and is limited in use during operation, and can make the CPU occupancy rate of the model structure obtained through neural architecture search meet the requirements. , Reduce the resource consumption of the neural network during operation and facilitate actual deployment and application.
  • the target model structure determination module 330 is set to:
  • the target model structure determination module 330 can also Set as:
  • the target model structure determination module 330 can also be configured as:
  • the model size and model delay are calculated using a Pareto optimal strategy. Determine a target neural network model structure among the structures of multiple candidate neural network models.
  • the neural network model structure determination device also includes a model training module, which is configured to train a preset CPU occupancy prediction model.
  • the training process includes:
  • the structural encoding is used as the model input data, and the runtime CPU occupancy rates of multiple sub-networks are used as the expected output of the model.
  • the model is trained to obtain the preset CPU occupancy prediction model.
  • the neural network model structure determination device further includes a model optimization module, which is configured as:
  • a preset model structure optimization algorithm is used to optimize the structure of the target neural network model.
  • the preset neural network model architecture search algorithm includes:
  • One or more of an evolutionary search algorithm a stochastic search algorithm, a reinforcement learning algorithm, a gradient optimization algorithm, or a Bayesian search algorithm.
  • the above-mentioned device provided by the embodiment of the present disclosure can execute the method provided by any embodiment of the present disclosure, and has corresponding functional modules and effects for executing the method.
  • FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • Terminal devices in embodiments of the present disclosure may include mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA), tablet computers (Portable Android Device, PAD), portable multimedia players (Portable Media Mobile terminals such as Player, PMP), vehicle-mounted terminals (such as vehicle-mounted navigation terminals), and fixed terminals such as digital televisions (Television, TV), desktop computers, and the like.
  • the electronic device 400 shown in FIG. 4 is only an example and should not bring any limitations to the functions and usage scope of the embodiments of the present disclosure.
  • the electronic device 400 may include a processing device (such as a central processing unit, a graphics processor, etc.) 401, which may process data according to a program stored in a read-only memory (Read-Only Memory, ROM) 402 or from a storage device. 408 loads the program in the random access memory (Random Access Memory, RAM) 403 to perform various appropriate actions and processes. In the RAM 403, various programs and data required for the operation of the electronic device 400 are also stored.
  • the processing device 401, ROM 402 and RAM 403 are connected to each other via a bus 404.
  • An editing/output (I/O) interface 405 is also connected to bus 404.
  • the following devices can be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD) , an output device 407 such as a speaker, a vibrator, etc.; a storage device 408 including a magnetic tape, a hard disk, etc.; and a communication device 409.
  • the communication device 409 may allow the electronic device 400 to communicate wirelessly or wiredly with other devices to exchange data.
  • FIG. 4 illustrates electronic device 400 with various means, implementation or availability of all illustrated means is not required. More or fewer means may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a non-transitory computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via communication device 409, or from storage device 408, or from ROM 402.
  • the processing device 401 When the computer program is executed by the processing device 401, the above-mentioned functions defined in the method of the embodiment of the present disclosure are performed.
  • the structure of the electronic device provided by the embodiments of the present disclosure and the neural network model provided by the above embodiments are determined.
  • the methods belong to the same concept.
  • Technical details that are not described in detail in this embodiment can be found in the above embodiments, and this embodiment has the same effect as the above embodiments.
  • Embodiments of the present disclosure provide a computer storage medium on which a computer program is stored.
  • the program is executed by a processor, the neural network model structure determination method provided by the above embodiments is implemented.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium may be, for example, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof.
  • Examples of computer readable storage media may include: an electrical connection having one or more wires, a portable computer disk, a hard drive, RAM, ROM, Erasable Programmable Read-Only Memory (EPROM, or flash memory) , optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take many forms, including electromagnetic signals, optical signals, or any suitable combination of the above.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer-readable medium can be transmitted using any appropriate medium, including: wire, optical cable, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.
  • the client and server can communicate using any currently known or future developed network protocol, such as HyperText Transfer Protocol (HTTP), and can communicate with digital data in any form or medium.
  • HTTP HyperText Transfer Protocol
  • Communications e.g., communications network
  • Examples of communication networks include Local Area Networks (LANs), Wide Area Networks (WANs), the Internet (e.g., the Internet), and end-to-end networks (e.g., ad hoc end-to-end networks), as well as any current network for knowledge or future research and development.
  • LANs Local Area Networks
  • WANs Wide Area Networks
  • the Internet e.g., the Internet
  • end-to-end networks e.g., ad hoc end-to-end networks
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; it may also exist independently without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs.
  • the electronic device executes the above-mentioned one or more programs.
  • a target neural network model structure is determined from the structures of at least one candidate neural network model.
  • Computer program code for performing the operations of the present disclosure may be written in one or more programming languages, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional Procedural programming language—such as "C" or a similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user computer through any kind of network, including a LAN or WAN, or may be connected to an external computer (eg, through the Internet using an Internet service provider).
  • each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or operations. , or can be implemented using a combination of specialized hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure can be implemented in software or hardware.
  • the name of the unit does not constitute a limitation on the unit itself.
  • the first acquisition unit can also be described as "the unit that acquires at least two Internet Protocol addresses.”
  • exemplary types of hardware logic components include: field programmable gate array (Field Programmable Gate Array, FPGA), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), application specific standard product (Application Specific Standard Parts (ASSP), System on Chip (SOC), Complex Programming Logic Device (CPLD), etc.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems. system, device or equipment, or any suitable combination of the foregoing. Examples of machine-readable storage media would include an electrical connection based on one or more wires, a portable computer disk, a hard drive, RAM, ROM, EPROM or flash memory, optical fiber, CD-ROM, optical storage device, magnetic storage device, or Any suitable combination of the above.
  • An embodiment of the present disclosure also provides a computer program product, including a computer program that, when executed by a processor, implements the method for determining the structure of a neural network model as provided in any embodiment of the present disclosure.
  • computer program code for performing the disclosed operations can be written in one or more programming languages or a combination thereof.
  • Programming languages include object-oriented programming languages, such as Java, Smalltalk , C++, and also includes conventional procedural programming languages, such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user computer through any kind of network, including a LAN or WAN, or may be connected to an external computer (eg, through the Internet using an Internet service provider).
  • Example 1 provides a method for determining the structure of a neural network model, which method includes:
  • a target neural network model structure is determined among the structures of the at least one candidate neural network model.
  • Example 2 provides a method for determining the structure of a neural network model, further including:
  • determining a target neural network model structure in the structure of the at least one candidate neural network model based on at least one CPU occupancy prediction value includes:
  • the structure of the candidate neural network model corresponding to the CPU occupancy predicted value is determined as the target neural network model structure.
  • Example 3 provides a method for determining the structure of a neural network model, including:
  • the at least one CPU occupancy prediction value in the at least one Determine a target neural network model structure among the candidate neural network model structures, which also includes:
  • Example 4 provides a method for determining the structure of a neural network model, further including:
  • a preset model selection strategy is used.
  • determine a target neural network model structure including:
  • the model size and model delay are calculated using a Pareto optimal strategy. Determine a target neural network model structure among the structures of multiple candidate neural network models.
  • Example 5 provides a method for determining the structure of a neural network model, further including:
  • the training process of the preset CPU usage prediction model includes:
  • the corresponding structural codes of the multiple sub-networks are used as model input data, and the runtime CPU occupancy rates of the multiple sub-networks are used as the model expected output. Model training is performed to obtain the preset CPU occupancy prediction model.
  • Example 6 provides a method for determining the structure of a neural network model, further including:
  • a preset model structure optimization algorithm is used to perform structural optimization on the target neural network model structure.
  • Example 7 provides a method for determining the structure of a neural network model, which also includes:
  • the preset neural network model architecture search algorithm includes:
  • One or more of a neural architecture search algorithm, a random search algorithm, a reinforcement learning algorithm, or a Bayesian search algorithm One or more of a neural architecture search algorithm, a random search algorithm, a reinforcement learning algorithm, or a Bayesian search algorithm.
  • Example 8 provides a neural network model structure determination device, including:
  • a candidate model determination module is configured to determine at least one candidate neural network model based on a preset neural network model architecture search algorithm
  • the CPU utilization prediction module is set to predict the runtime CPU usage of each candidate neural network model based on the preset CPU usage prediction model, and obtain the CPU usage prediction value;
  • the target model structure determination module is configured to determine a target neural network model structure among the structures of the at least one candidate neural network model based on at least one CPU occupancy prediction value.
  • Example 9 provides a neural network model structure determination device, further including:
  • the target model structure determination module is configured as:
  • the structure of the candidate neural network model corresponding to the CPU occupancy predicted value is determined as the target neural network model structure.
  • Example 10 provides a neural network model structure determination device, further comprising:
  • the target model structure determination module may also be set to:
  • Example 11 provides a neural network model structure determination device, further comprising:
  • the target model structure determination module can also be set to:
  • the model size and model delay are calculated using a Pareto optimal strategy.
  • the model size and model delay are calculated using a Pareto optimal strategy.
  • the structures of multiple candidate neural network models determine a target neural network model structure. structure.
  • Example 12 provides a neural network model structure determination device, further comprising:
  • the neural network model structure determination device further includes a model training module configured to train a preset CPU occupancy prediction model.
  • the training process includes:
  • the corresponding structural codes of the multiple sub-networks are used as model input data, and the runtime CPU occupancy rates of the multiple sub-networks are used as the model expected output. Model training is performed to obtain the preset CPU occupancy prediction model.
  • Example 13 provides a neural network model structure determination device, further including:
  • the neural network model structure determination device further includes a model optimization module, which is configured to:
  • a preset model structure optimization algorithm is used to optimize the structure of the target neural network model.
  • Example 14 provides a neural network model structure determination device, further comprising:
  • the preset neural network model architecture search algorithm includes:
  • One or more of a neural architecture search algorithm, a random search algorithm, a reinforcement learning algorithm, or a Bayesian search algorithm One or more of a neural architecture search algorithm, a random search algorithm, a reinforcement learning algorithm, or a Bayesian search algorithm.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Geometry (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本公开提供了神经网络模型结构确定方法、装置、设备、介质及产品。神经网络模型结构确定方法包括:根据预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型;基于预设CPU占用率预测模型,对每个候选神经网络模型的运行时CPU占用率进行预测,得到CPU占用率预测值;根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构。

Description

神经网络模型结构确定方法、装置、设备、介质及产品
本申请要求在2022年07月14日提交中国专利局、申请号为202210832510.X的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本公开涉及人工智能技术领域,例如涉及神经网络模型结构确定方法、装置、设备、介质及产品。
背景技术
在深度学习领域已有很多通过自动搜索算法确定的神经网络结构。但是,由于其中大多数的神经网络模型较大或因网络架构等原因,导致了模型在部署后存在一系列资源消耗问题,如中央处理器(Central Processing Unit,CPU)占用率高,会限制模型的实际应用。因此,在设计神经网络模型结构时,不仅要关注网络最终的性能,也需考虑最终模型在进行实际部署后的多个方面的指标。
虽然,在神经架构自动搜索过程中,已考虑到神经网络模型大小、计算量、时延等参数,结合网络性能综合考虑以进行神经网络架构搜索,并取得了一定的成效。但是,在模型的搜索过程中,并没有将模型实际运行过程中的CPU占用率纳入考虑,导致设计出的一些模型在实际运行过程中,具有较高的CPU占用率,以及较高的计算资源消耗。
发明内容
本公开提供了神经网络模型结构确定方法、装置、设备、介质及产品,可以得到在运行过程中CPU占用率满足预设要求的神经网络模型结构,降低神经网络在运行过程中的资源消耗。
第一方面,本公开提供了一种神经网络模型结构确定方法,该方法包括:
根据预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型;
基于预设CPU占用率预测模型,对每个候选神经网络模型的运行时CPU占用率进行预测,得到CPU占用率预测值;
根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构。
第二方面,本公开还提供了一种神经网络模型结构确定装置,该装置包括:
候选模型确定模块,设置为根据预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型;
CPU利用率预测模块,设置为基于预设CPU占用率预测模型,对每个候选神经网络模型的运行时CPU占用率进行预测,得到CPU占用率预测值;
目标模型结构确定模块,设置为根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构。
第三方面,本公开还提供了一种电子设备,所述电子设备包括:
一个或多个处理器;
存储装置,设置为存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如本公开实施例任一所述的神经网络模型结构确定方法。
第四方面,本公开还提供了一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行上述的神经网络模型结构确定方法。
第五方面,本公开还提供了一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现上述的神经网络模型结构确定方法。
附图说明
图1是本公开实施例所提供的一种神经网络模型结构确定方法的流程示意图;
图2是本公开实施例所提供的又一种神经网络模型结构确定方法的流程示意图;
图3是本公开实施例所提供的一种神经网络模型结构确定装置的结构示意图;
图4是本公开实施例所提供的一种电子设备的结构示意图。
具体实施方式
下面将参照附图描述本公开的实施例。虽然附图中显示了本公开的一些实施例,然而本公开可以通过多种形式来实现,提供这些实施例是为了理解本公开。本公开的附图及实施例仅用于示例性作用。
本公开的方法实施方式中记载的多个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。 本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有指出,否则应该理解为“一个或多个”。
在使用本公开实施例公开的技术方案之前,均应当依据相关法律法规通过恰当的方式对本公开所涉及个人信息的类型、使用范围、使用场景等告知用户并获得用户的授权。
例如,在响应于接收到用户的主动请求时,向用户发送提示信息,以明确地提示用户,其请求执行的操作将需要获取和使用到用户的个人信息。从而,使得用户可以根据提示信息来自主地选择是否向执行本公开技术方案的操作的电子设备、应用程序、服务器或存储介质等软件或硬件提供个人信息。
作为一种实现方式,响应于接收到用户的主动请求,向用户发送提示信息的方式例如可以是弹窗的方式,弹窗中可以以文字的方式呈现提示信息。此外,弹窗中还可以承载供用户选择“同意”或者“不同意”向电子设备提供个人信息的选择控件。
上述通知和获取用户授权过程仅是示意性的,不对本公开的实现方式构成限定,其它满足相关法律法规的方式也可应用于本公开的实现方式中。
本技术方案所涉及的数据(包括数据本身、数据的获取或使用)应当遵循相应法律法规及相关规定的要求。
图1为本公开实施例所提供的一种神经网络模型结构确定方法的流程示意图,本公开实施例适用于通过神经网络架构搜索确定模型结构的场景,该方法可以由神经网络模型结构确定装置来执行,该装置可以通过软件和/或硬件的形式实现,通过电子设备来实现,该电子设备可以是移动终端、个人电脑(Personal Computer,PC)端或服务器等。
如图1所示,所述神经网络模型结构确定方法包括:
S110、根据预设神经网络模型架构搜索算法,确定至少一个候选神经网络 模型。
在机器学习领域中,机器学习算法的效果好坏在很大程度上取决于多种超参数。超参数(Hyperparameter)主要有三类,第一类是优化参数,如学习率(learning rate)、训练批次尺寸(batch size)及权重衰减(weight decay)等;第二类是定义网络结构的参数,如网络有几层,每层包括什么算子,卷积中的过滤器尺寸(filter size)等;第三类为正则化系数。神经网络模型架构搜索(Neural Architecture Search,NAS),即是对网络结构的参数进行自动调优的过程,解决了在高维空间的最优参数搜索问题。
基于预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型的过程,即是在预设的搜索空间中,通过预设神经网络模型架构搜索算法对应的搜索策略,搜索出符合搜索策略要求的候选神经网络模型。
预设神经网络模型架构搜索算法可以是进化搜索算法、随机搜索算法、强化学习算法、梯度优化算法或贝叶斯搜索算法中的一种或多种。
在一种实施方式中,可以通过进化搜索算法、随机搜索算法、强化学习算法、梯度优化算法或贝叶斯搜索算法对模型架构搜索算法(策略)进行优化,然后,基于优化后的搜索策略进行候选神经网络模型的搜索。
S120、基于预设CPU占用率预测模型,对每个候选神经网络模型的运行时CPU占用率进行预测,得到CPU占用率预测值。
在确定了候选神经网络模型之后,本步骤中则是对候选神经网络模型的性能进行评估的过程。在本实施例中,优先考虑的是候选神经网络模型的运行时CPU占用率,即考虑搭到模型在部署应用的过程中计算资源消耗情况,避免最终训练得到的模型,因运行时资源消耗过大的原因,限制了实际的应用。
在本实施中通过一个预设CPU占用率预测模型对每一个候选神经网络模型的运行时CPU占用率进行预测。该预设CPU占用率预测模型是预先训练得到的一个学习模型,可以根据输入的模型架构编码,相应的得出一个CPU占用率的预测结果。可以每搜索到一个候选神经网络模型,就来预测该候选神经网络模型的CPU占用率,也可以一次搜索到多个候选神经网络模型之后,再分别进行CPU占用率预测。
S130、根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构。
在确定了每个候选神经网络模型的CPU占用率之后,便可以根据预设的CPU占用率要求,筛选出符合要求的目标神经网络模型。
可以将上一步骤中得到的CPU占用率预测值与预设CPU占用率阈值进行比 较;当CPU占用率预测值小于预设CPU占用率阈值时,将CPU占用率预测值对应的候选神经网络模型的结构,确定为目标神经网络模型结构。
候选神经网络模型的数量可以是一个或多个,相应的,满足预设CPU占用率要求的候选神经网络模型的数量也可以是一个或多个。当满足预设CPU占用率要求的候选神经网络模型的数量大于1时,便可以根据神经网络模型的计算量、时延、模型大小或网络性能等参数筛选出目标神经网络模型结构。进而可以对搜索出的目标神经网络模型架构进行训练,得到最终应用模型,进行实际测试并投入使用。
在一种实施方式中,在筛选出的目标神经网络模型结构基础之上,还可以采用预设模型结构优化算法,对目标神经网络模型结构进行结构优化,如剪枝算法、量化算法等算法,以实现模型优化,提高模型的学习效率。
本公开实施例的技术方案,通过先基于预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型;然后,利用预设CPU占用率预测模型,对每一个候选神经网络模型的运行时CPU占用率进行预测,得到对应的CPU占用率预测值;根据至少一个CPU占用率预测值,在至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构,即以CPU占用率为强约束,选取满足CPU占用率要求的神经网络模型作为目标神经网络模型。本公开实施例的技术方案,解决了通过神经架构搜索得到的模型结构在运行过程中CPU占用率较大,使用受限的问题,可以使通过神经架构搜索得到的模型结构的CPU占用率满足要求,降低神经网络在运行过程中的资源消耗,便于实际的部署应用。
图2为本公开实施例所提供的又一神经网络模型结构确定方法的流程示意图,在实现该方法流程的过程中,描述了从CPU占用率预测模型的训练到基于CPU占用率进行模型架构搜索的过程。该方法可以由模型训练装置来执行,该装置可以通过软件和/或硬件的形式实现,通过电子设备来实现,该电子设备可以是移动终端、PC端或服务器等。
如图2所示,所述神经网络模型结构确定方法包括:
S210、在预设网络搜索空间中进行网络模型采样,得到子网络样本集合,并分别运行所述子网络样本集合中的多个子网络,确定每个子网络的运行时CPU占用率。
预设网络搜索空间可以是一个超网空间,其中包含多种结构类型的模型子网络。示例性的,可以从预设网络搜索空间中,采样一定数量的子网架构,如1000~4000个,从而得到一个子网络样本集合。
可以对子网络样本集合中的多个子网络进行one-hot编码,以编码的形式表 示子网架构信息,从而可以根据编码结果对多个子网络进行区分。然后,在实际行环境中分别运行子网络样本集合中的每一个子网络,以测试每一个子网络的运行时CPU占用率。根据CPU占用率测试结果,可以将每一个子网络的编码结果与对应的CPU占用率测试结果组成一个样本对,构建数据集,用于对CPU占用率预测模型进行训练。
S220、将多个子网络的对应的结构编码作为模型输入数据,并将多个子网络的运行时CPU占用率作为模型期望输出,进行模型训练,得到目标CPU占用率预测模型。
在该步骤中,以上一步骤得到的样本数据进行模型训练,当训练的轮次达到预设次数,且模型的损失函数收敛时,便可以得到相应的训练结果,即目标CPU占用率预测模型。
S230、根据预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型。
S240、基于所述目标CPU占用率预测模型,对每个候选神经网络模型的运行时CPU占用率进行预测,得到CPU占用率预测值。
S250、将每个CPU占用率预测值与预设CPU占用率阈值进行比较。
S260、基于CPU占用率预测值小于所述预设CPU占用率阈值的多个候选神经网络模型的至少一个指标项数据,采用预设模型选择策略,在所述多个候选神经网络模型的结构中,确定一个目标神经网络模型结构。
通常,经过设神经网络模型架构搜索算法可以得到多个候选神经网络模型,并且满足预设CPU占用率阈值要求的候选神经网络模型的数量也是大于1的。可以在,满足了CPU占用率约束条件的基础上,再通过计算量约束条件、参数量约束条件以及其他指标项的约束条件,择优选择一个目标神经网络模型结构。例如,可以基于CPU占用率预测值小于预设CPU占用率阈值的多个候选神经网络模型的计算量、模型大小和模型时延中一个或多个指标数据,采用帕累托最优策略或其他最优解求解策略,在多个候选神经网络模型的结构中,确定一个目标神经网络模型结构。
在一个应用实例中,在目标对象图像分割任务上,利用本实施例的神经网络模型结构确定方法搜索出一个目标网络模型。在同等性能水平下,相比人工设计的用于目标对象图像分割任务的网络,搜索出的目标网络模型在结合剪枝算法优化后,浮点运算数(floating point operations,FLOPs),即计算量,降低了20%~25%FLOPs,CPU占用率降低了1.5%~2%。
本公开实施例的技术方案,通过先构建一个子网集合,并测试子网集合中 每一个子网的运行时CPU占用率,基于测试结果组成模型结构和CPU占用率样本对,并基于该样本对训练得到CPU占用率预测模型;然后在基于预设神经网络模型架构搜索算法确定神经网络模型的过程中,利用训练得到的CPU占用率预测模型,对每一个候选神经网络模型的运行时CPU占用率进行预测,得到对应的CPU占用率预测值;根据至少一个CPU占用率预测值,在至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构,即以CPU占用率为强约束,选取满足CPU占用率要求的神经网络模型的结构作为目标神经网络模型结构。本公开实施例的技术方案,解决了通过神经架构搜索得到的模型结构在运行过程中CPU占用率较大,使用受限的问题,可以使通过神经架构搜索得到的模型结构的CPU占用率满足要求,降低神经网络在运行过程中的资源消耗,便于实际的部署应用。
图3为本公开实施例所提供的一种神经网络模型结构确定装置的结构示意图,该装置适用于通过神经网络架构搜索确定模型结构的场景,神经网络模型结构确定装置可以通过软件和/或硬件的形式实现,可配置于电子设备,该电子设备可以是移动终端、PC端或服务器等。
如图3所示,所述神经网络模型结构确定装置包括:候选模型确定模块310、CPU利用率预测模块320和目标模型结构确定模块330。
候选模型确定模块310,设置为根据预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型;CPU利用率预测模块320,设置为基于预设CPU占用率预测模型,对每个候选神经网络模型的运行时CPU占用率进行预测,得到CPU占用率预测值;目标模型结构确定模块330,设置为根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构。
本公开实施例的技术方案,通过先基于预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型;然后,利用预设CPU占用率预测模型,对每一个候选神经网络模型的运行时CPU占用率进行预测,分别得到对应的CPU占用率预测值;根据至少一个CPU占用率预测值,在至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构,即以CPU占用率为强约束,选取满足CPU占用率要求的神经网络模型作为目标神经网络模型结构。本公开实施例的技术方案,解决了通过神经架构搜索得到的模型结构在运行过程中CPU占用率较大,使用受限的问题,可以使通过神经架构搜索得到的模型结构的CPU占用率满足要求,降低神经网络在运行过程中的资源消耗,便于实际的部署应用。
在本公开实施例中任一技术方案的基础上,目标模型结构确定模块330设置为:
将每个CPU占用率预测值与预设CPU占用率阈值进行比较;当所述CPU占用率预测值小于所述预设CPU占用率阈值时,将所述CPU占用率预测值对应的候选神经网络模型的结构,确定为目标神经网络模型结构。
在本公开实施例中任一技术方案的基础上,当CPU占用率预测值小于所述预设CPU占用率阈值的候选神经网络模型的数量大于或等于2时,目标模型结构确定模块330还可以设置为:
基于CPU占用率预测值小于所述预设CPU占用率阈值的多个候选神经网络模型的至少一个指标项数据,采用预设模型选择策略,在所述多个候选神经网络模型的结构中,确定一个目标神经网络模型结构。
在本公开实施例中任一技术方案的基础上,目标模型结构确定模块330还可以设置为:
基于CPU占用率预测值小于所述预设CPU占用率阈值的多个候选神经网络模型的计算量、模型大小和模型时延中一个或多个指标数据,采用帕累托最优策略,在所述多个候选神经网络模型的结构中,确定一个目标神经网络模型结构。
在本公开实施例中任一技术方案的基础上,神经网络模型结构确定装置还包括模型训练模块,设置为训练预设CPU占用率预测模型,训练过程,包括:
在预设网络搜索空间中进行网络模型采样,得到子网络样本集合,并分别运行所述子网络样本集合中的多个子网络,确定每个子网络的运行时CPU占用率;将多个子网络的对应的结构编码作为模型输入数据,并将多个子网络的运行时CPU占用率作为模型期望输出,进行模型训练,得到所述预设CPU占用率预测模型。
在本公开实施例中任一技术方案的基础上,神经网络模型结构确定装置还包括模型优化模块,设置为:
采用预设模型结构优化算法,对所述目标神经网络模型结构进行结构优化。
在本公开实施例中任一技术方案的基础上,所述预设神经网络模型架构搜索算法,包括:
进化搜索算法、随机搜索算法、强化学习算法、梯度优化算法或贝叶斯搜索算法中的一种或多种。
本公开实施例所提供的上述装置可执行本公开任意实施例所提供的方法,具备执行方法相应的功能模块和效果。
上述装置所包括的多个单元和模块只是按照功能逻辑进行划分的,但并不 局限于上述的划分,只要能够实现相应的功能即可;另外,多个功能单元的名称也只是为了便于相互区分,并不用于限制本公开实施例的保护范围。
图4为本公开实施例所提供的一种电子设备的结构示意图。下面参考图4,其示出了适于用来实现本公开实施例的电子设备(例如图4中的终端设备或服务器)400的结构示意图。本公开实施例中的终端设备可以包括诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,PDA)、平板电脑(Portable Android Device,PAD)、便携式多媒体播放器(Portable Media Player,PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字电视(Television,TV)、台式计算机等等的固定终端。图4示出的电子设备400仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图4所示,电子设备400可以包括处理装置(例如中央处理器、图形处理器等)401,其可以根据存储在只读存储器(Read-Only Memory,ROM)402中的程序或者从存储装置408加载到随机访问存储器(Random Access Memory,RAM)403中的程序而执行多种适当的动作和处理。在RAM 403中,还存储有电子设备400操作所需的多种程序和数据。处理装置401、ROM 402以及RAM 403通过总线404彼此相连。编辑/输出(Input/Output,I/O)接口405也连接至总线404。
通常,以下装置可以连接至I/O接口405:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置406;包括例如液晶显示器(Liquid Crystal Display,LCD)、扬声器、振动器等的输出装置407;包括例如磁带、硬盘等的存储装置408;以及通信装置409。通信装置409可以允许电子设备400与其他设备进行无线或有线通信以交换数据。虽然图4示出了具有多种装置的电子设备400,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置409从网络上被下载和安装,或者从存储装置408被安装,或者从ROM 402被安装。在该计算机程序被处理装置401执行时,执行本公开实施例的方法中限定的上述功能。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
本公开实施例提供的电子设备与上述实施例提供的神经网络模型结构确定 方法属于同一构思,未在本实施例中详尽描述的技术细节可参见上述实施例,并且本实施例与上述实施例具有相同的效果。
本公开实施例提供了一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述实施例所提供的神经网络模型结构确定方法。
本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的例子可以包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、RAM、ROM、可擦式可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括:电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如超文本传输协议(HyperText Transfer Protocol,HTTP)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:
根据预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型;基于预设CPU占用率预测模型,对每个候选神经网络模型的运行时CPU占用率进行预测,得到CPU占用率预测值;根据至少一个CPU占用率预测值,在所述 至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括LAN或WAN—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开多种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在一种情况下并不构成对该单元本身的限定,例如,第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、片上系统(System on Chip,SOC)、复杂可编程逻辑设备(Complex Programming Logic Device,CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括电子的、磁性的、光学的、电磁的、红外的、或半导体系 统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、RAM、ROM、EPROM或快闪存储器、光纤、CD-ROM、光学储存设备、磁储存设备、或上述内容的任何合适组合。
本公开实施例还提供了一种计算机程序产品,包括计算机程序,该计算机程序在被处理器执行时实现如本公开任意一个实施例所提供的神经网络模型结构确定方法。
计算机程序产品在实现的过程中,可以以一种或多种程序设计语言或其组合来编写用于执行本公开操作的计算机程序代码,程序设计语言包括面向对象的程序设计语言,诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括LAN或WAN—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
根据本公开的一个或多个实施例,【示例一】提供了一种神经网络模型结构确定方法,该方法包括:
根据预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型;
基于预设CPU占用率预测模型,对每个候选神经网络模型的运行时CPU占用率进行预测,得到CPU占用率预测值;
根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构。
根据本公开的一个或多个实施例,【示例二】提供了一种神经网络模型结构确定方法,还包括:
在一些实现方式中,所述根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构,包括:
将每个CPU占用率预测值与预设CPU占用率阈值进行比较;
当所述CPU占用率预测值小于所述预设CPU占用率阈值时,将所述CPU占用率预测值对应的候选神经网络模型的结构,确定为目标神经网络模型结构。
根据本公开的一个或多个实施例,【示例三】提供了一种神经网络模型结构确定方法,包括:
在一些实现方式中,当CPU占用率预测值小于所述预设CPU占用率阈值的候选神经网络模型的数量大于或等于2时,所述根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构,还包括:
基于CPU占用率预测值小于所述预设CPU占用率阈值的多个候选神经网络模型的至少一个指标项数据,采用预设模型选择策略,在所述多个候选神经网络模型的结构中,确定一个目标神经网络模型结构。
根据本公开的一个或多个实施例,【示例四】提供了一种神经网络模型结构确定方法,还包括:
在一些实现方式中,所述基于CPU占用率预测值小于所述预设CPU占用率阈值的多个候选神经网络模型的至少一个指标项数据,采用预设模型选择策略,在所述多个候选神经网络模型的结构中,确定一个目标神经网络模型结构,包括:
基于CPU占用率预测值小于所述预设CPU占用率阈值的多个候选神经网络模型的计算量、模型大小和模型时延中一个或多个指标数据,采用帕累托最优策略,在所述多个候选神经网络模型的结构中,确定一个目标神经网络模型结构。
根据本公开的一个或多个实施例,【示例五】提供了一种神经网络模型结构确定方法,还包括:
在一些实现方式中,所述预设CPU占用率预测模型的训练过程,包括:
在预设网络搜索空间中进行网络模型采样,得到子网络样本集合,并分别运行所述子网络样本集合中的多个子网络,确定每个子网络的运行时CPU占用率;
将所述多个子网络的对应的结构编码作为模型输入数据,并将所述多个子网络的运行时CPU占用率作为模型期望输出,进行模型训练,得到所述预设CPU占用率预测模型。
根据本公开的一个或多个实施例,【示例六】提供了一种神经网络模型结构确定方法,还包括:
在一些实现方式中,采用预设模型结构优化算法,对所述目标神经网络模型结构进行结构优化。
根据本公开的一个或多个实施例,【示例七】提供了一种神经网络模型结构确定方法,还包括:
在一些实现方式中,所述预设神经网络模型架构搜索算法,包括:
神经架构搜索算法、随机搜索算法,强化学习算法或贝叶斯搜索算法中的一种或多种。
根据本公开的一个或多个实施例,【示例八】提供了一种神经网络模型结构确定装置,包括:
候选模型确定模块,设置为根据预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型;
CPU利用率预测模块,设置为基于预设CPU占用率预测模型,对每个候选神经网络模型的运行时CPU占用率进行预测,得到CPU占用率预测值;
目标模型结构确定模块,设置为根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构。
根据本公开的一个或多个实施例,【示例九】提供了一种神经网络模型结构确定装置,还包括:
在一种实施方式中,所述目标模型结构确定模块设置为:
将每个CPU占用率预测值与预设CPU占用率阈值进行比较;
当所述CPU占用率预测值小于所述预设CPU占用率阈值时,将所述CPU占用率预测值对应的候选神经网络模型的结构,确定为目标神经网络模型结构。
根据本公开的一个或多个实施例,【示例十】提供了一种神经网络模型结构确定装置,还包括:
在一种实施方式中,当CPU占用率预测值小于所述预设CPU占用率阈值的候选神经网络模型的数量大于或等于2时,目标模型结构确定模块还可以设置为:
基于CPU占用率预测值小于所述预设CPU占用率阈值的多个候选神经网络模型的至少一个指标项数据,采用预设模型选择策略,在所述多个候选神经网络模型的结构中,确定一个目标神经网络模型结构。
根据本公开的一个或多个实施例,【示例十一】提供了一种神经网络模型结构确定装置,还包括:
在一种实施方式中,目标模型结构确定模块还可以设置为:
基于CPU占用率预测值小于所述预设CPU占用率阈值的多个候选神经网络模型的计算量、模型大小和模型时延中一个或多个指标数据,采用帕累托最优策略,在所述多个候选神经网络模型的结构中,确定一个目标神经网络模型结 构。
根据本公开的一个或多个实施例,【示例十二】提供了一种神经网络模型结构确定装置,还包括:
在一种实施方式中,神经网络模型结构确定装置还包括模型训练模块,设置为训练预设CPU占用率预测模型,训练过程,包括:
在预设网络搜索空间中进行网络模型采样,得到子网络样本集合,并分别运行所述子网络样本集合中的多个子网络,确定每个子网络的运行时CPU占用率;
将所述多个子网络的对应的结构编码作为模型输入数据,并将所述多个子网络的运行时CPU占用率作为模型期望输出,进行模型训练,得到所述预设CPU占用率预测模型。
根据本公开的一个或多个实施例,【示例十三】提供了一种神经网络模型结构确定装置,还包括:
在一种实施方式中,神经网络模型结构确定装置还包括模型优化模块,设置为:
采用预设模型结构优化算法,对所述目标神经网络模型结构进行结构优化。
根据本公开的一个或多个实施例,【示例十四】提供了一种神经网络模型结构确定装置,还包括:
在一种实施方式中,所述预设神经网络模型架构搜索算法,包括:
神经架构搜索算法、随机搜索算法,强化学习算法或贝叶斯搜索算法中的一种或多种。
此外,虽然采用特定次序描绘了多个操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了多个实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的一些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的多种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。

Claims (11)

  1. 一种神经网络模型结构确定方法,包括:
    根据预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型;
    基于预设中央处理器CPU占用率预测模型,对每个候选神经网络模型的运行时CPU占用率进行预测,得到CPU占用率预测值;
    根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构。
  2. 根据权利要求1所述的方法,其中,所述根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构,包括:
    将每个CPU占用率预测值与预设CPU占用率阈值进行比较;
    在所述CPU占用率预测值小于所述预设CPU占用率阈值的情况下,将所述CPU占用率预测值对应的候选神经网络模型的结构,确定为目标神经网络模型结构。
  3. 根据权利要求2所述的方法,其中,在CPU占用率预测值小于所述预设CPU占用率阈值的候选神经网络模型的数量大于或等于2的情况下,所述根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构,还包括:
    基于CPU占用率预测值小于所述预设CPU占用率阈值的多个候选神经网络模型的至少一个指标项数据,采用预设模型选择策略,在所述多个候选神经网络模型的结构中,确定一个目标神经网络模型结构。
  4. 根据权利要求3所述的方法,其中,所述基于CPU占用率预测值小于所述预设CPU占用率阈值的多个候选神经网络模型的至少一个指标项数据,采用预设模型选择策略,在所述多个候选神经网络模型的结构中,确定一个目标神经网络模型结构,包括:
    基于CPU占用率预测值小于所述预设CPU占用率阈值的所述多个候选神经网络模型的计算量、模型大小和模型时延中的至少一个指标数据,采用帕累托最优策略,在所述多个候选神经网络模型的结构中,确定一个目标神经网络模型结构。
  5. 根据权利要求1所述的方法,其中,所述预设CPU占用率预测模型的训练过程,包括:
    在预设网络搜索空间中进行网络模型采样,得到子网络样本集合,并分别 运行所述子网络样本集合中的多个子网络,确定每个子网络的运行时CPU占用率;
    将所述多个子网络的对应的结构编码作为模型输入数据,并将所述多个子网络的运行时CPU占用率作为模型期望输出,进行模型训练,得到所述预设CPU占用率预测模型。
  6. 根据权利要求1所述的方法,还包括:
    采用预设模型结构优化算法,对所述目标神经网络模型结构进行结构优化。
  7. 根据权利要求1-6中任一所述的方法,其中,所述预设神经网络模型架构搜索算法,包括:
    进化搜索算法、随机搜索算法、强化学习算法、梯度优化算法或贝叶斯搜索算法中的至少一种。
  8. 一种神经网络模型结构确定装置,包括:
    候选模型确定模块,设置为根据预设神经网络模型架构搜索算法,确定至少一个候选神经网络模型;
    CPU利用率预测模块,设置为基于预设中央处理器CPU占用率预测模型,对每个候选神经网络模型的运行时CPU占用率进行预测,得到CPU占用率预测值;
    目标模型结构确定模块,设置为根据至少一个CPU占用率预测值,在所述至少一个候选神经网络模型的结构中确定一个目标神经网络模型结构。
  9. 一种电子设备,包括:
    至少一个处理器;
    存储装置,设置为存储至少一个程序;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-7中任一所述的神经网络模型结构确定方法。
  10. 一种计算机可读存储介质,存储有计算机程序,所述程序被处理器执行时实现如权利要求1-7中任一所述的神经网络模型结构确定方法。
  11. 一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现如权利要求1-7中任一项所述的神经网络模型结构确定方法。
PCT/CN2023/105495 2022-07-14 2023-07-03 神经网络模型结构确定方法、装置、设备、介质及产品 WO2024012306A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210832510.X 2022-07-14
CN202210832510.XA CN117454959A (zh) 2022-07-14 2022-07-14 神经网络模型结构确定方法、装置、设备、介质及产品

Publications (1)

Publication Number Publication Date
WO2024012306A1 true WO2024012306A1 (zh) 2024-01-18

Family

ID=89535526

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/105495 WO2024012306A1 (zh) 2022-07-14 2023-07-03 神经网络模型结构确定方法、装置、设备、介质及产品

Country Status (2)

Country Link
CN (1) CN117454959A (zh)
WO (1) WO2024012306A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382868A (zh) * 2020-02-21 2020-07-07 华为技术有限公司 神经网络结构搜索方法和神经网络结构搜索装置
CN112949842A (zh) * 2021-05-13 2021-06-11 北京市商汤科技开发有限公司 神经网络结构搜索方法、装置、计算机设备以及存储介质
CN113033784A (zh) * 2021-04-18 2021-06-25 沈阳雅译网络技术有限公司 一种针对cpu和gpu设备搜索神经网络结构的方法
CN113407806A (zh) * 2020-10-12 2021-09-17 腾讯科技(深圳)有限公司 网络结构搜索方法、装置、设备及计算机可读存储介质
US20220147680A1 (en) * 2020-11-12 2022-05-12 Samsung Electronics Co., Ltd. Method for co-design of hardware and neural network architectures using coarse-to-fine search, two-phased block distillation and neural hardware predictor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382868A (zh) * 2020-02-21 2020-07-07 华为技术有限公司 神经网络结构搜索方法和神经网络结构搜索装置
CN113407806A (zh) * 2020-10-12 2021-09-17 腾讯科技(深圳)有限公司 网络结构搜索方法、装置、设备及计算机可读存储介质
US20220147680A1 (en) * 2020-11-12 2022-05-12 Samsung Electronics Co., Ltd. Method for co-design of hardware and neural network architectures using coarse-to-fine search, two-phased block distillation and neural hardware predictor
CN113033784A (zh) * 2021-04-18 2021-06-25 沈阳雅译网络技术有限公司 一种针对cpu和gpu设备搜索神经网络结构的方法
CN112949842A (zh) * 2021-05-13 2021-06-11 北京市商汤科技开发有限公司 神经网络结构搜索方法、装置、计算机设备以及存储介质

Also Published As

Publication number Publication date
CN117454959A (zh) 2024-01-26

Similar Documents

Publication Publication Date Title
CN114091617A (zh) 联邦学习建模优化方法、电子设备、存储介质及程序产品
CN110765354A (zh) 信息的推送方法、装置、电子设备及存储介质
CN116703131B (zh) 电力资源分配方法、装置、电子设备和计算机可读介质
CN110781373A (zh) 榜单更新方法、装置、可读介质和电子设备
CN113392018B (zh) 流量分发方法、装置、存储介质及电子设备
CN116388112B (zh) 异常供应端断电方法、装置、电子设备和计算机可读介质
CN117236805B (zh) 电力设备控制方法、装置、电子设备和计算机可读介质
CN117241092A (zh) 一种视频处理方法、装置、存储介质及电子设备
WO2024012306A1 (zh) 神经网络模型结构确定方法、装置、设备、介质及产品
CN115907136B (zh) 电动汽车调度方法、装置、设备和计算机可读介质
CN108770014B (zh) 网络服务器的计算评估方法、系统、装置及可读存储介质
CN116483891A (zh) 一种信息预测方法、装置、设备和存储介质
CN115759444A (zh) 电力设备分配方法、装置、电子设备和计算机可读介质
CN111898061B (zh) 搜索网络的方法、装置、电子设备和计算机可读介质
CN113435528B (zh) 对象分类的方法、装置、可读介质和电子设备
CN114859935A (zh) 应用于多节点组网的路径规划方法、装置、产品及介质
CN111680754B (zh) 图像分类方法、装置、电子设备及计算机可读存储介质
CN114692898A (zh) Mec联邦学习方法、装置及计算机可读存储介质
CN116800834B (zh) 虚拟礼物合并方法、装置、电子设备和计算机可读介质
CN117235535B (zh) 异常供应端断电方法、装置、电子设备和介质
CN114978794B (zh) 网络接入方法、装置、存储介质以及电子设备
WO2024007938A1 (zh) 一种多任务预测方法、装置、电子设备及存储介质
CN111582482B (zh) 用于生成网络模型信息的方法、装置、设备和介质
CN116700956B (zh) 请求处理方法、装置、电子设备和计算机可读介质
CN117978612B (zh) 网络故障检测方法、存储介质以及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23838798

Country of ref document: EP

Kind code of ref document: A1