WO2021068247A1 - 神经网络调度方法、装置、计算机设备及可读存储介质 - Google Patents

神经网络调度方法、装置、计算机设备及可读存储介质 Download PDF

Info

Publication number
WO2021068247A1
WO2021068247A1 PCT/CN2019/110823 CN2019110823W WO2021068247A1 WO 2021068247 A1 WO2021068247 A1 WO 2021068247A1 CN 2019110823 W CN2019110823 W CN 2019110823W WO 2021068247 A1 WO2021068247 A1 WO 2021068247A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
data
network model
memory
base address
Prior art date
Application number
PCT/CN2019/110823
Other languages
English (en)
French (fr)
Other versions
WO2021068247A8 (zh
Inventor
黄炯凯
蔡权雄
牛昕宇
Original Assignee
深圳鲲云信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳鲲云信息科技有限公司 filed Critical 深圳鲲云信息科技有限公司
Priority to CN201980066984.4A priority Critical patent/CN113196232A/zh
Priority to PCT/CN2019/110823 priority patent/WO2021068247A1/zh
Priority to US17/768,241 priority patent/US20230273826A1/en
Publication of WO2021068247A1 publication Critical patent/WO2021068247A1/zh
Publication of WO2021068247A8 publication Critical patent/WO2021068247A8/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the field of artificial intelligence technology, in particular to neural network scheduling methods, devices, computer equipment, and readable storage media.
  • multiple neural network models need to be run to obtain the desired results.
  • a neural network model needs to be called to detect whether an image contains a human face image. If there is a human face image, then another neural network model is dispatched to the person in the image. Recognize the facial image of the user, and finally obtain the desired result.
  • the current solution in the prior art is to use multiple hardware devices, and each hardware device runs a different neural network model, which will increase the cost of additional equipment and reduce the utilization of hardware resources.
  • the purpose of the embodiments of the present application is to propose a neural network scheduling method, device, computer equipment, and readable storage medium, so as to reduce additional equipment costs and improve the utilization of hardware resources.
  • the embodiments of the present application provide a neural network scheduling method, which adopts the following technical solutions:
  • the neural network scheduling method includes:
  • the memory also includes a public data storage area;
  • the corresponding neural network model is called to calculate the data, and the calculation result is obtained and output.
  • model storage area is used to store the network structure and parameters of the neural network model.
  • the base address is the initial storage address of a neural network model in the memory.
  • the step of invoking the corresponding neural network model based on the base address to calculate the data specifically includes:
  • the preprocessed data is input to the called neural network for calculation.
  • step of inputting the preprocessed data into the called neural network for calculation specifically includes:
  • the training of the pre-trained neural network model includes constructing a neural network, selecting a training data set, performing neural network training, and verifying the neural network.
  • an embodiment of the present application also provides a neural network scheduling device, which adopts the following technical solutions:
  • the neural network dispatch speed device includes:
  • a loading module configured to load at least one pre-trained neural network model into a model storage area in a memory, and obtain a base address of the neural network model, the memory also includes a public data storage area;
  • An obtaining module configured to obtain the base address of the neural network model corresponding to the task type, and read the data in the public data storage area;
  • the calculation module is used to call the corresponding neural network model to calculate the data based on the base address, obtain the calculation result and output it.
  • calculation module includes:
  • the calculation sub-module is used to input the preprocessed data into the called neural network for calculation.
  • the embodiments of the present application also provide a computer device, which adopts the following technical solutions:
  • the computer device includes a memory and a processor, and a computer program is stored in the memory.
  • the processor executes the computer program, the steps of any one of the neural network scheduling methods proposed in the embodiments of the present application are implemented .
  • the embodiments of the present application also provide a computer-readable storage medium, which adopts the following technical solutions:
  • a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of any one of the neural network scheduling methods proposed in the embodiments of the present application are implemented.
  • the embodiments of the present application mainly have the following beneficial effects: load at least one pre-trained neural network model into the model storage area in the memory, and obtain the base address of the neural network model, and the memory It also includes a public data storage area; obtains the corresponding base address of the neural network model according to the task type, and reads the data in the public data storage area; calls the corresponding neural network model to the data based on the base address Perform calculations, get the calculation results and output.
  • the storage area that is, performing the above-mentioned multiple neural network calculations on the same computing device, can reduce the cost of additional neural network computing devices and improve the utilization rate of hardware resources.
  • Fig. 1 shows a flowchart of an embodiment of a neural network scheduling method according to the present application
  • FIG. 2 shows a flowchart of an embodiment of step 103 in FIG. 1;
  • FIG. 3 shows a flowchart of an embodiment of step 1032 in FIG. 2;
  • FIG. 4 is a schematic structural diagram of a neural network scheduling device provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of an embodiment of the calculation module 203 in FIG. 4;
  • Fig. 6 is a schematic structural diagram of an embodiment of a computer device provided according to an embodiment of the present application.
  • Fig. 1 shows a flowchart of an embodiment of the neural network scheduling method according to the present application.
  • the neural network scheduling method includes:
  • Step 101 Load at least one pre-trained neural network model into a model storage area in a memory, and obtain a base address of the neural network model.
  • the memory also includes a public data storage area.
  • the above-mentioned neural network model includes neural networks involved in different task types, such as the feature detection network (CNN, etc.) used in face recognition tasks and recognition networks, recurrent neural networks RNN and length of speech recognition tasks Time memory neural network LSTM and so on.
  • CNN feature detection network
  • RNN recurrent neural networks
  • LSTM length of speech recognition tasks
  • a public data storage area for the aforementioned neural network to store initial input data and intermediate calculation results, etc., which can speed up the calculation of the neural network and save computing resources.
  • Step 102 Obtain the corresponding base address of the neural network model according to the task type, and read the data in the public data storage area.
  • the task types include the aforementioned face recognition and speech recognition, as well as application scenarios where neural networks are used in tasks such as text recognition, object segmentation, and automatic driving.
  • the types of neural networks used in various application scenarios And the quantity is not the same. Therefore, it is necessary to select the corresponding neural network for combination according to the task type to complete the corresponding task and realize its function.
  • the base address of the neural network in the memory required by the task is obtained, and then the neural network stored in the base address is loaded into the processor, and then the data in the public data storage area is read and input to the load to Executed in the neural network in the processor.
  • the neural network required for the above task may include multiple neural networks, and the multiple neural networks can be dynamically switched through the base address of each neural network described above, so that each neural network is executed in a certain calling sequence.
  • Step 103 Call the corresponding neural network model to calculate the data based on the base address, obtain the calculation result and output it.
  • At least one neural network required by the task can be obtained according to the above-mentioned base address through the above-mentioned step 103, and then the above-obtained neural networks are sequentially loaded into the same processor for the data read from the above-mentioned public data storage area.
  • the data is calculated accordingly, that is, each neural network is called in turn to calculate the data, and the intermediate calculation results are saved to the above public data storage area for the next neural network to use, that is, the calculation process can be based on the above basis
  • the address switches the neural network, and the above-mentioned public data storage area can be recycled until the last neural network calculation is completed and the final result is output, which can improve the utilization of hardware computing resources.
  • a neural network scheduling method which includes: loading at least one pre-trained neural network model into a model storage area in a memory, and obtaining a base address of the neural network model, and the memory It also includes a public data storage area; obtains the corresponding base address of the neural network model according to the task type, and reads the data in the public data storage area; calls the corresponding neural network model to the data based on the base address Perform calculations, get the calculation results and output them.
  • the storage area that is, performing the above-mentioned multiple neural network calculations on the same computing device, can reduce the cost of additional neural network computing devices and improve the utilization rate of hardware resources.
  • model storage area is used to store the network structure and parameters of the neural network model.
  • the above-mentioned neural network model is a pre-trained neural network, that is, its network structure is optimal and its parameters are the parameters that minimize the network error.
  • the network structure of the neural network is based on the layer as the calculation unit. , Including but not limited to convolutional layer, pooling layer, ReLU (activation function), fully connected layer, etc.
  • each layer in the neural network structure also has a large number of parameters. These parameters include but are not limited to: weight (weight), bias (bias), and so on.
  • the base address is the initial storage address of a neural network model in the memory.
  • a section of memory space is applied to the operating system to store the above neural network model.
  • the memory space can be continuous for placing multiple neural networks, or it can be discontinuous but each memory space only stores one. Neural Networks. Then the base address of each neural network can be obtained from the system, that is, the starting address of the neural network in the memory, the corresponding network can be found through this address, and the neural network can be loaded and switched.
  • step 103 specifically includes the following steps:
  • Step 1031 preprocessing the data
  • Step 1032 Input the preprocessed data into the called neural network for calculation.
  • the preprocessing of data includes:
  • Data cleaning It can be used to clear the noise in the data and correct inconsistencies.
  • Data integration Combine data from multiple data sources into a consistent data store, such as a data warehouse
  • the scale of data can be reduced by, for example, clustering, deleting redundant features, or clustering.
  • Data transformation including normalization, regularization, etc., for example, it can be used to compress data to a smaller
  • the interval such as 0.0 to 1.0.
  • the data can be processed into the format required for neural network calculation, and then input the neural network called above for corresponding calculation, which can improve the efficiency of neural network calculation.
  • step 1032 specifically includes the following steps:
  • Step 10321 Configure corresponding hardware resources according to the network structure of the neural network model
  • Step 10322 Calculate the preprocessed data based on the hardware resources.
  • different neural network models can be loaded according to different application scenarios and different task types.
  • pre-trained neural network models for speech processing can be loaded, such as RNN, LSTM, etc.
  • pre-trained neural network models for image processing can be loaded, such as fast-rcnn (including multiple specific sub-networks).
  • the preprocessed data performs corresponding operations, such as convolution operations, pooling operations, and so on.
  • the training of the above-mentioned pre-trained neural network model includes constructing a neural network, selecting a training data set, performing neural network training, and verifying the neural network.
  • FIG. 4 is a schematic structural diagram of a neural network scheduling device provided by an embodiment of the present application. As shown in FIG. 4, the device 200 includes:
  • the loading module 201 is configured to load at least one pre-trained neural network model into a model storage area in a memory, and obtain a base address of the neural network model, the memory also includes a public data storage area;
  • the obtaining module 202 is configured to obtain the corresponding base address of the neural network model according to the task type, and read the data in the public data storage area;
  • the calculation module 203 is configured to call the corresponding neural network model based on the base address to calculate the data, obtain and output the calculation result.
  • the foregoing calculation module 203 includes:
  • the preprocessing submodule 2031 is used to preprocess the data
  • the calculation sub-module 2032 is used to input the preprocessed data into the called neural network for calculation.
  • an embodiment of the present application provides a computer device, including: a memory, a processor, and a computer program stored on the memory and capable of running on the processor, and when the processor executes the computer program The steps in the neural network scheduling method provided in the embodiments of the application are implemented.
  • an embodiment of the present application provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the neural network scheduling method provided by the embodiment of the present application is implemented Steps in. That is, in the specific embodiment of the present invention, when the computer program of the computer-readable storage medium is executed by the processor, the steps of the neural network scheduling method described above are implemented, which can reduce additional equipment costs and improve the utilization of hardware resources.
  • the computer program in the computer-readable storage medium includes computer program code
  • the computer program code may be in the form of source code, object code, executable file, or some intermediate form.
  • the computer-readable medium may include: capable of carrying the computer program code
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • the aforementioned storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
  • an embodiment of the present application also provides a basic structural block diagram of the above computer device, as shown in FIG. 6.
  • the computer device 3 includes a memory 31, a processor 33, and a network interface 33 that are connected to each other in communication via a system bus. It should be pointed out that the figure only shows the computer device 3 with components 31-33, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
  • Its hardware includes, but is not limited to, a microprocessor, a dedicated Integrated Circuit (Application Specific Integrated Circuit, ASIC), Programmable Gate Array (Field-Programmable Gate Array, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.
  • ASIC Application Specific Integrated Circuit
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • DSP Digital Processor
  • the computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.
  • the memory 31 includes at least one type of readable storage medium, the readable storage medium includes flash memory, hard disk, multimedia card, card type memory (for example, SD or DX memory, etc.), random access memory (RAM), static memory Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc.
  • the memory 31 may be an internal storage unit of the computer device 3, such as a hard disk or a memory of the computer device 3.
  • the memory 31 may also be an external storage device of the computer device 3, such as a plug-in hard disk equipped on the computer device 3, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc.
  • the memory 31 may also include both the internal storage unit of the computer device 3 and its external storage device.
  • the memory 31 is generally used to store an operating system and various application software installed in the computer device 3, such as the program code of the neural network scheduling method described above.
  • the memory 31 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 33 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments.
  • the processor 33 is generally used to control the overall operation of the computer device 3.
  • the processor 33 is configured to run the program code or process data stored in the memory 31, for example, run the program code of the neural network scheduling method.
  • the network interface 33 may include a wireless network interface or a wired network interface.
  • the network interface 33 is usually used to establish a communication connection between the computer device 3 and other electronic devices, and then transmit data and the like.
  • This application also provides another implementation manner, that is, a computer-readable storage medium that stores a neural network scheduling method program, and the neural network scheduling method program can be processed by at least one The processor executes, so that the at least one processor executes the steps of the program of the neural network scheduling method described above, and realizes the corresponding functions.
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to enable a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the method described in each embodiment of the present application.
  • a terminal device which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种神经网络调度方法、装置、计算机设备及存储介质,该方法包括:将至少一个预训练好的神经网络模型加载到内存中的模型存储区,并获取所述神经网络模型的基地址,所述内存还包括公共数据存储区(101);根据任务类型获取对应的所述神经网络模型的基地址,并读取所述公共数据存储区内的数据(102);基于所述基地址调用对应的神经网络模型对所述数据进行计算,得到计算结果并输出(103)。该方法将神经网络预先加载到内存中并获得相应的基地址,调用上述基地址对应的多个神经网络对数据进行计算,并将中间结果放入一个公共数据存储区,可以减少额外的神经网络计算设备的成本,提高硬件资源的利用率。

Description

神经网络调度方法、装置、计算机设备及可读存储介质 技术领域
本申请涉及人工智能技术领域,尤其涉及神经网络调度方法、装置、计算机设备及可读存储介质。
背景技术
在人工智能的某些特定的应用场景(自动驾驶、人脸识别等)中,需要运行多个神经网络模型才能获得所需的结果。例如,在人脸识别应用场景中,需先调用一神经网络模型检测一幅图像中是否含有人的脸部图像,如果有人的脸部图像,再调度另一神经网络模型对该幅图像中人的脸部图像进行识别,最终获得所需的结果。但当前现有技术的解决方法是使用多个硬件设备,每个硬件设备运行不同的神经网络模型,这样会增加额外的设备成本,降低硬件资源的利用率。
发明内容
本申请实施例的目的在于提出一种神经网络调度方法、装置、计算机设备及可读存储介质,以减少额外的设备成本,提高硬件资源的利用率。
为了解决上述技术问题,本申请实施例提供一种神经网络调度方法,采用了如下所述的技术方案:
所述神经网络调度方法包括:
将至少一个预训练好的神经网络模型加载到内存中的模型存储区,并获取所述神经网络模型的基地址,所述内存还包括公共数据存储区;
根据任务类型获取对应的所述神经网络模型的基地址,并读取所述公共数据存储区内的数据;
基于所述基地址调用对应的神经网络模型对所述数据进行计算,得到计算结果并输出。
进一步的,所述模型存储区用于存储所述神经网络模型的网络结构及参数。
进一步的,所述基地址为一个神经网络模型在内存中的起始存放地址。
进一步的,所述基于所述基地址调用对应的神经网络模型对所述数据进行计算的步骤具体包括:
对所述数据进行预处理;
将所述预处理后的数据输入到所述调用的神经网络进行计算。
进一步的,所述将所述预处理后的数据输入到所述调用的神经网络进行计算的步骤具体包括:
根据神经网络模型的网络结构配置对应的硬件资源;
基于所述硬件资源对所述预处理后的数据进行计算。
进一步的,所述预训练好的神经网络模型的训练包括构建神经网络、选择训练数据集并进行神经网络训练、验证神经网络。
为了解决上述技术问题,本申请实施例还提供一种神经网络调度装置,采用了如下所述的技术方案:
所述神经网络调度速装置包括:
加载模块,用于将至少一个预训练好的神经网络模型加载到内存中的模型存储区,并获取所述神经网络模型的基地址,所述内存还包括公共数据存储区;
获取模块,用于根据任务类型获取对应的所述神经网络模型的基地址,并读取所述公共数据存储区内的数据;
计算模块,用于基于所述基地址调用对应的神经网络模型对所述数据进行计算,得到计算结果并输出。
进一步的,所述计算模块包括:
预处理子模块,用于对所述数据进行预处理;
计算子模块,用于将所述预处理后的数据输入到所述调用的神经网络进行计算。
为了解决上述技术问题,本申请实施例还提供一种计算机设备,采用了如下所述的技术方案:
所述计算机设备,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器执行所述计算机程序时实现本申请实施例中提出的任一项所述的神经网络调度方法的步骤。
为了解决上述技术问题,本申请实施例还提供一种计算机可读存储介质,采用了如下所述的技术方案:
所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现本申请实施例中提出的任一项所述的神经网络调度方法的步骤。
与现有技术相比,本申请实施例主要有以下有益效果:将至少一个预训练好的神经网络模型加载到内存中的模型存储区,并获取所述神经网络模型的基地址,所述内存还包括公共数据存储区;根据任务类型获取对应的所述神经网络模型的基地址,并读取所述公共数据存储区内的数据;基于所述基地址调用对应的神经网络模型对所述数据进行计算,得到计算结果并输出。将训练好的神经网络预先加载到内存中并获得每个神经网络的基地址,然后根据任务类型依次调用上述基地址对应的多个神经网络对数据进行计算,并将中间结果放入一个公共数据存储区,即在同一计算设备上执行上述多个神经网络计算,可以减少额外的神经网络计算设备的成本,提高硬件资源的利用率。
附图说明
为了更清楚地说明本申请中的方案,下面将对本申请实施例描述中所需要使用的附图作一个简单介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了根据本申请的神经网络调度方法的一个实施例的流程图;
图2示出了图1中步骤103的一个实施例的流程图;
图3示出了图2中步骤1032的一个实施例的流程图;
图4是本申请实施例提供的一种神经网络调度装置的结构示意图;
图5是图4中计算模块203的一个实施例的结构示意图;
图6是根据本申请的实施例提供的计算机设备的一个实施例的结构示意图。
具体实施方式
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同;本文中在申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请;本申请的说明书和权利要求书及上述附图说明中的术语“包括”和“具有”以及它们的任何变形,意图在于覆盖不排他的包含。本申请的说明书和权利要求书或上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。
为了使本技术领域的人员更好地理解本申请方案,下面将结合附图,对本申请实施例中的技术方案进行清楚、完整地描述。
第一方面,如图1所示,图1示出了根据本申请的神经网络调度方法的一个实施例的流程图。所述的神经网络调度方法包括:
步骤101,将至少一个预训练好的神经网络模型加载到内存中的模型存储区,并获取所述神经网络模型的基地址,所述内存还包括公共数据存储区。
在本实施例中,上述神经网络模型包括不同任务类型所涉及到的神经网 络,如人脸识别任务所使用的特征检测网络(CNN等)和识别网络、语音识别任务的递归神经网络RNN和长短时记忆神经网络LSTM等。首先为上述这些神经网络在内存中申请相应大小的存储空间,然后将上述神经网络模型的网络结构参数存入到上述申请的存储空间并返回各个神经网络模型的基地址(及起始地址),利用该基地址就可以根据需要寻找到对应的神经网络。进一步的还可以为上述神经网络申请一块公共数据存储区,用来存放初始输入的数据和中间计算结果等,可以加快神经网络的计算速度并节约计算资源。
步骤102,根据任务类型获取对应的所述神经网络模型的基地址,并读取所述公共数据存储区内的数据。
在本实施例中,任务类型包括上述的人脸识别、语音识别,还可以是文字识别、对象分割、自动驾驶等任务使用到神经网络的应用场景,各种应用场景所使用的神经网络的类型和数量是不相同的。因此,需要根据任务类型来选择相应的神经网络进行组合以完成对应的任务并实现其功能。具体地,获取到任务所需要的神经网络在内存中的基地址,然后将上述基地址所存储的神经网络加载到处理器中,再读取上述公共数据存储区的数据并输入到上述加载到处理器中的神经网络中进行执行。上述任务所需要的神经网络可以包括多个,多个神经网络可以通过上述各个神经网络的基地址进行动态切换,使各个神经网络之间按照一定的调用顺序进行执行。
步骤103,基于所述基地址调用对应的神经网络模型对所述数据进行计算,得到计算结果并输出。
在本实施例中,通过上述步骤103可以根据上述基地址得到任务所需要的至少一个神经网络,然后将上述获得的神经网络依次加载到同一处理器中对从上述公共数据存储区读取到的数据进行相应的计算,即依次调用每个神经网络对数据进行计算,并将中间计算结果保存到上述公共数据存储区供下一个神经网络进行使用,也就是说,在计算过程中可以根据上述基地址切换神经网络,并可以循环使用上述公共数据存储区,直到最后一个神经网络计算结束并将最终结果输出,这样可以提高硬件计算资源的利用率。
在本发明实施例中,提供一种神经网络调度方法,包括:将至少一个预训练好的神经网络模型加载到内存中的模型存储区,并获取所述神经网络模型的基地址,所述内存还包括公共数据存储区;根据任务类型获取对应的所述神经网络模型的基地址,并读取所述公共数据存储区内的数据;基于所述基地址调用对应的神经网络模型对所述数据进行计算,得到计算结果并输出。将训练好的神经网络预先加载到内存中并获得每个神经网络的基地址,然后根据任务类型依次调用上述基地址对应的多个神经网络对数据进行计算,并将中间结果放入一个公共数据存储区,即在同一计算设备上执行上述多个神经网络计算,可以减少额外的神经网络计算设备的成本,提高硬件资源的利用率。
进一步的,所述模型存储区用于存储所述神经网络模型的网络结构及参数。
在本实施例中,上述神经网络模型为预训练好的神经网络,即其网络结构是最优的以及参数是使该网络误差最小的参数,其中神经网络的网络结构是以层为计算单元的,包含且不限于卷积层、池化层、ReLU(激活函数)、全连接层等。神经网络结构中的每一层除了接收上一层输出的数据流外还具有大量的参数,这些参数包含且不限于:weight(权重)、bias(偏置)等。
进一步的,所述基地址为一个神经网络模型在内存中的起始存放地址。
在本实施例中通过向操作系统申请一段内存空间用以存放上述神经网络模型,该内存空间可以是连续的用来放置多个神经网络,也可以是不连续的但每段内存空间只存放一个神经网络。然后可以从系统获得每个神经网络的基地址基地址,即该神经网络在内存中的起始地址,通过该地址可以寻找到对应的网络,就可以加载和切换神经网络。
进一步的,如图2所示,上述步骤103具体包括以下步骤:
步骤1031,对所述数据进行预处理;
步骤1032,将所述预处理后的数据输入到所述调用的神经网络进行计算。
其中,对数据进行预处理包括:
数据清理:可以用来清楚数据中的噪声,纠正不一致。
数据集成:将数据由多个数据源合并成一个一致的数据存储,如数据仓
库。
数据归约:可以通过如聚集、删除冗余特征或聚类来降低数据的规模。
数据变换:包括规范化、正则化等,例如可以用来把数据压缩到较小的
区间,如0.0到1.0。
通过以上数据预处理方法可以把数据处理成神经网络计算所需要的格式,然后输入上述调用的神经网络进行相应计算,可以提高神经网络计算的效率。
进一步的,如图3所示,上述步骤1032具体包括以下步骤:
步骤10321,根据神经网络模型的网络结构配置对应的硬件资源;
步骤10322,基于所述硬件资源对所述预处理后的数据进行计算。
在本实施例中,可以根据不同的应用场景、不同的任务类型加载不同的神经网络模型,例如,对于语音识别应用场景,可以加载语音处理的预训练好的神经网络模型,如RNN、LSTM等;对于物体检测场景可以加载图像处理的预训练好的神经网络模型,如fast-rcnn(包括多个具体的子网络)等。然后根据上述加载的神经网络模型配置对应的硬件资源,即根据上述神经网络模型的网络结构及参数,分配计算单元、存储单元、流水线加速单元等硬件资源,最后基于配置好的上述硬件资源对上述预处理后的数据进行相应的运算,如卷积运算、池化运算等。
进一步的,上述预训练好的神经网络模型的训练包括构建神经网络、选择训练数据集并进行神经网络训练、验证神经网络。
其中,根据任务类型或应用场景来构建不同的神经网络,包括网络的结构划分、层数、连接方式等;进而选择相应的数据集对构建好的神经网络进行训练,数据集可以选用网络上开放的标注好的数据集,如图像识别的MNIST数据集、声音识别的VoxCeleb数据集等;最后通过验证数据集对训练好的神经网络进行交叉验证等,从而得到上述预训练好的神经网络模型。
第二方面,请参见图4,图4是本申请实施例提供的一种神经网络调度装置的结构示意图,如图4所示,所述装置200包括:
加载模块201,用于将至少一个预训练好的神经网络模型加载到内存中的模型存储区,并获取所述神经网络模型的基地址,所述内存还包括公共数据存储区;
获取模块202,用于根据任务类型获取对应的所述神经网络模型的基地址,并读取所述公共数据存储区内的数据;
计算模块203,用于基于所述基地址调用对应的神经网络模型对所述数据进行计算,得到计算结果并输出。
进一步的,如图5所示,上述计算模块203包括:
预处理子模块2031,用于对所述数据进行预处理;
计算子模块2032,用于将所述预处理后的数据输入到所述调用的神经网络进行计算。
第三方面,本申请实施例提供一种计算机设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现本申请实施例提供的神经网络调度方法中的步骤。
第四方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现本申请实施例提供的神经网络调度方法中的步骤。即,在本发明的具体实施例中,计算机可读存储介质的计算机程序被处理器执行时实现上述的神经网络调度方法的步骤,可以减少额外的设备成本,提高硬件资源的利用率。
示例性的,计算机可读存储介质的计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的
任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random  Access Memory)、电载波信号、电信信号以及软件分发介质等。
需要说明的是,由于计算机可读存储介质的计算机程序被处理器执行时实现上述的神经网络调度方法的步骤,因此上述神经网络调度方法的所有实施例均适用于该计算机可读存储介质,且均能达到相同或相似的有益效果。
本领域普通技术人员可以理解实现上述实施例中,方法的全部或部分流程,系统的全部或部分子系统,是可以通过计算机程序来指令相关的硬件来完成,该计算机程序可存储于一计算机可读取存储介质中,该程序在执行时,可实现包括如上述各子系统的实施例的功能。其中,前述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等非易失性存储介质,或随机存储记忆体(Random Access Memory,RAM)等。
应该理解的是,虽然附图的结构示意图中的各个子系统按照箭头的指示依次显示,但是这些子系统并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些子系统的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的结构示意图中的至少一部分子系统在执行时可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
请继续参阅图6,为解决上述技术问题,本申请实施例还提供上述计算机设备的基本结构框图,如图6所示。
所述计算机设备3包括通过系统总线相互通信连接存储器31、处理器33、网络接口33。需要指出的是,图中仅示出了具有组件31-33的计算机设备3,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。其中,本技术领域技术人员可以理解,这里的计算机设备是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程门阵列(Field-Programmable Gate  Array,FPGA)、数字处理器(Digital Signal Processor,DSP)、嵌入式设备等。
所述计算机设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述计算机设备可以与用户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互。
所述存储器31至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,所述存储器31可以是所述计算机设备3的内部存储单元,例如该计算机设备3的硬盘或内存。在另一些实施例中,所述存储器31也可以是所述计算机设备3的外部存储设备,例如该计算机设备3上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,所述存储器31还可以既包括所述计算机设备3的内部存储单元也包括其外部存储设备。本实施例中,所述存储器31通常用于存储安装于所述计算机设备3的操作系统和各类应用软件,例如上述神经网络调度方法的程序代码等。此外,所述存储器31还可以用于暂时地存储已经输出或者将要输出的各类数据。
所述处理器33在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器33通常用于控制所述计算机设备3的总体操作。本实施例中,所述处理器33用于运行所述存储器31中存储的程序代码或者处理数据,例如运行所述神经网络调度方法的程序代码。
所述网络接口33可包括无线网络接口或有线网络接口,该网络接口33通常用于在所述计算机设备3与其他电子设备之间建立通信连接,然后传输数据等。
本申请还提供了另一种实施方式,即提供一种计算机可读存储介质,所述计算机可读存储介质存储有神经网络调度方法的程序,所述神经网络调度方法的程序可被至少一个处理器执行,以使所述至少一个处理器执行如上述的神经网络调度方法的程序的步骤,实现相应的功能。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
显然,以上所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例,附图中给出了本申请的较佳实施例,但并不限制本申请的专利范围。本申请可以以许多不同的形式来实现,相反地,提供这些实施例的目的是使对本申请的公开内容的理解更加透彻全面。尽管参照前述实施例对本申请进行了详细的说明,对于本领域的技术人员来而言,其依然可以对前述各具体实施方式所记载的技术方案进行修改,或者对其中部分技术特征进行等效替换。凡是利用本申请说明书及附图内容所做的等效结构,直接或间接运用在其他相关的技术领域,均同理在本申请专利保护范围之内。

Claims (10)

  1. 一种神经网络调度方法,其特征在于,包括:
    将至少一个预训练好的神经网络模型加载到内存中的模型存储区,并获取所述神经网络模型的基地址,所述内存还包括公共数据存储区;
    根据任务类型获取对应的所述神经网络模型的基地址,并读取所述公共数据存储区内的数据;
    基于所述基地址调用对应的神经网络模型对所述数据进行计算,得到计算结果并输出。
  2. 如权利要求1所述方法,其特征在于,所述模型存储区用于存储所述神经网络模型的网络结构及参数。
  3. 如权利要求3所述方法,其特征在于,所述基地址为一个神经网络模型在内存中的起始存放地址。
  4. 如权利要求3所述方法,其特征在于,所述基于所述基地址调用对应的神经网络模型对所述数据进行计算的步骤具体包括:
    对所述数据进行预处理;
    将所述预处理后的数据输入到所述调用的神经网络进行计算。
  5. 如权利要求4所述方法,其特征在于,所述将所述预处理后的数据输入到所述调用的神经网络进行计算的步骤具体包括:
    根据神经网络模型的网络结构配置对应的硬件资源;
    基于所述硬件资源对所述预处理后的数据进行计算。
  6. 如权利要求1所述方法,其特征在于,所述预训练好的神经网络模型的训练包括构建神经网络、选择训练数据集并进行神经网络训练、验证神经网络。
  7. 一种神经网络调度装置,其特征在于,包括:
    加载模块,用于将至少一个预训练好的神经网络模型加载到内存中的模型存储区,并获取所述神经网络模型的基地址,所述内存还包括公共数据存储区;
    获取模块,用于根据任务类型获取对应的所述神经网络模型的基地址,并读取所述公共数据存储区内的数据;
    计算模块,用于基于所述基地址调用对应的神经网络模型对所述数据进行计算,得到计算结果并输出。
  8. 如权利要求7所述的装置,其特征在于,所述计算模块包括:
    预处理子模块,用于对所述数据进行预处理;
    计算子模块,用于将所述预处理后的数据输入到所述调用的神经网络进行计算。
  9. 一种计算机设备,其特征在于,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器执行所述计算机程序时实现如权利要求1至6中任一项所述的神经网络调度方法的步骤。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至6中任一项所述的神经网络调度方法的步骤。
PCT/CN2019/110823 2019-10-12 2019-10-12 神经网络调度方法、装置、计算机设备及可读存储介质 WO2021068247A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201980066984.4A CN113196232A (zh) 2019-10-12 2019-10-12 神经网络调度方法、装置、计算机设备及可读存储介质
PCT/CN2019/110823 WO2021068247A1 (zh) 2019-10-12 2019-10-12 神经网络调度方法、装置、计算机设备及可读存储介质
US17/768,241 US20230273826A1 (en) 2019-10-12 2019-10-12 Neural network scheduling method and apparatus, computer device, and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/110823 WO2021068247A1 (zh) 2019-10-12 2019-10-12 神经网络调度方法、装置、计算机设备及可读存储介质

Publications (2)

Publication Number Publication Date
WO2021068247A1 true WO2021068247A1 (zh) 2021-04-15
WO2021068247A8 WO2021068247A8 (zh) 2021-05-14

Family

ID=75437683

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/110823 WO2021068247A1 (zh) 2019-10-12 2019-10-12 神经网络调度方法、装置、计算机设备及可读存储介质

Country Status (3)

Country Link
US (1) US20230273826A1 (zh)
CN (1) CN113196232A (zh)
WO (1) WO2021068247A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115860049B (zh) * 2023-03-02 2023-05-05 瀚博半导体(上海)有限公司 一种数据调度方法和设备
CN115982110B (zh) * 2023-03-21 2023-08-29 北京探境科技有限公司 文件运行方法、装置、计算机设备及可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106355244A (zh) * 2016-08-30 2017-01-25 深圳市诺比邻科技有限公司 卷积神经网络的构建方法及系统
CN106548179A (zh) * 2016-09-29 2017-03-29 北京市商汤科技开发有限公司 物体和服饰关键点的检测方法、装置和电子设备
CN107885762A (zh) * 2017-09-19 2018-04-06 北京百度网讯科技有限公司 智能大数据系统、提供智能大数据服务的方法和设备
CN110222752A (zh) * 2019-05-28 2019-09-10 北京金山数字娱乐科技有限公司 图像处理方法、系统、计算机设备、存储介质和芯片
CN110276320A (zh) * 2019-06-26 2019-09-24 杭州创匠信息科技有限公司 基于人脸识别的门禁方法、装置、设备和存储介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110363279B (zh) * 2018-03-26 2021-09-21 华为技术有限公司 基于卷积神经网络模型的图像处理方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106355244A (zh) * 2016-08-30 2017-01-25 深圳市诺比邻科技有限公司 卷积神经网络的构建方法及系统
CN106548179A (zh) * 2016-09-29 2017-03-29 北京市商汤科技开发有限公司 物体和服饰关键点的检测方法、装置和电子设备
CN107885762A (zh) * 2017-09-19 2018-04-06 北京百度网讯科技有限公司 智能大数据系统、提供智能大数据服务的方法和设备
CN110222752A (zh) * 2019-05-28 2019-09-10 北京金山数字娱乐科技有限公司 图像处理方法、系统、计算机设备、存储介质和芯片
CN110276320A (zh) * 2019-06-26 2019-09-24 杭州创匠信息科技有限公司 基于人脸识别的门禁方法、装置、设备和存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XU, YUEBIN ET AL.,: "(Assembly Language Programming)", 31 July 2000 (2000-07-31), pages 5 - 9 *

Also Published As

Publication number Publication date
US20230273826A1 (en) 2023-08-31
WO2021068247A8 (zh) 2021-05-14
CN113196232A (zh) 2021-07-30

Similar Documents

Publication Publication Date Title
CN112699991B (zh) 用于加速神经网络训练的信息处理的方法、电子设备和计算机可读介质
CN111046150B (zh) 人机交互处理系统及其方法、存储介质、电子设备
CN109542399B (zh) 软件开发方法、装置、终端设备及计算机可读存储介质
US11488064B2 (en) Machine learning model for micro-service compliance requirements
US11087763B2 (en) Voice recognition method, apparatus, device and storage medium
CN111523640B (zh) 神经网络模型的训练方法和装置
CN102223363B (zh) 在管理通信会话的图形界面中生成持续会话的系统和方法
US11429434B2 (en) Elastic execution of machine learning workloads using application based profiling
CN110825807B (zh) 基于人工智能的数据交互转换方法、装置、设备及介质
CN109616097A (zh) 语音数据处理方法、装置、设备及存储介质
US20210304010A1 (en) Neural network training under memory restraint
CN112887371B (zh) 边缘计算方法、装置、计算机设备及存储介质
WO2021068247A1 (zh) 神经网络调度方法、装置、计算机设备及可读存储介质
CN105096235A (zh) 图形处理方法及图形处理装置
WO2021000411A1 (zh) 基于神经网络的文档分类方法、装置、设备及存储介质
CN112287950A (zh) 特征提取模块压缩方法、图像处理方法、装置、介质
US20220138574A1 (en) Method of training models in ai and electronic device
CN112906554B (zh) 基于视觉图像的模型训练优化方法、装置及相关设备
CN110874343B (zh) 基于深度学习芯片进行语音处理的方法和深度学习芯片
CN116661936A (zh) 页面数据的处理方法、装置、计算机设备及存储介质
CN110309848A (zh) 离线数据与流式数据实时融合计算的方法
CN111352360A (zh) 机器人的控制方法、装置、机器人及计算机存储介质
US7908143B2 (en) Dialog call-flow optimization
CN110891120B (zh) 界面内容展示方法、装置及存储介质
CN110837596B (zh) 一种智能推荐方法、装置、计算机设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19948243

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 19/09/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19948243

Country of ref document: EP

Kind code of ref document: A1