WO2021254498A1 - 一种图像预测方法、设备和存储介质 - Google Patents

一种图像预测方法、设备和存储介质 Download PDF

Info

Publication number
WO2021254498A1
WO2021254498A1 PCT/CN2021/100993 CN2021100993W WO2021254498A1 WO 2021254498 A1 WO2021254498 A1 WO 2021254498A1 CN 2021100993 W CN2021100993 W CN 2021100993W WO 2021254498 A1 WO2021254498 A1 WO 2021254498A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
neural network
network model
image
gradient
Prior art date
Application number
PCT/CN2021/100993
Other languages
English (en)
French (fr)
Inventor
栗伟清
韩炳涛
屠要峰
王永成
高洪
Original Assignee
南京中兴软件有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京中兴软件有限责任公司 filed Critical 南京中兴软件有限责任公司
Publication of WO2021254498A1 publication Critical patent/WO2021254498A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Definitions

  • This application relates to the field of deep learning technology, and in particular to an image prediction method, device and storage medium.
  • neural network models have been widely used.
  • the training process of neural network models is the most time-consuming process in constructing a practically usable network.
  • artificial intelligence (Artificial Intelligence, AI) platform provides users with the process of multi-graphics processing unit (GPU) parallel training, but in the process of multi-GPU parallel training, in order to improve the utilization of resources, the BatchSize on each GPU will be correspondingly increased.
  • the amount of batch processing, when the Batch Size is relatively large, will affect the accuracy of the model, and the existing neural model training process usually uses the globally unique learning rate to determine the weight of each layer, which affects the accuracy of the model.
  • the neural network model obtained by the existing training method will significantly affect the prediction accuracy of the image in the process of image prediction.
  • An embodiment of the application provides an image prediction method, the method includes: acquiring an image to be tested; inputting the image to be tested into a preset neural network model to obtain a prediction category of the image to be tested, wherein the preset neural network The weight of each layer of the network model is obtained through hierarchical adaptive learning rate training.
  • the embodiment of the application also proposes a device for image prediction.
  • the device includes a memory, a processor, a program that is stored on the memory and can run on the processor, and is used to implement the processor and the processor.
  • a data bus for connection and communication between the memories, and the aforementioned method is implemented when the program is executed by the processor.
  • This application provides a storage medium for computer-readable storage.
  • the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement the foregoing method.
  • FIG. 1 is a flowchart of an image prediction method provided in Embodiment 1 of the present application.
  • FIG. 2 is a flowchart of an image prediction method provided in Embodiment 2 of the present application.
  • Fig. 3 is a running interaction diagram of the training process provided in the second embodiment of the present application.
  • FIG. 4 is a flowchart of step S220 of the second image pre-storage method provided by the present application.
  • FIG. 5 is a schematic diagram of the dynamic change of the weight attenuation parameter provided in the second embodiment of the present application.
  • FIG. 6 is a flowchart of an image prediction method provided in Embodiment 3 of the present application.
  • Fig. 7 is a structural block diagram of an image prediction device provided in the fourth embodiment of the present application.
  • the main purpose of the embodiments of the present application is to propose an image prediction method, device, and storage medium, aiming to achieve accurate image prediction by obtaining a preset neural network model through hierarchical adaptive learning rate training.
  • module means, “component” or “unit” used to indicate elements is only for the description of the present application, and has no special meaning in itself. Therefore, “module”, “part” or “unit” can be used in a mixed manner.
  • this embodiment provides a flowchart of an image prediction method, including:
  • Step S110 Obtain an image to be tested.
  • the image to be tested can be obtained by shooting or filtering from the database.
  • the specific method of obtaining the image to be tested is not limited in this embodiment, and the purpose of image prediction can be determined.
  • the category of the image to be tested for example, it is determined that the image to be tested is an animal, a landscape, a building, or a person, etc.
  • this embodiment is only an example for illustration, and the specific content of the category is not limited.
  • the image to be tested can be preprocessed before inputting the preset model.
  • the preprocessing specifically includes image denoising, image enhancement or image filling, etc. In order to eliminate the interference factors in the image to be tested, the prediction result is more accurate.
  • Step S120 Input the image to be tested into the preset neural network model to obtain the prediction type of the image to be tested.
  • the weight of each layer of the preset neural network model is obtained through hierarchical adaptive learning rate training.
  • Hierarchical adaptive learning rate training refers to the setting of matching for each layer in the process of training the neural network through samples. Learning rate, and when calculating the weight of each layer, the learning rate corresponding to this layer is used for determination. Therefore, compared with the method of determining the weight of each layer by using the globally unique learning rate for all layers in related technologies, the determination of the weight is To be more accurate, so the determined preset neural network model must be more accurate.
  • inputting the image to be tested into a preset neural network model to obtain the predicted category of the image to be tested may include: inputting the image to be tested into the preset neural network model to obtain a type probability set, where the type probability set is Contains the corresponding relationship between each category and the probability value; determine the category corresponding to the largest probability value in the probability set; use the category corresponding to the largest probability value as the predicted category of the image to be tested.
  • input image 1 to be tested into the preset neural network model and the image to be tested 1 contains cats, and predict the input image 1 to be tested through the preset neural network model to obtain the type probability set ⁇ animal 98%, People 1%, scenery 1% ⁇ , through the type probability set, it can be obtained that the largest probability value is 98%, and the category corresponding to the largest probability value is animal, so that the predicted category of the image to be tested can be determined as animal.
  • the method may further include: using multiple graphics processing units GPU to perform parallel training on the sample images to obtain the preset neural network model.
  • using multiple GPUs to perform parallel training on sample images to obtain the preset neural network model may include: each GPU determines the initial gradient of each layer of the initial neural network model according to the sample image; The initial gradient of each layer obtains the aggregate gradient of each layer of the initial neural network model and the layer learning rate; it is determined according to the current weight of each layer, the layer learning rate, the weight attenuation parameter, and the global learning rate of the initial neural network model The preset weight of each layer; the preset neural network model is obtained according to the preset weight of each layer.
  • This application implements the image prediction method provided by inputting the acquired image to be tested into a preset neural network model. Since the weight of each layer of the preset neural network model is obtained through hierarchical adaptive learning rate training, the value of each layer is The weight is more accurate, and the determined preset neural network model is more accurate. Therefore, when the image to be tested is input into the preset neural network, the prediction category of the image to be tested can be accurately obtained.
  • Embodiment 2 is a flow chart of an image prediction method provided by Embodiment 2 of the application. This embodiment is based on the above-mentioned embodiment.
  • the method before inputting the image to be tested into the preset neural network model, the method further includes :Using multiple GPUs to train the sample images in parallel to obtain the preset neural network model.
  • the method of this embodiment specifically includes the following operations:
  • Step S210 Obtain an image to be tested.
  • Step S220 Use multiple GPUs to perform parallel training on the sample images to obtain a preset neural network model.
  • the application manager runs on a high-performance computing (High Performance Computing, HPC) cluster.
  • HPC High Performance Computing
  • the training job submitted by the user allocates the corresponding resources on the HPC cluster, such as GPU, and sends an application success instruction to the application manager; when the application manager obtains the application success instruction, it sends a start job instruction to the job scheduler.
  • HPC High Performance Computing
  • each training task corresponds to a job scheduler; after receiving the start job instruction, the job scheduler sends a start instruction to the executor, and the executor is responsible for executing the training task of the neural network model assigned to each node in the cluster ,
  • the training will send the training job completion instruction to the application manager through the job scheduler;
  • the application manager receives the training job completion, it will send the resource release instruction to the resource manager so that the resource manager can target The allocated resources are recycled.
  • the process of executing data parallel training by the executor corresponds to step S220 of this embodiment, so the process of step S220 will be described in detail below.
  • step S220 specifically includes the following steps:
  • Step S221 Each GPU determines the initial gradient of each layer of the initial neural network model according to the sample image.
  • each GPU may further include: obtaining model building instructions, and generating the original neural network model on each GPU according to the model building instructions; determining Specify the parameters of the original neural network model contained on the GPU, and use the original neural network model contained on the specified GPU as the initial neural network model; broadcast the parameters of the original neural network model contained on the specified GPU to the remaining GPUs, so that The remaining GPUs update their original neural network model parameters according to the broadcast parameters to obtain the initial neural network model.
  • the allocated resources are three GPUs, that is, three GPUs are used for parallel training.
  • the original neural network model will be generated on each GPU according to the model building instructions.
  • the parameters of the original neural network model are randomly generated on each GPU, so the parameters of each original neural network model generated must be different.
  • you can specify a GPU for example, each GPU has a Number, specify the original neural network model contained on the GPU with number 0 as the initial neural network model, and broadcast the parameters of the original neural network model contained on the GPU with number 0 to the remaining two GPUs, and the remaining two
  • Each GPU will update its own original neural network model parameters according to the broadcast parameters, so that each GPU contains the initial neural network model with the same parameters.
  • the sample image is read on each GPU, and the operation of the initial neural network model is executed to obtain the initial gradient of each layer of the initial neural network model of each GPU, for example, for the initial neural network model
  • For the first layer determine that the current weight corresponding to this layer is w l , where l represents the layer label, and l is 1, and the first GPU calculates to determine that the initial gradient is D1(w 1 ), and the second GPU
  • the initial gradient is determined to be D2(w 1 ) by calculation above, and the initial gradient is determined to be D3(w 1 ) by calculation on the third GPU.
  • Step S222 Obtain the aggregate gradient and the layer learning rate of each layer of the initial neural network model according to the initial gradient of each layer.
  • obtaining the aggregate gradient and the layer learning rate of each layer of the initial neural network model according to the initial gradient of each layer may include: adding the initial gradients of each layer to obtain the sum of the gradients of each layer; The ratio of the number of initial gradients of each layer is used as the aggregation gradient of each layer; the layer learning rate of each layer is obtained according to the aggregation gradient of each layer.
  • obtaining the layer learning rate of each layer according to the aggregate gradient of each layer includes: determining the upper boundary value and the lower boundary value of the learning rate, and the ratio of the current weight of each layer to the aggregate gradient; When the ratio of the aggregate gradient is between the lower boundary value and the upper boundary value, the ratio of the current weight to the aggregate gradient is used as the layer learning rate of each layer; when the ratio of the current weight to the aggregate gradient is determined to be greater than the upper boundary value, the The upper boundary value is used as the layer learning rate of each layer; when it is determined that the ratio of the current weight to the aggregate gradient is less than the lower boundary value, the lower boundary value is used as the layer learning rate of each layer.
  • the initial gradient is D1(w 1 ) by calculation on the first GPU, and the initial gradient is determined by calculation on the second GPU.
  • the initial gradient is determined by calculation on the second GPU.
  • the gradient of each layer and the ratio of the number of initial gradients of each layer can be used as the aggregate gradient of each layer, then
  • the aggregation gradient for the first layer is of course, in this embodiment, only the first layer is taken as an example for illustration, and the manner of determining the aggregation gradient for other layers is roughly the same as this, so it will not be repeated in this embodiment.
  • the following formula (1) can be specifically used to obtain the layer learning rate of each layer:
  • ⁇ l is the layer learning rate of layer l in the initial neural network model
  • l is the layer label
  • is the expansion coefficient of the weight gradient ratio
  • is an option
  • w l is the current weight of layer l
  • T m is the upper boundary value of the learning rate
  • T n is the lower boundary value of the learning rate.
  • ⁇ , ⁇ , T m, and T n need to be set before training, and the specific values can be limited by the user according to the actual situation, and the size of the value is not limited in this embodiment.
  • Step S223 Determine the preset weight of each layer according to the current weight of each layer, the layer learning rate, the weight attenuation parameter, and the global learning rate of the initial neural network model.
  • the preset weight of each layer can be determined by the following formula (2)
  • weight attenuation parameter ⁇ in this embodiment changes dynamically, and the 1Cycle adjustment strategy is specifically applied.
  • Figure 5 is a schematic diagram of the dynamic change of the weight attenuation parameter determined by the 1Cycle adjustment strategy. It can be seen that the weight decay parameter ⁇ increases linearly from 0.0005 to 0.01 in the first 13 iterations of training, and then decreases linearly from 0.01 to 0.0005 in the next 14 iterations, and remains constant at 0.0005 in the last iteration.
  • this embodiment is only an example for description, and does not limit the number of critical iterations and the number of iteration terminations, which can be limited by the user according to actual conditions.
  • Step S224 Obtain a preset neural network model according to the preset weight of each layer.
  • the above steps S221 to S223 are executed in a loop until the set number of iterations is reached.
  • the parameters of each layer of the neural network model are preset It is known, so the preset neural network model is obtained according to the determined parameters.
  • Step S230 Input the image to be tested into the preset neural network model to obtain the prediction type of the image to be tested.
  • This application implements the image prediction method provided by inputting the acquired image to be tested into a preset neural network model. Since the weight of each layer of the preset neural network model is obtained through hierarchical adaptive learning rate training, the value of each layer is The weight is more accurate, and the determined preset neural network model is more accurate. Therefore, when the image to be tested is input into the preset neural network, the prediction category of the image to be tested can be accurately obtained. And in the process of neural network training, since the learning rate of each layer can be determined, and the decoupling of the learning rate and weight attenuation parameters can be realized, so that the training process can be more efficient when training with a large Batch Size. The batch size of samples processed on the GPU is larger, which further improves the overall resource utilization.
  • Fig. 6 is a flowchart of an image prediction method provided in the third embodiment of the application. This embodiment is based on the above-mentioned embodiment.
  • the image to be tested is input into the preset neural network model to obtain the After the prediction category of the image, it also includes: detecting the prediction result, and issuing an alarm when it is determined that the prediction result is abnormal.
  • Step S310 Obtain an image to be tested.
  • Step S320 Input the image to be tested into the preset neural network model to obtain the prediction type of the image to be tested.
  • Step S330 Detect the prediction result.
  • the prediction result when the prediction result is detected, it can specifically be detected whether there is an obvious error in the prediction result, for example, the prediction result is identified to determine whether there is a garbled code or the content is empty.
  • Step S340 When it is determined that the prediction result is abnormal, an alarm is issued.
  • the prediction result is determined to be abnormal. If the prediction result is abnormal, an alarm prompt will be issued.
  • the specific method of alarm prompt can be text prompt, voice prompt or light prompt, for example, If it is determined that the prediction result is garbled, a voice prompt "prediction result is wrong, please check it.”
  • the cause of the failure may be equipment failure, communication terminal or neural network model parameter configuration error, the user can be notified to take corresponding measures in time by issuing an alarm prompt, such as equipment replacement or adjustment if the equipment is determined to be normal. Re-forecast the parameter configuration.
  • This application implements the image prediction method provided by inputting the acquired image to be tested into a preset neural network model. Since the weight of each layer of the preset neural network model is obtained through hierarchical adaptive learning rate training, the value of each layer is The weight is more accurate, and the determined preset neural network model is more accurate. Therefore, when the image to be tested is input into the preset neural network, the prediction category of the image to be tested can be accurately obtained. By detecting the prediction result, and issuing an alarm when the prediction result is abnormal, the user is prompted to perform equipment maintenance in time to further improve the accuracy of the prediction result.
  • the fourth embodiment of the present application proposes a device for image prediction.
  • the device includes a memory 720, a processor 710, a program stored in the memory and running on the processor, and a program for implementing the processor.
  • the processor 710 and the memory 720 in the terminal may be connected by a bus or in other ways.
  • the connection by a bus is taken as an example.
  • the memory 720 can be configured to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the method for determining time domain resources in the embodiment of the present application.
  • the memory 720 may include a program storage area and a data storage area.
  • the program storage area may store an operating system and an application program required by at least one function; the data storage area may store data created according to the use of the device, and the like.
  • the memory 720 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices.
  • the memory 720 may include a memory remotely provided with respect to the processor 710. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the fifth embodiment of the present application proposes a readable storage medium, the readable storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to realize the Image prediction method:
  • the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, a physical component may have multiple functions, or a function or step may consist of several physical components.
  • the components are executed cooperatively.
  • Certain physical components or all physical components can be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit .
  • a processor such as a central processing unit, a digital signal processor, or a microprocessor
  • Such software may be distributed on a computer-readable medium, and the computer-readable medium may include a computer storage medium (or a non-transitory medium) and a communication medium (or a transitory medium).
  • Computer storage medium includes volatile and non-volatile data implemented in any method or technology for storing information (such as computer-readable instructions, data structures, program modules, or other data). Sexual, removable and non-removable media.
  • Computer storage media include but are not limited to RAM, ROM, EEPROM, flash memory or other memory technologies, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cassettes, magnetic tapes, magnetic disk storage or other magnetic storage devices, or Any other medium used to store desired information and that can be accessed by a computer.
  • a communication medium usually contains computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transmission mechanism, and may include any information delivery medium. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

一种图像预测方法、设备和存储介质,属于深度学习技术领域。该方法包括:获取待测图像(S110);将待测图像输入预设神经网络模型,得到待测图像的预测类别(S120),其中,预设神经网络模型每层的权重通过分层自适应学习速率训练得到。

Description

一种图像预测方法、设备和存储介质
交叉引用
本申请基于申请号为“202010568970.7”、申请日为2020年06月19日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本申请。
技术领域
本申请涉及深度学习技术领域,尤其涉及一种图像预测方法、设备和存储介质。
背景技术
目前深度学习模型特别是神经网络得到了广泛的应用,神经网络模型的训练过程是构建一个实际可用的网络中最为耗时的过程,为了提升训练效率和缩短耗时,目前人工智能(Artificial Intelligence,AI)平台为用户提供多图形处理器(Graphics processing unit,GPU)并行训练的过程,但在多GPU并行训练的过程中为了提升资源的利用率会相应的增加每个GPU上的Batch Size即样本批处理量,当Batch Size比较大的情况下,会影响模型的精度,并且现有的神经模型训练过程中通常采用全局唯一的学习速率来确定每层的权重,从而影响模型的精度,因此采用现有的训练方式所获得神经网络模型在进行图像预测的过程中会显著影响图像的预测精度。
发明内容
本申请实施例提供了一种图像预测方法,所述方法包括:获取待测图像;将待测图像输入预设神经网络模型,得到所述待测图像的预测类别,其中,所述预设神经网络模型每层的权重通过分层自适应学习速率训练得到。
本申请实施例还提出了一种图像预测的设备,所述设备包括存储器、处理器、存储在所述存储器上并可在所述处理器上运行的程序以及用于实现所述处理器和所述存储器之间的连接通信的数据总线,所述程序被所述处理器执行时实现前述方法。
本申请提供了一种存储介质,用于计算机可读存储,所述存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现前述方法。
附图说明
图1是本申请实施例一提供的图像预测方法的流程图;
图2是本申请实施例二提供的图像预测方法的流程图;
图3是本申请实施例二提供的训练过程运行交互图;
图4是本申请实施第二提供的图像预存方法的步骤S220的流程图;
图5是本申请实施例二提供的权重衰减参数的动态变化示意图;
图6是本申请实施例三提供的图像预测方法的流程图;
图7是本申请实施例四提供的图像预测的设备结构框图。
具体实施方式
本申请实施例的主要目的在于提出一种图像预测方法、设备和存储介质,旨在实现通过分层自适应学习速率训练得到预设神经网络模型来实现图像的精准预测。
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
在后续的描述中,使用用于表示元件的诸如“模块”、“部件”或“单元”的后缀仅为了有利于本申请的说明,其本身没有特有的意义。因此,“模块”、“部件”或“单元”可以混合地使用。
实施例一
如图1所示,本实施例提供了一种图像预测方法的流程图,包括:
步骤S110:获取待测图像。
其中,待测图像具体可以是通过拍摄所获取的,也可以是从数据库中通过筛选所获取的,本实施方式中并不限定获取待测图像的具体方式,并且图像预测的目的具体可以是确定待测图像的类别,例如,确定待测图像为动物、风景、建筑或者人物等,当然本实施方式中仅是举例说明,而并不限定类别的具体内容。
需要说明的是,在获取待测图像后,为了使预测结果更加准确,可以在输入预设模型之前对待测图像进行预处理,其中,预处理具体包括图像去噪、图像增强或图像填充等,以将待测图像中的干扰因素进行排除,从而使得预测结果更加准确。
步骤S120:将待测图像输入预设神经网络模型,得到待测图像的预测类型。
其中,预设神经网络网络模型每层的权重通过分层自适应学习速率训练得到,分层自适应学习速率训练,指的是在通过样本对神经网络训练的过程中针对每层分别设置匹配的学习速率,而在计算每层的权重时,采用本层所对应的学习速率进行确定,因此相对于相关技术中针对所有层全部采用全局唯一的学习速率进行各层权重的确定方式,权重的确定要更加准确,因此所确定出的预设神经网络模型要更加准确。
在一种示例中,将待测图像输入预设神经网络模型,得到待测图像的预测类别,可以包括:将待测图像输入预设神经网络模型,获得类型概率集合,其中,类型概率集合中包含每个类别与概率值的对应关系;确定概率集合中最大的概率值所对应的类别;将最大的概率值所对应的类别作为待测图像的预测类别。
例如,将待测图像1输入预设神经网络模型,并且在待测图像1中包含猫,通过预设神经网络模型对输入的待测图像1进行预测,分别获得类型概率集合{动物98%,人物1%,风景1%},通过类型概率集合可以得出其中最大的概率值为98%,并且最大的概率值所对应的类别为动物,从而可以确定待测图像的预测类别为动物。
在一种示例中,将待测图像输入预设神经网络模型之前,还可以包括:采用多个图形处理器GPU对样本图像并行训练获得所述预设神经网络模型。
在一种示例中,采用多个图形处理器GPU对样本图像并行训练获得所述预设神经网络模型,可以包括:每个GPU根据所述样本图像确定初始神经网络模型每层的初始梯度;根据各层的初始梯度得到所述初始神经网络模型每层的聚合梯度和层学习速率;根据每层的当前权重、所述层学习速率、权重衰减参数以及所述初始神经网络模型的全局学习速率确定每层的预设权重;根据每层的预设权重得到所述预设神经网络模型。
本申请实施所提供的图像预测方法,通过将获取的待测图像输入预设神经网络模型,由于预设神经网络模型每层的权重是通过分层自适应学习速率训练所获得的,每层的权重更加准确,从而确定出的预设神经网络模型更加精准,因此在将待测图像输入预设神经网络时,能够准确的获取待测图像的预测类别。
实施例二
图2为本申请实施例二提供的一种图像预测方法的流程图,本实施例以上述实施例为基础,在本实施例中,在将待测图像输入预设神经网络模型之前,还包括:采用多个图形处理器GPU对样本图像并行训练获得预设神经网络模型。相应的,本实施例的方法具体包括如下操作:
步骤S210:获取待测图像。
步骤S220:采用多个图形处理器GPU对样本图像并行训练获得预设神经网络模型。
需要说明的是,本实施方式中在将待测图像输入预设神经网络模型之前,需要通过样本图像进行训练获得预设神经网络模型,在人工智能(Artificial Intelligence,AI)平台涉及到四个应用程序,分别是应用管理器、资源管理器、作业调度器和执行器,并通过四个应用程序之间的运行交互实现预设神经网络模型的训练,如图3所示为训练运行交互图:
其中,应用管理器运行在高性能计算(High Performance Computing,HPC)集群上,在接收到用户提交的模型训练作业时,会向资源管理模块发送申请资源请求;资源管理器会根据申请资源请求为用户提交的训练作业在HPC集群上分配相应的资源,例如,GPU等,并向应用管理器发送申请成功指令;应用管理器在获取到申请成功指令时,会向作业调度器发送启动作业指令,其中,每个训练任务分别对应一个作业调度器;作业调度器在接收到启动作业指令后会向执行器发送启动指令,执行器负责执行分配到集群中每个节点上的神经网络模型的训练任务,在训练完成的情况下会通过作业调度器向应用管理器发送训练作业完成指令;应用管理器在接收到训练作业完成的情况下会向资源管理器发送释放资源指令,以使资源管理器针对所分配的资源进行回收。而执行器的执行数据并行训练的过程对应本实施例的步骤S220,因此下面对步骤S220的过程进行具体说明。
如图3所示,步骤S220具体包括如下步骤:
步骤S221:每个GPU根据样本图像确定初始神经网络模型每层的初始梯度。
在一种示例中,每个GPU根据样本图像确定初始神经网络模型每层的初始梯度之前,还可以包括:获取模型构建指令,并根据模型构建指令在每个GPU上生成原始神经网络模型;确定指定GPU上所包含的原始神经网络模型的参数,将指定GPU上所包含的原始神经网络模型作为初始神经网络模型;将指定GPU上所包含的原始神经网络模型的参数广播给剩余GPU,以使剩余GPU根据广播的参数对自身的原始神经网络模型进行参数更新,获得初始神经网络模型。
在一个具体实现中,所分配的资源为三个GPU,即采用三个GPU并行进行训练,在获取到模型构建指令时,会根据模型构建指令在每个GPU上生成原始神经网络模型,由于每个GPU上分别是随机生成原始神经网络模型的参数,因此所生成的每个原始神经网络模型的参数肯定是不一样的,为了保持一致性,可以指定一个GPU,例如,每个GPU分别具有一个编号,指定编号为0的GPU上所包含的原始神经网络模型作为初始神经网络模型,并将编号为0的GPU上所包含的原始神经网络模型的参数广播给剩余的两个GPU,剩余的两个GPU会根据广播的参数对本身的原始神经网络模型进行参数更新,从而使得每个GPU上都包含参数相同的初始神经网络模型。
其中,本实施方式中在每个GPU上执行样本图像的读取,并执行初始神经网络模型的运算,得出每个GPU初始神经网络模型每层的初始梯度,例如,针对初始神经网络模型的第一层,确定该层所对应的当前权重为w l,其中,l表示层标号,此时l为1,并且第一个GPU通过计算确定初始梯度为D1(w 1),第二个GPU上通过计算确定初始梯度为D2(w 1),第三个GPU上通过计算确定初始梯度为D3(w 1)。
步骤S222:根据各层的初始梯度得到初始神经网络模型每层的聚合梯度和层学习速率。
在一种示例中,根据各层的初始梯度得到初始神经网络模型每层的聚合梯度和层学习速率可以包括:将各层的初始梯度相加获得各层的梯度和;将各层的梯度和与各层初始梯度个数的比值作为每层的聚合梯度;根据每层的聚合梯度获得每层的层学习速率。
在一种示例中,根据每层的聚合梯度获得每层的层学习速率包括:确定学习速率的上边界值和下边界值,以及每层的当前权重与聚合梯度的比值;在确定当前权重与聚合梯度的比值位于下边界值和上边界值之间时,则将当前权重与聚合梯度的比值作为每层的层学习速率;在确定当前权重与聚合梯度的比值大于上边界值时,则将上边界值作为每层的层学习速率;在确定当前权重与聚合梯度的比值小于下边界值时,则将下边界值作为每层的层学习速率。
具体的说,在本实施方式中,针对初始神经网络模型中第一层来说,在确定第一个GPU通过计算确定初始梯度为D1(w 1),第二个GPU上通过计算确定初始梯度为D2(w 1), 第三个GPU上通过计算确定初始梯度为D3(w 1)时,具体可以将各层的梯度和与各层初始梯度个数的比值作为每层的聚合梯度,则针对第一层的聚合梯度为
Figure PCTCN2021100993-appb-000001
Figure PCTCN2021100993-appb-000002
当然,本实施方式中仅是以第一层为例进行的举例说明,对于其它层确定聚合梯度的方式与此大致相同,因此本实施方式中不再进行赘述。
其中,在获得每层的聚合梯度之后,具体可以采用如下公式(1)获得每层的层学习速率:
Figure PCTCN2021100993-appb-000003
其中,λ l为初始神经网络模型中l层的层学习率,l为层标号,η为权重梯度比的扩展系数,ε为可选项,w l为l层的当前权重,
Figure PCTCN2021100993-appb-000004
为l层的聚合梯度,T m为学习速率的上边界值,T n为学习速率的下边界值。
需要说明的是,η、ε、T m和T n需要在训练之前进行设置,并且具体数值用户可以根据实际情况进行限定,本实施方式中并不限定取值的大小。
clip公式的含义是:在确定出
Figure PCTCN2021100993-appb-000005
的比值为q时,当T m<q<T n时,则层学习速率λ l=q;当q<T m时,则层学习速率λ l=T m;当q>T n时,则λ l=T n。因此通过clip操作,可以将层学习速率始终控制在上边界值和下边界值之间,不会出现过大的情况。并且通过每一层分别定义层学习速率,使得初始神经网络模型参数更新更加的高效合理,加速训练的过程。
步骤S223:根据每层的当前权重、层学习速率、权重衰减参数以及初始神经网络模型的全局学习速率确定每层的预设权重。
其中,在获得每层的层学习速率之后,具体可以通过如下公式(2)确定每层的预设权重
Figure PCTCN2021100993-appb-000006
其中,
Figure PCTCN2021100993-appb-000007
表示l层的预设权重,λ l为初始神经网络模型中l层的层学习率,t表示当前迭代次数,
Figure PCTCN2021100993-appb-000008
为l层的当前权重,γ为全局学习速率,
Figure PCTCN2021100993-appb-000009
为l层的聚合梯度,β为权重衰减参数。并且从公式(2)中可以得出学习速率与权重衰减参数是可以单独进行调整了,从而实现了两者的解耦。
需要说明的是,本实施方式中的权重衰减参数β是动态变化的,并且具体应用了1Cycle调整策略,如图5所示为采用1Cycle调整策略所确定的权重衰减参数的动态变化示意图,从图中可以获知权重衰减参数β在训练的前13个迭代中,从0.0005线性增加到0.01,然后在接下来的14个迭代中,从0.01再线性递减到0.0005,最后一个迭代中保持常量0.0005。当然,本实施方式中仅是示例说明,并不限定变化的临界迭代次数,以及迭代终止次数,用户可以根据实际情况进行限定。
步骤S224:根据每层的预设权重得到预设神经网络模型。
其中,通过循环执行上述步骤S221至步骤S223,直到达到所设定的迭代次数,在最终迭代次数中,当每层的预设权重都已经确定的情况下,预设神经网络模型每层的参数就是已 知的,因此根据所确定的参数得到预设神经网络模型。
步骤S230:将待测图像输入预设神经网络模型,得到待测图像的预测类型。
本申请实施所提供的图像预测方法,通过将获取的待测图像输入预设神经网络模型,由于预设神经网络模型每层的权重是通过分层自适应学习速率训练所获得的,每层的权重更加准确,从而确定出的预设神经网络模型更加精准,因此在将待测图像输入预设神经网络时,能够准确的获取待测图像的预测类别。并且在神经网络训练过程中,由于可以针对每一层确定层学习速率,并实现了学习速率和权重衰减参数的解耦,从而能够在超大Batch Size训练时,使得训练过程更加高效,由于每个GPU上处理的样本Batch Size更大,因此进一步提高了整体的资源利用率。
实施例三
图6为本申请实施例三提供的一种图像预测方法的流程图,本实施例以上述实施例为基础,在本实施例中,在将待测图像输入预设神经网络模型,得到待测图像的预测类别之后,还包括:对预测结果进行检测,在确定预测结果异常的情况下发出报警提示。
步骤S310:获取待测图像。
步骤S320:将待测图像输入预设神经网络模型,得到待测图像的预测类型。
步骤S330:对预测结果进行检测。
具体的说,在对预测结果进行检测时,具体可以是检测预测结果是否存在明显错误的情况,例如,对预测结果进行识别,判断是否存在乱码或内容为空的情况。
步骤S340:在确定预测结果异常的情况下发出报警提示。
在确定预测结果为乱码或者内容为空的情况下,则确定预测结果异常,在预测结果异常的情况下会发出报警提示,报警提示的方式具体可以是文字提示、语音提示或灯光提示,例如,在确定预测结果为乱码的情况下,会进行语音提示“预测结果错误,请进行查看”。由于出现故障的原因可能是设备故障、通信终端或者神经网络模型本身参数配置错误,因此通过发出报警提示,可以通知用户及时采取相应的措施,例如进行设备更换或者在确定设备正常的情况下通过调整参数配置重新进行预测。
本申请实施所提供的图像预测方法,通过将获取的待测图像输入预设神经网络模型,由于预设神经网络模型每层的权重是通过分层自适应学习速率训练所获得的,每层的权重更加准确,从而确定出的预设神经网络模型更加精准,因此在将待测图像输入预设神经网络时,能够准确的获取待测图像的预测类别。通过对预测结果进行检测,并在预测结果异常的情况下发出报警提示,从而提示用户及时进行设备维护,以进一步提高预测结果的准确性。
实施例四
如图7所示,本申请实施例四提出一种图像预测的设备,该设备包括存储器720、处理器710、存储在该存储器上并可在该处理器上运行的程序以及用于实现处理器710和存储器720之间的连接通信的数据总线,该程序被该处理器执行时,以实现本申请实施例中的图像预测方法:
获取待测图像;将待测图像输入预设神经网络模型,得到待测图像的预测类别,其中,预设神经网络模型每层的权重通过分层自适应学习速率训练得到。
终端中的处理器710、存储器720可以通过总线或其他方式连接,图7中以通过总线连接为例。
存储器720作为一种计算机可读存储介质,可设置为存储软件程序、计算机可执行程序以及模块,如本申请实施例时域资源确定方法对应的程序指令/模块。存储器720可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据设备的使用所创建的数据等。此外,存储器720可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实例中,存储器720可包括相对于处理器710远程设置的存储器。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
实施例五
本申请实施例五提出一种可读存储介质,该可读存储介质存储有一个或者多个程序,该一个或者多个程序可被一个或者多个处理器执行,以实现本申请实施例中的图像预测方法:
获取待测图像;将待测图像输入预设神经网络模型,得到待测图像的预测类别,其中,预设神经网络模型每层的权重通过分层自适应学习速率训练得到。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、设备中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。
在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包 括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。
以上参照附图说明了本申请的优选实施例,并非因此局限本申请的权利范围。本领域技术人员不脱离本申请的范围和实质内所作的任何修改、等同替换和改进,均应在本申请的权利范围之内。
通过示范性和非限制性的示例,上文已提供了对本申请的示范实施例的详细描述。但结合附图和权利要求来考虑,对以上实施例的多种修改和调整对本领域技术人员来说是显而易见的,但不偏离本申请的范围。因此,本申请的恰当范围将根据权利要求确定。

Claims (10)

  1. 一种图像预测方法,包括:
    获取待测图像;
    将待测图像输入预设神经网络模型,得到所述待测图像的预测类别,其中,所述预设神经网络模型每层的权重通过分层自适应学习速率训练得到。
  2. 根据权利要求1所述的方法,其中,所述将待测图像输入预设神经网络模型之前,还包括:
    采用多个图形处理器GPU对样本图像并行训练获得所述预设神经网络模型。
  3. 根据权利要求2所述的方法,其中,所述采用多个图形处理器GPU对样本图像并行训练获得所述预设神经网络模型包括:
    每个GPU根据所述样本图像确定初始神经网络模型每层的初始梯度;
    根据各层的初始梯度得到所述初始神经网络模型每层的聚合梯度和层学习速率;
    根据每层的当前权重、所述层学习速率、权重衰减参数以及所述初始神经网络模型的全局学习速率确定每层的预设权重;
    根据每层的预设权重得到所述预设神经网络模型。
  4. 根据权利要求3所述的方法,其中,所述每个GPU根据所述样本图像确定初始神经网络模型每层的初始梯度之前,还包括:
    获取模型构建指令,并根据所述模型构建指令在每个所述GPU上生成原始神经网络模型;
    确定指定GPU上所包含的原始神经网络模型的参数,将指定GPU上所包含的原始神经网络模型作为所述初始神经网络模型;
    将指定GPU上所包含的原始神经网络模型的参数广播给剩余GPU,以使所述剩余GPU根据广播的参数对自身的原始神经网络模型进行参数更新,获得所述初始神经网络模型。
  5. 根据权利要求3所述的方法,其中,所述根据各层的初始梯度得到所述初始神经网络模型每层的聚合梯度和层学习速率包括:
    将各层的所述初始梯度相加获得各层的梯度和;
    将各层的梯度和与各层初始梯度个数的比值作为每层的聚合梯度;
    根据每层的聚合梯度获得每层的所述层学习速率。
  6. 根据权利要求5所述的方法,其中,所述根据每层的聚合梯度获得每层的所述层学习速率包括:
    确定学习速率的上边界值和下边界值,以及每层的所述当前权重与所述聚合梯度的比值;
    在确定所述当前权重与所述聚合梯度的比值位于所述下边界值和所述上边界值之间时,则将所述当前权重与所述聚合梯度的比值作为每层的所述层学习速率;
    在确定所述当前权重与所述聚合梯度的比值大于所述上边界值时,则将所述上边界值作为每层的所述层学习速率;
    在确定所述当前权重与所述聚合梯度的比值小于所述下边界值时,则将所述下边界值作为每层的所述层学习速率。
  7. 根据权利要求1至6中任一项所述的方法,其中,所述将待测图像输入预设神经网络模型,得到所述待测图像的预测类别包括:
    将待测图像输入所述预设神经网络模型,获得类型概率集合,其中,所述类型概率集合中包含每个类别与概率值的对应关系;
    确定所述概率集合中最大的概率值所对应的类别;
    将最大的概率值所对应的类别作为所述待测图像的预测类别。
  8. 根据权利要求1至7中任一项所述的方法,其中,所述将待测图像输入预设神经网络模型,得到所述待测图像的预测类别之后,还包括:
    对预测结果进行检测;
    在确定所述预测结果异常的情况下发出报警提示。
  9. 一种图像预测的设备,所述设备包括存储器、处理器、存储在所述存储器上并可在所述处理器上运行的程序以及用于实现所述处理器和所述存储器之间的连接通信的数据总线,所述程序被所述处理器执行时实现如权利要求1-8任一项所述的图像预测方法的步骤。
  10. 一种存储介质,用于计算机可读存储,所述存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现权利要求1至8中任一项所述的图像预测方法的步骤。
PCT/CN2021/100993 2020-06-19 2021-06-18 一种图像预测方法、设备和存储介质 WO2021254498A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010568970.7A CN113822307A (zh) 2020-06-19 2020-06-19 一种图像预测方法、设备和存储介质
CN202010568970.7 2020-06-19

Publications (1)

Publication Number Publication Date
WO2021254498A1 true WO2021254498A1 (zh) 2021-12-23

Family

ID=78924664

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/100993 WO2021254498A1 (zh) 2020-06-19 2021-06-18 一种图像预测方法、设备和存储介质

Country Status (2)

Country Link
CN (1) CN113822307A (zh)
WO (1) WO2021254498A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114358284A (zh) * 2022-01-12 2022-04-15 厦门市美亚柏科信息股份有限公司 一种基于类别信息对神经网络分步训练的方法、装置、介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101425152A (zh) * 2008-12-12 2009-05-06 湖南大学 一种基于变学习率神经网络的fir滤波器的设计方法
US9129190B1 (en) * 2013-12-04 2015-09-08 Google Inc. Identifying objects in images
CN108960410A (zh) * 2018-06-13 2018-12-07 华为技术有限公司 基于神经网络的参数更新方法、相关平台及计算机存储介质
CN110781724A (zh) * 2018-09-11 2020-02-11 开放智能机器(上海)有限公司 一种人脸识别神经网络、方法、装置、设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101425152A (zh) * 2008-12-12 2009-05-06 湖南大学 一种基于变学习率神经网络的fir滤波器的设计方法
US9129190B1 (en) * 2013-12-04 2015-09-08 Google Inc. Identifying objects in images
CN108960410A (zh) * 2018-06-13 2018-12-07 华为技术有限公司 基于神经网络的参数更新方法、相关平台及计算机存储介质
CN110781724A (zh) * 2018-09-11 2020-02-11 开放智能机器(上海)有限公司 一种人脸识别神经网络、方法、装置、设备及存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114358284A (zh) * 2022-01-12 2022-04-15 厦门市美亚柏科信息股份有限公司 一种基于类别信息对神经网络分步训练的方法、装置、介质

Also Published As

Publication number Publication date
CN113822307A (zh) 2021-12-21

Similar Documents

Publication Publication Date Title
CN110837410B (zh) 任务调度方法、装置、电子设备及计算机可读存储介质
US20200090073A1 (en) Method and apparatus for generating machine learning model
US20220391771A1 (en) Method, apparatus, and computer device and storage medium for distributed training of machine learning model
CN111444009B (zh) 一种基于深度强化学习的资源分配方法及装置
CN107330516B (zh) 模型参数训练方法、装置及系统
US11531874B2 (en) Regularizing machine learning models
US20190332944A1 (en) Training Method, Apparatus, and Chip for Neural Network Model
WO2021238262A1 (zh) 一种车辆识别方法、装置、设备及存储介质
CN110221915B (zh) 节点调度方法和装置
US11488067B2 (en) Training machine learning models using teacher annealing
CN111160531B (zh) 神经网络模型的分布式训练方法、装置及电子设备
CN112764893B (zh) 数据处理方法和数据处理系统
CN110751175A (zh) 损失函数的优化方法、装置、计算机设备和存储介质
CN112862112A (zh) 联邦学习方法、存储介质、终端、服务器、联邦学习系统
WO2021254498A1 (zh) 一种图像预测方法、设备和存储介质
WO2023226284A1 (zh) 一种深度学习模型的训练方法、装置、设备及存储介质
CN115586961A (zh) 一种ai平台计算资源任务调度方法、装置及介质
CN114866563A (zh) 扩容方法、装置、系统和存储介质
US11012511B1 (en) Smart network interface controller for caching distributed data
CN113220463B (zh) 一种绑定策略推断方法、装置、电子设备及存储介质
CN114661665A (zh) 执行引擎的确定方法、模型训练方法和装置
CN113313195B (zh) 标注任务处理方法、装置、设备、存储介质及程序产品
CN112148469B (zh) 管理资源的方法、装置及计算机存储介质
CN115357346B (zh) 基于区块链的事务处理方法、装置、电子设备及介质
CN116755866B (zh) 一种资源调度方法、装置、电子设备及可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21826146

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21826146

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21826146

Country of ref document: EP

Kind code of ref document: A1