WO2024020773A1 - 模型生成方法、图像分类方法、控制器以及电子设备 - Google Patents

模型生成方法、图像分类方法、控制器以及电子设备 Download PDF

Info

Publication number
WO2024020773A1
WO2024020773A1 PCT/CN2022/107857 CN2022107857W WO2024020773A1 WO 2024020773 A1 WO2024020773 A1 WO 2024020773A1 CN 2022107857 W CN2022107857 W CN 2022107857W WO 2024020773 A1 WO2024020773 A1 WO 2024020773A1
Authority
WO
WIPO (PCT)
Prior art keywords
module
neural network
modules
convolutional neural
network model
Prior art date
Application number
PCT/CN2022/107857
Other languages
English (en)
French (fr)
Inventor
董学章
于春生
Original Assignee
江苏树实科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 江苏树实科技有限公司 filed Critical 江苏树实科技有限公司
Priority to PCT/CN2022/107857 priority Critical patent/WO2024020773A1/zh
Priority to CN202280005481.8A priority patent/CN115968477A/zh
Publication of WO2024020773A1 publication Critical patent/WO2024020773A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present invention relates to the field of image processing technology, and specifically relates to a model generation method, an image classification method, a controller and an electronic device.
  • microcontrollers With the advancement of computer hardware technology, deep learning models can run on the latest 32-bit microcontrollers.
  • the power consumption of currently commonly used microcontrollers (MCUs) is only a few milliwatts. Based on the low power consumption characteristics of microcontrollers, devices using microcontrollers can be powered by button batteries or some solar cells.
  • Microcontrollers are an important part of the development of the Internet of Things.
  • the real-time operating system (RTOS) has been widely used on the STMicroelectronics STM32 platform, Espressif Systems ESP32 platform andhen platform; the real-time operating system enables the microcontroller to support multi-processors ( CPU), multi-threaded applications.
  • RTOS real-time operating system
  • Image classification is an image processing method that distinguishes different categories of targets based on the different characteristics reflected in the image information; that is, for a given image, determine what categories of targets are contained in the image.
  • the image classification convolutional neural network (Convolutional Neural Network, CNN) based on deep learning is a feed-forward neural network. Its artificial neurons can respond to surrounding units within a part of the coverage area and have excellent performance in large-scale image processing.
  • the convolutional neural network model architecture is a multi-layer structure. After the first input layer, the image has several convolutional layers, batch normalization layers, and downsampling layers arranged in various orders. Finally, the output layer outputs the category of the image.
  • the purpose of the present invention is to provide a model generation method, image classification method, controller and electronic equipment, which can obtain a high-precision convolutional neural network model without requiring a large amount of labeled training data, and at the same time save the time spent on labeling training data. required manpower and time.
  • the present invention provides a model generation method, which includes: constructing a convolutional neural network model for image classification, and dividing the convolutional neural network model into N modules in sequence, each of which The module includes multiple adjacent layers in the neural network model, and N is an integer greater than 1; the first to N-1th modules are trained based on unlabeled training data to obtain the first parameters and models of the modules from 1 to N-1; cascade the trained modules from 1 to N-1 and the N-th module, and use the labeled training data to compare The connected N modules are trained to obtain the parameters and models of each module.
  • the present invention also provides an image classification method to obtain a convolutional neural network model used to classify images to be classified, where the convolutional neural network model is generated based on the above model generation method; using the obtained convolution The neural network model performs image classification on the images to be classified.
  • the present invention also provides a controller for executing the above-mentioned model generation method and/or the above-mentioned image classification method.
  • the invention also provides an electronic device, including: the above-mentioned controller and a memory communicatively connected with the controller.
  • This embodiment provides a model generation method.
  • a convolutional neural network model for image classification is constructed, and the multi-layer structure of the constructed convolutional neural network model is sequentially divided into N modules, each module including Multiple adjacent layers in the neural network model; and then train the first to N-1th modules based on unlabeled training data to obtain the 1st to N-1th modules.
  • the parameters and models of the above modules are used to pre-train the first N-1 modules using unlabeled training data, so that the first N-1 modules can learn the characteristics of the unlabeled training data in advance, and then train the first N-1 modules.
  • the 1st to N-1th modules are cascaded with the Nth module, and the cascaded N modules are trained using the labeled training data to obtain the parameters and models of each module.
  • the first N-1 modules have learned the characteristics of the unlabeled training data in advance, at this time, only a small amount of labeled training data can be used to perform supervised learning training on the convolutional neural network model obtained by the cascade, and the final The convolutional neural network model can obtain a high-precision convolutional neural network model without the need for a large amount of labeled training data, while saving the manpower and time required for training data labeling.
  • the first to N-1th modules are trained based on unlabeled training data to obtain the parameters and models of each target module, including: for each target module, Use the target module as the encoding module of the autoencoder to design the decoding module of the autoencoder, and train the autoencoder based on unlabeled training data to obtain the parameters and model of the target module, where the target module It is one of the modules from the 1st module to the N-1 module.
  • the decoding module of the autoencoder is designed using the target module as the encoding module of the autoencoder, and the autoencoder is trained based on unlabeled training data to obtain
  • the parameters and models of the target module include: for the first module, use unlabeled training data to train the first module to obtain the parameters and model of the first module; for the Mth module The M-th module is trained using the output data of the M-1-th module to obtain the parameters and model of the M-th module; where 1 ⁇ M ⁇ N-1, and M is an integer.
  • the memory occupied by the parameters of the multi-layer structure model corresponding to the module is smaller than the on-chip storage of the controller running the convolutional neural network model.
  • the 1st to N-1th modules after training are cascaded with the Nth module, and the labeled training data are used to cascade the Nth modules after cascading.
  • the method further includes: respectively converting the parameters and models of each of the modules into a format for running on the controller.
  • the construction of a convolutional neural network model for image classification includes: generating a convolution for classifying the image to be classified based on the attributes of the image to be classified and the system parameters of the controller.
  • Neural network model is a convolution for classifying the image to be classified based on the attributes of the image to be classified and the system parameters of the controller.
  • the memory occupied by the parameters of each module corresponding to the multi-layer structure model is smaller than the on-chip storage of the controller;
  • the convolutional neural network model performs image classification on the image to be classified, including: running multiple modules included in the acquired convolutional neural network model in parallel in multiple threads or processors of the controller, and classifying all the modules.
  • the images to be classified are used for image classification.
  • Figure 1 is a specific flow chart of a model generation method according to the first embodiment of the present invention
  • Figure 2 is a schematic diagram of a convolutional neural network model according to the first embodiment of the present invention.
  • Figure 3 is a flow chart of step 102 of the model generation method in Figure 1;
  • Figure 4 is a specific flow chart of an image classification method according to the second embodiment of the present invention.
  • the first embodiment of the present invention relates to a model generation method for training a convolutional neural network model.
  • the trained convolutional neural network can be used for image classification.
  • Step 101 Construct a convolutional neural network model for image classification, and divide the convolutional neural network model into N modules in sequence. Each module includes multiple adjacent layers in the neural network model, and N is greater than 1. integer.
  • the convolutional neural network model is used for image classification, which can be constructed based on the attributes of the image to be classified and the parameters of the controller running the convolutional neural network model.
  • the multi-layer structure of the convolutional neural network model is divided in sequence to obtain N modules (N is an integer greater than 1). Each module includes multiple layers of the convolutional neural network model. After multiple modules are connected in sequence, Get a complete convolutional neural network model.
  • the controller can be an MCU microcontroller.
  • the memory occupied by the parameters of the module's corresponding multi-layer structure model is smaller than the on-chip storage of the controller running the convolutional neural network model. That is, when dividing the convolutional neural network model, it is necessary to ensure that the parameters of the multi-layer structure module corresponding to each divided module occupy less than the on-chip storage of the controller to ensure that a single module can run on the controller. ; Moreover, you can later select multiple modules to run in parallel in multiple threads in the controller, or for a controller that includes multiple processors, multiple modules can run in parallel in multiple processors, thus speeding up the controller. The computing speed improves the speed of classifying images to be classified.
  • the first layer of the convolutional neural network model is the input layer, which is used to receive input images.
  • the input layer there are several convolutional layers and batch normalization arranged in sequence.
  • layer and downsampling layer used for feature extraction.
  • the extracted features are connected to the final output layer through the fully connected layer, and the output layer outputs the category of the content in the image.
  • the output layer is cascaded with several groups (two groups are taken as an example in Figure 2) of convolutional layers, batch normalization layers and downsampling layers to form module 1, and several subsequent Groups (two groups are taken as an example in Figure 2) convolutional layer, batch normalization layer and downsampling layer are concatenated to form module 2.
  • groups two groups are taken as an example in Figure 2
  • several subsequent Groups two groups are taken as an example in Figure 2
  • convolutional layer, batch normalization layer and downsampling layer are concatenated to form module 2.
  • Step 102 Train the 1st to N-1th modules based on unlabeled training data to obtain parameters and models of the 1st to N-1th modules.
  • the 1st to N-1th modules are trained in sequence, and the parameters and parameters of each module in the 1st to N-1th modules are obtained.
  • the model is saved, in which the parameters of each module include the connection weights between each layer in the module.
  • the 1st to N-1th modules are trained based on unlabeled training data to obtain the parameters and models of each target module, including: for each target module, the target module is used as an autoencoder.
  • the encoding module designs the decoding module of the autoencoder, and trains the autoencoder based on unlabeled training data to obtain the parameters and model of the target module, where the target module is one of the 1st module to the N-1th module. module.
  • step 102 For each target module, use the target module as the encoding module of the autoencoder to design the decoding module of the autoencoder, and train the autoencoder based on unlabeled training data to obtain the parameters of the target module. with the model, including the following sub-steps:
  • Sub-step 1021 for the first module, use unlabeled training data to train the first module to obtain the parameters and model of the first module.
  • Sub-step 1022 for the Mth module, use the output data of the M-1th module to train the Mth module to obtain the parameters and model of the Mth module; where 1 ⁇ M ⁇ N-1, and M is integer.
  • the first module (module 1) to the N-1th module (module N-1) are trained in sequence.
  • module 1 first use module 1 as the encoding module 11 of the autoencoder to design the decoding module 12 of the autoencoder. Therefore, the encoding module 11 (module 1) and the decoding module 12 form an autoencoder. Since the autoencoder The encoder belongs to unsupervised learning and does not rely on the labeling of training data. It can automatically find the relationship between the training data by mining the inherent characteristics of the training data, so that the autoencoder can be trained using unlabeled training data.
  • the required encoding module 11 (module 1) saves the parameters and models of the encoding module 11 (module 1), and the encoding module 11 (module 1) learns to obtain an abstract feature representation for the training data input.
  • the training method adopted is similar to the training method of module 1.
  • the input of each module is The output of a module, for example, when training module 2, the input data used is the output data of module 1.
  • the specific training process of the second module (Module 2) to the N-1 module (Module N-1) is no longer indexed here. After training, the parameters and models of Module 2 to Module N-1 can be obtained and saved.
  • unlabeled training data can be used to perform unsupervised learning training on module 1 to module N-1, so that the convolutional neural network model learns the characteristics of the training data.
  • Step 103 cascade the trained 1st to N-1th modules and the Nth module, and use the labeled training data to train the cascaded N modules to obtain the parameters and models of each module.
  • module 1 to module N are cascaded in sequence, that is, according to the division order after division, module 1 to module N-1 are combined to obtain a complete Convolutional neural network model, and then use the labeled training data to perform supervised learning training on the combined convolutional neural network model, and since modules 1 to module N have learned the characteristics of the training data in step 102, therefore In this step, only a small amount of labeled training data is needed to perform supervised learning training on the convolutional neural network model. After training the combined convolutional neural network model, the final convolutional neural network model is obtained, and Save the parameters and models of module 1 to module N respectively.
  • step 103 it also includes:
  • Step 104 Convert the parameters and models of each module into a format for running on the controller.
  • the parameters and models of module 1 to module N are converted respectively, so that modules 1 to module N can run on the controller.
  • the parameters and models of multiple modules are converted into code forms, so that multiple modules can be compiled directly in the controller, which reduces the memory usage of the modules in the controller and improves the running speed.
  • This embodiment provides a model generation method.
  • a convolutional neural network model for image classification is constructed, and the multi-layer structure of the constructed convolutional neural network model is sequentially divided into N modules.
  • Each module includes a neural network. Multiple adjacent layers in the model; then train the 1st to N-1th modules based on unlabeled training data to obtain the parameters and models of the 1st to N-1th modules, that is, use
  • the unlabeled training data is used to pre-train the first N-1 modules, so that the first N-1 modules can learn the characteristics of the unlabeled training data in advance, and then the first to N-1 modules after training are Cascade with the Nth module, and use the labeled training data to train the cascaded N modules to obtain the parameters and models of each module. Since the first N-1 modules have learned unlabeled training data in advance Characteristics of It creates a high-precision convolutional neural network model while saving the manpower and time required for training data annotation.
  • the second embodiment of the present invention discloses an image classification method, which is applied to a controller (which can be an MCU microcontroller).
  • a convolutional neural network model for image classification is run in the controller, so that the input data to be processed can be Classify images for image classification.
  • Step 201 Obtain a convolutional neural network model used to classify the image to be classified.
  • the convolutional neural network model is generated based on the model generation method in the first embodiment.
  • the convolutional neural network model used for image classification is generated based on the model generation method in the first embodiment. After the convolutional neural network model is generated, it can be run in the controller.
  • Step 202 Use the obtained convolutional neural network model to perform image classification on the image to be classified.
  • the memory occupied by the parameters of each module corresponding to the multi-layer structure model is smaller than the on-chip storage of the running controller; the obtained convolutional neural network model is used to perform image processing on the images to be classified.
  • Classification includes: running multiple modules contained in the obtained convolutional neural network model in parallel in multiple threads or processors of the controller, and performing image classification on the images to be classified. That is, in the convolutional neural network model generated in the first embodiment, the memory required for each module to run the convolutional neural network model is less than the on-chip storage of the controller. Therefore, each module can be controlled under the control of the convolutional neural network model.
  • multiple modules can then be selected to run in parallel in multiple threads in the controller, or for controllers that include multiple processors, multiple modules can be run in parallel on multiple processors, thereby speeding up the controller
  • the computing speed improves the speed of classifying images to be classified and is suitable for low-power microprocessors.
  • multiple modules run on different processors.
  • the processor running the first module after acquiring the current image to be classified and completing the processing, it sends the obtained data to the processor running the second module.
  • the processor running the first module will perform the next step of processing by the processor running the second module, and so on.
  • the processor running the first module will send the current data to the processor running the second module. Collect and process the next image.
  • the third embodiment of the present invention discloses a controller, such as an MCU controller, which is used to execute the model generation method in the first embodiment and/or the image classification method in the second embodiment, that is, the controller can Run the model generation method and the image classification method at the same time, or the model generation method and the image classification method are implemented by different controllers.
  • the model generation method involves a model training process with high computing power, which can be handed over to the processing power.
  • the controller sends the generated convolutional neural network model to a low-power microcontroller, and the low-power microcontroller performs image classification based on the convolutional neural network.
  • the fourth embodiment of the present invention discloses an electronic device.
  • the electronic device includes the controller in the third embodiment and a memory communicatively connected to the controller.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

本发明实施例提供了模型生成方法、图像分类方法、控制器以及电子设备,模型生成方法,包括:构建用于进行图像分类的卷积神经网络模型,并将所述卷积神经网络模型依次划分为N个模块,每个所述模块包括所述神经网络模型中相邻的多个层,N为大于1的整数;基于未标注的训练数据对第1个所述模块至第N-1个所述模块进行训练,得到第1个至第N-1个所述模块的参数与模型;将训练后的第1个至第N-1个所述模块与第N个所述模块级联,并利用已标注的训练数据对级联后的N个所述模块进行训练,得到各所述模块的参数与模型,能够在无需大量标注训练数据的条件下得到了高精度的卷积神经网络模型,同时节约了训练数据标注所需的人力和时间。

Description

模型生成方法、图像分类方法、控制器以及电子设备 技术领域
本发明涉及图像处理技术领域,具体涉及一种模型生成方法、图像分类方法、控制器以及电子设备。
背景技术
随着计算机硬件技术的进步,深度学习(deep learning)模型可以在最新的32位的微控制器上运行。目前常用的微控制器(MCU)的功耗只有几毫瓦,基于微控制器低功耗的特性,使得使用微控制器的设备可以使用纽扣电池或一些太阳能电池来供电。微控制器是物联网发展的重要组成部分,实时操作系统(RTOS)已经广泛地运用在意法半导体STM32平台,乐鑫科技ESP32平台和Arduino平台上;实时操作系统使得微控制器支持多处理器(CPU)、多线程的应用。
图像分类是根据不同类别目标各自在图像信息中所反映的不同特征,把不同类别的目标区分开来的图像处理方法;即对于给定一张图像,判断图像里面包含什么类别的目标。基于深度学习的图像分类卷积神经网络(Convolutional Neural Network,CNN)是一种前馈神经网络,它的人工神经元可以响应一部分覆盖范围内的周围单元,对于大型图像处理有出色表现。卷积神经网络模型架构为多层结构,图像在第一输入层之后,按各种顺序排布有若干个卷积层、批量标准化层、降采样层,最后由输出层输出图像的类别。
卷积神经网络模型的卷积层越多,其表示能力越高。但是卷积神经网络模型的层数越多,其中所涉及的参数也就越多,比如可以用在手机中的图像分类模型MobilenetV2大约有3.5M的参数,但目前的微控制器大约只有256KB到512KB的片内存储器,无法适用于在微控制器中,因此微控制器上只能运行层数较少的图像分类卷积神经网络。
发明内容
本发明的目的是提供一种模型生成方法、图像分类方法、控制器以及电子设备,能够在无需大量标注训练数据的条件下得到了高精度的卷积神经网络模型,同时节约了训练数据标注所需的人力和时间。
为实现上述目的,本发明提供了一种模型生成方法,包括:构建用于进行图像分类的卷积神经网络模型,并将所述卷积神经网络模型依次划分为N个模块,每个所述模块包括所述神经网络模型中相邻的多个层,N为大于1的整数;基于未标注的训练数据对第1个所述模块至第N-1个所述模块进行训练,得到第1个至第N-1个所述模块的参数与模型;将训练后的第1个至第N-1个所述模块以及第N个所述模块级联,并利用已标注的训练数据对级联后的N个所述模块进行训练,得到各所述模块的参数与模型。
本发明还提供了一种图像分类方法,获取用于对待分类图像进行分类的卷积神经网络模型,所述卷积神经网络模型为基于上述的模型生成方法所生成;利用获取的所述卷积神经网络模型对所述待分类图像进行图像分类。
本发明还提供了一种控制器,用于执行上述的模型生成方法和/或上述的图像分类方法。
本发明还提供了一种电子设备,包括:上述的控制器以及与所述控制器通信连接的存储器。
本实施例提供了一种模型生成方法,先构建用于进行图像分类的卷积神经网络模型,并将构建的卷积神经网络模型的多层结构依次划分得到N个模块,每个模块包括所述神经网络模型中相邻的多个层;随后再基于未标注的训练数据对第1个所述模块至第N-1个所述模块进行训练,得到第1个至第N-1个所述模块的参数与模型,即利用未标注的训练数据对前N-1个模块进行预训练,使得前N-1个模块预先学习到了未标注的训练数据的特征,继而再将训练后的第1个至第N-1个所述模块与第N个所述模块级联,并利用已标注的训练数据对级联后的N个所述模块进行训练,得到各所述模块的参数与模型,由于前N-1个模块已经预先学习到了未标注的训练数据的特征,此时可以仅使用少量的已标注的训练数据对级联得到的卷积神经网络模型进行有监督学习训练,得到最终的卷积神经 网络模型,在无需大量标注训练数据的条件下得到了高精度的卷积神经网络模型,同时节约了训练数据标注所需的人力和时间。
在一个实施例中,所述基于未标注的训练数据对第1个所述模块至第N-1个所述模块进行训练,得到各目标模块的参数与模型,包括:对于每个目标模块,以所述目标模块作为自编码器的编码模块设计自编码器的解码模块,并基于未标注的训练数据对所述自编码器进行训练得到所述目标模块的参数与模型,其中所述目标模块为第1个所述模块至第N-1个所述模块中的一个所述模块。
在一个实施例中,所述对于每个目标模块,以所述目标模块作为自编码器的编码模块设计自编码器的解码模块,并基于未标注的训练数据对所述自编码器进行训练得到所述目标模块的参数与模型,包括:对于第1个所述模块,利用未标注的训练数据对第1个所述模块进行训练,得到第1个所述模块的参数与模型;对于第M个所述模块,利用第M-1个所述模块的输出数据对第M个所述模块进行训练,得到第M个所述模块的参数与模型;其中1<M≤N-1,且M为整数。
在一个实施例中,对于每个所述模块,所述模块对应多层结构模型的参数所占内存小于运行所述卷积神经网络模型的控制器的片内存储。
在一个实施例中,在所述将训练后的第1个至第N-1个所述模块与第N个所述模块级联,并利用已标注的训练数据对级联后的N个所述模块进行训练,得到各所述模块的参数与模型之后,还包括:分别将各所述模块的参数和模型转换为用于在控制器上进行运行的格式。
在一个实施例中,所述构建用于进行图像分类的卷积神经网络模型,包括:基于待分类图像的属性与控制器的系统参数,生成用于对所述待分类图像进行分类的卷积神经网络模型。
在一个实施例中,在获取的所述卷积神经网络模型中,每个所述模块对应多层结构模型的参数所占内存小于所述控制器的片内存储;所述利用获取的所述卷积神经网络模型对所述待分类图像进行图像分类,包括:将获取的所述卷积神经网络模型包含的多个模块并行运行在所述控制器的多个线程或处理器 中,对所述待分类图像进行图像分类。
附图说明
图1是根据本发明第一实施例中的模型生成方法的具体流程图;
图2是根据本发明第一实施例中的一种卷积神经网络模型的示意图;
图3是图1中的模型生成方法的步骤102的流程图;
图4是根据本发明第二实施例中的图像分类方法的具体流程图。
具体实施方式
以下将结合附图对本发明的各实施例进行详细说明,以便更清楚理解本发明的目的、特点和优点。应理解的是,附图所示的实施例并不是对本发明范围的限制,而只是为了说明本发明技术方案的实质精神。
在下文的描述中,出于说明各种公开的实施例的目的阐述了某些具体细节以提供对各种公开实施例的透彻理解。但是,相关领域技术人员将认识到可在无这些具体细节中的一个或多个细节的情况来实践实施例。在其它情形下,与本申请相关联的熟知的装置、机构和技术可能并未详细地示出或描述从而避免不必要地混淆实施例的描述。
除非语境有其它需要,在整个说明书和权利要求中,词语“包括”和其变型,诸如“包含”和“具有”应被理解为开放的、包含的含义,即应解释为“包括,但不限于”。
在整个说明书中对“一个实施例”或“一实施例”的提及表示结合实施例所描述的特定特点、机构或特征包括于至少一个实施例中。因此,在整个说明书的各个位置“在一个实施例中”或“在一实施例”中的出现无需全都指相同实施例。另外,特定特点、机构或特征可在一个或多个实施例中以任何方式组合。
如该说明书和所附权利要求中所用的单数形式“一”和“”包括复数指代物,除非文中清楚地另外规定。应当指出的是术语“或”通常以其包括“或/和”的含义使用,除非文中清楚地另外规定。
在以下描述中,为了清楚展示本发明的机构及工作方式,将借助诸多方向 性词语进行描述,但是应当将“前”、“后”、“左”、“右”、“外”、“内”、“向外”、“向内”、“上”、“下”等词语理解为方便用语,而不应当理解为限定性词语。
本发明第一实施例涉及一种模型生成方法,用于对卷积神经网络模型进行训练,训练后的卷积神经网络可以用于进行图像分类。
本实施例中的模型生成方法的具体流程如图1所示。
步骤101,构建用于进行图像分类的卷积神经网络模型,并将卷积神经网络模型依次划分为N个模块,每个模块包括神经网络模型中相邻的多个层,N为大于1的整数。
具体而言,卷积神经网络模型用于进行图像分类,其可以基于待分类图像的属性与运行该卷积神经网络模型的控制器的参数来构建,在构建了多层结构的卷积神经网络模型之后,对卷积神经网络模型的多层结构依次划分得到N个模块(N为大于1的整数),每个模块包括卷积神经网络模型的多个层,多个模块依次连接之后便可以得到一个完整的卷积神经网络模型。其中,控制器可以为MCU微控制器。
在一个例子中,对于每个模块,模块对应多层结构模型的参数所占内存小于运行卷积神经网络模型的控制器的片内存储。即在对卷积神经网络模型进行划分时,需确保划分得到的每个模块所对应的多层结构模块的参数占用的存储小于控制器的片内存储,以确保单个模块可以在控制器上运行;并且,后续也可以选择多个模块并行运行在控制器中的多个线程中,或者对于包括多处理器的控制器,多个模块并行运行在多个处理器中,由此能够加快控制器的运算速度,提升了对待分类图像进行分类的速度。
以图2的卷积神经网络模型为例,该卷积神经网络模型的第一层为输入层,用于接收输入图像,在输入层之后,按顺序排布有若干个卷积层、批量标准化层和降采样层,用于进行特征提取,提取得到的特征通过全连接层连接到最后的输出层,由输出层输出图像中内容的类别。
在对图2的卷积神经网络模型进行划分时,将输出层与若干组(图2中以两组为例)卷积层、批量标准化层和降采样层级联构成模块1,将后续的若干组(图2中以两组为例)卷积层、批量标准化层和降采样层级联构成模块2, 重复上述过程,可以依次划分能够得到模块3至模块N-1,最后将全连接层与输出层划分为模块N。
步骤102,基于未标注的训练数据对第1个模块至第N-1个模块进行训练,得到第1个至第N-1个模块的参数与模型。
具体而言,步骤101中完成对卷积神经网络模型的划分后,依次对第1个至第N-1个模块进行训练,得到第1个至第N-1个模块中各模块的参数与模型进行保存,其中各模块的参数包括模块中各层之间的连接权重。
在一个例子中,基于未标注的训练数据对第1个模块至第N-1个模块进行训练,得到各目标模块的参数与模型,包括:对于每个目标模块,以目标模块作为自编码器的编码模块设计自编码器的解码模块,并基于未标注的训练数据对自编码器进行训练得到目标模块的参数与模型,其中目标模块为第1个模块至第N-1个模块中的一个模块。
请参考图3,步骤102,对于每个目标模块,以目标模块作为自编码器的编码模块设计自编码器的解码模块,并基于未标注的训练数据对自编码器进行训练得到目标模块的参数与模型,包括以下子步骤:
子步骤1021,对于第1个模块,利用未标注的训练数据对第1个模块进行训练,得到第1个模块的参数与模型。
子步骤1022,对于第M个模块,利用第M-1个模块的输出数据对第M个模块进行训练,得到第M个模块的参数与模型;其中1<M≤N-1,且M为整数。
以图2的卷积神经网络模型为例,在对卷积神经网络模型进行训练的过程中,依次对第1个模块(模块1)至第N-1个模块(模块N-1)进行训练,以模块1为例,先将模块1作为自编码器的编码模块11来设计自编码器的解码模块12,由此编码模块11(模块1)与解码模块12组成了自编码器,由于自编码器属于无监督学习,不依赖于训练数据的标注,可以通过对训练数据内在特征的挖掘,自动寻找训练数据之间的关系,由此可以使用未标注的训练数据对该自编码器进行训练;将未标注的训练数据输入到编码模块11(模块1),通过编码模块11(模块1)将训练数据映射到特征空间,随后再由解码模块12将编码模块11(模块1)得到抽样特征映射回原始空间得到重构数据,再将重构数据与训练数据进行对比得到重构 误差,以最小化重构误差作为优化目标来优化编码模块11(模块1)与解码模块12,得到最终所需的编码模块11(模块1),保存编码模块11(模块1)的参数与模型,编码模块11(模块1)学习得到针对训练数据输入的抽象特征表示。
对于第2个模块(模块2)至第N-1个模块(模块N-1),其所采取的训练方法与模块1的训练方式类似,主要不同之处在于,每个模块的输入为上一个模块的输出,例如在对模块2进行训练时,其所使用的输入数据为模块1的输出数据。第2个模块(模块2)至第N-1个模块(模块N-1)的具体训练过程在此不再指数,训练之后能够得到并保存模块2至模块N-1的参数与模型。
基于上述过程,可以使用未标注的训练数据对模块1至模块N-1进行无监督学习训练,使得卷积神经网络模型学习得到训练数据的特征。
步骤103,将训练后的第1个至第N-1个模块以及第N个模块级联,并利用已标注的训练数据对级联后的N个模块进行训练,得到各模块的参数与模型。
具体而言,在经过上述对模块1至模块N-1的预训练后,将模块1至模块N依次级联,即按照划分之后的划分顺序再将模块1至模块N-1组合得到完整的卷积神经网络模型,随后再利用已标注的训练数据对组合得到的卷积神经网络模型进行有监督学习训练,并且由于在步骤102中模块1至模块N已经学习了训练数据的特征,由此在本步骤中仅需使用少量的已标注的训练数据对卷积神经网络模型进行有监督学习训练,在对组合得到的卷积神经网络模型完成训练后,得到最终的卷积神经网络模型,并分别保存模块1至模块N的参数和模型。
在一个例子中,步骤103之后,还包括:
步骤104,分别将各模块的参数和模型转换为用于在控制器上进行运行的格式。
具体而言,在步骤103中保存最终的模块1至模块N的参数和模型后,分别对模块1至模块N的参数和模型进行转换,使得模块1至模块N可以在控制器上运行。例如对多个模块的参数和模型进行代码形式转换,使得多个模块可以直接编译在控制器中,减少了模块在控制器中的内存占用,提升了运行速度。
本实施例提供一种模型生成方法,先构建用于进行图像分类的卷积神经网络模型,并将构建的卷积神经网络模型的多层结构依次划分得到N个模块,每个模 块包括神经网络模型中相邻的多个层;随后再基于未标注的训练数据对第1个模块至第N-1个模块进行训练,得到第1个至第N-1个模块的参数与模型,即利用未标注的训练数据对前N-1个模块进行预训练,使得前N-1个模块预先学习到了未标注的训练数据的特征,继而再将训练后的第1个至第N-1个模块与第N个模块级联,并利用已标注的训练数据对级联后的N个模块进行训练,得到各模块的参数与模型,由于前N-1个模块已经预先学习到了未标注的训练数据的特征,此时可以仅使用少量的已标注的训练数据对级联得到的卷积神经网络模型进行有监督学习训练,得到最终的卷积神经网络模型,在无需大量标注训练数据的条件下得到了高精度的卷积神经网络模型,同时节约了训练数据标注所需的人力和时间。
本发明第二实施例公开了一种图像分类方法,应用于控制器(可以为MCU微控制器),控制器中运行有用于进行图像分类的卷积神经网络模型,由此能够将输入的待分类图像进行图像分类。
本实例中的图像分类方法的具体流程如图4所示。
步骤201,获取用于对待分类图像进行分类的卷积神经网络模型,卷积神经网络模型为基于第一实施例中的模型生成方法所生成。
具体而言,用于进行图像分类的卷积神经网络模型为基于第一实施例中的模型生成方法所生成,卷积神经网络模型生成之后可在控制器中运行。
步骤202,利用获取的卷积神经网络模型对待分类图像进行图像分类。
在一个例子中,在获取的卷积神经网络模型中,每个模块对应多层结构模型的参数所占内存小于运行控制器的片内存储;利用获取的卷积神经网络模型对待分类图像进行图像分类,包括:将获取的卷积神经网络模型包含的多个模块并行运行在控制器的多个线程或处理器中,对待分类图像进行图像分类。即,在第一实施例所生成的卷积神经网络模型中,运行该卷积神经网络模型的每个模块所需占用的内存小于控制器的片内存储,由此每个模块均可以在控制器中运行,继而可以选择多个模块并行运行在控制器中的多个线程中,或者对于包括多处理器的控制器,多个模块并行运行在多个处理器中,由此能够加快控制 器的运算速度,提升了对待分类图像进行分类的速度,适用于低功耗的微处理器。举例来说,多个模块分别运行在不同的处理器中,对于运行第1个模块的处理器来说,其在获取当前的待分类图像并完成处理后,将得到的数据发送到运行第2个模块的处理器,由运行第2个模块的处理器进行下一步处理,依次类推,运行第1个模块的处理器则会在将当前的数据发送发送到运行第2个模块的处理器后进行下一张图像的采集与处理。
本发明第三实施例公开了一种控制器,例如为MCU控制器,控制器用于执行第一实施例中的模型生成方法和/或第二实施例中的图像分类方法,即该控制器能够同时运行模型生成方法与图像分类方法,或者模型生成方法与图像分类方法分别由不同的控制器来实现,例如在模型生成方法中涉及到了对运算能力较高的模型训练过程,可以交由处理能力较强的控制器来实现,控制器在将生成的卷积神经网络模型发送到低功耗的微控制器,由低功耗的微控制器基于该卷积神经网络进行图像分类。
本发明第四实施例公开了一种电子设备,电子设备包括第三实施例中的控制器以及与该控制器通信连接的存储器。
以上已详细描述了本发明的较佳实施例,但应理解到,若需要,能修改实施例的方面来采用各种专利、申请和出版物的方面、特征和构思来提供另外的实施例。
考虑到上文的详细描述,能对实施例做出这些和其它变化。一般而言,在权利要求中,所用的术语不应被认为限制在说明书和权利要求中公开的具体实施例,而是应被理解为包括所有可能的实施例连同这些权利要求所享有的全部等同范围。

Claims (10)

  1. 一种模型生成方法,其特征在于,包括:
    构建用于进行图像分类的卷积神经网络模型,并将所述卷积神经网络模型依次划分为N个模块,每个所述模块包括所述神经网络模型中相邻的多个层,N为大于1的整数;
    基于未标注的训练数据对第1个所述模块至第N-1个所述模块进行训练,得到第1个至第N-1个所述模块的参数与模型;
    将训练后的第1个至第N-1个所述模块以及第N个所述模块级联,并利用已标注的训练数据对级联后的N个所述模块进行训练,得到各所述模块的参数与模型。
  2. 根据权利要求1所述的模型生成方法,其特征在于,所述基于未标注的训练数据对第1个所述模块至第N-1个所述模块进行训练,得到各目标模块的参数与模型,包括:
    对于每个目标模块,以所述目标模块作为自编码器的编码模块设计自编码器的解码模块,并基于未标注的训练数据对所述自编码器进行训练得到所述目标模块的参数与模型,其中所述目标模块为第1个所述模块至第N-1个所述模块中的一个所述模块。
  3. 根据权利要求2所述的模型生成方法,其特征在于,所述对于每个目标模块,以所述目标模块作为自编码器的编码模块设计自编码器的解码模块,并基于未标注的训练数据对所述自编码器进行训练得到所述目标模块的参数与模型,包括:
    对于第1个所述模块,利用未标注的训练数据对第1个所述模块进行训练,得到第1个所述模块的参数与模型;
    对于第M个所述模块,利用第M-1个所述模块的输出数据对第M个所述模块进行训练,得到第M个所述模块的参数与模型;其中1<M≤N-1,且M为整数。
  4. 根据权利要求1所述的模型生成方法,其特征在于,对于每个所述模块, 所述模块对应多层结构模型的参数所占内存小于运行所述卷积神经网络模型的控制器的片内存储。
  5. 根据权利要求1所述的模型生成方法,其特征在于,在所述将训练后的第1个至第N-1个所述模块与第N个所述模块级联,并利用已标注的训练数据对级联后的N个所述模块进行训练,得到各所述模块的参数与模型之后,还包括:
    分别将各所述模块的参数和模型转换为用于在控制器上进行运行的格式。
  6. 根据权利要求1所述的模型生成方法,其特征在于,所述构建用于进行图像分类的卷积神经网络模型,包括:
    基于待分类图像的属性与控制器的系统参数,生成用于对所述待分类图像进行分类的卷积神经网络模型。
  7. 一种图像分类方法,其特征在于,应用于控制器,所述方法包括:
    获取用于对待分类图像进行分类的卷积神经网络模型,所述卷积神经网络模型为基于权利要求1至6中任一项所述的模型生成方法所生成;
    利用获取的所述卷积神经网络模型对所述待分类图像进行图像分类。
  8. 根据权利要求7所述的图像分类方法,其特征在于,在获取的所述卷积神经网络模型中,每个所述模块对应多层结构模型的参数所占内存小于所述控制器的片内存储;所述利用获取的卷积神经网络模型对所述待分类图像进行图像分类,包括:
    将获取的所述卷积神经网络模型包含的多个模块并行运行在所述控制器的多个线程或处理器中,对所述待分类图像进行图像分类。
  9. 一种控制器,其特征在于,用于执行权利要求1至6中任一项所述的模型生成方法和/或权利要求7或8所述的图像分类方法。
  10. 一种电子设备,其特征在于,包括:权利要求9所述的控制器以及与所述控制器通信连接的存储器。
PCT/CN2022/107857 2022-07-26 2022-07-26 模型生成方法、图像分类方法、控制器以及电子设备 WO2024020773A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2022/107857 WO2024020773A1 (zh) 2022-07-26 2022-07-26 模型生成方法、图像分类方法、控制器以及电子设备
CN202280005481.8A CN115968477A (zh) 2022-07-26 2022-07-26 模型生成方法、图像分类方法、控制器以及电子设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/107857 WO2024020773A1 (zh) 2022-07-26 2022-07-26 模型生成方法、图像分类方法、控制器以及电子设备

Publications (1)

Publication Number Publication Date
WO2024020773A1 true WO2024020773A1 (zh) 2024-02-01

Family

ID=87363706

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/107857 WO2024020773A1 (zh) 2022-07-26 2022-07-26 模型生成方法、图像分类方法、控制器以及电子设备

Country Status (2)

Country Link
CN (1) CN115968477A (zh)
WO (1) WO2024020773A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109544190A (zh) * 2018-11-28 2019-03-29 北京芯盾时代科技有限公司 一种欺诈识别模型训练方法、欺诈识别方法及装置
CN111126481A (zh) * 2019-12-20 2020-05-08 湖南千视通信息科技有限公司 一种神经网络模型的训练方法及装置
CN111401524A (zh) * 2020-03-17 2020-07-10 深圳市物语智联科技有限公司 卷积神经网络处理方法、装置、设备、存储介质及模型
US20210150710A1 (en) * 2019-11-15 2021-05-20 Arizona Board Of Regents On Behalf Of Arizona State University Systems, methods, and apparatuses for implementing a self-supervised chest x-ray image analysis machine-learning model utilizing transferable visual words
CN114492723A (zh) * 2020-11-13 2022-05-13 华为技术有限公司 神经网络模型的训练方法、图像处理方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109544190A (zh) * 2018-11-28 2019-03-29 北京芯盾时代科技有限公司 一种欺诈识别模型训练方法、欺诈识别方法及装置
US20210150710A1 (en) * 2019-11-15 2021-05-20 Arizona Board Of Regents On Behalf Of Arizona State University Systems, methods, and apparatuses for implementing a self-supervised chest x-ray image analysis machine-learning model utilizing transferable visual words
CN111126481A (zh) * 2019-12-20 2020-05-08 湖南千视通信息科技有限公司 一种神经网络模型的训练方法及装置
CN111401524A (zh) * 2020-03-17 2020-07-10 深圳市物语智联科技有限公司 卷积神经网络处理方法、装置、设备、存储介质及模型
CN114492723A (zh) * 2020-11-13 2022-05-13 华为技术有限公司 神经网络模型的训练方法、图像处理方法及装置

Also Published As

Publication number Publication date
CN115968477A (zh) 2023-04-14

Similar Documents

Publication Publication Date Title
WO2019007406A1 (zh) 一种数据处理装置和方法
CN112288075B (zh) 一种数据处理方法及相关设备
Liu et al. Time series prediction based on temporal convolutional network
WO2023160472A1 (zh) 一种模型训练方法及相关设备
CN116415654A (zh) 一种数据处理方法及相关设备
CN112069804B (zh) 基于动态路由的交互式胶囊网络的隐式篇章关系识别方法
Ding et al. Slimyolov4: lightweight object detector based on yolov4
WO2024020774A1 (zh) 模型生成方法、物体检测方法、控制器以及电子设备
Kong et al. Real‐time facial expression recognition based on iterative transfer learning and efficient attention network
Dong et al. Lambo: Large language model empowered edge intelligence
CN114359656A (zh) 一种基于自监督对比学习的黑色素瘤图像识别方法和存储设备
WO2024020773A1 (zh) 模型生成方法、图像分类方法、控制器以及电子设备
Zhang et al. NAS4FBP: Facial beauty prediction based on neural architecture search
Wang et al. Fundamentals of artificial intelligence
US20230024803A1 (en) Semi-supervised video temporal action recognition and segmentation
Fang et al. A method of license plate location and character recognition based on CNN
Xu et al. NWP feature selection and GCN-based ultra-short-term wind farm cluster power forecasting method
Zhang et al. A Hybrid Neural Network-Based Intelligent Forecasting Approach for Capacity of Photovoltaic Electricity Generation
CN111597814B (zh) 一种人机交互命名实体识别方法、装置、设备及存储介质
Wu et al. NLP Research Based on Transformer Model
He et al. Healthcare entity recognition based on deep learning
Olaofe Assessment of LSTM, Conv2D and ConvLSTM2D Prediction Models for Long-Term Wind Speed and Direction Regression Analysis
Qian et al. ConShuffleNet: An Efficient Convolutional Neural Network Based on ShuffleNetV2
US20230010230A1 (en) Auxiliary middle frame prediction loss for robust video action segmentation
CN116526582B (zh) 基于人工智能联合驱动的电力机组组合调度方法与系统