CN111027635A - Image processing model construction method and device, terminal and readable storage medium - Google Patents

Image processing model construction method and device, terminal and readable storage medium Download PDF

Info

Publication number
CN111027635A
CN111027635A CN201911298501.1A CN201911298501A CN111027635A CN 111027635 A CN111027635 A CN 111027635A CN 201911298501 A CN201911298501 A CN 201911298501A CN 111027635 A CN111027635 A CN 111027635A
Authority
CN
China
Prior art keywords
image
model
channels
input
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911298501.1A
Other languages
Chinese (zh)
Other versions
CN111027635B (en
Inventor
邹冲
李世行
殷磊
吴海山
汪飙
张元梵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN201911298501.1A priority Critical patent/CN111027635B/en
Publication of CN111027635A publication Critical patent/CN111027635A/en
Application granted granted Critical
Publication of CN111027635B publication Critical patent/CN111027635B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a method, a device, a terminal and a readable storage medium for constructing an image processing model, wherein the method comprises the following steps: acquiring a pre-training model based on a convolutional neural network; performing transfer learning on the pre-training model to construct a target model; wherein the object model comprises an input layer and at least one convolutional layer; acquiring the number of channels of a current image to be processed, and configuring the number of image channels in input data of an input layer as the number of channels of the image to be processed; taking the convolution layer closest to the input layer as a target convolution layer, and inputting the input data after configuration of the input layer into the target convolution layer; and configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-training model. Therefore, a target model can be constructed based on the pre-training model, the integrity of the data of the number of channels of the input image to be processed is ensured, and the target model can be adaptively adjusted and processed according to the number of channels of different images to be processed.

Description

Image processing model construction method and device, terminal and readable storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a method and an apparatus for constructing an image processing model, a terminal, and a readable storage medium.
Background
With the development of deep learning technology, the convolutional neural network technology can be applied to the processing task of images in natural scenes. In order to improve the image processing effect, such as the rapidity, accuracy and the like of image recognition, a great deal of research and technical development is performed in academia and related enterprises. For example, the classification accuracy of a large data set such as ImageNet is high.
However, when complex image information is processed by an existing image processing model, the image processing model is not highly targeted, and it is difficult to flexibly deal with the complex image information, and it is difficult to obtain an ideal processing effect. For example, when the pre-training model VGG16 on ImageNet can process three-channel image information, but faces eight-channel image information, it is necessary to perform dimension reduction on data input by some technical means, such as PCA (Principal Component Analysis), band selection, etc., and then perform transfer learning on the pre-training model VGG16, which may cause some loss of image information, resulting in the problems of inflexible image processing, poor image processing effect, and the like.
Disclosure of Invention
The invention mainly aims to provide a method, a device, a terminal and a readable storage medium for constructing an image processing model, and aims to solve the technical problem that in the prior art, the flexibility of a processing method for an image to be processed is not strong, so that an ideal processing effect is difficult to obtain.
In order to achieve the above object, the present invention provides a method for constructing an image processing model, the method comprising:
acquiring a pre-training model based on a convolutional neural network;
performing transfer learning on the pre-training model to construct a target model, wherein the target model comprises an input layer and at least one convolutional layer;
acquiring the number of channels of a current image to be processed, and configuring the number of image channels in input data of the input layer as the number of channels of the image to be processed;
taking the convolution layer closest to the input layer as a target convolution layer, and inputting the input data after the input layer is configured to the target convolution layer;
and configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-training model.
Further, the step of configuring, according to the pre-training model, the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed includes:
acquiring the second weight corresponding to the target convolution layer in the pre-training model;
mapping a first matrix according to the second weight, wherein the number of row vectors of the first matrix corresponds to the number of input channels in the second weight; cutting the first matrix into a plurality of row vectors, and splicing the row vectors into a second matrix according to the row vectors, wherein the number of the row vectors of the second matrix corresponds to the number of channels of the image to be processed;
or mapping a first matrix according to the second weight, wherein the number of column vectors of the first matrix corresponds to the number of input channels in the second weight; cutting the first matrix into a plurality of column vectors, and splicing the column vectors into a second matrix according to the plurality of column vectors, wherein the number of the column vectors of the second matrix corresponds to the number of channels of the image to be processed;
mapping the constructed second matrix to the first weight.
Further, the step of mapping the constructed second matrix to the first weight includes:
and scaling the second matrix according to the preset scaling value, and mapping the scaled second matrix to the first weight.
Further, the preset scaling value is a ratio of the number of input channels in the second weight to the number of input channels in the first weight.
Further, the step of performing transfer learning on the pre-training model to construct an object model, wherein the object model includes an input layer and at least one convolutional layer, includes:
and constructing the target model according to the pre-training model, wherein the weights corresponding to other convolutional layers except the target convolutional layer in the pre-training model are correspondingly configured to the weights corresponding to the convolutional layers of the target model.
Further, the step of performing transfer learning on the pre-training model to construct an object model, wherein the object model comprises an input layer and at least one convolutional layer is preceded by the steps of:
and acquiring the image to be processed, and processing and analyzing the image to be processed to obtain the number of channels of the image to be processed.
Further, after the step of configuring, according to the pre-training model, the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed, the method includes:
acquiring the image to be processed;
and inputting the image to be processed into the target model as a training image to perform inspection adjustment, and outputting the inspected and adjusted target model.
The present invention also provides an apparatus for constructing an image processing model, the apparatus comprising:
the model acquisition module is used for acquiring a pre-training model based on a convolutional neural network;
the building module is used for carrying out transfer learning on the pre-training model to build a target model and configuring the number of image channels of an input layer of the target model as the number of channels of an image to be processed;
the channel number acquisition module is used for acquiring the channel number of the current image to be processed and configuring the image channel number in the input data of the input layer as the channel number of the image to be processed;
an input module, configured to input the input data after configuration of the input layer to a target convolutional layer, with a convolutional layer closest to the input layer as the target convolutional layer;
and the configuration module is used for configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-training model.
The present invention also provides a terminal, including: a memory, a processor and a program stored on the memory and executable on the processor, the image processing model construction program, when executed by the processor, implementing the steps of the image processing model construction method as described above.
The present invention also provides a readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of constructing an image processing model as described above.
The method for constructing the image processing model comprises the steps of obtaining a pre-training model based on a convolutional neural network; performing transfer learning on the pre-training model to construct a target model; wherein the object model comprises an input layer and at least one convolutional layer; acquiring the number of channels of a current image to be processed, and configuring the number of image channels in input data of an input layer as the number of channels of the image to be processed; taking the convolution layer closest to the input layer as a target convolution layer, and inputting the input data configured by the input layer into the target convolution layer; and configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-training model. Therefore, the pre-training model based on the convolutional neural network can construct a target model capable of processing the image to be processed, flexibly configure the number of image channels in the input data of the input layer in the target model and the number of input channels with the first weight according to the number of channels of the image to be processed, avoid the problem of feature loss caused by performing dimension reduction on the number of channels of the image to be processed and then performing transfer learning on the pre-training model, ensure the integrity of the data of the number of channels of the input image to be processed, enable the target model to perform adaptive adjustment and processing according to the number of channels of different images to be processed, enable the image processing to be more flexible, and enable the image processing to be more targeted, and enable the processing result to be more accurate.
Drawings
Fig. 1 is a schematic structural diagram of a terminal in which hardware according to an embodiment of the present invention operates;
FIG. 2 is a flowchart illustrating a first embodiment of a method for constructing an image processing model according to the present invention;
fig. 3 is a schematic structural diagram of a framework of an embodiment of an apparatus for constructing an image processing model according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, fig. 1 is a schematic terminal structure diagram of a hardware operating environment according to an embodiment of the present invention.
The terminal of the embodiment of the invention can be a server.
As shown in fig. 1, the terminal may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the terminal structure shown in fig. 1 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating device, a network communication module, a user interface module, and a construction program of an image processing model.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a backend server and performing data communication with the backend server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to call a building program of the image processing model stored in the memory 1005, and perform the following operations:
acquiring a pre-training model based on a convolutional neural network;
performing transfer learning on the pre-training model to construct a target model, wherein the target model comprises an input layer and at least one convolutional layer;
acquiring the number of channels of a current image to be processed, and configuring the number of image channels in input data of the input layer as the number of channels of the image to be processed;
taking the convolution layer closest to the input layer as a target convolution layer, and inputting the input data after the input layer is configured to the target convolution layer;
and configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-training model.
Further, acquiring the second weight corresponding to the target convolution layer in the pre-training model;
mapping a first matrix according to the second weight, wherein the number of row vectors of the first matrix corresponds to the number of input channels in the second weight; cutting the first matrix into a plurality of row vectors, and splicing the row vectors into a second matrix according to the row vectors, wherein the number of the row vectors of the second matrix corresponds to the number of channels of the image to be processed;
or mapping a first matrix according to the second weight, wherein the number of column vectors of the first matrix corresponds to the number of input channels in the second weight; cutting the first matrix into a plurality of column vectors, and splicing the column vectors into a second matrix according to the plurality of column vectors, wherein the number of the column vectors of the second matrix corresponds to the number of channels of the image to be processed;
mapping the constructed second matrix to the first weight.
Further, the second matrix is scaled according to the preset scaling value, and the scaled second matrix is mapped to the first weight.
Further, the preset scaling value is a ratio of the number of input channels in the second weight to the number of input channels in the first weight.
Further, the target model is constructed according to the pre-training model, wherein the weights corresponding to the other convolutional layers except the target convolutional layer in the pre-training model are correspondingly configured to the weights corresponding to the convolutional layers of the target model.
Further, the processor 1001 may call the construction program of the image processing model stored in the memory 1005, and also perform the following operations:
and acquiring the image to be processed, and processing and analyzing the image to be processed to obtain the number of channels of the image to be processed.
Further, the processor 1001 may call the construction program of the image processing model stored in the memory 1005, and also perform the following operations:
acquiring the image to be processed;
and inputting the image to be processed into the target model as a training image to perform inspection adjustment, and outputting the inspected and adjusted target model.
Referring to fig. 2, the present invention provides various embodiments of the method of the present invention based on the above hardware structure.
The invention provides a method for constructing an image processing model, which is applied to a terminal, and in a first embodiment of the method for constructing the image processing model, referring to FIG. 2, the method comprises the following steps:
step S10, acquiring a pre-training model based on a convolutional neural network;
and the terminal acquires a pre-training model based on the convolutional neural network. In this embodiment, the convolutional neural network is a standard neural network that can be extended, and the convolutional neural network can identify the processed image through the inner convolutional layer. The pre-training model is constructed based on a convolutional neural network, and can be a pre-training model VGG16 on ImageNet, which can process image information of three channels.
In this embodiment, the pre-trained model includes an input layer and at least one convolutional layer, and specifically, the pre-trained model may include an input layer, a first convolutional layer receiving output data of the input layer, a second convolutional layer receiving output data of the first convolutional layer, … …, and a pooling layer, the convolutional layer closest to the input layer is used as a target convolutional layer, that is, the first convolutional layer is used as a target convolutional layer, where the format of the input layer is (sample number, image height, image width, image channel number), and the convolutional layer may be mathematically expressed by weights, for example, the format of the second weight corresponding to the target convolutional layer is (convolutional kernel length, convolutional kernel width, input channel number, convolutional kernel number), and the size of the convolutional kernel may be 1x1, 3x3, 5x5, and the like, which are not limited herein. The pooling layer is used for reducing the size of the model, improving the calculation speed and improving the robustness of the extracted features. In this embodiment, if the size of the image to be processed is 224 × 224 (pixels) and includes three RGB bands, in the pre-training model, the format of the input layer is (None,224, 3), the format of the second weight corresponding to the target convolutional layer is (3,3,3,64), the format of the second weight corresponding to the target convolutional layer may also be expressed as (1,1,3,64), and the format of the pooling layer is (None,112, 64), where the value of None corresponds to the number of samples of the image to be processed.
The pre-training model VGG16 on ImageNet is obtained, and the size of the image is 224 × 224 (pixels), and the image includes three RGB bands. For example:
layer type Inputting data Outputting the data Second weight
Input layer (None,224,224,3) (None,224,224,3) - -
Target convolutional layer (None,224,224,3) (None,224,224,64) (3,3,3,64)
The second convolution layer (None,224,224,64) (None,224,224,64) (3,3,64,64)
Pooling layer (None,224,224,64) (None,112,112,64) - -
Step S20, performing transfer learning on the pre-training model to construct a target model, wherein the target model comprises an input layer and at least one convolutional layer;
the terminal performs transfer learning on the pre-training model to construct a target model, wherein the target model comprises an input layer and at least one convolutional layer. Wherein, the transfer learning refers to the learning task that the knowledge learned from one environment is used to help the new environment. In this embodiment, a pre-trained model is subjected to transfer learning, and the pre-trained model may be an image subjected to recognition processing, and the pre-trained model is subjected to transfer learning, that is, the target model is constructed based on the pre-trained model, where the target model includes an input layer and at least one convolutional layer. The target model includes, but is not limited to, an input layer, a target convolutional layer receiving output data of the input layer, a second convolutional layer receiving output data of the target convolutional layer, … …, and a pooling layer, wherein the format of the input layer is (number of samples, image height, image width, number of image channels), the format of the weights corresponding to the convolutional layers is (length of convolution kernel, width of convolution kernel, number of input channels, number of convolution kernels), and the size of convolution kernel may be 1x1, 3x3, 5x5, and the like, which is not limited herein. The pooling layer is used for reducing the size of the model, improving the calculation speed and improving the robustness of the extracted features. For example, the pre-training model VGG16 is constructed based on a convolutional neural network to be able to process the image, i.e., the pre-training model VGG16 includes an input layer and at least one convolutional layer, and the pre-training model VGG16 is subjected to transfer learning to construct the target model, i.e., the target model is able to process the image and also includes the input layer and the at least one convolutional layer. The size of the image to be processed is 224x224 (pixels), and 3 bands are input to the pre-training model VGG16, the format of the input layer of the pre-training model VGG16 is (None,224, 3), correspondingly, the format of the input layer of the target model is (None,224, 3), and similarly, the formats of the other layers of the target model are in one-to-one correspondence with the formats of the other layers of the pre-training model VGG 16.
Step S30, acquiring the channel number of the current image to be processed, and configuring the channel number of the image in the input data of the input layer as the channel number of the image to be processed.
The terminal obtains the channel number of the current image to be processed, and configures the channel number of the image in the input data of the input layer as the channel number of the image to be processed. In this embodiment, after the target model is built according to the pre-training model, the number of image channels in the input data of the input layer of the target model is configured to be the number of channels of the image to be processed according to the number of channels of the current image to be processed.
For example, if the number of channels of the image to be processed is greater than the number of channels of the image in the input data of the input layer, the number of channels of the image in the input data of the input layer needs to be configured as the number of channels of the image to be processed, so that the input layer can process the image to be processed.
In some embodiments, if the number of channels of the image to be processed is smaller than the number of channels of the image in the input data of the input layer, the number of channels of the image in the input data of the input layer needs to be configured as the number of channels of the image to be processed, so that the input layer can process the image to be processed.
In other embodiments, if the number of channels of the image to be processed is equal to the number of image channels in the input data of the input layer, the number of image channels in the input data of the input layer is not modified.
Step S40, using the convolution layer of the latest input layer as the target convolution layer, inputting the input data after the configuration of the input layer to the target convolution layer;
the terminal takes the convolution layer of the nearest input layer as a target convolution layer, and inputs the input data to the target convolution layer after the input layers are configured. In this embodiment, the object model includes an input layer and at least one convolutional layer, where the input layer is configured to receive and process an image to be processed, and input a processing result to one convolutional layer, and the convolutional layer receives and processes the image.
In the present embodiment, the convolutional layer closest to the input layer is taken as the target convolutional layer, and the target convolutional layer in the target model is mathematically expressed by the first weight. A mathematically based processing method modifies the target convolutional layer to enable processing of images for multiple channels.
For example, if the size of the image to be processed is 224 × 224 (pixels) and there are 8 bands, the input format is (None,224, 8), the input data (None,224, 8) is input to the target convolutional layer, the target convolutional layer receives the input data (None,224, 8), and outputs the data (None,224, 8) from the target convolutional layer.
Step S50, configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-training model.
And the terminal configures the number of input channels in the first weight corresponding to the target convolution layer in the target model into the number of channels of the image to be processed according to the pre-training model. In this embodiment, the format of the first weight corresponding to the target convolutional layer is (length of convolutional kernel, width of convolutional kernel, number of input channels, number of convolutional kernels), for example, if the image to be processed has 8 bands, the format of the first weight is (3,3,8, 64).
The target model is applied to multi-channel picture processing, and then input layers and weights of the target model are reconfigured, wherein the configurations are based on the number of channels of the image to be processed, the size of the image is 224x224 (pixels), and the RGB is 8 wave bands. For example:
layer type Inputting data Outputting the data First weight
Input layer (None,224,224,8) (None,224,224,8) - -
Target convolutional layer (None,224,224,8) (None,224,224,64) (3,3,8,64)
The second convolution layer (None,224,224,64) (None,224,224,64) (3,3,64,64)
Pooling layer (None,224,224,64) (None,112,112,64) - -
In the embodiment, a pre-training model based on a convolutional neural network is obtained; performing transfer learning on the pre-training model to construct a target model; wherein the object model comprises an input layer and at least one convolutional layer; acquiring the number of channels of a current image to be processed, and configuring the number of image channels in input data of an input layer as the number of channels of the image to be processed; taking the convolution layer closest to the input layer as a target convolution layer, and inputting the input data after configuration of the input layer into the target convolution layer; and configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-training model. Therefore, a target model capable of processing the images to be processed can be constructed based on the pre-training model, namely the target model constructed based on the pre-training model and by means of transfer learning, the input data and the input channel number in the target model are flexibly configured according to the channel number of the images to be processed, the problem of feature loss caused by the fact that the channel number of the images to be processed is firstly subjected to dimension reduction and then the pre-training model is subjected to transfer learning is solved, the integrity of the data of the input channel number of the images to be processed is guaranteed, the target model can be adaptively adjusted and processed according to the channel number of different images to be processed, the image processing is more flexible, the image processing is more targeted, and the processing result is more accurate.
Further, in step S20 of the first embodiment, the step of performing transfer learning on the pre-trained model to construct an object model, where the object model includes an input layer and at least one convolutional layer, includes:
step S21, construct the target model according to the pre-trained model, wherein the weights corresponding to the convolutional layers other than the target convolutional layer in the pre-trained model correspond to the weights corresponding to the convolutional layers configured to the target model.
In this embodiment, weights corresponding to other convolutional layers except the target convolutional layer in the pre-trained model are correspondingly configured to weights corresponding to the convolutional layers of the target model. For example, the format of the weight corresponding to the second convolutional layer in the pre-trained model is (3,3,64,64), and the format of the weight corresponding to the second convolutional layer in the target model is (3,3,64, 64).
Further, in step S50 of the first embodiment, configuring, according to the pre-training model, the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed includes:
step S51, acquiring a second weight corresponding to the target convolution layer in the pre-training model;
step S52, a first matrix is mapped according to the second weight, wherein the number of the row vectors of the first matrix corresponds to the number of the input channels in the second weight; the first matrix is cut into a plurality of row vectors, and the row vectors are spliced into a second matrix according to the row vectors, wherein the number of the row vectors of the second matrix corresponds to the number of channels of the image to be processed;
step S53, a first matrix is mapped according to the second weight, wherein the number of column vectors of the first matrix corresponds to the number of input channels in the second weight; the first matrix is cut into a plurality of column vectors, and the column vectors are spliced into a second matrix according to the plurality of column vectors, wherein the number of the column vectors of the second matrix corresponds to the number of channels of the image to be processed;
in step S54, the constructed second matrix is mapped to the first weight.
In this embodiment, in the model is pre-trained. Acquiring a second weight corresponding to a target convolutional layer in the pre-training model, and mapping a first matrix according to the second weight, wherein the number of row vectors of the first matrix corresponds to the number of input channels in the second weight; the first matrix is cut into a plurality of row vectors, and the row vectors are spliced into a second matrix according to the row vectors, wherein the number of the row vectors of the second matrix corresponds to the number of channels of the image to be processed; or mapping a first matrix according to the second weight, wherein the number of column vectors of the first matrix corresponds to the number of input channels of the second weight; and cutting the first matrix into a plurality of column vectors, splicing the column vectors into a second matrix according to the plurality of column vectors, wherein the number of the column vectors of the second matrix corresponds to the number of channels of the image to be processed, and mapping the constructed second matrix to the first weight.
Specifically, the format of the second weight corresponding to the target convolutional layer in the pre-training model is (3,3,3,64), and the first matrix is constructed according to the second weight, where the number of input channels of the second weight is 3, and the number of row vectors corresponding to the first matrix is 3, for example, the first matrix is expressed as: [ (3,3,1,64), (3,3,2,64), (3,3,3,64) ] T.
In this embodiment, if the number of channels of the image to be processed is 8, the number of row vectors of the first matrix in the pre-training model is 3, and the row vectors of the first matrix are spliced into a second matrix having 8 row vectors, where the number of row vectors of the second matrix corresponds to the number of channels of the image to be processed, for example, the second matrix is represented as: [(3,3,1,64),(3,3,2,64),(3,3,3,64),(3,3,1,64),(3,3,2,64),(3,3,3,64)(3,3,1,64),(3,3,2,64)]T. The constructed second matrix is mapped to the first weight, i.e. the second matrix corresponds to the first weight.
Similarly, the row vectors and the column vectors of the matrix have the same operation rule, and the row vectors and the column vectors have equivalent expression effect, so that when the first matrix is mapped according to the second weight, wherein the number of the column vectors of the first matrix corresponds to the number of input channels of the second weight, the first matrix is cut into a plurality of column vectors, and the first matrix is spliced into the second matrix according to the plurality of column vectors, wherein the number of the column vectors of the second matrix corresponds to the number of channels of the image to be processed.
In this embodiment, if the number of channels of the image to be processed is 3, the number of row vectors of the first matrix in the pre-training model is 8, and a second matrix having 3 row vectors is constructed by using the row vectors of the first matrix, where the number of row vectors of the second matrix corresponds to the number of channels of the image to be processed, for example, the second matrix is expressed as: [(3,3,1,1),(3,3,2,1),(3,3,3,1)]T. The constructed second matrix is mapped to the first weight, i.e. the second matrix corresponds to the first weight.
Further, in step S54, the step of mapping the constructed second matrix to the first weight includes:
step S541, scale the second matrix according to a preset scaling value, and map the scaled second matrix to the first weight.
And the terminal scales the second matrix according to a preset scaling value and maps the scaled second matrix to the first weight. Optionally, the preset scaling value is a ratio of the number of input channels in the second weight to the number of input channels in the first weight. For example, if the number of input channels in the first weight is 8 and the number of input channels in the second weight is 3, the preset scaling value is 3/8, and the second matrix is multiplied by the preset scaling value 8/3, thereby scaling the second matrix.
The scaled second matrix is used to correspond to the first weight.
Further, the step of performing transfer learning on the pre-training model to construct an object model, wherein the object model comprises an input layer and at least one convolutional layer is preceded by the steps of:
step A, acquiring an image to be processed, and processing and analyzing the image to be processed to obtain the number of channels of the image to be processed.
The terminal may acquire a multi-channel image (e.g., an image having 8 channels), and perform processing analysis on the multi-channel image to obtain the number of channels of the image. For example, a high-resolution six-size satellite can acquire images of 8 bands, namely images of 8 channels; RGB is an image of 3 bands, i.e., an image of 3 channels; the grayscale map is an image of 1 band, i.e., an image of 1 channel.
In a third embodiment of the method for constructing an image processing model according to the present invention, after the step of configuring, according to the pre-training model, the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed, the method includes:
step S55, acquiring an image to be processed;
and step S56, inputting the image to be processed as a training image into the target model for inspection and adjustment, and outputting the inspection and adjusted target model.
The terminal can obtain a plurality of images to be processed, the images to be processed are sequentially used as training images to be input into the target model for inspection and adjustment, and the target model after inspection and adjustment is output. In the present embodiment, the accuracy of the target model is checked by inputting a training image to the newly constructed target model, and learning is continuously updated, thereby performing checking and adjustment. Based on the pre-training model, a target model for processing the image to be processed is constructed, and the transfer learning process is completed, so that the target model can process the image to be processed in a targeted manner, the image processing process is more flexible, and the image processing result is more accurate.
In an embodiment, as shown in fig. 3, fig. 3 is a schematic diagram of a framework structure of an embodiment of an apparatus for constructing an image processing model according to the present invention, including: the device comprises a model acquisition module, a construction module, a channel number acquisition module, an input module and a configuration module, wherein:
the model acquisition module is used for acquiring a pre-training model based on a convolutional neural network;
the building module is used for carrying out transfer learning on the pre-training model to build a target model and configuring the number of image channels of an input layer of the target model as the number of channels of an image to be processed;
the channel number acquisition module is used for acquiring the channel number of the current image to be processed and configuring the image channel number in the input data of the input layer as the channel number of the image to be processed;
an input module, configured to input the input data after configuration of the input layer to a target convolutional layer, with a convolutional layer closest to the input layer as the target convolutional layer;
and the configuration module is used for configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-training model.
For specific definition of the image processing model construction device, the above definition of the image processing model construction method can be referred to, and details are not repeated here. The respective modules in the image processing model constructing apparatus described above may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
Furthermore, an embodiment of the present invention further provides a readable storage medium (i.e., a computer-readable memory) having a construction program of an image processing model stored thereon, where the construction program of the image processing model, when executed by a processor, implements the following operations:
acquiring a pre-training model based on a convolutional neural network;
performing transfer learning on the pre-training model to construct a target model, wherein the target model comprises an input layer and at least one convolutional layer;
acquiring the number of channels of a current image to be processed, and configuring the number of image channels in input data of the input layer as the number of channels of the image to be processed;
taking the convolution layer closest to the input layer as a target convolution layer, and inputting the input data after the input layer is configured to the target convolution layer;
and configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-training model.
Further, acquiring the second weight corresponding to the target convolution layer in the pre-training model;
mapping a first matrix according to the second weight, wherein the number of row vectors of the first matrix corresponds to the number of input channels in the second weight; cutting the first matrix into a plurality of row vectors, and splicing the row vectors into a second matrix according to the row vectors, wherein the number of the row vectors of the second matrix corresponds to the number of channels of the image to be processed;
or mapping a first matrix according to the second weight, wherein the number of column vectors of the first matrix corresponds to the number of input channels in the second weight; cutting the first matrix into a plurality of column vectors, and splicing the column vectors into a second matrix according to the plurality of column vectors, wherein the number of the column vectors of the second matrix corresponds to the number of channels of the image to be processed;
mapping the constructed second matrix to the first weight.
Further, the second matrix is scaled according to the preset scaling value, and the scaled second matrix is mapped to the first weight.
Further, the preset scaling value is a ratio of the number of input channels in the second weight to the number of input channels in the first weight.
Further, the target model is constructed according to the pre-training model, wherein the weights corresponding to the other convolutional layers except the target convolutional layer in the pre-training model are correspondingly configured to the weights corresponding to the convolutional layers of the target model.
Further, the image processing model building program when executed by the processor further implements the following operations:
and acquiring the image to be processed, and processing and analyzing the image to be processed to obtain the number of channels of the image to be processed.
Further, the image processing model building program when executed by the processor further implements the following operations:
acquiring the image to be processed;
and inputting the image to be processed into the target model as a training image to perform inspection adjustment, and outputting the inspected and adjusted target model.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A method of constructing an image processing model, the method comprising:
acquiring a pre-training model based on a convolutional neural network;
performing transfer learning on the pre-training model to construct a target model, wherein the target model comprises an input layer and at least one convolutional layer;
acquiring the number of channels of a current image to be processed, and configuring the number of image channels in input data of the input layer as the number of channels of the image to be processed;
taking the convolution layer closest to the input layer as a target convolution layer, and inputting the input data configured by the input layer into the target convolution layer;
and configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-training model.
2. The method for constructing an image processing model according to claim 1, wherein the step of configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-trained model comprises:
acquiring a second weight corresponding to a target convolution layer in the pre-training model;
mapping a first matrix according to the second weight, wherein the number of row vectors of the first matrix corresponds to the number of input channels in the second weight; cutting the first matrix into a plurality of row vectors, and splicing the row vectors into a second matrix according to the row vectors, wherein the number of the row vectors of the second matrix corresponds to the number of channels of the image to be processed;
or mapping a first matrix according to the second weight, wherein the number of column vectors of the first matrix corresponds to the number of input channels in the second weight; cutting the first matrix into a plurality of column vectors, and splicing the column vectors into a second matrix according to the plurality of column vectors, wherein the number of the column vectors of the second matrix corresponds to the number of channels of the image to be processed;
mapping the constructed second matrix to the first weight.
3. The method of constructing an image processing model of claim 2, wherein said step of mapping said constructed second matrix to said first weights comprises:
and scaling the second matrix according to the preset scaling value, and mapping the scaled second matrix to the first weight.
4. The method of claim 3, wherein the predetermined scaling value is a ratio of the number of input channels in the second weight to the number of input channels in the first weight.
5. The method for constructing an image processing model according to claim 1, wherein the step of performing transfer learning on the pre-trained model to construct an object model, wherein the object model includes an input layer and at least one convolutional layer, comprises:
and constructing the target model according to the pre-training model, wherein the weights corresponding to other convolutional layers except the target convolutional layer in the pre-training model are correspondingly configured to the weights corresponding to the convolutional layers of the target model.
6. The method for constructing an image processing model according to claim 1, wherein the step of performing transfer learning on the pre-trained model to construct an object model, wherein the object model comprises an input layer and at least one convolutional layer comprises, before the step of:
and acquiring the image to be processed, and processing and analyzing the image to be processed to obtain the number of channels of the image to be processed.
7. The method for constructing an image processing model according to claim 1, wherein after the step of configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-trained model, the method comprises:
acquiring the image to be processed;
and inputting the image to be processed into the target model as a training image to perform inspection adjustment, and outputting the inspected and adjusted target model.
8. An apparatus for constructing an image processing model, the apparatus comprising:
the model acquisition module is used for acquiring a pre-training model based on a convolutional neural network;
the building module is used for carrying out transfer learning on the pre-training model to build a target model and configuring the number of image channels of an input layer of the target model as the number of channels of an image to be processed;
the channel number acquisition module is used for acquiring the channel number of the current image to be processed and configuring the image channel number in the input data of the input layer as the channel number of the image to be processed;
an input module, configured to input the input data after configuration of the input layer to a target convolutional layer, with a convolutional layer closest to the input layer as the target convolutional layer;
and the configuration module is used for configuring the number of input channels in the first weight corresponding to the target convolutional layer as the number of channels of the image to be processed according to the pre-training model.
9. A terminal, characterized in that the terminal comprises: a memory, a processor and a program stored on the memory and executable on the processor, the image processing model construction program, when executed by the processor, implementing the steps of the image processing model construction method according to any one of claims 1 to 7.
10. A readable storage medium, characterized in that the readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the method of constructing an image processing model according to any one of claims 1 to 7.
CN201911298501.1A 2019-12-12 2019-12-12 Image processing model construction method, device, terminal and readable storage medium Active CN111027635B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911298501.1A CN111027635B (en) 2019-12-12 2019-12-12 Image processing model construction method, device, terminal and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911298501.1A CN111027635B (en) 2019-12-12 2019-12-12 Image processing model construction method, device, terminal and readable storage medium

Publications (2)

Publication Number Publication Date
CN111027635A true CN111027635A (en) 2020-04-17
CN111027635B CN111027635B (en) 2023-10-31

Family

ID=70209281

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911298501.1A Active CN111027635B (en) 2019-12-12 2019-12-12 Image processing model construction method, device, terminal and readable storage medium

Country Status (1)

Country Link
CN (1) CN111027635B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112766277A (en) * 2021-02-07 2021-05-07 普联技术有限公司 Channel adjustment method, device and equipment of convolutional neural network model
CN112766276A (en) * 2021-02-07 2021-05-07 普联技术有限公司 Channel adjustment method, device and equipment of convolutional neural network model
CN113408571A (en) * 2021-05-08 2021-09-17 浙江智慧视频安防创新中心有限公司 Image classification method and device based on model distillation, storage medium and terminal

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239802A (en) * 2017-06-28 2017-10-10 广东工业大学 A kind of image classification method and device
CN108259997A (en) * 2018-04-02 2018-07-06 腾讯科技(深圳)有限公司 Image correlation process method and device, intelligent terminal, server, storage medium
CN109241817A (en) * 2018-07-02 2019-01-18 广东工业大学 A kind of crops image-recognizing method of unmanned plane shooting
CN109740534A (en) * 2018-12-29 2019-05-10 北京旷视科技有限公司 Image processing method, device and processing equipment
CN110222816A (en) * 2019-04-29 2019-09-10 北京迈格威科技有限公司 Method for building up, image processing method and the device of deep learning model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239802A (en) * 2017-06-28 2017-10-10 广东工业大学 A kind of image classification method and device
CN108259997A (en) * 2018-04-02 2018-07-06 腾讯科技(深圳)有限公司 Image correlation process method and device, intelligent terminal, server, storage medium
CN109241817A (en) * 2018-07-02 2019-01-18 广东工业大学 A kind of crops image-recognizing method of unmanned plane shooting
CN109740534A (en) * 2018-12-29 2019-05-10 北京旷视科技有限公司 Image processing method, device and processing equipment
CN110222816A (en) * 2019-04-29 2019-09-10 北京迈格威科技有限公司 Method for building up, image processing method and the device of deep learning model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112766277A (en) * 2021-02-07 2021-05-07 普联技术有限公司 Channel adjustment method, device and equipment of convolutional neural network model
CN112766276A (en) * 2021-02-07 2021-05-07 普联技术有限公司 Channel adjustment method, device and equipment of convolutional neural network model
CN113408571A (en) * 2021-05-08 2021-09-17 浙江智慧视频安防创新中心有限公司 Image classification method and device based on model distillation, storage medium and terminal
CN113408571B (en) * 2021-05-08 2022-07-19 浙江智慧视频安防创新中心有限公司 Image classification method and device based on model distillation, storage medium and terminal

Also Published As

Publication number Publication date
CN111027635B (en) 2023-10-31

Similar Documents

Publication Publication Date Title
CN111179253B (en) Product defect detection method, device and system
US20210034909A1 (en) Spatial transformer modules
US11710293B2 (en) Target detection method and apparatus, computer-readable storage medium, and computer device
CN110555795B (en) High resolution style migration
CN111027635A (en) Image processing model construction method and device, terminal and readable storage medium
EP3526765B1 (en) Iterative multiscale image generation using neural networks
CN110826632B (en) Image change detection method, device, equipment and computer readable storage medium
CN109902763B (en) Method and device for generating feature map
CN109948699B (en) Method and device for generating feature map
CN110245747B (en) Image processing method and device based on full convolution neural network
CN112560980A (en) Training method and device of target detection model and terminal equipment
CN110738235A (en) Pulmonary tuberculosis determination method, pulmonary tuberculosis determination device, computer device, and storage medium
CN110827301B (en) Method and apparatus for processing image
CN112528318A (en) Image desensitization method and device and electronic equipment
CN113239925A (en) Text detection model training method, text detection method, device and equipment
CN110852385A (en) Image processing method, device, equipment and storage medium
CN113516697A (en) Image registration method and device, electronic equipment and computer-readable storage medium
CN110210314B (en) Face detection method, device, computer equipment and storage medium
CN115984856A (en) Training method of document image correction model and document image correction method
CN115471703A (en) Two-dimensional code detection method, model training method, device, equipment and storage medium
CN111815628B (en) Display panel defect detection method, device, equipment and readable storage medium
CN115375715A (en) Target extraction method and device, electronic equipment and storage medium
US20230016455A1 (en) Decomposing a deconvolution into multiple convolutions
WO2023220891A1 (en) Resolution-switchable segmentation networks
JP2019125128A (en) Information processing device, control method and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant