WO2023272431A1 - Image processing method and apparatus - Google Patents

Image processing method and apparatus Download PDF

Info

Publication number
WO2023272431A1
WO2023272431A1 PCT/CN2021/102739 CN2021102739W WO2023272431A1 WO 2023272431 A1 WO2023272431 A1 WO 2023272431A1 CN 2021102739 W CN2021102739 W CN 2021102739W WO 2023272431 A1 WO2023272431 A1 WO 2023272431A1
Authority
WO
WIPO (PCT)
Prior art keywords
image processing
module
task model
image
visual task
Prior art date
Application number
PCT/CN2021/102739
Other languages
French (fr)
Chinese (zh)
Inventor
伍玮翔
伍文龙
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2021/102739 priority Critical patent/WO2023272431A1/en
Priority to CN202180099442.4A priority patent/CN117529725A/en
Publication of WO2023272431A1 publication Critical patent/WO2023272431A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present application relates to the field of computer vision, and more specifically, to an image processing method and device.
  • Computer vision is an integral part of various intelligent/autonomous systems in various application fields such as manufacturing, inspection, document analysis, medical diagnosis, and military. What we need is the knowledge of the data and information of the subject being photographed. To put it figuratively, it is to install eyes (cameras/video cameras) and brains (algorithms) on computers to replace human eyes to identify, track and measure targets, so that computers can perceive the environment.
  • Computer vision can be seen as the science of how to make artificial systems "perceive" from images or multidimensional data. In general, computer vision is to use various imaging systems to replace the visual organs to obtain input information, and then use the computer to replace the brain to complete the processing and interpretation of these input information.
  • Computer vision tasks include tasks such as image classification, object detection, object tracking, and object segmentation.
  • a series of image signal processing image signal processing, ISP
  • ISP image signal processing
  • This visualized image can be used as an input image for computer vision tasks.
  • the purpose of ISP is usually to meet human visual needs.
  • the image obtained after a series of image signal processing can meet human visual needs, but performing visual tasks based on the image may not necessarily obtain ideal processing results.
  • the present application provides an image processing method and device, which can obtain an image processing flow suitable for a visual task and improve the performance of a visual task model.
  • an image processing method comprising: acquiring a first image; processing the first image through at least one image processing module to obtain a second image; inputting the second image into a visual task model for processing Processing; adjusting at least one image processing module according to the processing results of the visual task model.
  • the image processing flow is adjusted according to the processing result of the visual task model, which is beneficial to obtain an image suitable for the visual task, so as to ensure the performance of the visual task model.
  • the solutions of the embodiments of the present application can adjust the image processing flow according to the requirements of different application scenarios, so as to adapt to different application scenarios.
  • the first image may be a raw image acquired by a sensor.
  • the image processing module is used for image signal processing on the input image.
  • the second image may be an RGB image.
  • processing the first image through at least one image processing module to obtain the second image includes: processing the first image through at least one image processing module and the weight of the at least one image processing module to obtain the second image .
  • the processing result of the at least one image processing module is adjusted according to the weight of the at least one image processing module to obtain the second image.
  • the vision task includes: target detection, image classification, target segmentation, target tracking, or image recognition.
  • the visual task model is used to perform visual tasks. For example, if the vision task is target detection, then the vision task model is the target detection model. For another example, if the visual task is image recognition, then the visual task model is an image recognition model.
  • the vision task model can be a trained model.
  • the processing results of the vision task model may include performance indicators of the vision task model.
  • the performance index of the vision task model includes the accuracy of reasoning or the value of the loss function.
  • the loss function can be set as needed.
  • the loss function is used to indicate the difference between the inference result of the vision task model and the true value corresponding to the first image. It should be noted that the loss function here may be the loss function in the training process of the vision task model, or other forms of loss functions may also be used.
  • the processing result of the vision task model may include detection accuracy.
  • the processing result of the vision task model may include segmentation accuracy.
  • the visual task model may use a neural network model, or may also use a non-neural network model.
  • the at least one image processing module is adjusted according to the processing result of the visual task model, so that the processing result of the visual task model is as close to expectation as possible.
  • the at least one image adjustment module may be adjusted by means of a Bayesian optimization method, an RNN model, or a reinforcement learning algorithm.
  • adjusting at least one image processing module according to the processing result of the visual task model includes: adjusting the at least one image processing module according to the time of image processing and the processing result of the visual task model module.
  • the image processing time may be the processing time of the visual task model, or may also be the processing time of the at least one image processing module, or may also be the difference between the processing time of the visual task model and the processing time of the at least one image processing module. sum.
  • the processing speed can be improved and the time delay can be reduced under the premise of ensuring the performance of the visual task model.
  • At least one image processing module includes multiple image processing modules, and adjusting at least one image processing module according to the processing results of the visual task model includes: changing the at least one image processing module module.
  • Changing the at least one image processing module may include: deleting some image processing modules in the at least one image processing module or/and adding other image processing modules.
  • the combination of image processing modules is changed according to the processing results of the visual task model, so that a combination of image processing modules more suitable for the visual task model can be obtained, which is conducive to improving the performance of the visual task model.
  • At least one image processing module includes multiple image processing modules, and adjusting at least one image processing module according to the processing results of the visual task model includes: processing according to the visual task model As a result, some image processing modules among the plurality of image processing modules are deleted.
  • some image processing modules are deleted according to the processing results of the visual task model, which can reduce the time required for image processing, increase the processing speed, and reduce the requirement for computing power.
  • deleting some of the image processing modules in the multiple image processing modules according to the processing results of the visual task model includes: adjusting multiple image processing modules according to the processing results of the visual task model The weight of the module, the weights of the multiple image processing modules are used to process the processing results of the multiple image processing modules to obtain the second image; delete the parts of the multiple image processing modules according to the adjusted weights of the multiple image processing modules Image processing module.
  • the image processing module to be deleted is determined according to the weight of each image processing module, and the image processing module with a relatively small weight value is deleted, so that the impact on the processing result of the visual task model is small, and the visual impact after deletion is relatively small.
  • the performance of the task model is less affected. That is to say, the solutions of the embodiments of the present application can reduce unnecessary operations, reduce computing overhead, and improve processing speed on the premise of ensuring the performance of the visual task model.
  • the multiple image processing modules are m image processing modules.
  • m is an integer greater than 1.
  • the n image processing modules with the smallest adjusted weights are deleted from the m image processing modules.
  • n is an integer greater than 1 and less than m.
  • an image processing module whose adjusted weight is less than or equal to a weight threshold is deleted from the m image processing modules.
  • adjusting at least one image processing module according to a processing result of the visual task model includes: adjusting parameters in at least one image processing module according to a processing result of the visual task model.
  • adjusting at least one image processing module according to the processing result of the visual task model includes: deleting part of the image processing module from multiple image processing modules according to the processing result of the visual task model module; the fifth image is processed by the undeleted image processing module in the plurality of image processing modules to obtain the sixth image, and the sixth image is input into the visual task model for processing; according to the processing result of the visual task model, the undeleted The parameters of the image processing module that were removed.
  • the performance indicators obtained by the visual task model are used to adjust the weights of multiple image processing modules, so as to keep the performance indicators that have a relatively small impact on the visual task model.
  • a large image processing module or in other words, an image processing module that maintains or improves the performance indicators of the vision task model.
  • an image processing module suitable for the visual task model can be obtained, or in other words, the image processing module required by the visual task model can be obtained, which reduces the time required for the image processing process, saves computing overhead, reduces the demand for computing power, and requires more hardware. friendly.
  • using the performance index obtained from the visual task model to adjust the parameters in the reserved image processing module for example, using the performance index obtained from the visual task model to search the design space of the image processing module is conducive to obtaining the optimal value of each image processing module. parameter configuration to improve the performance of the vision task model.
  • adjusting the at least one image processing module according to the processing result of the visual task model includes: adjusting the processing sequence of the at least one image processing module according to the processing result of the visual task model.
  • the processing sequence of the image processing module is adjusted according to the processing result of the visual task model, so that an image processing flow more suitable for the visual task can be obtained, which is conducive to improving the accuracy of the visual task.
  • At least one image processing module includes: a black level compensation module, a green balance module, a bad pixel correction module, a demosaic module, a Bayer noise reduction module, an automatic white balance module, color correction module, gamma correction module or noise reduction and sharpening module.
  • Any image processing module in the at least one image processing module may be implemented by a neural network algorithm, or may also be implemented by a non-neural network algorithm.
  • an image processing method comprising: acquiring a third image; determining at least one target image processing module according to a visual task model; processing the third image by at least one target image processing module to obtain a fourth An image; the fourth image is processed through the visual task model to obtain a processing result of the fourth image.
  • different visual task models correspond to different image processing module configurations.
  • the image processing module can adaptively match the visual task model, making the image processing flow more suitable for the visual task model. , which is beneficial to improve the performance of the vision task model.
  • the third image may be a raw image acquired by the sensor.
  • the processing result of the fourth image can also be understood as the processing result of the third image.
  • the processing result of the fourth image is the reasoning result of the visual task model.
  • the at least one target image processing module is one or more image processing modules corresponding to the visual task model.
  • the vision task includes: target detection, image classification, target segmentation, target tracking, or image recognition.
  • the visual task model is used to perform visual tasks. For example, if the vision task is target detection, then the vision task model is the target detection model. For another example, if the visual task is image recognition, then the visual task model is an image recognition model.
  • the vision task model can be a trained model.
  • different visual task models may be used, and accordingly, at least one target image processing module matching the visual task model may be determined according to different visual task models. In this way, different image processing modules can be selected according to different application scenarios.
  • the configuration of the image processing module matching the current visual task model can be determined.
  • the configuration of the image processing modules includes at least one of the following: a combination of image processing modules, a weight of the image processing modules, a processing order of the image processing modules, or parameters in the image processing modules.
  • determining at least one target image processing module according to the visual task model includes: determining at least one target image processing module from multiple candidate image processing modules according to the visual task model.
  • different visual task models correspond to different combinations of image processing modules.
  • the combination of image processing modules can adaptively match the visual task model, so that the current image processing module The combination is more suitable for the current visual task model and is beneficial to improve the performance of the visual task model.
  • the combination of image processing modules corresponding to the current visual task model can be determined, or in other words, the image processing module required for the visual task model can be determined according to the corresponding relationship, that is, the at least one target image processing module .
  • determining at least one target image processing module according to the visual task model includes: determining the weight of at least one target image processing module according to the visual task model, and at least one target image processing module The weights of are used to process the processing result of at least one target image processing module to obtain a fourth image.
  • different visual task models correspond to the weights of different image processing modules.
  • the weight of the image processing module can adaptively match the visual task model, so that the current image processing module The weights are more suitable for the current visual task model, which is beneficial to improve the performance of the visual task model.
  • determining at least one target image processing module according to the visual task model includes: determining parameters in the at least one target image processing module according to the visual task model.
  • different visual task models correspond to different parameters in the image processing module.
  • the parameters in the image processing module can adaptively match the visual task model, so that the current image processing
  • the parameters in the module are more suitable for the current vision task model, which is beneficial to improve the performance of the vision task model.
  • parameters in the image processing module corresponding to the visual task model can be determined, that is, parameters in the at least one target image processing module.
  • determining at least one target image processing module according to the visual task model includes: determining a processing order of the at least one target image processing module according to the visual task model.
  • different visual task models correspond to different processing sequences of image processing modules.
  • the processing sequence of the image processing modules can adaptively match the visual task model, so that the current image processing
  • the processing order of the modules is more suitable for the current vision task model, which is beneficial to improve the performance of the vision task model.
  • the processing order of the image processing modules corresponding to the current visual task model can be determined, that is, the processing order of the at least one target image processing module.
  • At least one target image processing module includes: a black level compensation module, a green balance module, a dead pixel correction module, a demosaic module, a Bayer noise reduction module, an automatic white Balance Module, Color Correction Module, Gamma Correction Module or Noise Reduction and Sharpening Module.
  • an image processing apparatus includes a module or unit for executing the method in any one of the above-mentioned first aspect and the first aspect.
  • an image processing device includes a module or unit for executing the method in any one of the above-mentioned second aspect and the second aspect.
  • an image processing device comprising: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the processing The device is used to execute the first aspect and the method in any one of the implementation manners of the first aspect.
  • the processor in the fifth aspect above can be either a central processing unit (central processing unit, CPU), or a combination of a CPU and a neural network computing processor, where the neural network computing processor can include a graphics processing unit (graphics processing unit). unit, GPU), neural-network processing unit (NPU) and tensor processing unit (TPU), etc.
  • TPU is an artificial intelligence accelerator ASIC fully customized by Google for machine learning.
  • an image processing device which includes: a memory for storing programs; a processor for executing the programs stored in the memory, and when the programs stored in the memory are executed, the processing The device is configured to execute the second aspect and the method in any one implementation manner of the second aspect.
  • the processor in the sixth aspect above can be a central processing unit, or a combination of a CPU and a neural network computing processor, where the neural network computing processor can include a graphics processor, a neural network processor and a tensor processor wait.
  • TPU is Google's fully customized artificial intelligence accelerator ASIC for machine learning.
  • a computer-readable storage medium where the computer-readable medium stores program code for execution by a device, and the program code includes a program code for executing any one of the implementation manners of the first aspect or the second aspect. method.
  • a computer program product containing instructions is provided, and when the computer program product is run on a computer, the computer is made to execute the method in any one of the above-mentioned first aspect or the second aspect.
  • the chip includes a processor and a data interface, and the processor reads the instructions stored in the memory through the data interface, and executes any one of the above-mentioned first aspect or the second aspect method in the implementation.
  • the chip may further include a memory, the memory stores instructions, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the The processor is configured to execute the method in any one of the implementation manners of the first aspect or the second aspect.
  • the aforementioned chip may specifically be a field-programmable gate array (field-programmable gate array, FPGA) or an application-specific integrated circuit (application-specific integrated circuit, ASIC).
  • FPGA field-programmable gate array
  • ASIC application-specific integrated circuit
  • FIG. 1 is a schematic structural diagram of a system architecture provided in an embodiment of the present application.
  • FIG. 2 is a schematic diagram of an image processing flow provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of an image processing method provided in an embodiment of the present application.
  • FIG. 4 is a schematic diagram of another image processing flow provided by the embodiment of the present application.
  • FIG. 5 is a schematic diagram of another image processing flow provided by the embodiment of the present application.
  • FIG. 6 is a schematic flowchart of another image processing method provided by the embodiment of the present application.
  • Fig. 7 is a schematic block diagram of an image processing device provided by an embodiment of the present application.
  • Fig. 8 is a schematic block diagram of another image processing apparatus provided by an embodiment of the present application.
  • the embodiments of the present application can be applied in fields such as automatic driving, image classification, image retrieval, image semantic segmentation, image quality enhancement, image super-resolution, monitoring, object tracking, object detection, etc. that need to perform visual tasks.
  • the method in the embodiment of the present application can be applied in picture classification and monitoring scenarios, and the following two application scenarios are briefly introduced respectively.
  • a terminal device for example, a mobile phone
  • a cloud disk When a user stores a large number of pictures on a terminal device (for example, a mobile phone) or a cloud disk, it is convenient for the user or the system to classify and manage the album by identifying the images in the album, thereby improving user experience.
  • an image suitable for performing a classification task can be obtained, and the accuracy of classification can be improved.
  • it can reduce the image processing process, reduce hardware overhead, be more friendly to terminal equipment, increase the speed of classifying pictures, and help to label pictures of different categories in real time, which is convenient for users to view and find.
  • the classification tags of these pictures can also be provided to the album management system for classification management, which saves management time for users, improves the efficiency of album management, and improves user experience.
  • Surveillance scenarios include: smart city, field surveillance, indoor surveillance, outdoor surveillance, in-vehicle surveillance, etc.
  • multiple attribute recognition is required, such as pedestrian attribute recognition and riding attribute recognition.
  • Deep neural networks play an important role in multiple attribute recognition by virtue of their powerful capabilities.
  • an image suitable for performing an attribute recognition task can be obtained, and the accuracy of recognition can be improved.
  • the image processing flow can be reduced, the hardware overhead can be reduced, and the processing efficiency can be improved, which is conducive to real-time processing of the input road picture and faster recognition of different attribute information in the road picture.
  • a neural network can be composed of neural units, and a neural unit can refer to an operation unit that takes x s and an intercept 1 as input, and the output of the operation unit can be:
  • s 1, 2, ... n, n is a natural number greater than 1
  • W s is the weight of x s
  • b is the bias of the neuron unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to transform the input signal in the neural unit into an output signal.
  • the output signal of this activation function can be used as the input of the next layer.
  • the activation function can be a ReLU, tanh or sigmoid function.
  • a neural network is a network formed by connecting multiple above-mentioned single neural units, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field.
  • the local receptive field can be an area composed of several neural units.
  • Deep neural network also known as multi-layer neural network
  • DNN can be understood as a neural network with multiple hidden layers.
  • DNN is divided according to the position of different layers, and the neural network inside DNN can be divided into three categories: input layer, hidden layer, and output layer.
  • the first layer is the input layer
  • the last layer is the output layer
  • the layers in the middle are all hidden layers.
  • the layers are fully connected, that is, any neuron in the i-th layer must be connected to any neuron in the i+1-th layer.
  • DNN looks complicated, it is actually not complicated in terms of the work of each layer.
  • it is the following linear relationship expression: in, is the input vector, is the output vector, Is the offset vector, W is the weight matrix (also called coefficient), and ⁇ () is the activation function.
  • Each layer is just an input vector After such a simple operation, the output vector is obtained. Due to the large number of DNN layers, the coefficient W and the offset vector The number is also higher.
  • DNN The definition of these parameters in DNN is as follows: Take the coefficient W as an example: Assume that in a three-layer DNN, the linear coefficient from the fourth neuron of the second layer to the second neuron of the third layer is defined as The superscript 3 represents the layer number of the coefficient W, and the subscript corresponds to the output third layer index 2 and the input second layer index 4.
  • the coefficient from the kth neuron of the L-1 layer to the jth neuron of the L layer is defined as
  • the input layer has no W parameter.
  • more hidden layers make the network more capable of describing complex situations in the real world. Theoretically speaking, a model with more parameters has a higher complexity and a greater "capacity", which means that it can complete more complex learning tasks.
  • Training the deep neural network is the process of learning the weight matrix, and its ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (the weight matrix formed by the vector W of many layers).
  • Convolutional neural network is a deep neural network with a convolutional structure.
  • the convolutional neural network contains a feature extractor composed of a convolutional layer and a subsampling layer, which can be regarded as a filter.
  • the convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network.
  • a neuron can only be connected to some adjacent neurons.
  • a convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units. Neural units of the same feature plane share weights, and the shared weights here are convolution kernels.
  • Shared weights can be understood as a way to extract image information that is independent of location.
  • the convolution kernel can be formalized as a matrix of random size, and the convolution kernel can obtain reasonable weights through learning during the training process of the convolutional neural network.
  • the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, while reducing the risk of overfitting.
  • Recurrent neural networks are used to process sequence data.
  • the layers are fully connected, and each node in each layer is unconnected.
  • RNN Recurrent neural networks
  • this ordinary neural network solves many problems, it is still powerless to many problems. For example, if you want to predict what the next word in a sentence is, you generally need to use the previous words, because the preceding and following words in a sentence are not independent. The reason why RNN is called a recurrent neural network is that the current output of a sequence is also related to the previous output.
  • RNN can process sequence data of any length.
  • the training of RNN is the same as that of traditional CNN or DNN.
  • the error backpropagation algorithm is also used, but there is a difference: that is, if the RNN is expanded to the network, then the parameters, such as W, are shared; while the above-mentioned traditional neural network is not the case.
  • the output of each step depends not only on the network of the current step, but also depends on the state of the previous several steps of the network. This learning algorithm is called back propagation through time (BPTT) based on time.
  • BPTT back propagation through time
  • the loss function loss function
  • objective function objective function
  • the training of the deep neural network becomes a process of reducing the loss as much as possible.
  • the smaller the loss the higher the training quality of the deep neural network, and the larger the loss, the lower the training quality of the deep neural network.
  • the smaller the loss fluctuation the more stable the training; the larger the loss fluctuation, the more unstable the training.
  • the embodiment of the present application provides a system architecture 100 .
  • the data collection device 170 is used to collect training data.
  • the training data may include training images and ground truths corresponding to the training images.
  • the vision task is an image classification task
  • the ground truth value corresponding to the training image may be the classification result corresponding to the training image
  • the classification result of the training image may be the result of manual pre-labeling.
  • the data collection device 170 After collecting the training data, the data collection device 170 stores the training data in the database 130 , and the training device 120 obtains the target model/rule 101 based on training data maintained in the database 130 .
  • the target model/rule 101 is the model used by the vision task. For example, if the vision task is an image classification task, the target model/rule 101 may be a network model for image classification.
  • the training device 120 obtains the target model/rule 101 based on the training data.
  • the training device 120 processes the input raw data and compares the output value with the target value until the difference between the value output by the training device 120 and the target value The value is less than a certain threshold, thus completing the training of the target model/rule 101.
  • the target model/rule 101 in the embodiment of the present application may specifically be a neural network model.
  • a neural network model For example, Convolutional Neural Networks or Residual Networks.
  • the training data maintained in the database 130 may not all be collected by the data collection device 170, but may also be received from other devices.
  • the training device 120 does not necessarily perform the training of the target model/rules 101 based entirely on the training data maintained by the database 130, and it is also possible to obtain training data from the cloud or other places for model training. Limitations of the Examples.
  • the target model/rules 101 trained according to the training device 120 can be applied to different systems or devices, such as the execution device 110 shown in FIG. Laptop, augmented reality (augmented reality, AR) AR/virtual reality (virtual reality, VR), vehicle terminal, etc., can also be a server or cloud, etc.
  • the execution device 110 configures an input/output (input/output, I/O) interface 112 for data interaction with external devices, and the user can input data to the I/O interface 112 through the client device 140, and the input data
  • I/O input/output
  • the input data may include: data to be processed input by the client device.
  • the input data may include a raw image in this embodiment of the application.
  • the preprocessing module 113 is used to perform preprocessing according to the input image received by the I/O interface 112. In the embodiment of the present application, the preprocessing module 113 may be used to perform a series of image signal processing on the input image.
  • the preprocessing module 113 may include one or more image processing modules.
  • the execution device 110 When the execution device 110 preprocesses the input data, or in the calculation module 111 of the execution device 110 performs calculation and other related processing, the execution device 110 can call the data, codes, etc. in the data storage system 150 for corresponding processing , the correspondingly processed data and instructions may also be stored in the data storage system 150 .
  • the I/O interface 112 returns the processing result, such as the processing result of the data obtained above, to the client device 140, thereby providing it to the user.
  • the training device 120 can generate corresponding target models/rules 101 based on different training data for different goals or different tasks, and the corresponding target models/rules 101 can be used to achieve the above-mentioned goals or complete the above-mentioned task to provide the user with the desired result.
  • the user can manually specify the input data, and the manual specification can be operated through the interface provided by the I/O interface 112 .
  • the client device 140 can automatically send the input data to the I/O interface 112 . If the client device 140 is required to automatically send the input data to obtain the user's authorization, the user can set the corresponding authority in the client device 140 .
  • the user can view the results output by the execution device 110 on the client device 140, and the specific presentation form may be specific ways such as display, sound, and action.
  • the client device 140 can also be used as a data collection terminal, collecting the input data input to the I/O interface 112 as shown in the figure and the output results of the output I/O interface 112 as new sample data, and storing them in the database 130 .
  • the client device 140 may not be used for collection, but the I/O interface 112 directly uses the input data input to the I/O interface 112 as shown in the figure and the output result of the output I/O interface 112 as a new sample.
  • the data is stored in database 130 .
  • FIG. 1 is only a schematic diagram of a system architecture provided by the embodiment of the present application, and the positional relationship between devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data The storage system 150 is an external memory relative to the execution device 110 , and in other cases, the data storage system 150 may also be placed in the execution device 110 .
  • the target model/rule 101 is obtained according to the training device 120.
  • the target model/rule 101 in the embodiment of the present application may be the neural network model in the present application, specifically, the neural network in the embodiment of the present application
  • the model can be CNN or residual network, etc.
  • the image signal processor outputs a visualized image after a series of processing on the raw image acquired by the sensor. These images can be used as input images for vision tasks. Specifically, a neural network algorithm or a non-neural network algorithm may be used to process an input image in a visual task to obtain relevant results of the visual task.
  • FIG. 2 shows a schematic diagram of an overall processing flow of a vision task.
  • the raw image is used as an input image, and a series of image signal processing is performed on the input image, and an 8-bit (bit) visualized red green blue (RGB) image is output.
  • the RGB image is used as the input image of the vision task, and the processing result of the vision task is obtained.
  • the image signal processing module includes a black level compensation (black level compensation) module, a green balance (green balance) module, a bad pixel correction (bad pixel correction) module, a demosaic (demosaic) module, Bayer Bayer denoise module, auto white balance module, color correction module, gamma correction module, denoise sharpness module, etc.
  • the image signal processing module can adopt non-neural network algorithm or neural network algorithm.
  • the input images of vision tasks are usually RGB images after image signal processing.
  • the purpose of traditional image signal processing is usually to meet human visual needs, and the results obtained by performing visual tasks based on the images are not necessarily optimal.
  • the embodiment of the present application provides an image processing method, which adjusts the image processing flow before the vision task according to the processing result of the vision task, so as to obtain an image processing flow that meets requirements.
  • FIG. 3 shows an image processing method 300 provided by an embodiment of the present application.
  • the method shown in Figure 3 can be executed by a computing device, which can be a cloud service device or a terminal device, such as a computer, server, mobile phone, camera, vehicle, drone or robot, or a A system composed of cloud service equipment and terminal equipment.
  • a computing device which can be a cloud service device or a terminal device, such as a computer, server, mobile phone, camera, vehicle, drone or robot, or a A system composed of cloud service equipment and terminal equipment.
  • the method 300 may be executed by a training device or an inference device, for example, the method 300 may be executed by an accelerator such as a CPU, a GPU, or an NPU.
  • the accelerator chip may be located on an FPGA, a chip emulator (Emulator) or a development board (evaluation board, EVB).
  • the method 300 may be executed by a tuning tool or a calibration tool of an ISP pipeline (pipeline) of a hardware device (eg, a camera or a camera).
  • a tuning tool or a calibration tool of an ISP pipeline pipeline
  • a hardware device eg, a camera or a camera.
  • the method 300 includes step S301 to step S304. Step S301 to step S304 will be described in detail below.
  • the first image may be a raw image acquired by a sensor.
  • the training data set includes multiple images, and the first image is any image in the training data set.
  • the method 300 may be executed multiple times based on multiple images in the training data set until the required image processing modules are obtained.
  • the training data set may use an open source data set.
  • the training data set can also be a self-made data set.
  • the training data set may be pre-stored.
  • the training data set may be the training data maintained in the database 130 shown in FIG. 1 .
  • the training dataset can also be user-input data.
  • S302. Process the first image by at least one image processing module to obtain a second image.
  • the image processing module is used for image signal processing on the input image.
  • the at least one image processing module may be located on an image signal processor. That is to say, step S302 is executed by the image processing module in the image processor.
  • Any image processing module in the at least one image processing module may be implemented by a neural network algorithm, or may also be implemented by a non-neural network algorithm.
  • the embodiment of the present application does not limit the specific implementation manner of the image processing module.
  • the at least one image processing module may include: a black level compensation module, a green balance module, a dead point correction module, a demosaic module, a Bayer noise reduction module, an automatic white balance module, a color correction module, and a gamma correction module Or noise reduction and sharpening modules.
  • the raw image is used as the first image
  • the at least one image processing module includes 9 image processing modules, which are respectively a black level compensation module, a green balance module, a bad pixel correction module, a demosaic module, Bayer noise reduction module, automatic white balance module, color correction module, gamma correction module, and noise reduction and sharpening module.
  • the nine image processing modules sequentially perform black level compensation, green balance processing, dead pixel correction, demosaicing, Bayer noise reduction, automatic white balance processing, color correction, gamma correction, and noise reduction and sharpening.
  • the black level module, the green balance module and the bad pixel correction module can be used to process the raw data.
  • a demosaic module and a Bayer denoising module may be used to perform demosaic processing.
  • An automatic white balance module, a color correction module, a gamma correction module, and a noise reduction and sharpening module can be used to perform image enhancement processing.
  • the second image may be an RGB image.
  • the second image may be an 8-bit RGB image. This is only an example, and the type of the second image may also be set according to the input requirements of the visual task model.
  • step S302 includes: processing the first image by using at least one image processing module and the weight of the at least one image processing module to obtain the second image.
  • the processing result of the at least one image processing module is adjusted according to the weight of the at least one image processing module to obtain the second image.
  • the image processing module processes the image input to the module, which may be to adjust the pixel values of all or part of the pixels of the image input to the module, that is, to change the pixel values of all or part of the pixels.
  • the variation of the pixel values of all or some pixels may be adjusted according to the weight of the image processing module.
  • the weight of the image processing module is multiplied by the variation of the pixel value to obtain the adjusted variation of the pixel, and then the output image of the module is obtained. If the weight of the image processing module is 0, it means that the image processing module does not participate in the image processing process.
  • the specific value of the weight can be set as required, for example, the weight can be a real number greater than or equal to 0 and less than or equal to 1.
  • the weight of the at least one image processing module can be normalized, that is, the sum of the weights of the at least one image processing module is 1, or the sum of the weights of the at least one image processing module close to 1.
  • the weights of the nine image processing modules are w1, w2, w3, w4, w5, w6, w7, w8 and w9, respectively.
  • the value range of the weight is a real number greater than or equal to 0 and less than or equal to 1.
  • the largest possible sum of w1, w2, w3, w4, w5, w6, w7, w8 and w9 is nine.
  • the nine weights can also be normalized so that the sum of the nine weights can be 1.
  • the vision task includes: target detection, image classification, target segmentation, target tracking, or image recognition.
  • the visual task model is used to perform visual tasks. For example, if the vision task is target detection, then the vision task model is the target detection model. For another example, if the visual task is image recognition, then the visual task model is an image recognition model.
  • the vision task model can be a trained model.
  • the type of output of the visual task model is related to the type of visual task.
  • the output of the visual task model is the inference result of the visual task model.
  • the output of the vision task model may be a target frame on the second image and the category of the object in the target frame.
  • the output of the visual task model may be the category of the second image.
  • the processing results of the vision task model may include performance indicators of the vision task model.
  • the performance index of the vision task model includes the accuracy of reasoning or the value of the loss function.
  • the loss function can be set as needed.
  • the loss function is used to indicate the difference between the inference result of the vision task model and the true value corresponding to the first image. It should be noted that the loss function here may be the loss function in the training process of the vision task model, or other forms of loss functions may also be used.
  • the processing result of the vision task model may include detection accuracy.
  • the processing result of the vision task model may include segmentation accuracy.
  • the visual task model may use a neural network model, or may also use a non-neural network model.
  • the neural network model may be an existing neural network model, for example, a residual network.
  • the neural network model may also be a neural network model of other structures constructed by itself. This embodiment of the present application does not limit it.
  • the visual task model employed may or may not be the same in overexposed and underexposed situations.
  • the first object detection model may be used, and if the current scene is recognized as underexposed, the second object detection model may be used.
  • the first target detection model and the second target detection model are different target detection models.
  • the processing of the visual task model can be executed by the calculation module 111 in FIG. 1 .
  • the vision task model can be deployed on the execution device of the method 300, or can be deployed on other devices. That is to say, the processing of the visual task model can be executed by the executing device of the method 300 or by other devices, and the processing result can be fed back to the executing device of the method 300 .
  • the at least one image processing module is adjusted according to the processing result of the visual task model, so that the processing result of the visual task model is as close to expectation as possible.
  • the at least one image processing module is adjusted according to the performance index of the visual task model, so as to improve the performance of the visual task model.
  • the at least one image processing module is adjusted to improve the accuracy of inference of the model.
  • the at least one image processing module is adjusted to reduce the value of the loss function of the visual task model.
  • the method 300 may be executed based on multiple images in the training data set until a preset condition is met. That is to say, in practical applications, the image processing module can be adjusted iteratively based on multiple images.
  • the image processing module used in each iteration is the image processing module obtained after the previous iteration.
  • the preset conditions can be set as required, and examples will be given in Mode 1, Mode 2, Mode 3, and Mode 4 below.
  • the at least one image processing module can also be adjusted according to the image processing time and the processing result of the visual task model.
  • the image processing time may be the processing time of the visual task model, or may also be the processing time of the at least one image processing module, or may also be the difference between the processing time of the visual task model and the processing time of the at least one image processing module. sum.
  • the processing speed can be improved and the time delay can be reduced under the premise of ensuring the performance of the visual task model.
  • the image processing flow is adjusted according to the processing result of the visual task model, which is beneficial to obtain an image suitable for the visual task, so as to ensure the performance of the visual task model.
  • the solutions of the embodiments of the present application can adjust the image processing flow according to the requirements of different application scenarios, so as to adapt to different application scenarios.
  • the visual task model employed may or may not be the same in overexposed and underexposed situations.
  • the first object detection model may be used as the visual task model.
  • the second object detection model may be used as the vision task model.
  • the solution of the embodiment of the present application can adjust the image processing flow according to the processing results of the first object detection model and the second object detection model respectively, so as to obtain an image processing flow suitable for the first object detection model and an image suitable for the second object detection model processing flow.
  • Step S304 can be implemented in various ways, and the following four ways (mode 1, mode 2, mode 3 and mode 4) are taken as examples for illustration.
  • the at least one image processing module includes a plurality of image processing modules
  • step S304 includes: adjusting weights of the plurality of image processing modules according to a processing result of the visual task model.
  • the weights of the plurality of image processing modules are adjusted according to the processing results of the visual task model, so as to improve the performance of the visual task model.
  • the method 300 can be executed based on multiple images in the training data set to implement iterative adjustment of the weights of the multiple image processing modules until the preset conditions are met. Stop adjusting the weights of the plurality of image processing modules after the preset condition is met, or stop refreshing the weights of the plurality of image processing modules.
  • the preset condition may be that the weights of the plurality of image processing modules converge.
  • the method 300 is not executed any more, that is, the adjustment of the weights of the plurality of image processing modules is stopped.
  • the weight convergence can also be understood as the weight gradient obtained after performing the method 300 multiple times continuously has a small change. For example, when the change amount of the weight gradient obtained after performing the method 300 for multiple times is less than or equal to the first threshold, stop adjusting the weights of the multiple image processing modules.
  • the preset condition may be that the accuracy of the visual task model is greater than or equal to the second threshold.
  • the method 300 is not executed, that is, the adjustment of the weights of the plurality of image processing modules is stopped.
  • the second threshold may be a preset value.
  • the second threshold may be the inference accuracy of the visual task model obtained without setting the weight of the image processing module.
  • the second threshold may be the inference accuracy of the visual task model when no weight is set for the nine image processing modules.
  • the second threshold may be the inference accuracy of the visual task model when the weight of the nine image processing modules is 1.
  • the image is input into the original image processing module for processing, and the processed image is input into the vision task model for processing, and the accuracy of reasoning is calculated, and the accuracy is used as the second threshold.
  • Executing method 300 that is, inputting the image into the currently adjusted image processing module for processing, and inputting the processed image into the visual task model for processing, and calculating the accuracy of inference, and comparing the currently obtained inference accuracy with The second threshold is compared, and in the case that the accuracy of the currently obtained reasoning is greater than or equal to the second threshold, the method 300 is not executed any more. In this way, using the adjusted image processing module to process the image can ensure the performance of the visual task model, or can improve the performance of the visual task model.
  • the preset condition may be that the change amount of the loss function value of the visual task model obtained after performing the method 300 for multiple times is less than or equal to the third threshold.
  • the method 300 is not executed any more.
  • the preset condition may be that the number of iterations is greater than or equal to the fourth threshold.
  • the method 300 is not executed any more.
  • the preset condition may be that the accuracy of the visual task model is greater than or equal to the second threshold, and the number of iterations is greater than or equal to the fourth threshold.
  • the preset condition may be that the weights of the plurality of image processing modules converge, and the accuracy of the visual task model is greater than or equal to the second threshold.
  • the weights of the plurality of image processing modules may be adjusted by means of Bayesian optimization method, RNN model, or reinforcement learning algorithm.
  • the Bayesian optimization method is taken as an example to illustrate below.
  • the vision task model is a target detection model
  • the performance index of the vision task model may be mean average precision (mAP).
  • the weights of the multiple image processing modules are adjusted by a Bayesian optimization method to improve the mAP of the object detection model. In other words, the weights of the multiple image processing modules are adjusted with the goal of maximizing the mAP of the target detection model.
  • the average accuracy refers to the average of the detection accuracies for all target objects.
  • the detection accuracy of the image is input into the Bayesian optimization model, and the Bayesian optimization model adjusts the weight of each image processing module.
  • the detection accuracy of the image can be preserved in the Bayesian optimization model. That is to say, when other images in the training data set are input into the target detection model, the detection accuracy of other images is obtained.
  • the Bayesian optimization model can adjust the weight of each image processing module according to the detection accuracy of other images and the detection accuracy of previous images.
  • the training data set in the embodiment of the present application is used to train each image processing module, which may be the same as or different from the training data set of the vision task model.
  • the training data set in the embodiment of the present application may use a verification data set or a test data set of the vision task model.
  • the weight of the image processing module is evaluated according to the processing results of the visual task model, and then the weight of the image processing module is adjusted to increase the weight of the image processing module that has a strong correlation with the performance of the visual task model. Reducing the weight of image processing modules that are less correlated with the performance of the vision task model can obtain an image processing flow that is more suitable for the vision task, and is conducive to improving the performance of the vision task model.
  • step S304 includes: modifying the at least one image processing module according to a processing result of the visual task model.
  • Changing the at least one image processing module may include: deleting some image processing modules in the at least one image processing module or/and adding other image processing modules.
  • step S304 may be to select a combination of image processing modules from multiple candidate image processing modules according to the processing results of the visual task model, and use the combination of image processing modules to replace the at least one image processing module. module.
  • the at least one image processing module can be changed by means of Bayesian optimization method or reinforcement learning algorithm.
  • the method 300 can be executed based on multiple images in the training data set to realize iterative adjustment of the combination of the multiple image processing modules until the preset conditions are met. Stop adjusting the combination of the multiple image processing modules after the preset condition is met, or stop refreshing the combination of the multiple image processing modules.
  • the preset condition may be that the number of iterations is greater than or equal to the fourth threshold.
  • the execution of the method 300 is stopped, that is, the adjustment of the combination of the image processing modules is stopped.
  • the combination of image processing modules is changed according to the processing results of the visual task model, so that a combination of image processing modules more suitable for the visual task model can be obtained, which is conducive to improving the performance of the visual task model.
  • the at least one image processing module includes a plurality of image processing modules, and step S304 includes: deleting part of the image processing modules from the plurality of image processing modules according to the processing result of the visual task model.
  • the processing result of the manner 1 may be adopted in the manner 2.
  • step S304 includes: adjusting the weights of the multiple image processing modules according to the processing results of the visual task model; deleting part of the image processing modules from the multiple image processing modules according to the adjusted weights of the multiple image processing modules .
  • the multiple image processing modules are m image processing modules.
  • m is an integer greater than 1.
  • the n image processing modules with the smallest adjusted weights are deleted from the m image processing modules.
  • n is an integer greater than 1 and less than m.
  • an image processing module whose adjusted weight is less than or equal to a weight threshold is deleted from the m image processing modules.
  • some image processing modules are deleted according to the processing results of the visual task model, which can reduce the time required for image processing, increase the processing speed, and reduce the requirement for computing power.
  • image processing modules with higher weights have a stronger correlation with the vision task model, or in other words, image processing modules with higher weights have a greater impact on the performance of the vision task model.
  • the image processing module to be deleted is determined according to the weight of each image processing module, and the image processing module with a relatively small weight value is deleted, so that the impact on the processing result of the visual task model is small, and the visual impact after deletion is relatively small.
  • the performance of the task model is less affected. That is to say, the solutions of the embodiments of the present application can reduce unnecessary operations, reduce computing overhead, and improve processing speed on the premise of ensuring the performance of the visual task model.
  • step S304 includes: deleting part of the image processing modules from the plurality of image processing modules according to the processing result of the visual task model and the processing speed of the plurality of image processing modules.
  • some image processing modules are deleted from the plurality of image processing modules according to the adjusted weights of the plurality of image processing modules and the processing speeds of the plurality of image processing modules.
  • an image processing module whose adjusted weight is less than or equal to a weight threshold and whose processing speed is less than or equal to a speed threshold is deleted from the plurality of image processing modules. That is, image processing modules that have slower processing speed and have less impact on the vision task model are deleted. In this way, the speed of image processing can be further increased.
  • the at least one image processing module includes a plurality of image processing modules, and step S304 includes: adjusting the processing order of the plurality of image processing modules according to the processing results of the visual task model.
  • the processing order of the plurality of image processing modules is adjusted according to the processing results of the visual task model, so as to improve the performance of the visual task model.
  • the processing order of the plurality of image processing modules may be adjusted by means of Bayesian optimization method, RNN model, or reinforcement learning algorithm.
  • the method 300 can be executed based on multiple images in the training data set until the preset conditions are met. Stop adjusting the processing sequence of the multiple image processing modules after the preset condition is satisfied, or stop refreshing the processing sequence of the multiple image processing modules.
  • the preset condition may be that the variation of the processing sequence of the plurality of image processing modules is less than or equal to the fifth threshold.
  • the amount of change in the processing order of the plurality of image processing modules may be the number of image processing modules whose processing order changes after the method 300 is executed.
  • the preset condition may be that the inference accuracy of the visual task model is greater than or equal to the sixth threshold.
  • the method 300 is not executed again, that is, the adjustment of the processing sequence of the plurality of image processing modules is stopped.
  • the sixth threshold may be a preset value.
  • the sixth threshold may be the inference accuracy of the visual task model without adjusting the processing sequence of the image processing module.
  • the sixth threshold may be the inference accuracy of the visual task model when images are processed according to the processing order of the image processing module shown in FIG. 4 .
  • the image is input into the original image processing module, processed in the order of the original image processing module, and the processed image is input into the visual task model for processing, and the accuracy of inference is calculated, and the accuracy as the sixth threshold.
  • the method 300 is not executed any more. In this way, the images are processed according to the adjusted processing sequence of the image processing module, so that the performance of the visual task model can be guaranteed, or the performance of the visual task model can be improved.
  • the preset condition may be that the change amount of the loss function value of the visual task model obtained after performing the method 300 for multiple times is less than or equal to the third threshold.
  • the method 300 is not executed any more.
  • the preset condition may be that the number of iterations is greater than or equal to the fourth threshold.
  • the method 300 is not executed any more.
  • the above preset conditions may be used in combination.
  • the preset condition may be that the inference accuracy of the visual task model is greater than or equal to the sixth threshold, and the number of iterations is greater than or equal to the fourth threshold.
  • the preset condition may be that the variation of the processing order of the plurality of image processing modules is less than or equal to the fifth threshold, and the accuracy of the visual task model is greater than or equal to the sixth threshold.
  • the processing sequence of the image processing module is adjusted according to the processing result of the visual task model, so that an image processing flow more suitable for the visual task can be obtained, which is conducive to improving the accuracy of the visual task.
  • step S304 includes: adjusting parameters in the at least one image processing module according to a processing result of the visual task model.
  • the parameters in the at least one image processing module are adjusted according to the processing results of the visual task model, so as to improve the performance of the visual task model.
  • the parameters in the image processing module are the parameters of the neural network model.
  • the parameters in the at least one image processing module may be adjusted by means of a Bayesian optimization method, an RNN model, a reinforcement learning algorithm, and the like.
  • the input image is processed based on the parameter combination in the current image processing module, and the processed result is input into the vision task model for processing, for example, the vision task is performed by CPU or GPU.
  • the parameter combination in the image processing module is optimized and updated according to the feedback of the performance of the visual task model, that is, the optimal parameter combination in the image processing module is found in the search space, so as to improve the performance of the visual task model.
  • the method 300 can be executed based on multiple images in the training data set until a preset condition is met. Stop adjusting the parameters in the at least one image processing module after the preset condition is met, or stop refreshing the parameters in the at least one image processing module.
  • the preset condition may be that the inference accuracy of the visual task model is greater than or equal to the seventh threshold.
  • the method 300 is not executed again, that is, the adjustment of the parameters in the at least one image processing module is stopped.
  • the seventh threshold may be a preset value.
  • the seventh threshold may be the processing accuracy of the visual task model obtained without adjusting the parameters in the at least one image processing module.
  • the seventh threshold may be the inference accuracy of the vision task model when the nine image processing modules do not adjust parameters.
  • the image is input into the original image processing module, that is, the image processing module without adjustment parameters for processing, and the processed image is input into the vision task model for processing, and the accuracy of inference is calculated, and the accuracy degrees as the seventh threshold.
  • the method 300 is not executed any more. In this way, using the adjusted image processing module to process the image can ensure the performance of the visual task model, or can improve the performance of the visual task model.
  • the preset condition may be that the change amount of the loss function value of the visual task model obtained after performing the method 300 for multiple times is less than or equal to the third threshold.
  • the method 300 is not executed any more.
  • the preset condition may be that the number of iterations is greater than or equal to the fourth threshold.
  • the method 300 is not executed any more.
  • the above preset conditions may be used in combination.
  • the preset condition may be that the inference accuracy of the visual task model is greater than or equal to the seventh threshold, and the number of iterations is greater than or equal to the fourth threshold.
  • any two or more of the above modes 1, 2, 3 and 4 may be used in combination.
  • each method can be executed at the same time, or each method can also be executed separately.
  • step S304 includes: deleting part of the image processing modules from the plurality of image processing modules according to the processing results of the visual task model; processing the fifth image through the image processing modules that have not been deleted in the plurality of image processing modules to obtain For the sixth image, input the sixth image into the visual task model for processing; adjust the parameters of the image processing module that have not been deleted according to the processing result of the visual task model.
  • the fifth image may be an image in the training data set.
  • the fifth image and the first image may be the same image or different images.
  • the sixth image may be an RGB image.
  • the description of the sixth image refer to the second image above.
  • the performance indicators obtained by the visual task model are used to adjust the weights of multiple image processing modules, so as to keep the performance indicators that have a relatively small impact on the visual task model.
  • a large image processing module or in other words, an image processing module that maintains or improves the performance indicators of the vision task model.
  • an image processing module suitable for the visual task model can be obtained, or in other words, the image processing module required by the visual task model can be obtained, which reduces the time required for the image processing process, saves computing overhead, reduces the demand for computing power, and requires more hardware. friendly.
  • using the performance index obtained from the visual task model to adjust the parameters in the reserved image processing module for example, using the performance index obtained from the visual task model to search the design space of the image processing module is conducive to obtaining the optimal value of each image processing module. parameter configuration to improve the performance of the vision task model.
  • step S304 includes: adjusting the parameters of multiple image processing modules and the weights of the multiple image processing modules according to the processing results of the visual task model, and processing the multiple image processing modules according to the adjusted weights of the multiple image processing modules. Some image processing modules are deleted from the module.
  • step S304 includes: adjusting the parameters of multiple image processing modules, the weights of the multiple image processing modules, and the processing order of the multiple image processing modules according to the processing results of the visual task model, and adjusting the multiple image processing modules according to the adjusted The weight of the processing module deletes some image processing modules from the plurality of image processing modules.
  • the embodiment of the present application provides an image processing method 400.
  • the method 400 can be regarded as a specific implementation of the method 300.
  • some descriptions are appropriately omitted when introducing the method 400 below.
  • the method 400 adopts a combination of mode 1, mode 2 and mode 4.
  • the method 400 includes step S401 to step S410. Steps S401 to S410 will be described below.
  • the method 400 can be regarded as two stages, the first stage includes steps S401 to S406, and the second stage includes steps S407 to S410.
  • the plurality of image processing modules may include nine image processing modules as shown in FIG. 5 .
  • the weights of the respective image processing modules are denoted as w1, w2, w3, w4, w5, w6, w7, w8 and w9.
  • the sum of the 9 weights is 1.
  • the input image is processed based on the weights of the plurality of image processing modules.
  • the processing results of the multiple image processing modules are adjusted based on the weights of the multiple image processing modules.
  • the input image is processed according to the image processing module and its corresponding weight shown in FIG. 5 .
  • the processing result may be an RGB image.
  • the processing result may be an 8-bit RGB image.
  • Step S402 corresponds to step S302, and for a specific description, refer to the description in step S302.
  • the vision task model can be a trained model.
  • the comparison result is fed back to the optimization algorithm, and the optimization algorithm is used to adjust the weights of the plurality of image processing modules.
  • the optimization algorithm includes a Bayesian optimization method, an RNN model, and a reinforcement learning algorithm.
  • step S405 using the adjusted weight of the image processing module as the weight of the image processing module in step S402, and repeating steps S402 to S404 until the first preset condition is met.
  • step S405 may also be to perform normalization processing on the adjusted weights of the image processing modules, and use the normalized weights as the weights of the image processing modules in step S402.
  • step S402 to step S404 are terminated.
  • the currently obtained weight of the image processing module may be regarded as the weight of the image processing module obtained after satisfying the first preset condition.
  • step S402 to step S404 are terminated.
  • Steps S403 to S405 can be regarded as a specific implementation of method 1.
  • Step S406 corresponds to step S304 in method 2.
  • Step S406 corresponds to step S304 in method 2.
  • the image in step S407 and the image in step S402 may be the same image or different images.
  • the parameters in the image processing module that have not been deleted are used as tuning objects.
  • the parameters in the reserved image processing module are used as tuning objects.
  • normalization processing may also be performed on the weights of the image processing modules that have not been deleted.
  • the images in the training data set are input to the black level compensation module, the demosaic module, the automatic white balance module and the gamma correction module for processing. Further, before performing step S407, the weights of the four image processing modules may be normalized.
  • the comparison result is fed back to the optimization algorithm, and the parameters in the image processing module are adjusted using the optimization algorithm.
  • the optimization algorithm includes Bayesian optimization method, RNN model or reinforcement learning algorithm.
  • step S409 may be the same as or different from the optimization algorithm used in step S440.
  • step S407 to step S410 are terminated.
  • the currently obtained parameters in the image processing module may be regarded as parameters in the image processing module obtained after satisfying the second preset condition.
  • step S407 to step S410 are terminated.
  • Step S407 to step S410 can be regarded as a specific implementation manner of mode 3, and for specific description, refer to the description in mode 3, which will not be repeated here.
  • the setting method of the second preset condition reference may be made to the preset condition in method 3.
  • the performance indicators obtained by the visual task model are used to adjust the weights of multiple image processing modules, so as to keep the performance indicators that have a relatively small impact on the visual task model.
  • a large image processing module or in other words, an image processing module that maintains or improves the performance indicators of the vision task model.
  • an image processing module suitable for the visual task model can be obtained, or in other words, the image processing module required by the visual task model can be obtained, which reduces the time required for the image processing process, saves computing overhead, reduces the demand for computing power, and requires more hardware. friendly.
  • the performance index obtained by the visual task model to adjust the parameters in the retained image processing module, for example, use the performance index obtained by the visual task model to search the design space of the image processing module, which is beneficial
  • the optimal parameter configuration of each image processing module is obtained to improve the performance of the vision task model.
  • the first stage and the second stage in the method 400 may be executed simultaneously. That is to say, the weight of the image processing module and the parameters in the image processing module are adjusted at the same time.
  • the manner in which the first phase and the second phase of the method 400 are executed simultaneously will be described below.
  • Method 400 may include the following steps. For the following steps, reference may be made to the description of the first stage and the second stage of the aforementioned method 400. For the sake of brevity, part of the description is appropriately omitted when describing the following steps.
  • the comparison result is fed back to the optimization algorithm, and the optimization algorithm is used to adjust the weights of the plurality of image processing modules.
  • An optimization algorithm is used to adjust parameters in the multiple image processing modules.
  • the optimization algorithm includes a Bayesian optimization method, an RNN model, and a reinforcement learning algorithm.
  • the optimization algorithm for adjusting the weights of the multiple image processing modules and the optimization algorithm for adjusting the parameters in the multiple image processing modules may be the same or different.
  • step 5 The weight of the image processing module after adjustment is used as the weight of the image processing module in step 2), and the parameter in the image processing module after adjustment is used as the parameter in the image processing module in step 2), repeating step 2) Go to step 4) until the training is completed.
  • the adjusted weights of the image processing modules are normalized, and the normalized weights are used as the weights of the image processing modules in step 5).
  • the training is completed.
  • the accuracy of the current visual task model is greater than or equal to the inference accuracy of the visual task model before the method 400 is executed, the training is complete.
  • Step 6) Delete part of the image processing modules according to the weights of the image processing modules after training. Step 6) corresponds to step S304 in method 2. For specific description, please refer to the description in method 2, which will not be repeated here.
  • the first stage and the second stage are executed at the same time, which can prevent the image processing module from being deleted due to unreasonable parameter configuration, so that the image processing module can process the image under a better parameter configuration, and then judge the better parameter configuration
  • the contribution degree of each image processing module under the vision task model to the performance index in order to retain the image processing module required by the vision task model, so that the performance index of the vision task model can be further improved.
  • Method 400 is only an example of combining mode 1, mode 2 and mode 4.
  • Way 1, way 2, way 3 and way 4 can also be combined in other implementation ways.
  • mode 1, mode 2 and mode 3 are combined.
  • step S304 may include: adjusting the weights of multiple image processing modules and the processing order of the multiple image processing modules according to the processing results of the visual task model, and selecting from the multiple image processing modules according to the adjusted weights of the image processing modules Delete some image processing modules.
  • step S304 may include: adjusting the weights of multiple image processing modules according to the processing results of the visual task model, and deleting part of the image processing modules from the multiple image processing modules according to the adjusted weights of the image processing modules;
  • the processing results of the model adjust the processing order of the image processing modules that have not been deleted. That is, step S304 is divided into two stages. In the first stage, some image processing modules are deleted, and in the second stage, the processing order of the image processing modules that have not been deleted is adjusted.
  • the adjusted image processing module is an image processing module required by the visual task model. There is a corresponding relationship between the adjusted image processing module and the vision task model. Different vision task models can correspond to different image processing modules. In this way, an appropriate image processing flow can be selected according to the application scenario.
  • Figure 6 shows an image processing method 700 provided by the embodiment of the present application.
  • the method shown in Figure 6 can be executed by an image processing device, which can be a cloud service device or a terminal device, such as a computer, server, etc.
  • a device capable enough to perform image processing may also be a system composed of cloud service equipment and terminal equipment.
  • the method 700 may be executed by the preprocessing module in FIG. 1 .
  • the target image processing module in method 700 is obtained by method 300 or method 400 .
  • repeated descriptions are appropriately omitted when introducing the method 700 below.
  • the method 700 includes steps S701 to S704. Steps S701 to S704 will be described in detail below.
  • the third image is an image to be processed.
  • the third image may be a raw image acquired by the sensor.
  • the third image may be an image captured by a terminal device (or other device or device such as a computer or server) through a camera, or the third image may also be an image captured by a terminal device (or other device or device such as a computer or server). ) internally obtained images (for example, images stored in the photo album of the terminal device, or images obtained by the terminal device from the cloud), which are not limited in this embodiment of the present application.
  • S702. Determine at least one target image processing module according to the visual task model.
  • the at least one target image processing module is one or more image processing modules corresponding to the visual task model.
  • the vision task includes: target detection, image classification, target segmentation, target tracking, or image recognition.
  • the visual task model is used to perform visual tasks. For example, if the vision task is target detection, then the vision task model is the target detection model. For another example, if the visual task is image recognition, then the visual task model is an image recognition model.
  • the vision task model can be a trained model.
  • different visual task models may be used, and accordingly, at least one target image processing module matching the visual task model may be determined according to different visual task models. In this way, different image processing modules can be selected according to different application scenarios.
  • the visual task model employed may or may not be the same in overexposed and underexposed situations.
  • the first target detection model may be used as the visual task model, and at least one target image processing module corresponding to the first target detection model may be determined according to the first target detection model.
  • the second target detection model may be used as the visual task model, and at least one target image processing module corresponding to the second target detection model is determined according to the second target detection model.
  • the first target detection model and the second target detection model are different target detection models. In this way, different image processing processes can be selected according to different application scenarios to improve the performance of the vision task model.
  • one or more image processing modules corresponding to the visual task model are used to process the input third image to obtain the fourth image.
  • the fourth image may be an RGB image.
  • the fourth image may be an 8-bit RGB image. This is only an example, and the type of the fourth image can be set according to the input requirements of the visual task model.
  • the processing result of the fourth image can also be understood as the processing result of the third image.
  • the processing result of the fourth image is the reasoning result of the visual task model.
  • the inference results of the visual task model are related to the type of visual task.
  • the inference result of the vision task model may be the target frame on the fourth image and the category of the object in the target frame.
  • the reasoning result of the vision task model may be the category of the fourth image.
  • the configuration of the image processing module matching the current visual task model can be determined.
  • the configuration of the image processing modules includes at least one of the following: a combination of image processing modules, a weight of the image processing modules, a processing order of the image processing modules, or parameters in the image processing modules.
  • step S702 includes: determining at least one target image processing module from multiple candidate image processing modules according to the visual task model.
  • a combination of image processing modules is determined from multiple candidate image processing modules according to the visual task model, and an image processing module in the combination of image processing modules is the at least one target image processing module.
  • the combination of image processing modules may also change accordingly.
  • the combination of image processing modules corresponding to the current visual task model can be determined, or in other words, the image processing module required for the visual task model can be determined according to the corresponding relationship, that is, the at least one target image processing module .
  • the at least one target image processing module may be obtained through the method 300 or the method 400 .
  • the correspondence between the combination of the visual task model and the image processing module is obtained through the method 300 or the method 400 .
  • the at least one target image processing module includes: a black level compensation module, a demosaic module, an automatic white balance module and a gamma correction module.
  • the combination of image processing modules can adaptively match the visual task model, making the current combination of image processing modules more suitable for the current visual Task model, which is beneficial to improve the performance of vision task models.
  • step S702 includes: determining the weight of at least one target image processing module according to the visual task model.
  • the weight of the at least one target image processing module is used to process the processing result of the at least one target image processing module to obtain a fourth image.
  • the combinations of image processing modules corresponding to different visual task models are the same.
  • the weight of the image processing module may change accordingly.
  • the combination of image processing modules corresponding to different visual task models is the same, which may be understood to mean that the functions implemented by the image processing modules adopted by different visual task models are the same.
  • the weight of the image processing module corresponding to the current visual task model that is, the weight of the at least one target image processing module, can be determined.
  • the at least one target image processing module may be the nine image processing modules in FIG. 4, and the weights of the image processing modules may be the weights obtained in step S405.
  • the weights of the image processing modules can adaptively match the visual task model, making the weights of the current image processing modules more suitable for the current visual Task model, which is beneficial to improve the performance of vision task models.
  • the weight of the image processing module may also change, and other configurations of the image processing module may also change.
  • the combination of image processing modules may change.
  • the visual task model has a corresponding relationship with the weight of the image processing module and other configuration conditions of the image processing module.
  • the weight of the image processing module corresponding to the visual task model and other configurations of the image processing module can be determined according to the visual task model.
  • step S702 a combination of image processing modules corresponding to the visual task model and weights of image processing modules in the combination of image processing modules may be determined.
  • the at least one target image processing module corresponding to the visual task model may be obtained in step S406.
  • the at least one target image processing module includes a black level compensation module, a demosaic module, an automatic white balance module and a gamma correction module.
  • the weight of the at least one target image processing module may be the weight obtained in step S405.
  • step S702 includes: determining a processing sequence of at least one target image processing module according to the visual task model.
  • the combinations of image processing modules corresponding to different visual task models are the same.
  • the processing sequence of the image processing module may also change accordingly.
  • the processing order of the image processing modules corresponding to the current visual task model can be determined, that is, the processing order of the at least one target image processing module.
  • different visual task models correspond to the processing order of different image processing modules.
  • the processing order of the image processing module can adaptively match the visual task model, making the processing order of the current image processing module more suitable.
  • the current vision task model is beneficial to improve the performance of the vision task model.
  • the processing order of the image processing module may change, and other configurations of the image processing module may also change.
  • the combination of image processing modules may change.
  • the visual task model has a corresponding relationship with the processing order of the image processing module and other configurations of the image processing module.
  • the processing sequence of the image processing module corresponding to the visual task model and other configurations of the image processing module can be determined according to the corresponding relationship.
  • the combination of the visual task model and the image processing module there is a corresponding relationship between the combination of the visual task model and the image processing module, and the processing sequence of the image processing module.
  • the combination of image processing modules corresponding to the vision task model and the processing order of the image processing modules in the combination of image processing modules can be determined.
  • the combinations of image processing modules corresponding to different visual task models may be the same or different.
  • the combinations of image processing modules corresponding to the two visual task models are the same, but the processing orders of the image processing modules in the combination of image processing modules are different.
  • step S702 it is possible to determine the combination of image processing modules corresponding to the visual task model, the weight of the image processing modules, and the processing order of the image processing modules, that is, determine the target image processing module and the target image processing module from multiple candidate image processing modules. weights and the processing order of the target image processing module.
  • the combinations of image processing modules corresponding to different visual task models may be the same or different.
  • the weights of the image processing modules in the combination of image processing modules may be the same or different.
  • the processing order of the image processing modules in the combination of image processing modules may be the same or different.
  • step S702 includes: determining parameters in the at least one target image processing module according to the visual task model.
  • the combinations of image processing modules corresponding to different visual task models are the same.
  • the parameters in the image processing module may change accordingly.
  • the image processing module corresponding to the first visual task model includes: a black level compensation module and a demosaic module.
  • the parameters of the black level compensation module include parameter A1
  • the parameters of the demosaic module include parameter B1.
  • the image processing module corresponding to the second visual task model includes: a black level compensation module and a demosaic module.
  • the parameters of the black level compensation module include parameter A2, and the parameters of the demosaic module include parameter B2.
  • the parameters used in the black level compensation processing and demosaic processing before the first visual task model are different from those used in the black level compensation processing and demosaic processing before the second visual model.
  • parameters in the image processing module corresponding to the visual task model can be determined, that is, parameters in the at least one target image processing module.
  • different visual task models correspond to different parameters in the image processing module.
  • the parameters in the image processing module can adaptively match the visual task model, making the parameters in the current image processing module more suitable.
  • the current vision task model is beneficial to improve the performance of the vision task model.
  • the visual task model has a corresponding relationship with parameters in the image processing module and other configurations of the image processing module. In this way, parameters in the image processing module corresponding to the current visual task model and other configurations of the image processing module can be determined according to the corresponding relationship.
  • the combination of the visual task model and the image processing module there is a corresponding relationship between the combination of the visual task model and the image processing module, and the parameters in the image processing module. According to the corresponding relationship, the combination of image processing modules corresponding to the current visual task model and the parameters of the image processing modules in the combination of image processing modules can be determined.
  • the combinations of image processing modules corresponding to different visual task models may be the same or different.
  • the combinations of image processing modules corresponding to the two vision task models are the same, but the parameters of the image processing modules in the combination of image processing modules are different.
  • the combination of the visual task model and the image processing module there is a corresponding relationship between the combination of the visual task model and the image processing module, the weight of the image processing module, and the parameters in the image processing module.
  • the combination of the image processing modules corresponding to the visual task model, the weight of the image processing modules, and the parameters in the image processing modules can be determined.
  • the combinations of image processing modules corresponding to different visual task models may be the same or different.
  • the weights of the image processing modules in the combination of image processing modules may be the same or different.
  • the parameters of the image processing modules in the combination of image processing modules may be the same or different.
  • different visual task models correspond to different image processing module configurations.
  • the image processing module can adaptively match the visual task model, making the image processing flow more suitable for the visual task model. , which is beneficial to improve the performance of the vision task model.
  • the device of the embodiment of the present application will be described below with reference to FIG. 7 to FIG. 8 . It should be understood that the device described below can execute the method of the aforementioned embodiment of the present application. In order to avoid unnecessary repetition, repeated descriptions are appropriately omitted when introducing the device of the embodiment of the present application below.
  • FIG. 7 is a schematic block diagram of an image processing device according to an embodiment of the present application.
  • the image processing device 4000 shown in FIG. 7 includes an acquisition unit 4010 and a processing unit 4020 .
  • the acquisition unit 4010 and the processing unit 4020 may be used to execute the image processing method of the embodiment of the present application.
  • the apparatus 4000 may be used to execute the method 300 or the method 400 .
  • the acquiring unit 4010 is configured to acquire the first image.
  • the processing unit 4020 is used to: process the first image through at least one image processing module to obtain a second image; input the second image into the visual task model for processing; adjust at least one image processing module according to the processing result of the visual task model .
  • At least one image processing module includes multiple image processing modules, and the processing unit 4020 is specifically configured to:
  • Part of the image processing modules in the plurality of image processing modules are deleted according to the processing results of the visual task model.
  • the processing unit 4020 is specifically configured to: adjust the weights of multiple image processing modules according to the processing results of the visual task model, and the weights of the multiple image processing modules are used to process the processing results of the multiple image processing modules Perform processing to obtain a second image; delete part of the image processing modules in the plurality of image processing modules according to the adjusted weights of the plurality of image processing modules.
  • the processing unit 4020 is specifically configured to: adjust parameters in at least one image processing module according to a processing result of the visual task model.
  • the processing unit 4020 is specifically configured to: adjust a processing sequence of at least one image processing module according to a processing result of the visual task model.
  • At least one image processing module includes: a black level compensation module, a green balance module, a dead point correction module, a demosaic module, a Bayer noise reduction module, an automatic white balance module, a color correction module, a gamma Horse correction module or noise reduction and sharpening module.
  • the apparatus 4000 may be used to execute the method 700 .
  • the acquiring unit 4010 is configured to acquire a third image.
  • the processing unit 4020 is configured to: determine at least one target image processing module according to the visual task model; process the third image through at least one target image processing module to obtain the fourth image; process the fourth image through the visual task model to obtain the fourth image Four image processing results.
  • the processing unit 4020 is specifically configured to: determine at least one target image processing module from multiple candidate image processing modules according to the visual task model.
  • the processing unit 4020 is specifically configured to: determine parameters in at least one target image processing module according to the visual task model.
  • the processing unit 4020 is specifically configured to: determine a processing sequence of at least one target image processing module according to the visual task model.
  • At least one target image processing module includes: a black level compensation module, a green balance module, a dead point correction module, a demosaic module, a Bayer noise reduction module, an automatic white balance module, a color correction module, Gamma Correction Module or Noise Reduction and Sharpening Module.
  • unit here may be implemented in the form of software and/or hardware, which is not specifically limited.
  • a "unit” may be a software program, a hardware circuit or a combination of both to realize the above functions.
  • the hardware circuitry may include application specific integrated circuits (ASICs), electronic circuits, processors (such as shared processors, dedicated processors, or group processors) for executing one or more software or firmware programs. etc.) and memory, incorporating logic, and/or other suitable components to support the described functionality.
  • ASICs application specific integrated circuits
  • processors such as shared processors, dedicated processors, or group processors for executing one or more software or firmware programs. etc.
  • memory incorporating logic, and/or other suitable components to support the described functionality.
  • the units of each example described in the embodiments of the present application can be realized by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • FIG. 8 is a schematic diagram of a hardware structure of an image processing device provided by an embodiment of the present application.
  • the image processing apparatus 6000 shown in FIG. 8 (the apparatus 6000 may specifically be a computer device) includes a memory 6001 , a processor 6002 , a communication interface 6003 and a bus 6004 .
  • the memory 6001 , the processor 6002 , and the communication interface 6003 are connected to each other through a bus 6004 .
  • the memory 6001 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM).
  • the memory 6001 may store programs, and when the programs stored in the memory 6001 are executed by the processor 6002, the processor 6002 is configured to execute various steps of the image processing method of the embodiment of the present application. Specifically, the processor 6002 may execute the method 300, the method 400 or the method 700 above.
  • the processor 6002 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application specific integrated circuit (application specific integrated circuit, ASIC), a graphics processing unit (graphics processing unit, GPU) or one or more
  • the integrated circuit is used to execute related programs to realize the image processing method of the method embodiment of the present application.
  • the processor 6002 may also be an integrated circuit chip with signal processing capabilities. During implementation, each step of the image processing method of the present application may be completed by an integrated logic circuit of hardware in the processor 6002 or instructions in the form of software.
  • the above-mentioned processor 6002 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, Discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register.
  • the storage medium is located in the memory 6001, and the processor 6002 reads the information in the memory 6001, and combines its hardware to complete the functions required by the units included in the device shown in Figure 7, or execute the image processing method of the method embodiment of the present application .
  • the communication interface 6003 implements communication between the apparatus 6000 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver. For example, training data can be obtained through the communication interface 6003 .
  • the bus 6004 may include pathways for transferring information between various components of the device 6000 (eg, memory 6001 , processor 6002 , communication interface 6003 ).
  • the above device 6000 only shows a memory, a processor, and a communication interface, those skilled in the art should understand that the device 6000 may also include other devices necessary for normal operation during specific implementation. Meanwhile, according to specific needs, those skilled in the art should understand that the apparatus 6000 may also include hardware devices for implementing other additional functions. In addition, those skilled in the art should understand that the device 6000 may also only include the components necessary to realize the embodiment of the present application, and does not necessarily include all the components shown in FIG. 8 .
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable medium stores program code for device execution, and the program code includes the image processing method used in the embodiment of the present application.
  • the embodiment of the present application further provides a computer program product including instructions, and when the computer program product is run on a computer, the computer is made to execute the image processing method in the embodiment of the present application.
  • the embodiment of the present application also provides a chip, the chip includes a processor and a data interface, and the processor reads the instructions stored in the memory through the data interface, and executes the image processing method in the embodiment of the present application.
  • the chip may further include a memory, the memory stores instructions, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the The processor is configured to execute the method in any one of the implementation manners of the first aspect or the second aspect.
  • the aforementioned chip may specifically be an FPGA or an ASIC.
  • the processor in the embodiment of the present application may be a central processing unit (central processing unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (digital signal processor, DSP), application specific integrated circuits (application specific integrated circuit, ASIC), off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the memory in the embodiments of the present application may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memories.
  • the non-volatile memory can be read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically programmable Erases programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory can be random access memory (RAM), which acts as external cache memory.
  • RAM random access memory
  • static random access memory static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory Access memory
  • SDRAM synchronous dynamic random access memory
  • double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • serial link DRAM SLDRAM
  • direct memory bus random access memory direct rambus RAM, DR RAM
  • the above-mentioned embodiments may be implemented in whole or in part by software, hardware, firmware or other arbitrary combinations.
  • the above-described embodiments may be implemented in whole or in part in the form of computer program products.
  • the computer program product comprises one or more computer instructions or computer programs.
  • the processes or functions according to the embodiments of the present application will be generated in whole or in part.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server or data center by wired (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center that includes one or more sets of available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media.
  • the semiconductor medium may be a solid state drive.
  • At least one means one or more, and “multiple” means two or more.
  • At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items.
  • at least one item (piece) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c can be single or multiple .
  • sequence numbers of the above-mentioned processes do not mean the order of execution, and the execution order of the processes should be determined by their functions and internal logic, and should not be used in the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)

Abstract

The present application relates to the field of artificial intelligence, and specifically relates to the field of computer vision. Provided are an image processing method and apparatus. The method comprises: processing an input image by means of at least one image processing module, taking the processing result as an input of a visual task model, and adjusting the at least one image processing module according to a processing result of the visual task model. According to the solution of the present application, an image processing flow suitable for a visual task model can be obtained, thereby facilitating improving the performance of the visual task model.

Description

图像处理方法及装置Image processing method and device 技术领域technical field
本申请涉及计算机视觉领域,并且更具体地,涉及一种图像处理方法及装置。The present application relates to the field of computer vision, and more specifically, to an image processing method and device.
背景技术Background technique
计算机视觉是各个应用领域,如制造业、检验、文档分析、医疗诊断,和军事等领域中各种智能/自主系统中不可分割的一部分,它是一门关于如何运用照相机/摄像机和计算机来获取我们所需的,被拍摄对象的数据与信息的学问。形象地说,就是给计算机安装上眼睛(照相机/摄像机)和大脑(算法)用来代替人眼对目标进行识别、跟踪和测量等,从而使计算机能够感知环境。计算机视觉可以看作是研究如何使人工系统从图像或多维数据中“感知”的科学。总的来说,计算机视觉就是用各种成象系统代替视觉器官获取输入信息,再由计算机来代替大脑对这些输入信息完成处理和解释。Computer vision is an integral part of various intelligent/autonomous systems in various application fields such as manufacturing, inspection, document analysis, medical diagnosis, and military. What we need is the knowledge of the data and information of the subject being photographed. To put it figuratively, it is to install eyes (cameras/video cameras) and brains (algorithms) on computers to replace human eyes to identify, track and measure targets, so that computers can perceive the environment. Computer vision can be seen as the science of how to make artificial systems "perceive" from images or multidimensional data. In general, computer vision is to use various imaging systems to replace the visual organs to obtain input information, and then use the computer to replace the brain to complete the processing and interpretation of these input information.
计算机视觉任务包括图像分类、目标检测、目标跟踪以及目标分割等任务。在实际应用中,通常先对生(raw)图进行一系列的图像信号处理(image signal processing,ISP),输出可视化的图像。该可视化的图像可以作为计算机视觉任务的输入图像。然而,ISP的目的通常是为了满足人的视觉需求。实际上,经过一系列的图像信号处理后得到的图像能够满足人的视觉需求,但基于该图像执行视觉任务不一定能得到理想的处理结果。Computer vision tasks include tasks such as image classification, object detection, object tracking, and object segmentation. In practical applications, a series of image signal processing (image signal processing, ISP) is usually performed on the raw (raw) image to output a visualized image. This visualized image can be used as an input image for computer vision tasks. However, the purpose of ISP is usually to meet human visual needs. In fact, the image obtained after a series of image signal processing can meet human visual needs, but performing visual tasks based on the image may not necessarily obtain ideal processing results.
发明内容Contents of the invention
本申请提供一种图像处理方法及装置,能够获得适合视觉任务的图像处理流程,提高视觉任务模型的性能。The present application provides an image processing method and device, which can obtain an image processing flow suitable for a visual task and improve the performance of a visual task model.
第一方面,提供了一种图像处理方法,该方法包括:获取第一图像;通过至少一个图像处理模块对第一图像进行处理,得到第二图像;将第二图像输入至视觉任务模型中进行处理;根据视觉任务模型的处理结果调整至少一个图像处理模块。In a first aspect, an image processing method is provided, the method comprising: acquiring a first image; processing the first image through at least one image processing module to obtain a second image; inputting the second image into a visual task model for processing Processing; adjusting at least one image processing module according to the processing results of the visual task model.
在本申请实施例的方案中,根据视觉任务模型的处理结果调整图像处理流程,有利于得到适合视觉任务的图像,以保证视觉任务模型的性能。本申请实施例的方案能够根据不同的应用场景的需求调整图像处理流程,以适应不同的应用场景。In the solution of the embodiment of the present application, the image processing flow is adjusted according to the processing result of the visual task model, which is beneficial to obtain an image suitable for the visual task, so as to ensure the performance of the visual task model. The solutions of the embodiments of the present application can adjust the image processing flow according to the requirements of different application scenarios, so as to adapt to different application scenarios.
示例性地,第一图像可以为传感器获取的raw图。Exemplarily, the first image may be a raw image acquired by a sensor.
图像处理模块用于对输入图像进行图像信号处理。The image processing module is used for image signal processing on the input image.
示例性地,第二图像可以为RGB图像。Exemplarily, the second image may be an RGB image.
可选地,通过至少一个图像处理模块对第一图像进行处理,得到第二图像,包括:通过至少一个图像处理模块和该至少一个图像处理模块的权重对第一图像进行处理,得到第二图像。Optionally, processing the first image through at least one image processing module to obtain the second image includes: processing the first image through at least one image processing module and the weight of the at least one image processing module to obtain the second image .
具体地,根据该至少一个图像处理模块的权重对该至少一个图像处理模块的处理结果进行调整,得到第二图像。Specifically, the processing result of the at least one image processing module is adjusted according to the weight of the at least one image processing module to obtain the second image.
示例性地,视觉任务包括:目标检测、图像分类、目标分割、目标跟踪或图像识别等。Exemplarily, the vision task includes: target detection, image classification, target segmentation, target tracking, or image recognition.
视觉任务模型用于执行视觉任务。例如,视觉任务为目标检测,则视觉任务模型为目标检测模型。再如,视觉任务为图像识别,则视觉任务模型为图像识别模型。The visual task model is used to perform visual tasks. For example, if the vision task is target detection, then the vision task model is the target detection model. For another example, if the visual task is image recognition, then the visual task model is an image recognition model.
视觉任务模型可以为训练好的模型。The vision task model can be a trained model.
视觉任务模型的处理结果可以包括视觉任务模型的性能指标。The processing results of the vision task model may include performance indicators of the vision task model.
示例性地,视觉任务模型的性能指标包括推理的准确度或损失函数的值等。损失函数可以根据需要设置。损失函数用于指示视觉任务模型的推理结果与第一图像对应的真值之间的差异。需要说明的是,此处的损失函数可以采用视觉任务模型训练过程中的损失函数,或者,也可以采用其他形式的损失函数。Exemplarily, the performance index of the vision task model includes the accuracy of reasoning or the value of the loss function. The loss function can be set as needed. The loss function is used to indicate the difference between the inference result of the vision task model and the true value corresponding to the first image. It should be noted that the loss function here may be the loss function in the training process of the vision task model, or other forms of loss functions may also be used.
例如,视觉任务为目标检测,则视觉任务模型的处理结果可以包括检测准确度。For example, if the vision task is target detection, the processing result of the vision task model may include detection accuracy.
再如,视觉任务为目标分割,则视觉任务模型的处理结果可以包括分割准确度。For another example, if the vision task is target segmentation, the processing result of the vision task model may include segmentation accuracy.
视觉任务模型可以采用神经网络模型,或者,也可以采用非神经网络模型。The visual task model may use a neural network model, or may also use a non-neural network model.
根据视觉任务模型的处理结果调整该至少一个图像处理模块,以使视觉任务模型的处理结果尽可能接近预期。The at least one image processing module is adjusted according to the processing result of the visual task model, so that the processing result of the visual task model is as close to expectation as possible.
示例性地,可以采用贝叶斯优化方法、RNN模型或强化学习算法等方式调整至少一个图像调整模块。Exemplarily, the at least one image adjustment module may be adjusted by means of a Bayesian optimization method, an RNN model, or a reinforcement learning algorithm.
结合第一方面,在第一方面的某些实现方式中,根据视觉任务模型的处理结果调整至少一个图像处理模块,包括:根据图像处理的时间和视觉任务模型的处理结果调整该至少一个图像处理模块。With reference to the first aspect, in some implementations of the first aspect, adjusting at least one image processing module according to the processing result of the visual task model includes: adjusting the at least one image processing module according to the time of image processing and the processing result of the visual task model module.
图像处理的时间可以为视觉任务模型的处理时间,或者,也可以为该至少一个图像处理模块的处理时间,或者,也可以为视觉任务模型的处理时间和该至少一个图像处理模块的处理时间的总和。The image processing time may be the processing time of the visual task model, or may also be the processing time of the at least one image processing module, or may also be the difference between the processing time of the visual task model and the processing time of the at least one image processing module. sum.
这样,能够在保证视觉任务模型的性能的前提下,提高处理速度,降低时延。In this way, the processing speed can be improved and the time delay can be reduced under the premise of ensuring the performance of the visual task model.
结合第一方面,在第一方面的某些实现方式中,至少一个图像处理模块包括多个图像处理模块,根据视觉任务模型的处理结果调整至少一个图像处理模块,包括:更改该至少一个图像处理模块。With reference to the first aspect, in some implementations of the first aspect, at least one image processing module includes multiple image processing modules, and adjusting at least one image processing module according to the processing results of the visual task model includes: changing the at least one image processing module module.
更改该至少一个图像处理模块,可以包括:删除该至少一个图像处理模块中的部分图像处理模块或/和增加其他图像处理模块。Changing the at least one image processing module may include: deleting some image processing modules in the at least one image processing module or/and adding other image processing modules.
在本申请实施例的方案中,根据视觉任务模型的处理结果更改图像处理模块的组合,能够获得更适合视觉任务模型的图像处理模块的组合,有利于提高视觉任务模型的性能。In the solution of the embodiment of the present application, the combination of image processing modules is changed according to the processing results of the visual task model, so that a combination of image processing modules more suitable for the visual task model can be obtained, which is conducive to improving the performance of the visual task model.
结合第一方面,在第一方面的某些实现方式中,至少一个图像处理模块包括多个图像处理模块,根据视觉任务模型的处理结果调整至少一个图像处理模块,包括:根据视觉任务模型的处理结果删除多个图像处理模块中的部分图像处理模块。With reference to the first aspect, in some implementations of the first aspect, at least one image processing module includes multiple image processing modules, and adjusting at least one image processing module according to the processing results of the visual task model includes: processing according to the visual task model As a result, some image processing modules among the plurality of image processing modules are deleted.
在本申请实施例的方案中,根据视觉任务模型的处理结果删除部分图像处理模块,能够减少图像处理所需的时间,提高处理速度,减少对计算力的要求。In the solution of the embodiment of the present application, some image processing modules are deleted according to the processing results of the visual task model, which can reduce the time required for image processing, increase the processing speed, and reduce the requirement for computing power.
结合第一方面,在第一方面的某些实现方式中,根据视觉任务模型的处理结果删除多个图像处理模块中的部分图像处理模块,包括:根据视觉任务模型的处理结果调整多个图像处理模块的权重,多个图像处理模块的权重用于对多个图像处理模块的处理结果进行处理,得到第二图像;根据调整后的多个图像处理模块的权重删除多个图像处理模块中的部 分图像处理模块。With reference to the first aspect, in some implementations of the first aspect, deleting some of the image processing modules in the multiple image processing modules according to the processing results of the visual task model includes: adjusting multiple image processing modules according to the processing results of the visual task model The weight of the module, the weights of the multiple image processing modules are used to process the processing results of the multiple image processing modules to obtain the second image; delete the parts of the multiple image processing modules according to the adjusted weights of the multiple image processing modules Image processing module.
本申请实施例的方案中,根据各个图像处理模块的权重确定被删除的图像处理模块,删除权重值相对较小的图像处理模块,这样对视觉任务模型的处理结果影响较小,删除之后对视觉任务模型的性能的影响较小。也就是说,本申请实施例的方案能够在保证视觉任务模型的性能的前提下,减少不必要的运算,降低计算开销,提高处理速度。In the scheme of the embodiment of the present application, the image processing module to be deleted is determined according to the weight of each image processing module, and the image processing module with a relatively small weight value is deleted, so that the impact on the processing result of the visual task model is small, and the visual impact after deletion is relatively small. The performance of the task model is less affected. That is to say, the solutions of the embodiments of the present application can reduce unnecessary operations, reduce computing overhead, and improve processing speed on the premise of ensuring the performance of the visual task model.
示例性地,该多个图像处理模块为m个图像处理模块。m为大于1的整数。从该m个图像处理模块中删除调整后的权重最小的n个图像处理模块。n为大于1且小于m的整数。Exemplarily, the multiple image processing modules are m image processing modules. m is an integer greater than 1. The n image processing modules with the smallest adjusted weights are deleted from the m image processing modules. n is an integer greater than 1 and less than m.
可替换地,从该m个图像处理模块中删除调整后的权重小于或等于权重阈值的图像处理模块。Alternatively, an image processing module whose adjusted weight is less than or equal to a weight threshold is deleted from the m image processing modules.
结合第一方面,在第一方面的某些实现方式中,根据视觉任务模型的处理结果调整至少一个图像处理模块,包括:根据视觉任务模型的处理结果调整至少一个图像处理模块中的参数。With reference to the first aspect, in some implementation manners of the first aspect, adjusting at least one image processing module according to a processing result of the visual task model includes: adjusting parameters in at least one image processing module according to a processing result of the visual task model.
在本申请实施例的方案中,根据视觉任务模型的处理结果调整图像处理模块中的参数,能够获得更适合视觉任务的图像处理模块,有利于提高视觉任务的准确度。In the solutions of the embodiments of the present application, by adjusting the parameters in the image processing module according to the processing results of the visual task model, an image processing module more suitable for the visual task can be obtained, which is conducive to improving the accuracy of the visual task.
结合第一方面,在第一方面的某些实现方式中,根据视觉任务模型的处理结果调整至少一个图像处理模块,包括:根据视觉任务模型的处理结果从多个图像处理模块中删除部分图像处理模块;通过多个图像处理模块中未被删除的图像处理模块对第五图像进行处理,得到第六图像,将第六图像输入至视觉任务模型中进行处理;根据视觉任务模型的处理结果调整未被删除的图像处理模块的参数。With reference to the first aspect, in some implementations of the first aspect, adjusting at least one image processing module according to the processing result of the visual task model includes: deleting part of the image processing module from multiple image processing modules according to the processing result of the visual task model module; the fifth image is processed by the undeleted image processing module in the plurality of image processing modules to obtain the sixth image, and the sixth image is input into the visual task model for processing; according to the processing result of the visual task model, the undeleted The parameters of the image processing module that were removed.
根据本申请实施例的方案,利用视觉任务模型得到的性能指标,例如,目标检测的准确度、目标分割准确率等,调整多个图像处理模块的权重,保留对视觉任务模型的性能指标影响较大的图像处理模块,或者说,保留能够维持或提升视觉任务模型的性能指标的图像处理模块。这样,能够得到适合视觉任务模型的图像处理模块,或者说,得到视觉任务模型所需的图像处理模块,减少了图像处理流程所需的时间,节省计算开销,减少计算力的需求,对硬件更加友好。According to the solution of the embodiment of the present application, the performance indicators obtained by the visual task model, such as the accuracy of target detection, the accuracy of target segmentation, etc., are used to adjust the weights of multiple image processing modules, so as to keep the performance indicators that have a relatively small impact on the visual task model. A large image processing module, or in other words, an image processing module that maintains or improves the performance indicators of the vision task model. In this way, an image processing module suitable for the visual task model can be obtained, or in other words, the image processing module required by the visual task model can be obtained, which reduces the time required for the image processing process, saves computing overhead, reduces the demand for computing power, and requires more hardware. friendly.
而且,利用视觉任务模型得到的性能指标调整被保留的图像处理模块中的参数,例如,利用视觉任务模型得到的性能指标对图像处理模块进行设计空间的搜索,有利于得到各个图像处理模块最优的参数配置,以提升视觉任务模型的性能。Moreover, using the performance index obtained from the visual task model to adjust the parameters in the reserved image processing module, for example, using the performance index obtained from the visual task model to search the design space of the image processing module is conducive to obtaining the optimal value of each image processing module. parameter configuration to improve the performance of the vision task model.
结合第一方面,在第一方面的某些实现方式中,根据视觉任务模型的处理结果调整至少一个图像处理模块,包括:根据视觉任务模型的处理结果调整至少一个图像处理模块的处理顺序。With reference to the first aspect, in some implementation manners of the first aspect, adjusting the at least one image processing module according to the processing result of the visual task model includes: adjusting the processing sequence of the at least one image processing module according to the processing result of the visual task model.
在本申请实施例的方案中,根据视觉任务模型的处理结果调整图像处理模块的处理顺序,能够获得更适合视觉任务的图像处理流程,有利于提高视觉任务的准确度。In the solution of the embodiment of the present application, the processing sequence of the image processing module is adjusted according to the processing result of the visual task model, so that an image processing flow more suitable for the visual task can be obtained, which is conducive to improving the accuracy of the visual task.
结合第一方面,在第一方面的某些实现方式中,至少一个图像处理模块包括:黑电平补偿模块、绿平衡模块、坏点修正模块、去马赛克模块、拜耳降噪模块、自动白平衡模块、色彩校正模块、伽马校正模块或降噪及锐化模块。In combination with the first aspect, in some implementations of the first aspect, at least one image processing module includes: a black level compensation module, a green balance module, a bad pixel correction module, a demosaic module, a Bayer noise reduction module, an automatic white balance module, color correction module, gamma correction module or noise reduction and sharpening module.
该至少一个图像处理模块中的任一图像处理模块可以采用神经网络算法实现,或者,也可以采用非神经网络算法实现。Any image processing module in the at least one image processing module may be implemented by a neural network algorithm, or may also be implemented by a non-neural network algorithm.
第二方面,提供了一种图像处理方法,该方法包括:获取第三图像;根据视觉任务模型确定至少一个目标图像处理模块;通过至少一个目标图像处理模块对第三图像进行处理,得到第四图像;通过视觉任务模型对第四图像进行处理,得到第四图像的处理结果。In a second aspect, an image processing method is provided, the method comprising: acquiring a third image; determining at least one target image processing module according to a visual task model; processing the third image by at least one target image processing module to obtain a fourth An image; the fourth image is processed through the visual task model to obtain a processing result of the fourth image.
根据本申请实施例的方案,不同的视觉任务模型对应不同的图像处理模块的配置,当视觉任务模型发生变化时,图像处理模块能够自适应匹配视觉任务模型,使得图像处理流程更适合视觉任务模型,有利于提高视觉任务模型的性能。According to the solution of the embodiment of the present application, different visual task models correspond to different image processing module configurations. When the visual task model changes, the image processing module can adaptively match the visual task model, making the image processing flow more suitable for the visual task model. , which is beneficial to improve the performance of the vision task model.
示例性地,第三图像可以为传感器获取的raw图。Exemplarily, the third image may be a raw image acquired by the sensor.
第四图像的处理结果也可以理解为第三图像的处理结果。The processing result of the fourth image can also be understood as the processing result of the third image.
第四图像的处理结果即为视觉任务模型的推理结果。The processing result of the fourth image is the reasoning result of the visual task model.
该至少一个目标图像处理模块是与视觉任务模型对应的一个或多个图像处理模块。The at least one target image processing module is one or more image processing modules corresponding to the visual task model.
示例性地,视觉任务包括:目标检测、图像分类、目标分割、目标跟踪或图像识别等。Exemplarily, the vision task includes: target detection, image classification, target segmentation, target tracking, or image recognition.
视觉任务模型用于执行视觉任务。例如,视觉任务为目标检测,则视觉任务模型为目标检测模型。再如,视觉任务为图像识别,则视觉任务模型为图像识别模型。The visual task model is used to perform visual tasks. For example, if the vision task is target detection, then the vision task model is the target detection model. For another example, if the visual task is image recognition, then the visual task model is an image recognition model.
视觉任务模型可以为训练好的模型。The vision task model can be a trained model.
在不同的应用场景中,可以采用不同的视觉任务模型,相应地,根据不同的视觉任务模型即可确定与该视觉任务模型匹配的至少一个目标图像处理模块。这样,可以根据不同的应用场景选用不同的图像处理模块。In different application scenarios, different visual task models may be used, and accordingly, at least one target image processing module matching the visual task model may be determined according to different visual task models. In this way, different image processing modules can be selected according to different application scenarios.
视觉任务模型和图像处理模块的配置之间具有对应关系。根据视觉任务模型和图像处理模块的配置之间的对应关系可以确定与当前的视觉任务模型匹配的图像处理模块的配置。There is a corresponding relationship between the vision task model and the configuration of the image processing module. According to the corresponding relationship between the visual task model and the configuration of the image processing module, the configuration of the image processing module matching the current visual task model can be determined.
示例性地,图像处理模块的配置包括以下至少一项:图像处理模块的组合、图像处理模块的权重、图像处理模块的处理顺序或者图像处理模块中的参数。Exemplarily, the configuration of the image processing modules includes at least one of the following: a combination of image processing modules, a weight of the image processing modules, a processing order of the image processing modules, or parameters in the image processing modules.
结合第二方面,在第二方面的某些实现方式中,根据视觉任务模型确定至少一个目标图像处理模块,包括:根据视觉任务模型从多个候选图像处理模块中确定至少一个目标图像处理模块。With reference to the second aspect, in some implementations of the second aspect, determining at least one target image processing module according to the visual task model includes: determining at least one target image processing module from multiple candidate image processing modules according to the visual task model.
根据本申请实施例的方案,不同的视觉任务模型对应不同的图像处理模块的组合,当视觉任务模型发生变化时,图像处理模块的组合能够自适应匹配视觉任务模型,使得当前的图像处理模块的组合更适合当前的视觉任务模型,有利于提高视觉任务模型的性能。According to the solution of the embodiment of the present application, different visual task models correspond to different combinations of image processing modules. When the visual task model changes, the combination of image processing modules can adaptively match the visual task model, so that the current image processing module The combination is more suitable for the current visual task model and is beneficial to improve the performance of the visual task model.
而且,根据视觉任务模型从多个候选图像处理模块中选择适合的图像处理模块,无需使用所有的候选图像处理模块对图像进行处理,减少了处理流程,降低了对计算力的要求。Moreover, by selecting a suitable image processing module from multiple candidate image processing modules according to the visual task model, it is not necessary to use all the candidate image processing modules to process images, which reduces the processing flow and reduces the requirement for computing power.
视觉任务模型和图像处理模块的组合之间具有对应关系。根据该对应关系即可确定当前视觉任务模型对应的图像处理模块的组合,或者说,根据该对应关系即可确定用于该视觉任务模型所需的图像处理模块,即该至少一个目标图像处理模块。There is a correspondence between the combination of the visual task model and the image processing module. According to the corresponding relationship, the combination of image processing modules corresponding to the current visual task model can be determined, or in other words, the image processing module required for the visual task model can be determined according to the corresponding relationship, that is, the at least one target image processing module .
结合第二方面,在第二方面的某些实现方式中,根据视觉任务模型确定至少一个目标图像处理模块,包括:根据视觉任务模型确定至少一个目标图像处理模块的权重,至少一个目标图像处理模块的权重用于对至少一个目标图像处理模块的处理结果进行处理,得到第四图像。With reference to the second aspect, in some implementations of the second aspect, determining at least one target image processing module according to the visual task model includes: determining the weight of at least one target image processing module according to the visual task model, and at least one target image processing module The weights of are used to process the processing result of at least one target image processing module to obtain a fourth image.
根据本申请实施例的方案,不同的视觉任务模型对应不同的图像处理模块的权重,当视觉任务模型发生变化时,图像处理模块的权重能够自适应匹配视觉任务模型,使得当前 的图像处理模块的权重更适合当前的视觉任务模型,有利于提高视觉任务模型的性能。According to the scheme of the embodiment of the present application, different visual task models correspond to the weights of different image processing modules. When the visual task model changes, the weight of the image processing module can adaptively match the visual task model, so that the current image processing module The weights are more suitable for the current visual task model, which is beneficial to improve the performance of the visual task model.
结合第二方面,在第二方面的某些实现方式中,根据视觉任务模型确定至少一个目标图像处理模块,包括:根据视觉任务模型确定至少一个目标图像处理模块中的参数。With reference to the second aspect, in some implementation manners of the second aspect, determining at least one target image processing module according to the visual task model includes: determining parameters in the at least one target image processing module according to the visual task model.
根据本申请实施例的方案,不同的视觉任务模型对应不同的图像处理模块中的参数,当视觉任务模型发生变化时,图像处理模块中的参数能够自适应匹配视觉任务模型,使得当前的图像处理模块中的参数更适合当前的视觉任务模型,有利于提高视觉任务模型的性能。According to the scheme of the embodiment of the present application, different visual task models correspond to different parameters in the image processing module. When the visual task model changes, the parameters in the image processing module can adaptively match the visual task model, so that the current image processing The parameters in the module are more suitable for the current vision task model, which is beneficial to improve the performance of the vision task model.
视觉任务模型和图像处理模块中的参数之间具有对应关系。根据视觉任务模型可以确定视觉任务模型对应的图像处理模块中的参数,即该至少一个目标图像处理模块中的参数。There is a corresponding relationship between the visual task model and the parameters in the image processing module. According to the visual task model, parameters in the image processing module corresponding to the visual task model can be determined, that is, parameters in the at least one target image processing module.
结合第二方面,在第二方面的某些实现方式中,其特征在于,根据视觉任务模型确定至少一个目标图像处理模块,包括:根据视觉任务模型确定至少一个目标图像处理模块的处理顺序。With reference to the second aspect, in some implementations of the second aspect, it is characterized in that determining at least one target image processing module according to the visual task model includes: determining a processing order of the at least one target image processing module according to the visual task model.
根据本申请实施例的方案,不同的视觉任务模型对应不同的图像处理模块的处理顺序,当视觉任务模型发生变化时,图像处理模块的处理顺序能够自适应匹配视觉任务模型,使得当前的图像处理模块的处理顺序更适合当前的视觉任务模型,有利于提高视觉任务模型的性能。According to the solution of the embodiment of the present application, different visual task models correspond to different processing sequences of image processing modules. When the visual task model changes, the processing sequence of the image processing modules can adaptively match the visual task model, so that the current image processing The processing order of the modules is more suitable for the current vision task model, which is beneficial to improve the performance of the vision task model.
视觉任务模型和图像处理模块的处理顺序之间具有对应关系。根据该对应关系可以确定当前视觉任务模型对应的图像处理模块的处理顺序,即该至少一个目标图像处理模块的处理顺序。There is a corresponding relationship between the visual task model and the processing sequence of the image processing module. According to the corresponding relationship, the processing order of the image processing modules corresponding to the current visual task model can be determined, that is, the processing order of the at least one target image processing module.
结合第二方面,在第二方面的某些实现方式中,至少一个目标图像处理模块包括:黑电平补偿模块、绿平衡模块、坏点修正模块、去马赛克模块、拜耳降噪模块、自动白平衡模块、色彩校正模块、伽马校正模块或降噪及锐化模块。In combination with the second aspect, in some implementations of the second aspect, at least one target image processing module includes: a black level compensation module, a green balance module, a dead pixel correction module, a demosaic module, a Bayer noise reduction module, an automatic white Balance Module, Color Correction Module, Gamma Correction Module or Noise Reduction and Sharpening Module.
第三方面,提供了一种图像处理装置,该装置包括用于执行上述第一方面以及第一方面中的任意一种实现方式中的方法的模块或单元。In a third aspect, an image processing apparatus is provided, and the apparatus includes a module or unit for executing the method in any one of the above-mentioned first aspect and the first aspect.
第四方面,提供了一种图像处理装置,该装置包括用于执行上述第二方面以及第二方面中的任意一种实现方式中的方法的模块或单元。According to a fourth aspect, an image processing device is provided, and the device includes a module or unit for executing the method in any one of the above-mentioned second aspect and the second aspect.
应理解,在上述第一方面中对相关内容的扩展、限定、解释和说明也适用于第二方面、第三方面和第四方面中相同的内容。It should be understood that the expansion, limitation, explanation and illustration of the related content in the above first aspect are also applicable to the same content in the second aspect, the third aspect and the fourth aspect.
第五方面,提供了一种图像处理装置,该装置包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行第一方面以及第一方面中的任意一种实现方式中的方法。In a fifth aspect, an image processing device is provided, the device comprising: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the processing The device is used to execute the first aspect and the method in any one of the implementation manners of the first aspect.
上述第五方面中的处理器既可以是中央处理器(central processing unit,CPU),也可以是CPU与神经网络运算处理器的组合,这里的神经网络运算处理器可以包括图形处理器(graphics processing unit,GPU)、神经网络处理器(neural-network processing unit,NPU)和张量处理器(tensor processing unit,TPU)等等。其中,TPU是谷歌(google)为机器学习全定制的人工智能加速器专用集成电路。The processor in the fifth aspect above can be either a central processing unit (central processing unit, CPU), or a combination of a CPU and a neural network computing processor, where the neural network computing processor can include a graphics processing unit (graphics processing unit). unit, GPU), neural-network processing unit (NPU) and tensor processing unit (TPU), etc. Among them, TPU is an artificial intelligence accelerator ASIC fully customized by Google for machine learning.
第六方面,提供了一种图像处理装置,该装置包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执 行第二方面以及第二方面中的任意一种实现方式中的方法。According to a sixth aspect, an image processing device is provided, which includes: a memory for storing programs; a processor for executing the programs stored in the memory, and when the programs stored in the memory are executed, the processing The device is configured to execute the second aspect and the method in any one implementation manner of the second aspect.
上述第六方面中的处理器既可以是中央处理器,也可以是CPU与神经网络运算处理器的组合,这里的神经网络运算处理器可以包括图形处理器、神经网络处理器和张量处理器等等。其中,TPU是谷歌为机器学习全定制的人工智能加速器专用集成电路。The processor in the sixth aspect above can be a central processing unit, or a combination of a CPU and a neural network computing processor, where the neural network computing processor can include a graphics processor, a neural network processor and a tensor processor wait. Among them, TPU is Google's fully customized artificial intelligence accelerator ASIC for machine learning.
第七方面,提供一种计算机可读存储介质,该计算机可读介质存储用于设备执行的程序代码,该程序代码包括用于执行第一方面或第二方面中的任意一种实现方式中的方法。In a seventh aspect, there is provided a computer-readable storage medium, where the computer-readable medium stores program code for execution by a device, and the program code includes a program code for executing any one of the implementation manners of the first aspect or the second aspect. method.
第八方面,提供一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述第一方面或第二方面中的任意一种实现方式中的方法。In an eighth aspect, a computer program product containing instructions is provided, and when the computer program product is run on a computer, the computer is made to execute the method in any one of the above-mentioned first aspect or the second aspect.
第九方面,提供一种芯片,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,执行上述第一方面或第二方面中的任意一种实现方式中的方法。In the ninth aspect, there is provided a chip, the chip includes a processor and a data interface, and the processor reads the instructions stored in the memory through the data interface, and executes any one of the above-mentioned first aspect or the second aspect method in the implementation.
可选地,作为一种实现方式,所述芯片还可以包括存储器,所述存储器中存储有指令,所述处理器用于执行所述存储器上存储的指令,当所述指令被执行时,所述处理器用于执行第一方面或第二方面中的任意一种实现方式中的方法。Optionally, as an implementation manner, the chip may further include a memory, the memory stores instructions, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the The processor is configured to execute the method in any one of the implementation manners of the first aspect or the second aspect.
上述芯片具体可以是现场可编程门阵列(field-programmable gate array,FPGA)或者专用集成电路(application-specific integrated circuit,ASIC)。The aforementioned chip may specifically be a field-programmable gate array (field-programmable gate array, FPGA) or an application-specific integrated circuit (application-specific integrated circuit, ASIC).
附图说明Description of drawings
图1为本申请实施例提供的一种系统架构的结构示意图;FIG. 1 is a schematic structural diagram of a system architecture provided in an embodiment of the present application;
图2为本申请实施例提供的一种图像处理流程的示意图;FIG. 2 is a schematic diagram of an image processing flow provided by an embodiment of the present application;
图3为本申请实施例提供的一种图像处理方法的示意性流程图;FIG. 3 is a schematic flowchart of an image processing method provided in an embodiment of the present application;
图4为本申请实施例提供的另一种图像处理流程的示意图;FIG. 4 is a schematic diagram of another image processing flow provided by the embodiment of the present application;
图5为本申请实施例提供的又一种图像处理流程的示意图;FIG. 5 is a schematic diagram of another image processing flow provided by the embodiment of the present application;
图6为本申请实施例提供的另一种图像处理方法的示意性流程图;FIG. 6 is a schematic flowchart of another image processing method provided by the embodiment of the present application;
图7是本申请实施例提供的一种图像处理装置的示意性框图;Fig. 7 is a schematic block diagram of an image processing device provided by an embodiment of the present application;
图8是本申请实施例提供的另一种图像处理装置的示意性框图。Fig. 8 is a schematic block diagram of another image processing apparatus provided by an embodiment of the present application.
具体实施方式detailed description
下面将结合附图,对本申请中的技术方案进行描述。The technical solution in this application will be described below with reference to the accompanying drawings.
本申请实施例可以应用在自动驾驶、图像分类、图像检索、图像语义分割、图像质量增强、图像超分辨率、监控、目标跟踪、目标检测等需要执行视觉任务的领域。The embodiments of the present application can be applied in fields such as automatic driving, image classification, image retrieval, image semantic segmentation, image quality enhancement, image super-resolution, monitoring, object tracking, object detection, etc. that need to perform visual tasks.
具体而言,本申请实施例的方法能够应用在图片分类和监控场景中,下面分别对这两种应用场景进行简单的介绍。Specifically, the method in the embodiment of the present application can be applied in picture classification and monitoring scenarios, and the following two application scenarios are briefly introduced respectively.
图片分类:Image classification:
当用户在终端设备(例如,手机)或者云盘上存储了大量的图片时,通过对相册中图像进行识别可以方便用户或者系统对相册进行分类管理,提升用户体验。When a user stores a large number of pictures on a terminal device (for example, a mobile phone) or a cloud disk, it is convenient for the user or the system to classify and manage the album by identifying the images in the album, thereby improving user experience.
利用本申请实施例的图像处理方法,能够获得适合执行分类任务的图像,提高分类的准确率。此外,能够减少图像处理流程,降低硬件开销,对终端设备更友好,提高对图片进行分类的速度,有利于实时为不同的类别的图片打上标签,便于用户查看和查找。另外, 这些图片的分类标签也可以提供给相册管理系统进行分类管理,节省用户的管理时间,提高相册管理的效率,提升用户体验。Using the image processing method of the embodiment of the present application, an image suitable for performing a classification task can be obtained, and the accuracy of classification can be improved. In addition, it can reduce the image processing process, reduce hardware overhead, be more friendly to terminal equipment, increase the speed of classifying pictures, and help to label pictures of different categories in real time, which is convenient for users to view and find. In addition, the classification tags of these pictures can also be provided to the album management system for classification management, which saves management time for users, improves the efficiency of album management, and improves user experience.
监控:monitor:
监控场景包括:智慧城市、野外监控、室内监控、室外监控、车内监控等。其中,智慧城市场景下,需要进行多种属性识别,例如行人属性识别和骑行属性识别,深度神经网络凭借着其强大的能力在多种属性识别中发挥着重要的作用。Surveillance scenarios include: smart city, field surveillance, indoor surveillance, outdoor surveillance, in-vehicle surveillance, etc. Among them, in the smart city scene, multiple attribute recognition is required, such as pedestrian attribute recognition and riding attribute recognition. Deep neural networks play an important role in multiple attribute recognition by virtue of their powerful capabilities.
通过采用本申请实施例的图像处理方法,能够获得适合执行属性识别任务的图像,提高识别的准确率。此外,能够减少图像处理流程,降低硬件开销,提高处理效率,有利于对输入的道路画面进行实时处理,更快地识别出道路画面中的不同的属性信息。By adopting the image processing method of the embodiment of the present application, an image suitable for performing an attribute recognition task can be obtained, and the accuracy of recognition can be improved. In addition, the image processing flow can be reduced, the hardware overhead can be reduced, and the processing efficiency can be improved, which is conducive to real-time processing of the input road picture and faster recognition of different attribute information in the road picture.
由于本申请实施例涉及大量神经网络的应用,为了便于理解,下面先对本申请实施例可能涉及的神经网络的相关术语和概念进行介绍。Since the embodiment of the present application involves the application of a large number of neural networks, for ease of understanding, the following first introduces the related terms and concepts of the neural network that may be involved in the embodiment of the present application.
(1)神经网络(1) neural network
神经网络可以是由神经单元组成的,神经单元可以是指以x s和截距1为输入的运算单元,该运算单元的输出可以为: A neural network can be composed of neural units, and a neural unit can refer to an operation unit that takes x s and an intercept 1 as input, and the output of the operation unit can be:
Figure PCTCN2021102739-appb-000001
Figure PCTCN2021102739-appb-000001
其中,s=1、2、……n,n为大于1的自然数,W s为x s的权重,b为神经单元的偏置。 Among them, s=1, 2, ... n, n is a natural number greater than 1, W s is the weight of x s , and b is the bias of the neuron unit.
f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号变换为输出信号。该激活函数的输出信号可以作为下一层的输入。例如,激活函数可以是ReLU,tanh或sigmoid函数。f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to transform the input signal in the neural unit into an output signal. The output signal of this activation function can be used as the input of the next layer. For example, the activation function can be a ReLU, tanh or sigmoid function.
神经网络是将多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。A neural network is a network formed by connecting multiple above-mentioned single neural units, that is, the output of one neural unit can be the input of another neural unit. The input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field. The local receptive field can be an area composed of several neural units.
(2)深度神经网络(2) Deep Neural Network
深度神经网络(deep neural network,DNN),也称多层神经网络,可以理解为具有多层隐含层的神经网络。按照不同层的位置对DNN进行划分,DNN内部的神经网络可以分为三类:输入层,隐含层,输出层。一般来说第一层是输入层,最后一层是输出层,中间的层数都是隐含层。层与层之间是全连接的,也就是说,第i层的任意一个神经元一定与第i+1层的任意一个神经元相连。Deep neural network (DNN), also known as multi-layer neural network, can be understood as a neural network with multiple hidden layers. DNN is divided according to the position of different layers, and the neural network inside DNN can be divided into three categories: input layer, hidden layer, and output layer. Generally speaking, the first layer is the input layer, the last layer is the output layer, and the layers in the middle are all hidden layers. The layers are fully connected, that is, any neuron in the i-th layer must be connected to any neuron in the i+1-th layer.
虽然DNN看起来很复杂,但是就每一层的工作来说,其实并不复杂,简单来说就是如下线性关系表达式:
Figure PCTCN2021102739-appb-000002
其中,
Figure PCTCN2021102739-appb-000003
是输入向量,
Figure PCTCN2021102739-appb-000004
是输出向量,
Figure PCTCN2021102739-appb-000005
是偏移向量,W是权重矩阵(也称系数),α()是激活函数。每一层仅仅是对输入向量
Figure PCTCN2021102739-appb-000006
经过如此简单的操作得到输出向量。由于DNN层数多,系数W和偏移向量
Figure PCTCN2021102739-appb-000007
的数量也比较多。这些参数在DNN中的定义如下所述:以系数W为例:假设在一个三层的DNN中,第二层的第4个神经元到第三层的第2个神经元的线性系数定义为
Figure PCTCN2021102739-appb-000008
上标3代表系数W所在的层数,而下标对应的是输出的第三层索引2和输入的第二层索引4。
Although DNN looks complicated, it is actually not complicated in terms of the work of each layer. In simple terms, it is the following linear relationship expression:
Figure PCTCN2021102739-appb-000002
in,
Figure PCTCN2021102739-appb-000003
is the input vector,
Figure PCTCN2021102739-appb-000004
is the output vector,
Figure PCTCN2021102739-appb-000005
Is the offset vector, W is the weight matrix (also called coefficient), and α() is the activation function. Each layer is just an input vector
Figure PCTCN2021102739-appb-000006
After such a simple operation, the output vector is obtained. Due to the large number of DNN layers, the coefficient W and the offset vector
Figure PCTCN2021102739-appb-000007
The number is also higher. The definition of these parameters in DNN is as follows: Take the coefficient W as an example: Assume that in a three-layer DNN, the linear coefficient from the fourth neuron of the second layer to the second neuron of the third layer is defined as
Figure PCTCN2021102739-appb-000008
The superscript 3 represents the layer number of the coefficient W, and the subscript corresponds to the output third layer index 2 and the input second layer index 4.
综上,第L-1层的第k个神经元到第L层的第j个神经元的系数定义为
Figure PCTCN2021102739-appb-000009
In summary, the coefficient from the kth neuron of the L-1 layer to the jth neuron of the L layer is defined as
Figure PCTCN2021102739-appb-000009
需要注意的是,输入层是没有W参数的。在深度神经网络中,更多的隐含层让网络更能够刻画现实世界中的复杂情形。理论上而言,参数越多的模型复杂度越高,“容量”也就越大,也就意味着它能完成更复杂的学习任务。训练深度神经网络的也就是学习权重矩 阵的过程,其最终目的是得到训练好的深度神经网络的所有层的权重矩阵(由很多层的向量W形成的权重矩阵)。It should be noted that the input layer has no W parameter. In deep neural networks, more hidden layers make the network more capable of describing complex situations in the real world. Theoretically speaking, a model with more parameters has a higher complexity and a greater "capacity", which means that it can complete more complex learning tasks. Training the deep neural network is the process of learning the weight matrix, and its ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (the weight matrix formed by the vector W of many layers).
(3)卷积神经网络(3) Convolutional neural network
卷积神经网络(convolutional neuron network,CNN)是一种带有卷积结构的深度神经网络。卷积神经网络包含了一个由卷积层和子采样层构成的特征抽取器,该特征抽取器可以看作是滤波器。卷积层是指卷积神经网络中对输入信号进行卷积处理的神经元层。在卷积神经网络的卷积层中,一个神经元可以只与部分邻层神经元连接。一个卷积层中,通常包含若干个特征平面,每个特征平面可以由一些矩形排列的神经单元组成。同一特征平面的神经单元共享权重,这里共享的权重就是卷积核。共享权重可以理解为提取图像信息的方式与位置无关。卷积核可以以随机大小的矩阵的形式化,在卷积神经网络的训练过程中卷积核可以通过学习得到合理的权重。另外,共享权重带来的直接好处是减少卷积神经网络各层之间的连接,同时又降低了过拟合的风险。Convolutional neural network (CNN) is a deep neural network with a convolutional structure. The convolutional neural network contains a feature extractor composed of a convolutional layer and a subsampling layer, which can be regarded as a filter. The convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network. In the convolutional layer of a convolutional neural network, a neuron can only be connected to some adjacent neurons. A convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units. Neural units of the same feature plane share weights, and the shared weights here are convolution kernels. Shared weights can be understood as a way to extract image information that is independent of location. The convolution kernel can be formalized as a matrix of random size, and the convolution kernel can obtain reasonable weights through learning during the training process of the convolutional neural network. In addition, the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, while reducing the risk of overfitting.
(4)循环神经网络(4) Recurrent neural network
循环神经网络(recurrent neural networks,RNN)是用来处理序列数据的。在传统的神经网络模型中,是从输入层到隐含层再到输出层,层与层之间是全连接的,而对于每一层层内之间的各个节点是无连接的。这种普通的神经网络虽然解决了很多难题,但是却仍然对很多问题却无能无力。例如,你要预测句子的下一个单词是什么,一般需要用到前面的单词,因为一个句子中前后单词并不是独立的。RNN之所以称为循环神经网路,即一个序列当前的输出与前面的输出也有关。具体的表现形式为网络会对前面的信息进行记忆并应用于当前输出的计算中,即隐含层本层之间的节点不再无连接而是有连接的,并且隐含层的输入不仅包括输入层的输出还包括上一时刻隐含层的输出。理论上,RNN能够对任何长度的序列数据进行处理。对于RNN的训练和对传统的CNN或DNN的训练一样。同样使用误差反向传播算法,不过有一点区别:即,如果将RNN进行网络展开,那么其中的参数,如W,是共享的;而如上举例上述的传统神经网络却不是这样。并且在使用梯度下降算法中,每一步的输出不仅依赖当前步的网络,还依赖前面若干步网络的状态。该学习算法称为基于时间的反向传播算法(back propagation through time,BPTT)。Recurrent neural networks (RNN) are used to process sequence data. In the traditional neural network model, from the input layer to the hidden layer to the output layer, the layers are fully connected, and each node in each layer is unconnected. Although this ordinary neural network solves many problems, it is still powerless to many problems. For example, if you want to predict what the next word in a sentence is, you generally need to use the previous words, because the preceding and following words in a sentence are not independent. The reason why RNN is called a recurrent neural network is that the current output of a sequence is also related to the previous output. The specific manifestation is that the network will remember the previous information and apply it to the calculation of the current output, that is, the nodes between the hidden layer and the current layer are no longer connected but connected, and the input of the hidden layer not only includes The output of the input layer also includes the output of the hidden layer at the previous moment. In theory, RNN can process sequence data of any length. The training of RNN is the same as that of traditional CNN or DNN. The error backpropagation algorithm is also used, but there is a difference: that is, if the RNN is expanded to the network, then the parameters, such as W, are shared; while the above-mentioned traditional neural network is not the case. And in the gradient descent algorithm, the output of each step depends not only on the network of the current step, but also depends on the state of the previous several steps of the network. This learning algorithm is called back propagation through time (BPTT) based on time.
既然已经有了卷积神经网络,为什么还要循环神经网络?原因很简单,在卷积神经网络中,有一个前提假设是:元素之间是相互独立的,输入与输出也是独立的,比如猫和狗。但现实世界中,很多元素都是相互连接的,比如股票随时间的变化,再比如一个人说了:我喜欢旅游,其中最喜欢的地方是云南,以后有机会一定要去。这里填空,人类应该都知道是填“云南”。因为人类会根据上下文的内容进行推断,但如何让机器做到这一步?RNN就应运而生了。RNN旨在让机器像人一样拥有记忆的能力。因此,RNN的输出就需要依赖当前的输入信息和历史的记忆信息。Since there are already convolutional neural networks, why do we need recurrent neural networks? The reason is simple. In the convolutional neural network, there is a premise that the elements are independent of each other, and the input and output are also independent, such as cats and dogs. But in the real world, many elements are interconnected, such as the change of stocks over time, or a person said: I like to travel, and my favorite place is Yunnan, and I must go there in the future. Fill in the blank here, humans should know that it is to fill in "Yunnan". Because humans will infer based on the content of the context, but how to make the machine do this? RNN came into being. RNN is designed to allow machines to have the ability to remember like humans. Therefore, the output of RNN needs to depend on the current input information and historical memory information.
(5)损失函数(5) Loss function
在训练深度神经网络的过程中,因为希望深度神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再根据两者之间的差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有化的过程,即为深度神经网络中的各层预先配置参数),比如,如果网络的预测值高了,就调整权重向量让它预测低一些,不断地调整,直到深度神经网络能够预测出真正想要的目标值或与 真正想要的目标值非常接近的值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么深度神经网络的训练就变成了尽可能缩小这个loss的过程。通常地,loss越小,该深度神经网络的训练质量越高,loss越大,深度神经网络的训练质量越低。类似的,loss波动越小,训练越稳定;loss波动越大,训练越不稳定。In the process of training the deep neural network, because it is hoped that the output of the deep neural network is as close as possible to the value you really want to predict, you can compare the predicted value of the current network with the target value you really want, and then according to the difference between the two to update the weight vector of each layer of neural network (of course, there is usually a process of optimization before the first update, which is to pre-configure parameters for each layer in the deep neural network), for example, if the predicted value of the network If it is high, adjust the weight vector to make it predict lower, and keep adjusting until the deep neural network can predict the real desired target value or a value very close to the real desired target value. Therefore, it is necessary to pre-define "how to compare the difference between the predicted value and the target value", which is the loss function (loss function) or objective function (objective function), which is used to measure the difference between the predicted value and the target value important equation. Among them, taking the loss function as an example, the higher the output value (loss) of the loss function, the greater the difference. Then the training of the deep neural network becomes a process of reducing the loss as much as possible. Generally, the smaller the loss, the higher the training quality of the deep neural network, and the larger the loss, the lower the training quality of the deep neural network. Similarly, the smaller the loss fluctuation, the more stable the training; the larger the loss fluctuation, the more unstable the training.
如图1所示,本申请实施例提供了一种系统架构100。在图1中,数据采集设备170用于采集训练数据。例如,针对本申请实施例的图像处理方法来说,训练数据可以包括训练图像以及训练图像对应的真值(ground truth)。例如,若视觉任务为图像分类任务,则训练图像对应的真值可以为训练图像对应的分类结果,训练图像的分类结果可以是人工预先标注的结果。As shown in FIG. 1 , the embodiment of the present application provides a system architecture 100 . In FIG. 1 , the data collection device 170 is used to collect training data. For example, for the image processing method of the embodiment of the present application, the training data may include training images and ground truths corresponding to the training images. For example, if the vision task is an image classification task, the ground truth value corresponding to the training image may be the classification result corresponding to the training image, and the classification result of the training image may be the result of manual pre-labeling.
在采集到训练数据之后,数据采集设备170将这些训练数据存入数据库130,训练设备120基于数据库130中维护的训练数据训练得到目标模型/规则101。该目标模型/规则101即为视觉任务所使用的模型。例如,视觉任务为图像分类任务,则该目标模型/规则101可以为用于图像分类的网络模型。After collecting the training data, the data collection device 170 stores the training data in the database 130 , and the training device 120 obtains the target model/rule 101 based on training data maintained in the database 130 . The target model/rule 101 is the model used by the vision task. For example, if the vision task is an image classification task, the target model/rule 101 may be a network model for image classification.
下面对训练设备120基于训练数据得到目标模型/规则101进行描述,训练设备120对输入的原始数据进行处理,将输出值与目标值进行对比,直到训练设备120输出的值与目标值的差值小于一定的阈值,从而完成目标模型/规则101的训练。The training device 120 obtains the target model/rule 101 based on the training data. The training device 120 processes the input raw data and compares the output value with the target value until the difference between the value output by the training device 120 and the target value The value is less than a certain threshold, thus completing the training of the target model/rule 101.
本申请实施例中的目标模型/规则101具体可以为神经网络模型。例如,卷积神经网络或残差网络。需要说明的是,在实际的应用中,所述数据库130中维护的训练数据不一定都来自于数据采集设备170的采集,也有可能是从其他设备接收得到的。另外需要说明的是,训练设备120也不一定完全基于数据库130维护的训练数据进行目标模型/规则101的训练,也有可能从云端或其他地方获取训练数据进行模型训练,上述描述不应该作为对本申请实施例的限定。The target model/rule 101 in the embodiment of the present application may specifically be a neural network model. For example, Convolutional Neural Networks or Residual Networks. It should be noted that, in practical applications, the training data maintained in the database 130 may not all be collected by the data collection device 170, but may also be received from other devices. In addition, it should be noted that the training device 120 does not necessarily perform the training of the target model/rules 101 based entirely on the training data maintained by the database 130, and it is also possible to obtain training data from the cloud or other places for model training. Limitations of the Examples.
根据训练设备120训练得到的目标模型/规则101可以应用于不同的系统或设备中,如应用于图1所示的执行设备110,所述执行设备110可以是终端,如手机终端,平板电脑,笔记本电脑,增强现实(augmented reality,AR)AR/虚拟现实(virtual reality,VR),车载终端等,还可以是服务器或者云端等。在图1中,执行设备110配置输入/输出(input/output,I/O)接口112,用于与外部设备进行数据交互,用户可以通过客户设备140向I/O接口112输入数据,输入数据在本申请实施例中可以包括:客户设备输入的待处理的数据。示例性地,输入数据在本申请实施例中可以包括raw图。The target model/rules 101 trained according to the training device 120 can be applied to different systems or devices, such as the execution device 110 shown in FIG. Laptop, augmented reality (augmented reality, AR) AR/virtual reality (virtual reality, VR), vehicle terminal, etc., can also be a server or cloud, etc. In FIG. 1 , the execution device 110 configures an input/output (input/output, I/O) interface 112 for data interaction with external devices, and the user can input data to the I/O interface 112 through the client device 140, and the input data In this embodiment of the application, it may include: data to be processed input by the client device. Exemplarily, the input data may include a raw image in this embodiment of the application.
预处理模块113用于根据I/O接口112接收到的输入图像进行预处理,在本申请实施例中,预处理模块113可以用于对输入图像进行一系列的图像信号处理。预处理模块113中可以包括一个或多个图像处理模块。The preprocessing module 113 is used to perform preprocessing according to the input image received by the I/O interface 112. In the embodiment of the present application, the preprocessing module 113 may be used to perform a series of image signal processing on the input image. The preprocessing module 113 may include one or more image processing modules.
在执行设备110对输入数据进行预处理,或者在执行设备110的计算模块111执行计算等相关的处理过程中,执行设备110可以调用数据存储系统150中的数据、代码等以用于相应的处理,也可以将相应处理得到的数据、指令等存入数据存储系统150中。When the execution device 110 preprocesses the input data, or in the calculation module 111 of the execution device 110 performs calculation and other related processing, the execution device 110 can call the data, codes, etc. in the data storage system 150 for corresponding processing , the correspondingly processed data and instructions may also be stored in the data storage system 150 .
最后,I/O接口112将处理结果,如上述得到的数据的处理结果返回给客户设备140,从而提供给用户。Finally, the I/O interface 112 returns the processing result, such as the processing result of the data obtained above, to the client device 140, thereby providing it to the user.
值得说明的是,训练设备120可以针对不同的目标或不同的任务,基于不同的训练数据生成相应的目标模型/规则101,该相应的目标模型/规则101即可以用于实现上述目标或完成上述任务,从而为用户提供所需的结果。It is worth noting that the training device 120 can generate corresponding target models/rules 101 based on different training data for different goals or different tasks, and the corresponding target models/rules 101 can be used to achieve the above-mentioned goals or complete the above-mentioned task to provide the user with the desired result.
在图1中所示情况下,用户可以手动给定输入数据,该手动给定可以通过I/O接口112提供的界面进行操作。另一种情况下,客户设备140可以自动地向I/O接口112发送输入数据,如果要求客户设备140自动发送输入数据需要获得用户的授权,则用户可以在客户设备140中设置相应权限。用户可以在客户设备140查看执行设备110输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。客户设备140也可以作为数据采集端,采集如图所示输入I/O接口112的输入数据及输出I/O接口112的输出结果作为新的样本数据,并存入数据库130。当然,也可以不经过客户设备140进行采集,而是由I/O接口112直接将如图所示输入I/O接口112的输入数据及输出I/O接口112的输出结果,作为新的样本数据存入数据库130。In the situation shown in FIG. 1 , the user can manually specify the input data, and the manual specification can be operated through the interface provided by the I/O interface 112 . In another case, the client device 140 can automatically send the input data to the I/O interface 112 . If the client device 140 is required to automatically send the input data to obtain the user's authorization, the user can set the corresponding authority in the client device 140 . The user can view the results output by the execution device 110 on the client device 140, and the specific presentation form may be specific ways such as display, sound, and action. The client device 140 can also be used as a data collection terminal, collecting the input data input to the I/O interface 112 as shown in the figure and the output results of the output I/O interface 112 as new sample data, and storing them in the database 130 . Of course, the client device 140 may not be used for collection, but the I/O interface 112 directly uses the input data input to the I/O interface 112 as shown in the figure and the output result of the output I/O interface 112 as a new sample. The data is stored in database 130 .
值得注意的是,图1仅是本申请实施例提供的一种系统架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在图1中,数据存储系统150相对执行设备110是外部存储器,在其它情况下,也可以将数据存储系统150置于执行设备110中。It should be noted that FIG. 1 is only a schematic diagram of a system architecture provided by the embodiment of the present application, and the positional relationship between devices, devices, modules, etc. shown in the figure does not constitute any limitation. For example, in FIG. 1, the data The storage system 150 is an external memory relative to the execution device 110 , and in other cases, the data storage system 150 may also be placed in the execution device 110 .
如图1所示,根据训练设备120训练得到目标模型/规则101,该目标模型/规则101在本申请实施例中可以是本申请中的神经网络模型,具体的,本申请实施例的神经网络模型可以为CNN或残差网络等。As shown in FIG. 1, the target model/rule 101 is obtained according to the training device 120. The target model/rule 101 in the embodiment of the present application may be the neural network model in the present application, specifically, the neural network in the embodiment of the present application The model can be CNN or residual network, etc.
图像信号处理器对传感器获取的raw图像经过一系列的处理之后,输出可视化的图像。这些图像可以作为视觉任务的输入图像。具体地,在视觉任务中可以利用神经网络算法或者非神经网络算法对输入图像进行处理,得到视觉任务的相关结果。The image signal processor outputs a visualized image after a series of processing on the raw image acquired by the sensor. These images can be used as input images for vision tasks. Specifically, a neural network algorithm or a non-neural network algorithm may be used to process an input image in a visual task to obtain relevant results of the visual task.
图2示出了一种视觉任务的整体处理流程的示意图。将raw图作为输入图像,对该输入图像进行一系列的图像信号处理,输出8比特(bit)的可视化的红绿蓝(red green blue,RGB)图像。将该RGB图像作为视觉任务的输入图像,得到视觉任务的处理结果。例如,如图2所示,图像信号处理模块包括黑电平补偿(black level compensation)模块、绿平衡(green balance)模块、坏点修正(bad pixel correction)模块、去马赛克(demosaic)模块、拜耳降噪(bayer denoise)模块、自动白平衡(auto white balance)模块、色彩校正(color correction)模块、伽马校正(gamma correction)模块以及降噪和锐化(denoise sharpness)模块等。图像信号处理模块可以采用非神经网络算法,也可以采用神经网络算法。FIG. 2 shows a schematic diagram of an overall processing flow of a vision task. The raw image is used as an input image, and a series of image signal processing is performed on the input image, and an 8-bit (bit) visualized red green blue (RGB) image is output. The RGB image is used as the input image of the vision task, and the processing result of the vision task is obtained. For example, as shown in Figure 2, the image signal processing module includes a black level compensation (black level compensation) module, a green balance (green balance) module, a bad pixel correction (bad pixel correction) module, a demosaic (demosaic) module, Bayer Bayer denoise module, auto white balance module, color correction module, gamma correction module, denoise sharpness module, etc. The image signal processing module can adopt non-neural network algorithm or neural network algorithm.
视觉任务的输入图像通常为经过图像信号处理的RGB图像。传统的图像信号处理的目的通常是为了满足人的视觉需求,基于该图像执行视觉任务得到的结果不一定是最优的结果。The input images of vision tasks are usually RGB images after image signal processing. The purpose of traditional image signal processing is usually to meet human visual needs, and the results obtained by performing visual tasks based on the images are not necessarily optimal.
本申请实施例提供了一种图像处理方法,根据视觉任务的处理结果调整视觉任务之前的图像处理流程,以得到满足需要的图像处理流程。The embodiment of the present application provides an image processing method, which adjusts the image processing flow before the vision task according to the processing result of the vision task, so as to obtain an image processing flow that meets requirements.
下面结合图3至图6对本申请实施例中的图像处理方法进行详细的描述。The image processing method in the embodiment of the present application will be described in detail below with reference to FIG. 3 to FIG. 6 .
图3示出了本申请实施例提供的图像处理方法300。图3所示的方法可以由计算装置来执行,该装置可以是云服务设备,也可以是终端设备,例如,电脑、服务器、手机、摄像头、车辆、无人机或机器人等装置,也可以是由云服务设备和终端设备构成的系统。FIG. 3 shows an image processing method 300 provided by an embodiment of the present application. The method shown in Figure 3 can be executed by a computing device, which can be a cloud service device or a terminal device, such as a computer, server, mobile phone, camera, vehicle, drone or robot, or a A system composed of cloud service equipment and terminal equipment.
示例性地,方法300可以由训练设备或推理设备执行,例如,方法300可以由CPU、GPU或NPU等加速器执行。进一步地,加速器芯片可以位于FPGA、芯片仿真器(Emulator)或开发板(evaluation board,EVB)上。Exemplarily, the method 300 may be executed by a training device or an inference device, for example, the method 300 may be executed by an accelerator such as a CPU, a GPU, or an NPU. Further, the accelerator chip may be located on an FPGA, a chip emulator (Emulator) or a development board (evaluation board, EVB).
或者,方法300可以由硬件装置(例如,摄像头或相机)的ISP流水线(pipeline)的调优工具或校准工具执行。Alternatively, the method 300 may be executed by a tuning tool or a calibration tool of an ISP pipeline (pipeline) of a hardware device (eg, a camera or a camera).
方法300包括步骤S301至步骤S304。下面对步骤S301至步骤S304进行详细介绍。The method 300 includes step S301 to step S304. Step S301 to step S304 will be described in detail below.
S301,获取第一图像。S301. Acquire a first image.
示例性地,第一图像可以为传感器获取的raw图。Exemplarily, the first image may be a raw image acquired by a sensor.
训练数据集中包括多个图像,第一图像为训练数据集中的任一图像。在实际应用中,可以基于训练数据集中的多个图像多次执行方法300,直至得到需要的图像处理模块。The training data set includes multiple images, and the first image is any image in the training data set. In practical applications, the method 300 may be executed multiple times based on multiple images in the training data set until the required image processing modules are obtained.
示例性地,训练数据集可以采用开源数据集。或者,训练数据集也可以是自行制作的数据集。Exemplarily, the training data set may use an open source data set. Alternatively, the training data set can also be a self-made data set.
示例性地,训练数据集可以是预先存储的。例如,该训练数据集可以是图1所示的数据库130中维护的训练数据。或者,训练数据集也可以用户输入的数据。Exemplarily, the training data set may be pre-stored. For example, the training data set may be the training data maintained in the database 130 shown in FIG. 1 . Alternatively, the training dataset can also be user-input data.
S302,通过至少一个图像处理模块对第一图像进行处理,得到第二图像。S302. Process the first image by at least one image processing module to obtain a second image.
图像处理模块用于对输入图像进行图像信号处理。The image processing module is used for image signal processing on the input image.
示例性地,该至少一个图像处理模块可以位于图像信号处理器上。也就是说,由图像处理器中的图像处理模块执行步骤S302。Exemplarily, the at least one image processing module may be located on an image signal processor. That is to say, step S302 is executed by the image processing module in the image processor.
该至少一个图像处理模块中的任一图像处理模块可以采用神经网络算法实现,或者,也可以采用非神经网络算法实现。本申请实施例对图像处理模块的具体实现方式不做限定。Any image processing module in the at least one image processing module may be implemented by a neural network algorithm, or may also be implemented by a non-neural network algorithm. The embodiment of the present application does not limit the specific implementation manner of the image processing module.
可选地,该至少一个图像处理模块可以包括:黑电平补偿模块、绿平衡模块、坏点修正模块、去马赛克模块、Bayer降噪模块、自动白平衡模块、色彩校正模块、伽马校正模块或降噪和锐化模块。Optionally, the at least one image processing module may include: a black level compensation module, a green balance module, a dead point correction module, a demosaic module, a Bayer noise reduction module, an automatic white balance module, a color correction module, and a gamma correction module Or noise reduction and sharpening modules.
例如,如图4所示,将raw图作为第一图像,该至少一个图像处理模块包括9个图像处理模块,分别为黑电平补偿模块、绿平衡模块、坏点修正模块、去马赛克模块、Bayer降噪模块、自动白平衡模块、色彩校正模块、伽马校正模块以及降噪和锐化模块。该9个图像处理模块依次执行黑电平补偿、绿平衡处理、坏点修正、去马赛克、Bayer降噪、自动白平衡处理、色彩校正、伽马校正以及降噪和锐化。For example, as shown in Figure 4, the raw image is used as the first image, and the at least one image processing module includes 9 image processing modules, which are respectively a black level compensation module, a green balance module, a bad pixel correction module, a demosaic module, Bayer noise reduction module, automatic white balance module, color correction module, gamma correction module, and noise reduction and sharpening module. The nine image processing modules sequentially perform black level compensation, green balance processing, dead pixel correction, demosaicing, Bayer noise reduction, automatic white balance processing, color correction, gamma correction, and noise reduction and sharpening.
示例性地,黑电平模块、绿平衡模块和坏点修正模块可以用于对raw数据进行处理。去马赛克模块和Bayer降噪模块可以用于执行去马赛克处理。自动白平衡模块、色彩校正模块、伽马校正模块以及降噪和锐化模块可以用于执行图像增强处理。Exemplarily, the black level module, the green balance module and the bad pixel correction module can be used to process the raw data. A demosaic module and a Bayer denoising module may be used to perform demosaic processing. An automatic white balance module, a color correction module, a gamma correction module, and a noise reduction and sharpening module can be used to perform image enhancement processing.
例如,如图4所示,第二图像可以为RGB图像。进一步地,第二图像可以为8bit的RGB图像。此处仅为示例,第二图像的类型也可以根据视觉任务模型的输入需要设置。For example, as shown in FIG. 4, the second image may be an RGB image. Further, the second image may be an 8-bit RGB image. This is only an example, and the type of the second image may also be set according to the input requirements of the visual task model.
可选地,步骤S302包括:通过至少一个图像处理模块和该至少一个图像处理模块的权重对第一图像进行处理,得到第二图像。Optionally, step S302 includes: processing the first image by using at least one image processing module and the weight of the at least one image processing module to obtain the second image.
具体地,根据该至少一个图像处理模块的权重对该至少一个图像处理模块的处理结果进行调整,得到第二图像。Specifically, the processing result of the at least one image processing module is adjusted according to the weight of the at least one image processing module to obtain the second image.
示例性地,图像处理模块对输入该模块的图像进行处理,可以为调整输入该模块的图 像的全部或部分像素的像素值,也就是使全部或部分像素的像素值发生变化。在该情况下,可以根据该图像处理模块的权重对全部或部分像素的像素值的变化量进行调整。Exemplarily, the image processing module processes the image input to the module, which may be to adjust the pixel values of all or part of the pixels of the image input to the module, that is, to change the pixel values of all or part of the pixels. In this case, the variation of the pixel values of all or some pixels may be adjusted according to the weight of the image processing module.
例如,将该图像处理模块的权重与像素值的变化量相乘,得到调整后的像素的变化量,进而得到该模块的输出图像。若该图像处理模块的权重为0,则相当于该图像处理模块没有参与图像处理流程。For example, the weight of the image processing module is multiplied by the variation of the pixel value to obtain the adjusted variation of the pixel, and then the output image of the module is obtained. If the weight of the image processing module is 0, it means that the image processing module does not participate in the image processing process.
权重的具体取值可以根据需要设定,例如,权重可以为大于或等于0,且小于等于1的实数。The specific value of the weight can be set as required, for example, the weight can be a real number greater than or equal to 0 and less than or equal to 1.
进一步地,在设置权重时可以对该至少一个图像处理模块的权重进行归一化处理,也就是使该至少一个图像处理模块的权重的总和为1,或使该至少一个图像处理模块的权重总和接近1。Further, when setting the weight, the weight of the at least one image processing module can be normalized, that is, the sum of the weights of the at least one image processing module is 1, or the sum of the weights of the at least one image processing module close to 1.
如图4所示,9个图像处理模块的权重分别为w1、w2、w3、w4、w5、w6、w7、w8和w9。权重的取值范围为大于或等于0,且小于等于1的实数。这样,w1、w2、w3、w4、w5、w6、w7、w8和w9的最大的可能的总和为9。或者,也可以对该9个权重进行归一化处理,这样可以使该9个权重的总和为1。As shown in Figure 4, the weights of the nine image processing modules are w1, w2, w3, w4, w5, w6, w7, w8 and w9, respectively. The value range of the weight is a real number greater than or equal to 0 and less than or equal to 1. Thus, the largest possible sum of w1, w2, w3, w4, w5, w6, w7, w8 and w9 is nine. Alternatively, the nine weights can also be normalized so that the sum of the nine weights can be 1.
S303,将第二图像输入至视觉任务模型中进行处理。S303. Input the second image into the visual task model for processing.
示例性地,视觉任务包括:目标检测、图像分类、目标分割、目标跟踪或图像识别等。Exemplarily, the vision task includes: target detection, image classification, target segmentation, target tracking, or image recognition.
视觉任务模型用于执行视觉任务。例如,视觉任务为目标检测,则视觉任务模型为目标检测模型。再如,视觉任务为图像识别,则视觉任务模型为图像识别模型。The visual task model is used to perform visual tasks. For example, if the vision task is target detection, then the vision task model is the target detection model. For another example, if the visual task is image recognition, then the visual task model is an image recognition model.
视觉任务模型可以为训练好的模型。The vision task model can be a trained model.
视觉任务模型的输出的类型与视觉任务的类型有关。视觉任务模型的输出即为视觉任务模型的推理结果。The type of output of the visual task model is related to the type of visual task. The output of the visual task model is the inference result of the visual task model.
例如,视觉任务为目标检测,则视觉任务模型的输出可以为第二图像上的目标框以及该目标框中的物体的类别。再如,视觉任务为图像分类,则视觉任务模型的输出可以为第二图像的类别。For example, if the vision task is target detection, the output of the vision task model may be a target frame on the second image and the category of the object in the target frame. For another example, if the visual task is image classification, the output of the visual task model may be the category of the second image.
视觉任务模型的处理结果可以包括视觉任务模型的性能指标。The processing results of the vision task model may include performance indicators of the vision task model.
示例性地,视觉任务模型的性能指标包括推理的准确度或损失函数的值等。损失函数可以根据需要设置。损失函数用于指示视觉任务模型的推理结果与第一图像对应的真值之间的差异。需要说明的是,此处的损失函数可以采用视觉任务模型训练过程中的损失函数,或者,也可以采用其他形式的损失函数。Exemplarily, the performance index of the vision task model includes the accuracy of reasoning or the value of the loss function. The loss function can be set as needed. The loss function is used to indicate the difference between the inference result of the vision task model and the true value corresponding to the first image. It should be noted that the loss function here may be the loss function in the training process of the vision task model, or other forms of loss functions may also be used.
例如,视觉任务为目标检测,则视觉任务模型的处理结果可以包括检测准确度。For example, if the vision task is target detection, the processing result of the vision task model may include detection accuracy.
将第二图像输入视觉任务模型中进行处理,将得到的检测结果与第一图像对应的真值比较,得到两者之间的误差,根据两者之间的误差确定检测准确度。Input the second image into the visual task model for processing, compare the obtained detection result with the corresponding true value of the first image, obtain the error between the two, and determine the detection accuracy according to the error between the two.
再如,视觉任务为目标分割,则视觉任务模型的处理结果可以包括分割准确度。For another example, if the vision task is target segmentation, the processing result of the vision task model may include segmentation accuracy.
将第二图像输入视觉任务模型中进行处理,将得到的分割结果与第一图像对应的真值比较,得到两者之间的误差,根据两者之间的误差确定分割准确度。Input the second image into the visual task model for processing, compare the obtained segmentation result with the corresponding true value of the first image, obtain the error between the two, and determine the segmentation accuracy according to the error between the two.
视觉任务模型可以采用神经网络模型,或者,也可以采用非神经网络模型。神经网络模型可以是现有的神经网络模型,例如,残差网络。或者,该神经网络模型也可以是自行构建的其他结构的神经网络模型。本申请实施例对此不作限定。The visual task model may use a neural network model, or may also use a non-neural network model. The neural network model may be an existing neural network model, for example, a residual network. Alternatively, the neural network model may also be a neural network model of other structures constructed by itself. This embodiment of the present application does not limit it.
需要说明的是,对于相同的视觉任务,在不同的应用场景下,可能采用不同的视觉任 务模型。例如,对于驾驶场景中的目标检测任务,在曝光过度和曝光不足的情况下采用的视觉任务模型可能是相同的,也可能是不同的。在驾驶的过程中,若当前场景被识别为曝光过度,可以采用第一目标检测模型,若当前场景被识别为曝光不足,可以采用第二目标检测模型。第一目标检测模型和第二目标检测模型为不同的目标检测模型。It should be noted that for the same visual task, different visual task models may be used in different application scenarios. For example, for an object detection task in a driving scene, the visual task model employed may or may not be the same in overexposed and underexposed situations. During driving, if the current scene is recognized as overexposed, the first object detection model may be used, and if the current scene is recognized as underexposed, the second object detection model may be used. The first target detection model and the second target detection model are different target detection models.
示例性地,视觉任务模型的处理过程可以由图1中的计算模块111执行。Exemplarily, the processing of the visual task model can be executed by the calculation module 111 in FIG. 1 .
视觉任务模型可以部署于方法300的执行设备上,也可以部署于其他设备上。也就是说,视觉任务模型的处理过程可以由方法300的执行设备执行,也可以由其他设备执行,并将处理结果反馈至方法300的执行设备上。The vision task model can be deployed on the execution device of the method 300, or can be deployed on other devices. That is to say, the processing of the visual task model can be executed by the executing device of the method 300 or by other devices, and the processing result can be fed back to the executing device of the method 300 .
S304,根据视觉任务模型的处理结果调整该至少一个图像处理模块。S304. Adjust the at least one image processing module according to the processing result of the visual task model.
根据视觉任务模型的处理结果调整该至少一个图像处理模块,以使视觉任务模型的处理结果尽可能接近预期。The at least one image processing module is adjusted according to the processing result of the visual task model, so that the processing result of the visual task model is as close to expectation as possible.
或者说,根据视觉任务模型的性能指标调整该至少一个图像处理模块,以提高视觉任务模型的性能。In other words, the at least one image processing module is adjusted according to the performance index of the visual task model, so as to improve the performance of the visual task model.
例如,视觉任务模型的性能指标为视觉任务模型的推理的准确度,则调整该至少一个图像处理模块,以提高模型的推理的准确度。For example, if the performance index of the visual task model is the accuracy of inference of the visual task model, the at least one image processing module is adjusted to improve the accuracy of inference of the model.
再如,视觉任务模型的性能指标为视觉任务模型的损失函数的值,则调整该至少一个图像处理模块,以减少视觉任务模型的损失函数的值。For another example, if the performance index of the visual task model is the value of the loss function of the visual task model, the at least one image processing module is adjusted to reduce the value of the loss function of the visual task model.
在实际应用中可以基于训练数据集中多张图像执行方法300,直至满足预设条件。也就是说在实际应用中可以基于多张图像不断迭代调整图像处理模块。每一次迭代过程中采用的图像处理模块为上一次迭代后得到的图像处理模块。In practical applications, the method 300 may be executed based on multiple images in the training data set until a preset condition is met. That is to say, in practical applications, the image processing module can be adjusted iteratively based on multiple images. The image processing module used in each iteration is the image processing module obtained after the previous iteration.
预设条件可以根据需要设置,后文中会在方式1、方式2、方式3和方式4中举例说明。The preset conditions can be set as required, and examples will be given in Mode 1, Mode 2, Mode 3, and Mode 4 below.
进一步地,还可以根据图像处理的时间和视觉任务模型的处理结果调整该至少一个图像处理模块。Further, the at least one image processing module can also be adjusted according to the image processing time and the processing result of the visual task model.
图像处理的时间可以为视觉任务模型的处理时间,或者,也可以为该至少一个图像处理模块的处理时间,或者,也可以为视觉任务模型的处理时间和该至少一个图像处理模块的处理时间的总和。The image processing time may be the processing time of the visual task model, or may also be the processing time of the at least one image processing module, or may also be the difference between the processing time of the visual task model and the processing time of the at least one image processing module. sum.
这样,能够在保证视觉任务模型的性能的前提下,提高处理速度,降低时延。In this way, the processing speed can be improved and the time delay can be reduced under the premise of ensuring the performance of the visual task model.
在本申请实施例的方案中,根据视觉任务模型的处理结果调整图像处理流程,有利于得到适合视觉任务的图像,以保证视觉任务模型的性能。In the solution of the embodiment of the present application, the image processing flow is adjusted according to the processing result of the visual task model, which is beneficial to obtain an image suitable for the visual task, so as to ensure the performance of the visual task model.
本申请实施例的方案能够根据不同的应用场景的需求调整图像处理流程,以适应不同的应用场景。The solutions of the embodiments of the present application can adjust the image processing flow according to the requirements of different application scenarios, so as to adapt to different application scenarios.
相同的视觉任务,在不同的应用场景下,可能采用不同的视觉任务模型。例如,对于驾驶场景中的目标检测任务,在曝光过度和曝光不足的情况下采用的视觉任务模型可能是相同的,也可能是不同的。在驾驶的过程中,若当前场景被识别为曝光过度,可以采用第一目标检测模型作为视觉任务模型。若当前场景被识别为曝光不足,可以采用第二目标检测模型作为视觉任务模型。本申请实施例的方案可以分别针对第一目标检测模型和第二目标检测模型的处理结果调整图像处理流程,以分别得到适合第一目标检测模型的图像处理流程和适合第二目标检测模型的图像处理流程。For the same visual task, different visual task models may be used in different application scenarios. For example, for an object detection task in a driving scene, the visual task model employed may or may not be the same in overexposed and underexposed situations. During the driving process, if the current scene is identified as being overexposed, the first object detection model may be used as the visual task model. If the current scene is identified as being underexposed, the second object detection model may be used as the vision task model. The solution of the embodiment of the present application can adjust the image processing flow according to the processing results of the first object detection model and the second object detection model respectively, so as to obtain an image processing flow suitable for the first object detection model and an image suitable for the second object detection model processing flow.
步骤S304可以采用多种方式实现,下面以其中四种方式(方式1、方式2、方式3和方式4)为例进行说明。Step S304 can be implemented in various ways, and the following four ways (mode 1, mode 2, mode 3 and mode 4) are taken as examples for illustration.
方式1way 1
可选地,该至少一个图像处理模块包括多个图像处理模块,步骤S304包括:根据视觉任务模型的处理结果调整该多个图像处理模块的权重。Optionally, the at least one image processing module includes a plurality of image processing modules, and step S304 includes: adjusting weights of the plurality of image processing modules according to a processing result of the visual task model.
根据视觉任务模型的处理结果调整该多个图像处理模块的权重,以提高视觉任务模型的性能。The weights of the plurality of image processing modules are adjusted according to the processing results of the visual task model, so as to improve the performance of the visual task model.
如前所述,实际应用中可以基于训练数据集中多张图像执行方法300以实现对该多个图像处理模块的权重进行迭代调整,直至满足预设条件。满足预设条件后停止调整该多个图像处理模块的权重,或者说,停止刷新该多个图像处理模块的权重。As mentioned above, in practical applications, the method 300 can be executed based on multiple images in the training data set to implement iterative adjustment of the weights of the multiple image processing modules until the preset conditions are met. Stop adjusting the weights of the plurality of image processing modules after the preset condition is met, or stop refreshing the weights of the plurality of image processing modules.
示例性地,预设条件可以为该多个图像处理模块的权重收敛。Exemplarily, the preset condition may be that the weights of the plurality of image processing modules converge.
在该多个图像处理模块的权重收敛的情况下,不再执行方法300,即停止调整该多个图像处理模块的权重。权重收敛也可以理解为在连续执行多次方法300后得到的权重梯度变化较小。例如,连续执行多次方法300后得到的权重梯度的变化量小于或等于第一阈值的情况下,停止调整该多个图像处理模块的权重。When the weights of the plurality of image processing modules converge, the method 300 is not executed any more, that is, the adjustment of the weights of the plurality of image processing modules is stopped. The weight convergence can also be understood as the weight gradient obtained after performing the method 300 multiple times continuously has a small change. For example, when the change amount of the weight gradient obtained after performing the method 300 for multiple times is less than or equal to the first threshold, stop adjusting the weights of the multiple image processing modules.
可替换地,预设条件可以为视觉任务模型的准确度大于或等于第二阈值。Alternatively, the preset condition may be that the accuracy of the visual task model is greater than or equal to the second threshold.
在视觉任务模型的准确度大于或等于第二阈值的情况下,不再执行方法300,即停止调整该多个图像处理模块的权重。In the case that the accuracy of the visual task model is greater than or equal to the second threshold, the method 300 is not executed, that is, the adjustment of the weights of the plurality of image processing modules is stopped.
第二阈值可以为预设值。或者,第二阈值可以是在没有设置图像处理模块的权重的情况下得到的视觉任务模型的推理的准确度。例如,如图4所示,第二阈值可以在该9个图像处理模块没有设置权重的情况下的视觉任务模型的推理的准确度。或者可以理解为,第二阈值可以为在该9个图像处理模块的权重为1的情况下的视觉任务模型的推理的准确度。The second threshold may be a preset value. Alternatively, the second threshold may be the inference accuracy of the visual task model obtained without setting the weight of the image processing module. For example, as shown in FIG. 4 , the second threshold may be the inference accuracy of the visual task model when no weight is set for the nine image processing modules. Or it can be understood that the second threshold may be the inference accuracy of the visual task model when the weight of the nine image processing modules is 1.
也就是说,将图像输入原始的图像处理模块中进行处理,并将处理后的图像输入至视觉任务模型中进行处理,并计算推理的准确度,将该准确度作为第二阈值。执行方法300,即将图像输入当前调整过权重的图像处理模块中进行处理,并将处理后的图像输入至视觉任务模型中进行处理,并计算推理的准确度,将当前得到的推理的准确度与第二阈值进行比较,在当前得到的推理的准确度大于或等于第二阈值的情况下,不再执行方法300。这样,利用调整后的图像处理模块对图像进行处理,能够保证视觉任务模型的性能,或者能够提高视觉任务模型的性能。That is to say, the image is input into the original image processing module for processing, and the processed image is input into the vision task model for processing, and the accuracy of reasoning is calculated, and the accuracy is used as the second threshold. Executing method 300, that is, inputting the image into the currently adjusted image processing module for processing, and inputting the processed image into the visual task model for processing, and calculating the accuracy of inference, and comparing the currently obtained inference accuracy with The second threshold is compared, and in the case that the accuracy of the currently obtained reasoning is greater than or equal to the second threshold, the method 300 is not executed any more. In this way, using the adjusted image processing module to process the image can ensure the performance of the visual task model, or can improve the performance of the visual task model.
可替换地,预设条件可以为连续执行多次方法300后得到的视觉任务模型的损失函数值的变化量小于或等于第三阈值。Alternatively, the preset condition may be that the change amount of the loss function value of the visual task model obtained after performing the method 300 for multiple times is less than or equal to the third threshold.
也就是说,在视觉任务模型的损失函数值的变化趋于稳定的情况下,不再执行方法300。That is to say, when the change of the loss function value of the vision task model tends to be stable, the method 300 is not executed any more.
可替换地,预设条件可以为迭代次数大于或等于第四阈值。Alternatively, the preset condition may be that the number of iterations is greater than or equal to the fourth threshold.
也就是说,在执行方法300的次数大于或等于第四阈值的情况下,不再执行方法300。That is to say, in the case that the number of times the method 300 is executed is greater than or equal to the fourth threshold, the method 300 is not executed any more.
应理解,上述预设条件可以结合使用。例如,预设条件可以为视觉任务模型的准确度大于或等于第二阈值,且迭代次数大于或等于第四阈值。再如,预设条件可以为该多个图像处理模块的权重收敛,且视觉任务模型的准确度大于或等于第二阈值。It should be understood that the above preset conditions may be used in combination. For example, the preset condition may be that the accuracy of the visual task model is greater than or equal to the second threshold, and the number of iterations is greater than or equal to the fourth threshold. For another example, the preset condition may be that the weights of the plurality of image processing modules converge, and the accuracy of the visual task model is greater than or equal to the second threshold.
应理解,以上仅为示例,预设条件还可以为其他形式的条件,本申请对此不做限定。It should be understood that the above are examples only, and the preset conditions may also be conditions in other forms, which are not limited in the present application.
示例性地,可以采用贝叶斯优化方法、RNN模型或强化学习算法等方式调整该多个图像处理模块的权重。Exemplarily, the weights of the plurality of image processing modules may be adjusted by means of Bayesian optimization method, RNN model, or reinforcement learning algorithm.
下面以贝叶斯优化方法为例进行说明。The Bayesian optimization method is taken as an example to illustrate below.
例如,视觉任务模型为目标检测模型,视觉任务模型的性能指标可以为平均准确度(mean average precision,mAP)。通过贝叶斯优化方法调整该多个图像处理模块的权重,以提高目标检测模型的mAP。或者说,以目标检测模型的mAP最大化为目标调整该多个图像处理模块的权重。For example, the vision task model is a target detection model, and the performance index of the vision task model may be mean average precision (mAP). The weights of the multiple image processing modules are adjusted by a Bayesian optimization method to improve the mAP of the object detection model. In other words, the weights of the multiple image processing modules are adjusted with the goal of maximizing the mAP of the target detection model.
平均准确度指的是对于所有目标物体的检测准确度的平均值。The average accuracy refers to the average of the detection accuracies for all target objects.
将训练数据集中的图像输入目标检测模型中,得到该图像的检测准确度。将该图像的检测准确度输入贝叶斯优化模型中,贝叶斯优化模型调整各个图像处理模块的权重。Input the image in the training data set into the target detection model to obtain the detection accuracy of the image. The detection accuracy of the image is input into the Bayesian optimization model, and the Bayesian optimization model adjusts the weight of each image processing module.
进一步地,该图像的检测准确度可以保留在贝叶斯优化模型中。也就是说,当训练数据集中的其他图像输入至目标检测模型中,得到其他图像的检测准确度。贝叶斯优化模型可以根据其他图像的检测准确度以及之前的图像的检测准确度调整各个图像处理模块的权重。Further, the detection accuracy of the image can be preserved in the Bayesian optimization model. That is to say, when other images in the training data set are input into the target detection model, the detection accuracy of other images is obtained. The Bayesian optimization model can adjust the weight of each image processing module according to the detection accuracy of other images and the detection accuracy of previous images.
需要说明的是本申请实施例中的训练数据集用于训练各个图像处理模块,与视觉任务模型的训练数据集可以相同也可以不同。例如,本申请实施例中的训练数据集可以采用视觉任务模型的验证数据集或测试数据集等。It should be noted that the training data set in the embodiment of the present application is used to train each image processing module, which may be the same as or different from the training data set of the vision task model. For example, the training data set in the embodiment of the present application may use a verification data set or a test data set of the vision task model.
在本申请实施例的方案中,根据视觉任务模型的处理结果评估图像处理模块的权重,进而调整图像处理模块的权重,以增加与视觉任务模型的性能相关性较强的图像处理模块的权重,减少与视觉任务模型的性能相关性较弱的图像处理模块的权重,这样能够获得更适合视觉任务的图像处理流程,有利于提高视觉任务模型的性能。In the solution of the embodiment of the present application, the weight of the image processing module is evaluated according to the processing results of the visual task model, and then the weight of the image processing module is adjusted to increase the weight of the image processing module that has a strong correlation with the performance of the visual task model. Reducing the weight of image processing modules that are less correlated with the performance of the vision task model can obtain an image processing flow that is more suitable for the vision task, and is conducive to improving the performance of the vision task model.
方式2way 2
可选地,步骤S304包括:根据视觉任务模型的处理结果更改该至少一个图像处理模块。Optionally, step S304 includes: modifying the at least one image processing module according to a processing result of the visual task model.
更改该至少一个图像处理模块,可以包括:删除该至少一个图像处理模块中的部分图像处理模块或/和增加其他图像处理模块。Changing the at least one image processing module may include: deleting some image processing modules in the at least one image processing module or/and adding other image processing modules.
在一种可能的实现方式中,步骤S304可以为,根据视觉任务模型的处理结果从多个候选图像处理模块中选择一个图像处理模块的组合,利用该图像处理模块的组合替换该至少一个图像处理模块。In a possible implementation, step S304 may be to select a combination of image processing modules from multiple candidate image processing modules according to the processing results of the visual task model, and use the combination of image processing modules to replace the at least one image processing module. module.
示例性地,可以采用贝叶斯优化方法或强化学习算法等方式更改该至少一个图像处理模块。Exemplarily, the at least one image processing module can be changed by means of Bayesian optimization method or reinforcement learning algorithm.
如前所述,实际应用中可以基于训练数据集中多张图像执行方法300以实现对该多个图像处理模块的组合进行迭代调整,直至满足预设条件。满足预设条件后停止调整该多个图像处理模块的组合,或者说,停止刷新该多个图像处理模块的组合。As mentioned above, in practical applications, the method 300 can be executed based on multiple images in the training data set to realize iterative adjustment of the combination of the multiple image processing modules until the preset conditions are met. Stop adjusting the combination of the multiple image processing modules after the preset condition is met, or stop refreshing the combination of the multiple image processing modules.
示例性地,预设条件可以为迭代次数大于或等于第四阈值。Exemplarily, the preset condition may be that the number of iterations is greater than or equal to the fourth threshold.
在执行方法300的次数大于或等于第四阈值的情况下,不再执行方法300,即停止调整该图像处理模块的组合。In a case where the number of executions of the method 300 is greater than or equal to the fourth threshold, the execution of the method 300 is stopped, that is, the adjustment of the combination of the image processing modules is stopped.
应理解,此处仅为示例,其他的预设条件的设置可以参考方式1,此处不再赘述。It should be understood that this is only an example, and other preset conditions can be set with reference to mode 1, which will not be repeated here.
在本申请实施例的方案中,根据视觉任务模型的处理结果更改图像处理模块的组合,能够获得更适合视觉任务模型的图像处理模块的组合,有利于提高视觉任务模型的性能。In the solution of the embodiment of the present application, the combination of image processing modules is changed according to the processing results of the visual task model, so that a combination of image processing modules more suitable for the visual task model can be obtained, which is conducive to improving the performance of the visual task model.
该至少一个图像处理模块包括多个图像处理模块,步骤S304包括:根据视觉任务模型的处理结果从该多个图像处理模块中删除部分图像处理模块。The at least one image processing module includes a plurality of image processing modules, and step S304 includes: deleting part of the image processing modules from the plurality of image processing modules according to the processing result of the visual task model.
在一种可能实现方式中,方式2可以采用方式1的处理结果。In a possible implementation manner, the processing result of the manner 1 may be adopted in the manner 2.
可选地,步骤S304包括:根据视觉任务模型的处理结果调整该多个图像处理模块的权重;根据调整后的该多个图像处理模块的权重从该多个图像处理模块中删除部分图像处理模块。Optionally, step S304 includes: adjusting the weights of the multiple image processing modules according to the processing results of the visual task model; deleting part of the image processing modules from the multiple image processing modules according to the adjusted weights of the multiple image processing modules .
示例性地,该多个图像处理模块为m个图像处理模块。m为大于1的整数。从该m个图像处理模块中删除调整后的权重最小的n个图像处理模块。n为大于1且小于m的整数。Exemplarily, the multiple image processing modules are m image processing modules. m is an integer greater than 1. The n image processing modules with the smallest adjusted weights are deleted from the m image processing modules. n is an integer greater than 1 and less than m.
可替换地,从该m个图像处理模块中删除调整后的权重小于或等于权重阈值的图像处理模块。Alternatively, an image processing module whose adjusted weight is less than or equal to a weight threshold is deleted from the m image processing modules.
例如,如图4所示的9个图像处理模块中,绿平衡模块、坏点修正模块、bayer降噪模块、色彩校正模块以及降噪和锐化模块对应的权重小于或等于权重阈值,则删除该五个模块。For example, among the nine image processing modules shown in Figure 4, if the corresponding weights of the green balance module, bad pixel correction module, bayer noise reduction module, color correction module, and noise reduction and sharpening module are less than or equal to the weight threshold, delete The five modules.
在本申请实施例的方案中,根据视觉任务模型的处理结果删除部分图像处理模块,能够减少图像处理所需的时间,提高处理速度,减少对计算力的要求。In the solution of the embodiment of the present application, some image processing modules are deleted according to the processing results of the visual task model, which can reduce the time required for image processing, increase the processing speed, and reduce the requirement for computing power.
此外,权重较高的图像处理模块与视觉任务模型的相关性较强,或者说,权重较高的图像处理模块对视觉任务模型的性能的影响较大。本申请实施例的方案中,根据各个图像处理模块的权重确定被删除的图像处理模块,删除权重值相对较小的图像处理模块,这样对视觉任务模型的处理结果影响较小,删除之后对视觉任务模型的性能的影响较小。也就是说,本申请实施例的方案能够在保证视觉任务模型的性能的前提下,减少不必要的运算,降低计算开销,提高处理速度。In addition, image processing modules with higher weights have a stronger correlation with the vision task model, or in other words, image processing modules with higher weights have a greater impact on the performance of the vision task model. In the scheme of the embodiment of the present application, the image processing module to be deleted is determined according to the weight of each image processing module, and the image processing module with a relatively small weight value is deleted, so that the impact on the processing result of the visual task model is small, and the visual impact after deletion is relatively small. The performance of the task model is less affected. That is to say, the solutions of the embodiments of the present application can reduce unnecessary operations, reduce computing overhead, and improve processing speed on the premise of ensuring the performance of the visual task model.
可选地,步骤S304包括:根据视觉任务模型的处理结果和该多个图像处理模块的处理速度从该多个图像处理模块中删除部分图像处理模块。Optionally, step S304 includes: deleting part of the image processing modules from the plurality of image processing modules according to the processing result of the visual task model and the processing speed of the plurality of image processing modules.
示例性地,根据调整后的该多个图像处理模块的权重和该多个图像处理模块的处理速度从该多个图像处理模块中删除部分图像处理模块。Exemplarily, some image processing modules are deleted from the plurality of image processing modules according to the adjusted weights of the plurality of image processing modules and the processing speeds of the plurality of image processing modules.
例如,从该多个图像处理模块中删除调整后的权重小于或等于权重阈值、且处理速度小于或等于速度阈值的图像处理模块。也就是说,删除处理速度较慢,且对视觉任务模型的影响较小的图像处理模块。这样,能进一步提高图像处理的速度。For example, an image processing module whose adjusted weight is less than or equal to a weight threshold and whose processing speed is less than or equal to a speed threshold is deleted from the plurality of image processing modules. That is, image processing modules that have slower processing speed and have less impact on the vision task model are deleted. In this way, the speed of image processing can be further increased.
方式3way 3
该至少一个图像处理模块包括多个图像处理模块,步骤S304包括:根据视觉任务模型的处理结果调整该多个图像处理模块的处理顺序。The at least one image processing module includes a plurality of image processing modules, and step S304 includes: adjusting the processing order of the plurality of image processing modules according to the processing results of the visual task model.
根据视觉任务模型的处理结果调整该多个图像处理模块的处理顺序,以提高视觉任务模型的性能。The processing order of the plurality of image processing modules is adjusted according to the processing results of the visual task model, so as to improve the performance of the visual task model.
示例性地,可以采用贝叶斯优化方法、RNN模型或强化学习算法等方式调整该多个图像处理模块的处理顺序。Exemplarily, the processing order of the plurality of image processing modules may be adjusted by means of Bayesian optimization method, RNN model, or reinforcement learning algorithm.
如前所述,实际应用中可以基于训练数据集中多张图像执行方法300,直至满足预设 条件。满足预设条件后停止调整该多个图像处理模块的处理顺序,或者说,停止刷新该多个图像处理模块的处理顺序。As mentioned above, in practical applications, the method 300 can be executed based on multiple images in the training data set until the preset conditions are met. Stop adjusting the processing sequence of the multiple image processing modules after the preset condition is satisfied, or stop refreshing the processing sequence of the multiple image processing modules.
示例性地,预设条件可以为该多个图像处理模块的处理顺序的变化量小于或等于第五阈值。Exemplarily, the preset condition may be that the variation of the processing sequence of the plurality of image processing modules is less than or equal to the fifth threshold.
例如,该多个图像处理模块的处理顺序的变化量可以为,在执行方法300后处理顺序发生变化的图像处理模块的数量。For example, the amount of change in the processing order of the plurality of image processing modules may be the number of image processing modules whose processing order changes after the method 300 is executed.
可替换地,预设条件可以为视觉任务模型的推理的准确度大于或等于第六阈值。Alternatively, the preset condition may be that the inference accuracy of the visual task model is greater than or equal to the sixth threshold.
在视觉任务模型的推理的准确度大于或等于第六阈值的情况下,不再执行方法300,即停止调整该多个图像处理模块的处理顺序。In a case where the inference accuracy of the visual task model is greater than or equal to the sixth threshold, the method 300 is not executed again, that is, the adjustment of the processing sequence of the plurality of image processing modules is stopped.
第六阈值可以为预设值。或者,第六阈值可以是在没有调整图像处理模块的处理顺序的情况下,视觉任务模型的推理的准确度。例如,如图4所示,第六阈值可以为按照如图4所示的图像处理模块的处理顺序对图像进行处理的情况下,视觉任务模型的推理的准确度。The sixth threshold may be a preset value. Alternatively, the sixth threshold may be the inference accuracy of the visual task model without adjusting the processing sequence of the image processing module. For example, as shown in FIG. 4 , the sixth threshold may be the inference accuracy of the visual task model when images are processed according to the processing order of the image processing module shown in FIG. 4 .
也就是说,将图像输入原始的图像处理模块,按照原始的图像处理模块的顺序进行处理,并将处理后的图像输入至视觉任务模型中进行处理,并计算推理的准确度,将该准确度作为第六阈值。将图像输入当前调整过处理顺序的图像处理模块中进行处理,并将处理后的图像输入至视觉任务模型中进行处理,并计算推理的准确度,将当前得到的推理的准确度与第六阈值进行比较,在当前得到的推理的准确度大于或等于第六阈值的情况下,不再执行方法300。这样,按照调整后的图像处理模块的处理顺序对图像进行处理,能够保证视觉任务模型的性能,或者能够提高视觉任务模型的性能。That is to say, the image is input into the original image processing module, processed in the order of the original image processing module, and the processed image is input into the visual task model for processing, and the accuracy of inference is calculated, and the accuracy as the sixth threshold. Input the image into the currently adjusted image processing module for processing, and input the processed image into the visual task model for processing, and calculate the accuracy of the inference, and compare the accuracy of the current inference with the sixth threshold For comparison, if the accuracy of the currently obtained reasoning is greater than or equal to the sixth threshold, the method 300 is not executed any more. In this way, the images are processed according to the adjusted processing sequence of the image processing module, so that the performance of the visual task model can be guaranteed, or the performance of the visual task model can be improved.
可替换地,预设条件可以为连续执行多次方法300后得到的视觉任务模型的损失函数值的变化量小于或等于第三阈值。Alternatively, the preset condition may be that the change amount of the loss function value of the visual task model obtained after performing the method 300 for multiple times is less than or equal to the third threshold.
也就是说,在视觉任务模型的损失函数值的变化趋于稳定的情况下,不再执行方法300。That is to say, when the change of the loss function value of the vision task model tends to be stable, the method 300 is not executed any more.
可替换地,预设条件可以为迭代次数大于或等于第四阈值。Alternatively, the preset condition may be that the number of iterations is greater than or equal to the fourth threshold.
也就是说,在执行方法300的次数大于或等于第四阈值的情况下,不再执行方法300。That is to say, in the case that the number of times the method 300 is executed is greater than or equal to the fourth threshold, the method 300 is not executed any more.
可替换地,上述预设条件可以结合使用。例如,预设条件可以为视觉任务模型的推理的准确度大于或等于第六阈值,且迭代次数大于或等于第四阈值。再如,预设条件可以为该多个图像处理模块的处理顺序的变化量小于或等于第五阈值,且视觉任务模型的准确度大于或等于第六阈值。Alternatively, the above preset conditions may be used in combination. For example, the preset condition may be that the inference accuracy of the visual task model is greater than or equal to the sixth threshold, and the number of iterations is greater than or equal to the fourth threshold. For another example, the preset condition may be that the variation of the processing order of the plurality of image processing modules is less than or equal to the fifth threshold, and the accuracy of the visual task model is greater than or equal to the sixth threshold.
应理解,以上仅为示例,预设条件还可以为其他形式的条件,本申请对此不做限定。It should be understood that the above are examples only, and the preset conditions may also be conditions in other forms, which are not limited in the present application.
在本申请实施例的方案中,根据视觉任务模型的处理结果调整图像处理模块的处理顺序,能够获得更适合视觉任务的图像处理流程,有利于提高视觉任务的准确度。In the solution of the embodiment of the present application, the processing sequence of the image processing module is adjusted according to the processing result of the visual task model, so that an image processing flow more suitable for the visual task can be obtained, which is conducive to improving the accuracy of the visual task.
方式4way 4
可选地,步骤S304包括:根据视觉任务模型的处理结果调整该至少一个图像处理模块中的参数。Optionally, step S304 includes: adjusting parameters in the at least one image processing module according to a processing result of the visual task model.
根据视觉任务模型的处理结果调整该至少一个图像处理模块中的参数,以提高视觉任务模型的性能。The parameters in the at least one image processing module are adjusted according to the processing results of the visual task model, so as to improve the performance of the visual task model.
示例性地,若图像处理模块采用神经网络模型,则该图像处理模块中的参数即为该神 经网络模型的参数。Exemplarily, if the image processing module adopts a neural network model, the parameters in the image processing module are the parameters of the neural network model.
示例性地,可以采用贝叶斯优化方法、RNN模型、强化学习算法等方式调整该至少一个图像处理模块中的参数。Exemplarily, the parameters in the at least one image processing module may be adjusted by means of a Bayesian optimization method, an RNN model, a reinforcement learning algorithm, and the like.
基于当前的图像处理模块中的参数组合对输入图像进行处理,并将处理后的结果输入视觉任务模型中进行处理,例如由CPU或GPU执行视觉任务。根据视觉任务模型的性能的反馈优化更新该图像处理模块中的参数组合,即在搜索空间中寻找最优的图像处理模块中的参数组合,以提高视觉任务模型的性能。The input image is processed based on the parameter combination in the current image processing module, and the processed result is input into the vision task model for processing, for example, the vision task is performed by CPU or GPU. The parameter combination in the image processing module is optimized and updated according to the feedback of the performance of the visual task model, that is, the optimal parameter combination in the image processing module is found in the search space, so as to improve the performance of the visual task model.
如前所述,实际应用中可以基于训练数据集中多张图像执行方法300,直至满足预设条件。满足预设条件后停止调整该至少一个图像处理模块中的参数,或者说,停止刷新该至少一个图像处理模块中的参数。As mentioned above, in practical applications, the method 300 can be executed based on multiple images in the training data set until a preset condition is met. Stop adjusting the parameters in the at least one image processing module after the preset condition is met, or stop refreshing the parameters in the at least one image processing module.
示例性地,预设条件可以为视觉任务模型的推理的准确度大于或等于第七阈值。Exemplarily, the preset condition may be that the inference accuracy of the visual task model is greater than or equal to the seventh threshold.
在视觉任务模型的推理的准确度大于或等于第七阈值的情况下,不再执行方法300,即停止调整该至少一个图像处理模块中的参数。In a case where the inference accuracy of the visual task model is greater than or equal to the seventh threshold, the method 300 is not executed again, that is, the adjustment of the parameters in the at least one image processing module is stopped.
第七阈值可以为预设值。或者,第七阈值可以是在没有调整该至少一个图像处理模块中的参数的情况下得到的视觉任务模型的处理的准确度。例如,如图4所示,第七阈值可以在该9个图像处理模块没有调整参数的情况下,视觉任务模型的推理的准确度。The seventh threshold may be a preset value. Alternatively, the seventh threshold may be the processing accuracy of the visual task model obtained without adjusting the parameters in the at least one image processing module. For example, as shown in FIG. 4 , the seventh threshold may be the inference accuracy of the vision task model when the nine image processing modules do not adjust parameters.
也就是说,将图像输入原始的图像处理模块,即没有调整参数的图像处理模块中进行处理,并将处理后的图像输入至视觉任务模型中进行处理,并计算推理的准确度,将该准确度作为第七阈值。将图像输入当前调整过参数的图像处理模块中进行处理,并将处理后的图像输入至视觉任务模型中进行处理,并计算推理的准确度,将当前得到的推理的准确度与第七阈值进行比较,在当前得到的推理的准确度大于或等于第七阈值的情况下,不再执行方法300。这样,利用调整后的图像处理模块对图像进行处理,能够保证视觉任务模型的性能,或者能够提高视觉任务模型的性能。That is to say, the image is input into the original image processing module, that is, the image processing module without adjustment parameters for processing, and the processed image is input into the vision task model for processing, and the accuracy of inference is calculated, and the accuracy degrees as the seventh threshold. Input the image into the image processing module with currently adjusted parameters for processing, and input the processed image into the visual task model for processing, and calculate the accuracy of inference, and compare the currently obtained inference accuracy with the seventh threshold In comparison, in the case where the accuracy of the currently obtained reasoning is greater than or equal to the seventh threshold, the method 300 is not executed any more. In this way, using the adjusted image processing module to process the image can ensure the performance of the visual task model, or can improve the performance of the visual task model.
可替换地,预设条件可以为连续执行多次方法300后得到的视觉任务模型的损失函数值的变化量小于或等于第三阈值。Alternatively, the preset condition may be that the change amount of the loss function value of the visual task model obtained after performing the method 300 for multiple times is less than or equal to the third threshold.
也就是说,在视觉任务模型的损失函数值的变化趋于稳定的情况下,不再执行方法300。That is to say, when the change of the loss function value of the vision task model tends to be stable, the method 300 is not executed any more.
可替换地,预设条件可以为迭代次数大于或等于第四阈值。Alternatively, the preset condition may be that the number of iterations is greater than or equal to the fourth threshold.
也就是说,在执行方法300的次数大于或等于第四阈值的情况下,不再执行方法300。That is to say, in the case that the number of times the method 300 is executed is greater than or equal to the fourth threshold, the method 300 is not executed any more.
可替换地,上述预设条件可以结合使用。例如,预设条件可以为视觉任务模型的推理的准确度大于或等于第七阈值,且迭代次数大于或等于第四阈值。Alternatively, the above preset conditions may be used in combination. For example, the preset condition may be that the inference accuracy of the visual task model is greater than or equal to the seventh threshold, and the number of iterations is greater than or equal to the fourth threshold.
应理解,以上仅为示例,预设条件还可以为其他形式的条件,本申请对此不做限定。It should be understood that the above are examples only, and the preset conditions may also be conditions in other forms, which are not limited in the present application.
在本申请实施例的方案中,根据视觉任务模型的处理结果调整图像处理模块中的参数,能够获得更适合视觉任务的图像处理模块,有利于提高视觉任务的准确度。In the solutions of the embodiments of the present application, by adjusting the parameters in the image processing module according to the processing results of the visual task model, an image processing module more suitable for the visual task can be obtained, which is conducive to improving the accuracy of the visual task.
需要说明的是,上述方式1、方式2、方式3和方式4中任两种或两种以上的方式可以结合使用。在结合使用时,各个方式可以同时执行,或者,各个方式也可以分别执行。It should be noted that any two or more of the above modes 1, 2, 3 and 4 may be used in combination. When used in combination, each method can be executed at the same time, or each method can also be executed separately.
可选地,步骤S304包括:根据视觉任务模型的处理结果从多个图像处理模块中删除部分图像处理模块;通过多个图像处理模块中未被删除的图像处理模块对第五图像进行处理,得到第六图像,将第六图像输入至视觉任务模型中进行处理;根据视觉任务模型的处 理结果调整未被删除的图像处理模块的参数。Optionally, step S304 includes: deleting part of the image processing modules from the plurality of image processing modules according to the processing results of the visual task model; processing the fifth image through the image processing modules that have not been deleted in the plurality of image processing modules to obtain For the sixth image, input the sixth image into the visual task model for processing; adjust the parameters of the image processing module that have not been deleted according to the processing result of the visual task model.
示例性地,第五图像可以为训练数据集中的图像。第五图像的其他描述可以参考前文中的第一图像。第五图像和第一图像可以为相同的图像,也可以为不同的图像。Exemplarily, the fifth image may be an image in the training data set. For other descriptions of the fifth image, reference may be made to the first image above. The fifth image and the first image may be the same image or different images.
示例性地,第六图像可以为RGB图像。第六图像的描述可以参考前文中的第二图像。Exemplarily, the sixth image may be an RGB image. For the description of the sixth image, refer to the second image above.
根据本申请实施例的方案,利用视觉任务模型得到的性能指标,例如,目标检测的准确度、目标分割准确率等,调整多个图像处理模块的权重,保留对视觉任务模型的性能指标影响较大的图像处理模块,或者说,保留能够维持或提升视觉任务模型的性能指标的图像处理模块。这样,能够得到适合视觉任务模型的图像处理模块,或者说,得到视觉任务模型所需的图像处理模块,减少了图像处理流程所需的时间,节省计算开销,减少计算力的需求,对硬件更加友好。According to the solution of the embodiment of the present application, the performance indicators obtained by the visual task model, such as the accuracy of target detection, the accuracy of target segmentation, etc., are used to adjust the weights of multiple image processing modules, so as to keep the performance indicators that have a relatively small impact on the visual task model. A large image processing module, or in other words, an image processing module that maintains or improves the performance indicators of the vision task model. In this way, an image processing module suitable for the visual task model can be obtained, or in other words, the image processing module required by the visual task model can be obtained, which reduces the time required for the image processing process, saves computing overhead, reduces the demand for computing power, and requires more hardware. friendly.
而且,利用视觉任务模型得到的性能指标调整被保留的图像处理模块中的参数,例如,利用视觉任务模型得到的性能指标对图像处理模块进行设计空间的搜索,有利于得到各个图像处理模块最优的参数配置,以提升视觉任务模型的性能。Moreover, using the performance index obtained from the visual task model to adjust the parameters in the reserved image processing module, for example, using the performance index obtained from the visual task model to search the design space of the image processing module is conducive to obtaining the optimal value of each image processing module. parameter configuration to improve the performance of the vision task model.
可选地,步骤S304包括:根据视觉任务模型的处理结果调整多个图像处理模块的参数以及该多个图像处理模块的权重,根据调整后的多个图像处理模块的权重从该多个图像处理模块中删除部分图像处理模块。Optionally, step S304 includes: adjusting the parameters of multiple image processing modules and the weights of the multiple image processing modules according to the processing results of the visual task model, and processing the multiple image processing modules according to the adjusted weights of the multiple image processing modules. Some image processing modules are deleted from the module.
可选地,步骤S304包括:根据视觉任务模型的处理结果调整多个图像处理模块的参数、该多个图像处理模块的权重以及该多个图像处理模块的处理顺序,根据调整后的多个图像处理模块的权重从该多个图像处理模块中删除部分图像处理模块。Optionally, step S304 includes: adjusting the parameters of multiple image processing modules, the weights of the multiple image processing modules, and the processing order of the multiple image processing modules according to the processing results of the visual task model, and adjusting the multiple image processing modules according to the adjusted The weight of the processing module deletes some image processing modules from the plurality of image processing modules.
本申请实施例提供了一种图像处理方法400,方法400可以视为方法300的一种具体实现方式,具体描述参考前述方法300,为了描述简洁,下面在介绍方法400时适当省略部分描述。具体地,方法400采用方式1、方式2和方式4的组合的方式。The embodiment of the present application provides an image processing method 400. The method 400 can be regarded as a specific implementation of the method 300. For the specific description, refer to the aforementioned method 300. For the sake of brevity, some descriptions are appropriately omitted when introducing the method 400 below. Specifically, the method 400 adopts a combination of mode 1, mode 2 and mode 4.
方法400包括步骤S401至步骤S410。下面对步骤S401至步骤S410进行说明。方法400可以视为两个阶段,第一阶段包括步骤S401至步骤S406,第二阶段包括步骤S407至步骤S410。The method 400 includes step S401 to step S410. Steps S401 to S410 will be described below. The method 400 can be regarded as two stages, the first stage includes steps S401 to S406, and the second stage includes steps S407 to S410.
S401,为多个图像处理模块设置初始的权重。S401. Set initial weights for multiple image processing modules.
例如,该多个图像处理模块可以包括如图5所示的9个图像处理模块。将各个图像处理模块的权重表示为w1、w2、w3、w4、w5、w6、w7、w8和w9。该9个权重的总和为1。For example, the plurality of image processing modules may include nine image processing modules as shown in FIG. 5 . The weights of the respective image processing modules are denoted as w1, w2, w3, w4, w5, w6, w7, w8 and w9. The sum of the 9 weights is 1.
以上仅为示例,其他权重设置方法可以参考步骤S302中的描述。The above is only an example, and other weight setting methods can refer to the description in step S302.
S402,将训练数据集中的图像输入至该多个图像处理模块中进行处理。S402. Input the images in the training data set to the plurality of image processing modules for processing.
也就是基于该多个图像处理模块的权重对输入的图像进行处理。或者说,基于该多个图像处理模块的权重对该多个图像处理模块的处理结果进行调整。That is, the input image is processed based on the weights of the plurality of image processing modules. In other words, the processing results of the multiple image processing modules are adjusted based on the weights of the multiple image processing modules.
例如,按照图5所示的图像处理模块及其对应的权重对输入图形进行处理。For example, the input image is processed according to the image processing module and its corresponding weight shown in FIG. 5 .
示例性地,处理结果可以为RGB图像。Exemplarily, the processing result may be an RGB image.
进一步地,处理结果可以为8bit的RGB图像。Further, the processing result may be an 8-bit RGB image.
步骤S402与步骤S302对应,具体描述可以参见步骤S302中的描述。Step S402 corresponds to step S302, and for a specific description, refer to the description in step S302.
S403,该多个图像处理模块处理后的结果输入至视觉任务模型中进行推理,得到视觉任务模型的推理结果。S403, input the processed results of the plurality of image processing modules into the visual task model for reasoning, and obtain the reasoning result of the visual task model.
视觉任务模型可以为已经训练好的模型。The vision task model can be a trained model.
S404,将视觉任务模型的推理结果与训练数据集中的图像对应的真值进行对比,根据对比结果调整该多个图像处理模块的权重。S404. Compare the inference result of the visual task model with the true value corresponding to the image in the training data set, and adjust the weights of the plurality of image processing modules according to the comparison result.
或者说,将对比结果反馈至优化算法中,利用优化算法调整该多个图像处理模块的权重。In other words, the comparison result is fed back to the optimization algorithm, and the optimization algorithm is used to adjust the weights of the plurality of image processing modules.
示例性地,优化算法包括贝叶斯优化方法、RNN模型、强化学习算法。Exemplarily, the optimization algorithm includes a Bayesian optimization method, an RNN model, and a reinforcement learning algorithm.
S405,将调整后的图像处理模块的权重作为步骤S402中的图像处理模块的权重,重复步骤S402至步骤S404,直至满足第一预设条件。S405, using the adjusted weight of the image processing module as the weight of the image processing module in step S402, and repeating steps S402 to S404 until the first preset condition is met.
或者,步骤S405也可以为,将调整后的图像处理模块的权重进行归一化处理,将归一化处理后的权重作为步骤S402中的图像处理模块的权重。Alternatively, step S405 may also be to perform normalization processing on the adjusted weights of the image processing modules, and use the normalized weights as the weights of the image processing modules in step S402.
也就是说,在每次调整图像处理模块的权重之后,对调整后的权重进行归一化处理,使得归一化后的权重的总和为1或者,总和接近1。That is to say, after adjusting the weights of the image processing modules each time, normalization processing is performed on the adjusted weights, so that the sum of the normalized weights is 1 or the sum is close to 1.
满足第一预设条件后,终止步骤S402至步骤S404。示例性地,当前得到的图像处理模块的权重可以视为满足第一预设条件后得到的图像处理模块的权重。After the first preset condition is satisfied, step S402 to step S404 are terminated. Exemplarily, the currently obtained weight of the image processing module may be regarded as the weight of the image processing module obtained after satisfying the first preset condition.
例如,若当前视觉任务模型的准确度大于或等于没有设置图像处理模块的权重的情况下视觉任务模型的准确度,则终止步骤S402至步骤S404。For example, if the accuracy of the current visual task model is greater than or equal to the accuracy of the visual task model when the weight of the image processing module is not set, then step S402 to step S404 are terminated.
步骤S403至步骤S405可以视为方式1的具体实现方式,具体描述可以参考方式1中的描述,第一预设条件的设置方式可以参考方式1中的预设条件,此处不再赘述。Steps S403 to S405 can be regarded as a specific implementation of method 1. For specific description, refer to the description in method 1. For the setting method of the first preset condition, refer to the preset condition in method 1, which will not be repeated here.
S406,根据满足第一预设条件后得到的图像处理模块的权重删除部分图像处理模块。S406. Delete part of the image processing modules according to the weights of the image processing modules obtained after satisfying the first preset condition.
步骤S406与方法2中的步骤S304对应,具体描述可以参考方式2中的相爱难改观描述,此处不再赘述。Step S406 corresponds to step S304 in method 2. For specific description, please refer to the description of love is hard to change in method 2, which will not be repeated here.
例如,如图5所示,删除调整后的权重值较小的绿平衡模块、坏点修复模块、bayer降噪模块、色彩校正模块、伽马校正模块以及降噪和锐化模块。For example, as shown in FIG. 5 , delete the green balance module, bad pixel repair module, bayer noise reduction module, color correction module, gamma correction module, and noise reduction and sharpening module with smaller adjusted weight values.
S407,将训练数据集中的图像输入至该未被删除的图像处理模块中进行处理。S407. Input the images in the training data set into the undeleted image processing module for processing.
步骤S407中的图像与步骤S402中的图像可以为相同的图像,也可以为不同的图像。The image in step S407 and the image in step S402 may be the same image or different images.
也就是说,将未被删除的图像处理模块中的参数作为调优对象。或者说,将被保留的图像处理模块中的参数作为调优对象。That is to say, the parameters in the image processing module that have not been deleted are used as tuning objects. In other words, the parameters in the reserved image processing module are used as tuning objects.
进一步地,在步骤S407之前,还可以对未被删除的图像处理模块的权重进行归一化处理。Further, before step S407, normalization processing may also be performed on the weights of the image processing modules that have not been deleted.
例如,如图5所示,将训练数据集中的图像输入至黑电平补偿模块、去马赛克模块、自动白平衡模块和伽马校正模块中进行处理。进一步地,在执行步骤S407之前,可以对该4个图像处理模块的权重进行归一化处理。For example, as shown in Figure 5, the images in the training data set are input to the black level compensation module, the demosaic module, the automatic white balance module and the gamma correction module for processing. Further, before performing step S407, the weights of the four image processing modules may be normalized.
S408,将未被删除的图像处理模块处理后的结果输入至视觉任务模型中进行推理,得到视觉任务模型的推理结果。S408. Input the processed results of the undeleted image processing modules into the visual task model for inference, and obtain an inference result of the visual task model.
S409,将视觉任务模型的推理结果与训练数据集中的图像对应的真值进行对比,根据对比结果调整未被删除的图像处理模块中的参数。S409, comparing the inference result of the visual task model with the true value corresponding to the image in the training data set, and adjusting the parameters in the image processing module that have not been deleted according to the comparison result.
或者说,将对比结果反馈至优化算法中,利用优化算法调整该图像处理模块中的参数。In other words, the comparison result is fed back to the optimization algorithm, and the parameters in the image processing module are adjusted using the optimization algorithm.
示例性地,优化算法包括贝叶斯优化方法、RNN模型或强化学习算法。Exemplarily, the optimization algorithm includes Bayesian optimization method, RNN model or reinforcement learning algorithm.
应理解,步骤S409采用的优化算法与步骤S440采用的优化算法可以相同,也可以不 同。It should be understood that the optimization algorithm used in step S409 may be the same as or different from the optimization algorithm used in step S440.
S410,将调整后的图像处理模块中的参数作为步骤S407中的图像处理模块中的参数,重复步骤S407至步骤S409,直至满足第二预设条件。S410, using the adjusted parameters in the image processing module as parameters in the image processing module in step S407, and repeating steps S407 to S409 until the second preset condition is met.
满足第二预设条件后,终止步骤S407至步骤S410。示例性地,当前得到的图像处理模块中的参数可以视为满足第二预设条件后得到的图像处理模块中的参数。After the second preset condition is satisfied, step S407 to step S410 are terminated. Exemplarily, the currently obtained parameters in the image processing module may be regarded as parameters in the image processing module obtained after satisfying the second preset condition.
例如,若当前视觉任务模型的准确度大于或等于没有设置图像处理模块的权重的情况下视觉任务模型的准确度,则终止步骤S407至步骤S410。For example, if the accuracy of the current visual task model is greater than or equal to the accuracy of the visual task model when the weight of the image processing module is not set, then step S407 to step S410 are terminated.
步骤S407至步骤S410可以视为方式3的具体实现方式,具体描述可以参考方式3中的描述,此处不再赘述。第二预设条件的设置方式可以参考方式3中的预设条件。Step S407 to step S410 can be regarded as a specific implementation manner of mode 3, and for specific description, refer to the description in mode 3, which will not be repeated here. For the setting method of the second preset condition, reference may be made to the preset condition in method 3.
根据本申请实施例的方案,利用视觉任务模型得到的性能指标,例如,目标检测的准确度、目标分割准确率等,调整多个图像处理模块的权重,保留对视觉任务模型的性能指标影响较大的图像处理模块,或者说,保留能够维持或提升视觉任务模型的性能指标的图像处理模块。这样,能够得到适合视觉任务模型的图像处理模块,或者说,得到视觉任务模型所需的图像处理模块,减少了图像处理流程所需的时间,节省计算开销,减少计算力的需求,对硬件更加友好。According to the solution of the embodiment of the present application, the performance indicators obtained by the visual task model, such as the accuracy of target detection, the accuracy of target segmentation, etc., are used to adjust the weights of multiple image processing modules, so as to keep the performance indicators that have a relatively small impact on the visual task model. A large image processing module, or in other words, an image processing module that maintains or improves the performance indicators of the vision task model. In this way, an image processing module suitable for the visual task model can be obtained, or in other words, the image processing module required by the visual task model can be obtained, which reduces the time required for the image processing process, saves computing overhead, reduces the demand for computing power, and requires more hardware. friendly.
而且,在第一阶段完成后,利用视觉任务模型得到的性能指标调整被保留的图像处理模块中的参数,例如,利用视觉任务模型得到的性能指标对图像处理模块进行设计空间的搜索,有利于得到各个图像处理模块最优的参数配置,以提升视觉任务模型的性能。Moreover, after the first stage is completed, use the performance index obtained by the visual task model to adjust the parameters in the retained image processing module, for example, use the performance index obtained by the visual task model to search the design space of the image processing module, which is beneficial The optimal parameter configuration of each image processing module is obtained to improve the performance of the vision task model.
在另一种可能的实现方式中,方法400中的第一阶段和第二阶段可以同时执行。也就是说同时调整图像处理模块的权重以及图像处理模块中的参数。下面对方法400的第一阶段和第二阶段同时执行的方式进行说。方法400可以包括以下步骤。以下步骤可以参考前述方法400的第一阶段和第二阶段的描述,为了描述简洁,在描述以下步骤时适当省略部分描述。In another possible implementation manner, the first stage and the second stage in the method 400 may be executed simultaneously. That is to say, the weight of the image processing module and the parameters in the image processing module are adjusted at the same time. The manner in which the first phase and the second phase of the method 400 are executed simultaneously will be described below. Method 400 may include the following steps. For the following steps, reference may be made to the description of the first stage and the second stage of the aforementioned method 400. For the sake of brevity, part of the description is appropriately omitted when describing the following steps.
1)为多个图像处理模块设置初始的权重。1) Set initial weights for multiple image processing modules.
2)将训练数据集中的图像输入至该多个图像处理模块中进行处理。2) Input the images in the training data set into the plurality of image processing modules for processing.
3)该多个图像处理模块处理后的结果输入至视觉任务模型中进行推理,得到视觉任务模型的推理结果。3) Input the processed results of the plurality of image processing modules into the visual task model for inference, and obtain the inference result of the visual task model.
4)将视觉任务模型的推理结果与训练数据集中的图像对应的真值进行对比,根据对比结果调整该多个图像处理模块的权重以及该多个图像处理模块中的参数。4) comparing the inference result of the vision task model with the true value corresponding to the image in the training data set, and adjusting the weights of the multiple image processing modules and the parameters in the multiple image processing modules according to the comparison results.
或者说,将对比结果反馈至优化算法中,利用优化算法调整该多个图像处理模块的权重。利用优化算法调整该多个图像处理模块中的参数。In other words, the comparison result is fed back to the optimization algorithm, and the optimization algorithm is used to adjust the weights of the plurality of image processing modules. An optimization algorithm is used to adjust parameters in the multiple image processing modules.
示例性地,优化算法包括贝叶斯优化方法、RNN模型、强化学习算法。Exemplarily, the optimization algorithm includes a Bayesian optimization method, an RNN model, and a reinforcement learning algorithm.
调整该多个图像处理模块的权重的优化算法和调整该多个图像处理模块中的参数的优化算法可以相同,也可以不同。The optimization algorithm for adjusting the weights of the multiple image processing modules and the optimization algorithm for adjusting the parameters in the multiple image processing modules may be the same or different.
5)将调整后的图像处理模块的权重作为步骤2)中的图像处理模块的权重,将调整后的图像处理模块中的参数作为步骤2)中的图像处理模块中的参数,重复步骤2)至步骤4),直至训练完成。5) The weight of the image processing module after adjustment is used as the weight of the image processing module in step 2), and the parameter in the image processing module after adjustment is used as the parameter in the image processing module in step 2), repeating step 2) Go to step 4) until the training is completed.
或者,将调整后的图像处理模块的权重进行归一化处理,将归一化处理后的权重作为步骤5)中的图像处理模块的权重。Alternatively, the adjusted weights of the image processing modules are normalized, and the normalized weights are used as the weights of the image processing modules in step 5).
也就是说,在每次调整图像处理模块的权重之后,对调整后的权重进行归一化处理,使得归一化后的权重的总和为1或者,总和接近1。That is to say, after adjusting the weights of the image processing modules each time, normalization processing is performed on the adjusted weights, so that the sum of the normalized weights is 1 or the sum is close to 1.
例如,若当前视觉任务模型的准确度大于或等于没有设置图像处理模块的权重的情况下视觉任务模型的推理的准确度,则训练完成。或者说,若当前视觉任务模型的准确度大于或等于方法400执行之前的视觉任务模型的推理的准确度,则训练完成。For example, if the accuracy of the current visual task model is greater than or equal to the inference accuracy of the visual task model when the weight of the image processing module is not set, the training is completed. In other words, if the accuracy of the current visual task model is greater than or equal to the inference accuracy of the visual task model before the method 400 is executed, the training is complete.
6)根据训练完成后的图像处理模块的权重删除部分图像处理模块。步骤6)与方法2中的步骤S304对应,具体描述可以参考方式2中的描述,此处不再赘述。6) Delete part of the image processing modules according to the weights of the image processing modules after training. Step 6) corresponds to step S304 in method 2. For specific description, please refer to the description in method 2, which will not be repeated here.
这样,第一阶段和第二阶段同时执行,能够避免图像处理模块由于参数配置不合理而被删除,使得图像处理模块能够在较优的参数配置下对图像进行处理,进而判断较优的参数配置下的各个图像处理模块对视觉任务模型的性能指标的贡献程度,以保留视觉任务模型所需的图像处理模块,这样能够进一步提高视觉任务模型的性能指标。In this way, the first stage and the second stage are executed at the same time, which can prevent the image processing module from being deleted due to unreasonable parameter configuration, so that the image processing module can process the image under a better parameter configuration, and then judge the better parameter configuration The contribution degree of each image processing module under the vision task model to the performance index, in order to retain the image processing module required by the vision task model, so that the performance index of the vision task model can be further improved.
方法400仅为将方式1、方式2和方式4进行结合的一种示例。方式1、方式2、方式3和方式4还可以以其他实现方式进行结合。Method 400 is only an example of combining mode 1, mode 2 and mode 4. Way 1, way 2, way 3 and way 4 can also be combined in other implementation ways.
示例性地,将方式1、方式2和方式3进行结合。Exemplarily, mode 1, mode 2 and mode 3 are combined.
例如,步骤S304可以包括:根据视觉任务模型的处理结果调整多个图像处理模块的权重和该多个图像处理模块的处理顺序,根据调整后的图像处理模块的权重从该多个图像处理模块中删除部分图像处理模块。For example, step S304 may include: adjusting the weights of multiple image processing modules and the processing order of the multiple image processing modules according to the processing results of the visual task model, and selecting from the multiple image processing modules according to the adjusted weights of the image processing modules Delete some image processing modules.
再如,步骤S304可以包括:根据视觉任务模型的处理结果调整多个图像处理模块的权重,根据调整后的图像处理模块的权重从该多个图像处理模块中删除部分图像处理模块;根据视觉任务模型的处理结果调整未被删除的图像处理模块的处理顺序。也就是将步骤S304分为两个阶段,在第一阶段中删除部分图像处理模块,在第二阶段中调整未被删除的图像处理模块的处理顺序。For another example, step S304 may include: adjusting the weights of multiple image processing modules according to the processing results of the visual task model, and deleting part of the image processing modules from the multiple image processing modules according to the adjusted weights of the image processing modules; The processing results of the model adjust the processing order of the image processing modules that have not been deleted. That is, step S304 is divided into two stages. In the first stage, some image processing modules are deleted, and in the second stage, the processing order of the image processing modules that have not been deleted is adjusted.
具体的结合方式可以参考方法400,此处不再赘述。For a specific combination manner, reference may be made to the method 400, which will not be repeated here.
应理解,以上结合方式均为示例,还可以对上述四种方式中的任两种及两种以上的方式进行结合,本申请实施例对此不做限定。It should be understood that the above combination manners are examples, and any two or more of the foregoing four manners may also be combined, which is not limited in this embodiment of the present application.
本申请实施例中,调整后的图像处理模块为视觉任务模型所需的图像处理模块。调整后的图像处理模块与视觉任务模型之间具有对应关系。不同的视觉任务模型可以对应不同的图像处理模块。这样,能够根据应用场景选择合适的图像处理流程。In the embodiment of the present application, the adjusted image processing module is an image processing module required by the visual task model. There is a corresponding relationship between the adjusted image processing module and the vision task model. Different vision task models can correspond to different image processing modules. In this way, an appropriate image processing flow can be selected according to the application scenario.
图6示出了本申请实施例提供的图像处理方法700,图6所示的方法可以由图像处理装置执行,该装置可以是云服务设备,也可以是终端设备,例如,电脑、服务器等运算能力足以用来执行图像处理的装置,也可以是由云服务设备和终端设备构成的系统。示例性地,方法700可以由图1中的预处理模块执行。Figure 6 shows an image processing method 700 provided by the embodiment of the present application. The method shown in Figure 6 can be executed by an image processing device, which can be a cloud service device or a terminal device, such as a computer, server, etc. A device capable enough to perform image processing may also be a system composed of cloud service equipment and terminal equipment. Exemplarily, the method 700 may be executed by the preprocessing module in FIG. 1 .
方法700中的目标图像处理模块是由方法300或方法400得到的。为了避免不必要的重复,下面在介绍方法700时适当省略重复的描述。The target image processing module in method 700 is obtained by method 300 or method 400 . In order to avoid unnecessary repetition, repeated descriptions are appropriately omitted when introducing the method 700 below.
方法700包括步骤S701至步骤S704。下面对步骤S701至步骤S704进行详细介绍。The method 700 includes steps S701 to S704. Steps S701 to S704 will be described in detail below.
S701,获取第三图像。S701. Acquire a third image.
第三图像为待处理的图像。The third image is an image to be processed.
示例性地,第三图像可以为传感器获取的raw图。Exemplarily, the third image may be a raw image acquired by the sensor.
示例性地,第三图像可以是终端设备(或者电脑、服务器等其他装置或设备)通过摄 像头拍摄到的图像,或者,第三图像还可以是从终端设备(或者电脑、服务器等其他装置或设备)内部获得的图像(例如,终端设备的相册中存储的图像,或者终端设备从云端获取的图像),本申请实施例对此并不限定。Exemplarily, the third image may be an image captured by a terminal device (or other device or device such as a computer or server) through a camera, or the third image may also be an image captured by a terminal device (or other device or device such as a computer or server). ) internally obtained images (for example, images stored in the photo album of the terminal device, or images obtained by the terminal device from the cloud), which are not limited in this embodiment of the present application.
S702,根据视觉任务模型确定至少一个目标图像处理模块。S702. Determine at least one target image processing module according to the visual task model.
该至少一个目标图像处理模块是与视觉任务模型对应的一个或多个图像处理模块。The at least one target image processing module is one or more image processing modules corresponding to the visual task model.
示例性地,视觉任务包括:目标检测、图像分类、目标分割、目标跟踪或图像识别等。Exemplarily, the vision task includes: target detection, image classification, target segmentation, target tracking, or image recognition.
视觉任务模型用于执行视觉任务。例如,视觉任务为目标检测,则视觉任务模型为目标检测模型。再如,视觉任务为图像识别,则视觉任务模型为图像识别模型。The visual task model is used to perform visual tasks. For example, if the vision task is target detection, then the vision task model is the target detection model. For another example, if the visual task is image recognition, then the visual task model is an image recognition model.
视觉任务模型可以为训练好的模型。The vision task model can be a trained model.
在不同的应用场景中,可以采用不同的视觉任务模型,相应地,根据不同的视觉任务模型即可确定与该视觉任务模型匹配的至少一个目标图像处理模块。这样,可以根据不同的应用场景选用不同的图像处理模块。In different application scenarios, different visual task models may be used, and accordingly, at least one target image processing module matching the visual task model may be determined according to different visual task models. In this way, different image processing modules can be selected according to different application scenarios.
对于相同的视觉任务,在不同的应用场景下,可能采用不同的视觉任务模型。例如,对于驾驶场景中的目标检测任务,在曝光过度和曝光不足的情况下采用的视觉任务模型可能是相同的,也可能是不同的。在驾驶的过程中,若当前场景被识别为曝光过度,可以采用第一目标检测模型作为视觉任务模型,根据第一目标检测模型确定第一目标检测模型对应的至少一个目标图像处理模块。若当前场景被识别为曝光不足,可以采用第二目标检测模型作为视觉任务模型,根据第二目标检测模型确定与第二目标检测模型对应的至少一个目标图像处理模块。第一目标检测模型和第二目标检测模型为不同的目标检测模型。这样,可以根据不同的应用场景选择不同的图像处理流程,提高视觉任务模型的性能。For the same vision task, different vision task models may be used in different application scenarios. For example, for an object detection task in a driving scene, the visual task model employed may or may not be the same in overexposed and underexposed situations. During driving, if the current scene is identified as being overexposed, the first target detection model may be used as the visual task model, and at least one target image processing module corresponding to the first target detection model may be determined according to the first target detection model. If the current scene is identified as being underexposed, the second target detection model may be used as the visual task model, and at least one target image processing module corresponding to the second target detection model is determined according to the second target detection model. The first target detection model and the second target detection model are different target detection models. In this way, different image processing processes can be selected according to different application scenarios to improve the performance of the vision task model.
S703,通过该至少一个目标图像处理模块对第三图像进行处理,得到第四图像。S703. Process the third image by the at least one target image processing module to obtain a fourth image.
也就是说,利用与视觉任务模型对应的一个或多个图像处理模块对输入的第三图像进行处理,得到第四图像。That is to say, one or more image processing modules corresponding to the visual task model are used to process the input third image to obtain the fourth image.
示例性地,第四图像可以为RGB图像。或者,第四图像可以为8bit的RGB图像。此处仅为示例,第四图像的类型可以根据视觉任务模型的输入需要设置。Exemplarily, the fourth image may be an RGB image. Alternatively, the fourth image may be an 8-bit RGB image. This is only an example, and the type of the fourth image can be set according to the input requirements of the visual task model.
S704,通过视觉任务模型对第四图像进行处理,得到第四图像的处理结果。S704. Process the fourth image by using the visual task model to obtain a processing result of the fourth image.
第四图像的处理结果也可以理解为第三图像的处理结果。The processing result of the fourth image can also be understood as the processing result of the third image.
第四图像的处理结果即为视觉任务模型的推理结果。视觉任务模型的推理结果与视觉任务的类型有关。The processing result of the fourth image is the reasoning result of the visual task model. The inference results of the visual task model are related to the type of visual task.
例如,视觉任务为目标检测,则视觉任务模型的推理结果可以为第四图像上的目标框以及该目标框中的物体的类别。再如,视觉任务为图像分类,则视觉任务模型的推理结果可以为第四图像的类别。For example, if the vision task is target detection, the inference result of the vision task model may be the target frame on the fourth image and the category of the object in the target frame. For another example, if the vision task is image classification, the reasoning result of the vision task model may be the category of the fourth image.
视觉任务模型和图像处理模块的配置之间具有对应关系。根据视觉任务模型和图像处理模块的配置之间的对应关系可以确定与当前的视觉任务模型匹配的图像处理模块的配置。There is a corresponding relationship between the vision task model and the configuration of the image processing module. According to the corresponding relationship between the visual task model and the configuration of the image processing module, the configuration of the image processing module matching the current visual task model can be determined.
示例性地,图像处理模块的配置包括以下至少一项:图像处理模块的组合、图像处理模块的权重、图像处理模块的处理顺序或者图像处理模块中的参数。Exemplarily, the configuration of the image processing modules includes at least one of the following: a combination of image processing modules, a weight of the image processing modules, a processing order of the image processing modules, or parameters in the image processing modules.
可选地,步骤S702包括:根据视觉任务模型从多个候选图像处理模块中确定至少一个目标图像处理模块。Optionally, step S702 includes: determining at least one target image processing module from multiple candidate image processing modules according to the visual task model.
也就是说,根据视觉任务模型从多个候选图像处理模块中确定一个图像处理模块的组合,该图像处理模块的组合中的图像处理模块即为该至少一个目标图像处理模块。That is to say, a combination of image processing modules is determined from multiple candidate image processing modules according to the visual task model, and an image processing module in the combination of image processing modules is the at least one target image processing module.
在该情况下,当视觉任务模型发生变化时,相应地,图像处理模块的组合也可能发生变化。In this case, when the visual task model changes, the combination of image processing modules may also change accordingly.
视觉任务模型和图像处理模块的组合之间具有对应关系。根据该对应关系即可确定当前视觉任务模型对应的图像处理模块的组合,或者说,根据该对应关系即可确定用于该视觉任务模型所需的图像处理模块,即该至少一个目标图像处理模块。该至少一个目标图像处理模块可以是通过方法300或者方法400得到的。或者,可以理解为,视觉任务模型和图像处理模块的组合之间的对应关系是通过方法300或方法400得到的。There is a correspondence between the combination of the visual task model and the image processing module. According to the corresponding relationship, the combination of image processing modules corresponding to the current visual task model can be determined, or in other words, the image processing module required for the visual task model can be determined according to the corresponding relationship, that is, the at least one target image processing module . The at least one target image processing module may be obtained through the method 300 or the method 400 . Alternatively, it can be understood that the correspondence between the combination of the visual task model and the image processing module is obtained through the method 300 or the method 400 .
例如,视觉任务模型为图5所示的模型,则该至少一个目标图像处理模块包括:黑电平补偿模块、去马赛克模块、自动白平衡模块和伽马校正模块。For example, if the visual task model is the model shown in FIG. 5 , then the at least one target image processing module includes: a black level compensation module, a demosaic module, an automatic white balance module and a gamma correction module.
这样,不同的视觉任务模型对应不同的图像处理模块的组合,当视觉任务模型发生变化时,图像处理模块的组合能够自适应匹配视觉任务模型,使得当前的图像处理模块的组合更适合当前的视觉任务模型,有利于提高视觉任务模型的性能。In this way, different visual task models correspond to different combinations of image processing modules. When the visual task model changes, the combination of image processing modules can adaptively match the visual task model, making the current combination of image processing modules more suitable for the current visual Task model, which is beneficial to improve the performance of vision task models.
而且,根据视觉任务模型从多个候选图像处理模块中选择适合的图像处理模块,无需使用所有的候选图像处理模块对图像进行处理,减少了处理流程,降低了对计算力的要求。Moreover, by selecting a suitable image processing module from multiple candidate image processing modules according to the visual task model, it is not necessary to use all the candidate image processing modules to process images, which reduces the processing flow and reduces the requirement for computing power.
可选地,步骤S702包括:根据视觉任务模型确定至少一个目标图像处理模块的权重。至少一个目标图像处理模块的权重用于对至少一个目标图像处理模块的处理结果进行处理,得到第四图像。Optionally, step S702 includes: determining the weight of at least one target image processing module according to the visual task model. The weight of the at least one target image processing module is used to process the processing result of the at least one target image processing module to obtain a fourth image.
在一种实现方式中,不同的视觉任务模型对应的图像处理模块的组合是相同的。当视觉任务模型发生变化时,相应地,图像处理模块的权重可能发生变化。In an implementation manner, the combinations of image processing modules corresponding to different visual task models are the same. When the vision task model changes, the weight of the image processing module may change accordingly.
本申请实施例中,不同的视觉任务模型对应的图像处理模块的组合相同可以理解为不同的视觉任务模型所采用的图像处理模块所实现的功能是相同的。In the embodiment of the present application, the combination of image processing modules corresponding to different visual task models is the same, which may be understood to mean that the functions implemented by the image processing modules adopted by different visual task models are the same.
视觉任务模型和图像处理模块的权重之间具有对应关系。根据该对应关系可以确定当前的视觉任务模型对应的图像处理模块的权重,即该至少一个目标图像处理模块的权重。There is a corresponding relationship between the visual task model and the weights of the image processing module. According to the corresponding relationship, the weight of the image processing module corresponding to the current visual task model, that is, the weight of the at least one target image processing module, can be determined.
例如,视觉任务模型为图4所示的模型,则该至少一个目标图像处理模块可以为图4中的9个图像处理模块,图像处理模块的权重可以为步骤S405得到的权重。For example, if the visual task model is the model shown in FIG. 4, the at least one target image processing module may be the nine image processing modules in FIG. 4, and the weights of the image processing modules may be the weights obtained in step S405.
这样,不同的视觉任务模型对应不同的图像处理模块的权重,当视觉任务模型发生变化时,图像处理模块的权重能够自适应匹配视觉任务模型,使得当前的图像处理模块的权重更适合当前的视觉任务模型,有利于提高视觉任务模型的性能。In this way, different visual task models correspond to different weights of image processing modules. When the visual task model changes, the weights of the image processing modules can adaptively match the visual task model, making the weights of the current image processing modules more suitable for the current visual Task model, which is beneficial to improve the performance of vision task models.
在另一种实现方式中,当视觉任务模型发生变化时,相应地,图像处理模块的权重也可能发生变化,图像处理模块的其他配置可能也发生变化。例如,图像处理模块的组合可能发生变化。In another implementation manner, when the visual task model changes, correspondingly, the weight of the image processing module may also change, and other configurations of the image processing module may also change. For example, the combination of image processing modules may change.
示例性地,视觉任务模型和图像处理模块的权重以及图像处理模块的其他配置情况具有对应关系。这样,根据视觉任务模型可以确定视觉任务模型对应的图像处理模块的权重,以及图像处理模块的其他配置情况。Exemplarily, the visual task model has a corresponding relationship with the weight of the image processing module and other configuration conditions of the image processing module. In this way, the weight of the image processing module corresponding to the visual task model and other configurations of the image processing module can be determined according to the visual task model.
例如,视觉任务模型和图像处理模块的组合以及图像处理模块的权重之间具有对应关系。步骤S702中可以确定视觉任务模型对应的图像处理模块的组合,以及该图像处理模块的组合中的图像处理模块的权重。For example, there is a corresponding relationship between the combination of the visual task model and the image processing module, and the weight of the image processing module. In step S702, a combination of image processing modules corresponding to the visual task model and weights of image processing modules in the combination of image processing modules may be determined.
若视觉任务模型为图5所示的模型,与该视觉任务模型对应的该至少一个目标图像处理模块可以是步骤S406得到的。该至少一个目标图像处理模块包括黑电平补偿模块、去马赛克模块、自动白平衡模块和伽马校正模块。该至少一个目标图像处理模块的权重可以是步骤S405得到的权重。If the visual task model is the model shown in FIG. 5 , the at least one target image processing module corresponding to the visual task model may be obtained in step S406. The at least one target image processing module includes a black level compensation module, a demosaic module, an automatic white balance module and a gamma correction module. The weight of the at least one target image processing module may be the weight obtained in step S405.
可选地,步骤S702包括:根据视觉任务模型确定至少一个目标图像处理模块的处理顺序。Optionally, step S702 includes: determining a processing sequence of at least one target image processing module according to the visual task model.
在一种实现方式中,不同的视觉任务模型对应的图像处理模块的组合是相同的。在该情况下,当视觉任务模型发生变化时,相应地,图像处理模块的处理顺序也可能发生变化。In an implementation manner, the combinations of image processing modules corresponding to different visual task models are the same. In this case, when the visual task model changes, the processing sequence of the image processing module may also change accordingly.
视觉任务模型和图像处理模块的处理顺序之间具有对应关系。根据该对应关系可以确定当前视觉任务模型对应的图像处理模块的处理顺序,即该至少一个目标图像处理模块的处理顺序。There is a corresponding relationship between the visual task model and the processing sequence of the image processing module. According to the corresponding relationship, the processing order of the image processing modules corresponding to the current visual task model can be determined, that is, the processing order of the at least one target image processing module.
这样,不同的视觉任务模型对应不同的图像处理模块的处理顺序,当视觉任务模型发生变化时,图像处理模块的处理顺序能够自适应匹配视觉任务模型,使得当前的图像处理模块的处理顺序更适合当前的视觉任务模型,有利于提高视觉任务模型的性能。In this way, different visual task models correspond to the processing order of different image processing modules. When the visual task model changes, the processing order of the image processing module can adaptively match the visual task model, making the processing order of the current image processing module more suitable. The current vision task model is beneficial to improve the performance of the vision task model.
在另一种实现方式中,当视觉任务模型发生变化时,相应地,图像处理模块的处理顺序可能发生变化,图像处理模块的其他配置也可能发生变化。例如,图像处理模块的组合可能发生变化。In another implementation manner, when the visual task model changes, correspondingly, the processing order of the image processing module may change, and other configurations of the image processing module may also change. For example, the combination of image processing modules may change.
示例性地,视觉任务模型和图像处理模块的处理顺序以及图像处理模块的其他配置情况具有对应关系。这样,根据该对应关系可以确定视觉任务模型对应的图像处理模块的处理顺序,以及图像处理模块的其他配置情况。Exemplarily, the visual task model has a corresponding relationship with the processing order of the image processing module and other configurations of the image processing module. In this way, the processing sequence of the image processing module corresponding to the visual task model and other configurations of the image processing module can be determined according to the corresponding relationship.
例如,视觉任务模型和图像处理模块的组合以及图像处理模块的处理顺序之间具有对应关系。根据视觉任务模型可以确定视觉任务模型对应的图像处理模块的组合,以及该图像处理模块的组合中的图像处理模块的处理顺序。For example, there is a corresponding relationship between the combination of the visual task model and the image processing module, and the processing sequence of the image processing module. According to the vision task model, the combination of image processing modules corresponding to the vision task model and the processing order of the image processing modules in the combination of image processing modules can be determined.
在该情况下,不同的视觉任务模型对应的图像处理模块的组合可能是相同的,可能是不同的。例如,两个视觉任务模型对应的图像处理模块的组合是相同的,而该图像处理模块的组合中的图像处理模块的处理顺序是不同的。In this case, the combinations of image processing modules corresponding to different visual task models may be the same or different. For example, the combinations of image processing modules corresponding to the two visual task models are the same, but the processing orders of the image processing modules in the combination of image processing modules are different.
再如,视觉任务模型和图像处理模块的组合、图像处理模块的权重以及图像处理模块的处理顺序之间具有对应关系。步骤S702中可以确定视觉任务模型对应的图像处理模块的组合、该图像处理模块的权重以及图像处理模块的处理顺序,即从多个候选图像处理模块中确定目标图像处理模块、目标图像处理模块的权重以及目标图像处理模块的处理顺序。For another example, there is a corresponding relationship between the combination of the visual task model and the image processing module, the weight of the image processing module, and the processing order of the image processing module. In step S702, it is possible to determine the combination of image processing modules corresponding to the visual task model, the weight of the image processing modules, and the processing order of the image processing modules, that is, determine the target image processing module and the target image processing module from multiple candidate image processing modules. weights and the processing order of the target image processing module.
在该情况下,不同的视觉任务模型对应的图像处理模块的组合可能是相同的,也可能是不同的。在图像处理模块的组合相同的情况下,图像处理模块的组合中的图像处理模块的权重可能是相同的,也可能是不同的。在图像处理模块的组合相同的情况下,图像处理模块的组合中的图像处理模块的处理顺序可能是相同的,也可能是不同的。In this case, the combinations of image processing modules corresponding to different visual task models may be the same or different. In the case of the same combination of image processing modules, the weights of the image processing modules in the combination of image processing modules may be the same or different. In the case of the same combination of image processing modules, the processing order of the image processing modules in the combination of image processing modules may be the same or different.
可选地,步骤S702包括:根据视觉任务模型确定该至少一个目标图像处理模块中的参数。Optionally, step S702 includes: determining parameters in the at least one target image processing module according to the visual task model.
在一种实现方式中,不同的视觉任务模型对应的图像处理模块的组合是相同的。当视觉任务模型发生变化时,相应地,图像处理模块中的参数可能发生变化。In an implementation manner, the combinations of image processing modules corresponding to different visual task models are the same. When the vision task model changes, the parameters in the image processing module may change accordingly.
例如,第一视觉任务模型对应的图像处理模块包括:黑电平补偿模块和去马赛克模块。其中,黑电平补偿模块的参数包括参数A1,去马赛克模块的参数包括参数B1。第二视觉任务模型对应的图像处理模块包括:黑电平补偿模块和去马赛克模块。其中,黑电平补偿模块的参数包括参数A2,去马赛克模块的参数包括参数B2。图像在输入第一视觉任务模型和第二视觉任务模型之前,均需要经过黑电平补偿处理和去马赛克处理。但第一视觉任务模型之前的黑电平补偿处理和去马赛克处理与第二视觉模型之前的黑电平补偿处理和去马赛克处理所采用的参数是不同的。For example, the image processing module corresponding to the first visual task model includes: a black level compensation module and a demosaic module. Wherein, the parameters of the black level compensation module include parameter A1, and the parameters of the demosaic module include parameter B1. The image processing module corresponding to the second visual task model includes: a black level compensation module and a demosaic module. Wherein, the parameters of the black level compensation module include parameter A2, and the parameters of the demosaic module include parameter B2. Before the image is input into the first visual task model and the second visual task model, it needs to undergo black level compensation processing and demosaic processing. However, the parameters used in the black level compensation processing and demosaic processing before the first visual task model are different from those used in the black level compensation processing and demosaic processing before the second visual model.
视觉任务模型和图像处理模块中的参数之间具有对应关系。根据视觉任务模型可以确定视觉任务模型对应的图像处理模块中的参数,即该至少一个目标图像处理模块中的参数。There is a corresponding relationship between the visual task model and the parameters in the image processing module. According to the visual task model, parameters in the image processing module corresponding to the visual task model can be determined, that is, parameters in the at least one target image processing module.
这样,不同的视觉任务模型对应不同的图像处理模块中的参数,当视觉任务模型发生变化时,图像处理模块中的参数能够自适应匹配视觉任务模型,使得当前的图像处理模块中的参数更适合当前的视觉任务模型,有利于提高视觉任务模型的性能。In this way, different visual task models correspond to different parameters in the image processing module. When the visual task model changes, the parameters in the image processing module can adaptively match the visual task model, making the parameters in the current image processing module more suitable. The current vision task model is beneficial to improve the performance of the vision task model.
在另一种实现方式中,视觉任务模型和图像处理模块中的参数以及图像处理模块的其他配置情况具有对应关系。这样,根据该对应关系可以确定当前视觉任务模型对应的图像处理模块中的参数,以及图像处理模块的其他配置情况。In another implementation manner, the visual task model has a corresponding relationship with parameters in the image processing module and other configurations of the image processing module. In this way, parameters in the image processing module corresponding to the current visual task model and other configurations of the image processing module can be determined according to the corresponding relationship.
例如,视觉任务模型和图像处理模块的组合以及图像处理模块中的参数之间具有对应关系。根据该对应关系可以确定当前视觉任务模型对应的图像处理模块的组合以及该图像处理模块的组合中的图像处理模块中的参数。For example, there is a corresponding relationship between the combination of the visual task model and the image processing module, and the parameters in the image processing module. According to the corresponding relationship, the combination of image processing modules corresponding to the current visual task model and the parameters of the image processing modules in the combination of image processing modules can be determined.
在该情况下,不同的视觉任务模型对应的图像处理模块的组合可能是相同的,可能是不同的。例如,两个视觉任务模型对应的图像处理模块的组合是相同的,而该图像处理模块的组合中的图像处理模块中的参数是不同的。In this case, the combinations of image processing modules corresponding to different visual task models may be the same or different. For example, the combinations of image processing modules corresponding to the two vision task models are the same, but the parameters of the image processing modules in the combination of image processing modules are different.
再如,视觉任务模型和图像处理模块的组合、图像处理模块的权重以及图像处理模块中的参数之间具有对应关系。根据该对应关系可以确定视觉任务模型对应的图像处理模块的组合、该图像处理模块的权重以及图像处理模块中的参数。For another example, there is a corresponding relationship between the combination of the visual task model and the image processing module, the weight of the image processing module, and the parameters in the image processing module. According to the corresponding relationship, the combination of the image processing modules corresponding to the visual task model, the weight of the image processing modules, and the parameters in the image processing modules can be determined.
在该情况下,不同的视觉任务模型对应的图像处理模块的组合可能是相同的,也可能是不同的。在图像处理模块的组合相同的情况下,图像处理模块的组合中的图像处理模块的权重可能是相同的,也可能是不同的。在图像处理模块的组合相同的情况下,图像处理模块的组合中的图像处理模块中的参数可能是相同的,也可能是不同的。In this case, the combinations of image processing modules corresponding to different visual task models may be the same or different. In the case of the same combination of image processing modules, the weights of the image processing modules in the combination of image processing modules may be the same or different. In the case of the same combination of image processing modules, the parameters of the image processing modules in the combination of image processing modules may be the same or different.
根据本申请实施例的方案,不同的视觉任务模型对应不同的图像处理模块的配置,当视觉任务模型发生变化时,图像处理模块能够自适应匹配视觉任务模型,使得图像处理流程更适合视觉任务模型,有利于提高视觉任务模型的性能。According to the solution of the embodiment of the present application, different visual task models correspond to different image processing module configurations. When the visual task model changes, the image processing module can adaptively match the visual task model, making the image processing flow more suitable for the visual task model. , which is beneficial to improve the performance of the vision task model.
下面结合图7至图8对本申请实施例的装置进行说明。应理解,下面描述的装置能够执行前述本申请实施例的方法,为了避免不必要的重复,下面在介绍本申请实施例的装置时适当省略重复的描述。The device of the embodiment of the present application will be described below with reference to FIG. 7 to FIG. 8 . It should be understood that the device described below can execute the method of the aforementioned embodiment of the present application. In order to avoid unnecessary repetition, repeated descriptions are appropriately omitted when introducing the device of the embodiment of the present application below.
图7是本申请实施例的图像处理装置的示意性框图。图7所示的图像处理装置4000包括获取单元4010和处理单元4020。FIG. 7 is a schematic block diagram of an image processing device according to an embodiment of the present application. The image processing device 4000 shown in FIG. 7 includes an acquisition unit 4010 and a processing unit 4020 .
获取单元4010和处理单元4020可以用于执行本申请实施例的图像处理方法。The acquisition unit 4010 and the processing unit 4020 may be used to execute the image processing method of the embodiment of the present application.
在一种可能的实现方式中,装置4000可以用于执行方法300或方法400。In a possible implementation manner, the apparatus 4000 may be used to execute the method 300 or the method 400 .
具体地,获取单元4010用于获取第一图像。Specifically, the acquiring unit 4010 is configured to acquire the first image.
处理单元4020用于:通过至少一个图像处理模块对第一图像进行处理,得到第二图像;将第二图像输入至视觉任务模型中进行处理;根据视觉任务模型的处理结果调整至少一个图像处理模块。The processing unit 4020 is used to: process the first image through at least one image processing module to obtain a second image; input the second image into the visual task model for processing; adjust at least one image processing module according to the processing result of the visual task model .
可选地,作为一个实施例,至少一个图像处理模块包括多个图像处理模块,处理单元4020具体用于:Optionally, as an embodiment, at least one image processing module includes multiple image processing modules, and the processing unit 4020 is specifically configured to:
根据视觉任务模型的处理结果删除多个图像处理模块中的部分图像处理模块。Part of the image processing modules in the plurality of image processing modules are deleted according to the processing results of the visual task model.
可选地,作为一个实施例,处理单元4020具体用于:根据视觉任务模型的处理结果调整多个图像处理模块的权重,多个图像处理模块的权重用于对多个图像处理模块的处理结果进行处理,得到第二图像;根据调整后的多个图像处理模块的权重删除多个图像处理模块中的部分图像处理模块。Optionally, as an embodiment, the processing unit 4020 is specifically configured to: adjust the weights of multiple image processing modules according to the processing results of the visual task model, and the weights of the multiple image processing modules are used to process the processing results of the multiple image processing modules Perform processing to obtain a second image; delete part of the image processing modules in the plurality of image processing modules according to the adjusted weights of the plurality of image processing modules.
可选地,作为一个实施例,处理单元4020具体用于:根据视觉任务模型的处理结果调整至少一个图像处理模块中的参数。Optionally, as an embodiment, the processing unit 4020 is specifically configured to: adjust parameters in at least one image processing module according to a processing result of the visual task model.
可选地,作为一个实施例,处理单元4020具体用于:根据视觉任务模型的处理结果调整至少一个图像处理模块的处理顺序。Optionally, as an embodiment, the processing unit 4020 is specifically configured to: adjust a processing sequence of at least one image processing module according to a processing result of the visual task model.
可选地,作为一个实施例,至少一个图像处理模块包括:黑电平补偿模块、绿平衡模块、坏点修正模块、去马赛克模块、拜耳降噪模块、自动白平衡模块、色彩校正模块、伽马校正模块或降噪及锐化模块。Optionally, as an embodiment, at least one image processing module includes: a black level compensation module, a green balance module, a dead point correction module, a demosaic module, a Bayer noise reduction module, an automatic white balance module, a color correction module, a gamma Horse correction module or noise reduction and sharpening module.
在另一种可能的实现方式中,装置4000可以用于执行方法700。In another possible implementation manner, the apparatus 4000 may be used to execute the method 700 .
具体地,获取单元4010用于获取第三图像。Specifically, the acquiring unit 4010 is configured to acquire a third image.
处理单元4020用于:根据视觉任务模型确定至少一个目标图像处理模块;通过至少一个目标图像处理模块对第三图像进行处理,得到第四图像;通过视觉任务模型对第四图像进行处理,得到第四图像的处理结果。The processing unit 4020 is configured to: determine at least one target image processing module according to the visual task model; process the third image through at least one target image processing module to obtain the fourth image; process the fourth image through the visual task model to obtain the fourth image Four image processing results.
可选地,作为一个实施例,处理单元4020具体用于:根据视觉任务模型从多个候选图像处理模块中确定至少一个目标图像处理模块。Optionally, as an embodiment, the processing unit 4020 is specifically configured to: determine at least one target image processing module from multiple candidate image processing modules according to the visual task model.
可选地,作为一个实施例,处理单元4020具体用于:根据视觉任务模型确定至少一个目标图像处理模块中的参数。Optionally, as an embodiment, the processing unit 4020 is specifically configured to: determine parameters in at least one target image processing module according to the visual task model.
可选地,作为一个实施例,处理单元4020具体用于:根据视觉任务模型确定至少一个目标图像处理模块的处理顺序。Optionally, as an embodiment, the processing unit 4020 is specifically configured to: determine a processing sequence of at least one target image processing module according to the visual task model.
可选地,作为一个实施例,至少一个目标图像处理模块包括:黑电平补偿模块、绿平衡模块、坏点修正模块、去马赛克模块、拜耳降噪模块、自动白平衡模块、色彩校正模块、伽马校正模块或降噪及锐化模块。Optionally, as an embodiment, at least one target image processing module includes: a black level compensation module, a green balance module, a dead point correction module, a demosaic module, a Bayer noise reduction module, an automatic white balance module, a color correction module, Gamma Correction Module or Noise Reduction and Sharpening Module.
需要说明的是,上述装置4000以功能单元的形式体现。这里的术语“单元”可以通过软件和/或硬件形式实现,对此不作具体限定。It should be noted that the above device 4000 is embodied in the form of functional units. The term "unit" here may be implemented in the form of software and/or hardware, which is not specifically limited.
例如,“单元”可以是实现上述功能的软件程序、硬件电路或二者结合。所述硬件电路可能包括应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。For example, a "unit" may be a software program, a hardware circuit or a combination of both to realize the above functions. The hardware circuitry may include application specific integrated circuits (ASICs), electronic circuits, processors (such as shared processors, dedicated processors, or group processors) for executing one or more software or firmware programs. etc.) and memory, incorporating logic, and/or other suitable components to support the described functionality.
因此,在本申请的实施例中描述的各示例的单元,能够以电子硬件、或者计算机软件 和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Therefore, the units of each example described in the embodiments of the present application can be realized by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
图8是本申请实施例提供的图像处理装置的硬件结构示意图。图8所示的图像处理装置6000(该装置6000具体可以是一种计算机设备)包括存储器6001、处理器6002、通信接口6003以及总线6004。其中,存储器6001、处理器6002、通信接口6003通过总线6004实现彼此之间的通信连接。FIG. 8 is a schematic diagram of a hardware structure of an image processing device provided by an embodiment of the present application. The image processing apparatus 6000 shown in FIG. 8 (the apparatus 6000 may specifically be a computer device) includes a memory 6001 , a processor 6002 , a communication interface 6003 and a bus 6004 . Wherein, the memory 6001 , the processor 6002 , and the communication interface 6003 are connected to each other through a bus 6004 .
存储器6001可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器6001可以存储程序,当存储器6001中存储的程序被处理器6002执行时,处理器6002用于执行本申请实施例的图像处理方法的各个步骤。具体地,处理器6002可以执行上文中的方法300、方法400或方法700。The memory 6001 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM). The memory 6001 may store programs, and when the programs stored in the memory 6001 are executed by the processor 6002, the processor 6002 is configured to execute various steps of the image processing method of the embodiment of the present application. Specifically, the processor 6002 may execute the method 300, the method 400 or the method 700 above.
处理器6002可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请方法实施例的图像处理方法。The processor 6002 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application specific integrated circuit (application specific integrated circuit, ASIC), a graphics processing unit (graphics processing unit, GPU) or one or more The integrated circuit is used to execute related programs to realize the image processing method of the method embodiment of the present application.
处理器6002还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的图像处理方法的各个步骤可以通过处理器6002中的硬件的集成逻辑电路或者软件形式的指令完成。The processor 6002 may also be an integrated circuit chip with signal processing capabilities. During implementation, each step of the image processing method of the present application may be completed by an integrated logic circuit of hardware in the processor 6002 or instructions in the form of software.
上述处理器6002还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器6001,处理器6002读取存储器6001中的信息,结合其硬件完成图7所示的装置中包括的单元所需执行的功能,或者,执行本申请方法实施例的图像处理方法。The above-mentioned processor 6002 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, Discrete gate or transistor logic devices, discrete hardware components. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register. The storage medium is located in the memory 6001, and the processor 6002 reads the information in the memory 6001, and combines its hardware to complete the functions required by the units included in the device shown in Figure 7, or execute the image processing method of the method embodiment of the present application .
通信接口6003使用例如但不限于收发器一类的收发装置,来实现装置6000与其他设备或通信网络之间的通信。例如,可以通过通信接口6003获取训练数据。The communication interface 6003 implements communication between the apparatus 6000 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver. For example, training data can be obtained through the communication interface 6003 .
总线6004可包括在装置6000各个部件(例如,存储器6001、处理器6002、通信接口6003)之间传送信息的通路。The bus 6004 may include pathways for transferring information between various components of the device 6000 (eg, memory 6001 , processor 6002 , communication interface 6003 ).
应注意,尽管上述装置6000仅仅示出了存储器、处理器、通信接口,但是在具体实现过程中,本领域的技术人员应当理解,装置6000还可以包括实现正常运行所必须的其他器件。同时,根据具体需要,本领域的技术人员应当理解,装置6000还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,装置6000也可仅仅包括实现本申请实施例所必须的器件,而不必包括图8中所示的全部器件。It should be noted that although the above device 6000 only shows a memory, a processor, and a communication interface, those skilled in the art should understand that the device 6000 may also include other devices necessary for normal operation during specific implementation. Meanwhile, according to specific needs, those skilled in the art should understand that the apparatus 6000 may also include hardware devices for implementing other additional functions. In addition, those skilled in the art should understand that the device 6000 may also only include the components necessary to realize the embodiment of the present application, and does not necessarily include all the components shown in FIG. 8 .
本申请实施例还提供了一种计算机可读存储介质,该计算机可读介质存储用于设备执 行的程序代码,该程序代码包括用于执行本申请实施例中的图像处理方法。The embodiment of the present application also provides a computer-readable storage medium, the computer-readable medium stores program code for device execution, and the program code includes the image processing method used in the embodiment of the present application.
本申请实施例还提供一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行本申请实施例中的图像处理方法。The embodiment of the present application further provides a computer program product including instructions, and when the computer program product is run on a computer, the computer is made to execute the image processing method in the embodiment of the present application.
本申请实施例还提供一种芯片,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,执行本申请实施例中的图像处理方法。The embodiment of the present application also provides a chip, the chip includes a processor and a data interface, and the processor reads the instructions stored in the memory through the data interface, and executes the image processing method in the embodiment of the present application.
可选地,作为一种实现方式,所述芯片还可以包括存储器,所述存储器中存储有指令,所述处理器用于执行所述存储器上存储的指令,当所述指令被执行时,所述处理器用于执行第一方面或第二方面中的任意一种实现方式中的方法。Optionally, as an implementation manner, the chip may further include a memory, the memory stores instructions, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the The processor is configured to execute the method in any one of the implementation manners of the first aspect or the second aspect.
上述芯片具体可以是FPGA或者ASIC。The aforementioned chip may specifically be an FPGA or an ASIC.
应理解,本申请实施例中的处理器可以为中央处理单元(central processing unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that the processor in the embodiment of the present application may be a central processing unit (central processing unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (digital signal processor, DSP), application specific integrated circuits (application specific integrated circuit, ASIC), off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
还应理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的随机存取存储器(random access memory,RAM)可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。It should also be understood that the memory in the embodiments of the present application may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memories. Among them, the non-volatile memory can be read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically programmable Erases programmable read-only memory (electrically EPROM, EEPROM) or flash memory. Volatile memory can be random access memory (RAM), which acts as external cache memory. By way of illustration and not limitation, many forms of random access memory (RAM) are available, such as static random access memory (static RAM, SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory Access memory (synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (double data rate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous connection dynamic random access memory Access memory (synchlink DRAM, SLDRAM) and direct memory bus random access memory (direct rambus RAM, DR RAM).
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令或计算机程序。在计算机上加载或执行所述计算机指令或计算机程序时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质。半导体介质可以是固态硬盘。The above-mentioned embodiments may be implemented in whole or in part by software, hardware, firmware or other arbitrary combinations. When implemented using software, the above-described embodiments may be implemented in whole or in part in the form of computer program products. The computer program product comprises one or more computer instructions or computer programs. When the computer instruction or computer program is loaded or executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server or data center by wired (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center that includes one or more sets of available media. The available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media. The semiconductor medium may be a solid state drive.
应理解,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系,但也可能表示的是一种“和/或”的关系,具体可参考前后文进行理解。It should be understood that the term "and/or" in this article is only an association relationship describing associated objects, which means that there may be three relationships, for example, A and/or B may mean: A exists alone, and A and B exist at the same time , there are three cases of B alone, where A and B can be singular or plural. In addition, the character "/" in this article generally indicates that the related objects are an "or" relationship, but it may also indicate an "and/or" relationship, which can be understood by referring to the context.
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。In this application, "at least one" means one or more, and "multiple" means two or more. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one item (piece) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c can be single or multiple .
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that, in various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the order of execution, and the execution order of the processes should be determined by their functions and internal logic, and should not be used in the embodiments of the present application. The implementation process constitutes any limitation.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟 悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above is only a specific implementation of the application, but the scope of protection of the application is not limited thereto. Anyone familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the application. Should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be determined by the protection scope of the claims.

Claims (26)

  1. 一种图像处理方法,其特征在于,包括:An image processing method, characterized in that, comprising:
    获取第一图像;get the first image;
    通过至少一个图像处理模块对所述第一图像进行处理,得到第二图像;Process the first image by at least one image processing module to obtain a second image;
    将所述第二图像输入至视觉任务模型中进行处理;inputting the second image into the visual task model for processing;
    根据所述视觉任务模型的处理结果调整所述至少一个图像处理模块。The at least one image processing module is adjusted according to the processing result of the visual task model.
  2. 根据权利要求1所述的方法,其特征在于,其特征在于,所述至少一个图像处理模块包括多个图像处理模块,所述根据所述视觉任务模型的处理结果调整所述至少一个图像处理模块,包括:The method according to claim 1, characterized in that, the at least one image processing module comprises a plurality of image processing modules, and the at least one image processing module is adjusted according to the processing results of the visual task model ,include:
    根据所述视觉任务模型的处理结果删除所述多个图像处理模块中的部分图像处理模块。Deleting part of the image processing modules in the plurality of image processing modules according to the processing result of the visual task model.
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述视觉任务模型的处理结果删除所述多个图像处理模块中的部分图像处理模块,包括:The method according to claim 2, wherein the deleting part of the image processing modules in the plurality of image processing modules according to the processing results of the visual task model comprises:
    根据所述视觉任务模型的处理结果调整所述多个图像处理模块的权重,所述多个图像处理模块的权重用于对所述多个图像处理模块的处理结果进行处理,得到所述第二图像;Adjust the weights of the multiple image processing modules according to the processing results of the visual task model, and the weights of the multiple image processing modules are used to process the processing results of the multiple image processing modules to obtain the second image;
    根据调整后的所述多个图像处理模块的权重删除所述多个图像处理模块中的部分图像处理模块。Deleting part of the image processing modules in the plurality of image processing modules according to the adjusted weights of the plurality of image processing modules.
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述根据所述视觉任务模型的处理结果调整所述至少一个图像处理模块,包括:The method according to any one of claims 1 to 3, wherein said adjusting said at least one image processing module according to a processing result of said visual task model comprises:
    根据所述视觉任务模型的处理结果调整所述至少一个图像处理模块中的参数。Adjusting parameters in the at least one image processing module according to the processing results of the visual task model.
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述根据所述视觉任务模型的处理结果调整所述至少一个图像处理模块,包括:The method according to any one of claims 1 to 4, wherein said adjusting said at least one image processing module according to a processing result of said visual task model comprises:
    根据所述视觉任务模型的处理结果调整所述至少一个图像处理模块的处理顺序。The processing sequence of the at least one image processing module is adjusted according to the processing result of the visual task model.
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,所述至少一个图像处理模块包括:The method according to any one of claims 1 to 5, wherein the at least one image processing module comprises:
    黑电平补偿模块、绿平衡模块、坏点修正模块、去马赛克模块、拜耳降噪模块、自动白平衡模块、色彩校正模块、伽马校正模块或降噪及锐化模块。Black level compensation module, green balance module, dead pixel correction module, demosaic module, Bayer noise reduction module, automatic white balance module, color correction module, gamma correction module or noise reduction and sharpening module.
  7. 一种图像处理方法,其特征在于,包括:An image processing method, characterized in that, comprising:
    获取第三图像;get the third image;
    根据视觉任务模型确定至少一个目标图像处理模块;determining at least one target image processing module according to the visual task model;
    通过所述至少一个目标图像处理模块对所述第三图像进行处理,得到第四图像;Process the third image by the at least one target image processing module to obtain a fourth image;
    通过所述视觉任务模型对所述第四图像进行处理,得到所述第四图像的处理结果。The fourth image is processed by the visual task model to obtain a processing result of the fourth image.
  8. 根据权利要求7所述的方法,其特征在于,所述根据视觉任务模型确定至少一个目标图像处理模块,包括:The method according to claim 7, wherein said determining at least one target image processing module according to the visual task model comprises:
    根据所述视觉任务模型从多个候选图像处理模块中确定所述至少一个目标图像处理模块。The at least one target image processing module is determined from a plurality of candidate image processing modules according to the visual task model.
  9. 根据权利要求7或8所述的方法,其特征在于,所述根据视觉任务模型确定至少一 个目标图像处理模块,包括:The method according to claim 7 or 8, wherein said determining at least one target image processing module according to the visual task model comprises:
    根据所述视觉任务模型确定所述至少一个目标图像处理模块中的参数。Determine parameters in the at least one target image processing module according to the visual task model.
  10. 根据权利要求7至9中任一项所述的方法,其特征在于,所述根据视觉任务模型确定至少一个目标图像处理模块,包括:The method according to any one of claims 7 to 9, wherein said determining at least one target image processing module according to the visual task model comprises:
    根据所述视觉任务模型确定所述至少一个目标图像处理模块的处理顺序。A processing order of the at least one target image processing module is determined according to the visual task model.
  11. 根据权利要求7至10中任一项所述的方法,其特征在于,所述至少一个目标图像处理模块包括:The method according to any one of claims 7 to 10, wherein the at least one target image processing module comprises:
    黑电平补偿模块、绿平衡模块、坏点修正模块、去马赛克模块、拜耳降噪模块、自动白平衡模块、色彩校正模块、伽马校正模块或降噪及锐化模块。Black level compensation module, green balance module, dead pixel correction module, demosaic module, Bayer noise reduction module, automatic white balance module, color correction module, gamma correction module or noise reduction and sharpening module.
  12. 一种图像处理装置,其特征在于,包括:An image processing device, characterized in that it comprises:
    获取单元,用于获取第一图像;an acquisition unit, configured to acquire the first image;
    处理单元,用于:processing unit for:
    通过至少一个图像处理模块对所述第一图像进行处理,得到第二图像;Process the first image by at least one image processing module to obtain a second image;
    将所述第二图像输入至视觉任务模型中进行处理;inputting the second image into the visual task model for processing;
    根据所述视觉任务模型的处理结果调整所述至少一个图像处理模块。The at least one image processing module is adjusted according to the processing result of the visual task model.
  13. 根据权利要求12所述的装置,其特征在于,所述至少一个图像处理模块包括多个图像处理模块,所述处理单元具体用于:The device according to claim 12, wherein the at least one image processing module comprises a plurality of image processing modules, and the processing unit is specifically used for:
    根据所述视觉任务模型的处理结果删除所述多个图像处理模块中的部分图像处理模块。Deleting part of the image processing modules in the plurality of image processing modules according to the processing result of the visual task model.
  14. 根据权利要求13所述的装置,其特征在于,所述处理单元具体用于:The device according to claim 13, wherein the processing unit is specifically used for:
    根据所述视觉任务模型的处理结果调整所述多个图像处理模块的权重,所述多个图像处理模块的权重用于对所述多个图像处理模块的处理结果进行处理,得到所述第二图像;Adjust the weights of the multiple image processing modules according to the processing results of the visual task model, and the weights of the multiple image processing modules are used to process the processing results of the multiple image processing modules to obtain the second image;
    根据调整后的所述多个图像处理模块的权重删除所述多个图像处理模块中的部分图像处理模块。Deleting part of the image processing modules in the plurality of image processing modules according to the adjusted weights of the plurality of image processing modules.
  15. 根据权利要求12至14中任一项所述的装置,其特征在于,所述处理单元具体用于:The device according to any one of claims 12 to 14, wherein the processing unit is specifically configured to:
    根据所述视觉任务模型的处理结果调整所述至少一个图像处理模块中的参数。Adjusting parameters in the at least one image processing module according to the processing results of the visual task model.
  16. 根据权利要求12至15中任一项所述的装置,其特征在于,所述处理单元具体用于:The device according to any one of claims 12 to 15, wherein the processing unit is specifically configured to:
    根据所述视觉任务模型的处理结果调整所述至少一个图像处理模块的处理顺序。The processing sequence of the at least one image processing module is adjusted according to the processing result of the visual task model.
  17. 根据权利要求12至16中任一项所述的装置,其特征在于,所述至少一个图像处理模块包括:The device according to any one of claims 12 to 16, wherein the at least one image processing module comprises:
    黑电平补偿模块、绿平衡模块、坏点修正模块、去马赛克模块、拜耳降噪模块、自动白平衡模块、色彩校正模块、伽马校正模块或降噪及锐化模块。Black level compensation module, green balance module, dead pixel correction module, demosaic module, Bayer noise reduction module, automatic white balance module, color correction module, gamma correction module or noise reduction and sharpening module.
  18. 一种图像处理装置,其特征在于,包括:An image processing device, characterized in that it comprises:
    获取单元,用于获取第三图像;an acquisition unit, configured to acquire a third image;
    处理单元,用于:processing unit for:
    根据视觉任务模型确定至少一个目标图像处理模块;determining at least one target image processing module according to the visual task model;
    通过所述至少一个目标图像处理模块对所述第三图像进行处理,得到第四图像;Process the third image by the at least one target image processing module to obtain a fourth image;
    通过所述视觉任务模型对所述第四图像进行处理,得到所述第四图像的处理结果。The fourth image is processed by the visual task model to obtain a processing result of the fourth image.
  19. 根据权利要求18所述的装置,其特征在于,所述处理单元具体用于:The device according to claim 18, wherein the processing unit is specifically used for:
    根据所述视觉任务模型从多个候选图像处理模块中确定所述至少一个目标图像处理模块。The at least one target image processing module is determined from a plurality of candidate image processing modules according to the visual task model.
  20. 根据权利要求18或19所述的装置,其特征在于,所述处理单元具体用于:The device according to claim 18 or 19, wherein the processing unit is specifically used for:
    根据所述视觉任务模型确定所述至少一个目标图像处理模块中的参数。Determine parameters in the at least one target image processing module according to the visual task model.
  21. 根据权利要求18至20中任一项所述的装置,其特征在于,所述处理单元具体用于:The device according to any one of claims 18 to 20, wherein the processing unit is specifically configured to:
    根据所述视觉任务模型确定所述至少一个目标图像处理模块的处理顺序。A processing order of the at least one target image processing module is determined according to the visual task model.
  22. 根据权利要求18至21中任一项所述的装置,其特征在于,所述至少一个目标图像处理模块包括:The device according to any one of claims 18 to 21, wherein the at least one target image processing module comprises:
    黑电平补偿模块、绿平衡模块、坏点修正模块、去马赛克模块、拜耳降噪模块、自动白平衡模块、色彩校正模块、伽马校正模块或降噪及锐化模块。Black level compensation module, green balance module, dead pixel correction module, demosaic module, Bayer noise reduction module, automatic white balance module, color correction module, gamma correction module or noise reduction and sharpening module.
  23. 一种图像处理装置,其特征在于,包括处理器和存储器,所述存储器用于存储程序指令,所述处理器用于调用所述程序指令以执行如权利要求1至6或权利要求7至11中任一项所述的方法。An image processing device, characterized in that it includes a processor and a memory, the memory is used to store program instructions, and the processor is used to call the program instructions to perform the tasks described in claims 1 to 6 or claims 7 to 11. any one of the methods described.
  24. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质用于存储设备执行的程序代码,所述程序代码包括用于执行如权利要求1至6或权利要求7至11中任一项所述的方法。A computer-readable storage medium, characterized in that the computer-readable storage medium is used to store program code executed by a device, and the program code includes a program code for executing any of claims 1 to 6 or claims 7 to 11. one of the methods described.
  25. 一种包含指令的计算机程序产品,其特征在于,当所述计算机程序产品在计算机上运行时,使得所述计算机执行如权利要求1至6或权利要求7至11中任一项所述的方法。A computer program product comprising instructions, characterized in that, when the computer program product is run on a computer, the computer is made to perform the method according to any one of claims 1 to 6 or claims 7 to 11 .
  26. 一种芯片,其特征在于,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令以执行如权利要求1至6或权利要求7至11中任一项所述的方法。A chip, characterized in that the chip includes a processor and a data interface, and the processor reads the instructions stored on the memory through the data interface to execute any of claims 1 to 6 or claims 7 to 11. one of the methods described.
PCT/CN2021/102739 2021-06-28 2021-06-28 Image processing method and apparatus WO2023272431A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2021/102739 WO2023272431A1 (en) 2021-06-28 2021-06-28 Image processing method and apparatus
CN202180099442.4A CN117529725A (en) 2021-06-28 2021-06-28 Image processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/102739 WO2023272431A1 (en) 2021-06-28 2021-06-28 Image processing method and apparatus

Publications (1)

Publication Number Publication Date
WO2023272431A1 true WO2023272431A1 (en) 2023-01-05

Family

ID=84690936

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/102739 WO2023272431A1 (en) 2021-06-28 2021-06-28 Image processing method and apparatus

Country Status (2)

Country Link
CN (1) CN117529725A (en)
WO (1) WO2023272431A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7235768B1 (en) * 2005-02-28 2007-06-26 United States Of America As Represented By The Secretary Of The Air Force Solid state vision enhancement device
CN102663745A (en) * 2012-03-23 2012-09-12 北京理工大学 Color fusion image quality evaluation method based on vision task.
CN110348572A (en) * 2019-07-09 2019-10-18 上海商汤智能科技有限公司 The processing method and processing device of neural network model, electronic equipment, storage medium
CN111881785A (en) * 2020-07-13 2020-11-03 北京市商汤科技开发有限公司 Passenger flow analysis method and device, storage medium and system
CN111901594A (en) * 2020-06-29 2020-11-06 北京大学 Visual analysis task-oriented image coding method, electronic device and medium
CN111898638A (en) * 2020-06-29 2020-11-06 北京大学 Image processing method, electronic device and medium fusing different visual tasks
CN112529150A (en) * 2020-12-01 2021-03-19 华为技术有限公司 Model structure, model training method, image enhancement method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7235768B1 (en) * 2005-02-28 2007-06-26 United States Of America As Represented By The Secretary Of The Air Force Solid state vision enhancement device
CN102663745A (en) * 2012-03-23 2012-09-12 北京理工大学 Color fusion image quality evaluation method based on vision task.
CN110348572A (en) * 2019-07-09 2019-10-18 上海商汤智能科技有限公司 The processing method and processing device of neural network model, electronic equipment, storage medium
CN111901594A (en) * 2020-06-29 2020-11-06 北京大学 Visual analysis task-oriented image coding method, electronic device and medium
CN111898638A (en) * 2020-06-29 2020-11-06 北京大学 Image processing method, electronic device and medium fusing different visual tasks
CN111881785A (en) * 2020-07-13 2020-11-03 北京市商汤科技开发有限公司 Passenger flow analysis method and device, storage medium and system
CN112529150A (en) * 2020-12-01 2021-03-19 华为技术有限公司 Model structure, model training method, image enhancement method and device

Also Published As

Publication number Publication date
CN117529725A (en) 2024-02-06

Similar Documents

Publication Publication Date Title
WO2020253416A1 (en) Object detection method and device, and computer storage medium
WO2021043273A1 (en) Image enhancement method and apparatus
WO2021120719A1 (en) Neural network model update method, and image processing method and device
WO2021043168A1 (en) Person re-identification network training method and person re-identification method and apparatus
WO2020192736A1 (en) Object recognition method and device
WO2021018163A1 (en) Neural network search method and apparatus
WO2021147325A1 (en) Object detection method and apparatus, and storage medium
WO2022001805A1 (en) Neural network distillation method and device
WO2021018251A1 (en) Image classification method and device
CN110222717B (en) Image processing method and device
CN110309856A (en) Image classification method, the training method of neural network and device
US20220148291A1 (en) Image classification method and apparatus, and image classification model training method and apparatus
CN111914997B (en) Method for training neural network, image processing method and device
WO2021018245A1 (en) Image classification method and apparatus
CN113011562B (en) Model training method and device
CN112561027A (en) Neural network architecture searching method, image processing method, device and storage medium
CN110222718B (en) Image processing method and device
CN111291809A (en) Processing device, method and storage medium
CN111695673B (en) Method for training neural network predictor, image processing method and device
WO2022267036A1 (en) Neural network model training method and apparatus and data processing method and apparatus
WO2022179606A1 (en) Image processing method and related apparatus
CN112529146A (en) Method and device for training neural network model
WO2022156475A1 (en) Neural network model training method and apparatus, and data processing method and apparatus
CN113807183A (en) Model training method and related equipment
CN113128285A (en) Method and device for processing video

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21947395

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180099442.4

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21947395

Country of ref document: EP

Kind code of ref document: A1