WO2023025063A1 - 图像信号处理器优化方法及设备 - Google Patents

图像信号处理器优化方法及设备 Download PDF

Info

Publication number
WO2023025063A1
WO2023025063A1 PCT/CN2022/113673 CN2022113673W WO2023025063A1 WO 2023025063 A1 WO2023025063 A1 WO 2023025063A1 CN 2022113673 W CN2022113673 W CN 2022113673W WO 2023025063 A1 WO2023025063 A1 WO 2023025063A1
Authority
WO
WIPO (PCT)
Prior art keywords
evaluation score
sample
evaluation
image signal
signal processor
Prior art date
Application number
PCT/CN2022/113673
Other languages
English (en)
French (fr)
Inventor
沈凌浩
伊藤厚史
Original Assignee
索尼集团公司
沈凌浩
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 索尼集团公司, 沈凌浩 filed Critical 索尼集团公司
Priority to CN202280056428.0A priority Critical patent/CN118159995A/zh
Publication of WO2023025063A1 publication Critical patent/WO2023025063A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/96Management of image or video recognition tasks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/50Constructional details
    • H04N23/54Mounting of pick-up tubes, electronic image sensors, deviation or focusing coils
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • H04N23/84Camera processing pipelines; Components thereof for processing colour signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • H04N23/84Camera processing pipelines; Components thereof for processing colour signals
    • H04N23/88Camera processing pipelines; Components thereof for processing colour signals for colour balance, e.g. white-balance circuits or colour temperature control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N9/00Details of colour television systems
    • H04N9/64Circuits for processing colour signals

Definitions

  • the present disclosure relates to image signal processing, and more particularly to optimization of image signal processors.
  • ISP Image Signal Processor
  • ISPs are the underlying image processing device in electronic photography equipment, which is used to convert the original light signal captured by the optical sensor in electronic photography equipment to obtain pictures that can be viewed by human eyes on various display devices , It is widely used in current digital cameras, mobile phone cameras and other equipment.
  • the performance of the ISP has a great influence on the quality of the final image captured.
  • ISPs generally provide a large number of configuration parameters for adjustment, and ISP manufacturers often have experts to tune the configuration parameters.
  • the tuning target of ISP is the human visual experience, such as texture clarity, visual noise, etc.
  • an optimization apparatus for an image signal processor, the optimization apparatus including a processing circuit configured to use a simulator of an image signal processor to optimize a sample for an image signal processor The picture is processed to obtain a result picture; an evaluation score obtained based on a task model of a specific task applied by the image signal processor for evaluating the execution effect of the specific task on the sample picture is obtained, the evaluation score includes a distribution evaluation score indicating a distribution deviation of the sample picture; and adjusting a configuration parameter of an image signal processor based on the evaluation score.
  • an optimization method for an image signal processor comprising: using a simulator of an image signal processor to process a sample picture for image signal processor optimization to obtain Result picture; obtaining an evaluation score obtained based on a task model of a specific task to which the image signal processor is applied, and used to evaluate the execution effect of the specific task on the sample picture, the evaluation score including indicating a distribution deviation of the sample picture distribution evaluation scores for ; and adjusting configuration parameters of the image signal processor based on the evaluation scores.
  • a photographic device comprising an image signal processor for generating an image based on an electrical signal converted by an image sensor from light collected by the photographic device, and as described herein
  • the optimization device is used for optimizing the image signal processor.
  • an optimization device comprising at least one processor and at least one storage device having stored thereon instructions which, when executed by the at least one processor, cause The at least one processor performs a method as described herein.
  • a storage medium storing instructions which, when executed by a processor, cause the method as described herein to be performed.
  • a program product comprising instructions which, when executed by a processor, cause the processor to perform a method as described herein.
  • a computer program comprising instructions which, when executed by a computer, cause the computer to perform the method as described herein.
  • Fig. 1 shows a general conceptual diagram of an image signal processing flow.
  • Fig. 2 shows a schematic diagram of an image signal processor optimization application scenario according to an embodiment of the present disclosure.
  • FIG. 3A shows a block diagram of an optimization device for an image signal processor according to an embodiment of the present disclosure.
  • FIG. 3B shows a flowchart of an optimization method for an image signal processor according to an embodiment of the present disclosure.
  • FIG. 4A shows an application scenario analysis of image signal processor optimization according to an embodiment of the present disclosure
  • FIG. 4B shows a schematic flowchart of image signal processor optimization according to an embodiment of the present disclosure.
  • FIG. 5A shows an exemplary process of performing ISP automatic tuning based on the KITTI data set according to an embodiment of the present disclosure
  • FIG. 5B shows an optimization effect diagram according to an embodiment of the present disclosure, which shows manual adjustment After the ISP parameters and ISP parameters are automatically tuned, the model predicts the ISP-processed pictures
  • FIG. 5C shows pictures generated by manual tuning of ISP parameters and automatic tuning of ISP parameters.
  • FIG. 6 shows the effect of unsupervised tuning, which shows the prediction results of the model on the ISP-processed picture after manual adjustment of the ISP parameters and automatic tuning of the ISP parameters.
  • FIG. 7 shows the effect of semi-supervised tuning, which shows the prediction results of the model on the ISP-processed picture after manual adjustment of the ISP parameters and automatic tuning of the ISP parameters.
  • Fig. 8 shows the performance of the model after tuning based on different annotation data amounts, which shows the prediction results of the model on the ISP-processed picture after manual adjustment of the ISP parameters and automatic tuning of the ISP parameters.
  • Fig. 9 shows the model performance after different ISP simulator parameters are tuned, which shows the prediction results of the model on the ISP-processed picture after the ISP parameters are manually adjusted and the ISP parameters are automatically tuned.
  • FIG. 10 shows a photographing device according to an embodiment of the present disclosure.
  • FIG. 11 shows a block diagram showing an exemplary hardware configuration of a computer system capable of implementing embodiments of the present invention.
  • ISP Image Signal Processor
  • image signal processor image signal processor
  • Figure 1 shows the conceptual arrangement of the ISP in the capture architecture.
  • the image sensor converts the light received through the lens into an electrical signal, and the electrical signal will be sent to the image signal processor, which based on the The received electrical signals are used to generate an image for presentation to a user, or are further processed, for example via an image processor (eg, GPU), the result of which can be presented to a user.
  • an image processor eg, GPU
  • an image signal processor (ISP) unit may perform a series of signal processing processes aimed at making an image visually pleasing and suitable for viewing by a user.
  • the processing performed by the ISP may include, but is not limited to, AEC (Automatic Exposure Control), AGC (Automatic Gain Control), AWB (Automatic White Balance), denoising, demosaicing, sharpening, color correction, gamma mapping, tone mapping, compression and more.
  • the signal generated by the ISP unit may be in any appropriate format, as long as the signal can be further processed or is suitable for viewing by the user, for example, it may be an image in JPEG, JPG and other formats.
  • ISP can be widely used in various image applications. For example, with the development of machine learning, a large number of pictures are used for computer vision tasks. However, there are still difficulties in ISP optimization/tuning, especially for advanced computer vision tasks such as autonomous driving. Specifically, there are two main difficulties in tuning computer vision tasks: one is that it is difficult for human experts to obtain the optimal solution of computer vision algorithms through tuning based on visual effects; Accumulate rapid iterations, and it is difficult for human experts to tune ISP parameters so quickly.
  • Some attempts at ISP automatic tuning have been proposed.
  • One way of thinking is to automatically tune the ISP based on the expert's understanding and considering the ISP image processing effect.
  • Computer vision tasks help, so by tweaking some of the parameters of the ISP to enhance image features that experts deem effective.
  • this idea does not require manual adjustment by experts, it still requires domain knowledge of experts to make judgments.
  • Another way of thinking is to optimize the modularization of the ISP, in particular, to simplify the simulation of each module of the ISP through a mathematically similar algorithm, such as abstracting the ISP function into several convolutional neural networks (CNN) and targeting the downstream as a whole. task for training.
  • CNN convolutional neural networks
  • the present disclosure proposes an improved ISP tuning solution.
  • the present disclosure proposes a concept of ISP tuning for tasks, especially for task execution effects.
  • ISP tuning is performed with the goal of improving the completion effect of the application task.
  • the solution disclosed in the present disclosure does not require the participation of experts, and can be optimized for a specific task, so that the pictures processed by the ISP can have a better effect on the task. In this way, the application effect of the ISP can be better optimized than the optimization for the visual effect of the picture.
  • an image signal processor (ISP) is automatically tuned for the performance of a specific model for a specific task using a black-box optimization algorithm, where individual parameters are explicitly tuned while considering the ISP as a whole, Therefore, it is possible to obtain an ISP parameter with a better effect of completing the task.
  • ISP Image Signal Processor
  • the technical solutions of the present disclosure can be applied to various appropriate tasks, including but not limited to computer vision tasks.
  • the computer vision tasks include at least one of image classification, object detection, object segmentation, instance segmentation, and panoramic segmentation, so that they can be tuned for different tasks.
  • Computer vision tasks can generally be used in various application scenarios, and then ISP tuning can be adaptively performed for the computer vision tasks or various application scenarios for which the computer vision tasks are used.
  • one of its application scenarios is autonomous driving.
  • the application of autonomous driving requires the support of a series of computer vision tasks, including lane detection, signal light recognition, sign recognition, vehicle and pedestrian detection, etc.
  • Lane detection needs to segment each lane based on the image. This task needs to be able to accurately identify the lane lines on the road surface. It does not require ISP to have good color accuracy, but it needs clear edge features to help curve recognition.
  • Signal light recognition needs to be able to identify the position and color of the signal light, mainly requires ISP accurate color correction and prevention of overexposure.
  • Sign recognition needs to recognize road lane signs or street sign signs on the roadside, and the ISP needs to be able to obtain the color of the signs stably for positioning and classification.
  • ISP tuning for unmanned retail is unmanned retail.
  • the application of unmanned retail mainly requires face detection, face verification, and commodity detection functions.
  • face detection requires the ISP to accurately reflect the skin color and shape of the face
  • face verification requires the ISP to clearly display the feature points of the face.
  • Commodity inspection requires ISP to accurately reflect the texture characteristics of packaging bags such as color, pattern, and text. Therefore, in the present disclosure, ISP tuning for unmanned retail applications can be performed through the synthesis of computer vision task model effects of face detection, face verification, and product detection.
  • ISP tuning for the final effect of a certain application brings better results than ISP tuning for image visual features or a single computer vision task.
  • FIG. 2 shows a schematic diagram of a specific application scenario to which an embodiment of the present disclosure can be applied, where an ISP tuning process according to an embodiment of the present disclosure can be performed.
  • This application scenario is related to computer vision tasks, and ISP tuning can be performed with the help of models for computer vision tasks.
  • the optical sensor processes the incident light to generate an electrical signal and sends it to the image signal processor, and the image signal processor (ISP) generates an image, such as an 8-bit RGB picture, based on the received electrical signal.
  • the images are input to the computer vision task model to be used to complete the computer vision task.
  • the ISP parameters can be automatically adjusted based on the effect of the computer vision task model, and then the model effect can be improved based on the adjusted ISP parameters.
  • the ISP parameters can be optimized through the interaction between the ISP parameters and the computer vision task model, and then the model performance can be improved.
  • FIG. 3A shows a block diagram of an optimization device for an image signal processor (ISP) according to an embodiment of the present disclosure.
  • the device 30 includes a processing circuit 302 configured to process a sample picture for image signal processor optimization using a simulator of an image signal processor to obtain a result picture; the acquisition is based on the an evaluation score obtained by a task model of a specific task to which the image signal processor is applied, and used to evaluate the execution effect of the specific task on the sample picture, the evaluation score including a distribution evaluation score indicating a distribution deviation of the sample picture; and based on The evaluation score is used to adjust configuration parameters of the image signal processor.
  • a processing circuit 302 configured to process a sample picture for image signal processor optimization using a simulator of an image signal processor to obtain a result picture; the acquisition is based on the an evaluation score obtained by a task model of a specific task to which the image signal processor is applied, and used to evaluate the execution effect of the specific task on the sample picture, the evaluation score including a distribution evaluation score indicating a distribution deviation of
  • the ISP simulator is a simulation for a real/actual ISP, which can implement the same functions as the real ISP, and can implement the same processing as the real ISP by applying the configuration parameters of the ISP. That is to say, the parameters and functional effects of the ISP simulator are in one-to-one correspondence with the parameters and functional effects of the image signal processor. Therefore, in some embodiments, the simulator can process the input sample picture based on the configuration parameters of the ISP, so as to obtain a processing result that is basically consistent with the real ISP.
  • the ISP simulator can be implemented in various ways, such as through hardware, software or firmware.
  • the ISP simulator itself can be a black box to simulate a real ISP, and the internal structure has no influence on the implementation.
  • the sample picture can be any suitable image that can be processed by an image signal processor, for example, it can be selected from sample pictures pre-stored in the data, or it can be obtained by photographing equipment within a certain period of time of.
  • the sample image can be an original image, or an image that has undergone specific processing on the original image, such as preliminary filtering, anti-aliasing, color adjustment, contrast adjustment, normalization, and so on.
  • the preprocessing operation may also include other types of preprocessing operations known in the art, which will not be described in detail here.
  • the sample picture may have a specific labeling status, for example, its labeling status is one of all manual labeling, some manual labeling, and no manual labeling, so that the tuning in such labeling situations can be performed separately Corresponding to supervised tuning, semi-supervised tuning, and unsupervised tuning. It will be described in detail below.
  • the task model is a model characterizing a specific task to which the ISP is applied.
  • different models can be used.
  • the task model may be a computer vision task model, or any other appropriate task model, which will not be described in detail here.
  • the task model can be implemented in various appropriate ways, such as neural network and so on.
  • the task model may be implemented using any one of a deep neural network, a convolutional neural network, and the like.
  • the task model of the present disclosure can be trained based on the model training data set to obtain a model for a specific task, the input of which is a sample image processed by the ISP simulator, and the output result of the model can represent the execution result of the corresponding task .
  • the task model in the present disclosure can be trained based on the model training data set, especially a more complex model trained based on a large-scale data set, which means that the tuning scheme of the present disclosure can be Realized through complex models, it can well deal with complex application scenarios.
  • the task model in this disclosure is not trained based on the sample pictures used for tuning, but the task model has been trained and can perform related tasks in the ISP tuning process for model performance evaluation without further steps for training and modification. In this way, the performance of the ISP to complete specific tasks can be improved without modifying the model itself, and the performance can be kept stable during the tuning process.
  • the evaluation score can be used to evaluate the performance of a specific task, which can correspond to the task accuracy of the corresponding task model, such as the task accuracy of the computer vision model, and then can be used to adjust the ISP. Optimize to improve the effect of completing tasks in the real environment.
  • the evaluation score may be obtained based on a task model of the specific task to which the image signal processor is applied, in particular based on the task model's processing of the resulting image from the sample image.
  • the evaluation score may include and/or may consist of various appropriate forms of evaluation scores.
  • the evaluation score may include a distribution evaluation score indicating a distribution deviation of the sample pictures.
  • the distribution evaluation score is indicative of the distribution difference between the model's training set and the sample pictures. The smaller the distribution difference, the higher the distribution evaluation score, which means that the model can achieve the desired effect on the operation and processing of the sample image, which means that the task execution effect is good.
  • a task model applied to a specific task is trained based on a certain training set. Generally speaking, if the pictures used in applying the task model are similar to the training set, the effect will be better, and vice versa. Therefore, when the ISP-processed sample pictures are applied to the application task model for further calculation, if the deviation between the ISP-processed picture distribution and the training set distribution is small, the model can give better results in most cases.
  • the distribution evaluation score can be calculated for various types of input samples, and is especially suitable for calculation for samples without manual annotation.
  • the distribution evaluation score may be referred to as an unsupervised evaluation value. Therefore, the distribution evaluation score is especially suitable for task evaluation applying unsupervised sample images and corresponding ISP tuning.
  • the distribution evaluation score can also be used for task evaluation of supervised sample pictures, semi-supervised sample pictures and corresponding ISP tuning.
  • the distribution evaluation score may be calculated in various suitable ways. According to an embodiment of the present disclosure, the distribution evaluation score may be based on the statistical characteristics of a specific layer included in the task model of the hierarchical structure and the statistics of the calculation results of the result picture at the specific layer and/or before the specific layer At least one of the features is calculated.
  • the task model is a deep neural network
  • the distribution evaluation score can be calculated based on its characteristic batch normalization layer (Batch normalization, referred to as BN) as this specific layer.
  • the statistical characteristics of the batch normalization layer can be obtained in an appropriate manner, such as performing calculations based on a model, reading from existing data, and the like. Preferably, it can be directly read from the weights of the model, thereby avoiding the use of training data for calculation, improving efficiency and saving calculation overhead.
  • the statistical characteristics of the operation results of the result picture at and/or before the specific layer may be obtained and recorded during operation of the result picture by the task model, called activation values , and is used in the calculation of the evaluation score.
  • the distribution evaluation score may be calculated based on both the statistical characteristics of a particular layer in the task model and the statistical characteristics of the operation results of the result pictures preceding the particular layer.
  • the distribution evaluation score can be calculated by: reading the weights of the batch normalization layer of the deep neural network; calculating the activation value of the test sample before the batch normalization layer; according to the statistics of weights and activation values The feature computes the sample distribution difference.
  • the statistical characteristics of the batch normalization layer include mean and variance of the batch normalization layer.
  • the statistical features of the batch normalization layer include mean and variance read directly from the weights of the model, especially the weights of the BN layer of the model.
  • the operation result of the result picture at the batch normalization layer includes the mean value and variance of the distribution of each channel of the operation result.
  • directly recording all the activation values in the calculation would occupy a large amount of video memory and be difficult to implement. Therefore, preferably, in operation, the recorded first-order moment and second-order moment can be updated according to each input sample activation value. And after completing the operation on all batch samples, the mean and variance are calculated based on the records.
  • the distribution evaluation score can be in various suitable forms, for example, at least one.
  • the distribution evaluation score can be appropriately selected for the task of applying the ISP.
  • the distribution evaluation score may be based on the difference between the mean of the batch normalization layer and the mean of the operation result, and the variance of the batch normalization layer and the variance of the operation result calculated from the ratio of .
  • a distribution evaluation score may be the KL divergence.
  • KL divergence can be used for unsupervised, semi-supervised, or fully supervised optimization, and is particularly suitable for unsupervised optimization.
  • KL divergence calculations can be performed in any suitable manner in the art.
  • the KL divergence can be calculated as shown in the following formula:
  • i represents the i-th sample data
  • b i is the batch size of the data
  • N is the total sample data volume
  • xi is the input sample at the current batch normalization layer (Batch normalization, hereinafter referred to as BN).
  • ⁇ 1 , ⁇ 1 are the mean and standard deviation of the current BN. Since ⁇ 1 and ⁇ 1 can be calculated based on the summation, the video memory resource occupied by the calculation is equal to the video memory required by a single xi , thus avoiding the data due to The increase in the amount leads to taking up too much video memory.
  • ⁇ 1 and ⁇ 1 in formula 2 are the calculation results in formula 1
  • ⁇ 2 and ⁇ 2 are the mean and standard deviation of the current BN, which can be directly read from the weight of the model, thus avoiding the use of training data for calculation.
  • the distribution evaluation score may be based on the difference between the mean of the batch normalization layer and the mean of the operation result, and the variance of the batch normalization layer and the variance of the operation result calculated by the difference.
  • such distribution evaluation scores may be L-norm.
  • the L1 norm difference is calculated according to the following formula and used as the evaluation score:
  • ⁇ 1 , ⁇ 1 , ⁇ 2 , and ⁇ 2 are consistent with the aforementioned definitions.
  • the tuning of the ISP can be advantageously guided.
  • the evaluation score may further include a model evaluation score of a model output obtained by operating the task model on the result picture.
  • the evaluation score may also be determined based on the model evaluation score.
  • a score for evaluating task completion effect may be calculated based on model output results as the model evaluation score.
  • Model evaluation scores are especially suitable for evaluating task performance when the input contains labeled samples, and such model evaluation scores can be considered as supervised evaluation values.
  • the model evaluation score may be calculated for the labeled sample pictures based on the annotation information contained in the labeled sample pictures.
  • model evaluation score is selected from the group consisting of F1 value, mean Average Precision (mAP) value, mean Average Average Recall (mAR) value, Intersection over Union (IoU) value, dice coefficient, Panoptic Quality (PQ) value At least one of the group.
  • model evaluation scores can choose different types of values for different types of tasks. As an example, for image classification task, F1 value is preferred; for object detection task, mAP value is preferred; for object segmentation task, dice coefficient is preferred; for instance segmentation task, mAR value is preferred; for panorama segmentation, PQ value is preferred.
  • the evaluation score may be obtained based on both the distribution evaluation score and the model evaluation score, so as to more properly indicate the task execution effect.
  • Such evaluation scores are especially suitable for supervised evaluation, semi-supervised evaluation, fully supervised evaluation, etc.
  • the evaluation score may be obtained by weighting the distribution evaluation score and the model evaluation score.
  • the weights applied to the distribution evaluation score and the model evaluation score may be appropriately selected and are not particularly limited as long as the evaluation score can be calculated such that the better the task performance, the higher the evaluation score.
  • an evaluation score for the task effect of a specific task to which the ISP is applied or expected to be applied can be generated.
  • the configuration parameters of the ISP can be optimized based on the generated evaluation scores, and the ISP configuration parameters are configuration parameters used by the ISP simulator to process the sample pictures during the process of generating the evaluation scores. It also corresponds to the corresponding configuration parameters of the real ISP.
  • the optimization of configuration parameters takes into account the execution effect/completion status of the task, and optimizes with the goal of making the execution effect of the task better.
  • the configuration parameters may be adjusted so that the task effect obtained when the specific task is completed based on the adjusted configuration parameters is better.
  • the assessment score is substantially in one-to-one correspondence with the configuration parameters of the ISP.
  • evaluation score generation may be performed at least once, wherein, in each generation operation, an initial ISP configuration parameter is set, and then an ISP simulator is used to process the sample picture based on the initial ISP configuration parameter, and then The evaluation score is generated based on the task model, so that at least one evaluation score can be obtained, and each evaluation score corresponds to each set of configuration parameters, which can then be used as a set of evaluation scores for subsequent parameter adjustment.
  • an optimization operation may be performed such that the adjusted configuration parameters are closer to the configuration parameters that result in a better evaluation score.
  • the processing circuit is further configured to: acquire multiple sets of evaluation scores corresponding to multiple sets of configuration parameters, and process sample pictures based on the multiple sets of configuration parameters for the multiple sets of evaluation scores obtained by task model calculation; and adjusting the configuration parameters of the image signal processor so that the adjusted configuration parameters are closer to the configuration parameters corresponding to the better evaluation scores in the multiple sets of evaluation scores, and away from A configuration parameter corresponding to a worse evaluation score in the plurality of sets of evaluation scores.
  • the above-mentioned process from processing the sample image with the ISP simulator to obtaining the corresponding evaluation score can be repeated for a specific number of times to obtain multiple sets of configuration parameters and corresponding evaluation scores.
  • the number of repetitions can be set arbitrarily, and preferably, it can be repeated 12 times. It should be pointed out that in each of the repeated operations, the sample picture and the task model can remain unchanged, and the respective initial ISP configuration parameters can be set in each operation to process the picture and thereby generate the corresponding assessment score.
  • the configuration parameters can be adjusted, so that the adjusted configuration parameters can be close to the configuration parameters that generate high evaluation scores in the aforementioned multiple groups of configuration parameters, and at the same time keep away from the configuration parameters that generate low evaluation scores as much as possible.
  • the processing circuit is further configured to iteratively perform adjustment of configuration parameters of the image signal processor. That is to say, the above-mentioned process from processing the sample picture by using the ISP simulator to adjusting the configuration parameters can be iteratively executed. Each process in the iteration can be performed as described above, in particular, the aforementioned process of generating multiple sets of evaluation scores can be performed for adjustment. In some embodiments, the iterative process of configuration parameter adjustment may be performed in any suitable manner. In some embodiments, iterations may be terminated based on certain conditions.
  • the iteration termination condition includes at least one of the following: when the number of iterations reaches a preset number of thresholds, stop the iteration; when the evaluation score corresponding to one iteration is no longer better than the evaluation score corresponding to the previous iteration, then The iteration stops; and when the evaluation score corresponding to the specific number of iterations is no longer better than the evaluation score corresponding to the previous specific number of iterations, the iteration stops.
  • the predetermined number of thresholds may be any appropriate value, and the value may be properly set by the operator, for example, set according to experience or according to the workload requirements of related equipment, or adjusted according to previous parameters
  • the result of the operation is set, for example, it can be set as an empirical value of the iteration number of the previous parameter adjustment operation.
  • the predetermined number of times threshold may be 500 times.
  • iterations may be stopped if the evaluation score has not improved over the last iteration for the first threshold number of consecutive iterations.
  • the evaluation score for the first threshold consecutive times refers to the evaluation score obtained by continuously performing configuration parameter adjustments for the first threshold times, and then using the adjusted configuration parameters. Continuously performing the first threshold adjustment can be performed as described above, which will not be described in detail here.
  • the first threshold is 50 times.
  • configuration parameters may be initially set by an operator and adjusted as described above.
  • configuration parameters may be set and adjusted by suitable means.
  • configuration parameters of an image signal processor can be generated and adjusted using an optimizer.
  • the optimizer is a black box optimizer.
  • the optimizer is a CMA-ES optimizer.
  • the configuration parameters of the image signal processor may be obtained by processing the values generated by the optimizer to meet the parameter requirements of the image signal processor.
  • An example of a method of generating configuration parameters of an image signal processor based on an optimizer will be described exemplarily below. First, randomly select a group of numbers equal to the number of internal parameters of the optimizer as the initial value of the optimizer. Then, the optimizer is invoked, which can generate a set of values equal to the number of ISP parameters. These values have a one-to-one correspondence with the parameters of the ISP. For the generated multiple values, according to the range and value type of the actual ISP parameters, the generated multiple values are processed so as to meet the requirements of the ISP parameters.
  • ISP parameters can also be generated according to parameter types. For example, if the parameter type is discrete, the optimizer can directly generate and process discrete data. As another example, if the optimizer produces continuous values, the continuous values are converted to discrete by rounding. In this way, corresponding configuration parameters can be generated by the optimizer.
  • the processing circuit is further configured to: update the state of the optimizer with the evaluation score; and adjust the configuration parameter based on the values produced by the updated optimizer.
  • the evaluation score corresponds to the configuration parameter, and then corresponds to the value of the optimizer that generates the configuration parameter
  • the state of the optimizer can be adjusted based on the evaluation score in operation, and then the configuration parameter can be adjusted.
  • the optimizer is updated such that the values produced by the updated optimizer are closer to the values corresponding to the better evaluation scores.
  • the optimization of the optimizer may adopt the aforementioned configuration parameter optimization manner.
  • multiple sets of evaluation scores corresponding to multiple sets of values generated by the optimizer can be obtained; and the state of the optimizer is updated, so that after the update, the values generated by the optimizer can be closer to corresponding to the multiple sets of evaluation scores The value of the better evaluation score and away from the value corresponding to the worse evaluation score in the plurality of sets of evaluation scores.
  • the ISP is tuned based on the evaluation score, and the evaluation score can be appropriately determined, and in particular can be selected and determined in consideration of the labeling status of the samples. For example, an appropriate evaluation score may be determined considering whether a sample is unlabeled, partially labeled, or fully labeled.
  • the input sample data can perform ISP tuning relatively efficiently in the presence of annotations. Therefore, the present disclosure further proposes that the input sample pictures can be marked, so as to realize more efficient and improved ISP tuning.
  • samples can be randomly labeled.
  • samples may be labeled according to certain criteria.
  • the sample labeling is performed based on at least one of labeling importance, priority and the like of the sample.
  • the processing circuit is further configured to: label a predetermined number of sample pictures based on the labeling importance of the sample pictures.
  • the samples may be sorted according to the importance of labeling of the sample pictures, and the first predetermined number of sample pictures may be labeled for training.
  • the predetermined number can be appropriately set. For example, it may be set by an operator based on experience, or may be appropriately set in consideration of tuning effect, efficiency, cost, and the like.
  • various appropriate methods may be used to determine the annotation importance of the sample picture.
  • importance can characterize the representativeness of a sample in the test set, and representative samples, such as highly concentrated, highly differentiated from other samples, or otherwise highly representative samples, can be assigned high importance degree, priority, etc.
  • concentration degree of the samples may be considered to set the labeling importance of the samples, for example, the more concentrated the samples, the higher the labeling importance of the samples.
  • the approximate likelihood of the sample can be considered, for example, the smaller the approximate likelihood, the higher the labeling importance.
  • the processing circuit is further configured to: calculate the centrality of each sample, the centrality of a sample indicates the number of samples adjacent to the sample, adjacent samples are defined as the distance between image features is less than a certain threshold; computing an approximate likelihood for each sample, where the approximate likelihood is computed using the sample's image features and the mean and variance of the corresponding batch normalization layer; and computing the ratio of the absolute value of the centrality to the approximate likelihood
  • the label importance of the sample can be calculated according to the following formula.
  • K(x) represents the adjacent samples of sample x
  • f represents the function of the current ISP
  • g(f(x)) is the sample image feature
  • the acquisition method is the model output of the task model (for example, the last Layer convolution output) plus the global average pooling value
  • d is the weighted L2 norm
  • the weight of each dimension is the reciprocal of the standard deviation of the dimension
  • D is the distance threshold.
  • the parameters used by f in this embodiment are the same as the parameters tuned by the ISP.
  • L(x) represents the approximate likelihood of sample x
  • ⁇ 2 , ⁇ 2 2 is the mean and variance recorded by the model output of the task model (for example, the last layer BN of the model backbone network)
  • N(x; ⁇ 2 , ⁇ 2 2 ) is a multidimensional normal distribution with independent dimensions.
  • R(x) is the label importance of sample x
  • is the number of adjacent samples of x, that is, the centrality.
  • the samples are sorted according to the importance of labeling, and then labeled according to the ranking from high to low. Therefore, tuning the data according to the importance of labeling has a better effect than random labeling. Sort in descending order of R(x), from high to low. For sample x, if there is a higher-ranked adjacent sample, it will be sorted to the end of the sequence. The sorted list finally obtained in this way can be used as the sorting of the sample labeling importance, and the first predetermined number of data in the sorted data are marked. In this way, the top-ranked samples are preferentially labeled, that is, the important samples are preferentially labeled, so that better results can be achieved with the same labeling amount, and the image signal processor can be better tuned for the computer vision neural network.
  • the evaluation of the present invention considers the closeness between the ISP result and the task, that is, the evaluation score considers whether the ISP output is more in line with the execution task, not just the quality of the picture itself , so that ISP tuning can be performed from the aspect of task execution optimization, not limited to human recognition ability.
  • the ISP tuning implemented in this way can improve the execution effect of the task to which the ISP is applied.
  • the present disclosure can improve the accuracy of the computer vision task.
  • the image signal processor does not need to be manually tuned, but can be performed automatically with the help of appropriate equipment, thus reducing the labor consumption of manual tuning of the image signal processor.
  • existing data and models can be better utilized to save work overhead.
  • an appropriate evaluation score can be selected, in particular, an appropriate evaluation score can be selected and calculated according to the labeling of the data.
  • the model is pre-trained and remains unchanged during the tuning process, even if the image quality changes brought by the image sensor have no effect on the existing machine learning model, so Can maintain the stability of tuning.
  • appropriate labeling may be performed on the sample data to facilitate image sensor tuning.
  • the processing circuit 302 may be in the form of a general processor, or may be a special processing circuit, such as an ASIC.
  • the processing circuit 120 can be configured by an electric circuit (hardware) or a central processing device such as a central processing unit (CPU).
  • a program (software) for operating the circuit (hardware) or central processing device may be carried on the processing circuit 302 .
  • the program can be stored in a memory such as arranged in a memory or in an external storage medium connected from the outside, and downloaded via a network such as the Internet.
  • the processing circuit 302 may include various units for realizing the above-mentioned functions, such as a picture obtaining unit 304, which is used to process a sample picture for image signal processor optimization using an image signal processor simulator Obtain the result picture; the evaluation score acquisition unit 306, acquires the evaluation score obtained based on the task model of the specific task applied by the image signal processor, and is used to evaluate the execution effect of the specific task on the sample picture, the The evaluation score includes a distribution evaluation score indicating a distribution deviation of the sample picture; and the parameter adjustment unit 308 adjusts configuration parameters of the image signal processor based on the evaluation score.
  • the above units can be implemented in various suitable ways.
  • the picture obtaining unit 304 can be realized by an ISP simulator, and the evaluation score obtaining unit 306 can include a calculation unit 3061 for receiving the processing result of the task model to calculate the evaluation score.
  • the calculation unit can also be in the evaluation score obtaining unit 306 Outside, outside the processing circuit 302, even outside the optimization device 30.
  • the parameter adjustment unit 308 may further include an optimization unit 3081, which uses the evaluation score to update the state of the optimizer; and adjusts the configuration parameters based on the values generated by the updated optimizer.
  • the optimization unit can be implemented in any suitable way, for example, it can be implemented as an optimizer whose input is an evaluation score and can update its own state according to the evaluation score, and the output is a value generated based on the updated state.
  • the processing circuit 302 may further include a labeling unit 310 configured to label the sample pictures, especially according to the importance/priority of the sample pictures.
  • the labeling unit 310 may further include an importance calculation unit 3101, which may calculate the labeling importance of the samples by calculating the concentration and approximate likelihood of the samples.
  • Such an importance calculation unit 3101 may not be included in the labeling unit 310 , and it may be outside the labeling unit 310 , outside the processing circuit 302 , or even outside the optimization device 30 .
  • the calculation unit 3061, the optimization unit 3081, the labeling unit 310, and the importance calculation unit 3101 are drawn with dotted lines to illustrate that these units are not necessarily included in the processing circuit, or do not exist. It should be noted that although the various units are shown as separate units in FIG. 3A , one or more of these units may be combined into one unit, or split into multiple units.
  • each of the above units may be implemented as an independent physical entity, or may also be implemented by a single entity (for example, a processor (CPU or DSP, etc.), an integrated circuit, etc.).
  • the above-mentioned units are shown with dotted lines in the drawings to indicate that these units may not actually exist, and the operations/functions realized by them may be realized by the processing circuit itself.
  • FIG. 3A is only a schematic structural configuration of an optimization device for an image signal processor, and that the optimization device 30 may also include other possible components, such as a memory, a network interface, a controller, etc., and these components are not shown for clarity. Shows.
  • a processing circuit may be associated with a memory.
  • the processing circuit may be directly or indirectly (eg, other components may be connected therebetween) connected to the memory for accessing data related to image processing.
  • the memory may store various data and/or information generated by the processing circuitry 302 .
  • the memory can also be located within the optimization device but outside the processing circuitry, or even outside the optimization device.
  • the memory can be volatile memory and/or non-volatile memory.
  • memory may include, but is not limited to, random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), read only memory (ROM), flash memory.
  • the image signal processor simulator is used to process the sample image for image signal processor optimization to obtain a result image
  • the evaluation score obtaining step S303 the image signal based on the image signal is obtained
  • the parameter adjustment step S305 may further include an optimization step, which uses the evaluation score to update the state of the optimizer; and adjusts the configuration parameters based on the values generated by the updated optimizer.
  • the optimization step can be implemented in any appropriate way, for example, it can be executed by an optimizer, the optimizer can input the evaluation score and can update its own state according to the evaluation score, and the output is a value generated based on the updated state.
  • the method 300 may further include a labeling step, which is performed before the image acquisition step, and which is configured to label the sample pictures, especially according to the importance/priority of the sample pictures.
  • the method 300 may further include an importance calculation step, which may calculate the labeling importance of the samples by calculating the concentration and approximate likelihood of the samples. Such an importance calculation step can be included in the labeling step or outside the labeling step. It should be noted that the above optimization step and labeling step may not be included in the tuning method of the present disclosure.
  • FIG. 4A shows a conceptual block diagram of ISP parameter tuning according to an embodiment of the present disclosure, which also shows an information interaction flow during the ISP parameter tuning process
  • FIG. 4B shows an ISP according to an embodiment of the present disclosure.
  • parties involved in performing ISP parameter tuning may include an automatic tuning framework, tuning samples input into the automatic tuning framework, and a computer vision model interacting with the automatic tuning framework.
  • the automatic tuning framework may include an ISP simulator, a black-box optimizer, and an evaluation indicator calculation component, and may also include a not-shown computer capable of running required codes and models.
  • the automatic tuning framework may correspond to an exemplary implementation of an optimization device according to an embodiment of the present disclosure.
  • the optimization device of the present application may contain more or fewer components than the automatic framework.
  • the optimization device of the present disclosure may not include the evaluation index calculation means, and the evaluation score may be calculated outside the optimization device and input to the optimization device.
  • the automatic tuning framework receives tuning samples, then calls the ISP simulator for processing, and sends the processed pictures to the computer vision model.
  • tuning samples may be streamed in, and processed images may be streamed out to the computer vision model. Since the tuning data set contains multiple samples, data stream input and output play an important role in improving the operating efficiency of the framework and saving system resources.
  • tuning samples may be obtained in any suitable manner, eg from any suitable training set.
  • tuning samples may contain any suitable images.
  • the tuning samples obtained here may be original tuning samples, which may initially contain their own labeling information, and of course, labels may be actively added through labeling operations according to the present application, so as to further improve the tuning effect.
  • the ISP simulator can process the tuning samples similar to the ISP to obtain the ISP-processed picture as the input of the computer vision model.
  • the ISP simulator can be implemented in various appropriate ways, such as software modules.
  • the ISP simulator is input with an original sample picture (as an example, the Bayer picture of 24bit), and provides an interface for adjusting parameters; With the input picture, output the picture processed by ISP.
  • the ISP emulator used in the present invention corresponds to the hardware ISP of the Sony FUJI sensor, and its basic functions include demosaicing, white balance, noise reduction, sharpening, tone mapping, and bit length compression.
  • the ISP parameters may be set to appropriate initial values, or the ISP parameters may be appropriately set based on an optimizer, as described above.
  • a computer vision model is a vision model applied to a specific task, can be in any suitable form, and used in this disclosure is a convolutional neural network (CNN).
  • CNN is a neural network that can achieve specific tasks after training. Different models can be used for different tasks; optionally, the present invention uses YOLOv3 for target detection, Mask R-CNN for target segmentation and instance segmentation, and Deeplab-v3 for panoramic segmentation.
  • YOLOv3 is trained using KITTI and COCO datasets
  • Mask R-CNN is trained using COCO datasets
  • Deeplab-v3 is trained using COCO datasets.
  • the computer vision model inputs the picture processed by the ISP (as an example, a 3-channel 8-bit sRBG picture), and outputs the result of the corresponding task, which may include the picture recognition result.
  • the model will record the mean and variance of the distribution of each channel of the model's calculation results (called activation values) at/before the batch normalization layer (Batch normalization, BN for short).
  • the model outputs and activations thus obtained can be fed into the auto-tuning framework for evaluation metrics calculation.
  • the samples can be processed in batches.
  • multiple sample pictures are divided into multiple batches, and then, for each batch of sample pictures, use the simulator to process the batch of sample pictures, and process the obtained results
  • the pictures are provided to the task model so that the task model can be applied to process the batch of result pictures, and at the same time, the simulator is used to process the next batch of sample pictures.
  • image processing by the simulator is processed in parallel with the image to generate the evaluation score, which in turn can further improve efficiency.
  • the auto-tuning framework can receive values input from the aforementioned models and calculate evaluation metrics. Indicator calculations can be performed as previously described.
  • the difference in data distribution can be evaluated, specifically, distribution evaluation scores, such as KL divergence or L1 norm, can be obtained based on the mean and variance of the computer vision model BN.
  • the evaluation index can also be obtained based on the annotation information contained in the sample picture.
  • the task performance can also be evaluated based on the model output.
  • the model evaluation score can be appropriately selected and determined as described above. Also preferably, the sum of the model evaluation score and the distribution evaluation score according to a certain weight can be used as the evaluation score of the numerical value generated by the current optimizer.
  • the calculated evaluation score can be used as an evaluation index to be fed back to the black-box optimizer in the ISP automatic tuning framework, so that the internal state of the black-box optimizer can be adaptively updated, so that the optimizer can be based on the updated optimizer state An updated value is correspondingly generated to obtain an updated ISP parameter, so as to realize the tuning of the ISP parameter.
  • the black-box optimizer may be a CMA-ES optimizer, and its optimization goal is to improve the performance of the model while reducing the difference between the distribution of the sample pictures processed by the ISP and the distribution of the possible model training set.
  • the operation from calling the ISP simulator to process the tuning samples until calculating the evaluation index.
  • Such operations can be repeated a certain number of times in order to obtain the current optimization
  • Multiple sets of values and corresponding evaluation scores generated by the device are input into the optimizer, so as to update the state of the optimizer.
  • the optimizer compares the evaluation scores and updates the internal state so that the newly generated values are more likely to be close to values with high evaluation scores and farther away from values with low evaluation scores.
  • an optimizer internal state that brings the optimizer's internal state closer to a value that produces a high evaluation score can be tuned, for example, to the optimizer internal state corresponding to the highest evaluation score in the sequence of evaluation scores or within a certain range thereof .
  • ISP tuning may be performed iteratively as previously described.
  • FIG. 5A shows an exemplary flow chart of ISP automatic tuning according to an embodiment of the present disclosure.
  • an existing public data set is used to reflect the effect improvement of ISP automatic tuning.
  • the public dataset used is KITTI, which is a commonly used dataset in the field of autonomous driving.
  • KITTI dataset for object recognition.
  • the KITTI data set is divided into training sets (about 80%) for training the Yolov3 object detection model, which corresponds to the task model of the present disclosure; the remaining 20% of the pictures are used to generate the original images before ISP processing Sample, the following sample refers to the original sample before ISP processing, and the picture refers to the output of the sample after ISP processing.
  • 256 samples are used to tune the ISP parameters, and the remaining samples are used to test the detection effect of the model on the pictures processed by ISP.
  • ISP f( ⁇ ) among Fig. 5 A is the function indicating the ISP simulator used in this embodiment, and this ISP simulator comprises the denoiser based on bilateral filtering and Gaussian filtering, edge enhancement based on high-pass filtering, and based on Durand tone A tonemapper for the mapping algorithm. It emulates several important functions of the Sony Fuji Family ISP. In order to simulate the discrete characteristics of the parameters in the hardware ISP, the parameters used in the ISP simulator are also discrete. In this embodiment, the CMA-ES optimizer is used as an automatic optimizer, and 12 sets of parameters are set to be generated each time, and the internal state is updated based on the evaluation scores of the 12 sets of parameter simulation images.
  • the evaluation score consists of three indicators, namely mAP@0.5 value (hereinafter referred to as mAP), mAR@det10 value (hereinafter referred to as mAR), and KL divergence.
  • mAP mAP + 0.1mAR - 0.1KL divergence.
  • mAP and mAR are positive weights, while KL divergence needs to be non-positive weights.
  • mAP can be calculated in any suitable manner known in the art.
  • the calculation method of mAP value is as follows: 1. For a certain category, first set the detection confidence threshold, and the model prediction below the threshold is eliminated; 2. Calculate the intersection area and The area of the union part, if the intersection area is greater than 0.5 times the union area, it is regarded as a correct detection, otherwise it is an error; 3. Based on the number of correct and wrong in 2, calculate the corresponding precision value and recall value; 4. By adjusting Confidence threshold in 1, you can get a curve of precision value with respect to the change of recall value; calculate the area under the curve as the AP value of the category; average the AP values of all categories to get the mAP value.
  • mAR can be calculated by any suitable means known in the art. As an example, mAR is calculated similarly to mAP, but instead of calculating the area under the curve, the average recall is calculated.
  • the KL divergence can be calculated in various suitable ways, such as those described above.
  • Fig. 5B shows the prediction results of the model after the manual adjustment of the ISP parameters and the automatic adjustment of the ISP parameters on the ISP processed pictures.
  • Figure 5C shows some comparative samples. It can be seen that since manual tuning only considers human visual experience, the effect of the processed picture is significantly different from that of the automatically tuned picture.
  • the same data set division and the same model as in the previous embodiment can be used, but the 256 samples used for tuning do not use their corresponding labels, so as to simulate the situation that the data is not manually labeled.
  • the ISP simulator and optimizer used are the same as those in the previous embodiment, and the evaluation score can be determined by KL divergence or L1 norm respectively, but mAP and mAR are not used in the evaluation score.
  • the method of unsupervised tuning based on the L1 norm is also compared, and the L1 norm can be calculated as described above.
  • ISP auto-tuning according to still other embodiments of the present disclosure, wherein the effect of ISP auto-tuning can suggest the module design of the ISP.
  • the same data set division and the same model as in the previous embodiments are used. But consider a variety of different ISP module designs at the same time, and compare the model performance of different designs after tuning. Since automatic ISP tuning can efficiently find out the best parameter configuration corresponding to the model design, it can provide an experimental reference for evaluating different ISP module designs. In this embodiment, we tested 4 different ISP simulators to correspond to different ISP module designs.
  • ISP1 is Gaussian filter and gamma transform
  • ISP2 is non-local mean filtering, high-pass filtering, Durand tone mapping
  • ISP3 is bilateral filtering, high-pass filtering, contrast compression, global tone mapping
  • ISP4 is the same as the ISP simulator in 3.1.
  • Different function effects can correspond to different ISP modules.
  • the schematic diagram of the effect is shown in Figure 9. Including two different performance evaluation values (left: mAP, right: mAR) and two image sizes (416 ⁇ 416 pixels, 640 ⁇ 640 pixels). The surrounding shaded area is the standard deviation. Since the model performance that can be achieved after tuning the ISP of different module designs is also different, the experimental data can be used as a reference for designing the ISP.
  • the optimization device of the present disclosure may be integrated in any device including an image signal processor (ISP), such as a photographic device or other image acquisition/processing device, for example integrated in the form of an integrated circuit or a processor , even integrated into the existing processing circuit of the device; or it can also be detachably connected to the device as a separate component, for example, it can be used as a separate module, or together with other components that can be detachably mounted on the device. In some embodiments, it may even be provided on a remote device with which the device can communicate.
  • ISP image signal processor
  • the solution of the present disclosure can be realized by a software algorithm, so that it can be easily integrated in various types of equipment including an image signal processor (ISP), such as a video camera, a camera such as a SLR camera, a mirrorless camera etc., as well as portable photography equipment, and other image acquisition/processing equipment.
  • ISP image signal processor
  • the method of the present disclosure may be implemented as a computer program, instruction, etc. by a processor of a photographic device, so as to perform ISP tuning.
  • a photographic device including: an image signal processor for generating an image based on an electrical signal converted by an image sensor from light collected by the photographic device, and an optimization device for Optimized for image signal processors.
  • the optimization device may be implemented in various appropriate ways, especially the optimization device for an image signal processor according to the present disclosure as described above.
  • the photographing device may further include a lens unit, a photographic filter, etc., which may process the collected light.
  • the image acquisition device may also include other components as long as the image to be processed can be obtained.
  • FIG. 10 shows a photography device according to an embodiment of the present disclosure, wherein the photography device 1000 includes an image signal processor 1002 and an optimization device 1004 .
  • FIG. 11 is a block diagram showing an example structure of a personal computer of an optimization device employable in an embodiment of the present disclosure.
  • the personal computer may correspond to the above-described exemplary optimization device according to the present disclosure.
  • a central processing unit (CPU) 1101 executes various processes according to programs stored in a read only memory (ROM) 1102 or loaded from a storage section 1108 to a random access memory (RAM) 1103 .
  • ROM read only memory
  • RAM random access memory
  • data required when the CPU 1101 executes various processing and the like is also stored as necessary.
  • the CPU 1101, ROM 1102, and RAM 1103 are connected to each other via a bus 1104.
  • An input/output interface 1105 is also connected to the bus 1104 .
  • the following components are connected to the input/output interface 1105: an input section 1106 including a keyboard, a mouse, etc.; an output section 1107 including a display such as a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 1108 , including a hard disk, etc.; and the communication part 1109, including a network interface card such as a LAN card, a modem, and the like.
  • the communication section 1109 performs communication processing via a network such as the Internet.
  • a driver 1110 is also connected to the input/output interface 1105 as needed.
  • a removable medium 1111 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc. is mounted on the drive 1110 as necessary, so that a computer program read therefrom is installed into the storage section 1108 as necessary.
  • the programs constituting the software are installed from a network such as the Internet or a storage medium such as the removable medium 1111 .
  • a storage medium is not limited to the removable medium 1111 shown in FIG. 11 in which the program is stored and distributed separately from the device to provide the program to the user.
  • the removable media 1111 include magnetic disks (including floppy disks (registered trademark)), optical disks (including compact disk read only memory (CD-ROM) and digital versatile disks (DVD)), magneto-optical disks (including )) and semiconductor memory.
  • the storage medium may be a ROM 1102, a hard disk contained in the storage section 1108, or the like, in which programs are stored and distributed to users together with devices containing them.
  • the methods and systems of the present invention can be implemented in a variety of ways.
  • the methods and systems of the present invention may be implemented by software, hardware, firmware, or any combination thereof.
  • the sequence of steps of the method described above is illustrative only, and unless specifically stated otherwise, the steps of the method of the present invention are not limited to the sequence specifically described above.
  • the present invention can also be embodied as a program recorded in a recording medium, including machine-readable instructions for implementing the method according to the present invention. Therefore, the present invention also covers a recording medium storing a program for implementing the method according to the present invention.
  • Such storage media may include, but are not limited to, floppy disks, optical disks, magneto-optical disks, memory cards, memory sticks, and the like.
  • embodiments of the present disclosure may also include the following illustrative examples (EE).
  • the sample picture for image signal processor optimization is processed using the simulator of the image signal processor to obtain the result picture;
  • the evaluation score obtained based on a task model of a specific task to which the image signal processor is applied, for evaluating the execution effect of the specific task on the sample picture, the evaluation score including a distribution evaluation indicating a distribution deviation of the sample picture score;
  • a configuration parameter of an image signal processor is adjusted based on the evaluation score.
  • EE 2 The optimization device according to EE 1, wherein the distribution evaluation score indicates a distribution difference between a training set of the task model and the sample pictures.
  • EE 3 The optimization device according to EE 2 or 3, wherein the distribution evaluation score is based on the statistical characteristics of the batch normalization layer included in the task model and the result picture in the batch normalization calculated by at least one of the results of operations at the layer.
  • EE 6 The optimization device according to any one of EE 3-5, wherein the distribution evaluation score is based on the difference between the mean value of the batch normalization layer and the mean value of the operation result, and the batch Calculated by the ratio of the variance of the normalization layer to the variance of the operation result.
  • EE 7 The optimization device according to any one of EE 3-5, wherein the distribution evaluation score is based on the difference between the mean value of the batch normalization layer and the mean value of the operation result, and the batch The difference between the variance of the normalization layer and the variance of the operation result is calculated.
  • EE 8 The optimization device according to any one of EE 1-7, wherein the distribution evaluation score is selected from the group comprising L norm, Kullback-Leibler (KL) divergence, Jensen-Shannon (JS) divergence , Wasserstein distance at least one of the group.
  • L norm Kullback-Leibler
  • JS Jensen-Shannon
  • EE 9 The optimization device according to any one of EE 1-8, wherein the distribution evaluation score is obtained for unlabeled sample pictures.
  • EE 10 The optimization device according to any one of EE 1-9, wherein the evaluation score further includes a model evaluation score of a model output obtained by operating the task model on the result picture.
  • EE 11 The optimization device according to EE 10, wherein the model evaluation score is selected from the group consisting of F1 value, mean Average Precision (mAP) value, mean Average Average Recall (mAR) value, Intersection over Union (IoU) value, At least one of the group of dice coefficient, Panoptic Quality (PQ) value.
  • mAP mean Average Precision
  • mAR mean Average Average Recall
  • IoU Intersection over Union
  • PQ Panoptic Quality
  • EE 12 The optimization device according to EE 10 or 11, wherein the model evaluation score is calculated for labeled sample pictures based on label information contained in labeled sample pictures.
  • EE 13 The optimization device according to any one of EE 10-12, wherein the evaluation score is calculated based on a weighted sum of the distribution evaluation score and the model evaluation score.
  • EE 14 The optimization device according to EE 1, wherein the parameters and functional effects of the simulator are in one-to-one correspondence with the parameters and functional effects of the image signal processor.
  • the configuration parameters are adjusted so that the task effect obtained by completing the specific task based on the adjusted configuration parameters is better.
  • the multiple sets of evaluation scores respectively corresponding to multiple sets of configuration parameters, and are multiple sets of evaluation scores obtained by processing sample pictures based on the multiple sets of configuration parameters for the operation of the task model;
  • EE 18 The optimization device according to EE 1-17, wherein the processing circuit is further configured to iteratively perform adjustment of configuration parameters of the image signal processor.
  • EE 20 The optimization device according to EE 1, wherein the configuration parameters of the image signal processor are obtained by processing the values generated by the optimizer to meet the parameter requirements of the image signal processor.
  • the configuration parameters are adjusted based on the values produced by the updated optimizer.
  • EE 23 The optimization device according to any one of EE 20-22, wherein the processing circuit is further configured to:
  • EE 24 The optimization device according to any one of EE 20-23, wherein the optimizer is a black box optimizer.
  • EE 25 The optimization device according to any one of EE 20-23, wherein the optimizer is a CMA-ES optimizer.
  • EE 26 The optimization device according to any one of EE 1-25, wherein the task model is a task-specific model trained on a large-scale data set, and the output result of the task model is corresponding to The execution result of the task.
  • EE 27 The optimization device according to any one of EE 1-26, wherein the sample picture comprises a plurality of sample pictures, and the processing circuit is further configured to process the samples in batches,
  • the multiple sample pictures are divided into multiple batches
  • the simulator is used to process the batch of sample pictures, and the resulting pictures obtained from the processing are provided to the task model for calculation; and at the same time, the simulator The next batch of sample images will be processed.
  • EE 28 The optimization device according to any one of EE 1-27, wherein the processing circuit is further configured to:
  • the centrality of the sample indicates the number of samples adjacent to the sample, and adjacent samples are defined as the distance between image features is less than a certain threshold;
  • the ratio of the absolute value of the centrality to the approximate likelihood is calculated to determine the labeling importance of the sample.
  • An optimization method for an image signal processor comprising:
  • the sample picture for image signal processor optimization is processed using the simulator of the image signal processor to obtain the result picture;
  • the evaluation score obtained based on a task model of a specific task to which the image signal processor is applied, for evaluating the execution effect of the specific task on the sample picture, the evaluation score including a distribution evaluation indicating a distribution deviation of the sample picture score;
  • a configuration parameter of an image signal processor is adjusted based on the evaluation score.
  • EE 31 The method according to EE 30, wherein the distribution evaluation score is indicative of a distribution difference between a training set of the model and the sample pictures.
  • EE 32 The method according to EE 30 or 31, wherein the evaluation score further includes a model evaluation score based on a model output obtained by operating the task model on the result picture.
  • the configuration parameters are adjusted so that the task effect obtained by completing the specific task based on the adjusted configuration parameters is better.
  • the multiple sets of evaluation scores respectively corresponding to multiple sets of configuration parameters, and are multiple sets of evaluation scores obtained by processing sample pictures based on the multiple sets of configuration parameters for the operation of the task model;
  • EE 35 The method according to any one of EE 30-34, further comprising: iteratively performing the adjustment of the configuration parameters of the image signal processor.
  • EE 36 The method according to EE 35, wherein the iteration termination condition comprises at least one of the following:
  • EE 37 The method according to EE 30, wherein the configuration parameters of the image signal processor are obtained by processing the values generated by the optimizer to meet the parameter requirements of the image signal processor.
  • the configuration parameters are adjusted based on the values produced by the updated optimizer.
  • EE 40 The method according to any one of EE 30-39, wherein the sample picture comprises a plurality of sample pictures, and the method further comprises processing the samples in batches,
  • the multiple sample pictures are divided into multiple batches
  • the simulator is used to process the batch of sample pictures, and the resulting pictures obtained from the processing are provided to the task model for calculation, and at the same time, the simulator The next batch of sample images will be processed.
  • EE 41 The method according to any one of EE 30-40, further comprising:
  • the centrality of the sample indicates the number of samples adjacent to the sample, and adjacent samples are defined as the distance between image features is less than a certain threshold;
  • the ratio of the absolute value of the centrality to the approximate likelihood is calculated to determine the labeling importance of the sample.
  • a photographic device comprising:
  • an image signal processor for generating an image based on an electrical signal converted by the image sensor from light collected by the photographic device
  • An optimization device for optimizing an image signal processor.
  • At least one storage device stores instructions thereon, which instructions, when executed by the at least one processor, cause the at least one processor to perform according to any one of EE 30-42 Optimization.
  • EE 45 A storage medium storing instructions which, when executed by a processor, enable execution of the optimization method according to any one of EE 30-42.
  • EE 46 A program product comprising instructions which, when executed by a processor, enable execution of the optimization method according to any one of EE 30-42.
  • EE 47 A computer program comprising instructions which, when executed by a computer, cause the computer to perform the optimization method according to any one of EE 30-42.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

本公开涉及图像信号处理器优化方法及设备。提供了用于图像信号处理器的优化设备,所述优化设备包括处理电路,被配置为使用图像信号处理器的模拟器对用于图像信号处理器优化的样本图片进行处理以获得结果图片;获取基于所述图像信号处理器所应用于的特定任务的任务模型获得的、用于评价所述特定任务对于样本图片的执行效果的评估分数,所述评估分数包括指示样本图片分布偏差的分布评估分数;以及基于所述评估分数来调整图像信号处理器的配置参数。

Description

图像信号处理器优化方法及设备
相关申请的交叉引用
本申请要求于2021年8月23日递交的中国专利申请No.202110965449.1的优先权,其全文通过引用并入于此。
技术领域
本公开涉及图像信号处理,特别涉及图像信号处理器的优化。
背景技术
随着电子摄影设备,诸如各种数码相机、便携设备搭载摄影设备等日益普及,人们越来越多地利用电子摄影设备进行获取各种场景的照片、视频等等。
图像信号处理器(ISP)是电子摄影设备中的底层图像处理装置,其用于对电子摄影设备中的光学传感器捕捉到的原始光照信号进行转换以获得人眼可以在各类显示设备观看的图片,在目前的数码相机、手机摄像头等设备中有着广泛应用。ISP的性能对于拍摄得到的最终图像的质量有着较大影响。ISP一般提供大量的配置参数可供调整,而ISP的生产商往往会有专家来对配置参数进行调优。一般ISP的调优目标都是人眼视觉感受,如纹理清晰度、视觉噪声等。
除非另有说明,否则不应假定本节中描述的任何方法仅仅因为包含在本节中而成为现有技术。同样,除非另有说明,否则关于一种或多种方法所认识出的问题不应在本节的基础上假定在任何现有技术中都认识到。
发明内容
提供该发明内容部分以便以简要的形式介绍构思,这些构思将在后面的具体实施方式部分被详细描述。
本公开的一个目的是对图像信号处理器进行优化,特别地针对图像信号处理器所应用于的特定任务对图像信号处理器进行优化。
在本公开的一个方面,提供了一种用于图像信号处理器的优化设备,所述优化设备包括处理电路,被配置为使用图像信号处理器的模拟器对用于图像信号处理器优化的样本图片进行处理以获得结果图片;获取基于所述图像信号处理器所应用于的特定任务的任务模 型获得的、用于评价所述特定任务对于样本图片的执行效果的评估分数,所述评估分数包括指示样本图片分布偏差的分布评估分数;以及基于所述评估分数来调整图像信号处理器的配置参数。
在本公开的另一个方面,提供了一种用于图像信号处理器的优化方法,所述方法包括:使用图像信号处理器的模拟器对用于图像信号处理器优化的样本图片进行处理以获得结果图片;获取基于所述图像信号处理器所应用于的特定任务的任务模型获得的、用于评价所述特定任务对于样本图片的执行效果的评估分数,所述评估分数包括指示样本图片分布偏差的分布评估分数;以及基于所述评估分数来调整图像信号处理器的配置参数。
在本公开的另一方面,提供了一种摄影设备,包括图像信号处理器,其用于基于由图像传感器将摄影设备所采集的光转化成的电信号,产生图像,以及如本文所述的优化设备,其用于对图像信号处理器进行优化。
在还另一方面,提供了一种优化设备,包括至少一个处理器和至少一个存储设备,所述至少一个存储设备其上存储有指令,该指令在由所述至少一个处理器执行时可使得所述至少一个处理器执行如本文所述的方法。
在仍另一方面,提供了一种存储有指令的存储介质,该指令在由处理器执行时可以使得执行如本文所述的方法。
在仍另一方面,提供了一种程序产品,所述程序产品包含指令,该指令在由处理器执行时可使得所述处理器执行如本文所述的方法。
在仍另一方面,提供了一种计算机程序,所述计算机程序包括指令,所述指令在由计算机执行时使得计算机执行如本文所述的方法。
从参照附图的示例性实施例的以下描述,本发明的其它特征将变得清晰。
附图说明
下面参照附图说明本公开的优选实施例。此处所说明的附图用来提供对本公开的进一步理解,各附图连同下面的具体描述一起包含在本说明书中并形成说明书的一部分,用于解释本公开。应当理解的是,下面描述中的附图仅仅涉及本公开的一些实施例,而非对本公开构成限制。
图1示出了图像信号处理流程的一般概念图。
图2示出了根据本公开的实施例的图像信号处理器调优应用场景的示意图。
图3A示出了根据本公开的实施例的用于图像信号处理器的优化设备的框图。
图3B示出了根据本公开的实施例的用于图像信号处理器的优化方法的流程图。
图4A示出了根据本公开的实施例的图像信号处理器调优的应用场景分析,而图4B示出了根据本公开的实施例的图像信号处理器调优的示意性流程图。
图5A示出了根据本公开的实施例的示例性基于KITTI数据集,进行ISP自动调优的流程,图5B示出了根据本公开的实施例的调优效果图,其中示出了手动调节ISP参数与ISP参数自动调优后,模型对ISP处理后图片的预测结果,并且图5C示出了手动调优ISP参数和自动调节ISP参数处理生成的图片。
图6示出了示出了无监督调优效果,其中示出了手动调节ISP参数与ISP参数自动调优后,模型对ISP处理后图片的预测结果。
图7示出了示出了半监督调优效果,其中示出了手动调节ISP参数与ISP参数自动调优后,模型对ISP处理后图片的预测结果。
图8示出了基于不同标注数据量调优后的模型性能,其中示出了手动调节ISP参数与ISP参数自动调优后,模型对ISP处理后图片的预测结果。
图9示出了不同ISP模拟器参数调优后的模型性能,其中示出了手动调节ISP参数与ISP参数自动调优后,模型对ISP处理后图片的预测结果。
图10示出了根据本公开的实施例的摄影设备。
图11示出了示出了能够实现本发明的实施例的计算机系统的示例性硬件配置的框图。
应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不一定是按照实际的比例关系绘制的。在各附图中使用了相同或相似的附图标记来表示相同或者相似的部件。因此,一旦某一项在一个附图中被定义,则在随后的附图中可能不再对其进行进一步讨论。
具体实施方式
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,但是显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。附图以及下文对实施例的描述实际上也仅仅是说明性的,决不作为对本公开及其应用或使用的任何限制。应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例。
此外,在下文中结合附图对本公开的示范性实施例进行描述时,为了清楚和简明起见,在说明书中并未描述实施例的所有特征。应当注意,为了避免因不必要的细节而模糊了本公开,在附图中仅仅示出了与至少根据本公开的方案密切相关的处理步骤和/或设备结构, 而省略了与本公开关系不大的其他细节。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值应被解释为仅仅是示例性的,不限制本公开的范围。
在本公开中,术语“第一”、“第二”等仅仅用于区分元件或者步骤,而不是要指示时间顺序、优先选择或者重要性。
ISP(Image Signal Processor),即图像信号处理器,是电子摄影设备,诸如各种数码相机、便携设备搭载摄影设备等中重要的组成部分。图1示出了ISP在拍摄架构中的概念性布置。特别地,在拍照时,光线通过镜头模组进入透镜模组,紧接着图像传感器将通过透镜接到的光转化为电信号,电信号将会发送到图像信号处理器,图像信号处理器基于所接收到的电信号来产生图像,以供被呈现给用户,或者进一步被处理,例如经由图像处理器(例如,GPU)处理,其处理结果可被呈现给用户。在操作中,图像信号处理器(ISP)单元可以执行一系列的信号处理过程,旨在使图像在视觉呈现上较为美观并且适于用户观察。ISP所执行的处理可以包括但不限于AEC(自动曝光控制)、AGC(自动增益控制)、AWB(自动白平衡)、去噪、去马赛克、锐化、色彩校正、伽马映射、色调映射、压缩等等。ISP单元所产生的信号可以为任何适当的格式,只要该信号可以被进一步处理或者适于用户观看即可,例如可以是JPEG,JPG等格式的图像。
ISP可以在各种图像应用中得到广泛地使用。例如,随着机器学习的发展,大量图片被用于计算机视觉任务,然而,但是ISP优化/调优、尤其是针对自动驾驶等高级计算机视觉任务的ISP优化/调优尚存在困难。具体而言,针对计算机视觉任务进行调优主要有两个难点:一是人类专家基于视觉效果的调优很难得到计算机视觉算法的最优解;二是计算机视觉模型会随着技术与数据的积累快速迭代,而人类专家难以如此快的对ISP参数进行调优。
已经提出了一些ISP自动调优的尝试。一种思路是基于专家的理解在考虑了ISP图像处理效果的情况下对ISP进行自动调优,例如基于专家理解,得出图像的诸如锐利度、对比度等图像特征会对配准、行人检测等计算机视觉任务有所帮助,因此通过调整ISP的部分参数来增强专家认为有效的图像特征。该思路尽管不需要专家手动调整,但依然需要专家的领域知识进行判断,同时,仅对一部分ISP的参数进行调节,难以充分利用和改善ISP的功能。另一种思路是对ISP进行模块化的优化,特别地,通过数学上效果近似的算 法对ISP的各个模块进行简化模拟,例如将ISP功能抽象为数个卷积神经网络(CNN)并整体针对下游任务进行训练。该思路的问题在于针对ISP的独立模块进行优化,而忽视了模块的互相影响,无法实现有效的优化。而且,各个CNN与ISP参数无准确的对应关系,无法将CNN结果有效地应用于ISP参数的调优,也就无法对现有的硬件加以利用。
鉴于此,本公开提出改进的ISP调优方案。特别地,本公开提出了针对任务、尤其是针对任务执行效果来进行ISP调优的构思。具体而言,对于特定图像信号处理器应用任务,以改善该应用任务的完成效果为目标来进行ISP调优。有别于当前主要由专家针对个人感官对ISP调优的方法,本公开的方案在于无需专家参与,且能够针对特定任务优化,可以使ISP处理的图片能在该任务上有更好的效果。这样,与针对图片视觉效果进行调优相比,能够更好地优化ISP的应用效果。
进一步地,本公开提出了针对用于特定任务的任务模型的表现来自动调优图像信号处理器(ISP)。特别地,利用黑盒优化算法针对用于特定任务的特定模型的表现来自动调优图像信号处理器(ISP),其中在将ISP视为一个整体的同时,明确地对各个参数进行调优,从而能够获得针对该任务的完成效果更好的ISP参数。
应指出,本公开的技术方案可以应用于各种适当的任务,包括但不局限于计算机视觉任务。在一些实施例中,计算机视觉任务包括图像分类、物体检测、物体分割、实例分割、全景分割中的至少一个,从而可以针对不同的任务来分别调优。计算机视觉任务通常可以用于各种应用场景,继而ISP调优可以针对计算机视觉任务、或者计算机视觉任务所用于的各种应用场景来适应性地进行。
作为一个示例,其应用场景之一为自动驾驶。自动驾驶的应用需要一系列计算机视觉任务的支撑,包括车道检测、信号灯识别、标志识别、车辆与行人检测等。车道检测需要基于图像分割出各个车道,该任务需要能准确的识别路面的车道线,不需要ISP有很好的色彩准确性,却需要清晰的边缘特征以帮助曲线识别。信号灯识别需要能识别信号灯的位置与颜色,主要需要ISP准确的颜色校正与防止过度曝光。标志识别需要识别路面的车道标志或路边的路牌标志,需要ISP能稳定的获得标志的颜色,用于进行定位与分类。车辆与行人检测最为复杂,需要ISP整体的性能较强。而自动驾驶中也有多类不同的细分应用,对不同任务的依赖程度不一,例如自动巡航主要需要先检测当前车道并检测道路前方障碍物,因此对于ISP色彩准确性的需求较低;同时,车道检测一般以车道线的分割为性能评测标准,然而该场景下车道障碍物检测准确性为更好的评测标准,因此针对该场景,本公开的方案需要先进行车道线分割,再基于其结果进行障碍物检测,并根据此障碍物检测准 确率来进行ISP调优。
针对应用进行的ISP调优,其另一应用场景为无人零售。无人零售的应用主要需要人脸检测,人脸验证,商品检测功能。其中,人脸检测需要ISP能准确反映人脸肤色、形状等;人脸验证则需要ISP能清晰的展示人脸特征点。商品检测需要ISP能准确反映包装袋颜色、花纹、文字等纹理特征。因此,在本公开中,可以通过人脸检测、人脸验证、商品检测的计算机视觉任务模型效果的综合,来进行无人零售应用的ISP调优。
总之,针对某一个应用的最终效果进行ISP调优,比针对图片视觉特征或单一计算机视觉任务进行ISP调优带来更好的效果。
图2示出了根据本公开的实施例可应用于其中的特定应用场景的示意图,其中可执行根据本公开的实施例的ISP调优处理。该应用场景与计算机视觉任务相关,并且可以借助于用于计算机视觉任务的模型来进行ISP调优。具体而言,光学传感器处理入射光以产生电信号并将之发送到图像信号处理器,图像信号处理器(ISP)基于所接收到的电信号来产生图像,例如8位RGB图片,所产生的图像被输入到计算机视觉任务模型以用于完成计算机视觉任务。在该应用场景中,可以基于计算机视觉任务模型效果来自动调节ISP参数,继而基于调节后的ISP参数来提高模型效果。这样,通过ISP参数与计算机视觉任务模型之间的交互来优化ISP参数,继而能够提高模型表现。
图3A示出了根据本公开的实施例的用于图像信号处理器(ISP)的优化设备的框图。如图3A所示,设备30包括处理电路302,处理电路302被配置用于使用图像信号处理器的模拟器对用于图像信号处理器优化的样本图片进行处理以获得结果图片;获取基于所述图像信号处理器所应用于的特定任务的任务模型获得的、用于评价所述特定任务对于样本图片的执行效果的评估分数,所述评估分数包括指示样本图片分布偏差的分布评估分数;以及基于所述评估分数来调整图像信号处理器的配置参数。
根据本公开的实施例,ISP模拟器是针对真实/实际ISP的模拟,其可实现与真实ISP相同的功能,并且可应用ISP的配置参数实现与真实ISP一致的处理。也就是说,ISP模拟器的参数和功能效果与图像信号处理器的参数和功能效果一一对应。因此,在一些实施例中,模拟器可基于ISP的配置参数对输入样本图片进行处理,以获得与真实ISP基本一致的处理结果。作为示例,ISP模拟器可以采用各种方式来实现,例如可以通过硬件、软件或者固件来实现。例如,ISP模拟器本身可以是个黑盒来模拟真实ISP,而内部结构对实现并无影响。
在一些实施例中,样本图片可以是可由图像信号处理器处理的任何适当的图像,例如 可以是从预先存储在数据中的样本图片中选择的,或者是通过摄影设备在特定时间段内拍摄得到的。另外,样本图片可以是原始图片,或者已对原始图片进行过特定处理的图像,例如初步过滤,去混叠,颜色调整,对比度调整,规范化等等。应指出,预处理操作还可以包括本领域已知的其它类型的预处理操作,这里将不再详细描述。在一些实施例中,样本图片可以具有特定标注状态,例如其标注情况为全都有人工标注、部分有人工标注、无人工标注的其中一类,从而在这样的标注情况下进行的调优可分别对应有监督调优、半监督调优、无监督调优。以下将详细描述。
根据本公开的实施例,任务模型是表征应用了ISP的特定任务的模型。针对不同任务,可以使用不同模型。例如,该任务模型可以是计算机视觉任务模型,也可以是任何其它适当的任务的模型,这里将不再详细描述。在一些实施例中,任务模型可以是采用各种适当的方式来实现,例如神经网络等。作为示例,任务模型可以采用深度神经网络、卷积神经网络等中的任一个来实现。
在一些实施例中,本公开的任务模型可以是基于模型训练数据集训练得到针对特定任务的模型,其输入为经ISP模拟器处理后的样本图片,并且模型输出结果可以表征对应任务的执行结果。应指出,本公开中的任务模型可基于模型训练数据集被训练得到,特别地可以是更为复杂的且基于大规模数据集而训练得到的模型,这样就意味着本公开的调优方案可以通过复杂的模型来实现,能够很好地应对复杂的应用场景。而且,本公开中的任务模型不是基于调优所使用的样本图片被训练,而是任务模型已经训练好并且能够在ISP调优过程中执行相关的任务以便进行模型执行效果评估,而无需再进一步进行训练和修改。这样无需对模型本身进行修改就可以提高ISP完成特定任务的性能,而且调优过程中能够保持性能稳定。
根据本公开的实施例,评估分数可以用于评价特定任务的执行效果,其可对应于相应的任务模型的任务准确性,例如计算机视觉模型的任务准确性,继而可被用于对ISP进行调优来改善真实环境中完成任务的效果。在一些实施例中,评估分数可以基于所述图像信号处理器所应用于的特定任务的任务模型被获得,特别地基于任务模型对于得自样本图像的结果图像的处理而获得的。
根据本公开的实施例,评估分数可以包含各种适当形式的评估分数,和/或可以由各种适当形式的评估分数组成。
根据本公开的实施例,评估分数可以包括指示样本图片分布偏差的分布评估分数。特别地,该分布评估分数指示所述模型的训练集与所述样本图片之间的分布差异。分布差异 越小,则分布评估分数越高,越说明模型对于样本图片的运算处理能够获得期望的效果,代表任务执行效果好。具体而言,应用于特定任务的任务模型是基于某一个训练集训练的。一般来说,在应用该任务模型时所使用的图片与训练集比较相似的话效果会比较好,反之则会比较差。因此,在ISP处理后的样本图片被施加到应用任务模型来进一步运算时,如果ISP处理后的图片分布与训练集分布的偏差较小,则多数情况下模型可以给出更好的结果。
在本公开的一些实施例中,分布评估分数可针对各种类型的输入样本被计算,并且尤其适用于针对无人工标注样本的计算。在此情况下,该分布评估分数可被称为无监督评估值。因此,分布评估分数尤其适合于应用无监督样本图片的任务评估以及相应的ISP调优。当然,分布评估分数也可以用于有监督样本图片、半监督样本图片的任务评估以及相应的ISP调优。
分布评估分数可以采用各种适当的方式来计算。根据本公开的实施例,分布评估分数可以基于分层结构的任务模型中所包含的特定层的统计特征以及结果图片在所述特定层处和/或在所述特定层之前的运算结果的统计特征中的至少一者被计算。作为示例,任务模型是深度神经网络,而可以基于其特有的批归一化层(Batch normalization,简称BN)作为该特定层来计算分布评估分数。
在一些实施例中,批归一化层的统计特征可通过适当的方式来获得,例如基于模型进行运算、从已有数据读取等。优选地,可从模型的权重中直接读取,从而避免使用训练数据计算,提高效率且节约计算开销。在一些实施例中,结果图片在所述特定层处和/或在所述特定层之前的运算结果的统计特征可在通过任务模型对结果图片进行运算时被获取并记录,被称为激活值,并且用于评估分数的计算。在一些实施例中,可以基于任务模型中的特定层的统计特征以及在特定层之前的结果图片的运算结果的统计特征两者来计算分布评估分数。特别地,作为示例,可以通过如下方式来计算分布评估分数:读取深度神经网络批归一化层的权重;计算测试样本在批归一化层前的激活值;根据权重与激活值的统计特征计算样本分布差异。
根据本公开的实施例,批归一化层的统计特征包括批归一化层的均值和方差。优选地,批归一化层的统计特征包括从模型的权重、尤其是模型的BN层权重中直接读取的均值和方差。根据本公开的实施例,结果图片在所述批归一化层处的运算结果包括该运算结果的各个通道的分布的均值与方差。在一些实施例中,在计算直接记录所有的激活值会占用大量显存从而难以实现。因此优选地,在操作中,可以根据每次输入样本激活值,对记录的 一阶矩与二阶矩进行更新。并在完成对所有批次样本的运算后基于记录计算出均值与方差。
在一些实施例中,分布评估分数可以为各种适当的形式,例如选自包含L范数、Kullback-Leibler(KL)散度、Jensen-Shannon(JS)散度、Wasserstein距离的组中的至少一个。分布评估分数可以针对应用ISP的任务被适当地选择。
根据本公开的一些实施例,分布评估分数可以是基于所述批归一化层的均值与所述运算结果的均值之差、以及所述批归一化层的方差与所述运算结果的方差的比值而计算的。在一些实施例中,这样的分布评估分数可以是KL散度。作为一个示例,KL散度可被用于进行无监督、半监督或者全监督情况下的优化,并且尤其适合于无监督情况下的优化。KL散度计算可以采用本领域中任何适当的方式来实行。作为一个示例,KL散度的计算可如以下公式所示:
[公式1]
Figure PCTCN2022113673-appb-000001
其中,
Figure PCTCN2022113673-appb-000002
[公式2]
Figure PCTCN2022113673-appb-000003
公式1中i代表第i个样本数据,b i为该数据所在批次大小,N为总样本数据量,x i为输入样本在当前批归一化层(Batch normalization,以下简称BN)处的计算结果。μ 1,σ 1为当前BN处的均值与标准差,由于μ 1,σ 1都可以基于求和计算,其计算时所占用的显存资源与单个x i所需显存相等,从而可以避免由于数据量增加导致占用过多显存。公式2中的μ 1,σ 1为公式1中的计算结果,μ 2,σ 2为当前BN的均值与标准差,可以从模型的权重中直接读取,从而避免使用训练数据计算。
应指出,此实施例中仅是计算第一BN层的KL散度,此KL散度的计算可以针对任意多个BN层同时进行。在此情况下,对于任一BN层的计算,上述公式中的x i为输入样本在当该BN层处的计算结果,然后可以根据上述公式计算该BN层的KL散度。
根据本公开的一些实施例,分布评估分数可以是基于所述批归一化层的均值与所述运算结果的均值之差、以及所述批归一化层的方差与所述运算结果的方差之差而计算的。在 一些实施例中,这样的分布评估分数可以是L范数。具体来说,按如下公式计算L1范数差,且用其作为评估分数:
[公式3]
L1范数差=|σ 1 22 2|+|μ 12|,
其中μ 1,σ 1、μ 2,σ 2定义与前述定义一致。
由此,通过使用分布评估分数来评价ISP应用的特定任务的执行效果,从而可以有利地指导ISP的调优。
根据本公开的实施例,评估分数还可包括基于所述任务模型对所述结果图片进行运算得到的模型输出的模型评估分数。特别地,评估分数还可以基于模型评估分数来确定。在一些实施例中,可基于模型输出结果来计算用于评价任务完成效果的分数来作为模型评估分数。模型评估分数尤其适合于在输入包含有标注样本的情况下的任务执行效果的评价,并且这样的模型评估分数可被认为是有监督评估值。在一些实施例中,模型评估分数可以是针对有标注样本图片、基于有标注样本图片包含的标注信息而计算的。
在一些实施例中,模型评估分数为选自包含F1值、mean Average Precision(mAP)值、meana Average Recall(mAR)值、Intersection over Union(IoU)值、骰子系数、Panoptic Quality(PQ)值的组中的至少一个。特别地,模型评估分数可以对于不同类型的任务选择不同类型的值。作为示例,对于图片分类任务,优选F1值;对于物体检测任务,优选mAP值;对于物体分割任务,优选骰子系数;对于实例分割任务,优选mAR值;对于全景分割,优选PQ值。这些值可以采用各种适当的方式来计算得到,这里将不再详细描述。
根据本公开的实施例,评估分数可以是基于分布评估分数和模型评估分数两者而得到的,以更加适当地指示任务执行效果。这样的评估分数尤其适合于有监督评估、半监督评估、全监督评估等等。特别地,评估分数可以是通过对分布评估分数和模型评估分数进行加权和而得到的。应用于分布评估分数和模型评估分数的权重可以被适当地选定,并且不被特别限制,只要评估分数能够被计算为使得任务执行效果越好,评估分数越高即可。
这样,通过上述过程,可以生成对于应用了ISP或者预期应用ISP的特定任务的任务效果的评估分数。
根据本公开的实施例,可以基于所生成的评估分数来优化ISP的配置参数,该ISP配置参数即为在生成评估分数的过程中,由ISP模拟器用于对样本图片进行处理的配置参数,其也对应于真实ISP的相应配置参数。这里,配置参数的优化是考虑了任务的执行效 果/完成状况,并且是以使得任务的执行效果更优为目标来进行优化。在一些实施例中,可以将配置参数调整为使得基于调整后的配置参数完成所述特定任务时所取得的任务效果更优。
根据本公开的实施例,评估分数实质上是与ISP的配置参数一一相对应的。具体而言,在优化过程中,可以进行至少一次评估分数生成,其中,在每次生成操作中,设定初始ISP配置参数,然后利用ISP模拟器基于该初始ISP配置参数来处理样本图片,然后基于任务模型来生成评估分数,这样可以获得至少一个评估分数,每个评估分数与每组配置参数相对应,由此作为评估分数集合来进行后续的参数调整。
在一些实施例中,优化操作可以执行为使得调整后的配置参数更接近于导致更优评估分数的配置参数。在一些实施例中,所述处理电路进一步配置为:获取多组评估分数,所述多组评估分数分别与多组配置参数对应,并且是基于所述多组配置参数处理样本图片以供所述任务模型运算所获得多组评估分数;并且将图像信号处理器的配置参数调整为使得调整后的配置参数更接近于所述多组评估分数中的更优评估分数相对应的配置参数,且远离所述多组评估分数中的更差评估分数相对应的配置参数。
作为示例,为了能更稳定地进行优化,优选地,可以将前述从利用ISP模拟器处理样本图片直到相应地获取评估分数的过程反复执行特定次数,获得多组配置参数与对应的评估分数。该重复执行的次数可被任意地设定,并且优选的,可重复12次。应指出,在重复执行的多次操作中的每一次中,样本图片和任务模型可以保持不变,而可以在每次操作中设置各自的初始ISP配置参数来进行图片处理并由此生成相应的评估分数。然后,在优化过程中,可以对配置参数进行调整,以使得调整后的配置参数能接近前述多组配置参数中的产生高评估分数的配置参数,同时尽量远离产生低评估分数的配置参数。
根据本公开的实施例,所述处理电路进一步配置为迭代地执行图像信号处理器的配置参数的调整。也就是说,可以将上述的从利用ISP模拟器处理样本图片直到配置参数调整的过程迭代地执行。迭代中的每一次过程可以如上所述地执行,特别地,可以执行前述生成多组评估分数的过程以进行调整。在一些实施例中,配置参数调整的迭代过程可被以任何适当的方式来执行。在一些实施例中,可以根据特定条件来终止迭代。特别地,迭代终止条件包含以下中的至少一个:当迭代次数达到预先设定的次数阈值时,停止迭代;当一次迭代所对应的评估分数不再优于前一次迭代所对应的评估分数,则迭代停止;以及当特定次数迭代后所对应的评估分数不再优于前特定次数的迭代所对应的评估分数,则迭代停止。
作为一个示例,预定的次数阈值可以是适当的任何值,该值可以是由操作人员适当地设定的,例如根据经验设定或者根据相关设备的工作负载要求设定,或者根据先前的参数调整操作的结果而被设定,例如可以设定为先前参数调整操作的迭代次数的经验值。例如,预定的次数阈值可以是500次。作为另一示例,如果连续第一阈值次评估分数没有较上一次重复提高,则可以停止迭代。这里,连续第一阈值次评估分数指的是连续执行第一阈值次配置参数调整,之后使用调整后的配置参数计算得到的评估分数。连续执行第一阈值次调整可以如前文所述地执行,这里将不再详细描述。优选地,第一阈值为50次。
根据本公开的实施例,配置参数的设定和调整可采用各种适当的方式来执行。作为一个示例,配置参数可以由操作人员初始地设定并如上所述地调整。作为另一示例,配置参数可以通过适当的装置被设定和调整。根据本公开的实施例,图像信号处理器的配置参数可以采用优化器来生成和调整。优选地,优化器是黑盒优化器。尤其优选地,优化器是CMA-ES优化器。
在一些实施例中,图像信号处理器的配置参数可以是通过对于优化器产生的数值进行处理以使之符合图像信号处理器的参数要求而获得的。以下将示例性地描述基于优化器来获取图像信号处理器的配置参数生成方式的一个示例。首先,随机抽取一组等同于优化器内部参数个数的数字作为优化器的初始值。然后,调用优化器,能够产生一组等同于ISP参数个数的数值。这些数值与ISP的参数有一一对应关系。对于所产生的多个数值,按照ISP实际参数的范围与数值类型,对产生的多个数值进行处理,以使其符合ISP参数要求。处理包括但不限于:将超出对应参数范围的数值进行截断、缩放或反射等操作,使其符合参数范围要求。特别地,还可以根据参数类型来生成ISP参数。例如,如果参数类型为离散型,优化器可以直接产生离散数据并进行处理。作为另一示例,如果优化器产生连续的数值,则通过四舍五入将连续的数值转换成离散。这样,通过优化器能够产生相对应的配置参数。
根据一些实施例,处理电路进一步配置为:利用所述评估分数来更新优化器的状态;以及基于由更新后的优化器产生的数值来调整所述配置参数。特别地,考虑到评估分数与配置参数对应,继而也就与产生该配置参数的优化器数值相对应,因此在操作中可以基于评估分数来调整优化器的状态,继而可以调整配置参数。特别地,在一些实施例,优化器被更新为使得更新后的优化器所产生的数值更加接近于对应于更优评估分数的数值。
在一些实施例中,优化器的优化可以采用前述的配置参数优化的方式。作为示例,可以获取与优化器产生的多组数值对应的多组评估分数;并且更新优化器的状态,以使得更 新后优化器所产生的数值能够更接近对应于所述多组评估分数中的更优评估分数的数值,且远离对应于所述多组评估分数中的更差评估分数的数值。
前文描述了基于评估分数来对ISP进行调优,而评估分数可以被适当地确定,特别地可以考虑样本的标注状况而被选择和确定。例如,可以考虑样本是否为无标注、部分标注或者全部标注而确定适当的评估分数。在本公开中,可以注意到输入样本数据在存在标注的情况下可以相对高效地进行ISP调优。因此,本公开进一步提出了可对输入的样本图片进行标注,从而实现更加高效和改善的ISP调优。
根据本公开的实施例,可以采用各种适当的方式来对样本进行标注。作为一个示例,可以对于样本进行随机标注。作为另一示例,可以根据特定标准对样本进行标注。特别地,尤其基于样本的标注重要度、优先级等中的至少一个来进行样本标注。
根据本公开的实施例,所述处理电路进一步配置为:基于样本图片的标注重要度对预定数量的样本图片进行标注。作为示例,可以按照样本图片的标注重要度对样本进行排序,并且对于前预定数量的样本图片进行标注以供进行训练。这里,预定数量可以被适当地设定。例如,可以由操作人员根据经验被设定,或者可以考虑调优效果、效率和开销等等而被适当地设定。
在本公开的上下文中,可以采用各种适当的方法来确定样本图片的标注重要度。特别地,重要度可表征样本在测试集中的代表性,具有代表性的样本,例如高度集中的,与其它样本区别大的、或者以其它方式具有高代表性的样本,可被赋予高的重要度、优先级等。在一个示例中,可以考虑样本的集中度来设定样本的标注重要度,例如样本越集中,则样本的标注重要度越高。另一方面,可以考虑样本的近似似然性,例如近似似然性越小,则标注重要度越高。
在一些实施例中,所述处理电路进一步配置为:计算各样本的中心度,样本的中心度指示与该样本相邻的样本数量,相邻样本定义为图像特征间的距离小于某一阈值;计算各样本的近似似然,其中该近似似然是使用样本的图像特征以及对应批归一化层的均值和方差来计算的;以及计算中心度的绝对值与近似似然值的比值来确定样本的标注重要度。特别地,作为示例,标注重要度可按照如下公式进行计算。
[公式4]
K(x)={x′|d(g(f(x)),g(f(x′)))<D},
其中,K(x)代表样本x的相邻样本;f表示当前ISP的函数,g(f(x))为样本图像特征, 获取方式为任务模型的模型输出(例如,模型主干网络的最后一层卷积输出)加以全局平均池化值;d为加权的L2范数,其每个维度的权重为该维度的标准差倒数;D为距离阈值。此实施例中f使用的参数同ISP调优的参数相同。
[公式5]
L(x)=N(x;μ 22 2),
其中,L(x)代表样本x的近似似然度;μ 22 2为任务模型的模型输出(例如,模型主干网络最后一层BN)记录的均值与方差;N(x;μ 22 2)为各维度独立的多维正态分布。
[公式6]
Figure PCTCN2022113673-appb-000004
其中,R(x)为样本x的标注重要度;|K(x)|为x相邻样本的数量,即中心度。
由此,可以确定各个样本的标注重要度。
然后,将样本根据标注重要度排序,然后按照排序从高到底进行标注。从而按照标注重要度标注数据进行调优比随机标注有更好效果。按R(x)降序排序,从高到底,对于样本x,如果其相邻样本中有排名更高的,则将其排到序列的最后。这样终得到的排序列表可以作为样本标注重要度的排序,并且将排序后的数据中的前预定数量个数据加以标注。这样,优先标注排名靠前的样本,即优先标注重要的样本,使得在同样标注量时达到更好的效果,能够更好地针对计算机视觉神经网络调优图像信号处理器。
以上概述了本公开的方案,其可被称为是适应于任务的ISP调优方案。特别地,在本公开的实施例中,本发明的评估是考虑了ISP结果与任务之间的接近程度,即评估分数是考虑了ISP输出是否更加符合执行任务,并不仅仅是图片本身的质量,从而可以从任务执行效果优化方面来进行ISP调优,而不局限于人识别能力。这样实现的ISP调优能够改善应用该ISP的任务的执行效果,例如对于计算机视觉任务而言,本公开能够提高计算机视觉任务的准确性。而且在本公开的实施例中,图像信号处理器无需人工调优,而是可以借助于适当的设备自动地进行,这样减少人工调试图像信号处理器的劳力消耗。
此外,在本公开的实施例中,可以更好地利用现有的数据以及模型,节省工作开销。作为示例,对于给定的样本数据,可以选择适当的评估分数,特别地可以根据数据的标注情况而选择和计算适当的评估分数。而且,如前文所述,在本公开的实施例中,模型被预先训练得到并且在调优过程中不变,即使图像传感器带来的图片质量变化对现有机器学习 模型也没有什么影响,这样能够保持调优的稳定性。
进一步地,在本公开的实施例中,可以对于样本数据进行适当的标注以有利于图像传感器调优。特别地,在本公开的实施例中,可以确定样本的标注重要度来主动地为样本数据进行标注,这样可以利用主动标注后的数据来进行图像传感器调优,实现更好的图像传感器优化,继而能够进一步改善任务完成效果。
在上述装置的结构示例中,处理电路302可以是通用处理器的形式,也可以是专用处理电路,例如ASIC。例如,处理电路120能够由电路(硬件)或中央处理设备(诸如,中央处理单元(CPU))构造。此外,处理电路302上可以承载用于使电路(硬件)或中央处理设备工作的程序(软件)。该程序能够存储在存储器(诸如,布置在存储器中)或从外面连接的外部存储介质中,以及经由网络(诸如,互联网)下载。
根据本公开的实施例,处理电路302可以包括用于实现上述功能的各个单元,例如图片获得单元304,用于使用图像信号处理器的模拟器对用于图像信号处理器优化的样本图片进行处理以获得结果图片;评估分数获取单元306,获取基于所述图像信号处理器所应用于的特定任务的任务模型获得的、用于评价所述特定任务对于样本图片的执行效果的评估分数,所述评估分数包括指示样本图片分布偏差的分布评估分数;以及参数调整单元308,基于所述评估分数来调整图像信号处理器的配置参数。应指出,上述单元可采用各种适当的方式来实现。作为示例,图片获得单元304可由ISP模拟器实现,评估分数获取单元306可包含计算单元3061,以用于接收任务模型的处理结果来计算评估分数,当然,计算单元也可以在评估分数获取单元306之外、在处理电路302之外,甚至是在优化设备30之外。
优选地,参数调整单元308还可以包括优化单元3081,其利用所述评估分数来更新优化器的状态;以及基于由更新后的优化器产生的数值来调整所述配置参数。优化单元可通过任何适当的方式来实现,例如可实现为优化器,其输入为评估分数并且可以根据评估分数来更新自身的状态,输出为基于更新的状态产生的数值。
优选地,处理电路302还可以包括标注单元310,其被配置为对样本图片进行标注,特别地是根据样本图片的重要度/优先级来进行标注。优选地,标注单元310还可以包括重要度计算单元3101,其可以通过计算样本的集中度和近似似然性来计算样本的标注重要度。这样的重要度计算单元3101可以不被包含在标注单元310内,其可以在标注单元310之外、在处理电路302之外,甚至在优化设备30之外。
应指出,在图3A中,计算单元3061、优化单元3081、标注单元310和重要度计算 单元3101虚线绘出,旨在说明该单元并不一定被包含在处理电路中,或者并不存在。应注意,尽管图3A中将各个单元示为分立的单元,但是这些单元中的一个或多个也可以合并为一个单元,或者拆分为多个单元。
应注意,上述各个单元仅是根据其所实现的具体功能划分的逻辑模块,而不是用于限制具体的实现方式,例如可以以软件、硬件或者软硬件结合的方式来实现。在实际实现时,上述各个单元可被实现为独立的物理实体,或者也可由单个实体(例如,处理器(CPU或DSP等)、集成电路等)来实现。此外,上述各个单元在附图中用虚线示出指示这些单元可以并不实际存在,而它们所实现的操作/功能可由处理电路本身来实现。
应理解,图3A仅仅是用于图像信号处理器的优化设备的概略性结构配置,优化设备30还可以包括其他可能的部件,诸如存储器、网络接口、控制器等,为了清楚起见这些部件并未示出。特别地,处理电路可以与存储器相关联。例如,处理电路可以直接或间接(例如,中间可能连接有其它部件)连接到存储器,以进行图像处理相关数据的存取。存储器可以存储由处理电路302产生的各种数据和/或信息。存储器还可以位于优化设备内但在处理电路之外,或者甚至位于优化设备之外。存储器可以是易失性存储器和/或非易失性存储器。例如,存储器可以包括但不限于随机存储存储器(RAM)、动态随机存储存储器(DRAM)、静态随机存取存储器(SRAM)、只读存储器(ROM)、闪存存储器。
以下将参照图3B来描述根据本公开的实施例的图像信号处理器的优化方法的流程图。在方法300中,在图片获得步骤S301,使用图像信号处理器的模拟器对用于图像信号处理器优化的样本图片进行处理以获得结果图片,在评估分数获取步骤S303,获取基于所述图像信号处理器所应用于的特定任务的任务模型获得的、用于评价所述特定任务对于样本图片的执行效果的评估分数,所述评估分数包括指示样本图片分布偏差的分布评估分数;以及在参数调整步骤S305,基于所述评估分数来调整图像信号处理器的配置参数。
优选地,参数调整步骤S305还可以包括优化步骤,其利用所述评估分数来更新优化器的状态;以及基于由更新后的优化器产生的数值来调整所述配置参数。优化步骤可通过任何适当的方式来实现,例如可通过优化器来执行,优化器的输入为评估分数并且可以根据评估分数来更新自身的状态,输出为基于更新的状态产生的数值。
优选地,方法300还可以包括标注步骤,其在图像获取步骤之前执行,并且其被配置为对样本图片进行标注,特别地是根据样本图片的重要度/优先级来进行标注。优选地,方法300还可以包括重要度计算步骤,其可以通过计算样本的集中度和近似似然性来计算样本的标注重要度。这样的重要度计算步骤可以被包含在标注步骤中,也可以在标注步骤 之外。应指出,上述优化步骤和标注步骤也可不被包含在本公开的调优方法中。
应指出,这些步骤可以由任何适当的设备或设备元件来执行,例如前述的调优设备,调优设备中的处理电路、处理电路中的相应元件等等。应指出,根据本公开的实施例的图像处理方法还可包含其他步骤,例如前文所述的各种进一步的处理。而且这些进一步的处理也可通过适当的设备或者设备元件来执行,这里将不再详细描述。
以下将参照附图描述根据本公开的实施例的示例性实现,其中以计算机视觉任务为例进行说明以便更清楚地解释本公开的构思和效果。图4A示出了根据本公开的实施例的ISP参数调优的概念性框图,其中还示出了ISP参数调优过程中的信息交互流,图4B示出了根据本公开的实施例的ISP参数调优的示例性流程图,该流程可以由图4A中的设备框架来执行。
在本公开的实施例中,参与执行ISP参数调优的各方可包括自动调优框架、输入自动调优框架的调优样本、以及与自动调优框架交互的计算机视觉模型。自动调优框架可包括ISP模拟器、黑盒优化器和评价指标计算部件,并且还可以包括未示出的可运行所需代码与模型的计算机等。
这里,自动调优框架可对应于根据本公开的实施例的优化设备的示例性实现。但是相比于自动框架,本申请的优化设备可包含更多或者更少的部件。作为示例,本公开的优化设备可以不包括评价指标计算部件,而评估分数可以在优化设备之外计算并输入到优化设备。
在操作中,自动调优框架接收调优样本,然后调用ISP模拟器处理,并且将处理后的图片至计算机视觉模型。在一些实施例中,调优样本可以按数据流形式输入,并且处理后的图片以数据流形式输出至计算机视觉模型。由于调优数据集包含多个样本,数据流输入和输出对于提高框架运行效率,节约系统资源有着重要作用。
这里,该调优样本可被以任何适当的方式被获取,例如被从任何适当的训练集获取。作为示例,调优样本可以包含任何适当的图像。应指出,这里获得的调优样本可以是原始的调优样本,可能初始即包含各自的标注信息,当然也可以是通过根据本申请的标注操作来主动添加标注,以便进一步提高调优效果。
ISP模拟器能够类似于ISP那样对调优样本进行处理,以获得ISP处理后的图片作为计算机视觉模型的输入。这里,ISP模拟器可以采用各种适当的方式实现,例如软件模块来实现,作为示例,ISP模拟器被输入原始样本图片(作为示例,24bit的Bayer图片),并提供调节参数的界面;根据参数与输入图片,输出ISP处理后的图片。优选的,本发明 中使用的ISP模拟器对应索尼FUJI传感器的硬件ISP,基本功能包含去马赛克、白平衡、降噪、锐化、色调映射、位长压缩。ISP参数可以被设定适当的初始值,或者可以基于优化器来对ISP参数进行适当设定,如前所述。
计算机视觉模型是应用于特定任务的视觉模型,可以为任何适当的形式,并且本公开中使用的为卷积神经网络(CNN)。在本公开的实施例中,CNN都为训练完成可以实现特定任务的神经网络。针对不同任务,可以使用不同模型;可选的,本发明使用YOLOv3进行目标检测、Mask R-CNN进行目标分割与实例分割、Deeplab-v3进行全景分割。可选的,YOLOv3使用KITTI与COCO数据集训练,Mask R-CNN使用COCO数据集训练,Deeplab-v3使用COCO数据集训练。
在操作中,计算机视觉模型输入ISP处理后的图片(作为示例,3通道8bit的sRBG图片),输出对应任务的结果,可包括图片识别结果。同时,会记录模型对结果图片在批归一化层(Batch normalization,简称BN)处/之前的运算结果(称为激活值)各个通道的分布的均值与方差。这样获得的模型输出结果和激活值可以被输入自动调优框架以进行评价指标计算。
优选地,为了提高运算效率,可以将样本进行批次处理。在一些实施例中,将多个样本图片被分成多个批次,然后其中,对于每一个批次的样本图片,使用该模拟器对该批次的样本图片进行处理,并且将处理得到的结果图片提供给任务模型以便应用所述任务模型来处理该批次的结果图片,同时使用该模拟器对下一批次的样本图片进行处理。这样将模拟器进行的图片处理与生成评价分数的图片进行并行处理,继而可以进一步提高效率。
该自动调优框架可以接收从前述模型输入的值,并计算评价指标。指标计算可以如前文所述地执行。特别地,可以对数据分布的差异进行评估,具体而言,可以基于计算机视觉模型BN的均值与方差来获取分布评估分数,例如KL散度或者L1范数。进一步地,还可以基于样本图片包含的标注信息来获取评价指标。特别地,如果至少有部分样本图片包含对应的标注,还可以基于模型输出结果来对任务效果进行评估。所述模型评估分数可以如前文所述地选择适当的方式并且被确定。还优选地,可以将模型评估分数与分布评估分数按一定权重加和作为当前优化器产生数值的评估分数。
所计算得到的评估分数可作为评价指标被反馈至ISP自动调优框架中的黑盒优化器,这样可以适应性地更新黑盒优化器的内部状态,由此优化器可基于更新的优化器状态而相应地生成更新的值以获得更新的ISP参数,以实现ISP参数的调优。在本公开的实施例中,黑盒优化器可以是CMA-ES优化器,其优化目标为提高模型表现的同时降低ISP处理后 的样本图片分布与可能的模型训练集分布的差异。
在一些实施例中,为了能更稳定的进行优化,优选的,可以重复地进行从调用ISP模拟器处理调优样本直到计算评价指标的操作,这样的操作可以重复特定的次数,以便获得当前优化器产生的多组数值与对应的评估分数。这样,在更新优化器状态的操作中,将多组数值与对应评估分数输入优化器,以更新优化器的状态。优化器会对评估分数进行比较,并且更新内部状态,从而新产生的数值会更可能接近于评估分数高的数值,同时更可能远离评估分数低的数值。作为示例,可以将使得优化器的内部状态更加接近产生高评估分数的数值的优化器内部状态,例如,调整为评估分数序列中的最高评估分数所对应的优化器内部状态或者在其特定范围内。
这样,通过信息交互流程,能够利用调优样本自动地进行ISP参数的优化。在一些实施例中,可以如前所述地迭代地执行ISP调优。
以下将进一步描述根据本公开的实施例的一些示例性实现来阐述本公开的方案的实现。
图5A示出了根据本公开的实施例的ISP自动调优的示例性流程图,在该实施例中,使用已有的公开数据集来体现ISP自动调优的效果提升。作为示例,所使用的公开数据集是KITTI,其是一个自动驾驶领域的常用数据集,我们基于KITTI数据集来进行物体识别。此实施例中,KITTI数据集被划分为训练集(约占80%)用于训练Yolov3物体检测模型,其对应于本公开的任务模型;剩余20%的图片则用于生成ISP处理前的原始样本,以下样本特指ISP处理前的原始样本,图片则特指样本经过ISP处理后的输出。256个样本用于调优ISP参数,其余样本用于测试模型对ISP处理后图片的检测效果。为了尽量消除测试中的随机性,我们分别随机从20%的数据中抽取10组256张样本用于ISP调优,并将剩余样本用于对应调优结果的测试。
图5A中的ISP f(θ)为指示此实施例中使用的ISP模拟器的函数,该ISP模拟器包含基于双边滤波与高斯滤波的降噪器,基于高通滤波的边缘强化,以及基于Durand色调映射算法的色调映射器。其可以模拟Sony Fuji Family ISP的数个重要功能。为了模拟硬件ISP中参数的离散特点,ISP模拟器中使用的参数也是离散型。此实施例使用CMA-ES优化器作为自动优化器,设定每次会产生12组参数,并基于这12组参数模拟图片的评估分数来更新内部状态。
评估分数由3项指标组成,分别是mAP@0.5值(以下简称mAP),mAR@det10值(以下简称mAR),KL散度。该实施例中评估分数=mAP+0.1mAR-0.1KL散度,不同 情况下,不同指标可以使用不同的权重。但是mAP与mAR为正数权重,而KL散度需要为非正数的权重。
mAP可以采用本领域中任何适当的方式来计算。作为一个示例,mAP值计算方式为:1.针对某一类别,首先设置检测置信度阈值,阈值以下的模型预测剔除;2.分别计算模型剩余预测检测框与人工标注检测框的交集部分面积与并集部分面积,如果交集面积大于并集面积的0.5倍,则视为正确检测,否则为错误;3.基于2中正确与错误的数量,计算对应的精确值与召回值;4.通过调整1中的置信度阈值,可以得到一条精确值关于召回值变化的曲线;计算该曲线下方的面积,作为该类别的AP值;将所有类别的AP值取平均,得到mAP值。
mAR可以采用本领域中任何适当的方式来计算。作为一个示例,mAR计算方式与mAP相似,不过并非计算曲线下面积,而是计算平均的召回值。
KL散度可以采用各种适当的方式来计算,如前文所述的方式。
图5B中展示了手动调节ISP参数与ISP参数自动调优后模型对ISP处理后图片的预测结果。我们使用mAP与mAR评估模型效果,并且考虑了使用不同大小图片调优的情况(小:416×416像素,大:640×640像素)。可以看到,对于10组不同的调优与测试数据分割,ISP参数自动调优的效果都显著优于手动调优。图5C展示了部分对比样本,可以看到由于手动调优仅考虑人体视觉感受,其处理后图片与自动调优的图片效果差别较为明显。
以下将描述根据本公开的涉及无监督ISP自动调优的示例,该无监督ISP自动调优的执行能够提高模型效果。
在该实施例中,可以使用与前述实施例相同的数据集划分、同样的模型,但是用于调优的256个样本不使用其对应的标签,以模拟数据无人工标注的情况。在该实施例中,使用的ISP模拟器与优化器与前述实施例中的一致,而且评估分数可以分别由KL散度或者L1范数,但是评估分数中不使用mAP与mAR。该实施例中还比较了基于L1范数无监督调优的做法,L1范数可如前所述地计算。
具体来说,使用训练与调优中未使用的416×416像素大小的图片进行测试,并通过mAP@0.5来测量模型在不同ISP参数下的效果,效果见图6。可以看到基于KL散度调优效果优于基于L1范数差,并且二者都优于人工调优ISP参数的效果,可见无监督ISP自动调优对于提高计算机视觉模型效果的有效性。
以下将描述根据本公开的涉及半监督ISP自动调优的另一实施例,该半监督ISP自 动调优的执行能够提高模型效果。其中,样本图片中的至少一些已经被标注,从而可以根据图片的标注信息来确定相对应的评估分数。
在本实施例中,基于前述实施例相同的数据集划分、同样的模型,但是用于调优的256个样本仅有16个为标注样本,剩余样本不使用其对应的标签,以模拟数据仅有部分人工标注的情况。使用的ISP模拟器与优化器与前述实施例中的一致。对于有标注样本,评估分数与3.1一致,对于无标注样本,评估分数为KL散度。效果见图7。可以看到自动调优的ISP参数会比手工调优有更好的模型性能,同时,半监督的调优也比仅使用少量标注样本的效果更好,可以更接近256个样本都有标注时ISP参数调优的模型性能。
以下将描述根据本公开的按照标注重要度标注数据进行调优的另一实施例,其中将不标注与标注的情况进行比较。在该实施例中,基于与前述实施例相同的模型、ISP模拟器、优化器、训练集。调优集为两种情况,情况1与前述实施例一致,为随机抽取10组。情况2需要首先计算各个样本的标注重要度,然后按照标注重要度将样本排序,并且将前预定数量的样本进行标注。效果示意图见图8。基本来说,基于情况1调优集的调优效果比情况2的略差一些。说明按此标注优先级进行数据标注可以在同样标注量时取得更好效果。
以下将进一步描述根据本公开的还另一些实施例的ISP自动调优,其中,ISP自动调优的效果可以建议ISP的模块设计。
在本实施例中,基于前述实施例相同的数据集划分、同样的模型。但是同时考虑多种不同的ISP模块设计,并比较不同设计在调优后的模型性能。由于自动ISP调优可以高效率的找出对应模型设计的最佳参数配置,其可以很好的为评估不同ISP模块设计提供实验参考。此实施例中,我们测试了4种不同的ISP模拟器,以对应不同的ISP模块设计。其中,不同ISP的设计可以是但不限于:ISP1为高斯滤波器与伽马变换;ISP2为非局部均值滤波、高通滤波、Durand色调映射;ISP3为双边滤波、高通滤波、对比度压缩、全局色调映射;ISP4与3.1中ISP模拟器相同。不同函数效果可对应不同ISP模块。效果示意图见图9。包括不同两种性能评价值(左:mAP,右:mAR)以及两种图片大小(416×416像素,640×640像素),附图中的折线为10次基于不同数据调优的均值,其周围的阴影区域为标准差。由于不同模块设计的ISP调优后所能达到的模型性能也不同,因此该实验数据可以作为设计ISP的参考。
在一些实施例中,本公开的优化设备可以集成在包含图像信号处理器(ISP)的任何设备中,诸如摄影设备或者其它图像获取/处理设备,例如以集成电路、处理器的形式集成在其中,甚至集成在设备已有的处理电路中;或者也可以作为分体器件可拆装地连接到 设备上,例如可以作为单独地模块,或者与可拆装到设备上其它部件一起。在一些实施例中,甚至可以设置在设备可通信的远程设备上。
在一些实施例中,本公开的方案可以通过软件算法来实现,从而可以方便地集成在包含图像信号处理器(ISP)的各种类型的设备中,例如摄影机、照相机例如单反相机、微单相机等等,以及便携式摄影设备,以及其它图像获取/处理设备中。特别地,本公开的方法可作为计算机程序、指令等由摄影设备的处理器来执行,以便进行对于ISP进行调优。
根据本公开的实施例,提出了一种摄影设备,包括:图像信号处理器,其用于基于由图像传感器将摄影设备所采集的光转化成的电信号,产生图像,以及优化设备,其用于对图像信号处理器进行优化。其中,优化设备可以采用各种适当的方式来实现,尤其是如上文所述的根据本公开的用于图像信号处理器的优化设备。特别地,尽管未示出,摄影设备还可以包括透镜单元、照相滤镜等等,其可以对采集到的光进行处理。应指出,尽管未示出,但是图像获取装置还可包含其他部件,只要能够获得待处理的图像即可。图10示出了根据本公开的实施例的摄影设备,其中摄影设备1000包括图像信号处理器1002和优化设备1004。
另外,应当理解,上述系列处理和设备也可以通过软件和/或固件实现。在通过软件和/或固件实现的情况下,从存储介质或网络向具有专用硬件结构的计算机,例如图11所示的通用个人计算机1100安装构成该软件的程序,该计算机在安装有各种程序时,能够执行各种功能等等。图11是示出根据本公开的实施例的中可采用的优化设备的个人计算机的示例结构的框图。在一个例子中,该个人计算机可以对应于根据本公开的上述示例性优化设备。
在图11中,中央处理单元(CPU)1101根据只读存储器(ROM)1102中存储的程序或从存储部分1108加载到随机存取存储器(RAM)1103的程序执行各种处理。在RAM 1103中,也根据需要存储当CPU 1101执行各种处理等时所需的数据。
CPU 1101、ROM 1102和RAM 1103经由总线1104彼此连接。输入/输出接口1105也连接到总线1104。
下述部件连接到输入/输出接口1105:输入部分1106,包括键盘、鼠标等;输出部分1107,包括显示器,比如阴极射线管(CRT)、液晶显示器(LCD)等,和扬声器等;存储部分1108,包括硬盘等;和通信部分1109,包括网络接口卡比如LAN卡、调制解调器等。通信部分1109经由网络比如因特网执行通信处理。
根据需要,驱动器1110也连接到输入/输出接口1105。可拆卸介质1111比如磁盘、 光盘、磁光盘、半导体存储器等等根据需要被安装在驱动器1110上,使得从中读出的计算机程序根据需要被安装到存储部分1108中。
在通过软件实现上述系列处理的情况下,从网络比如因特网或存储介质比如可拆卸介质1111安装构成软件的程序。
本领域技术人员应当理解,这种存储介质不局限于图11所示的其中存储有程序、与设备相分离地分发以向用户提供程序的可拆卸介质1111。可拆卸介质1111的例子包含磁盘(包含软盘(注册商标))、光盘(包含光盘只读存储器(CD-ROM)和数字通用盘(DVD))、磁光盘(包含迷你盘(MD)(注册商标))和半导体存储器。或者,存储介质可以是ROM 1102、存储部分1108中包含的硬盘等等,其中存有程序,并且与包含它们的设备一起被分发给用户。
应指出,文中所述的方法和设备可被实现为软件、固件、硬件或它们的任何组合。有些组件可例如被实现为在数字信号处理器或者微处理器上运行的软件。其他组件可例如实现为硬件和/或专用集成电路。
另外,可采用多种方式来实行本发明的方法和系统。例如,可通过软件、硬件、固件或它们的任何组合来实行本发明的方法和系统。上文所述的该方法的步骤的顺序仅是说明性的,并且除非另外具体说明,否则本发明的方法的步骤不限于上文具体描述的顺序。此外,在一些实施例中,本发明还可具体化为记录介质中记录的程序,包括用于实施根据本发明的方法的机器可读指令。因此,本发明还涵盖了存储用于实施根据本发明的方法的程序的记录介质。这样的存储介质可以包括但不限于软盘、光盘、磁光盘、存储卡、存储棒等等。
本领域技术人员应当意识到,在上述操作之间的边界仅仅是说明性的。多个操作可以结合成单个操作,单个操作可以分布于附加的操作中,并且操作可以在时间上至少部分重叠地执行。而且,另选的实施例可以包括特定操作的多个实例,并且在其他各种实施例中可以改变操作顺序。但是,其它的修改、变化和替换同样是可能的。因此,本说明书和附图应当被看作是说明性的,而非限制性的。
另外,本公开的实施方式还可以包括以下示意性示例(EE)。
EE 1.一种用于图像信号处理器(ISP)的优化设备,所述优化设备包括处理电路,被配置为:
使用图像信号处理器的模拟器对用于图像信号处理器优化的样本图片进行处理以获得结果图片;
获取基于所述图像信号处理器所应用于的特定任务的任务模型获得的、用于评价所述特定任务对于样本图片的执行效果的评估分数,所述评估分数包括指示样本图片分布偏差的分布评估分数;以及
基于所述评估分数来调整图像信号处理器的配置参数。
EE 2、根据EE 1所述的优化设备,其中,所述分布评估分数指示所述任务模型的训练集与所述样本图片之间的分布差异。
EE 3、根据EE 2或3所述的优化设备,其中,所述分布评估分数是基于所述任务模型中所包含的批归一化层的统计特征以及所述结果图片在所述批归一化层处的运算结果中的至少一者计算的。
EE 4、根据EE 3所述的优化设备,其中,所述批归一化层的统计特征包括批归一化层的均值和方差。
EE 5、根据EE 3所述的优化设备,其中,所述结果图片在所述批归一化层处的运算结果包括该运算结果的各个通道的分布的均值与方差。
EE 6、根据EE 3-5中任一项所述的优化设备,其中,所述分布评估分数是基于所述批归一化层的均值与所述运算结果的均值之差、以及所述批归一化层的方差与所述运算结果的方差的比值而计算的。
EE 7、根据EE 3-5中任一项所述的优化设备,其中,所述分布评估分数是基于所述批归一化层的均值与所述运算结果的均值之差、以及所述批归一化层的方差与所述运算结果的方差之差而计算的。
EE 8、根据EE 1-7中任一项所述的优化设备,其中,所述分布评估分数为选自包含L范数、Kullback-Leibler(KL)散度、Jensen-Shannon(JS)散度、Wasserstein距离的组中的至少一个。
EE 9、根据EE 1-8中任一项所述的优化设备,其中,所述分布评估分数是针对无标注样本图片被获取的。
EE 10、根据EE 1-9中任一项所述的优化设备,其中,所述评估分数还包括基于所述任务模型对所述结果图片进行运算得到的模型输出的模型评估分数。
EE 11、根据EE 10所述的优化设备,其中,所述模型评估分数为选自包含F1值、mean Average Precision(mAP)值、meana Average Recall(mAR)值、Intersection over Union(IoU)值、骰子系数、Panoptic Quality(PQ)值的组中的至少一个。
EE 12、根据EE 10或11所述的优化设备,其中,所述模型评估分数是针对有标注 样本图片、基于有标注样本图片包含的标注信息而计算的。
EE 13、根据EE 10-12中任一项所述的优化设备,其中,所述评估分数是基于所述分布评估分数和所述模型评估分数的加权和而计算的。
EE 14、根据EE 1所述的优化设备,其中,所述模拟器的参数和功能效果与所述图像信号处理器的参数和功能效果一一对应。
EE 15、根据EE 1所述的优化设备,其中,所述处理电路进一步配置为:
将所述配置参数调整为使得基于调整后的配置参数完成所述特定任务所取得的任务效果更优。
EE 16、根据EE 15所述的优化设备,其中,调整后的配置参数更接近于导致更优评估分数的配置参数。
EE 17、根据EE 1或2所述的优化设备,其中,所述处理电路进一步配置为:
获取多组评估分数,所述多组评估分数分别与多组配置参数对应,并且是基于所述多组配置参数处理样本图片以供所述任务模型运算所获得的多组评估分数;并且
将图像信号处理器的配置参数调整为使得调整后的配置参数更接近于所述多组评估分数中的更优评估分数相对应的配置参数,且远离所述多组评估分数中的更差评估分数相对应的配置参数。
EE 18、根据EE 1-17所述的优化设备,其中,所述处理电路进一步配置为迭代地执行图像信号处理器的配置参数的调整。
EE 19、根据EE 18所述的优化设备,其中,迭代终止条件包含以下中的至少一个:
当迭代次数达到预先设定的次数阈值时,停止迭代;
当一次迭代所对应的评估分数不再优于前一次迭代所对应的评估分数,则迭代停止;以及
当特定次数迭代后所对应的评估分数不再优于前特定次数的迭代所对应的评估分数,则迭代停止。
EE 20、根据EE 1所述的优化设备,其中,所述图像信号处理器的配置参数是通过对于优化器产生的数值进行处理以使之符合图像信号处理器的参数要求而获得的。
EE 21、根据EE 20所述的优化设备,其中,所述处理电路进一步配置为:
利用所述评估分数来更新优化器的状态;以及
基于由更新后的优化器产生的数值来调整所述配置参数。
EE 22、根据EE 21所述的优化设备,其中,优化器被更新为使得更新后的优化器所 产生的数值更加接近于对应于更优评估分数的数值。
EE 23、根据EE 20-22中任一项所述的优化设备,其中,所述处理电路进一步配置为:
获取与优化器产生的多组数值对应的多组评估分数;并且
更新优化器的状态,以使得更新后优化器所产生的数值能够更接近对应于所述多组评估分数中的更优评估分数的数值,且远离对应于所述多组评估分数中的更差评估分数的数值。
EE 24、根据EE 20-23中任一项所述的优化设备,其中,所述优化器是黑盒优化器。
EE 25、根据EE 20-23中任一项所述的优化设备,其中,所述优化器是CMA-ES优化器。
EE 26、根据EE 1-25中任一项所述的优化设备,其中,所述任务模型是在大规模数据集上训练完成的针对特定任务的模型,并且所述任务模型的输出结果是对应任务的执行结果。
EE 27、根据EE 1-26中任一项所述的优化设备,其中,样本图片包含多个样本图片,并且所述处理电路进一步配置为对样本分批地进行处理,
其中,所述多个样本图片被分成多个批次,
其中,对于每一个批次的样本图片,使用该模拟器对该批次的样本图片进行处理,并且将处理得到的结果图片提供给所述任务模型以供运算;并且同时,对该模拟器对下一批次的样本图片进行处理。
EE 28、根据EE 1-27中任一项所述的优化设备,其中,所述处理电路进一步配置为:
按照样本图片的标注重要度对样本进行排序,并且
对于前预定数量的样本图片进行标注以供进行训练。
EE 29、根据EE 28所述的优化设备,其中,所述处理电路进一步配置为:
计算各样本的中心度,样本的中心度指示与该样本相邻的样本数量,相邻样本定义为图像特征间的距离小于某一阈值;
计算各样本的近似似然,其中该近似似然是使用样本的图像特征以及对应批归一化层的均值和方差来计算的;以及
计算中心度的绝对值与近似似然值的比值来确定样本的标注重要度。
EE 30、一种用于图像信号处理器(ISP)的优化方法,包括:
使用图像信号处理器的模拟器对用于图像信号处理器优化的样本图片进行处理以获得结果图片;
获取基于所述图像信号处理器所应用于的特定任务的任务模型获得的、用于评价所述特定任务对于样本图片的执行效果的评估分数,所述评估分数包括指示样本图片分布偏差的分布评估分数;以及
基于所述评估分数来调整图像信号处理器的配置参数。
EE 31、根据EE 30所述的方法,其中,所述分布评估分数指示所述模型的训练集与所述样本图片之间的分布差异。
EE 32、根据EE 30或31所述的方法,其中,所述评估分数还包括基于所述任务模型对所述结果图片进行运算得到的模型输出的模型评估分数。
EE 33、根据EE 30所述的方法,还包括:
将所述配置参数调整为使得基于调整后的配置参数完成所述特定任务所取得的任务效果更优。
EE 34、根据EE 30所述的方法,还包括:
获取多组评估分数,所述多组评估分数分别与多组配置参数对应,并且是基于所述多组配置参数处理样本图片以供所述任务模型运算所获得的多组评估分数;并且
将图像信号处理器的配置参数调整为使得调整后的配置参数更接近于所述多组评估分数中的更优评估分数相对应的配置参数,且远离所述多组评估分数中的更差评估分数相对应的配置参数。
EE 35、根据EE 30-34中任一项所述的方法,还包括:迭代地执行图像信号处理器的配置参数的调整。
EE 36、根据EE 35所述的方法,其中,迭代终止条件包含以下中的至少一个:
当迭代次数达到预先设定的次数阈值时,停止迭代;
当一次迭代所对应的评估分数不再优于前一次迭代所对应的评估分数,则迭代停止;以及
当特定次数迭代后所对应的评估分数不再优于前特定次数的迭代所对应的评估分数,则迭代停止。
EE 37、根据EE 30所述的方法,其中,所述图像信号处理器的配置参数是通过对于优化器产生的数值进行处理以使之符合图像信号处理器的参数要求而获得的。
EE 38、根据EE 37所述的方法,还包括:
利用所述评估分数来更新优化器的状态;以及
基于由更新后的优化器产生的数值来调整所述配置参数。
EE 39、根据EE 37或38所述的方法,还包括:
获取与优化器产生的多组数值对应的多组评估分数;并且
更新优化器的状态,以使得更新后优化器所产生的数值能够更接近对应于所述多组评估分数中的更优评估分数的数值,且远离对应于所述多组评估分数中的更差评估分数的数值。
EE 40、根据EE 30-39中任一项所述的方法,其中,样本图片包含多个样本图片,并且所述方法进一步包括为对样本分批地进行处理,
其中,所述多个样本图片被分成多个批次,
其中,对于每一个批次的样本图片,使用该模拟器对该批次的样本图片进行处理,并且将处理得到的结果图片提供给所述任务模型以供运算,并且同时,对该模拟器对下一批次的样本图片进行处理。
EE 41、根据EE 30-40中任一项所述的方法,还包括:
按照样本图片的标注重要度对样本进行排序,并且
对于前预定数量的样本图片进行标注以供进行训练。
EE 42、根据EE 41所述的方法,还包括:
计算各样本的中心度,样本的中心度指示与该样本相邻的样本数量,相邻样本定义为图像特征间的距离小于某一阈值;
计算各样本的近似似然,其中该近似似然是使用样本的图像特征以及对应批归一化层的均值和方差来计算的;以及
计算中心度的绝对值与近似似然值的比值来确定样本的标注重要度。
EE 43、一种摄影设备,包括:
图像信号处理器,其用于基于由图像传感器将摄影设备所采集的光转化成的电信号,产生图像,以及
根据EE 1-29中任一项所述的优化设备,其用于对图像信号处理器进行优化。
EE 44、一种设备,包括
至少一个处理器;和
至少一个存储设备,所述至少一个存储设备在其上存储指令,该指令在由所述至少一个处理器执行时,使所述至少一个处理器执行根据EE 30-42中任一项所述的优化方法。
EE 45、一种存储指令的存储介质,该指令在由处理器执行时能使得执行根据EE 30-42中任一项所述的优化方法。
EE 46、一种程序产品,所述程序产品包含指令,该指令在由处理器执行时能使得执行根据EE 30-42中任一项所述的优化方法。
EE 47、一种计算机程序,所述计算机程序包括指令,所述指令在由计算机执行时使得计算机执行根据EE 30-42中任一项所述的优化方法。
虽然已经详细说明了本公开及其优点,但是应当理解在不脱离由所附的权利要求所限定的本公开的精神和范围的情况下可以进行各种改变、替代和变换。而且,本公开实施例的术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
虽然已详细描述了本公开的一些具体实施例,但是本领域技术人员应当理解,上述实施例仅是说明性的而不限制本公开的范围。本领域技术人员应该理解,上述实施例可以被组合、修改或替换而不脱离本公开的范围和实质。本公开的范围是通过所附的权利要求限定的。

Claims (46)

  1. 一种用于图像信号处理器(ISP)的优化设备,所述优化设备包括处理电路,被配置为:
    使用图像信号处理器的模拟器对用于图像信号处理器优化的样本图片进行处理以获得结果图片;
    获取基于所述图像信号处理器所应用于的特定任务的任务模型获得的、用于评价所述特定任务对于样本图片的执行效果的评估分数,所述评估分数包括指示样本图片分布偏差的分布评估分数;以及
    基于所述评估分数来调整图像信号处理器的配置参数。
  2. 根据权利要求1所述的优化设备,其中,所述分布评估分数指示所述任务模型的训练集与所述样本图片之间的分布差异。
  3. 根据权利要求2或3所述的优化设备,其中,所述分布评估分数是基于所述任务模型中所包含的批归一化层的统计特征以及所述结果图片在所述批归一化层处的运算结果中的至少一者计算的。
  4. 根据权利要求3所述的优化设备,其中,所述批归一化层的统计特征包括批归一化层的均值和方差。
  5. 根据权利要求3所述的优化设备,其中,所述结果图片在所述批归一化层处的运算结果包括该运算结果的各个通道的分布的均值与方差。
  6. 根据权利要求3-5中任一项所述的优化设备,其中,所述分布评估分数是基于所述批归一化层的均值与所述运算结果的均值之差、以及所述批归一化层的方差与所述运算结果的方差的比值而计算的。
  7. 根据权利要求3-5中任一项所述的优化设备,其中,所述分布评估分数是基于所述批归一化层的均值与所述运算结果的均值之差、以及所述批归一化层的方差与所述运算 结果的方差之差而计算的。
  8. 根据权利要求1-7中任一项所述的优化设备,其中,所述分布评估分数为选自包含L范数、Kullback-Leibler(KL)散度、Jensen-Shannon(JS)散度、Wasserstein距离的组中的至少一个。
  9. 根据权利要求1-8中任一项所述的优化设备,其中,所述分布评估分数是针对无标注样本图片被获取的。
  10. 根据权利要求1-9中任一项所述的优化设备,其中,所述评估分数还包括基于所述任务模型对所述结果图片进行运算得到的模型输出的模型评估分数。
  11. 根据权利要求10所述的优化设备,其中,所述模型评估分数为选自包含F1值、mean Average Precision(mAP)值、meana Average Recall(mAR)值、Intersection over Union(IoU)值、骰子系数、Panoptic Quality(PQ)值的组中的至少一个。
  12. 根据权利要求10或11所述的优化设备,其中,所述模型评估分数是针对有标注样本图片、基于有标注样本图片包含的标注信息而计算的。
  13. 根据权利要求10-12中任一项所述的优化设备,其中,所述评估分数是基于所述分布评估分数和所述模型评估分数的加权和而计算的。
  14. 根据权利要求1所述的优化设备,其中,所述模拟器的参数和功能效果与所述图像信号处理器的参数和功能效果一一对应。
  15. 根据权利要求1所述的优化设备,其中,所述处理电路进一步配置为:
    将所述配置参数调整为使得基于调整后的配置参数完成所述特定任务所取得的任务效果更优。
  16. 根据权利要求15所述的优化设备,其中,调整后的配置参数更接近于导致更 优评估分数的配置参数。
  17. 根据权利要求1或2所述的优化设备,其中,所述处理电路进一步配置为:
    获取多组评估分数,所述多组评估分数分别与多组配置参数对应,并且是基于所述多组配置参数处理样本图片以供所述任务模型运算所获得的多组评估分数;并且
    将图像信号处理器的配置参数调整为使得调整后的配置参数更接近于所述多组评估分数中的更优评估分数相对应的配置参数,且远离所述多组评估分数中的更差评估分数相对应的配置参数。
  18. 根据权利要求1-17所述的优化设备,其中,所述处理电路进一步配置为迭代地执行图像信号处理器的配置参数的调整。
  19. 根据权利要求18所述的优化设备,其中,迭代终止条件包含以下中的至少一个:
    当迭代次数达到预先设定的次数阈值时,停止迭代;
    当一次迭代所对应的评估分数不再优于前一次迭代所对应的评估分数,则迭代停止;以及
    当特定次数迭代后所对应的评估分数不再优于前特定次数的迭代所对应的评估分数,则迭代停止。
  20. 根据权利要求1所述的优化设备,其中,所述图像信号处理器的配置参数是通过对于优化器产生的数值进行处理以使之符合图像信号处理器的参数要求而获得的。
  21. 根据权利要求20所述的优化设备,其中,所述处理电路进一步配置为:
    利用所述评估分数来更新优化器的状态;以及
    基于由更新后的优化器产生的数值来调整所述配置参数。
  22. 根据权利要求21所述的优化设备,其中,优化器被更新为使得更新后的优化器所产生的数值更加接近于对应于更优评估分数的数值。
  23. 根据权利要求20-22中任一项所述的优化设备,其中,所述处理电路进一步配置为:
    获取与优化器产生的多组数值对应的多组评估分数;并且
    更新优化器的状态,以使得更新后优化器所产生的数值能够更接近对应于所述多组评估分数中的更优评估分数的数值,且远离对应于所述多组评估分数中的更差评估分数的数值。
  24. 根据权利要求20-23中任一项所述的优化设备,其中,所述优化器是黑盒优化器。
  25. 根据权利要求20-23中任一项所述的优化设备,其中,所述优化器是CMA-ES优化器。
  26. 根据权利要求1-25中任一项所述的优化设备,其中,所述任务模型是在大规模数据集上训练完成的针对特定任务的模型,并且所述任务模型的输出结果是对应任务的执行结果。
  27. 根据权利要求1-26中任一项所述的优化设备,其中,样本图片包含多个样本图片,并且所述处理电路进一步配置为对样本分批地进行处理,
    其中,所述多个样本图片被分成多个批次,
    其中,对于每一个批次的样本图片,使用该模拟器对该批次的样本图片进行处理,并且将处理得到的结果图片提供给所述任务模型以供运算;并且同时,对该模拟器对下一批次的样本图片进行处理。
  28. 根据权利要求1-27中任一项所述的优化设备,其中,所述处理电路进一步配置为:
    按照样本图片的标注重要度对样本进行排序,并且
    对于前预定数量的样本图片进行标注以供进行训练。
  29. 根据权利要求28所述的优化设备,其中,所述处理电路进一步配置为:
    计算各样本的中心度,样本的中心度指示与该样本相邻的样本数量,相邻样本定义为图像特征间的距离小于某一阈值;
    计算各样本的近似似然,其中该近似似然是使用样本的图像特征以及对应批归一化层的均值和方差来计算的;以及
    计算中心度的绝对值与近似似然值的比值来确定样本的标注重要度。
  30. 一种用于图像信号处理器(ISP)的优化方法,包括:
    使用图像信号处理器的模拟器对用于图像信号处理器优化的样本图片进行处理以获得结果图片;
    获取基于所述图像信号处理器所应用于的特定任务的任务模型获得的、用于评价所述特定任务对于样本图片的执行效果的评估分数,所述评估分数包括指示样本图片分布偏差的分布评估分数;以及
    基于所述评估分数来调整图像信号处理器的配置参数。
  31. 根据权利要求30所述的方法,其中,所述分布评估分数指示所述模型的训练集与所述样本图片之间的分布差异。
  32. 根据权利要求30或31所述的方法,其中,所述评估分数还包括基于所述任务模型对所述结果图片进行运算得到的模型输出的模型评估分数。
  33. 根据权利要求30所述的方法,还包括:
    将所述配置参数调整为使得基于调整后的配置参数完成所述特定任务所取得的任务效果更优。
  34. 根据权利要求30所述的方法,还包括:
    获取多组评估分数,所述多组评估分数分别与多组配置参数对应,并且是基于所述多组配置参数处理样本图片以供所述任务模型运算所获得的多组评估分数;并且
    将图像信号处理器的配置参数调整为使得调整后的配置参数更接近于所述多组评估分数中的更优评估分数相对应的配置参数,且远离所述多组评估分数中的更差评估分数相对应的配置参数。
  35. 根据权利要求30-34中任一项所述的方法,还包括:迭代地执行图像信号处理器的配置参数的调整。
  36. 根据权利要求35所述的方法,其中,迭代终止条件包含以下中的至少一个:
    当迭代次数达到预先设定的次数阈值时,停止迭代;
    当一次迭代所对应的评估分数不再优于前一次迭代所对应的评估分数,则迭代停止;以及
    当特定次数迭代后所对应的评估分数不再优于前特定次数的迭代所对应的评估分数,则迭代停止。
  37. 根据权利要求30所述的方法,其中,所述图像信号处理器的配置参数是通过对于优化器产生的数值进行处理以使之符合图像信号处理器的参数要求而获得的。
  38. 根据权利要求37所述的方法,还包括:
    利用所述评估分数来更新优化器的状态;以及
    基于由更新后的优化器产生的数值来调整所述配置参数。
  39. 根据权利要求37或38所述的方法,还包括:
    获取与优化器产生的多组数值对应的多组评估分数;并且
    更新优化器的状态,以使得更新后优化器所产生的数值能够更接近对应于所述多组评估分数中的更优评估分数的数值,且远离对应于所述多组评估分数中的更差评估分数的数值。
  40. 根据权利要求30-39中任一项所述的方法,其中,样本图片包含多个样本图片,并且所述方法进一步包括为对样本分批地进行处理,
    其中,所述多个样本图片被分成多个批次,
    其中,对于每一个批次的样本图片,使用该模拟器对该批次的样本图片进行处理,并且将处理得到的结果图片提供给所述任务模型以供运算,并且同时,对该模拟器对下一批次的样本图片进行处理。
  41. 根据权利要求30-40中任一项所述的方法,还包括:
    按照样本图片的标注重要度对样本进行排序,并且
    对于前预定数量的样本图片进行标注以供进行训练。
  42. 根据权利要求41所述的方法,还包括:
    计算各样本的中心度,样本的中心度指示与该样本相邻的样本数量,相邻样本定义为图像特征间的距离小于某一阈值;
    计算各样本的近似似然,其中该近似似然是使用样本的图像特征以及对应批归一化层的均值和方差来计算的;以及
    计算中心度的绝对值与近似似然值的比值来确定样本的标注重要度。
  43. 一种摄影设备,包括:
    图像信号处理器,其用于基于由图像传感器将摄影设备所采集的光转化成的电信号,产生图像,以及
    根据权利要求1-29中任一项所述的优化设备,其用于对图像信号处理器进行优化。
  44. 一种设备,包括
    至少一个处理器;和
    至少一个存储设备,所述至少一个存储设备在其上存储指令,该指令在由所述至少一个处理器执行时,使所述至少一个处理器执行根据权利要求30-42中任一项所述的优化方法。
  45. 一种存储指令的存储介质,该指令在由处理器执行时能使得执行根据权利要求30-42中任一项所述的优化方法。
  46. 一种程序产品,所述程序产品包含指令,该指令在由处理器执行时能使得执行根据权利要求30-42中任一项所述的优化方法。
PCT/CN2022/113673 2021-08-23 2022-08-19 图像信号处理器优化方法及设备 WO2023025063A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202280056428.0A CN118159995A (zh) 2021-08-23 2022-08-19 图像信号处理器优化方法及设备

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110965449.1A CN115719440A (zh) 2021-08-23 2021-08-23 图像信号处理器优化方法及设备
CN202110965449.1 2021-08-23

Publications (1)

Publication Number Publication Date
WO2023025063A1 true WO2023025063A1 (zh) 2023-03-02

Family

ID=85253303

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/113673 WO2023025063A1 (zh) 2021-08-23 2022-08-19 图像信号处理器优化方法及设备

Country Status (2)

Country Link
CN (2) CN115719440A (zh)
WO (1) WO2023025063A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170070671A1 (en) * 2015-09-07 2017-03-09 Samsung Electronics Co., Ltd. Systems, methods, apparatuses, and non-transitory computer readable media for automatically tuning operation parameters of image signal processors
WO2019152499A1 (en) * 2018-01-30 2019-08-08 Qualcomm Incorporated Systems and methods for image signal processor tuning using a reference image
CN111988544A (zh) * 2019-05-21 2020-11-24 三星电子株式会社 使用机器学习预测参数的最优值
CN112118388A (zh) * 2020-08-04 2020-12-22 绍兴埃瓦科技有限公司 图像处理方法、装置、计算机设备和存储介质
CN112967190A (zh) * 2021-02-09 2021-06-15 北京爱芯科技有限公司 图像处理方法、装置、电子设备及计算机可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170070671A1 (en) * 2015-09-07 2017-03-09 Samsung Electronics Co., Ltd. Systems, methods, apparatuses, and non-transitory computer readable media for automatically tuning operation parameters of image signal processors
WO2019152499A1 (en) * 2018-01-30 2019-08-08 Qualcomm Incorporated Systems and methods for image signal processor tuning using a reference image
CN111988544A (zh) * 2019-05-21 2020-11-24 三星电子株式会社 使用机器学习预测参数的最优值
CN112118388A (zh) * 2020-08-04 2020-12-22 绍兴埃瓦科技有限公司 图像处理方法、装置、计算机设备和存储介质
CN112967190A (zh) * 2021-02-09 2021-06-15 北京爱芯科技有限公司 图像处理方法、装置、电子设备及计算机可读存储介质

Also Published As

Publication number Publication date
CN115719440A (zh) 2023-02-28
CN118159995A (zh) 2024-06-07

Similar Documents

Publication Publication Date Title
US11544831B2 (en) Utilizing an image exposure transformation neural network to generate a long-exposure image from a single short-exposure image
US9978003B2 (en) Utilizing deep learning for automatic digital image segmentation and stylization
CN111062871B (zh) 一种图像处理方法、装置、计算机设备及可读存储介质
CN109583483B (zh) 一种基于卷积神经网络的目标检测方法和系统
US12100192B2 (en) Method, apparatus, and electronic device for training place recognition model
WO2019242416A1 (zh) 视频图像处理方法及装置、计算机可读介质和电子设备
CN109284733B (zh) 一种基于yolo和多任务卷积神经网络的导购消极行为监控方法
CN110532970B (zh) 人脸2d图像的年龄性别属性分析方法、系统、设备和介质
US8571332B2 (en) Methods, systems, and media for automatically classifying face images
US20160034788A1 (en) Learning image categorization using related attributes
CN111738243A (zh) 人脸图像的选择方法、装置、设备及存储介质
US20170185870A1 (en) Method of image processing
WO2023206944A1 (zh) 一种语义分割方法、装置、计算机设备和存储介质
ur Rehman et al. DeepRPN-BIQA: Deep architectures with region proposal network for natural-scene and screen-content blind image quality assessment
US11934958B2 (en) Compressing generative adversarial neural networks
TWI803243B (zh) 圖像擴增方法、電腦設備及儲存介質
CN109325435B (zh) 基于级联神经网络的视频动作识别及定位方法
WO2023160645A1 (zh) 图像增强方法及设备
CN110533046A (zh) 一种图像实例分割方法和装置
CN116670687A (zh) 用于调整训练后的物体检测模型以适应域偏移的方法和系统
CN116543261A (zh) 用于图像识别的模型训练方法、图像识别方法设备及介质
CN111667495A (zh) 一种图像场景解析方法和装置
WO2023025063A1 (zh) 图像信号处理器优化方法及设备
CN115798005A (zh) 基准照片的处理方法及装置、处理器和电子设备
US20210224652A1 (en) Methods and systems for performing tasks on media using attribute specific joint learning

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 202280056428.0

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22860409

Country of ref document: EP

Kind code of ref document: A1