WO2022205058A1

WO2022205058A1 - Method and apparatus for determining image processing mode

Info

Publication number: WO2022205058A1
Application number: PCT/CN2021/084373
Authority: WO
Inventors: 林永兵; 马莎; 罗达新; 高鲁涛
Original assignee: 华为技术有限公司
Priority date: 2021-03-31
Filing date: 2021-03-31
Publication date: 2022-10-06
Also published as: CN113366531A

Abstract

The present application relates to a method and apparatus for determining an image processing mode, which can be applied to assistant driving and autonomous driving. The method comprises: according to a precision threshold value of a service requirement and a first correspondence, determining a distortion threshold value corresponding to the precision threshold value, wherein the first correspondence is a correspondence between precision and a distortion degree; and according to the distortion threshold value and a second correspondence, determining a code rate threshold value corresponding to the distortion threshold value, wherein the second correspondence is a correspondence between the distortion degree and a code rate. By means of the method provided by the embodiments of the present application, the decoupling of a compression algorithm and AI processing is realized. If a new compression algorithm needs to be evaluated, there is no need to perform end-to-end evaluation, such that the efficiency of evaluating an autonomous driving or assistant driving system can be improved. The method can be applied to the Internet of vehicles, such as vehicle-to-everything (V2X), long term evolution-vehicle (LTE-V) and vehicle-to-vehicle (V2V).

Description

Method and device for determining image processing mode

technical field

The present application relates to the field of image technology, and in particular, to a method and device for determining an image processing method.

Background technique

With the development of society, intelligent terminals such as intelligent transportation equipment, smart home equipment, and robots are gradually entering people's daily life. Sensors play a very important role in smart terminals. Various sensors installed on the smart terminal, such as millimeter-wave radar, lidar, camera, ultrasonic radar, etc., perceive the surrounding environment during the movement of the smart terminal, collect data, and identify and track moving objects. As well as the identification of static scenes such as lane lines and signs, and combined with navigator and map data for path planning. Sensors can detect possible dangers in advance and assist or even take necessary evasion measures autonomously, effectively increasing the safety and comfort of smart terminals.

The camera has the characteristics of high resolution, non-contact, convenient use and low cost, and is an essential sensor for environmental perception of autonomous driving. More and more cameras can be installed on the vehicle. During automatic driving, the camera collects images in the environment and performs machine vision processing to identify obstacles or targets in the environment, so as to achieve no blind spot coverage.

With the continuous improvement of camera resolution, frame rate, sampling depth and other parameters, the video output by the camera has an increasing demand for transmission bandwidth. FIG. 1a is a schematic diagram of data transmission in a compression-based sensing system in the related art. As shown in Figure 1a, the perception system includes a camera and an image signal processor (ISP), and the perception system transmits the processed image data to the mobile data center (MDC), which is further processed by the MDC. deal with. Specifically, the Bayer RAW image output by the camera is processed by the ISP and sent to the MDC, and the MDC performs machine vision processing on the image processed by the ISP.

The Bayer RAW image output by the camera in Figure 1a can be an Ultra high definition (UHD) image with a resolution of 4K, the frame rate of the image can be 30fps, and the bit depth of the image can be 16bitdepth. The bandwidth requirement is up to 4Gbps (4K*2k*30*16). In order to ease the pressure on the transmission network, the method of compressing and transmitting images can be used to reduce bandwidth requirements, and new services of UHD video transmission can be carried out without upgrading the existing network.

Autonomous driving requires high safety, therefore, the automatic driving system is more sensitive to the delay of the perception system. The scene shown in Figure 1a is an example of a perception system, and the requirements for compression algorithms may include: support for encoding of RAW format images, low latency, low complexity, and high compression performance. In order to meet these performances, an architecture for video compression in the RAW domain is designed in the related art. FIG. 1b shows a schematic diagram of an architecture of video compression according to an example in the related art. As shown in Figure 1b, the camera outputs an image in RAW format, which is encoded by an encoder and then outputs an image in RAW format. The outputted image in RAW format is a compressed image, and the image encoded by the encoder can be transmitted to MDC, MDC It can include decoder, ISP and deep neural network. The decoder is used to decode the received compressed image to obtain the decoded image, and then output the three primary colors (Red Green Blue, RGB) or YUV format after ISP processing. The image goes to a deep neural network for further processing. The ISP processing may include: a demosaic (Demosaic) operation for converting an image from a RAW format to an RGB format; a white balance (WB) operation for performing white balance processing on an image; a color correction matrix (Color Correction Matrix, CCM), used to complete the conversion from sensor_RGB color space to sRGB color space, so that the color matching characteristics of the camera meet the Luther condition; Gamma (Gamma) correction, used to correct the display characteristics of the display and the nonlinearity of the input image relation. The ISP processing may also include other processing procedures for images, and the present application is not limited to the above-mentioned processing. The processing of the image by the deep neural network can include: image recognition, segmentation, etc.

The example shown in Figure 1b can reduce the delay from the perception system to the MDC in the RAW domain; the ISP and deep neural network shown in Figure 1b can be set in the MDC, which can provide more flexible ISP capabilities and obtain better images quality, and can reduce the delay from the sensing system to the MDC.

The use of lossy image/video compression technology can achieve higher compression rates. Commonly used lossy compression standards include: Joint Photographic Experts Group (JPEG), H264/H265, JPEG-XS (Joint Photographic Experts Group Extra) Speed) etc. Among them, JPEG-XS is a new compression standard proposed by the Joint Photographic Experts Group. The image quality damage caused by the introduction of compression technology is inevitable, and the image quality damage will have an impact on subsequent machine vision processing, which may lead to problems such as a decrease in the accuracy of recognition and inaccurate image segmentation.

In order to evaluate the impact of image quality damage caused by compression on subsequent artificial intelligence (AI) processing, some image quality evaluation methods are proposed in the related art. At what bit rate threshold, the compression can achieve machine lossless Require. Among them, machine lossless means that, compared with uncompressed images, the accuracy index of recognizing compressed images is within a certain error range. That is to say, the difference between the accuracy index for identifying the compressed image and the accuracy index for identifying the original image (the uncompressed image) is within a certain error range.

The evaluation method proposed in the related art is an end-to-end evaluation, that is to say, the evaluation method in the related art is an evaluation of the accuracy of the entire process from the front-end compression processing to the back-end artificial intelligence processing. If you want to evaluate different compression algorithms, the end-to-end evaluation method is relatively inefficient.

SUMMARY OF THE INVENTION

In view of this, a method and device for determining an image processing method are proposed, which realizes the decoupling of compression algorithm and AI processing, and can improve the efficiency of evaluation.

In a first aspect, an embodiment of the present application provides a method for determining an image processing mode, the method comprising: determining a distortion threshold corresponding to the accuracy threshold according to a service-required accuracy threshold and a first correspondence; The first correspondence is the correspondence between the precision and the degree of distortion; according to the distortion threshold and the second correspondence, the code rate threshold corresponding to the distortion threshold is determined; wherein, the second correspondence is the degree of distortion and Correspondence between code rates.

Among them, the accuracy threshold required by the business may refer to the accuracy requirements in different application scenarios, and the application scenarios may be automatic driving, assisted driving, etc. These different application scenarios may have different processing accuracy requirements. Therefore, different The application scenarios have corresponding accuracy thresholds required by the business.

In the method of the embodiment of the present application, the distortion degree is introduced as an intermediate variable, the first correspondence relationship is used to evaluate the influence of the distortion degree after compression on the accuracy, the second correspondence relationship is used to evaluate the distortion degree after compression using different code rates, and the compressed The evaluation of the process and the post-compression process are processed separately, which can realize the decoupling of the compression and the post-compression process, and improve the evaluation efficiency.

Exemplarily, the processing process after compression may be AI processing, and the method according to the embodiment of the present application can realize the decoupling of the compression algorithm and the AI processing. The second correspondence relationship corresponding to the new compression algorithm can be obtained by evaluating the compression processing process. Similarly, if you want to use a new AI module to recognize images, you can also use existing data to re-evaluate the AI recognition process to obtain a new first correspondence, without end-to-end evaluation. The methods provided in the embodiments of the present application can improve the efficiency of evaluation.

In a possible implementation manner, the accuracy threshold value required by the service may refer to the difference between the accuracy required by the service and the accuracy of the lossless machine. According to the first aspect, in a first possible implementation manner, determining the distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and the first correspondence includes: determining the distortion threshold according to the accuracy threshold and the first accuracy. the second accuracy corresponding to the accuracy threshold, wherein the first accuracy is the accuracy of identifying the original image; according to the second accuracy and the first correspondence, the distortion corresponding to the second accuracy is determined threshold.

According to the first aspect or the first possible implementation manner of the first aspect, in the second possible implementation manner, the code rate is a sampling frequency at which the compressed image is obtained by compressing the original image by using a compression algorithm, The distortion is the difference between the compressed image and the real environment, and the accuracy is the accuracy of identifying the compressed image.

According to the first aspect or any one of the first or second possible implementation manners of the first aspect, in the third possible implementation manner, the second correspondence is obtained by testing the compression algorithm, The second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, and the first correspondence corresponding to different compression algorithms is the same.

According to the third possible implementation manner of the first aspect, in the fourth possible implementation manner, determining the bit rate threshold corresponding to the distortion threshold according to the distortion threshold and the second correspondence includes: The compression algorithm used for compression; the sub-correspondence relationship corresponding to the compression algorithm is determined; and the code rate threshold corresponding to the distortion threshold value is determined according to the distortion threshold value and the sub-correspondence relationship.

The determination method of the embodiment of the present application is simple, efficient, and easy to expand. For example, if a new compression algorithm is to be used for image compression, the new compression algorithm can be evaluated for different bit rate points. Specifically, a new compression algorithm is used to compress the image for different bit rate points, and the distortion degree of the compressed image is output to obtain the second correspondence corresponding to the new compression algorithm. There is no need to perform AI processing on the compressed image to obtain the processing accuracy of the AI module, and there is no need to re-establish a new first correspondence, and the first correspondence established before can be used. The processor can establish a second correspondence corresponding to the new compression algorithm. If the code rate threshold when compressing with the new compression algorithm is to be determined according to the accuracy threshold required by the service, the processor can determine the corresponding distortion threshold according to the accuracy threshold required by the service. , find the established first correspondence according to the precision threshold to determine the distortion threshold corresponding to the precision threshold, find the second correspondence corresponding to the new compression algorithm according to the distortion threshold, and determine the bit rate threshold corresponding to the distortion threshold, that is, to adopt the new compression The bit rate threshold that meets the accuracy threshold required by the service when the algorithm compresses. The determination method provided in this application can decouple the process of compression and AI processing, and realize staged evaluation. The precision of AI processing has nothing to do with the compression algorithm, and only needs to re-test the distortion degree and bit rate for the new compression algorithm. , no end-to-end testing is required, the testing process is simplified, and the evaluation efficiency is higher.

According to the first aspect or any one of the first or second possible implementation manners of the first aspect, in a fifth possible implementation manner, the degree of distortion is based on one or more of the following indicators Obtained: peak signal-to-noise ratio PSNR, mean square error MSE, structural similarity metric SSIM, perceptual loss.

According to the first aspect or any one of the first or second possible implementation manners of the first aspect, in the sixth possible implementation manner, the accuracy is obtained according to one or more of the following indicators : Mean Precision mAP, Mean Precision AP, Mean Recall AR, Mean Intersection Over Union Ratio MIoU.

According to the second possible implementation manner of the first aspect, in a seventh possible implementation manner, the original image is a Bayer original RAW image, the compressed image is a red, green, and blue RGB image, and the Compressing the original image is: compressing the original image in the RAW domain, or the RGB domain, or the YUV domain.

According to the second possible implementation manner of the first aspect, in the eighth possible implementation manner, the original image is a Bayer original RAW image, the compressed image is a YUV image, and the original image is Compression is: compressing the original image in the YUV domain.

According to the second possible implementation manner of the first aspect, in a ninth possible implementation manner, both the original image and the compressed image are Bayerian original RAW images, and the original image is compressed as : Compress the original image in the RAW domain.

In a second aspect, an embodiment of the present application provides a device for determining an image processing method, the device includes: a first determining module, configured to determine the corresponding accuracy threshold value according to the accuracy threshold value required by the service and the first corresponding relationship The first corresponding relationship is the corresponding relationship between the accuracy and the degree of distortion; the second determining module is used to determine the corresponding bit rate of the distortion threshold according to the distortion threshold and the second corresponding relationship. Threshold; wherein, the second correspondence is the correspondence between the distortion degree and the code rate.

By introducing the distortion degree as an intermediate variable, the apparatus in the embodiment of the present application uses the first correspondence relationship to evaluate the influence of the distortion degree after compression on the accuracy, and the second correspondence relationship to evaluate the distortion degree after compression with different code rates. The evaluation of the process and the post-compression process are processed separately, which can realize the decoupling of the compression and the post-compression process, and improve the evaluation efficiency.

Exemplarily, the processing process after compression may be AI processing. The device according to the embodiment of the present application realizes the decoupling of the compression algorithm and AI processing. If a new compression algorithm is to be evaluated, end-to-end evaluation is not required, and only The second correspondence relationship corresponding to the new compression algorithm can be obtained by evaluating the compression processing process. Similarly, if you want to use a new AI module to recognize images, you can also use existing data to re-evaluate the AI recognition process to obtain a new first correspondence, without end-to-end evaluation. The device provided by the embodiment of the present application can improve the efficiency of evaluation.

According to the second aspect, in a first possible implementation manner, the first determination module includes: a first determination unit, configured to determine the second accuracy corresponding to the accuracy threshold according to the accuracy threshold and the first accuracy, The first accuracy is the accuracy of recognizing the original image; and the second determination unit is configured to determine the distortion threshold corresponding to the second accuracy according to the second accuracy and the first correspondence.

According to the second aspect or the first possible implementation manner of the second aspect, in the second possible implementation manner, the code rate is a sampling frequency at which the compressed image is obtained by compressing the original image using a compression algorithm, The distortion is the difference between the compressed image and the real environment, and the accuracy is the accuracy of identifying the compressed image.

According to the second aspect or any one of the first or second possible implementation manners of the second aspect, in the third possible implementation manner, the second correspondence is obtained by testing the compression algorithm, The second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, and the first correspondence corresponding to different compression algorithms is the same.

According to a third possible implementation manner of the second aspect, in a fourth possible implementation manner, the second determination module includes: a third determination unit, configured to determine a compression algorithm used for compressing the original image; The fourth determining unit is configured to determine the sub-correspondence relationship corresponding to the compression algorithm; the fifth determining unit is configured to determine the bit rate threshold value corresponding to the distortion threshold value according to the distortion threshold value and the sub-correspondence relationship.

The device of the embodiment of the present application is simple, efficient, and easy to expand. For example, if a new compression algorithm is to be used for image compression, the new compression algorithm can be evaluated for different bit rate points. Specifically, a new compression algorithm is used to compress the image for different bit rate points, and the distortion degree of the compressed image is output to obtain the second correspondence corresponding to the new compression algorithm. There is no need to perform AI processing on the compressed image to obtain the processing accuracy of the AI module, and there is no need to re-establish a new first correspondence, and the first correspondence established before can be used. The processor can establish a second correspondence corresponding to the new compression algorithm. If the code rate threshold when compressing with the new compression algorithm is to be determined according to the accuracy threshold required by the service, the processor can determine the corresponding distortion threshold according to the accuracy threshold required by the service. , find the established first correspondence according to the precision threshold to determine the distortion threshold corresponding to the precision threshold, find the second correspondence corresponding to the new compression algorithm according to the distortion threshold, and determine the bit rate threshold corresponding to the distortion threshold, that is, to adopt the new compression The bit rate threshold that meets the accuracy threshold required by the service when the algorithm compresses. The device provided in this application can decouple the process of compression and AI processing, and realize staged evaluation. The precision of AI processing has nothing to do with the compression algorithm. For the new compression algorithm, it only needs to re-test the distortion degree and bit rate. End-to-end testing is not required, which simplifies the testing process and increases evaluation efficiency.

According to the second aspect or any one of the first or second possible implementation manners of the second aspect, in a fifth possible implementation manner, the degree of distortion is based on one or more of the following indicators Obtained: peak signal-to-noise ratio PSNR, mean square error MSE, structural similarity metric SSIM, perceptual loss.

According to the second aspect or any one of the first or second possible implementation manners of the second aspect, in the sixth possible implementation manner, the accuracy is obtained according to one or more of the following indicators : Mean Precision mAP, Mean Precision AP, Mean Recall AR, Mean Intersection Over Union Ratio MIoU.

In a third aspect, embodiments of the present application provide an electronic device, including: a processor and a memory for storing instructions executable by the processor, where the processor can execute the first aspect or the first aspect when the processor executes the instructions A method for determining one or more image processing manners among multiple possible implementation manners of the aspect.

In a fourth aspect, embodiments of the present application provide a computer program product, comprising computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in an electronic When running in the device, the processor in the electronic device executes the first aspect or the method for determining one or more image processing manners in the first aspect or multiple possible implementation manners of the first aspect.

In a fifth aspect, embodiments of the present application provide a non-volatile computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, implement the first aspect or the first aspect above A method for determining one or more image processing modes in a variety of possible implementation modes.

In a sixth aspect, an embodiment of the present application further provides a sensor system for providing a sensing function for a vehicle. It includes at least one device for determining the image processing method mentioned in the above-mentioned embodiments of the present application, and at least one of other sensors such as a camera or a radar. At least one sensor device in the system can be integrated into a whole machine or device, or The at least one sensor device within the system can also be provided independently as an element or device.

In a seventh aspect, the embodiments of the present application further provide a system, which is applied in unmanned driving or intelligent driving, which includes at least one device for determining the image processing method mentioned in the above-mentioned embodiments of the present application, and sensors such as cameras and radars. At least one device in the system can be integrated into a whole machine or equipment, or at least one device in the system can also be independently set as a component or device.

In an eighth aspect, an embodiment of the present application further provides a vehicle, where the vehicle includes at least one image processing method determination device or any of the above-mentioned systems mentioned in the above-mentioned embodiments of the present application.

These and other aspects of the present application will be more clearly understood in the following description of the embodiment(s).

Description of drawings

The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments, features and aspects of the application and together with the description, serve to explain the principles of the application.

FIG. 1a is a schematic diagram of data transmission in a compression-based sensing system in the related art.

FIG. 1b shows a schematic diagram of an architecture of video compression according to an example in the related art.

Fig. 2a shows a schematic diagram of an evaluation framework according to an embodiment of the present application.

FIG. 2b shows a schematic diagram of a rate-precision curve according to an embodiment of the present application.

FIG. 3 shows a schematic diagram of an application scenario of a method for determining an image processing mode according to an embodiment of the present application.

Fig. 4a shows a schematic diagram of a curve of the first correspondence according to an embodiment of the present application.

FIG. 4b shows a schematic diagram of a curve of the second correspondence according to an embodiment of the present application.

FIG. 5 shows a method for determining an image processing mode according to an embodiment of the present application.

FIG. 6a shows a schematic diagram of a manner of determining a distortion threshold according to an embodiment of the present application.

FIG. 6b shows a schematic diagram of a manner of determining a code rate threshold according to an embodiment of the present application.

FIG. 7 shows a schematic diagram of an assessment framework according to some examples of the present application.

FIG. 8 shows a schematic diagram of an assessment framework according to some examples of the present application.

9 shows a schematic diagram of an assessment framework according to some examples of the present application.

FIG. 10 shows a block diagram of an apparatus for determining an image processing method according to an embodiment of the present application.

Detailed ways

Various exemplary embodiments, features and aspects of the present application will be described in detail below with reference to the accompanying drawings. The same reference numbers in the figures denote elements that have the same or similar functions. While various aspects of the embodiments are shown in the drawings, the drawings are not necessarily drawn to scale unless otherwise indicated.

The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration." Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

In addition, in order to better illustrate the present application, numerous specific details are given in the following detailed description. It should be understood by those skilled in the art that the present application may be practiced without certain specific details. In some instances, methods, means, components and circuits well known to those skilled in the art have not been described in detail so as not to obscure the subject matter of the present application.

Fig. 2a shows a schematic diagram of an evaluation framework according to an embodiment of the present application, and the evaluation framework shown in Fig. 2a is the Moving Picture Experts Group (MPEG)-Machine Vision Coding (Video Coding for Machines, VCM) working group A defined, machine vision-oriented image quality evaluation method using an end-to-end evaluation process.

As shown in Figure 2a, the camera outputs the video processed by the ISP to the VCM encoder. The video processed by the ISP can be in RGB or YUV format. The VCM encoder encodes the video to obtain the encoded video, and the encoded video is transmitted. To the VCM decoder, the VCM decoder performs video decoding to obtain the decoded video, and performs machine vision processing on the decoded video. Specifically, the decoded image can be output to a neural network, and machine vision processing is performed through the neural network.

The evaluation framework shown in Figure 2a is a tightly coupled system. The camera, compression algorithm (encoder + decoder), and NN modules are coupled together. If you want to evaluate the compression performance of a compression algorithm, you must perform an end-to-end evaluation. Evaluation is complex and inefficient. Specifically, as shown in Fig. 2a, for different compression algorithms, if we want to determine the code rate threshold at which to compress to meet the requirements of machine lossless, we must use the framework shown in Fig. 2a to perform end-to-end testing, and obtain The accuracy corresponding to multiple bit rate points of the compression algorithm. According to the accuracy corresponding to multiple bit rate points, the curve of the bit rate and accuracy can be drawn, and the bit rate corresponding to the accuracy index can be determined according to the accuracy threshold and the bit rate-precision curve required by the business. threshold, the efficiency is relatively low. Among them, the accuracy threshold required by the business may refer to the accuracy requirements in different application scenarios, and the application scenarios may be automatic driving, assisted driving, etc. These different application scenarios may have different processing accuracy requirements. Therefore, different The application scenarios have corresponding accuracy thresholds required by the business. In a possible implementation manner, the accuracy threshold required by the service may refer to the difference between the accuracy required by the service and the accuracy of the lossless machine.

FIG. 2b shows a schematic diagram of a rate-precision curve according to an embodiment of the present application. As shown in Figure 2b, the abscissa represents the bit rate, and the ordinate represents the precision. In the example of Figure 2b, the index used for the precision may be the mean average precision (mean Average Precision, mAP). As shown in Figure 2b, the dotted line represents the recognition accuracy of the original image collected (the uncompressed image), and the other three curves represent the compressed image obtained by compressing the original image using three different compression algorithms. The relationship between the recognition accuracy of the compressed image and the bit rate, the three compression algorithms are X265 default configuration (X265_medium), X264 default configuration (X264_medium), and X264 fast configuration (X264_ultrafast). As shown in Figure 2b, with the increase of the code rate, the recognition accuracy of the compressed image is getting closer and closer to the machine lossless (dotted line). When the code rate is relatively high, the three curves tend to be consistent, but when the code rate is relatively low When , the three curves are relatively scattered, that is to say, when the code rate is relatively low, the accuracy of recognition and the compression algorithm used are relatively strong. For different compression algorithms, if you want to determine the code rate threshold at which to compress to meet the requirements of machine lossless, you must use the framework shown in Figure 2a to perform end-to-end testing to obtain the corresponding code rate points of the compression algorithm. According to the accuracy corresponding to multiple code rate points, the curve of the code rate and accuracy can be drawn, and the code rate threshold corresponding to the accuracy index can be determined according to the accuracy index and the code rate-accuracy curve required by the business, and the efficiency is relatively low.

In related technologies, the IEEE-P2020 standard image quality evaluation working group for imaging systems for autonomous driving has defined probability-based machine vision evaluation indicators for autonomous driving, including: contrast detection probability (CDP), color separation probability ( Color separation probability (CSP), geometric resolution probability (GRP), etc., to achieve module-level evaluation of perception systems. These probabilistic indicators are used to characterize the imaging quality of the machine vision-oriented perception system to measure the impact of image quality on subsequent machine vision AI processing. However, these indicators only consider the capabilities of the imaging system, which are separated from the back-end AI processing tasks, and cannot well reflect the impact of image quality on AI processing.

In order to solve the above technical problems, the present application provides a method for determining an image processing mode. FIG. 3 shows a schematic diagram of an application scenario of a method for determining an image processing mode according to an embodiment of the present application. As shown in FIG. 3 , in the application scenario of the embodiment of the present application, a compression module, an AI module, and a processor may be included. Among them, the compression module can compress the received image. During the compression process, the image can be sampled (the sampling frequency is the code rate), and the compressed image can be transmitted to the AI module for target detection, image segmentation and other processing.

It should be noted that the embodiments of the present application can be tested directly on the existing test set, for example, the test can be performed on the cityscape data set. The cityscape data set includes training images, validation images, and test images. The images are annotated, and can be directly compressed and identified to obtain the test result data (distortion and precision data corresponding to the code rate) of the embodiments of the present application. Among them, the code rate is the sampling frequency of the compressed image obtained by compressing the original image using the compression algorithm, and the distortion degree is the difference between the compressed image and the real environment.

In this case, the simulation device may include the above-mentioned compression module, AI module, and processor, wherein the compression module and the AI module may be software programs stored in the memory of the simulation device, and the processor may call the corresponding The image processing, and get the test result data. For the obtained test result data, the processor may establish a correspondence between the degree of distortion and the precision (the first correspondence), and the correspondence between the code rate and the degree of distortion (the second correspondence), and store the first correspondence and The second correspondence.

The methods of the embodiments of the present application can also be tested in actual application scenarios, for example, tested on an automatic driving system. The automatic driving system may include a camera, and may also include but not limited to: vehicle terminals, vehicle controllers, vehicle Other sensors such as modules, in-vehicle modules, in-vehicle components, in-vehicle chips, in-vehicle units, in-vehicle radar or in-vehicle cameras.

The compression module may be an encoder, the encoder may be located on the camera, the compression module may also include a decoder, the decoder may be located on the MDC, the AI module and the processor may be located on the MDC, or the AI module is located on the MDC, processing The processor can be a processor of an external device (device other than the autonomous driving system).

The external device can be a general-purpose device or a dedicated device. In a specific implementation, the external device may be a desktop computer, a portable computer, a network server, a personal digital assistant (PDA), a mobile phone, a tablet computer, a wireless terminal device, an embedded device, or other devices with processing functions. This embodiment of the present application does not limit the type of the external device. The external device may have a chip or processor with a processing function (such as the processor shown in FIG. 3 ), the external device may include multiple processors, and the processor may be a single-core (single-CPU) processor, or It is a multi-core (multi-CPU) processor.

Taking the processor shown in FIG. 3 as an example of a processor of an external device, the method for determining the image processing mode of the present application may be performed offline by the above-mentioned external device. The code rate of the encoder can be set during the test. After the camera captures the image, the encoder encodes it and sends it to the MDC. The image decoded by the decoder can be stored in the MDC. The AI module can identify the decoded image and get Precision data. For different code rate points, the code rate of the encoder can be set multiple times, and the above process is performed to obtain the test result data.

The obtained test result data can be output to an external device, the external device can obtain the distortion degree of the compressed image according to the decoded image, the external device can establish a correspondence between the distortion degree and the precision (the first correspondence), and The corresponding relationship between the code rate and the distortion degree (the second corresponding relationship), and the first corresponding relationship and the second corresponding relationship are stored.

If both the AI module and the processor shown in FIG. 3 are located on the MDC, the automatic driving system can execute the method for determining the image processing method of the embodiments of the present application online. For example, the code rate of the encoder can be set during testing, and the camera can After the image is encoded by the encoder, it is sent to the MDC. The image decoded by the decoder can be stored in the MDC. The MDC can obtain the distortion degree of the compressed image according to the decoded image. Accuracy data can be obtained by performing identification. For the obtained test result data, the MDC can establish the correspondence between the degree of distortion and the precision (the first correspondence), and the correspondence between the code rate and the degree of distortion (the second correspondence), and store the first correspondence and the first correspondence Two correspondences. The MDC can also obtain the bit rate threshold corresponding to the precision threshold according to the precision threshold required by the service and the first correspondence and the second correspondence, and set the bit rate of the encoder encoding according to the bit rate threshold. In this way, the encoder can Machine non-destructive processing.

It should be noted that in the above example, both the AI module and the processor are located on the MDC, which is only an example of the application, and does not limit the application in any way. For example, the AI module and the processor may also be located in other components of the automatic driving system. above, this application does not limit it.

In the embodiment of the present application, the index used for the distortion degree may be a peak signal noise ratio (Peak signal noise ratio, PSNR), or a mean square error (Mean square error, MSE), or a structural similarity index (Structure similarity index, SSIM), or Perception loss (P-loss). The degree of distortion may also be a combination of multiple indicators above, for example, combining multiple indicators of distortion to comprehensively evaluate the degree of distortion of a compressed image. For example, a weighted index of PSNR and SSIM can be used as the final distortion degree, which can be applied to applications requiring both signal fidelity (PSNR) and human vision (SSIM).

After the AI module continues to process the compressed image, the processing accuracy of the AI module can be obtained, and the accuracy can be the accuracy of identifying the compressed image by the AI module. In this way, when the compression algorithm is used to compress the image, the distortion degree and precision data corresponding to different bit rate points can be obtained.

In the embodiment of the present application, the indicators used for precision may be mean Average Precision (mAP), mean precision (Average Precision, AP), average recall rate (Average Recall, AR), mean cross-join ratio (Mean Intersection over Union, MIoU). The AP may be AP50, AP60, AP70, or weightedAP, and so on. The indicators used for accuracy can also be a combination of multiple indicators above. For example, the accuracy of AI processing can be comprehensively evaluated by combining multiple indicators above. For example, the weighted index of mAP and AR can be used as the precision index, and the present application does not limit the specific index used for the precision.

In the embodiments of the present application, the first correspondence and the second correspondence may be one-to-one values stored in the form of table entries, or may be expressed in the form of functions, which are not limited in this application.

For example, exemplarily, the first corresponding relationship may be represented in the form shown in Table 1, and the second corresponding relationship may be represented in the form shown in Table 2.

Table 1

精度precision	失真度Distortion
P1P1	D1D1
P2P2	D2D2
……	……
PnPn	DnDn

Table 2

失真度Distortion	码率code rate
D1D1	R1R1
D2D2	R2R2
……	……
DnDn	RnRn

Exemplarily, the first correspondence can also be expressed in the form of a function, as shown in the following formula (1):

Among them, fi(D) represents the functional relationship between the precision and the degree of distortion in the numerical range Di, and i is a positive integer from 1 to n. In other words, the relationship between precision and distortion can be expressed in the form of a piecewise function.

In a possible implementation manner, over the numerical range Di, P and D may have a linear relationship. The relationship between accuracy and distortion can be expressed as a piecewise linear function.

Fig. 4a shows a schematic diagram of a curve of the first correspondence according to an embodiment of the present application. As shown in FIG. 4a, the abscissa may represent the degree of distortion, and the ordinate may represent the accuracy. In the example shown in FIG. 4a, the index of the degree of distortion is PSNR, and the index of accuracy is mAP. In the example shown in FIG. 4a, the curves of the first correspondences corresponding to the three different compression algorithms almost overlap, that is to say, the first correspondences are independent of the specific compression algorithm used and do not depend on the specific compression algorithm.

In other words, the performance of machine vision mainly depends on the distortion degree of the input image, which has nothing to do with the compression algorithm and does not depend on the specific compression algorithm. In addition, the performance of machine vision is related to the specific neural network used.

Similarly, the second correspondence can also be expressed in the form of a function, as shown in the following formula (2):

Among them, gi(R) represents the functional relationship between the distortion degree and the code rate in the numerical range Ri. In other words, the relationship between the distortion degree and the code rate can be expressed in the form of a piecewise function.

In a possible implementation manner, in the numerical range Ri, D and R may have a linear relationship, and the relationship between the distortion degree and the code rate may be expressed as a piecewise linear function.

FIG. 4b shows a schematic diagram of a curve of the second correspondence according to an embodiment of the present application. As shown in FIG. 4b, the abscissa may represent the code rate, and the ordinate may represent the degree of distortion. In the example shown in FIG. 4b, the index of the degree of distortion used is PSNR. In the example shown in Figure 4b, the curves of the second correspondence corresponding to the three different compression algorithms are relatively scattered, that is to say, even if different compression algorithms use the same bit rate to compress the image, the compressed image is obtained. The difference of the distortion degree is relatively large, and the second corresponding relationship is related to the specific compression algorithm used.

Through the above embodiments, the decoupling of the compression algorithm and AI processing is realized. If a new compression algorithm is to be evaluated, end-to-end evaluation is not required, and only the compression process can be evaluated to obtain the second corresponding to the new compression algorithm. Correspondence can be. Similarly, if you want to use a new AI module to recognize images, you can also use existing data to re-evaluate the AI recognition process to obtain a new first correspondence, without end-to-end evaluation. The methods provided in the embodiments of the present application can improve the efficiency of evaluation.

After the first correspondence and the second correspondence are obtained, the corresponding precision threshold may be determined according to the precision indicators required by different services, and the distortion threshold corresponding to the precision threshold may be determined according to the precision threshold and the first correspondence, and according to the distortion threshold and In the second correspondence, the code rate threshold corresponding to the distortion threshold can be determined. In this way, the code rate used for compression can be determined according to different service requirements.

FIG. 5 shows a method for determining an image processing mode according to an embodiment of the present application. In the embodiments of the present application, the image processing mode may refer to a code rate used for compressing an image, and determining the image processing mode may refer to a process of determining a code rate used for compressing an image. As shown in Figure 5, the method for determining the image processing mode may include the following steps:

Step S500: Determine a distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and a first corresponding relationship, wherein the first corresponding relationship is a corresponding relationship between the accuracy and the degree of distortion.

Step S501 , according to the distortion threshold and a second correspondence, determine a bit rate threshold corresponding to the distortion threshold; wherein the second correspondence is a correspondence between a distortion degree and a bit rate.

The accuracy threshold required by the service may refer to the difference between the accuracy required by the service and the first accuracy, and the first accuracy may be the accuracy of recognizing the original image, that is, the accuracy of recognizing the uncompressed image, as shown in Figure 4a As shown, the first precision is the precision value marked by the dotted line in Figure 4a.

In a possible implementation manner, step S500 may include: determining a second precision corresponding to the precision threshold according to the precision threshold and a first precision, where the first precision is identifying the original image The accuracy of the second accuracy is determined; the distortion threshold corresponding to the second accuracy is determined according to the second accuracy and the first correspondence.

The code rate points tested in the testing process can be discrete. If the first correspondence is stored in the form of a function, then the calculation method shown in formula (1) can be stored, which actually stores the accuracy and distortion. Correspondence curve. FIG. 6a shows a schematic diagram of a manner of determining a distortion threshold according to an embodiment of the present application. As shown in Figure 6a, Pth can represent the precision threshold required by the business (Precision threshold), that is, the difference between the second precision required by the business and the first precision, and Pmax can represent the first precision, that is, the uncompressed image is processed The recognition accuracy, P can represent the second accuracy, that is, the accuracy required by the business, and PSNR th can represent the distortion threshold, that is, the degree of distortion corresponding to the accuracy required by the business. The second precision may be the difference between the first precision and the precision threshold, and the second precision P=Pmax−Pth. After P is determined, the PSNR th corresponding to P can be obtained according to the curve shown in Figure 6a. After the second precision P is determined, the distortion threshold D (PSNR th) corresponding to the second precision P can be calculated specifically by using the function fi(D) in the formula (1) according to the range of P.

The code rate points tested in the testing process may be discrete. If the first correspondence is stored in the form of a pair of values, the processor may use linear interpolation to process the points not stored in the first correspondence. For example, if the corresponding precision threshold is determined according to the precision index required by the business, but the precision data corresponding to the precision threshold is not stored in the first correspondence, the processor can obtain the precision data adjacent to the precision threshold in the first correspondence , the distortion threshold corresponding to the precision threshold can be obtained by performing linear interpolation according to the precision data adjacent to the precision threshold and the corresponding distortion degree data. Taking Table 1 as an example, assuming that the determined second precision P is greater than P1 but less than P2, then the distortion threshold corresponding to the second precision P can be calculated by the following linear interpolation formula (3):

For step S501, the bit rate threshold corresponding to the distortion threshold can also be determined according to the specific storage method. If the second correspondence is stored in the form of a function, then the calculation method shown in formula (2) can be stored, which is actually a curve storing the correspondence between the distortion degree and the code rate. FIG. 6b shows a schematic diagram of a manner of determining a code rate threshold according to an embodiment of the present application. As shown in Figure 6b, PSNR th may represent the distortion threshold determined in step S500, that is, the distortion degree corresponding to the accuracy required by the service. For any compression algorithm, the corresponding relationship between the distortion degree and the code rate can be expressed in the form of formula (2). After the distortion threshold is determined, it can be determined according to the range described by the distortion threshold. The function gi(R) of calculates the rate threshold R corresponding to the distortion threshold. The example shown in FIG. 6b includes three curves of the second correspondence corresponding to three different compression algorithms, each curve has a corresponding function expression, according to the determined distortion threshold and the function expression corresponding to each curve The code rate thresholds corresponding to the three compression algorithms can be determined: R_X265_medium, R_X264_medium, and R_X264_ultrafast.

If the second correspondence is stored in the form of a pair of values, the processor may use linear interpolation to process the points that are not stored in the second correspondence. For a specific manner, reference may be made to the process of determining the distortion threshold through linear interpolation, which will not be repeated here.

In a possible implementation manner, the second correspondence is obtained by testing a compression algorithm, the second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, different The first correspondences corresponding to the compression algorithms are the same.

As can be seen from the above, the first correspondence relationship is independent of the specific compression algorithm used, and does not depend on the specific compression algorithm. Therefore, the first correspondence relationship corresponding to different compression algorithms may be the same. That is to say, if the way in which the processor (such as the processor of the MDC) performs machine vision processing on the image does not change, the same first correspondence can be used to evaluate the image quality for different compression algorithms. Specifically, the test data obtained by identifying the compressed images obtained by different compression algorithms can be fitted to the test data to obtain the curve of the first correspondence, or the distortion degree and identification of the compressed image can also be directly stored. The first correspondence between the precisions is not limited in this application.

For different compression algorithms, due to different compression standards, the distortion degree of the compressed image obtained by compressing the image with the same bit rate may be different. Therefore, the second correspondence between different compression algorithms may be different. different. As shown in Figure 4b and Figure 6b, the rate-distortion curves corresponding to the three compression algorithms X265_medium, X264_medium, and X264_ultrafast are different.

Therefore, in a possible implementation manner, in step S501, according to the distortion threshold and the second correspondence, determining the bit rate threshold corresponding to the distortion threshold may include: determining a compression algorithm used to compress the original image ; determine the sub-correspondence relationship corresponding to the compression algorithm; and determine the code rate threshold value corresponding to the distortion threshold value according to the distortion threshold value and the sub-correspondence relationship.

As shown in FIG. 3 , the sub-correspondence (second correspondence) corresponding to the compression algorithm is pre-stored in the processor, and the processor can receive the input compression algorithm in addition to the input precision threshold. The processor can determine the distortion threshold corresponding to the accuracy threshold according to the accuracy threshold and the first correspondence, determine the compression algorithm used to compress the original image according to the input compression algorithm, and determine the corresponding sub-correspondence and distortion according to the pre-stored compression algorithm. The threshold can determine the bit rate threshold corresponding to the distortion threshold.

The method of the embodiment of the present application is simple, efficient, and easy to expand. For example, if a new compression algorithm is to be used for image compression, the new compression algorithm can be evaluated for different bit rate points. Specifically, a new compression algorithm is used to compress the image for different bit rate points, and the distortion degree of the compressed image is output to obtain the second correspondence corresponding to the new compression algorithm. There is no need to perform AI processing on the compressed image to obtain the processing accuracy of the AI module, and there is no need to re-establish a new first correspondence, and the first correspondence established before can be used.

The processor can establish a second correspondence corresponding to the new compression algorithm. If the code rate threshold when compressing with the new compression algorithm is to be determined according to the accuracy threshold required by the service, the processor can determine the corresponding distortion threshold according to the accuracy threshold required by the service. , find the established first correspondence according to the precision threshold to determine the distortion threshold corresponding to the precision threshold, find the second correspondence corresponding to the new compression algorithm according to the distortion threshold, and determine the bit rate threshold corresponding to the distortion threshold, that is, to adopt the new compression The bit rate threshold that meets the accuracy threshold required by the service when the algorithm compresses.

However, if the previous evaluation method is used, end-to-end testing of the new compression algorithm is required. As shown in Figure 2a, for the new compression algorithm, end-to-end testing at different code rate points can establish the corresponding relationship between code rate and accuracy. According to the accuracy threshold required by the business and the corresponding relationship between code rate and accuracy, the Determine the bit rate threshold when compressing with the new compression algorithm.

Comparing the above two processes, it can be determined that the method provided in this application can decouple the compression and AI processing processes, and realize staged evaluation. The precision of AI processing is independent of the compression algorithm. The code rate can be re-tested without end-to-end testing, which simplifies the testing process and makes the evaluation more efficient.

In the embodiment of the present application, if a new neural network model is to be used to identify an image, the new neural network model can be used to process the labeled data set to obtain the first correspondence of the new neural network model, without the need for Re-execute the compression processing process, because the previous AI processing process has marked the distortion degree of the compressed image, and the new neural network model is used to process the marked data set, which can obtain a new difference between the distortion degree and the accuracy. the first correspondence.

However, if the existing evaluation method is used, the end-to-end testing process needs to be re-implemented according to the framework shown in Figure 2a, and the compressed image is processed with a new neural network model. According to the processing results and the labeled dataset, The first correspondence of the new neural network model can be obtained.

Comparing the above two processes, it can be determined that the method provided in this application can decouple the compression and AI processing processes to realize staged evaluation. The accuracy of AI processing is related to the model, and the evaluation of the compression process is independent of the model. The network model can process the labeled data set without the need to re-compress the process. Compared with the existing end-to-end testing process, it can simplify the evaluation process and the evaluation efficiency is higher.

The method for determining the image processing mode of the present application will be described below with reference to specific application scenarios and application examples.

FIG. 7 shows a schematic diagram of an assessment framework according to some examples of the present application. As shown in Figure 7, the evaluation process can be divided into two stages: the first stage and the second stage. The first stage is used to test the compression algorithm, and the second correspondence between the bit rate and the distortion degree can be obtained. The second stage is used to test the recognition accuracy of the neural network, and the first correspondence between distortion and accuracy can be obtained. In the example of FIG. 7 , the degree of distortion may be defined in the RGB domain, and the indicator used for the degree of distortion may be the above-mentioned PSNR, or MSE, or SSIM, or P-loss, or the weighted result of the above multiple indicators.

For compressed transmission systems, distortion is mainly quantization noise introduced by compression coding. For the precision-distortion degree curve, the degree of distortion mainly depends on the energy of the compressed and quantized noise and has little to do with the specific noise form. In this case, PSNR/MSE becomes a suitable indicator to measure the compression distortion. PSNR and MSE have a log relationship. MSE is characterized by the amount of compressed noise energy, and the PSNR/MSE index is simple to calculate and easy to use.

Example (a) represents the evaluation process of the reference. The RAW image is processed by the ISP and then the RGB image is output to the deep neural network. At the same time, the distortion data of the RGB image can be output. The deep neural network performs machine vision for the uncompressed RGB image. Process the identified accuracy data.

Example (b) represents the scene of compressing RAW images in the RAW domain. The RAW images are compressed by the encoder/decoder to obtain the compressed images. The ISP can process the compressed images to obtain RGB images, and output the RGB images. Distortion data, machine vision processing of RGB images by deep neural network, can get the accuracy data of recognition. Compared with compressing in RGB/YUV domain, compressing RAW images in RAW domain can reduce the complexity of compression algorithm because the amount of data in RAW domain is less.

Example (c) represents the scene of compressing RGB images in the RGB domain. The RAW image is processed by ISP to obtain an RGB image, and the encoder/decoder compresses the RGB image to obtain a compressed image, and outputs the distortion of the compressed RGB image. The accuracy data of the recognition can be obtained by the machine vision processing of the compressed RGB image by the deep neural network.

Example (d) represents a scene where YUV images are compressed in the YUV domain. RAW images are processed by ISP to obtain RGB images, and YUV images can be obtained by converting RGB images to RGB-YUV format, and YUV images are compressed using encoder/decoder Obtain the compressed image, convert the compressed image to YUV-RGB format to obtain the compressed RGB image, output the distortion data of the compressed RGB image, and perform machine vision processing on the compressed RGB image by the deep neural network. Accuracy data for identification can be obtained.

Example (a) can obtain the accuracy of recognizing uncompressed images, that is, the first accuracy Pmax, as shown in FIG. 6a. For a variety of different compression algorithms, the frameworks of example (b), example (c), and example (d) can be used for testing, and the second correspondence corresponding to each compression algorithm on the framework of each example is obtained, and each The first correspondence of each example.

For example, for the three compression algorithms X265_medium, X264_medium, and X264_ultrafast, tests can be performed on the frameworks of example (b), example (c), and example (d) according to the above process.

Taking example (b) as an example, the compression algorithm X265_medium is used to compress the RAW image at different bit rates to obtain a compressed image, and the ISP can process the compressed image to obtain an RGB image, and output the distortion data of the RGB image. , the second correspondence between the bit rate and the distortion degree corresponding to the compression algorithm X265_medium can be established. As shown in Figure 6b, the deep neural network performs machine vision processing on the RGB image, and the recognition accuracy data can be obtained, and the distortion degree and accuracy can be established. The first correspondence; the compression algorithm X264_medium is used to compress the RAW image at different bit rates to obtain a compressed image, and the ISP can process the compressed image to obtain an RGB image, and output the distortion data of the RGB image, which can be established. The second correspondence between the code rate and the distortion degree corresponding to the compression algorithm X264_medium, as shown in Figure 6b, for the compression algorithm X264_medium, it is not necessary to continue to test the subsequent machine vision processing process. When determining the code rate corresponding to the lossless machine, The first correspondence obtained by testing the compression algorithm X265_medium may be used; for the compression algorithm X264_ultrafast, the same process as the compression algorithm X264_medium may be repeated to obtain the corresponding second correspondence, as shown in FIG. 6b. It can be seen that the method of the embodiment of the present application is simple and efficient, decouples the process of compression and AI processing, and realizes staged evaluation. The precision of AI processing has nothing to do with the compression algorithm. For the new compression algorithm, only the distortion degree and The code rate can be re-tested without end-to-end testing, which simplifies the testing process and makes the evaluation more efficient.

Comparing example (b) with example (a), you can evaluate the impact of compression algorithms on machine vision processing. For example, using example (b) to compress images, the closer the recognition accuracy is to the accuracy of example (a), the closer The machine is undamaged. According to the first correspondence and the second correspondence established by the test result data and the accuracy threshold required by the service, the lossless code rate threshold of the machine can be determined. For the specific process, refer to the processes in FIG. 5, FIG. 6a and FIG. 6b, and will not be repeated.

FIG. 8 shows a schematic diagram of an assessment framework according to some examples of the present application. As shown in FIG. 8 , the evaluation process can still be decomposed into two stages: the first stage and the second stage, and the way of dividing the stages is different from that in the example of FIG. 7 . In the example of FIG. 8 , the distortion degree can be defined in the YUV domain, and PSNR, or MSE, or SSIM, or P-loss, or the weighted result of the above multiple indicators can also be used as the distortion degree indicator. Therefore, the first stage can be divided into using a compression algorithm to compress the YUV image to obtain a compressed YUV image, and the distortion data of the compressed YUV image can be output. In the example of FIG. 8 , example (e) may represent a reference evaluation process, and example (f) may represent a scene of compressing an image in the YUV domain. Other processes are similar to the example in FIG. 7 and will not be repeated.

9 shows a schematic diagram of an assessment framework according to some examples of the present application. As shown in FIG. 9 , the evaluation process can still be decomposed into two stages: the first stage and the second stage, which is different from the examples of FIG. 7 and FIG. 8 in which the stages are divided. In the example of FIG. 9 , the distortion degree can be defined in the RAW domain, and PSNR, or MSE, or SSIM, or P-loss, or the weighted result of the above multiple indicators can also be used as the distortion degree indicator. Therefore, the first stage can be divided into using a compression algorithm to compress the RAW image to obtain a compressed RAW image, and the distortion data of the compressed RAW image can be output. In the example of FIG. 9 , example (g) may represent a reference evaluation process, and example (h) may represent a scene of compressing an image in the RAW domain. Other processes are similar to the example in FIG. 7 , and will not be repeated here.

According to the embodiments of the present application, the method for determining the image processing method of the present application can be applied to various scenarios, has strong versatility, and is easy to extend the evaluation of new compression algorithms or AI modules, which can improve the efficiency of evaluation.

The present application further provides an apparatus for determining an image processing method, and FIG. 10 shows a block diagram of an apparatus for determining an image processing method according to an embodiment of the present application. As shown in FIG. 10 , the apparatus may include: a first determination module 100, configured to determine a distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and a first correspondence; wherein the first correspondence is the correspondence between the precision and the degree of distortion; the second determination module 101 is configured to determine the bit rate threshold corresponding to the distortion threshold according to the distortion threshold and the second correspondence; wherein, the second correspondence is Correspondence between distortion and bit rate.

The device according to the embodiment of the present application realizes the decoupling of the compression algorithm and the AI processing. If a new compression algorithm is to be evaluated, end-to-end evaluation is not required, and only the compression process can be evaluated to obtain the corresponding data of the new compression algorithm. The second corresponding relationship is sufficient. Similarly, if you want to use a new AI module to recognize images, you can also use existing data to re-evaluate the AI recognition process to obtain a new first correspondence, without end-to-end evaluation. The device provided by the embodiment of the present application can improve the efficiency of evaluation.

In a possible implementation manner, the first determining module 100 includes: a first determining unit, configured to determine a second accuracy corresponding to the accuracy threshold according to the accuracy threshold and the first accuracy, wherein the The first accuracy is the accuracy of identifying the original image; the second determination unit is configured to determine the distortion threshold corresponding to the second accuracy according to the second accuracy and the first correspondence.

In a possible implementation manner, the bit rate is a sampling frequency at which the compressed image is obtained by compressing the original image using a compression algorithm, and the distortion degree is the difference between the compressed image and the real environment, so The accuracy is the accuracy of identifying the compressed image.

In a possible implementation manner, the second determining module 101 includes: a third determining unit, configured to determine a compression algorithm used to compress the original image; and a fourth determining unit, configured to determine the corresponding compression algorithm The sub-correspondence relationship of ; the fifth determination unit is configured to determine the bit rate threshold value corresponding to the distortion threshold value according to the distortion threshold value and the sub-correspondence relationship.

In a possible implementation manner, the distortion degree is obtained according to one or more of the following indicators: peak signal-to-noise ratio (PSNR), mean square error (MSE), structural similarity indicator (SSIM), and perceptual loss.

In a possible implementation manner, the precision is obtained according to one or more of the following indicators: mean mean precision mAP, mean mean precision AP, mean recall rate AR, and mean cross-over-union ratio MIoU.

The device for determining the image processing mode may be a chip with processing function or a program module in a processor, and the processor may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. The chip or processor can implement the methods of the foregoing embodiments of the present application by executing the program.

An embodiment of the present application provides an electronic device, including: a processor and a memory for storing instructions executable by the processor; wherein, the processor is configured to implement the methods of the foregoing embodiments of the present application when executing the instructions .

The apparatus or electronic device for determining the above image processing method may be a general-purpose device or a special-purpose device. In a specific implementation, the apparatus can also be a desktop computer, a portable computer, a network server, a PDA (personal digital assistant, PDA), a mobile phone, a tablet computer, a wireless terminal device, an embedded device, or other devices with processing functions. The embodiments of the present application do not limit the type of the apparatus for determining the image processing method.

Embodiments of the present application provide a non-volatile computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, implement the above method.

Embodiments of the present application provide a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.

A computer-readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device. The computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (Electrically Programmable Read-Only-Memory, EPROM or flash memory), static random access memory (Static Random-Access Memory, SRAM), portable compact disk read-only memory (Compact Disc Read-Only Memory, CD - ROM), Digital Video Disc (DVD), memory sticks, floppy disks, mechanically encoded devices, such as punch cards or raised structures in grooves on which instructions are stored, and any suitable combination of the foregoing .

Claims

A method for determining an image processing mode, characterized in that the method comprises:

Determine the distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and the first correspondence; wherein, the first correspondence is the correspondence between the accuracy and the degree of distortion;

A bit rate threshold corresponding to the distortion threshold is determined according to the distortion threshold and the second correspondence, wherein the second correspondence is the correspondence between the distortion degree and the bit rate.
The method according to claim 1, wherein determining the distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and the first correspondence, comprising:

According to the precision threshold and the first precision, determine the second precision corresponding to the precision threshold, wherein the first precision is the precision for recognizing the original image;

According to the second precision and the first correspondence, a distortion threshold corresponding to the second precision is determined.
The method according to claim 1 or 2, wherein the code rate is a sampling frequency at which the compressed image is obtained by compressing the original image by using a compression algorithm, and the distortion degree is the ratio of the compressed image to the compressed image. The difference in the real environment, the accuracy is the accuracy of identifying the compressed image.
The method according to any one of claims 1-3, wherein,

The second correspondence is obtained by testing the compression algorithm, the second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, and the first correspondence corresponding to different compression algorithms. The corresponding relationship is the same.
The method of claim 4, wherein:

According to the distortion threshold and the second correspondence, determining the bit rate threshold corresponding to the distortion threshold, including:

Determine the compression algorithm used to compress the original image;

determining the sub-correspondence corresponding to the compression algorithm;

A bit rate threshold corresponding to the distortion threshold is determined according to the distortion threshold and the sub-correspondence.
The method according to any one of claims 1-3, wherein the distortion degree is obtained according to one or more of the following indicators: peak signal-to-noise ratio (PSNR), mean square error (MSE), structural similarity Metric SSIM, Perceptual Loss.
The method according to any one of claims 1-3, wherein the precision is obtained according to one or more of the following indicators: mean mean precision mAP, mean precision AP, mean recall rate AR, all Cross and compare MIoU.
A device for determining an image processing mode, characterized in that the device comprises:

a first determination module, configured to determine a distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and a first correspondence; wherein the first correspondence is the correspondence between the precision and the degree of distortion;

The second determination module is configured to determine the bit rate threshold corresponding to the distortion threshold according to the distortion threshold and the second correspondence; wherein the second correspondence is the correspondence between the distortion degree and the bit rate.
The device according to claim 8, wherein the first determining module comprises:

a first determining unit, configured to determine a second accuracy corresponding to the accuracy threshold according to the accuracy threshold and a first accuracy, where the first accuracy is the accuracy of recognizing the original image;

A second determining unit, configured to determine a distortion threshold corresponding to the second accuracy according to the second accuracy and the first correspondence.
The device according to claim 8 or 9, wherein the code rate is a sampling frequency at which the compressed image is obtained by compressing the original image by using a compression algorithm, and the distortion degree is the ratio of the compressed image to the compressed image. The difference in the real environment, the accuracy is the accuracy of identifying the compressed image.
The device according to any one of claims 8-10, characterized in that,

The second correspondence is obtained by testing the compression algorithm, the second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, and the first correspondence corresponding to different compression algorithms. The corresponding relationship is the same.
The apparatus of claim 11, wherein:

The second determining module includes:

a third determining unit, configured to determine the compression algorithm used to compress the original image;

a fourth determining unit, configured to determine the sub-correspondence relationship corresponding to the compression algorithm;

A fifth determination unit, configured to determine a bit rate threshold corresponding to the distortion threshold according to the distortion threshold and the sub-correspondence.
The device according to any one of claims 8-10, wherein the distortion degree is obtained according to one or more of the following indicators: peak signal-to-noise ratio (PSNR), mean square error (MSE), structural similarity Metric SSIM, Perceptual Loss.
The device according to any one of claims 8-10, wherein the precision is obtained according to one or more of the following indicators: mean mean precision mAP, mean precision AP, mean recall rate AR, all Cross and compare MIoU.
A computer program product comprising computer readable code, when the computer readable code is executed in an electronic device, a processor in the electronic device executes the method of any one of the above claims 1-7.
An electronic device, comprising:

processor;

memory for storing processor-executable instructions;

Wherein, the processor is configured to implement the method of any one of claims 1-7 when executing the instructions.
A non-volatile computer-readable storage medium on which computer program instructions are stored, characterized in that, when the computer program instructions are executed by a processor, the method described in any one of claims 1-7 is implemented.