WO2022205058A1 - Method and apparatus for determining image processing mode - Google Patents

Method and apparatus for determining image processing mode Download PDF

Info

Publication number
WO2022205058A1
WO2022205058A1 PCT/CN2021/084373 CN2021084373W WO2022205058A1 WO 2022205058 A1 WO2022205058 A1 WO 2022205058A1 CN 2021084373 W CN2021084373 W CN 2021084373W WO 2022205058 A1 WO2022205058 A1 WO 2022205058A1
Authority
WO
WIPO (PCT)
Prior art keywords
correspondence
distortion
threshold
accuracy
precision
Prior art date
Application number
PCT/CN2021/084373
Other languages
French (fr)
Chinese (zh)
Inventor
林永兵
马莎
罗达新
高鲁涛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2021/084373 priority Critical patent/WO2022205058A1/en
Priority to CN202180001346.1A priority patent/CN113366531A/en
Publication of WO2022205058A1 publication Critical patent/WO2022205058A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Definitions

  • the present application relates to the field of image technology, and in particular, to a method and device for determining an image processing method.
  • the camera has the characteristics of high resolution, non-contact, convenient use and low cost, and is an essential sensor for environmental perception of autonomous driving. More and more cameras can be installed on the vehicle. During automatic driving, the camera collects images in the environment and performs machine vision processing to identify obstacles or targets in the environment, so as to achieve no blind spot coverage.
  • FIG. 1a is a schematic diagram of data transmission in a compression-based sensing system in the related art.
  • the perception system includes a camera and an image signal processor (ISP), and the perception system transmits the processed image data to the mobile data center (MDC), which is further processed by the MDC.
  • ISP image signal processor
  • MDC mobile data center
  • the Bayer RAW image output by the camera is processed by the ISP and sent to the MDC, and the MDC performs machine vision processing on the image processed by the ISP.
  • the Bayer RAW image output by the camera in Figure 1a can be an Ultra high definition (UHD) image with a resolution of 4K, the frame rate of the image can be 30fps, and the bit depth of the image can be 16bitdepth.
  • the bandwidth requirement is up to 4Gbps (4K*2k*30*16).
  • the method of compressing and transmitting images can be used to reduce bandwidth requirements, and new services of UHD video transmission can be carried out without upgrading the existing network.
  • FIG. 1a shows a schematic diagram of an architecture of video compression according to an example in the related art.
  • the camera outputs an image in RAW format, which is encoded by an encoder and then outputs an image in RAW format.
  • the outputted image in RAW format is a compressed image, and the image encoded by the encoder can be transmitted to MDC, MDC It can include decoder, ISP and deep neural network.
  • the decoder is used to decode the received compressed image to obtain the decoded image, and then output the three primary colors (Red Green Blue, RGB) or YUV format after ISP processing.
  • the image goes to a deep neural network for further processing.
  • the ISP processing may include: a demosaic (Demosaic) operation for converting an image from a RAW format to an RGB format; a white balance (WB) operation for performing white balance processing on an image; a color correction matrix (Color Correction Matrix, CCM), used to complete the conversion from sensor_RGB color space to sRGB color space, so that the color matching characteristics of the camera meet the Luther condition; Gamma (Gamma) correction, used to correct the display characteristics of the display and the nonlinearity of the input image relation.
  • the ISP processing may also include other processing procedures for images, and the present application is not limited to the above-mentioned processing.
  • the processing of the image by the deep neural network can include: image recognition, segmentation, etc.
  • the example shown in Figure 1b can reduce the delay from the perception system to the MDC in the RAW domain; the ISP and deep neural network shown in Figure 1b can be set in the MDC, which can provide more flexible ISP capabilities and obtain better images quality, and can reduce the delay from the sensing system to the MDC.
  • lossy image/video compression technology can achieve higher compression rates.
  • Commonly used lossy compression standards include: Joint Photographic Experts Group (JPEG), H264/H265, JPEG-XS (Joint Photographic Experts Group Extra) Speed) etc.
  • JPEG-XS Joint Photographic Experts Group Extra
  • JPEG-XS is a new compression standard proposed by the Joint Photographic Experts Group.
  • the image quality damage caused by the introduction of compression technology is inevitable, and the image quality damage will have an impact on subsequent machine vision processing, which may lead to problems such as a decrease in the accuracy of recognition and inaccurate image segmentation.
  • the evaluation method proposed in the related art is an end-to-end evaluation, that is to say, the evaluation method in the related art is an evaluation of the accuracy of the entire process from the front-end compression processing to the back-end artificial intelligence processing. If you want to evaluate different compression algorithms, the end-to-end evaluation method is relatively inefficient.
  • a method and device for determining an image processing method are proposed, which realizes the decoupling of compression algorithm and AI processing, and can improve the efficiency of evaluation.
  • an embodiment of the present application provides a method for determining an image processing mode, the method comprising: determining a distortion threshold corresponding to the accuracy threshold according to a service-required accuracy threshold and a first correspondence;
  • the first correspondence is the correspondence between the precision and the degree of distortion; according to the distortion threshold and the second correspondence, the code rate threshold corresponding to the distortion threshold is determined; wherein, the second correspondence is the degree of distortion and Correspondence between code rates.
  • the accuracy threshold required by the business may refer to the accuracy requirements in different application scenarios, and the application scenarios may be automatic driving, assisted driving, etc. These different application scenarios may have different processing accuracy requirements. Therefore, different The application scenarios have corresponding accuracy thresholds required by the business.
  • the distortion degree is introduced as an intermediate variable
  • the first correspondence relationship is used to evaluate the influence of the distortion degree after compression on the accuracy
  • the second correspondence relationship is used to evaluate the distortion degree after compression using different code rates
  • the compressed The evaluation of the process and the post-compression process are processed separately, which can realize the decoupling of the compression and the post-compression process, and improve the evaluation efficiency.
  • the processing process after compression may be AI processing
  • the method according to the embodiment of the present application can realize the decoupling of the compression algorithm and the AI processing.
  • the second correspondence relationship corresponding to the new compression algorithm can be obtained by evaluating the compression processing process.
  • you want to use a new AI module to recognize images you can also use existing data to re-evaluate the AI recognition process to obtain a new first correspondence, without end-to-end evaluation.
  • the methods provided in the embodiments of the present application can improve the efficiency of evaluation.
  • the accuracy threshold value required by the service may refer to the difference between the accuracy required by the service and the accuracy of the lossless machine.
  • determining the distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and the first correspondence includes: determining the distortion threshold according to the accuracy threshold and the first accuracy. the second accuracy corresponding to the accuracy threshold, wherein the first accuracy is the accuracy of identifying the original image; according to the second accuracy and the first correspondence, the distortion corresponding to the second accuracy is determined threshold.
  • the code rate is a sampling frequency at which the compressed image is obtained by compressing the original image by using a compression algorithm
  • the distortion is the difference between the compressed image and the real environment
  • the accuracy is the accuracy of identifying the compressed image.
  • the second correspondence is obtained by testing the compression algorithm,
  • the second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, and the first correspondence corresponding to different compression algorithms is the same.
  • determining the bit rate threshold corresponding to the distortion threshold according to the distortion threshold and the second correspondence includes: The compression algorithm used for compression; the sub-correspondence relationship corresponding to the compression algorithm is determined; and the code rate threshold corresponding to the distortion threshold value is determined according to the distortion threshold value and the sub-correspondence relationship.
  • the determination method of the embodiment of the present application is simple, efficient, and easy to expand. For example, if a new compression algorithm is to be used for image compression, the new compression algorithm can be evaluated for different bit rate points. Specifically, a new compression algorithm is used to compress the image for different bit rate points, and the distortion degree of the compressed image is output to obtain the second correspondence corresponding to the new compression algorithm. There is no need to perform AI processing on the compressed image to obtain the processing accuracy of the AI module, and there is no need to re-establish a new first correspondence, and the first correspondence established before can be used. The processor can establish a second correspondence corresponding to the new compression algorithm.
  • the processor can determine the corresponding distortion threshold according to the accuracy threshold required by the service. , find the established first correspondence according to the precision threshold to determine the distortion threshold corresponding to the precision threshold, find the second correspondence corresponding to the new compression algorithm according to the distortion threshold, and determine the bit rate threshold corresponding to the distortion threshold, that is, to adopt the new compression
  • the bit rate threshold that meets the accuracy threshold required by the service when the algorithm compresses.
  • the determination method provided in this application can decouple the process of compression and AI processing, and realize staged evaluation.
  • the precision of AI processing has nothing to do with the compression algorithm, and only needs to re-test the distortion degree and bit rate for the new compression algorithm. , no end-to-end testing is required, the testing process is simplified, and the evaluation efficiency is higher.
  • the degree of distortion is based on one or more of the following indicators Obtained: peak signal-to-noise ratio PSNR, mean square error MSE, structural similarity metric SSIM, perceptual loss.
  • the accuracy is obtained according to one or more of the following indicators : Mean Precision mAP, Mean Precision AP, Mean Recall AR, Mean Intersection Over Union Ratio MIoU.
  • the original image is a Bayer original RAW image
  • the compressed image is a red, green, and blue RGB image
  • the Compressing the original image is: compressing the original image in the RAW domain, or the RGB domain, or the YUV domain.
  • the original image is a Bayer original RAW image
  • the compressed image is a YUV image
  • the original image is Compression is: compressing the original image in the YUV domain.
  • both the original image and the compressed image are Bayerian original RAW images, and the original image is compressed as : Compress the original image in the RAW domain.
  • an embodiment of the present application provides a device for determining an image processing method, the device includes: a first determining module, configured to determine the corresponding accuracy threshold value according to the accuracy threshold value required by the service and the first corresponding relationship
  • the first corresponding relationship is the corresponding relationship between the accuracy and the degree of distortion
  • the second determining module is used to determine the corresponding bit rate of the distortion threshold according to the distortion threshold and the second corresponding relationship. Threshold; wherein, the second correspondence is the correspondence between the distortion degree and the code rate.
  • the apparatus in the embodiment of the present application uses the first correspondence relationship to evaluate the influence of the distortion degree after compression on the accuracy, and the second correspondence relationship to evaluate the distortion degree after compression with different code rates.
  • the evaluation of the process and the post-compression process are processed separately, which can realize the decoupling of the compression and the post-compression process, and improve the evaluation efficiency.
  • the processing process after compression may be AI processing.
  • the device according to the embodiment of the present application realizes the decoupling of the compression algorithm and AI processing. If a new compression algorithm is to be evaluated, end-to-end evaluation is not required, and only The second correspondence relationship corresponding to the new compression algorithm can be obtained by evaluating the compression processing process. Similarly, if you want to use a new AI module to recognize images, you can also use existing data to re-evaluate the AI recognition process to obtain a new first correspondence, without end-to-end evaluation.
  • the device provided by the embodiment of the present application can improve the efficiency of evaluation.
  • the first determination module includes: a first determination unit, configured to determine the second accuracy corresponding to the accuracy threshold according to the accuracy threshold and the first accuracy, The first accuracy is the accuracy of recognizing the original image; and the second determination unit is configured to determine the distortion threshold corresponding to the second accuracy according to the second accuracy and the first correspondence.
  • the code rate is a sampling frequency at which the compressed image is obtained by compressing the original image using a compression algorithm
  • the distortion is the difference between the compressed image and the real environment
  • the accuracy is the accuracy of identifying the compressed image.
  • the second correspondence is obtained by testing the compression algorithm,
  • the second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, and the first correspondence corresponding to different compression algorithms is the same.
  • the second determination module includes: a third determination unit, configured to determine a compression algorithm used for compressing the original image; The fourth determining unit is configured to determine the sub-correspondence relationship corresponding to the compression algorithm; the fifth determining unit is configured to determine the bit rate threshold value corresponding to the distortion threshold value according to the distortion threshold value and the sub-correspondence relationship.
  • the device of the embodiment of the present application is simple, efficient, and easy to expand.
  • the new compression algorithm can be evaluated for different bit rate points.
  • a new compression algorithm is used to compress the image for different bit rate points, and the distortion degree of the compressed image is output to obtain the second correspondence corresponding to the new compression algorithm.
  • the processor can establish a second correspondence corresponding to the new compression algorithm.
  • the processor can determine the corresponding distortion threshold according to the accuracy threshold required by the service. , find the established first correspondence according to the precision threshold to determine the distortion threshold corresponding to the precision threshold, find the second correspondence corresponding to the new compression algorithm according to the distortion threshold, and determine the bit rate threshold corresponding to the distortion threshold, that is, to adopt the new compression
  • the bit rate threshold that meets the accuracy threshold required by the service when the algorithm compresses.
  • the device provided in this application can decouple the process of compression and AI processing, and realize staged evaluation.
  • the precision of AI processing has nothing to do with the compression algorithm. For the new compression algorithm, it only needs to re-test the distortion degree and bit rate. End-to-end testing is not required, which simplifies the testing process and increases evaluation efficiency.
  • the degree of distortion is based on one or more of the following indicators Obtained: peak signal-to-noise ratio PSNR, mean square error MSE, structural similarity metric SSIM, perceptual loss.
  • the accuracy is obtained according to one or more of the following indicators : Mean Precision mAP, Mean Precision AP, Mean Recall AR, Mean Intersection Over Union Ratio MIoU.
  • embodiments of the present application provide an electronic device, including: a processor and a memory for storing instructions executable by the processor, where the processor can execute the first aspect or the first aspect when the processor executes the instructions
  • embodiments of the present application provide a computer program product, comprising computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in an electronic
  • the processor in the electronic device executes the first aspect or the method for determining one or more image processing manners in the first aspect or multiple possible implementation manners of the first aspect.
  • embodiments of the present application provide a non-volatile computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, implement the first aspect or the first aspect above A method for determining one or more image processing modes in a variety of possible implementation modes.
  • an embodiment of the present application further provides a sensor system for providing a sensing function for a vehicle. It includes at least one device for determining the image processing method mentioned in the above-mentioned embodiments of the present application, and at least one of other sensors such as a camera or a radar. At least one sensor device in the system can be integrated into a whole machine or device, or The at least one sensor device within the system can also be provided independently as an element or device.
  • the embodiments of the present application further provide a system, which is applied in unmanned driving or intelligent driving, which includes at least one device for determining the image processing method mentioned in the above-mentioned embodiments of the present application, and sensors such as cameras and radars.
  • At least one device in the system can be integrated into a whole machine or equipment, or at least one device in the system can also be independently set as a component or device.
  • an embodiment of the present application further provides a vehicle, where the vehicle includes at least one image processing method determination device or any of the above-mentioned systems mentioned in the above-mentioned embodiments of the present application.
  • FIG. 1a is a schematic diagram of data transmission in a compression-based sensing system in the related art.
  • FIG. 1b shows a schematic diagram of an architecture of video compression according to an example in the related art.
  • Fig. 2a shows a schematic diagram of an evaluation framework according to an embodiment of the present application.
  • FIG. 2b shows a schematic diagram of a rate-precision curve according to an embodiment of the present application.
  • FIG. 3 shows a schematic diagram of an application scenario of a method for determining an image processing mode according to an embodiment of the present application.
  • Fig. 4a shows a schematic diagram of a curve of the first correspondence according to an embodiment of the present application.
  • FIG. 4b shows a schematic diagram of a curve of the second correspondence according to an embodiment of the present application.
  • FIG. 5 shows a method for determining an image processing mode according to an embodiment of the present application.
  • FIG. 6a shows a schematic diagram of a manner of determining a distortion threshold according to an embodiment of the present application.
  • FIG. 6b shows a schematic diagram of a manner of determining a code rate threshold according to an embodiment of the present application.
  • FIG. 7 shows a schematic diagram of an assessment framework according to some examples of the present application.
  • FIG. 8 shows a schematic diagram of an assessment framework according to some examples of the present application.
  • FIG. 9 shows a schematic diagram of an assessment framework according to some examples of the present application.
  • FIG. 10 shows a block diagram of an apparatus for determining an image processing method according to an embodiment of the present application.
  • Fig. 2a shows a schematic diagram of an evaluation framework according to an embodiment of the present application, and the evaluation framework shown in Fig. 2a is the Moving Picture Experts Group (MPEG)-Machine Vision Coding (Video Coding for Machines, VCM) working group A defined, machine vision-oriented image quality evaluation method using an end-to-end evaluation process.
  • MPEG Moving Picture Experts Group
  • VCM Video Coding for Machines
  • the camera outputs the video processed by the ISP to the VCM encoder.
  • the video processed by the ISP can be in RGB or YUV format.
  • the VCM encoder encodes the video to obtain the encoded video, and the encoded video is transmitted.
  • the VCM decoder performs video decoding to obtain the decoded video, and performs machine vision processing on the decoded video.
  • the decoded image can be output to a neural network, and machine vision processing is performed through the neural network.
  • the evaluation framework shown in Figure 2a is a tightly coupled system.
  • the camera, compression algorithm (encoder + decoder), and NN modules are coupled together. If you want to evaluate the compression performance of a compression algorithm, you must perform an end-to-end evaluation. Evaluation is complex and inefficient.
  • Fig. 2a for different compression algorithms, if we want to determine the code rate threshold at which to compress to meet the requirements of machine lossless, we must use the framework shown in Fig. 2a to perform end-to-end testing, and obtain The accuracy corresponding to multiple bit rate points of the compression algorithm.
  • the accuracy threshold required by the business may refer to the accuracy requirements in different application scenarios, and the application scenarios may be automatic driving, assisted driving, etc. These different application scenarios may have different processing accuracy requirements. Therefore, different The application scenarios have corresponding accuracy thresholds required by the business.
  • the accuracy threshold required by the service may refer to the difference between the accuracy required by the service and the accuracy of the lossless machine.
  • FIG. 2b shows a schematic diagram of a rate-precision curve according to an embodiment of the present application.
  • the abscissa represents the bit rate
  • the ordinate represents the precision.
  • the index used for the precision may be the mean average precision (mean Average Precision, mAP).
  • the dotted line represents the recognition accuracy of the original image collected (the uncompressed image), and the other three curves represent the compressed image obtained by compressing the original image using three different compression algorithms.
  • the relationship between the recognition accuracy of the compressed image and the bit rate, the three compression algorithms are X265 default configuration (X265_medium), X264 default configuration (X264_medium), and X264 fast configuration (X264_ultrafast).
  • X265_medium X265 default configuration
  • X264_medium X264 default configuration
  • X264_ultrafast X264 fast configuration
  • the IEEE-P2020 standard image quality evaluation working group for imaging systems for autonomous driving has defined probability-based machine vision evaluation indicators for autonomous driving, including: contrast detection probability (CDP), color separation probability ( Color separation probability (CSP), geometric resolution probability (GRP), etc., to achieve module-level evaluation of perception systems.
  • CDP contrast detection probability
  • CSP color separation probability
  • GRP geometric resolution probability
  • These probabilistic indicators are used to characterize the imaging quality of the machine vision-oriented perception system to measure the impact of image quality on subsequent machine vision AI processing.
  • these indicators only consider the capabilities of the imaging system, which are separated from the back-end AI processing tasks, and cannot well reflect the impact of image quality on AI processing.
  • FIG. 3 shows a schematic diagram of an application scenario of a method for determining an image processing mode according to an embodiment of the present application.
  • a compression module can compress the received image.
  • the image can be sampled (the sampling frequency is the code rate), and the compressed image can be transmitted to the AI module for target detection, image segmentation and other processing.
  • the embodiments of the present application can be tested directly on the existing test set, for example, the test can be performed on the cityscape data set.
  • the cityscape data set includes training images, validation images, and test images.
  • the images are annotated, and can be directly compressed and identified to obtain the test result data (distortion and precision data corresponding to the code rate) of the embodiments of the present application.
  • the code rate is the sampling frequency of the compressed image obtained by compressing the original image using the compression algorithm
  • the distortion degree is the difference between the compressed image and the real environment.
  • the simulation device may include the above-mentioned compression module, AI module, and processor, wherein the compression module and the AI module may be software programs stored in the memory of the simulation device, and the processor may call the corresponding The image processing, and get the test result data.
  • the processor may establish a correspondence between the degree of distortion and the precision (the first correspondence), and the correspondence between the code rate and the degree of distortion (the second correspondence), and store the first correspondence and The second correspondence.
  • the methods of the embodiments of the present application can also be tested in actual application scenarios, for example, tested on an automatic driving system.
  • the automatic driving system may include a camera, and may also include but not limited to: vehicle terminals, vehicle controllers, vehicle Other sensors such as modules, in-vehicle modules, in-vehicle components, in-vehicle chips, in-vehicle units, in-vehicle radar or in-vehicle cameras.
  • the compression module may be an encoder, the encoder may be located on the camera, the compression module may also include a decoder, the decoder may be located on the MDC, the AI module and the processor may be located on the MDC, or the AI module is located on the MDC, processing
  • the processor can be a processor of an external device (device other than the autonomous driving system).
  • the external device can be a general-purpose device or a dedicated device.
  • the external device may be a desktop computer, a portable computer, a network server, a personal digital assistant (PDA), a mobile phone, a tablet computer, a wireless terminal device, an embedded device, or other devices with processing functions.
  • PDA personal digital assistant
  • This embodiment of the present application does not limit the type of the external device.
  • the external device may have a chip or processor with a processing function (such as the processor shown in FIG. 3 ), the external device may include multiple processors, and the processor may be a single-core (single-CPU) processor, or It is a multi-core (multi-CPU) processor.
  • the method for determining the image processing mode of the present application may be performed offline by the above-mentioned external device.
  • the code rate of the encoder can be set during the test. After the camera captures the image, the encoder encodes it and sends it to the MDC. The image decoded by the decoder can be stored in the MDC.
  • the AI module can identify the decoded image and get Precision data. For different code rate points, the code rate of the encoder can be set multiple times, and the above process is performed to obtain the test result data.
  • the obtained test result data can be output to an external device, the external device can obtain the distortion degree of the compressed image according to the decoded image, the external device can establish a correspondence between the distortion degree and the precision (the first correspondence), and The corresponding relationship between the code rate and the distortion degree (the second corresponding relationship), and the first corresponding relationship and the second corresponding relationship are stored.
  • the automatic driving system can execute the method for determining the image processing method of the embodiments of the present application online.
  • the code rate of the encoder can be set during testing, and the camera can After the image is encoded by the encoder, it is sent to the MDC.
  • the image decoded by the decoder can be stored in the MDC.
  • the MDC can obtain the distortion degree of the compressed image according to the decoded image.
  • Accuracy data can be obtained by performing identification.
  • the MDC can establish the correspondence between the degree of distortion and the precision (the first correspondence), and the correspondence between the code rate and the degree of distortion (the second correspondence), and store the first correspondence and the first correspondence Two correspondences.
  • the MDC can also obtain the bit rate threshold corresponding to the precision threshold according to the precision threshold required by the service and the first correspondence and the second correspondence, and set the bit rate of the encoder encoding according to the bit rate threshold. In this way, the encoder can Machine non-destructive processing.
  • both the AI module and the processor are located on the MDC, which is only an example of the application, and does not limit the application in any way.
  • the AI module and the processor may also be located in other components of the automatic driving system. above, this application does not limit it.
  • the index used for the distortion degree may be a peak signal noise ratio (Peak signal noise ratio, PSNR), or a mean square error (Mean square error, MSE), or a structural similarity index (Structure similarity index, SSIM), or Perception loss (P-loss).
  • PSNR peak signal noise ratio
  • MSE mean square error
  • SSIM structural similarity index
  • P-loss Perception loss
  • the degree of distortion may also be a combination of multiple indicators above, for example, combining multiple indicators of distortion to comprehensively evaluate the degree of distortion of a compressed image.
  • a weighted index of PSNR and SSIM can be used as the final distortion degree, which can be applied to applications requiring both signal fidelity (PSNR) and human vision (SSIM).
  • the processing accuracy of the AI module can be obtained, and the accuracy can be the accuracy of identifying the compressed image by the AI module.
  • the compression algorithm is used to compress the image, the distortion degree and precision data corresponding to different bit rate points can be obtained.
  • the indicators used for precision may be mean Average Precision (mAP), mean precision (Average Precision, AP), average recall rate (Average Recall, AR), mean cross-join ratio (Mean Intersection over Union, MIoU).
  • the AP may be AP50, AP60, AP70, or weightedAP, and so on.
  • the indicators used for accuracy can also be a combination of multiple indicators above.
  • the accuracy of AI processing can be comprehensively evaluated by combining multiple indicators above.
  • the weighted index of mAP and AR can be used as the precision index, and the present application does not limit the specific index used for the precision.
  • the first correspondence and the second correspondence may be one-to-one values stored in the form of table entries, or may be expressed in the form of functions, which are not limited in this application.
  • the first corresponding relationship may be represented in the form shown in Table 1
  • the second corresponding relationship may be represented in the form shown in Table 2.
  • the first correspondence can also be expressed in the form of a function, as shown in the following formula (1):
  • fi(D) represents the functional relationship between the precision and the degree of distortion in the numerical range Di, and i is a positive integer from 1 to n.
  • the relationship between precision and distortion can be expressed in the form of a piecewise function.
  • P and D may have a linear relationship.
  • the relationship between accuracy and distortion can be expressed as a piecewise linear function.
  • Fig. 4a shows a schematic diagram of a curve of the first correspondence according to an embodiment of the present application.
  • the abscissa may represent the degree of distortion
  • the ordinate may represent the accuracy.
  • the index of the degree of distortion is PSNR
  • the index of accuracy is mAP.
  • the curves of the first correspondences corresponding to the three different compression algorithms almost overlap, that is to say, the first correspondences are independent of the specific compression algorithm used and do not depend on the specific compression algorithm.
  • the performance of machine vision mainly depends on the distortion degree of the input image, which has nothing to do with the compression algorithm and does not depend on the specific compression algorithm.
  • the performance of machine vision is related to the specific neural network used.
  • the second correspondence can also be expressed in the form of a function, as shown in the following formula (2):
  • gi(R) represents the functional relationship between the distortion degree and the code rate in the numerical range Ri.
  • the relationship between the distortion degree and the code rate can be expressed in the form of a piecewise function.
  • D and R may have a linear relationship, and the relationship between the distortion degree and the code rate may be expressed as a piecewise linear function.
  • FIG. 4b shows a schematic diagram of a curve of the second correspondence according to an embodiment of the present application.
  • the abscissa may represent the code rate
  • the ordinate may represent the degree of distortion.
  • the index of the degree of distortion used is PSNR.
  • the curves of the second correspondence corresponding to the three different compression algorithms are relatively scattered, that is to say, even if different compression algorithms use the same bit rate to compress the image, the compressed image is obtained.
  • the difference of the distortion degree is relatively large, and the second corresponding relationship is related to the specific compression algorithm used.
  • the decoupling of the compression algorithm and AI processing is realized. If a new compression algorithm is to be evaluated, end-to-end evaluation is not required, and only the compression process can be evaluated to obtain the second corresponding to the new compression algorithm. Correspondence can be. Similarly, if you want to use a new AI module to recognize images, you can also use existing data to re-evaluate the AI recognition process to obtain a new first correspondence, without end-to-end evaluation.
  • the methods provided in the embodiments of the present application can improve the efficiency of evaluation.
  • the corresponding precision threshold may be determined according to the precision indicators required by different services, and the distortion threshold corresponding to the precision threshold may be determined according to the precision threshold and the first correspondence, and according to the distortion threshold and In the second correspondence, the code rate threshold corresponding to the distortion threshold can be determined. In this way, the code rate used for compression can be determined according to different service requirements.
  • FIG. 5 shows a method for determining an image processing mode according to an embodiment of the present application.
  • the image processing mode may refer to a code rate used for compressing an image
  • determining the image processing mode may refer to a process of determining a code rate used for compressing an image.
  • the method for determining the image processing mode may include the following steps:
  • Step S500 Determine a distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and a first corresponding relationship, wherein the first corresponding relationship is a corresponding relationship between the accuracy and the degree of distortion.
  • Step S501 according to the distortion threshold and a second correspondence, determine a bit rate threshold corresponding to the distortion threshold; wherein the second correspondence is a correspondence between a distortion degree and a bit rate.
  • the accuracy threshold required by the service may refer to the difference between the accuracy required by the service and the first accuracy, and the first accuracy may be the accuracy of recognizing the original image, that is, the accuracy of recognizing the uncompressed image, as shown in Figure 4a As shown, the first precision is the precision value marked by the dotted line in Figure 4a.
  • the distortion degree is introduced as an intermediate variable
  • the first correspondence relationship is used to evaluate the influence of the distortion degree after compression on the accuracy
  • the second correspondence relationship is used to evaluate the distortion degree after compression using different code rates
  • the compressed The evaluation of the process and the post-compression process are processed separately, which can realize the decoupling of the compression and the post-compression process, and improve the evaluation efficiency.
  • step S500 may include: determining a second precision corresponding to the precision threshold according to the precision threshold and a first precision, where the first precision is identifying the original image The accuracy of the second accuracy is determined; the distortion threshold corresponding to the second accuracy is determined according to the second accuracy and the first correspondence.
  • FIG. 6a shows a schematic diagram of a manner of determining a distortion threshold according to an embodiment of the present application.
  • Pth can represent the precision threshold required by the business (Precision threshold), that is, the difference between the second precision required by the business and the first precision
  • Pmax can represent the first precision, that is, the uncompressed image is processed
  • the recognition accuracy, P can represent the second accuracy, that is, the accuracy required by the business
  • PSNR th can represent the distortion threshold, that is, the degree of distortion corresponding to the accuracy required by the business.
  • the PSNR th corresponding to P can be obtained according to the curve shown in Figure 6a.
  • the distortion threshold D (PSNR th) corresponding to the second precision P can be calculated specifically by using the function fi(D) in the formula (1) according to the range of P.
  • the code rate points tested in the testing process may be discrete.
  • the processor may use linear interpolation to process the points not stored in the first correspondence. For example, if the corresponding precision threshold is determined according to the precision index required by the business, but the precision data corresponding to the precision threshold is not stored in the first correspondence, the processor can obtain the precision data adjacent to the precision threshold in the first correspondence , the distortion threshold corresponding to the precision threshold can be obtained by performing linear interpolation according to the precision data adjacent to the precision threshold and the corresponding distortion degree data. Taking Table 1 as an example, assuming that the determined second precision P is greater than P1 but less than P2, then the distortion threshold corresponding to the second precision P can be calculated by the following linear interpolation formula (3):
  • the bit rate threshold corresponding to the distortion threshold can also be determined according to the specific storage method. If the second correspondence is stored in the form of a function, then the calculation method shown in formula (2) can be stored, which is actually a curve storing the correspondence between the distortion degree and the code rate.
  • FIG. 6b shows a schematic diagram of a manner of determining a code rate threshold according to an embodiment of the present application.
  • PSNR th may represent the distortion threshold determined in step S500, that is, the distortion degree corresponding to the accuracy required by the service.
  • the corresponding relationship between the distortion degree and the code rate can be expressed in the form of formula (2). After the distortion threshold is determined, it can be determined according to the range described by the distortion threshold.
  • the example shown in FIG. 6b includes three curves of the second correspondence corresponding to three different compression algorithms, each curve has a corresponding function expression, according to the determined distortion threshold and the function expression corresponding to each curve
  • the code rate thresholds corresponding to the three compression algorithms can be determined: R_X265_medium, R_X264_medium, and R_X264_ultrafast.
  • the processor may use linear interpolation to process the points that are not stored in the second correspondence. For a specific manner, reference may be made to the process of determining the distortion threshold through linear interpolation, which will not be repeated here.
  • the second correspondence is obtained by testing a compression algorithm
  • the second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, different
  • the first correspondences corresponding to the compression algorithms are the same.
  • the first correspondence relationship is independent of the specific compression algorithm used, and does not depend on the specific compression algorithm. Therefore, the first correspondence relationship corresponding to different compression algorithms may be the same. That is to say, if the way in which the processor (such as the processor of the MDC) performs machine vision processing on the image does not change, the same first correspondence can be used to evaluate the image quality for different compression algorithms. Specifically, the test data obtained by identifying the compressed images obtained by different compression algorithms can be fitted to the test data to obtain the curve of the first correspondence, or the distortion degree and identification of the compressed image can also be directly stored.
  • the first correspondence between the precisions is not limited in this application.
  • the distortion degree of the compressed image obtained by compressing the image with the same bit rate may be different. Therefore, the second correspondence between different compression algorithms may be different. different. As shown in Figure 4b and Figure 6b, the rate-distortion curves corresponding to the three compression algorithms X265_medium, X264_medium, and X264_ultrafast are different.
  • determining the bit rate threshold corresponding to the distortion threshold may include: determining a compression algorithm used to compress the original image ; determine the sub-correspondence relationship corresponding to the compression algorithm; and determine the code rate threshold value corresponding to the distortion threshold value according to the distortion threshold value and the sub-correspondence relationship.
  • the sub-correspondence (second correspondence) corresponding to the compression algorithm is pre-stored in the processor, and the processor can receive the input compression algorithm in addition to the input precision threshold.
  • the processor can determine the distortion threshold corresponding to the accuracy threshold according to the accuracy threshold and the first correspondence, determine the compression algorithm used to compress the original image according to the input compression algorithm, and determine the corresponding sub-correspondence and distortion according to the pre-stored compression algorithm.
  • the threshold can determine the bit rate threshold corresponding to the distortion threshold.
  • the method of the embodiment of the present application is simple, efficient, and easy to expand. For example, if a new compression algorithm is to be used for image compression, the new compression algorithm can be evaluated for different bit rate points. Specifically, a new compression algorithm is used to compress the image for different bit rate points, and the distortion degree of the compressed image is output to obtain the second correspondence corresponding to the new compression algorithm. There is no need to perform AI processing on the compressed image to obtain the processing accuracy of the AI module, and there is no need to re-establish a new first correspondence, and the first correspondence established before can be used.
  • the processor can establish a second correspondence corresponding to the new compression algorithm. If the code rate threshold when compressing with the new compression algorithm is to be determined according to the accuracy threshold required by the service, the processor can determine the corresponding distortion threshold according to the accuracy threshold required by the service. , find the established first correspondence according to the precision threshold to determine the distortion threshold corresponding to the precision threshold, find the second correspondence corresponding to the new compression algorithm according to the distortion threshold, and determine the bit rate threshold corresponding to the distortion threshold, that is, to adopt the new compression The bit rate threshold that meets the accuracy threshold required by the service when the algorithm compresses.
  • the method provided in this application can decouple the compression and AI processing processes, and realize staged evaluation.
  • the precision of AI processing is independent of the compression algorithm.
  • the code rate can be re-tested without end-to-end testing, which simplifies the testing process and makes the evaluation more efficient.
  • the new neural network model can be used to process the labeled data set to obtain the first correspondence of the new neural network model, without the need for Re-execute the compression processing process, because the previous AI processing process has marked the distortion degree of the compressed image, and the new neural network model is used to process the marked data set, which can obtain a new difference between the distortion degree and the accuracy. the first correspondence.
  • the end-to-end testing process needs to be re-implemented according to the framework shown in Figure 2a, and the compressed image is processed with a new neural network model. According to the processing results and the labeled dataset, The first correspondence of the new neural network model can be obtained.
  • the method provided in this application can decouple the compression and AI processing processes to realize staged evaluation.
  • the accuracy of AI processing is related to the model, and the evaluation of the compression process is independent of the model.
  • the network model can process the labeled data set without the need to re-compress the process. Compared with the existing end-to-end testing process, it can simplify the evaluation process and the evaluation efficiency is higher.
  • FIG. 7 shows a schematic diagram of an assessment framework according to some examples of the present application.
  • the evaluation process can be divided into two stages: the first stage and the second stage.
  • the first stage is used to test the compression algorithm, and the second correspondence between the bit rate and the distortion degree can be obtained.
  • the second stage is used to test the recognition accuracy of the neural network, and the first correspondence between distortion and accuracy can be obtained.
  • the degree of distortion may be defined in the RGB domain, and the indicator used for the degree of distortion may be the above-mentioned PSNR, or MSE, or SSIM, or P-loss, or the weighted result of the above multiple indicators.
  • distortion is mainly quantization noise introduced by compression coding.
  • the degree of distortion mainly depends on the energy of the compressed and quantized noise and has little to do with the specific noise form.
  • PSNR/MSE becomes a suitable indicator to measure the compression distortion.
  • PSNR and MSE have a log relationship. MSE is characterized by the amount of compressed noise energy, and the PSNR/MSE index is simple to calculate and easy to use.
  • Example (a) represents the evaluation process of the reference.
  • the RAW image is processed by the ISP and then the RGB image is output to the deep neural network.
  • the distortion data of the RGB image can be output.
  • the deep neural network performs machine vision for the uncompressed RGB image. Process the identified accuracy data.
  • Example (b) represents the scene of compressing RAW images in the RAW domain.
  • the RAW images are compressed by the encoder/decoder to obtain the compressed images.
  • the ISP can process the compressed images to obtain RGB images, and output the RGB images. Distortion data, machine vision processing of RGB images by deep neural network, can get the accuracy data of recognition.
  • compressing RAW images in RAW domain can reduce the complexity of compression algorithm because the amount of data in RAW domain is less.
  • Example (c) represents the scene of compressing RGB images in the RGB domain.
  • the RAW image is processed by ISP to obtain an RGB image, and the encoder/decoder compresses the RGB image to obtain a compressed image, and outputs the distortion of the compressed RGB image.
  • the accuracy data of the recognition can be obtained by the machine vision processing of the compressed RGB image by the deep neural network.
  • Example (d) represents a scene where YUV images are compressed in the YUV domain.
  • RAW images are processed by ISP to obtain RGB images
  • YUV images can be obtained by converting RGB images to RGB-YUV format
  • YUV images are compressed using encoder/decoder
  • Obtain the compressed image convert the compressed image to YUV-RGB format to obtain the compressed RGB image, output the distortion data of the compressed RGB image, and perform machine vision processing on the compressed RGB image by the deep neural network.
  • Accuracy data for identification can be obtained.
  • Example (a) can obtain the accuracy of recognizing uncompressed images, that is, the first accuracy Pmax, as shown in FIG. 6a.
  • the frameworks of example (b), example (c), and example (d) can be used for testing, and the second correspondence corresponding to each compression algorithm on the framework of each example is obtained, and each The first correspondence of each example.
  • tests can be performed on the frameworks of example (b), example (c), and example (d) according to the above process.
  • the compression algorithm X265_medium is used to compress the RAW image at different bit rates to obtain a compressed image, and the ISP can process the compressed image to obtain an RGB image, and output the distortion data of the RGB image.
  • the second correspondence between the bit rate and the distortion degree corresponding to the compression algorithm X265_medium can be established.
  • the deep neural network performs machine vision processing on the RGB image, and the recognition accuracy data can be obtained, and the distortion degree and accuracy can be established.
  • the first correspondence; the compression algorithm X264_medium is used to compress the RAW image at different bit rates to obtain a compressed image, and the ISP can process the compressed image to obtain an RGB image, and output the distortion data of the RGB image, which can be established.
  • the first correspondence obtained by testing the compression algorithm X265_medium may be used; for the compression algorithm X264_ultrafast, the same process as the compression algorithm X264_medium may be repeated to obtain the corresponding second correspondence, as shown in FIG. 6b.
  • the method of the embodiment of the present application is simple and efficient, decouples the process of compression and AI processing, and realizes staged evaluation.
  • the precision of AI processing has nothing to do with the compression algorithm.
  • For the new compression algorithm only the distortion degree and The code rate can be re-tested without end-to-end testing, which simplifies the testing process and makes the evaluation more efficient.
  • example (b) Comparing example (b) with example (a), you can evaluate the impact of compression algorithms on machine vision processing. For example, using example (b) to compress images, the closer the recognition accuracy is to the accuracy of example (a), the closer The machine is undamaged. According to the first correspondence and the second correspondence established by the test result data and the accuracy threshold required by the service, the lossless code rate threshold of the machine can be determined. For the specific process, refer to the processes in FIG. 5, FIG. 6a and FIG. 6b, and will not be repeated.
  • FIG. 8 shows a schematic diagram of an assessment framework according to some examples of the present application.
  • the evaluation process can still be decomposed into two stages: the first stage and the second stage, and the way of dividing the stages is different from that in the example of FIG. 7 .
  • the distortion degree can be defined in the YUV domain, and PSNR, or MSE, or SSIM, or P-loss, or the weighted result of the above multiple indicators can also be used as the distortion degree indicator. Therefore, the first stage can be divided into using a compression algorithm to compress the YUV image to obtain a compressed YUV image, and the distortion data of the compressed YUV image can be output.
  • example (e) may represent a reference evaluation process, and example (f) may represent a scene of compressing an image in the YUV domain. Other processes are similar to the example in FIG. 7 and will not be repeated.
  • the evaluation process can still be decomposed into two stages: the first stage and the second stage, which is different from the examples of FIG. 7 and FIG. 8 in which the stages are divided.
  • the distortion degree can be defined in the RAW domain, and PSNR, or MSE, or SSIM, or P-loss, or the weighted result of the above multiple indicators can also be used as the distortion degree indicator. Therefore, the first stage can be divided into using a compression algorithm to compress the RAW image to obtain a compressed RAW image, and the distortion data of the compressed RAW image can be output.
  • example (g) may represent a reference evaluation process
  • example (h) may represent a scene of compressing an image in the RAW domain.
  • Other processes are similar to the example in FIG. 7 , and will not be repeated here.
  • the method for determining the image processing method of the present application can be applied to various scenarios, has strong versatility, and is easy to extend the evaluation of new compression algorithms or AI modules, which can improve the efficiency of evaluation.
  • FIG. 10 shows a block diagram of an apparatus for determining an image processing method according to an embodiment of the present application.
  • the apparatus may include: a first determination module 100, configured to determine a distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and a first correspondence; wherein the first correspondence is the correspondence between the precision and the degree of distortion; the second determination module 101 is configured to determine the bit rate threshold corresponding to the distortion threshold according to the distortion threshold and the second correspondence; wherein, the second correspondence is Correspondence between distortion and bit rate.
  • the apparatus in the embodiment of the present application uses the first correspondence relationship to evaluate the influence of the distortion degree after compression on the accuracy, and the second correspondence relationship to evaluate the distortion degree after compression with different code rates.
  • the evaluation of the process and the post-compression process are processed separately, which can realize the decoupling of the compression and the post-compression process, and improve the evaluation efficiency.
  • the device realizes the decoupling of the compression algorithm and the AI processing. If a new compression algorithm is to be evaluated, end-to-end evaluation is not required, and only the compression process can be evaluated to obtain the corresponding data of the new compression algorithm. The second corresponding relationship is sufficient. Similarly, if you want to use a new AI module to recognize images, you can also use existing data to re-evaluate the AI recognition process to obtain a new first correspondence, without end-to-end evaluation.
  • the device provided by the embodiment of the present application can improve the efficiency of evaluation.
  • the first determining module 100 includes: a first determining unit, configured to determine a second accuracy corresponding to the accuracy threshold according to the accuracy threshold and the first accuracy, wherein the The first accuracy is the accuracy of identifying the original image; the second determination unit is configured to determine the distortion threshold corresponding to the second accuracy according to the second accuracy and the first correspondence.
  • the bit rate is a sampling frequency at which the compressed image is obtained by compressing the original image using a compression algorithm
  • the distortion degree is the difference between the compressed image and the real environment, so
  • the accuracy is the accuracy of identifying the compressed image.
  • the second correspondence is obtained by testing a compression algorithm
  • the second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, different
  • the first correspondences corresponding to the compression algorithms are the same.
  • the second determining module 101 includes: a third determining unit, configured to determine a compression algorithm used to compress the original image; and a fourth determining unit, configured to determine the corresponding compression algorithm The sub-correspondence relationship of ; the fifth determination unit is configured to determine the bit rate threshold value corresponding to the distortion threshold value according to the distortion threshold value and the sub-correspondence relationship.
  • the device of the embodiment of the present application is simple, efficient, and easy to expand.
  • the new compression algorithm can be evaluated for different bit rate points.
  • a new compression algorithm is used to compress the image for different bit rate points, and the distortion degree of the compressed image is output to obtain the second correspondence corresponding to the new compression algorithm.
  • the processor can establish a second correspondence corresponding to the new compression algorithm.
  • the processor can determine the corresponding distortion threshold according to the accuracy threshold required by the service. , find the established first correspondence according to the precision threshold to determine the distortion threshold corresponding to the precision threshold, find the second correspondence corresponding to the new compression algorithm according to the distortion threshold, and determine the bit rate threshold corresponding to the distortion threshold, that is, to adopt the new compression
  • the bit rate threshold that meets the accuracy threshold required by the service when the algorithm compresses.
  • the device provided in this application can decouple the process of compression and AI processing, and realize staged evaluation.
  • the precision of AI processing has nothing to do with the compression algorithm. For the new compression algorithm, it only needs to re-test the distortion degree and bit rate. End-to-end testing is not required, which simplifies the testing process and increases evaluation efficiency.
  • the distortion degree is obtained according to one or more of the following indicators: peak signal-to-noise ratio (PSNR), mean square error (MSE), structural similarity indicator (SSIM), and perceptual loss.
  • PSNR peak signal-to-noise ratio
  • MSE mean square error
  • SSIM structural similarity indicator
  • the precision is obtained according to one or more of the following indicators: mean mean precision mAP, mean mean precision AP, mean recall rate AR, and mean cross-over-union ratio MIoU.
  • the device for determining the image processing mode may be a chip with processing function or a program module in a processor, and the processor may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor.
  • the chip or processor can implement the methods of the foregoing embodiments of the present application by executing the program.
  • An embodiment of the present application provides an electronic device, including: a processor and a memory for storing instructions executable by the processor; wherein, the processor is configured to implement the methods of the foregoing embodiments of the present application when executing the instructions .
  • the apparatus or electronic device for determining the above image processing method may be a general-purpose device or a special-purpose device.
  • the apparatus can also be a desktop computer, a portable computer, a network server, a PDA (personal digital assistant, PDA), a mobile phone, a tablet computer, a wireless terminal device, an embedded device, or other devices with processing functions.
  • PDA personal digital assistant
  • the embodiments of the present application do not limit the type of the apparatus for determining the image processing method.
  • Embodiments of the present application provide a non-volatile computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, implement the above method.
  • Embodiments of the present application provide a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.
  • a computer-readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (Electrically Programmable Read-Only-Memory, EPROM or flash memory), static random access memory (Static Random-Access Memory, SRAM), portable compact disk read-only memory (Compact Disc Read-Only Memory, CD - ROM), Digital Video Disc (DVD), memory sticks, floppy disks, mechanically encoded devices, such as punch cards or raised structures in grooves on which instructions are stored, and any suitable combination of the foregoing .
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read-only memory
  • EPROM Errically Programmable Read-Only-Memory
  • SRAM static random access memory
  • portable compact disk read-only memory Compact Disc Read-Only Memory
  • CD - ROM Compact Disc Read-Only Memory
  • DVD Digital Video Disc
  • memory sticks floppy disks

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present application relates to a method and apparatus for determining an image processing mode, which can be applied to assistant driving and autonomous driving. The method comprises: according to a precision threshold value of a service requirement and a first correspondence, determining a distortion threshold value corresponding to the precision threshold value, wherein the first correspondence is a correspondence between precision and a distortion degree; and according to the distortion threshold value and a second correspondence, determining a code rate threshold value corresponding to the distortion threshold value, wherein the second correspondence is a correspondence between the distortion degree and a code rate. By means of the method provided by the embodiments of the present application, the decoupling of a compression algorithm and AI processing is realized. If a new compression algorithm needs to be evaluated, there is no need to perform end-to-end evaluation, such that the efficiency of evaluating an autonomous driving or assistant driving system can be improved. The method can be applied to the Internet of vehicles, such as vehicle-to-everything (V2X), long term evolution-vehicle (LTE-V) and vehicle-to-vehicle (V2V).

Description

图像处理方式的确定方法及装置Method and device for determining image processing mode 技术领域technical field
本申请涉及图像技术领域,尤其涉及一种图像处理方式的确定方法及装置。The present application relates to the field of image technology, and in particular, to a method and device for determining an image processing method.
背景技术Background technique
随着社会的发展,智能运输设备、智能家居设备、机器人等智能终端正在逐步进入人们的日常生活中。传感器在智能终端上发挥着十分重要的作用。安装在智能终端上的各式各样的传感器,比如毫米波雷达,激光雷达,摄像头,超声波雷达等,在智能终端的运动过程中感知周围的环境,收集数据,进行移动物体的辨识与追踪,以及静止场景如车道线、标示牌的识别,并结合导航仪及地图数据进行路径规划。传感器可以预先察觉到可能发生的危险并辅助甚至自主采取必要的规避手段,有效增加了智能终端的安全性和舒适性。With the development of society, intelligent terminals such as intelligent transportation equipment, smart home equipment, and robots are gradually entering people's daily life. Sensors play a very important role in smart terminals. Various sensors installed on the smart terminal, such as millimeter-wave radar, lidar, camera, ultrasonic radar, etc., perceive the surrounding environment during the movement of the smart terminal, collect data, and identify and track moving objects. As well as the identification of static scenes such as lane lines and signs, and combined with navigator and map data for path planning. Sensors can detect possible dangers in advance and assist or even take necessary evasion measures autonomously, effectively increasing the safety and comfort of smart terminals.
摄像头具有分辨率高、非接触、使用方便、成本低廉等特点,是自动驾驶环境感知的必备传感器。车辆上可以安装越来越多的摄像头,在自动驾驶时,通过摄像头采集环境中的图像并进行机器视觉处理,识别环境中的障碍物或者目标,从而实现无盲点覆盖。The camera has the characteristics of high resolution, non-contact, convenient use and low cost, and is an essential sensor for environmental perception of autonomous driving. More and more cameras can be installed on the vehicle. During automatic driving, the camera collects images in the environment and performs machine vision processing to identify obstacles or targets in the environment, so as to achieve no blind spot coverage.
随着摄像头的分辨率、帧率、采样深度等参数的不断提高,摄像头输出的视频对传输带宽的需求越来越大。图1a是相关技术中一种基于压缩的感知系统传输数据的示意图。如图1a所示,感知系统中包括摄像头、图像信号处理器(Image signal processor,ISP),感知系统将处理后的图像数据传输到移动数据计算平台(Mobile data center,MDC),由MDC进一步进行处理。具体地,摄像头输出的拜耳原始(Bayer RAW)图像,经过ISP处理后发送MDC,MDC对ISP处理后的图像进行机器视觉处理。With the continuous improvement of camera resolution, frame rate, sampling depth and other parameters, the video output by the camera has an increasing demand for transmission bandwidth. FIG. 1a is a schematic diagram of data transmission in a compression-based sensing system in the related art. As shown in Figure 1a, the perception system includes a camera and an image signal processor (ISP), and the perception system transmits the processed image data to the mobile data center (MDC), which is further processed by the MDC. deal with. Specifically, the Bayer RAW image output by the camera is processed by the ISP and sent to the MDC, and the MDC performs machine vision processing on the image processed by the ISP.
图1a中的摄像头输出的Bayer RAW图像可以为分辨率为4K的超高清(Ultra high definition,UHD)图像,图像的帧率可以为30fps,图像的位深度可以为16bitdepth(比特位深),图像的带宽需求高达4Gbps(4K*2k*30*16)。为缓解传输网络的压力,可以采用对图像进行压缩后传输的方法降低带宽需求,无需升级现有网络即可开展UHD视频传输的新业务。The Bayer RAW image output by the camera in Figure 1a can be an Ultra high definition (UHD) image with a resolution of 4K, the frame rate of the image can be 30fps, and the bit depth of the image can be 16bitdepth. The bandwidth requirement is up to 4Gbps (4K*2k*30*16). In order to ease the pressure on the transmission network, the method of compressing and transmitting images can be used to reduce bandwidth requirements, and new services of UHD video transmission can be carried out without upgrading the existing network.
自动驾驶对安全性要求高,因此,自动驾驶系统对感知系统的延时比较敏感。图1a所示的场景作为感知系统的一个示例,对压缩算法的需求可以包括:支持RAW格式图像的编码,低延时,低复杂度,高压缩性能。为了满足这些性能,相关技术中设计了在RAW域进行视频压缩的架构。图1b示出根据相关技术中一示例的视频压缩的架构的示意图。如图1b所示,摄像头输出RAW格式的图像,经过编码器进行编码后输出RAW格式的图像,输出的RAW格式的图像是经过压缩后的图像,编码器编码后的图像可以传输到MDC,MDC上可以包括解码器、ISP以及深度神经网络,解码器用于对收到的已压缩的图像进行解码得到解码后的图像,然后再经过ISP处理后输出三原色(Red Green Blue,RGB)或者YUV格式的图像到深度神经网络进一步处理。其中,ISP处理可以包括:去马赛克(Demosaic)操作,用于将图像从RAW格式转换成RGB格式;白平衡(white balance,WB)操作,用于对图像进行白平衡处理;色彩校正矩阵(Color Correction Matrix,CCM),用于完成sensor_RGB色彩空间到sRGB色彩空间的转换,使得相机的颜色匹配特性满足卢瑟条件;伽马(Gamma)矫正,用于矫正显示器的显示特性和输入图像的非线性关系。ISP处理还可以包括其他对图像的处 理过程,本申请不限于上述处理。深度神经网络对图像进行的处理可以包括:图像识别、分割等。Autonomous driving requires high safety, therefore, the automatic driving system is more sensitive to the delay of the perception system. The scene shown in Figure 1a is an example of a perception system, and the requirements for compression algorithms may include: support for encoding of RAW format images, low latency, low complexity, and high compression performance. In order to meet these performances, an architecture for video compression in the RAW domain is designed in the related art. FIG. 1b shows a schematic diagram of an architecture of video compression according to an example in the related art. As shown in Figure 1b, the camera outputs an image in RAW format, which is encoded by an encoder and then outputs an image in RAW format. The outputted image in RAW format is a compressed image, and the image encoded by the encoder can be transmitted to MDC, MDC It can include decoder, ISP and deep neural network. The decoder is used to decode the received compressed image to obtain the decoded image, and then output the three primary colors (Red Green Blue, RGB) or YUV format after ISP processing. The image goes to a deep neural network for further processing. The ISP processing may include: a demosaic (Demosaic) operation for converting an image from a RAW format to an RGB format; a white balance (WB) operation for performing white balance processing on an image; a color correction matrix (Color Correction Matrix, CCM), used to complete the conversion from sensor_RGB color space to sRGB color space, so that the color matching characteristics of the camera meet the Luther condition; Gamma (Gamma) correction, used to correct the display characteristics of the display and the nonlinearity of the input image relation. The ISP processing may also include other processing procedures for images, and the present application is not limited to the above-mentioned processing. The processing of the image by the deep neural network can include: image recognition, segmentation, etc.
图1b所示的示例在RAW域压缩可以降低感知系统到MDC的时延;图1b所示的ISP和深度神经网络可以设置于MDC中,这样可以提供更加灵活的ISP能力,获得更好的图像质量,并且能够降低感知系统到MDC的时延。The example shown in Figure 1b can reduce the delay from the perception system to the MDC in the RAW domain; the ISP and deep neural network shown in Figure 1b can be set in the MDC, which can provide more flexible ISP capabilities and obtain better images quality, and can reduce the delay from the sensing system to the MDC.
采用有损图像/视频压缩技术能够获得较高的压缩率,常用的有损压缩标准包括:联合图像专家组(Joint Photographic Experts Group,JPEG),H264/H265,JPEG-XS(Joint Photographic Experts Group Extra Speed)等。其中,JPEG-XS是联合图像专家组提出的一种新的压缩标准。压缩技术的引入导致的图像质量损伤是不可避免的,图像质量的损伤会对后续的机器视觉处理产生影响,可能会导致识别的准确率下降,图像分割不准确等问题。The use of lossy image/video compression technology can achieve higher compression rates. Commonly used lossy compression standards include: Joint Photographic Experts Group (JPEG), H264/H265, JPEG-XS (Joint Photographic Experts Group Extra) Speed) etc. Among them, JPEG-XS is a new compression standard proposed by the Joint Photographic Experts Group. The image quality damage caused by the introduction of compression technology is inevitable, and the image quality damage will have an impact on subsequent machine vision processing, which may lead to problems such as a decrease in the accuracy of recognition and inaccurate image segmentation.
为了评估压缩带来的图像质量的损伤对后续人工智能(Artificial Intelligence,AI)处理的影响,相关技术中提出了一些图像质量评价方法,在多大的码率阈值上进行压缩,可以达到机器无损的要求。其中,机器无损是指,相比于不压缩的图像,对压缩后的图像进行识别的精度指标在一定的误差范围内。也就是说,对压缩后的图像进行识别的精度指标,与对原图像(没有压缩的图像)进行识别的精度指标之间的差值在一定的误差范围内。In order to evaluate the impact of image quality damage caused by compression on subsequent artificial intelligence (AI) processing, some image quality evaluation methods are proposed in the related art. At what bit rate threshold, the compression can achieve machine lossless Require. Among them, machine lossless means that, compared with uncompressed images, the accuracy index of recognizing compressed images is within a certain error range. That is to say, the difference between the accuracy index for identifying the compressed image and the accuracy index for identifying the original image (the uncompressed image) is within a certain error range.
相关技术中提出的评价方法是端到端的测评,也就是说,相关技术中的评价方法是对从前端的压缩处理、到后端的人工智能处理整个过程的精度的评价。如果要对不同的压缩算法进行评价,端到端的测评方式效率比较低。The evaluation method proposed in the related art is an end-to-end evaluation, that is to say, the evaluation method in the related art is an evaluation of the accuracy of the entire process from the front-end compression processing to the back-end artificial intelligence processing. If you want to evaluate different compression algorithms, the end-to-end evaluation method is relatively inefficient.
发明内容SUMMARY OF THE INVENTION
有鉴于此,提出了一种图像处理方式的确定方法及装置,实现了压缩算法与AI处理的解耦,可以提高评测的效率。In view of this, a method and device for determining an image processing method are proposed, which realizes the decoupling of compression algorithm and AI processing, and can improve the efficiency of evaluation.
第一方面,本申请的实施例提供了一种图像处理方式的确定方法,所述方法包括:根据业务要求的精度阈值和第一对应关系,确定所述精度阈值对应的失真阈值;其中,所述第一对应关系为精度和失真度之间的对应关系;根据所述失真阈值和第二对应关系,确定所述失真阈值对应的码率阈值;其中,所述第二对应关系为失真度和码率之间的对应关系。In a first aspect, an embodiment of the present application provides a method for determining an image processing mode, the method comprising: determining a distortion threshold corresponding to the accuracy threshold according to a service-required accuracy threshold and a first correspondence; The first correspondence is the correspondence between the precision and the degree of distortion; according to the distortion threshold and the second correspondence, the code rate threshold corresponding to the distortion threshold is determined; wherein, the second correspondence is the degree of distortion and Correspondence between code rates.
其中,业务要求的精度阈值可以是指不同的应用场景下对精度的需求,应用场景可以为自动驾驶、辅助驾驶等等,这些不同的应用场景对处理精度的需求可能是不同的,因此,不同的应用场景有对应的业务要求的精度阈值。Among them, the accuracy threshold required by the business may refer to the accuracy requirements in different application scenarios, and the application scenarios may be automatic driving, assisted driving, etc. These different application scenarios may have different processing accuracy requirements. Therefore, different The application scenarios have corresponding accuracy thresholds required by the business.
本申请实施例的方法通过引入失真度作为中间变量,采用第一对应关系评价压缩之后的失真度对精度的影响,采用第二对应关系评价采用不同的码率压缩之后的失真度,将压缩的过程和压缩之后的处理过程的评价分开处理,可以实现压缩和压缩之后的处理过程的解耦,提高评测的效率。In the method of the embodiment of the present application, the distortion degree is introduced as an intermediate variable, the first correspondence relationship is used to evaluate the influence of the distortion degree after compression on the accuracy, the second correspondence relationship is used to evaluate the distortion degree after compression using different code rates, and the compressed The evaluation of the process and the post-compression process are processed separately, which can realize the decoupling of the compression and the post-compression process, and improve the evaluation efficiency.
示例性的,压缩之后的处理过程可以是AI处理,根据本申请实施例的方法可以实现压缩算法与AI处理的解耦,如果要评测新的压缩算法,不需要进行端到端评测,可以只对压缩处理的过程进行评测得到新的压缩算法对应的第二对应关系即可。同样的,如果要采用新的AI模块对图像进行识别,也可以采用已有的数据对AI识别的过程重新进行评测得到新的第一对应关系即可,不需要进行端到端的评测。本申请实施例提供的方法可以提高评测的效率。Exemplarily, the processing process after compression may be AI processing, and the method according to the embodiment of the present application can realize the decoupling of the compression algorithm and the AI processing. The second correspondence relationship corresponding to the new compression algorithm can be obtained by evaluating the compression processing process. Similarly, if you want to use a new AI module to recognize images, you can also use existing data to re-evaluate the AI recognition process to obtain a new first correspondence, without end-to-end evaluation. The methods provided in the embodiments of the present application can improve the efficiency of evaluation.
在一种可能的实现方式中,业务要求的精度阈值可以是指业务要求的精度与机器无损的 精度的差值。根据第一方面,第一种可能的实现方式中,根据业务要求的精度阈值和第一对应关系,确定所述精度阈值对应的失真阈值,包括:根据所述精度阈值和第一精度,确定所述精度阈值对应的第二精度,其中,所述第一精度为对所述原图像进行识别的精度;根据所述第二精度和所述第一对应关系,确定所述第二精度对应的失真阈值。In a possible implementation manner, the accuracy threshold value required by the service may refer to the difference between the accuracy required by the service and the accuracy of the lossless machine. According to the first aspect, in a first possible implementation manner, determining the distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and the first correspondence includes: determining the distortion threshold according to the accuracy threshold and the first accuracy. the second accuracy corresponding to the accuracy threshold, wherein the first accuracy is the accuracy of identifying the original image; according to the second accuracy and the first correspondence, the distortion corresponding to the second accuracy is determined threshold.
根据第一方面或第一方面的第一种可能的实现方式,第二种可能的实现方式中,所述码率为采用压缩算法对原图像进行压缩得到所述已压缩的图像的取样频率,所述失真度为已压缩的图像相对于真实环境的差异,所述精度为对所述已压缩的图像进行识别的精度。According to the first aspect or the first possible implementation manner of the first aspect, in the second possible implementation manner, the code rate is a sampling frequency at which the compressed image is obtained by compressing the original image by using a compression algorithm, The distortion is the difference between the compressed image and the real environment, and the accuracy is the accuracy of identifying the compressed image.
根据第一方面或第一方面的第一种或第二种可能的实现方式中的任意一种,第三种可能的实现方式中,所述第二对应关系为对压缩算法进行测试得到的,所述第二对应关系包括多个不同的子对应关系,每个子对应关系与一个压缩算法对应,不同的压缩算法对应的所述第一对应关系相同。According to the first aspect or any one of the first or second possible implementation manners of the first aspect, in the third possible implementation manner, the second correspondence is obtained by testing the compression algorithm, The second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, and the first correspondence corresponding to different compression algorithms is the same.
根据第一方面的第三种可能的实现方式,第四种可能的实现方式中,根据所述失真阈值和第二对应关系,确定所述失真阈值对应的码率阈值,包括:确定对原图像进行压缩所采用的压缩算法;确定所述压缩算法对应的子对应关系;根据所述失真阈值和所述子对应关系,确定所述失真阈值对应的码率阈值。According to the third possible implementation manner of the first aspect, in the fourth possible implementation manner, determining the bit rate threshold corresponding to the distortion threshold according to the distortion threshold and the second correspondence includes: The compression algorithm used for compression; the sub-correspondence relationship corresponding to the compression algorithm is determined; and the code rate threshold corresponding to the distortion threshold value is determined according to the distortion threshold value and the sub-correspondence relationship.
本申请实施例的确定方法简单、高效,易于扩展。比如说,如果要采用新的压缩算法进行图像的压缩处理,可以针对不同的码率点对新的压缩算法进行评测。具体地,针对不同的码率点采用新的压缩算法对图像进行压缩处理,并输出已压缩的图像的失真度,得到新的压缩算法对应的第二对应关系。不需要再对压缩后的图像进行AI处理,得到AI模块处理的精度,不需要重新建立新的第一对应关系,采用之前建立的第一对应关系即可。处理器可以建立新的压缩算法对应的第二对应关系,如果要根据业务要求的精度阈值确定采用新的压缩算法压缩时的码率阈值,处理器可以根据业务要求的精度阈值确定对应的失真阈值,根据精度阈值查找已建立的第一对应关系确定精度阈值对应的失真阈值,根据失真阈值查找新的压缩算法对应的第二对应关系,确定失真阈值对应的码率阈值,即为采用新的压缩算法压缩时满足业务要求的精度阈值的码率阈值。本申请提供的确定方法可以对压缩和AI处理的过程进行解耦,实现分阶段评价,AI处理的精度和压缩算法无关,针对新的压缩算法只需要进行失真度和码率的重新测试即可,不需要进行端到端的测试,简化了测试的过程,评测效率更高。The determination method of the embodiment of the present application is simple, efficient, and easy to expand. For example, if a new compression algorithm is to be used for image compression, the new compression algorithm can be evaluated for different bit rate points. Specifically, a new compression algorithm is used to compress the image for different bit rate points, and the distortion degree of the compressed image is output to obtain the second correspondence corresponding to the new compression algorithm. There is no need to perform AI processing on the compressed image to obtain the processing accuracy of the AI module, and there is no need to re-establish a new first correspondence, and the first correspondence established before can be used. The processor can establish a second correspondence corresponding to the new compression algorithm. If the code rate threshold when compressing with the new compression algorithm is to be determined according to the accuracy threshold required by the service, the processor can determine the corresponding distortion threshold according to the accuracy threshold required by the service. , find the established first correspondence according to the precision threshold to determine the distortion threshold corresponding to the precision threshold, find the second correspondence corresponding to the new compression algorithm according to the distortion threshold, and determine the bit rate threshold corresponding to the distortion threshold, that is, to adopt the new compression The bit rate threshold that meets the accuracy threshold required by the service when the algorithm compresses. The determination method provided in this application can decouple the process of compression and AI processing, and realize staged evaluation. The precision of AI processing has nothing to do with the compression algorithm, and only needs to re-test the distortion degree and bit rate for the new compression algorithm. , no end-to-end testing is required, the testing process is simplified, and the evaluation efficiency is higher.
根据第一方面或第一方面的第一种或第二种可能的实现方式中的任意一种,第五种可能的实现方式中,所述失真度为根据以下指标中的一种或多种得到的:峰值信噪比PSNR,均方误差MSE,结构相似性指标SSIM,感知损失。According to the first aspect or any one of the first or second possible implementation manners of the first aspect, in a fifth possible implementation manner, the degree of distortion is based on one or more of the following indicators Obtained: peak signal-to-noise ratio PSNR, mean square error MSE, structural similarity metric SSIM, perceptual loss.
根据第一方面或第一方面的第一种或第二种可能的实现方式中的任意一种,第六种可能的实现方式中,所述精度为根据以下指标中的一种或多种得到的:平均精度均值mAP,精度均值AP,平均召回率AR,均交并比MIoU。According to the first aspect or any one of the first or second possible implementation manners of the first aspect, in the sixth possible implementation manner, the accuracy is obtained according to one or more of the following indicators : Mean Precision mAP, Mean Precision AP, Mean Recall AR, Mean Intersection Over Union Ratio MIoU.
根据第一方面的第二种可能的实现方式,第七种可能的实现方式中,所述原图像为贝叶尔原始RAW图像,所述已压缩的图像为红绿蓝RGB图像,对所述原图像进行压缩为:在RAW域、或者RGB域、或者YUV域对所述原图像进行压缩。According to the second possible implementation manner of the first aspect, in a seventh possible implementation manner, the original image is a Bayer original RAW image, the compressed image is a red, green, and blue RGB image, and the Compressing the original image is: compressing the original image in the RAW domain, or the RGB domain, or the YUV domain.
根据第一方面的第二种可能的实现方式,第八种可能的实现方式中,所述原图像为贝叶尔原始RAW图像,所述已压缩的图像为YUV图像,对所述原图像进行压缩为:在YUV域对所述原图像进行压缩。According to the second possible implementation manner of the first aspect, in the eighth possible implementation manner, the original image is a Bayer original RAW image, the compressed image is a YUV image, and the original image is Compression is: compressing the original image in the YUV domain.
根据第一方面的第二种可能的实现方式,第九种可能的实现方式中,所述原图像和所述已压缩的图像都为贝叶尔原始RAW图像,对所述原图像进行压缩为:在RAW域对所述原图像进行压缩。According to the second possible implementation manner of the first aspect, in a ninth possible implementation manner, both the original image and the compressed image are Bayerian original RAW images, and the original image is compressed as : Compress the original image in the RAW domain.
第二方面,本申请的实施例提供了一种图像处理方式的确定装置,所述装置包括:第一确定模块,用于根据业务要求的精度阈值和第一对应关系,确定所述精度阈值对应的失真阈值;其中,所述第一对应关系为精度和失真度之间的对应关系;第二确定模块,用于根据所述失真阈值和第二对应关系,确定所述失真阈值对应的码率阈值;其中,所述第二对应关系为失真度和码率之间的对应关系。In a second aspect, an embodiment of the present application provides a device for determining an image processing method, the device includes: a first determining module, configured to determine the corresponding accuracy threshold value according to the accuracy threshold value required by the service and the first corresponding relationship The first corresponding relationship is the corresponding relationship between the accuracy and the degree of distortion; the second determining module is used to determine the corresponding bit rate of the distortion threshold according to the distortion threshold and the second corresponding relationship. Threshold; wherein, the second correspondence is the correspondence between the distortion degree and the code rate.
本申请实施例的装置通过引入失真度作为中间变量,采用第一对应关系评价压缩之后的失真度对精度的影响,采用第二对应关系评价采用不同的码率压缩之后的失真度,将压缩的过程和压缩之后的处理过程的评价分开处理,可以实现压缩和压缩之后的处理过程的解耦,提高评测的效率。By introducing the distortion degree as an intermediate variable, the apparatus in the embodiment of the present application uses the first correspondence relationship to evaluate the influence of the distortion degree after compression on the accuracy, and the second correspondence relationship to evaluate the distortion degree after compression with different code rates. The evaluation of the process and the post-compression process are processed separately, which can realize the decoupling of the compression and the post-compression process, and improve the evaluation efficiency.
示例性的,压缩之后的处理过程可以是AI处理,根据本申请实施例的装置实现了压缩算法与AI处理的解耦,如果要评测新的压缩算法,不需要进行端到端评测,可以只对压缩处理的过程进行评测得到新的压缩算法对应的第二对应关系即可。同样的,如果要采用新的AI模块对图像进行识别,也可以采用已有的数据对AI识别的过程重新进行评测得到新的第一对应关系即可,不需要进行端到端的评测。本申请实施例提供的装置可以提高评测的效率。Exemplarily, the processing process after compression may be AI processing. The device according to the embodiment of the present application realizes the decoupling of the compression algorithm and AI processing. If a new compression algorithm is to be evaluated, end-to-end evaluation is not required, and only The second correspondence relationship corresponding to the new compression algorithm can be obtained by evaluating the compression processing process. Similarly, if you want to use a new AI module to recognize images, you can also use existing data to re-evaluate the AI recognition process to obtain a new first correspondence, without end-to-end evaluation. The device provided by the embodiment of the present application can improve the efficiency of evaluation.
根据第二方面,第一种可能的实现方式中,所述第一确定模块包括:第一确定单元,用于根据所述精度阈值和第一精度,确定所述精度阈值对应的第二精度,其中,所述第一精度为对所述原图像进行识别的精度;第二确定单元,用于根据所述第二精度和所述第一对应关系,确定所述第二精度对应的失真阈值。According to the second aspect, in a first possible implementation manner, the first determination module includes: a first determination unit, configured to determine the second accuracy corresponding to the accuracy threshold according to the accuracy threshold and the first accuracy, The first accuracy is the accuracy of recognizing the original image; and the second determination unit is configured to determine the distortion threshold corresponding to the second accuracy according to the second accuracy and the first correspondence.
根据第二方面或第二方面的第一种可能的实现方式,第二种可能的实现方式中,所述码率为采用压缩算法对原图像进行压缩得到所述已压缩的图像的取样频率,所述失真度为已压缩的图像相对于真实环境的差异,所述精度为对所述已压缩的图像进行识别的精度。According to the second aspect or the first possible implementation manner of the second aspect, in the second possible implementation manner, the code rate is a sampling frequency at which the compressed image is obtained by compressing the original image using a compression algorithm, The distortion is the difference between the compressed image and the real environment, and the accuracy is the accuracy of identifying the compressed image.
根据第二方面或第二方面的第一种或第二种可能的实现方式中的任意一种,第三种可能的实现方式中,所述第二对应关系为对压缩算法进行测试得到的,所述第二对应关系包括多个不同的子对应关系,每个子对应关系与一个压缩算法对应,不同的压缩算法对应的所述第一对应关系相同。According to the second aspect or any one of the first or second possible implementation manners of the second aspect, in the third possible implementation manner, the second correspondence is obtained by testing the compression algorithm, The second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, and the first correspondence corresponding to different compression algorithms is the same.
根据第二方面的第三种可能的实现方式,第四种可能的实现方式中,所述第二确定模块包括:第三确定单元,用于确定对原图像进行压缩所采用的压缩算法;第四确定单元,用于确定所述压缩算法对应的子对应关系;第五确定单元,用于根据所述失真阈值和所述子对应关系,确定所述失真阈值对应的码率阈值。According to a third possible implementation manner of the second aspect, in a fourth possible implementation manner, the second determination module includes: a third determination unit, configured to determine a compression algorithm used for compressing the original image; The fourth determining unit is configured to determine the sub-correspondence relationship corresponding to the compression algorithm; the fifth determining unit is configured to determine the bit rate threshold value corresponding to the distortion threshold value according to the distortion threshold value and the sub-correspondence relationship.
本申请实施例的装置简单、高效,易于扩展。比如说,如果要采用新的压缩算法进行图像的压缩处理,可以针对不同的码率点对新的压缩算法进行评测。具体地,针对不同的码率点采用新的压缩算法对图像进行压缩处理,并输出已压缩的图像的失真度,得到新的压缩算法对应的第二对应关系。不需要再对压缩后的图像进行AI处理,得到AI模块处理的精度,不需要重新建立新的第一对应关系,采用之前建立的第一对应关系即可。处理器可以建立新的压缩算法对应的第二对应关系,如果要根据业务要求的精度阈值确定采用新的压缩算法压 缩时的码率阈值,处理器可以根据业务要求的精度阈值确定对应的失真阈值,根据精度阈值查找已建立的第一对应关系确定精度阈值对应的失真阈值,根据失真阈值查找新的压缩算法对应的第二对应关系,确定失真阈值对应的码率阈值,即为采用新的压缩算法压缩时满足业务要求的精度阈值的码率阈值。本申请提供的装置可以对压缩和AI处理的过程进行解耦,实现分阶段评价,AI处理的精度和压缩算法无关,针对新的压缩算法只需要进行失真度和码率的重新测试即可,不需要进行端到端的测试,简化了测试的过程,评测效率更高。The device of the embodiment of the present application is simple, efficient, and easy to expand. For example, if a new compression algorithm is to be used for image compression, the new compression algorithm can be evaluated for different bit rate points. Specifically, a new compression algorithm is used to compress the image for different bit rate points, and the distortion degree of the compressed image is output to obtain the second correspondence corresponding to the new compression algorithm. There is no need to perform AI processing on the compressed image to obtain the processing accuracy of the AI module, and there is no need to re-establish a new first correspondence, and the first correspondence established before can be used. The processor can establish a second correspondence corresponding to the new compression algorithm. If the code rate threshold when compressing with the new compression algorithm is to be determined according to the accuracy threshold required by the service, the processor can determine the corresponding distortion threshold according to the accuracy threshold required by the service. , find the established first correspondence according to the precision threshold to determine the distortion threshold corresponding to the precision threshold, find the second correspondence corresponding to the new compression algorithm according to the distortion threshold, and determine the bit rate threshold corresponding to the distortion threshold, that is, to adopt the new compression The bit rate threshold that meets the accuracy threshold required by the service when the algorithm compresses. The device provided in this application can decouple the process of compression and AI processing, and realize staged evaluation. The precision of AI processing has nothing to do with the compression algorithm. For the new compression algorithm, it only needs to re-test the distortion degree and bit rate. End-to-end testing is not required, which simplifies the testing process and increases evaluation efficiency.
根据第二方面或第二方面的第一种或第二种可能的实现方式中的任意一种,第五种可能的实现方式中,所述失真度为根据以下指标中的一种或多种得到的:峰值信噪比PSNR,均方误差MSE,结构相似性指标SSIM,感知损失。According to the second aspect or any one of the first or second possible implementation manners of the second aspect, in a fifth possible implementation manner, the degree of distortion is based on one or more of the following indicators Obtained: peak signal-to-noise ratio PSNR, mean square error MSE, structural similarity metric SSIM, perceptual loss.
根据第二方面或第二方面的第一种或第二种可能的实现方式中的任意一种,第六种可能的实现方式中,所述精度为根据以下指标中的一种或多种得到的:平均精度均值mAP,精度均值AP,平均召回率AR,均交并比MIoU。According to the second aspect or any one of the first or second possible implementation manners of the second aspect, in the sixth possible implementation manner, the accuracy is obtained according to one or more of the following indicators : Mean Precision mAP, Mean Precision AP, Mean Recall AR, Mean Intersection Over Union Ratio MIoU.
第三方面,本申请的实施例提供了一种电子设备,包括:处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时可以执行上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的图像处理方式的确定方法。In a third aspect, embodiments of the present application provide an electronic device, including: a processor and a memory for storing instructions executable by the processor, where the processor can execute the first aspect or the first aspect when the processor executes the instructions A method for determining one or more image processing manners among multiple possible implementation manners of the aspect.
第四方面,本申请的实施例提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的图像处理方式的确定方法。In a fourth aspect, embodiments of the present application provide a computer program product, comprising computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in an electronic When running in the device, the processor in the electronic device executes the first aspect or the method for determining one or more image processing manners in the first aspect or multiple possible implementation manners of the first aspect.
第五方面,本申请的实施例提供了一种非易失性计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的图像处理方式的确定方法。In a fifth aspect, embodiments of the present application provide a non-volatile computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, implement the first aspect or the first aspect above A method for determining one or more image processing modes in a variety of possible implementation modes.
第六方面,本申请实施例还提供一种传感器系统,用于为车辆提供感知功能。其包含至少一个本申请上述实施例提到的图像处理方式的确定装置,以及,摄像头或雷达等其他传感器中的至少一个,该系统内的至少一个传感器装置可以集成为一个整机或设备,或者该系统内的至少一个传感器装置也可以独立设置为元件或装置。In a sixth aspect, an embodiment of the present application further provides a sensor system for providing a sensing function for a vehicle. It includes at least one device for determining the image processing method mentioned in the above-mentioned embodiments of the present application, and at least one of other sensors such as a camera or a radar. At least one sensor device in the system can be integrated into a whole machine or device, or The at least one sensor device within the system can also be provided independently as an element or device.
第七方面,本申请实施例还提供一种系统,应用于无人驾驶或智能驾驶中,其包含至少一个本申请上述实施例提到的图像处理方式的确定装置,以及摄像头、雷达等传感器中的至少一个,该系统内的至少一个装置可以集成为一个整机或设备,或者该系统内的至少一个装置也可以独立设置为元件或装置。In a seventh aspect, the embodiments of the present application further provide a system, which is applied in unmanned driving or intelligent driving, which includes at least one device for determining the image processing method mentioned in the above-mentioned embodiments of the present application, and sensors such as cameras and radars. At least one device in the system can be integrated into a whole machine or equipment, or at least one device in the system can also be independently set as a component or device.
第八方面,本申请实施例还提供一种车辆,所述车辆包括至少一个本申请上述实施例提到的图像处理方式的确定装置或上述任一系统。In an eighth aspect, an embodiment of the present application further provides a vehicle, where the vehicle includes at least one image processing method determination device or any of the above-mentioned systems mentioned in the above-mentioned embodiments of the present application.
本申请的这些和其他方面在以下(多个)实施例的描述中会更加简明易懂。These and other aspects of the present application will be more clearly understood in the following description of the embodiment(s).
附图说明Description of drawings
包含在说明书中并且构成说明书的一部分的附图与说明书一起示出了本申请的示例性实施例、特征和方面,并且用于解释本申请的原理。The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments, features and aspects of the application and together with the description, serve to explain the principles of the application.
图1a是相关技术中一种基于压缩的感知系统传输数据的示意图。FIG. 1a is a schematic diagram of data transmission in a compression-based sensing system in the related art.
图1b示出根据相关技术中一示例的视频压缩的架构的示意图。FIG. 1b shows a schematic diagram of an architecture of video compression according to an example in the related art.
图2a示出根据本申请一实施例的评测框架的示意图。Fig. 2a shows a schematic diagram of an evaluation framework according to an embodiment of the present application.
图2b示出根据本申请一实施例的码率-精度曲线的示意图。FIG. 2b shows a schematic diagram of a rate-precision curve according to an embodiment of the present application.
图3示出根据本申请一实施例的图像处理方式的确定方法应用的场景示意图。FIG. 3 shows a schematic diagram of an application scenario of a method for determining an image processing mode according to an embodiment of the present application.
图4a示出根据本申请一实施例的第一对应关系的曲线的示意图。Fig. 4a shows a schematic diagram of a curve of the first correspondence according to an embodiment of the present application.
图4b示出根据本申请一实施例的第二对应关系的曲线的示意图。FIG. 4b shows a schematic diagram of a curve of the second correspondence according to an embodiment of the present application.
图5示出根据本申请一实施例的图像处理方式的确定方法。FIG. 5 shows a method for determining an image processing mode according to an embodiment of the present application.
图6a示出根据本申请一实施例的确定失真阈值的方式的示意图。FIG. 6a shows a schematic diagram of a manner of determining a distortion threshold according to an embodiment of the present application.
图6b示出根据本申请一实施例的确定码率阈值的方式的示意图。FIG. 6b shows a schematic diagram of a manner of determining a code rate threshold according to an embodiment of the present application.
图7示出根据本申请一些示例的测评框架的示意图。FIG. 7 shows a schematic diagram of an assessment framework according to some examples of the present application.
图8示出根据本申请一些示例的测评框架的示意图。FIG. 8 shows a schematic diagram of an assessment framework according to some examples of the present application.
图9示出根据本申请一些示例的测评框架的示意图。9 shows a schematic diagram of an assessment framework according to some examples of the present application.
图10示出根据本申请一实施例的图像处理方式的确定装置的框图。FIG. 10 shows a block diagram of an apparatus for determining an image processing method according to an embodiment of the present application.
具体实施方式Detailed ways
以下将参考附图详细说明本申请的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。Various exemplary embodiments, features and aspects of the present application will be described in detail below with reference to the accompanying drawings. The same reference numbers in the figures denote elements that have the same or similar functions. While various aspects of the embodiments are shown in the drawings, the drawings are not necessarily drawn to scale unless otherwise indicated.
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration." Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
另外,为了更好的说明本申请,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本申请同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本申请的主旨。In addition, in order to better illustrate the present application, numerous specific details are given in the following detailed description. It should be understood by those skilled in the art that the present application may be practiced without certain specific details. In some instances, methods, means, components and circuits well known to those skilled in the art have not been described in detail so as not to obscure the subject matter of the present application.
图2a示出根据本申请一实施例的评测框架的示意图,图2a所示的评测框架为动态图像专家组(Moving Picture Experts Group,MPEG)-机器视觉编码(Video Coding for Machines,VCM)工作组定义的、面向机器视觉的图像质量评价方法,采用端到端的评测流程。Fig. 2a shows a schematic diagram of an evaluation framework according to an embodiment of the present application, and the evaluation framework shown in Fig. 2a is the Moving Picture Experts Group (MPEG)-Machine Vision Coding (Video Coding for Machines, VCM) working group A defined, machine vision-oriented image quality evaluation method using an end-to-end evaluation process.
如图2a所示,摄像头将经过ISP处理的视频输出到VCM编码器,经过ISP处理的视频可以为RGB或者YUV格式,由VCM编码器对视频进行编码得到编码后的视频,编码后的视频传输到VCM解码器,由VCM解码器进行视频解码得到解码后的视频,对解码后的视频进行机器视觉处理,具体地,可以将解码后的图像输出给神经网络,通过神经网络进行机器视觉处理。As shown in Figure 2a, the camera outputs the video processed by the ISP to the VCM encoder. The video processed by the ISP can be in RGB or YUV format. The VCM encoder encodes the video to obtain the encoded video, and the encoded video is transmitted. To the VCM decoder, the VCM decoder performs video decoding to obtain the decoded video, and performs machine vision processing on the decoded video. Specifically, the decoded image can be output to a neural network, and machine vision processing is performed through the neural network.
图2a所示的评测框架是一个紧耦合系统,摄像头、压缩算法(编码器+解码器)、NN各个模块耦合在一起,如果要对一种压缩算法进行压缩性能的评测,必须进行端到端的评测,比较复杂,且效率低下。具体地,如图2a所示,对于不同的压缩算法,如果要确定在多大的码率阈值上进行压缩,可以达到机器无损的要求,必须采用图2a所示的框架进行端到端的测试,得到压缩算法的多个码率点对应的精度,根据多个码率点对应的精度可以绘制码率和精度的曲线,根据业务要求的精度阈值和码率-精度曲线可以确定精度指标对应的码率阈值,效率比较低。其中,业务要求的精度阈值可以是指不同的应用场景下对精度的需求,应用场景可以为自动驾驶、辅助驾驶等等,这些不同的应用场景对处理精度的需求可能是不同的,因此,不同的应用场景有对应的业务要求的精度阈值。在一种可能的实现方式中,业务要求的 精度阈值可以是指业务要求的精度与机器无损的精度的差值。The evaluation framework shown in Figure 2a is a tightly coupled system. The camera, compression algorithm (encoder + decoder), and NN modules are coupled together. If you want to evaluate the compression performance of a compression algorithm, you must perform an end-to-end evaluation. Evaluation is complex and inefficient. Specifically, as shown in Fig. 2a, for different compression algorithms, if we want to determine the code rate threshold at which to compress to meet the requirements of machine lossless, we must use the framework shown in Fig. 2a to perform end-to-end testing, and obtain The accuracy corresponding to multiple bit rate points of the compression algorithm. According to the accuracy corresponding to multiple bit rate points, the curve of the bit rate and accuracy can be drawn, and the bit rate corresponding to the accuracy index can be determined according to the accuracy threshold and the bit rate-precision curve required by the business. threshold, the efficiency is relatively low. Among them, the accuracy threshold required by the business may refer to the accuracy requirements in different application scenarios, and the application scenarios may be automatic driving, assisted driving, etc. These different application scenarios may have different processing accuracy requirements. Therefore, different The application scenarios have corresponding accuracy thresholds required by the business. In a possible implementation manner, the accuracy threshold required by the service may refer to the difference between the accuracy required by the service and the accuracy of the lossless machine.
图2b示出根据本申请一实施例的码率-精度曲线的示意图。如图2b所示,横坐标表示码率,纵坐标表示精度,在图2b的示例中,精度采用的指标可以为平均精度均值(mean Average Precision,mAP)。如图2b所示,虚线代表对采集到的原图像(没有进行压缩的图像)进行识别的精度,另外三条曲线表示采用三种不同的压缩算法对原图像进行压缩处理得到已压缩的图像,对已压缩的图像进行识别的精度和码率的关系,三种压缩算法分别为X265缺省配置(X265_medium)、X264缺省配置(X264_medium)、X264快速配置(X264_ultrafast)。如图2b所示,随着码率的增加,对已压缩的图像进行识别的精度越来越接近机器无损(虚线),在码率比较高时,三条曲线趋向一致,但是在码率比较低时,三条曲线比较分散,也就是说在码率比较低时,识别的精度和采用的压缩算相关性比较强。对于不同的压缩算法,如果要确定在多大的码率阈值上进行压缩,可以达到机器无损的要求,必须采用图2a所示的框架进行端到端的测试,得到压缩算法的多个码率点对应的精度,根据多个码率点对应的精度可以绘制码率和精度的曲线,根据业务要求的精度指标和码率-精度曲线可以确定精度指标对应的码率阈值,效率比较低。FIG. 2b shows a schematic diagram of a rate-precision curve according to an embodiment of the present application. As shown in Figure 2b, the abscissa represents the bit rate, and the ordinate represents the precision. In the example of Figure 2b, the index used for the precision may be the mean average precision (mean Average Precision, mAP). As shown in Figure 2b, the dotted line represents the recognition accuracy of the original image collected (the uncompressed image), and the other three curves represent the compressed image obtained by compressing the original image using three different compression algorithms. The relationship between the recognition accuracy of the compressed image and the bit rate, the three compression algorithms are X265 default configuration (X265_medium), X264 default configuration (X264_medium), and X264 fast configuration (X264_ultrafast). As shown in Figure 2b, with the increase of the code rate, the recognition accuracy of the compressed image is getting closer and closer to the machine lossless (dotted line). When the code rate is relatively high, the three curves tend to be consistent, but when the code rate is relatively low When , the three curves are relatively scattered, that is to say, when the code rate is relatively low, the accuracy of recognition and the compression algorithm used are relatively strong. For different compression algorithms, if you want to determine the code rate threshold at which to compress to meet the requirements of machine lossless, you must use the framework shown in Figure 2a to perform end-to-end testing to obtain the corresponding code rate points of the compression algorithm. According to the accuracy corresponding to multiple code rate points, the curve of the code rate and accuracy can be drawn, and the code rate threshold corresponding to the accuracy index can be determined according to the accuracy index and the code rate-accuracy curve required by the business, and the efficiency is relatively low.
相关技术中,IEEE-P2020标准面向自动驾驶的成像系统图像质量评测工作组,定义了面向自动驾驶基于概率的机器视觉评价指标,包括:对比度检测概率(Contrast detection probability,CDP)、颜色分离概率(color separation probability,CSP)、几何分辨率概率(geometric resolution probability,GRP)等,实现感知系统模块级评测。以这些概率指标来表征面向机器视觉的感知系统的成像质量,以衡量图像质量对后续机器视觉AI处理的影响程度。但这些指标仅仅考虑成像系统的能力,与后端的AI处理任务是割裂的,并不能很好反映图像质量对AI处理的影响。In related technologies, the IEEE-P2020 standard image quality evaluation working group for imaging systems for autonomous driving has defined probability-based machine vision evaluation indicators for autonomous driving, including: contrast detection probability (CDP), color separation probability ( Color separation probability (CSP), geometric resolution probability (GRP), etc., to achieve module-level evaluation of perception systems. These probabilistic indicators are used to characterize the imaging quality of the machine vision-oriented perception system to measure the impact of image quality on subsequent machine vision AI processing. However, these indicators only consider the capabilities of the imaging system, which are separated from the back-end AI processing tasks, and cannot well reflect the impact of image quality on AI processing.
为了解决上述技术问题,本申请提供了一种图像处理方式的确定方法。图3示出根据本申请一实施例的图像处理方式的确定方法应用的场景示意图。如图3所示,在本申请的实施例的应用场景中,可以包括压缩模块、AI模块以及处理器。其中,压缩模块可以对接收到的图像进行压缩处理,在压缩处理时可以对图像进行取样(取样频率为码率)处理,压缩后的图像可以传输给AI模块进行目标检测、图像分割等处理。In order to solve the above technical problems, the present application provides a method for determining an image processing mode. FIG. 3 shows a schematic diagram of an application scenario of a method for determining an image processing mode according to an embodiment of the present application. As shown in FIG. 3 , in the application scenario of the embodiment of the present application, a compression module, an AI module, and a processor may be included. Among them, the compression module can compress the received image. During the compression process, the image can be sampled (the sampling frequency is the code rate), and the compressed image can be transmitted to the AI module for target detection, image segmentation and other processing.
需要说明的是,本申请的实施例可以直接在已有的测试集上进行测试,比如说,可以在cityscape数据集上进行测试,cityscape数据集中包括训练图、验证图、测试图,数据集中包括的图像都带有注释,可以直接进行压缩和识别处理,得到本申请实施例的测试结果数据(码率对应的失真度和精度数据)。其中,码率为采用压缩算法对原图像进行压缩得到已压缩的图像的取样频率,失真度为已压缩的图像相对于真实环境的差异。It should be noted that the embodiments of the present application can be tested directly on the existing test set, for example, the test can be performed on the cityscape data set. The cityscape data set includes training images, validation images, and test images. The images are annotated, and can be directly compressed and identified to obtain the test result data (distortion and precision data corresponding to the code rate) of the embodiments of the present application. Among them, the code rate is the sampling frequency of the compressed image obtained by compressing the original image using the compression algorithm, and the distortion degree is the difference between the compressed image and the real environment.
这种情况下,仿真设备可以包括上述压缩模块、AI模块和处理器,其中,压缩模块和AI模块可以是存储在仿真设备的存储器上的软件程序,处理器可以调用相应的模块实现对测试集中的图像的处理,并得到测试结果数据。对于得到的测试结果数据,处理器可以建立失真度和精度之间的对应关系(第一对应关系)、以及码率和失真度的对应关系(第二对应关系),并存储第一对应关系和第二对应关系。In this case, the simulation device may include the above-mentioned compression module, AI module, and processor, wherein the compression module and the AI module may be software programs stored in the memory of the simulation device, and the processor may call the corresponding The image processing, and get the test result data. For the obtained test result data, the processor may establish a correspondence between the degree of distortion and the precision (the first correspondence), and the correspondence between the code rate and the degree of distortion (the second correspondence), and store the first correspondence and The second correspondence.
本申请实施例的方法也可以在实际的应用场景中进行测试,比如说,在自动驾驶系统上进行测试,自动驾驶系统可以包括摄像头,还可以包括但不限于:车载终端、车载控制器、车载模块、车载模组、车载部件、车载芯片、车载单元、车载雷达或车载摄像头等其他传感 器。The methods of the embodiments of the present application can also be tested in actual application scenarios, for example, tested on an automatic driving system. The automatic driving system may include a camera, and may also include but not limited to: vehicle terminals, vehicle controllers, vehicle Other sensors such as modules, in-vehicle modules, in-vehicle components, in-vehicle chips, in-vehicle units, in-vehicle radar or in-vehicle cameras.
其中,压缩模块可以是编码器,编码器可以位于摄像头上,压缩模块还可以包括解码器,解码器可以位于MDC上,AI模块和处理器可以位于MDC上,或者,AI模块位于MDC上,处理器可以是外部设备(自动驾驶系统以外的设备)的处理器。The compression module may be an encoder, the encoder may be located on the camera, the compression module may also include a decoder, the decoder may be located on the MDC, the AI module and the processor may be located on the MDC, or the AI module is located on the MDC, processing The processor can be a processor of an external device (device other than the autonomous driving system).
该外部设备可以为一个通用设备或者是一个专用设备。在具体实现中,该外部设备可以是台式机、便携式电脑、网络服务器、掌上电脑(personal digital assistant,PDA)、移动手机、平板电脑、无线终端设备、嵌入式设备或其他具有处理功能的设备。本申请实施例不限定该外部设备的类型。该外部设备可以具有处理功能的芯片或处理器(如图3所示的处理器),该外部设备可以包括多个处理器,处理器可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。The external device can be a general-purpose device or a dedicated device. In a specific implementation, the external device may be a desktop computer, a portable computer, a network server, a personal digital assistant (PDA), a mobile phone, a tablet computer, a wireless terminal device, an embedded device, or other devices with processing functions. This embodiment of the present application does not limit the type of the external device. The external device may have a chip or processor with a processing function (such as the processor shown in FIG. 3 ), the external device may include multiple processors, and the processor may be a single-core (single-CPU) processor, or It is a multi-core (multi-CPU) processor.
以图3所示的处理器为是外部设备的处理器为例,本申请的图像处理方式的确定方法可以由上述外部设备离线执行。测试时可以设置编码器的码率,摄像头采集图像后由编码器进行编码后,发送到MDC,由解码器解码后得到的图像可以存储在MDC,AI模块对解码后得到的图像进行识别可以得到精度数据。对于不同的码率点,可以分多次设置编码器的码率,并执行上述过程得到测试结果数据。Taking the processor shown in FIG. 3 as an example of a processor of an external device, the method for determining the image processing mode of the present application may be performed offline by the above-mentioned external device. The code rate of the encoder can be set during the test. After the camera captures the image, the encoder encodes it and sends it to the MDC. The image decoded by the decoder can be stored in the MDC. The AI module can identify the decoded image and get Precision data. For different code rate points, the code rate of the encoder can be set multiple times, and the above process is performed to obtain the test result data.
对于得到的测试结果数据可以输出到外部设备,外部设备可以根据解码后的图像得到已压缩的图像的失真度,外部设备可以建立失真度和精度之间的对应关系(第一对应关系)、以及码率和失真度的对应关系(第二对应关系),并存储第一对应关系和第二对应关系。The obtained test result data can be output to an external device, the external device can obtain the distortion degree of the compressed image according to the decoded image, the external device can establish a correspondence between the distortion degree and the precision (the first correspondence), and The corresponding relationship between the code rate and the distortion degree (the second corresponding relationship), and the first corresponding relationship and the second corresponding relationship are stored.
如果图3所示的AI模块和处理器都位于MDC上,那么自动驾驶系统可以在线执行本申请实施例的图像处理方式的确定方法,比如说,测试时可以设置编码器的码率,摄像头采集图像后由编码器进行编码后,发送到MDC,由解码器解码后得到的图像可以存储在MDC,MDC根据解码后的图像可以得到已压缩的图像的失真度,AI模块对解码后得到的图像进行识别可以得到精度数据。对于得到的测试结果数据,MDC可以建立失真度和精度之间的对应关系(第一对应关系)、以及码率和失真度的对应关系(第二对应关系),并存储第一对应关系和第二对应关系。MDC还可以根据业务要求的精度阈值以及第一对应关系和第二对应关系,得到精度阈值对应的码率阈值,根据码率阈值设置编码器编码的码率,这样,编码器可以实现对图像的机器无损处理。If both the AI module and the processor shown in FIG. 3 are located on the MDC, the automatic driving system can execute the method for determining the image processing method of the embodiments of the present application online. For example, the code rate of the encoder can be set during testing, and the camera can After the image is encoded by the encoder, it is sent to the MDC. The image decoded by the decoder can be stored in the MDC. The MDC can obtain the distortion degree of the compressed image according to the decoded image. Accuracy data can be obtained by performing identification. For the obtained test result data, the MDC can establish the correspondence between the degree of distortion and the precision (the first correspondence), and the correspondence between the code rate and the degree of distortion (the second correspondence), and store the first correspondence and the first correspondence Two correspondences. The MDC can also obtain the bit rate threshold corresponding to the precision threshold according to the precision threshold required by the service and the first correspondence and the second correspondence, and set the bit rate of the encoder encoding according to the bit rate threshold. In this way, the encoder can Machine non-destructive processing.
需要说明的是,上述示例中AI模块和处理器都位于MDC上仅仅是本申请的一个示例,不以任何方式限制本申请,比如说,AI模块和处理器还可以位于自动驾驶系统的其他部件上,本申请对此不作限定。It should be noted that in the above example, both the AI module and the processor are located on the MDC, which is only an example of the application, and does not limit the application in any way. For example, the AI module and the processor may also be located in other components of the automatic driving system. above, this application does not limit it.
在本申请的实施例中,失真度采用的指标可以为峰值信噪比(Peak signal noise ratio,PSNR),或者均方误差(Mean square error,MSE),或者结构相似性指标(Structure similarity index,SSIM),或者感知损失(Perception loss,P-loss)。失真度也可以采用以上指标中的多个的结合,比如说,联合多个失真指标对已压缩的图像的失真度进行综合评测。举例来说,可以采用PSNR和SSIM两个指标的加权指标,作为最终失真度,可以适用于对信号保真(PSNR)和人眼视觉(SSIM)都有要求的应用场合。In the embodiment of the present application, the index used for the distortion degree may be a peak signal noise ratio (Peak signal noise ratio, PSNR), or a mean square error (Mean square error, MSE), or a structural similarity index (Structure similarity index, SSIM), or Perception loss (P-loss). The degree of distortion may also be a combination of multiple indicators above, for example, combining multiple indicators of distortion to comprehensively evaluate the degree of distortion of a compressed image. For example, a weighted index of PSNR and SSIM can be used as the final distortion degree, which can be applied to applications requiring both signal fidelity (PSNR) and human vision (SSIM).
经过AI模块继续对已压缩的图像处理后,可以得到AI模块处理的精度,精度可以为AI模块对所述已压缩的图像进行识别的精度。这样,可以得到采用该压缩算法对图像进行压缩处理时,不同码率点对应的失真度和精度数据。After the AI module continues to process the compressed image, the processing accuracy of the AI module can be obtained, and the accuracy can be the accuracy of identifying the compressed image by the AI module. In this way, when the compression algorithm is used to compress the image, the distortion degree and precision data corresponding to different bit rate points can be obtained.
在本申请的实施例中,精度采用的指标可以为平均精度均值(mean Average Precision,mAP),精度均值(Average Precision,AP),平均召回率(Average Recall,AR),均交并比(Mean Intersection over Union,MIoU)。其中,AP可以为AP50、AP60、AP70或者weightedAP,等等。精度采用的指标还可以是以上指标中多个的结合,比如说,联合以上多个指标对AI处理的精度进行综合测评。举例来说,可以采用mAP和AR两个指标的加权指标作为精度的指标,本申请对精度采用的具体指标不作限定。In the embodiment of the present application, the indicators used for precision may be mean Average Precision (mAP), mean precision (Average Precision, AP), average recall rate (Average Recall, AR), mean cross-join ratio (Mean Intersection over Union, MIoU). The AP may be AP50, AP60, AP70, or weightedAP, and so on. The indicators used for accuracy can also be a combination of multiple indicators above. For example, the accuracy of AI processing can be comprehensively evaluated by combining multiple indicators above. For example, the weighted index of mAP and AR can be used as the precision index, and the present application does not limit the specific index used for the precision.
在本申请的实施例中,第一对应关系和第二对应关系可以是以表项的形式存储的一对一对的数值,也可以是以函数的形式表示,本申请对此不作限定。In the embodiments of the present application, the first correspondence and the second correspondence may be one-to-one values stored in the form of table entries, or may be expressed in the form of functions, which are not limited in this application.
举例来说,示例性的,第一对应关系可以表示为如表1所示的形式,第二对应关系可以表示为如表2所示的形式。For example, exemplarily, the first corresponding relationship may be represented in the form shown in Table 1, and the second corresponding relationship may be represented in the form shown in Table 2.
表1Table 1
精度precision 失真度Distortion
P1P1 D1D1
P2P2 D2D2
PnPn DnDn
表2Table 2
失真度Distortion 码率code rate
D1D1 R1R1
D2D2 R2R2
DnDn RnRn
示例性的,第一对应关系还可以表示成函数的形式,如下公式(1)所示:Exemplarily, the first correspondence can also be expressed in the form of a function, as shown in the following formula (1):
Figure PCTCN2021084373-appb-000001
其中,fi(D)表示在数值范围Di上,精度和失真度之间的函数关系,i为1~n的正整数。换言之,精度和失真度之间的关系可以表示为分段函数的形式。
Figure PCTCN2021084373-appb-000001
Among them, fi(D) represents the functional relationship between the precision and the degree of distortion in the numerical range Di, and i is a positive integer from 1 to n. In other words, the relationship between precision and distortion can be expressed in the form of a piecewise function.
在一种可能的实现方式中,在数值范围Di上,P和D可以为线性关系。精度和失真度之间的关系可以表示为分段线性函数。In a possible implementation manner, over the numerical range Di, P and D may have a linear relationship. The relationship between accuracy and distortion can be expressed as a piecewise linear function.
图4a示出根据本申请一实施例的第一对应关系的曲线的示意图。如图4a所示,横坐标可以表示失真度,纵坐标可以表示精度,在图4a所示的示例中采用的失真度的指标为PSNR,精度的指标为mAP。图4a所示的示例中,三种不同压缩算法对应的第一对应关系的曲线几乎是重合的,也就是说,第一对应关系与具体采用的压缩算法无关,不依赖于具体的压缩算法。Fig. 4a shows a schematic diagram of a curve of the first correspondence according to an embodiment of the present application. As shown in FIG. 4a, the abscissa may represent the degree of distortion, and the ordinate may represent the accuracy. In the example shown in FIG. 4a, the index of the degree of distortion is PSNR, and the index of accuracy is mAP. In the example shown in FIG. 4a, the curves of the first correspondences corresponding to the three different compression algorithms almost overlap, that is to say, the first correspondences are independent of the specific compression algorithm used and do not depend on the specific compression algorithm.
换言之,机器视觉的性能主要取决于输入的图像的失真度,与压缩算法无关,不依赖于具体的压缩算法。另外,机器视觉的性能和具体采用的神经网络是有关系的。In other words, the performance of machine vision mainly depends on the distortion degree of the input image, which has nothing to do with the compression algorithm and does not depend on the specific compression algorithm. In addition, the performance of machine vision is related to the specific neural network used.
同样的,第二对应关系也可以表示成函数的形式,如下公式(2)所示:Similarly, the second correspondence can also be expressed in the form of a function, as shown in the following formula (2):
Figure PCTCN2021084373-appb-000002
其中,gi(R)表示在数值范围Ri上,失真度和码率之间的函数关系。换言之,失真度和码率之间的关系可以表示为分段函数的形式。
Figure PCTCN2021084373-appb-000002
Among them, gi(R) represents the functional relationship between the distortion degree and the code rate in the numerical range Ri. In other words, the relationship between the distortion degree and the code rate can be expressed in the form of a piecewise function.
在一种可能的实现方式中,在数值范围Ri上,D和R可以为线性关系,失真度和码率之间的关系可以表示为分段线性函数。In a possible implementation manner, in the numerical range Ri, D and R may have a linear relationship, and the relationship between the distortion degree and the code rate may be expressed as a piecewise linear function.
图4b示出根据本申请一实施例的第二对应关系的曲线的示意图。如图4b所示,横坐标可以表示码率,纵坐标可以表示失真度,在图4b所示的示例中采用的失真度的指标为PSNR。图4b所示的示例中,三种不同压缩算法对应的第二对应关系的曲线比较分散,也就是说,不同的压缩算法即使是采用相同的码率对图像进行压缩处理得到的已压缩的图像的失真度差别比较大,第二对应关系与具体采用的压缩算法有关。FIG. 4b shows a schematic diagram of a curve of the second correspondence according to an embodiment of the present application. As shown in FIG. 4b, the abscissa may represent the code rate, and the ordinate may represent the degree of distortion. In the example shown in FIG. 4b, the index of the degree of distortion used is PSNR. In the example shown in Figure 4b, the curves of the second correspondence corresponding to the three different compression algorithms are relatively scattered, that is to say, even if different compression algorithms use the same bit rate to compress the image, the compressed image is obtained. The difference of the distortion degree is relatively large, and the second corresponding relationship is related to the specific compression algorithm used.
通过以上实施方式,实现了压缩算法与AI处理的解耦,如果要评测新的压缩算法,不需要进行端到端评测,可以只对压缩处理的过程进行评测得到新的压缩算法对应的第二对应关系即可。同样的,如果要采用新的AI模块对图像进行识别,也可以采用已有的数据对AI识别的过程重新进行评测得到新的第一对应关系即可,不需要进行端到端的评测。本申请实施例提供的方法可以提高评测的效率。Through the above embodiments, the decoupling of the compression algorithm and AI processing is realized. If a new compression algorithm is to be evaluated, end-to-end evaluation is not required, and only the compression process can be evaluated to obtain the second corresponding to the new compression algorithm. Correspondence can be. Similarly, if you want to use a new AI module to recognize images, you can also use existing data to re-evaluate the AI recognition process to obtain a new first correspondence, without end-to-end evaluation. The methods provided in the embodiments of the present application can improve the efficiency of evaluation.
在得到第一对应关系和第二对应关系后,可以根据不同的业务要求的精度指标确定对应的精度阈值,根据精度阈值和第一对应关系,可以确定精度阈值对应的失真阈值,根据失真阈值和第二对应关系,可以确定失真阈值对应的码率阈值。这样,就可以根据不同的业务要求确定压缩采用的码率了。After the first correspondence and the second correspondence are obtained, the corresponding precision threshold may be determined according to the precision indicators required by different services, and the distortion threshold corresponding to the precision threshold may be determined according to the precision threshold and the first correspondence, and according to the distortion threshold and In the second correspondence, the code rate threshold corresponding to the distortion threshold can be determined. In this way, the code rate used for compression can be determined according to different service requirements.
图5示出根据本申请一实施例的图像处理方式的确定方法。在本申请的实施例中,图像处理方式可以是指对图像进行压缩采用的码率,确定图像处理的方式可以是指确定对图像进行压缩采用的码率的过程。如图5所示,图像处理方式的确定方法可以包括以下步骤:FIG. 5 shows a method for determining an image processing mode according to an embodiment of the present application. In the embodiments of the present application, the image processing mode may refer to a code rate used for compressing an image, and determining the image processing mode may refer to a process of determining a code rate used for compressing an image. As shown in Figure 5, the method for determining the image processing mode may include the following steps:
步骤S500,根据业务要求的精度阈值和第一对应关系,确定所述精度阈值对应的失真阈值;其中,所述第一对应关系为精度和失真度之间的对应关系。Step S500: Determine a distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and a first corresponding relationship, wherein the first corresponding relationship is a corresponding relationship between the accuracy and the degree of distortion.
步骤S501,根据所述失真阈值和第二对应关系,确定所述失真阈值对应的码率阈值;其中,所述第二对应关系为失真度和码率之间的对应关系。Step S501 , according to the distortion threshold and a second correspondence, determine a bit rate threshold corresponding to the distortion threshold; wherein the second correspondence is a correspondence between a distortion degree and a bit rate.
其中,业务要求的精度阈值可以是指业务要求的精度与第一精度的差值,第一精度可以为对原图像进行识别的精度,也就是对未压缩的图像进行识别的精度,如图4a所示,第一精度为图4a中的虚线标出的精度值。The accuracy threshold required by the service may refer to the difference between the accuracy required by the service and the first accuracy, and the first accuracy may be the accuracy of recognizing the original image, that is, the accuracy of recognizing the uncompressed image, as shown in Figure 4a As shown, the first precision is the precision value marked by the dotted line in Figure 4a.
本申请实施例的方法通过引入失真度作为中间变量,采用第一对应关系评价压缩之后的失真度对精度的影响,采用第二对应关系评价采用不同的码率压缩之后的失真度,将压缩的过程和压缩之后的处理过程的评价分开处理,可以实现压缩和压缩之后的处理过程的解耦,提高评测的效率。In the method of the embodiment of the present application, the distortion degree is introduced as an intermediate variable, the first correspondence relationship is used to evaluate the influence of the distortion degree after compression on the accuracy, the second correspondence relationship is used to evaluate the distortion degree after compression using different code rates, and the compressed The evaluation of the process and the post-compression process are processed separately, which can realize the decoupling of the compression and the post-compression process, and improve the evaluation efficiency.
在一种可能的实现方式中,步骤S500可以包括:根据所述精度阈值和第一精度,确定所述精度阈值对应的第二精度,其中,所述第一精度为对所述原图像进行识别的精度;根据所述第二精度和所述第一对应关系,确定所述第二精度对应的失真阈值。In a possible implementation manner, step S500 may include: determining a second precision corresponding to the precision threshold according to the precision threshold and a first precision, where the first precision is identifying the original image The accuracy of the second accuracy is determined; the distortion threshold corresponding to the second accuracy is determined according to the second accuracy and the first correspondence.
在测试过程中测试的码率点可以是离散的,如果以函数的形式存储第一对应关系,那么可以存储如公式(1)所示的计算方式,实际上也就是存储了精度和失真度的对应关系的曲线。 图6a示出根据本申请一实施例的确定失真阈值的方式的示意图。如图6a所示,Pth可以表示业务要求的精度阈值(Precision threshold),也就是业务要求的第二精度与第一精度的差值,Pmax可以表示第一精度,也就是对未压缩的图像进行识别的精度,P可以表示第二精度,也就是业务要求的精度,PSNR th可以表示失真阈值,也就是业务要求的精度对应的失真度。其中,第二精度可以为第一精度和精度阈值的差值,第二精度P=Pmax-Pth。在确定P以后,可以根据图6a所示的曲线得到P对应的PSNR th。在确定第二精度P后,可以根据P的范围确定具体采用公式(1)中的函数fi(D)计算第二精度P对应的失真阈值D(PSNR th)。The code rate points tested in the testing process can be discrete. If the first correspondence is stored in the form of a function, then the calculation method shown in formula (1) can be stored, which actually stores the accuracy and distortion. Correspondence curve. FIG. 6a shows a schematic diagram of a manner of determining a distortion threshold according to an embodiment of the present application. As shown in Figure 6a, Pth can represent the precision threshold required by the business (Precision threshold), that is, the difference between the second precision required by the business and the first precision, and Pmax can represent the first precision, that is, the uncompressed image is processed The recognition accuracy, P can represent the second accuracy, that is, the accuracy required by the business, and PSNR th can represent the distortion threshold, that is, the degree of distortion corresponding to the accuracy required by the business. The second precision may be the difference between the first precision and the precision threshold, and the second precision P=Pmax−Pth. After P is determined, the PSNR th corresponding to P can be obtained according to the curve shown in Figure 6a. After the second precision P is determined, the distortion threshold D (PSNR th) corresponding to the second precision P can be calculated specifically by using the function fi(D) in the formula (1) according to the range of P.
在测试过程中测试的码率点可以是离散的,如果是以数值对的形式存储第一对应关系,对于第一对应关系中未存储的点,处理器可以采用线性插值的方式进行处理。比如说,如果根据业务要求的精度指标确定了对应的精度阈值,但第一对应关系中未存储该精度阈值对应的精度数据,处理器可以获取第一对应关系中与精度阈值相邻的精度数据,根据与精度阈值相邻的精度数据和对应的失真度数据进行线性插值可以得到精度阈值对应的失真阈值。以表1为例,假设确定的第二精度P大于P1,但小于P2,那么,第二精度P对应的失真阈值可以通过以下线性插值公式(3)计算得到:The code rate points tested in the testing process may be discrete. If the first correspondence is stored in the form of a pair of values, the processor may use linear interpolation to process the points not stored in the first correspondence. For example, if the corresponding precision threshold is determined according to the precision index required by the business, but the precision data corresponding to the precision threshold is not stored in the first correspondence, the processor can obtain the precision data adjacent to the precision threshold in the first correspondence , the distortion threshold corresponding to the precision threshold can be obtained by performing linear interpolation according to the precision data adjacent to the precision threshold and the corresponding distortion degree data. Taking Table 1 as an example, assuming that the determined second precision P is greater than P1 but less than P2, then the distortion threshold corresponding to the second precision P can be calculated by the following linear interpolation formula (3):
Figure PCTCN2021084373-appb-000003
Figure PCTCN2021084373-appb-000003
对于步骤S501,同样可以根据具体存储的方式确定失真阈值对应的码率阈值。如果以函数的形式存储第二对应关系,那么可以存储如公式(2)所示的计算方式,实际上也就是存储了失真度和码率的对应关系的曲线。图6b示出根据本申请一实施例的确定码率阈值的方式的示意图。如图6b所示,PSNR th可以表示步骤S500中确定的失真阈值,也就是业务要求的精度对应的失真度。对于任意一种压缩算法,都可以采用公式(2)的形式表示失真度和码率之间的对应关系,在确定失真阈值后,可以根据失真阈值所述的范围确定具体采用公式(2)中的函数gi(R)计算失真阈值对应的码率阈值R。图6b所示的示例中包括三种不同的压缩算法对应的第二对应关系的三条曲线,每条曲线都有对应的函数表达式,根据已经确定的失真阈值以及每条曲线对应的函数表达式可以确定三种压缩算法分别对应的码率阈值:R_X265_medium、R_X264_medium、R_X264_ultrafast。For step S501, the bit rate threshold corresponding to the distortion threshold can also be determined according to the specific storage method. If the second correspondence is stored in the form of a function, then the calculation method shown in formula (2) can be stored, which is actually a curve storing the correspondence between the distortion degree and the code rate. FIG. 6b shows a schematic diagram of a manner of determining a code rate threshold according to an embodiment of the present application. As shown in Figure 6b, PSNR th may represent the distortion threshold determined in step S500, that is, the distortion degree corresponding to the accuracy required by the service. For any compression algorithm, the corresponding relationship between the distortion degree and the code rate can be expressed in the form of formula (2). After the distortion threshold is determined, it can be determined according to the range described by the distortion threshold. The function gi(R) of calculates the rate threshold R corresponding to the distortion threshold. The example shown in FIG. 6b includes three curves of the second correspondence corresponding to three different compression algorithms, each curve has a corresponding function expression, according to the determined distortion threshold and the function expression corresponding to each curve The code rate thresholds corresponding to the three compression algorithms can be determined: R_X265_medium, R_X264_medium, and R_X264_ultrafast.
如果是以数值对的形式存储第二对应关系,对于第二对应关系中未存储的点,处理器可以采用线性插值的方式进行处理。具体的方式可以参见通过线性插值确定失真阈值的过程,不再赘述。If the second correspondence is stored in the form of a pair of values, the processor may use linear interpolation to process the points that are not stored in the second correspondence. For a specific manner, reference may be made to the process of determining the distortion threshold through linear interpolation, which will not be repeated here.
在一种可能的实现方式中,所述第二对应关系为对压缩算法进行测试得到的,所述第二对应关系包括多个不同的子对应关系,每个子对应关系与一个压缩算法对应,不同的压缩算法对应的所述第一对应关系相同。In a possible implementation manner, the second correspondence is obtained by testing a compression algorithm, the second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, different The first correspondences corresponding to the compression algorithms are the same.
根据上文可知,第一对应关系与具体采用的压缩算法无关,不依赖于具体的压缩算法,因此,不同的压缩算法对应的第一对应关系可以是相同的。也就是说,如果处理器(比如MDC的处理器)对图像进行机器视觉处理的方式没有改变,那么对于不同的压缩算法可以采用相同的第一对应关系对图像质量进行评价。具体地,对不同的压缩算法得到的已压缩的图像进行识别得到的测试数据,可以对测试数据进行拟合得到第一对应关系的曲线,或者也可以直接存储已压缩的图像的失真度和识别的精度之间的第一对应关系,本申请对此不作限定。As can be seen from the above, the first correspondence relationship is independent of the specific compression algorithm used, and does not depend on the specific compression algorithm. Therefore, the first correspondence relationship corresponding to different compression algorithms may be the same. That is to say, if the way in which the processor (such as the processor of the MDC) performs machine vision processing on the image does not change, the same first correspondence can be used to evaluate the image quality for different compression algorithms. Specifically, the test data obtained by identifying the compressed images obtained by different compression algorithms can be fitted to the test data to obtain the curve of the first correspondence, or the distortion degree and identification of the compressed image can also be directly stored. The first correspondence between the precisions is not limited in this application.
对于不同的压缩算法,由于压缩标准的不同,即使采用相同的码率对图像进行压缩处理得到的已压缩的图像的失真度也可能是不同的,因此,不同的压缩算法的第二对应关系可能 不同。如图4b和图6b所示,三种压缩算法X265_medium、X264_medium、X264_ultrafast对应的码率-失真度的曲线是不同的。For different compression algorithms, due to different compression standards, the distortion degree of the compressed image obtained by compressing the image with the same bit rate may be different. Therefore, the second correspondence between different compression algorithms may be different. different. As shown in Figure 4b and Figure 6b, the rate-distortion curves corresponding to the three compression algorithms X265_medium, X264_medium, and X264_ultrafast are different.
因此,在一种可能的实现方式中,步骤S501,根据所述失真阈值和第二对应关系,确定所述失真阈值对应的码率阈值,可以包括:确定对原图像进行压缩所采用的压缩算法;确定所述压缩算法对应的子对应关系;根据所述失真阈值和所述子对应关系,确定所述失真阈值对应的码率阈值。Therefore, in a possible implementation manner, in step S501, according to the distortion threshold and the second correspondence, determining the bit rate threshold corresponding to the distortion threshold may include: determining a compression algorithm used to compress the original image ; determine the sub-correspondence relationship corresponding to the compression algorithm; and determine the code rate threshold value corresponding to the distortion threshold value according to the distortion threshold value and the sub-correspondence relationship.
如图3所示,处理器中预先存储了压缩算法对应的子对应关系(第二对应关系),处理器除了接收输入的精度阈值,还可以接收输入的压缩算法。处理器可以根据精度阈值和第一对应关系确定精度阈值对应的失真阈值,根据输入的压缩算法确定对原图像进行压缩所采用的压缩算法,根据预先存储的压缩算法对应的子对应关系、以及失真阈值可以确定失真阈值对应的码率阈值。As shown in FIG. 3 , the sub-correspondence (second correspondence) corresponding to the compression algorithm is pre-stored in the processor, and the processor can receive the input compression algorithm in addition to the input precision threshold. The processor can determine the distortion threshold corresponding to the accuracy threshold according to the accuracy threshold and the first correspondence, determine the compression algorithm used to compress the original image according to the input compression algorithm, and determine the corresponding sub-correspondence and distortion according to the pre-stored compression algorithm. The threshold can determine the bit rate threshold corresponding to the distortion threshold.
本申请实施例的方法简单、高效,易于扩展。比如说,如果要采用新的压缩算法进行图像的压缩处理,可以针对不同的码率点对新的压缩算法进行评测。具体地,针对不同的码率点采用新的压缩算法对图像进行压缩处理,并输出已压缩的图像的失真度,得到新的压缩算法对应的第二对应关系。不需要再对压缩后的图像进行AI处理,得到AI模块处理的精度,不需要重新建立新的第一对应关系,采用之前建立的第一对应关系即可。The method of the embodiment of the present application is simple, efficient, and easy to expand. For example, if a new compression algorithm is to be used for image compression, the new compression algorithm can be evaluated for different bit rate points. Specifically, a new compression algorithm is used to compress the image for different bit rate points, and the distortion degree of the compressed image is output to obtain the second correspondence corresponding to the new compression algorithm. There is no need to perform AI processing on the compressed image to obtain the processing accuracy of the AI module, and there is no need to re-establish a new first correspondence, and the first correspondence established before can be used.
处理器可以建立新的压缩算法对应的第二对应关系,如果要根据业务要求的精度阈值确定采用新的压缩算法压缩时的码率阈值,处理器可以根据业务要求的精度阈值确定对应的失真阈值,根据精度阈值查找已建立的第一对应关系确定精度阈值对应的失真阈值,根据失真阈值查找新的压缩算法对应的第二对应关系,确定失真阈值对应的码率阈值,即为采用新的压缩算法压缩时满足业务要求的精度阈值的码率阈值。The processor can establish a second correspondence corresponding to the new compression algorithm. If the code rate threshold when compressing with the new compression algorithm is to be determined according to the accuracy threshold required by the service, the processor can determine the corresponding distortion threshold according to the accuracy threshold required by the service. , find the established first correspondence according to the precision threshold to determine the distortion threshold corresponding to the precision threshold, find the second correspondence corresponding to the new compression algorithm according to the distortion threshold, and determine the bit rate threshold corresponding to the distortion threshold, that is, to adopt the new compression The bit rate threshold that meets the accuracy threshold required by the service when the algorithm compresses.
但是,如果采用之前的评测方法,则需要针对新的压缩算法进行端到端的测试。如图2a所示,针对新的压缩算法,在不同的码率点进行端到端的测试,可以建立码率与精度的对应关系,根据业务要求的精度阈值和码率与精度的对应关系,可以确定采用新的压缩算法压缩时的码率阈值。However, if the previous evaluation method is used, end-to-end testing of the new compression algorithm is required. As shown in Figure 2a, for the new compression algorithm, end-to-end testing at different code rate points can establish the corresponding relationship between code rate and accuracy. According to the accuracy threshold required by the business and the corresponding relationship between code rate and accuracy, the Determine the bit rate threshold when compressing with the new compression algorithm.
比较上述两个过程,可以确定本申请提供的方法可以对压缩和AI处理的过程进行解耦,实现分阶段评价,AI处理的精度和压缩算法无关,针对新的压缩算法只需要进行失真度和码率的重新测试即可,不需要进行端到端的测试,简化了测试的过程,评测效率更高。Comparing the above two processes, it can be determined that the method provided in this application can decouple the compression and AI processing processes, and realize staged evaluation. The precision of AI processing is independent of the compression algorithm. The code rate can be re-tested without end-to-end testing, which simplifies the testing process and makes the evaluation more efficient.
在本申请的实施例中,如果要采用新的神经网络模型对图像进行识别,可以采用新的神经网络模型对已经标注的数据集进行处理,得到新的神经网络模型的第一对应关系,无需重新执行压缩处理的过程,因为之前的AI处理的过程已经标注了压缩处理后图像的失真度,采用新的神经网络模型对标注的数据集进行处理,既可以得到失真度和精度之间的新的第一对应关系。In the embodiment of the present application, if a new neural network model is to be used to identify an image, the new neural network model can be used to process the labeled data set to obtain the first correspondence of the new neural network model, without the need for Re-execute the compression processing process, because the previous AI processing process has marked the distortion degree of the compressed image, and the new neural network model is used to process the marked data set, which can obtain a new difference between the distortion degree and the accuracy. the first correspondence.
但是,如果采用现有的评测方法,则需要根据图2a所示的框架重新进行端到端的测试过程,采用新的神经网络模型对压缩处理后图像进行处理,根据处理结果和标注的数据集,可以得到新的神经网络模型的第一对应关系。However, if the existing evaluation method is used, the end-to-end testing process needs to be re-implemented according to the framework shown in Figure 2a, and the compressed image is processed with a new neural network model. According to the processing results and the labeled dataset, The first correspondence of the new neural network model can be obtained.
比较上述两个过程,可以确定本申请提供的方法可以对压缩和AI处理的过程进行解耦,实现分阶段评价,AI处理的精度和模型有关,压缩过程的评价和模型无关,针对新的神经网络模型可以对已经标注的数据集进行处理,无需重新进行压缩处理的过程,相比于现有的需 要重新进行端到端的测试过程,可以简化评测的过程,评测的效率更高。Comparing the above two processes, it can be determined that the method provided in this application can decouple the compression and AI processing processes to realize staged evaluation. The accuracy of AI processing is related to the model, and the evaluation of the compression process is independent of the model. The network model can process the labeled data set without the need to re-compress the process. Compared with the existing end-to-end testing process, it can simplify the evaluation process and the evaluation efficiency is higher.
下面结合具体的应用场景和应用示例对本申请的图像处理方式的确定方法进行说明。The method for determining the image processing mode of the present application will be described below with reference to specific application scenarios and application examples.
图7示出根据本申请一些示例的测评框架的示意图。如图7所示,测评的过程可以分解为第一阶段和第二阶段两个阶段,其中,第一阶段用于对压缩算法进行测试,可以得到码率和失真度的第二对应关系,第二阶段用于对神经网络的识别精度进行测试,可以得到失真度和精度的第一对应关系。在图7的示例中,失真度可以定义在RGB域,失真度采用的指标可以为上文所述的PSNR,或者MSE,或者SSIM,或者P-loss,或者以上多个指标的加权结果。FIG. 7 shows a schematic diagram of an assessment framework according to some examples of the present application. As shown in Figure 7, the evaluation process can be divided into two stages: the first stage and the second stage. The first stage is used to test the compression algorithm, and the second correspondence between the bit rate and the distortion degree can be obtained. The second stage is used to test the recognition accuracy of the neural network, and the first correspondence between distortion and accuracy can be obtained. In the example of FIG. 7 , the degree of distortion may be defined in the RGB domain, and the indicator used for the degree of distortion may be the above-mentioned PSNR, or MSE, or SSIM, or P-loss, or the weighted result of the above multiple indicators.
对于压缩传输系统,失真主要是压缩编码引入的量化噪声。对精度-失真度曲线,失真度主要取决于压缩量化噪声的能量而与具体噪声形态关系不大,这种情况下,PSNR/MSE成为衡量压缩失真的合适的指标,PSNR和MSE存在log关系,MSE表征为压缩噪声能量大小,PSNR/MSE指标计算简单,使用方便。For compressed transmission systems, distortion is mainly quantization noise introduced by compression coding. For the precision-distortion degree curve, the degree of distortion mainly depends on the energy of the compressed and quantized noise and has little to do with the specific noise form. In this case, PSNR/MSE becomes a suitable indicator to measure the compression distortion. PSNR and MSE have a log relationship. MSE is characterized by the amount of compressed noise energy, and the PSNR/MSE index is simple to calculate and easy to use.
示例(a)表示参照(reference)的测评过程,RAW图像经ISP处理后输出RGB图像到深度神经网络,同时可以输出RGB图像的失真度数据,深度神经网络为未经压缩的RGB图像进行机器视觉处理得到识别的精度数据。Example (a) represents the evaluation process of the reference. The RAW image is processed by the ISP and then the RGB image is output to the deep neural network. At the same time, the distortion data of the RGB image can be output. The deep neural network performs machine vision for the uncompressed RGB image. Process the identified accuracy data.
示例(b)表示在RAW域对RAW图像进行压缩的场景,采用编码/解码器对RAW图像进行压缩后得到已压缩的图像,ISP对已压缩的图像进行处理可以得到RGB图像,输出RGB图像的失真度数据,深度神经网络对RGB图像进行机器视觉处理,可以得到识别的精度数据。相比于在RGB/YUV域压缩,在RAW域对RAW图像进行压缩可以降低压缩算法的复杂度,因为RAW域的数据量少。Example (b) represents the scene of compressing RAW images in the RAW domain. The RAW images are compressed by the encoder/decoder to obtain the compressed images. The ISP can process the compressed images to obtain RGB images, and output the RGB images. Distortion data, machine vision processing of RGB images by deep neural network, can get the accuracy data of recognition. Compared with compressing in RGB/YUV domain, compressing RAW images in RAW domain can reduce the complexity of compression algorithm because the amount of data in RAW domain is less.
示例(c)表示在RGB域对RGB图像进行压缩的场景,RAW图像经ISP处理后得到RGB图像,编码/解码器对RGB图像进行压缩后得到已压缩的图像,输出已压缩的RGB图像的失真度数据,深度神经网络对已压缩的RGB图像进行机器视觉处理,可以得到识别的精度数据。Example (c) represents the scene of compressing RGB images in the RGB domain. The RAW image is processed by ISP to obtain an RGB image, and the encoder/decoder compresses the RGB image to obtain a compressed image, and outputs the distortion of the compressed RGB image. The accuracy data of the recognition can be obtained by the machine vision processing of the compressed RGB image by the deep neural network.
示例(d)表示在YUV域对YUV图像进行压缩的场景,RAW图像经ISP处理后得到RGB图像,对RGB图像进行RGB-YUV格式转换可以得到YUV图像,采用编码/解码器对YUV图像进行压缩得到已压缩的图像,对已压缩的图像进行YUV-RGB格式转换可以得到已压缩的RGB图像,输出已压缩的RGB图像的失真度数据,深度神经网络对已压缩的RGB图像进行机器视觉处理,可以得到识别的精度数据。Example (d) represents a scene where YUV images are compressed in the YUV domain. RAW images are processed by ISP to obtain RGB images, and YUV images can be obtained by converting RGB images to RGB-YUV format, and YUV images are compressed using encoder/decoder Obtain the compressed image, convert the compressed image to YUV-RGB format to obtain the compressed RGB image, output the distortion data of the compressed RGB image, and perform machine vision processing on the compressed RGB image by the deep neural network. Accuracy data for identification can be obtained.
示例(a)可以得到对未压缩的图像进行识别的精度,也就是所述的第一精度Pmax,如图6a所示。对多种不同的压缩算法,可以分别采用示例(b)、示例(c)和示例(d)的框架进行测试,得到每个示例的框架上每个压缩算法对应的第二对应关系,以及每个示例的第一对应关系。Example (a) can obtain the accuracy of recognizing uncompressed images, that is, the first accuracy Pmax, as shown in FIG. 6a. For a variety of different compression algorithms, the frameworks of example (b), example (c), and example (d) can be used for testing, and the second correspondence corresponding to each compression algorithm on the framework of each example is obtained, and each The first correspondence of each example.
举例来说,对于三种压缩算法X265_medium、X264_medium、X264_ultrafast,可以分别在示例(b)、示例(c)和示例(d)的框架上,根据上述过程进行测试。For example, for the three compression algorithms X265_medium, X264_medium, and X264_ultrafast, tests can be performed on the frameworks of example (b), example (c), and example (d) according to the above process.
以示例(b)为例,在不同的码率上采用压缩算法X265_medium对RAW图像进行压缩后得到已压缩的图像,ISP对已压缩的图像进行处理可以得到RGB图像,输出RGB图像的失真度数据,可以建立压缩算法X265_medium对应的码率和失真度的第二对应关系,如图6b所示,深度神经网络对RGB图像进行机器视觉处理,可以得到识别的精度数据,可以建立失真度和精度的第一对应关系;在不同的码率上采用压缩算法X264_medium对RAW图像进行压缩后得到已压缩的图像,ISP对已压缩的图像进行处理可以得到RGB图像,输出RGB 图像的失真度数据,可以建立压缩算法X264_medium对应的码率和失真度的第二对应关系,如图6b所示,对于压缩算法X264_medium可以不继续对后续的机器视觉处理的过程进行测试,在确定机器无损对应的码率时,可以采用根据压缩算法X265_medium测试得到的第一对应关系;对于压缩算法X264_ultrafast,可以重复与压缩算法X264_medium相同的过程得到对应的第二对应关系,如图6b所示。由此可见,本申请实施例的方法简单、高效,对压缩和AI处理的过程进行解耦,实现分阶段评价,AI处理的精度和压缩算法无关,针对新的压缩算法只需要进行失真度和码率的重新测试即可,不需要进行端到端的测试,简化了测试的过程,评测效率更高。Taking example (b) as an example, the compression algorithm X265_medium is used to compress the RAW image at different bit rates to obtain a compressed image, and the ISP can process the compressed image to obtain an RGB image, and output the distortion data of the RGB image. , the second correspondence between the bit rate and the distortion degree corresponding to the compression algorithm X265_medium can be established. As shown in Figure 6b, the deep neural network performs machine vision processing on the RGB image, and the recognition accuracy data can be obtained, and the distortion degree and accuracy can be established. The first correspondence; the compression algorithm X264_medium is used to compress the RAW image at different bit rates to obtain a compressed image, and the ISP can process the compressed image to obtain an RGB image, and output the distortion data of the RGB image, which can be established. The second correspondence between the code rate and the distortion degree corresponding to the compression algorithm X264_medium, as shown in Figure 6b, for the compression algorithm X264_medium, it is not necessary to continue to test the subsequent machine vision processing process. When determining the code rate corresponding to the lossless machine, The first correspondence obtained by testing the compression algorithm X265_medium may be used; for the compression algorithm X264_ultrafast, the same process as the compression algorithm X264_medium may be repeated to obtain the corresponding second correspondence, as shown in FIG. 6b. It can be seen that the method of the embodiment of the present application is simple and efficient, decouples the process of compression and AI processing, and realizes staged evaluation. The precision of AI processing has nothing to do with the compression algorithm. For the new compression algorithm, only the distortion degree and The code rate can be re-tested without end-to-end testing, which simplifies the testing process and makes the evaluation more efficient.
将示例(b)与示例(a)进行对比,可以评价压缩算法对机器视觉处理的影响,比如说采用示例(b)对图像进行压缩、识别的精度越接近示例(a)的精度,越接近机器无损。根据测试结果数据建立的第一对应关系和第二对应关系以及业务要求的精度阈值可以确定机器无损的码率阈值,具体的过程可以参见图5、图6a和图6b的过程,不再赘述。Comparing example (b) with example (a), you can evaluate the impact of compression algorithms on machine vision processing. For example, using example (b) to compress images, the closer the recognition accuracy is to the accuracy of example (a), the closer The machine is undamaged. According to the first correspondence and the second correspondence established by the test result data and the accuracy threshold required by the service, the lossless code rate threshold of the machine can be determined. For the specific process, refer to the processes in FIG. 5, FIG. 6a and FIG. 6b, and will not be repeated.
图8示出根据本申请一些示例的测评框架的示意图。如图8所示,测评的过程仍然可以分解为第一阶段和第二阶段两个阶段,相比于图7的示例划分阶段的方式不同。在图8的示例中,失真度可以定义在YUV域,同样可以采用PSNR,或者MSE,或者SSIM,或者P-loss,或者以上多个指标的加权结果作为失真度的指标。因此,第一阶段可以划分到采用压缩算法对YUV图像进行压缩得到已压缩的YUV图像,可以输出已压缩的YUV图像的失真度数据。图8的示例中示例(e)可以表示参照的测评过程,示例(f)可以表示在YUV域压缩图像的场景,其他过程与图7的示例相似,不再赘述。FIG. 8 shows a schematic diagram of an assessment framework according to some examples of the present application. As shown in FIG. 8 , the evaluation process can still be decomposed into two stages: the first stage and the second stage, and the way of dividing the stages is different from that in the example of FIG. 7 . In the example of FIG. 8 , the distortion degree can be defined in the YUV domain, and PSNR, or MSE, or SSIM, or P-loss, or the weighted result of the above multiple indicators can also be used as the distortion degree indicator. Therefore, the first stage can be divided into using a compression algorithm to compress the YUV image to obtain a compressed YUV image, and the distortion data of the compressed YUV image can be output. In the example of FIG. 8 , example (e) may represent a reference evaluation process, and example (f) may represent a scene of compressing an image in the YUV domain. Other processes are similar to the example in FIG. 7 and will not be repeated.
图9示出根据本申请一些示例的测评框架的示意图。如图9所示,测评的过程仍然可以分解为第一阶段和第二阶段两个阶段,相比于图7和图8的示例划分阶段的方式不同。在图9的示例中,失真度可以定义在RAW域,同样可以采用PSNR,或者MSE,或者SSIM,或者P-loss,或者以上多个指标的加权结果作为失真度的指标。因此,第一阶段可以划分到采用压缩算法对RAW图像进行压缩得到已压缩的RAW图像,可以输出已压缩的RAW图像的失真度数据。图9的示例中示例(g)可以表示参照的测评过程,示例(h)可以表示在RAW域压缩图像的场景,其他过程与图7的示例相似,不再赘述。9 shows a schematic diagram of an assessment framework according to some examples of the present application. As shown in FIG. 9 , the evaluation process can still be decomposed into two stages: the first stage and the second stage, which is different from the examples of FIG. 7 and FIG. 8 in which the stages are divided. In the example of FIG. 9 , the distortion degree can be defined in the RAW domain, and PSNR, or MSE, or SSIM, or P-loss, or the weighted result of the above multiple indicators can also be used as the distortion degree indicator. Therefore, the first stage can be divided into using a compression algorithm to compress the RAW image to obtain a compressed RAW image, and the distortion data of the compressed RAW image can be output. In the example of FIG. 9 , example (g) may represent a reference evaluation process, and example (h) may represent a scene of compressing an image in the RAW domain. Other processes are similar to the example in FIG. 7 , and will not be repeated here.
根据本申请的实施例可知,本申请的图像处理方式的确定方法可以应用于多种场景,通用性强,并且易于扩展对新的压缩算法或者AI模块的评测,可以提高评测的效率。According to the embodiments of the present application, the method for determining the image processing method of the present application can be applied to various scenarios, has strong versatility, and is easy to extend the evaluation of new compression algorithms or AI modules, which can improve the efficiency of evaluation.
本申请还提供了一种图像处理方式的确定装置,图10示出根据本申请一实施例的图像处理方式的确定装置的框图。如图10所示,所述装置可以包括:第一确定模块100,用于根据业务要求的精度阈值和第一对应关系,确定所述精度阈值对应的失真阈值;其中,所述第一对应关系为精度和失真度之间的对应关系;第二确定模块101,用于根据所述失真阈值和第二对应关系,确定所述失真阈值对应的码率阈值;其中,所述第二对应关系为失真度和码率之间的对应关系。The present application further provides an apparatus for determining an image processing method, and FIG. 10 shows a block diagram of an apparatus for determining an image processing method according to an embodiment of the present application. As shown in FIG. 10 , the apparatus may include: a first determination module 100, configured to determine a distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and a first correspondence; wherein the first correspondence is the correspondence between the precision and the degree of distortion; the second determination module 101 is configured to determine the bit rate threshold corresponding to the distortion threshold according to the distortion threshold and the second correspondence; wherein, the second correspondence is Correspondence between distortion and bit rate.
本申请实施例的装置通过引入失真度作为中间变量,采用第一对应关系评价压缩之后的失真度对精度的影响,采用第二对应关系评价采用不同的码率压缩之后的失真度,将压缩的过程和压缩之后的处理过程的评价分开处理,可以实现压缩和压缩之后的处理过程的解耦,提高评测的效率。By introducing the distortion degree as an intermediate variable, the apparatus in the embodiment of the present application uses the first correspondence relationship to evaluate the influence of the distortion degree after compression on the accuracy, and the second correspondence relationship to evaluate the distortion degree after compression with different code rates. The evaluation of the process and the post-compression process are processed separately, which can realize the decoupling of the compression and the post-compression process, and improve the evaluation efficiency.
根据本申请实施例的装置实现了压缩算法与AI处理的解耦,如果要评测新的压缩算法, 不需要进行端到端评测,可以只对压缩处理的过程进行评测得到新的压缩算法对应的第二对应关系即可。同样的,如果要采用新的AI模块对图像进行识别,也可以采用已有的数据对AI识别的过程重新进行评测得到新的第一对应关系即可,不需要进行端到端的评测。本申请实施例提供的装置可以提高评测的效率。The device according to the embodiment of the present application realizes the decoupling of the compression algorithm and the AI processing. If a new compression algorithm is to be evaluated, end-to-end evaluation is not required, and only the compression process can be evaluated to obtain the corresponding data of the new compression algorithm. The second corresponding relationship is sufficient. Similarly, if you want to use a new AI module to recognize images, you can also use existing data to re-evaluate the AI recognition process to obtain a new first correspondence, without end-to-end evaluation. The device provided by the embodiment of the present application can improve the efficiency of evaluation.
在一种可能的实现方式中,所述第一确定模块100包括:第一确定单元,用于根据所述精度阈值和第一精度,确定所述精度阈值对应的第二精度,其中,所述第一精度为对所述原图像进行识别的精度;第二确定单元,用于根据所述第二精度和所述第一对应关系,确定所述第二精度对应的失真阈值。In a possible implementation manner, the first determining module 100 includes: a first determining unit, configured to determine a second accuracy corresponding to the accuracy threshold according to the accuracy threshold and the first accuracy, wherein the The first accuracy is the accuracy of identifying the original image; the second determination unit is configured to determine the distortion threshold corresponding to the second accuracy according to the second accuracy and the first correspondence.
在一种可能的实现方式中,所述码率为采用压缩算法对原图像进行压缩得到所述已压缩的图像的取样频率,所述失真度为已压缩的图像相对于真实环境的差异,所述精度为对所述已压缩的图像进行识别的精度。In a possible implementation manner, the bit rate is a sampling frequency at which the compressed image is obtained by compressing the original image using a compression algorithm, and the distortion degree is the difference between the compressed image and the real environment, so The accuracy is the accuracy of identifying the compressed image.
在一种可能的实现方式中,所述第二对应关系为对压缩算法进行测试得到的,所述第二对应关系包括多个不同的子对应关系,每个子对应关系与一个压缩算法对应,不同的压缩算法对应的所述第一对应关系相同。In a possible implementation manner, the second correspondence is obtained by testing a compression algorithm, the second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, different The first correspondences corresponding to the compression algorithms are the same.
在一种可能的实现方式中,所述第二确定模块101包括:第三确定单元,用于确定对原图像进行压缩所采用的压缩算法;第四确定单元,用于确定所述压缩算法对应的子对应关系;第五确定单元,用于根据所述失真阈值和所述子对应关系,确定所述失真阈值对应的码率阈值。In a possible implementation manner, the second determining module 101 includes: a third determining unit, configured to determine a compression algorithm used to compress the original image; and a fourth determining unit, configured to determine the corresponding compression algorithm The sub-correspondence relationship of ; the fifth determination unit is configured to determine the bit rate threshold value corresponding to the distortion threshold value according to the distortion threshold value and the sub-correspondence relationship.
本申请实施例的装置简单、高效,易于扩展。比如说,如果要采用新的压缩算法进行图像的压缩处理,可以针对不同的码率点对新的压缩算法进行评测。具体地,针对不同的码率点采用新的压缩算法对图像进行压缩处理,并输出已压缩的图像的失真度,得到新的压缩算法对应的第二对应关系。不需要再对压缩后的图像进行AI处理,得到AI模块处理的精度,不需要重新建立新的第一对应关系,采用之前建立的第一对应关系即可。处理器可以建立新的压缩算法对应的第二对应关系,如果要根据业务要求的精度阈值确定采用新的压缩算法压缩时的码率阈值,处理器可以根据业务要求的精度阈值确定对应的失真阈值,根据精度阈值查找已建立的第一对应关系确定精度阈值对应的失真阈值,根据失真阈值查找新的压缩算法对应的第二对应关系,确定失真阈值对应的码率阈值,即为采用新的压缩算法压缩时满足业务要求的精度阈值的码率阈值。本申请提供的装置可以对压缩和AI处理的过程进行解耦,实现分阶段评价,AI处理的精度和压缩算法无关,针对新的压缩算法只需要进行失真度和码率的重新测试即可,不需要进行端到端的测试,简化了测试的过程,评测效率更高。The device of the embodiment of the present application is simple, efficient, and easy to expand. For example, if a new compression algorithm is to be used for image compression, the new compression algorithm can be evaluated for different bit rate points. Specifically, a new compression algorithm is used to compress the image for different bit rate points, and the distortion degree of the compressed image is output to obtain the second correspondence corresponding to the new compression algorithm. There is no need to perform AI processing on the compressed image to obtain the processing accuracy of the AI module, and there is no need to re-establish a new first correspondence, and the first correspondence established before can be used. The processor can establish a second correspondence corresponding to the new compression algorithm. If the code rate threshold when compressing with the new compression algorithm is to be determined according to the accuracy threshold required by the service, the processor can determine the corresponding distortion threshold according to the accuracy threshold required by the service. , find the established first correspondence according to the precision threshold to determine the distortion threshold corresponding to the precision threshold, find the second correspondence corresponding to the new compression algorithm according to the distortion threshold, and determine the bit rate threshold corresponding to the distortion threshold, that is, to adopt the new compression The bit rate threshold that meets the accuracy threshold required by the service when the algorithm compresses. The device provided in this application can decouple the process of compression and AI processing, and realize staged evaluation. The precision of AI processing has nothing to do with the compression algorithm. For the new compression algorithm, it only needs to re-test the distortion degree and bit rate. End-to-end testing is not required, which simplifies the testing process and increases evaluation efficiency.
在一种可能的实现方式中,所述失真度为根据以下指标中的一种或多种得到的:峰值信噪比PSNR,均方误差MSE,结构相似性指标SSIM,感知损失。In a possible implementation manner, the distortion degree is obtained according to one or more of the following indicators: peak signal-to-noise ratio (PSNR), mean square error (MSE), structural similarity indicator (SSIM), and perceptual loss.
在一种可能的实现方式中,所述精度为根据以下指标中的一种或多种得到的:平均精度均值mAP,精度均值AP,平均召回率AR,均交并比MIoU。In a possible implementation manner, the precision is obtained according to one or more of the following indicators: mean mean precision mAP, mean mean precision AP, mean recall rate AR, and mean cross-over-union ratio MIoU.
该图像处理方式的确定装置可以是具有处理功能的芯片或处理器中的程序模块,处理器可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。芯片或处理器通过执行程序可以实现本申请上述实施例的方法。The device for determining the image processing mode may be a chip with processing function or a program module in a processor, and the processor may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. The chip or processor can implement the methods of the foregoing embodiments of the present application by executing the program.
本申请的实施例提供了一种电子设备,包括:处理器以及用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行所述指令时实现本申请上述实施例的方法。An embodiment of the present application provides an electronic device, including: a processor and a memory for storing instructions executable by the processor; wherein, the processor is configured to implement the methods of the foregoing embodiments of the present application when executing the instructions .
上述图像处理方式的确定装置或电子设备可以是一个通用设备或者是一个专用设备。在具体实现中,该装置还可以台式机、便携式电脑、网络服务器、掌上电脑(personal digital assistant,PDA)、移动手机、平板电脑、无线终端设备、嵌入式设备或其他具有处理功能的设备。本申请实施例不限定该图像处理方式的确定装置的类型。The apparatus or electronic device for determining the above image processing method may be a general-purpose device or a special-purpose device. In a specific implementation, the apparatus can also be a desktop computer, a portable computer, a network server, a PDA (personal digital assistant, PDA), a mobile phone, a tablet computer, a wireless terminal device, an embedded device, or other devices with processing functions. The embodiments of the present application do not limit the type of the apparatus for determining the image processing method.
本申请的实施例提供了一种非易失性计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。Embodiments of the present application provide a non-volatile computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, implement the above method.
本申请的实施例提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行上述方法。Embodiments of the present application provide a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是――但不限于――电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器(Electrically Programmable Read-Only-Memory,EPROM或闪存)、静态随机存取存储器(Static Random-Access Memory,SRAM)、便携式压缩盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、数字多功能盘(Digital Video Disc,DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。A computer-readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device. The computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (Electrically Programmable Read-Only-Memory, EPROM or flash memory), static random access memory (Static Random-Access Memory, SRAM), portable compact disk read-only memory (Compact Disc Read-Only Memory, CD - ROM), Digital Video Disc (DVD), memory sticks, floppy disks, mechanically encoded devices, such as punch cards or raised structures in grooves on which instructions are stored, and any suitable combination of the foregoing .

Claims (17)

  1. 一种图像处理方式的确定方法,其特征在于,所述方法包括:A method for determining an image processing mode, characterized in that the method comprises:
    根据业务要求的精度阈值和第一对应关系,确定所述精度阈值对应的失真阈值;其中,所述第一对应关系为精度和失真度之间的对应关系;Determine the distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and the first correspondence; wherein, the first correspondence is the correspondence between the accuracy and the degree of distortion;
    根据所述失真阈值和第二对应关系,确定所述失真阈值对应的码率阈值;其中,所述第二对应关系为失真度和码率之间的对应关系。A bit rate threshold corresponding to the distortion threshold is determined according to the distortion threshold and the second correspondence, wherein the second correspondence is the correspondence between the distortion degree and the bit rate.
  2. 根据权利要求1所述的方法,其特征在于,根据业务要求的精度阈值和第一对应关系,确定所述精度阈值对应的失真阈值,包括:The method according to claim 1, wherein determining the distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and the first correspondence, comprising:
    根据所述精度阈值和第一精度,确定所述精度阈值对应的第二精度,其中,所述第一精度为对所述原图像进行识别的精度;According to the precision threshold and the first precision, determine the second precision corresponding to the precision threshold, wherein the first precision is the precision for recognizing the original image;
    根据所述第二精度和所述第一对应关系,确定所述第二精度对应的失真阈值。According to the second precision and the first correspondence, a distortion threshold corresponding to the second precision is determined.
  3. 根据权利要求1或2所述的方法,其特征在于,所述码率为采用压缩算法对原图像进行压缩得到所述已压缩的图像的取样频率,所述失真度为已压缩的图像相对于真实环境的差异,所述精度为对所述已压缩的图像进行识别的精度。The method according to claim 1 or 2, wherein the code rate is a sampling frequency at which the compressed image is obtained by compressing the original image by using a compression algorithm, and the distortion degree is the ratio of the compressed image to the compressed image. The difference in the real environment, the accuracy is the accuracy of identifying the compressed image.
  4. 根据权利要求1-3任意一项所述的方法,其特征在于,The method according to any one of claims 1-3, wherein,
    所述第二对应关系为对压缩算法进行测试得到的,所述第二对应关系包括多个不同的子对应关系,每个子对应关系与一个压缩算法对应,不同的压缩算法对应的所述第一对应关系相同。The second correspondence is obtained by testing the compression algorithm, the second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, and the first correspondence corresponding to different compression algorithms. The corresponding relationship is the same.
  5. 根据权利要求4所述的方法,其特征在于,The method of claim 4, wherein:
    根据所述失真阈值和第二对应关系,确定所述失真阈值对应的码率阈值,包括:According to the distortion threshold and the second correspondence, determining the bit rate threshold corresponding to the distortion threshold, including:
    确定对原图像进行压缩所采用的压缩算法;Determine the compression algorithm used to compress the original image;
    确定所述压缩算法对应的子对应关系;determining the sub-correspondence corresponding to the compression algorithm;
    根据所述失真阈值和所述子对应关系,确定所述失真阈值对应的码率阈值。A bit rate threshold corresponding to the distortion threshold is determined according to the distortion threshold and the sub-correspondence.
  6. 根据权利要求1-3任意一项所述的方法,其特征在于,所述失真度为根据以下指标中的一种或多种得到的:峰值信噪比PSNR,均方误差MSE,结构相似性指标SSIM,感知损失。The method according to any one of claims 1-3, wherein the distortion degree is obtained according to one or more of the following indicators: peak signal-to-noise ratio (PSNR), mean square error (MSE), structural similarity Metric SSIM, Perceptual Loss.
  7. 根据权利要求1-3任意一项所述的方法,其特征在于,所述精度为根据以下指标中的一种或多种得到的:平均精度均值mAP,精度均值AP,平均召回率AR,均交并比MIoU。The method according to any one of claims 1-3, wherein the precision is obtained according to one or more of the following indicators: mean mean precision mAP, mean precision AP, mean recall rate AR, all Cross and compare MIoU.
  8. 一种图像处理方式的确定装置,其特征在于,所述装置包括:A device for determining an image processing mode, characterized in that the device comprises:
    第一确定模块,用于根据业务要求的精度阈值和第一对应关系,确定所述精度阈值对应的失真阈值;其中,所述第一对应关系为精度和失真度之间的对应关系;a first determination module, configured to determine a distortion threshold corresponding to the accuracy threshold according to the accuracy threshold required by the service and a first correspondence; wherein the first correspondence is the correspondence between the precision and the degree of distortion;
    第二确定模块,用于根据所述失真阈值和第二对应关系,确定所述失真阈值对应的码率阈值;其中,所述第二对应关系为失真度和码率之间的对应关系。The second determination module is configured to determine the bit rate threshold corresponding to the distortion threshold according to the distortion threshold and the second correspondence; wherein the second correspondence is the correspondence between the distortion degree and the bit rate.
  9. 根据权利要求8所述的装置,其特征在于,所述第一确定模块包括:The device according to claim 8, wherein the first determining module comprises:
    第一确定单元,用于根据所述精度阈值和第一精度,确定所述精度阈值对应的第二精度,其中,所述第一精度为对所述原图像进行识别的精度;a first determining unit, configured to determine a second accuracy corresponding to the accuracy threshold according to the accuracy threshold and a first accuracy, where the first accuracy is the accuracy of recognizing the original image;
    第二确定单元,用于根据所述第二精度和所述第一对应关系,确定所述第二精度对应的失真阈值。A second determining unit, configured to determine a distortion threshold corresponding to the second accuracy according to the second accuracy and the first correspondence.
  10. 根据权利要求8或9所述的装置,其特征在于,所述码率为采用压缩算法对原图像进行压缩得到所述已压缩的图像的取样频率,所述失真度为已压缩的图像相对于真实环境的差异,所述精度为对所述已压缩的图像进行识别的精度。The device according to claim 8 or 9, wherein the code rate is a sampling frequency at which the compressed image is obtained by compressing the original image by using a compression algorithm, and the distortion degree is the ratio of the compressed image to the compressed image. The difference in the real environment, the accuracy is the accuracy of identifying the compressed image.
  11. 根据权利要求8-10任意一项所述的装置,其特征在于,The device according to any one of claims 8-10, characterized in that,
    所述第二对应关系为对压缩算法进行测试得到的,所述第二对应关系包括多个不同的子对应关系,每个子对应关系与一个压缩算法对应,不同的压缩算法对应的所述第一对应关系相同。The second correspondence is obtained by testing the compression algorithm, the second correspondence includes a plurality of different sub-correspondences, each sub-correspondence corresponds to a compression algorithm, and the first correspondence corresponding to different compression algorithms. The corresponding relationship is the same.
  12. 根据权利要求11所述的装置,其特征在于,The apparatus of claim 11, wherein:
    所述第二确定模块包括:The second determining module includes:
    第三确定单元,用于确定对原图像进行压缩所采用的压缩算法;a third determining unit, configured to determine the compression algorithm used to compress the original image;
    第四确定单元,用于确定所述压缩算法对应的子对应关系;a fourth determining unit, configured to determine the sub-correspondence relationship corresponding to the compression algorithm;
    第五确定单元,用于根据所述失真阈值和所述子对应关系,确定所述失真阈值对应的码率阈值。A fifth determination unit, configured to determine a bit rate threshold corresponding to the distortion threshold according to the distortion threshold and the sub-correspondence.
  13. 根据权利要求8-10任意一项所述的装置,其特征在于,所述失真度为根据以下指标中的一种或多种得到的:峰值信噪比PSNR,均方误差MSE,结构相似性指标SSIM,感知损失。The device according to any one of claims 8-10, wherein the distortion degree is obtained according to one or more of the following indicators: peak signal-to-noise ratio (PSNR), mean square error (MSE), structural similarity Metric SSIM, Perceptual Loss.
  14. 根据权利要求8-10任意一项所述的装置,其特征在于,所述精度为根据以下指标中的一种或多种得到的:平均精度均值mAP,精度均值AP,平均召回率AR,均交并比MIoU。The device according to any one of claims 8-10, wherein the precision is obtained according to one or more of the following indicators: mean mean precision mAP, mean precision AP, mean recall rate AR, all Cross and compare MIoU.
  15. 一种计算机程序产品,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行上述权利要求1-7任意一项所述的方法。A computer program product comprising computer readable code, when the computer readable code is executed in an electronic device, a processor in the electronic device executes the method of any one of the above claims 1-7.
  16. 一种电子设备,其特征在于,包括:An electronic device, comprising:
    处理器;processor;
    用于存储处理器可执行指令的存储器;memory for storing processor-executable instructions;
    其中,所述处理器被配置为执行所述指令时实现权利要求1-7任意一项所述的方法。Wherein, the processor is configured to implement the method of any one of claims 1-7 when executing the instructions.
  17. 一种非易失性计算机可读存储介质,其上存储有计算机程序指令,其特征在于,所述计算机程序指令被处理器执行时实现权利要求1-7中任意一项所述的方法。A non-volatile computer-readable storage medium on which computer program instructions are stored, characterized in that, when the computer program instructions are executed by a processor, the method described in any one of claims 1-7 is implemented.
PCT/CN2021/084373 2021-03-31 2021-03-31 Method and apparatus for determining image processing mode WO2022205058A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2021/084373 WO2022205058A1 (en) 2021-03-31 2021-03-31 Method and apparatus for determining image processing mode
CN202180001346.1A CN113366531A (en) 2021-03-31 2021-03-31 Method and device for determining image processing mode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/084373 WO2022205058A1 (en) 2021-03-31 2021-03-31 Method and apparatus for determining image processing mode

Publications (1)

Publication Number Publication Date
WO2022205058A1 true WO2022205058A1 (en) 2022-10-06

Family

ID=77523047

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/084373 WO2022205058A1 (en) 2021-03-31 2021-03-31 Method and apparatus for determining image processing mode

Country Status (2)

Country Link
CN (1) CN113366531A (en)
WO (1) WO2022205058A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112804578A (en) * 2021-01-28 2021-05-14 广州虎牙科技有限公司 Atmosphere special effect generation method and device, electronic equipment and storage medium
WO2022220723A1 (en) * 2021-04-15 2022-10-20 Telefonaktiebolaget Lm Ericsson (Publ) Method to determine encoder parameters
CN114153590A (en) * 2021-10-19 2022-03-08 广州文远知行科技有限公司 Large-scale simulation and model derivation method, device, equipment and storage medium
CN114786036B (en) * 2022-03-02 2024-03-22 上海仙途智能科技有限公司 Method and device for monitoring automatic driving vehicle, storage medium and computer equipment
CN114743076A (en) * 2022-04-22 2022-07-12 清华大学 Automatic driving image processing and evaluating method, related equipment, medium and product

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101365125A (en) * 2008-09-27 2009-02-11 腾讯科技(深圳)有限公司 Multipath video communication method and system
CN101521819A (en) * 2008-02-27 2009-09-02 深圳市融合视讯科技有限公司 Method for optimizing rate distortion in video image compression
CN101888561A (en) * 2010-07-02 2010-11-17 西南交通大学 Multi-view video transmission error control method for rate distortion optimization dynamic regulation
US20130089150A1 (en) * 2011-10-06 2013-04-11 Synopsys, Inc. Visual quality measure for real-time video processing
US20150110204A1 (en) * 2012-08-21 2015-04-23 Huawei Technologies Co., Ltd. Method and apparatus for acquiring video coding compression quality
CN108769685A (en) * 2018-06-05 2018-11-06 腾讯科技(深圳)有限公司 The method, apparatus and storage medium of detection image compression coding efficiency
CN111918067A (en) * 2020-07-23 2020-11-10 腾讯科技(深圳)有限公司 Data processing method and device and computer readable storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1937002B1 (en) * 2006-12-21 2017-11-01 Rohde & Schwarz GmbH & Co. KG Method and device for estimating the image quality of compressed images and/or video sequences
JP4824712B2 (en) * 2008-02-29 2011-11-30 日本電信電話株式会社 Motion estimation accuracy estimation method, motion estimation accuracy estimation device, motion estimation accuracy estimation program, and computer-readable recording medium recording the program
CN111901594B (en) * 2020-06-29 2021-07-20 北京大学 Visual analysis task-oriented image coding method, electronic device and medium
CN112437301B (en) * 2020-10-13 2021-11-02 北京大学 Code rate control method and device for visual analysis, storage medium and terminal

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101521819A (en) * 2008-02-27 2009-09-02 深圳市融合视讯科技有限公司 Method for optimizing rate distortion in video image compression
CN101365125A (en) * 2008-09-27 2009-02-11 腾讯科技(深圳)有限公司 Multipath video communication method and system
CN101888561A (en) * 2010-07-02 2010-11-17 西南交通大学 Multi-view video transmission error control method for rate distortion optimization dynamic regulation
US20130089150A1 (en) * 2011-10-06 2013-04-11 Synopsys, Inc. Visual quality measure for real-time video processing
US20150110204A1 (en) * 2012-08-21 2015-04-23 Huawei Technologies Co., Ltd. Method and apparatus for acquiring video coding compression quality
CN108769685A (en) * 2018-06-05 2018-11-06 腾讯科技(深圳)有限公司 The method, apparatus and storage medium of detection image compression coding efficiency
CN111918067A (en) * 2020-07-23 2020-11-10 腾讯科技(深圳)有限公司 Data processing method and device and computer readable storage medium

Also Published As

Publication number Publication date
CN113366531A (en) 2021-09-07

Similar Documents

Publication Publication Date Title
WO2022205058A1 (en) Method and apparatus for determining image processing mode
EP3471395B1 (en) Method and electronic device for processing raw image acquired through camera by using external electronic device
CN114616832A (en) Network-based visual analysis
CN115914634A (en) Environmental security engineering monitoring data management method and system
US9406274B2 (en) Image processing apparatus, method for image processing, and program
US20130089150A1 (en) Visual quality measure for real-time video processing
US10904542B2 (en) Image transcoding method and apparatus
KR20160032137A (en) Feature-based image set compression
US20230328396A1 (en) White balance correction method and apparatus, device, and storage medium
WO2022205060A1 (en) Method and apparatus for determining image processing mode
CN113507611B (en) Image storage method and device, computer equipment and storage medium
KR20210078350A (en) Lossless Compression Method for Hyperspectral Image Processing
CN110555120A (en) picture compression control method and device, computer equipment and storage medium
WO2022133753A1 (en) Point cloud encoding and decoding methods and systems, point cloud encoder, and point cloud decoder
CN116250008A (en) Encoding and decoding methods, encoder, decoder and encoding and decoding system of point cloud
US20240070924A1 (en) Compression of temporal data by using geometry-based point cloud compression
US20130272583A1 (en) Methods and apparatuses for facilitating face image analysis
US20200258188A1 (en) Image Compression/Decompression in a Computer Vision System
CN114677719A (en) Method, apparatus and computer-readable storage medium for image signal processing
CN116325732A (en) Decoding and encoding method, decoder, encoder and encoding and decoding system of point cloud
EP3065127A1 (en) Method and device for processing image data
WO2020107376A1 (en) Image processing method, device, and storage medium
WO2023203687A1 (en) Accuracy predicting system, accuracy predicting method, apparatus, and non-transitory computer-readable storage medium
CN114567632B (en) Progressive coding edge intelligent image transmission method, system, equipment and medium
US20240267539A1 (en) Image compression apparatus and method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21933735

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21933735

Country of ref document: EP

Kind code of ref document: A1