WO2022151053A1 - 数据处理方法、装置及系统、计算机存储介质 - Google Patents

数据处理方法、装置及系统、计算机存储介质 Download PDF

Info

Publication number
WO2022151053A1
WO2022151053A1 PCT/CN2021/071513 CN2021071513W WO2022151053A1 WO 2022151053 A1 WO2022151053 A1 WO 2022151053A1 CN 2021071513 W CN2021071513 W CN 2021071513W WO 2022151053 A1 WO2022151053 A1 WO 2022151053A1
Authority
WO
WIPO (PCT)
Prior art keywords
image data
layer image
enhancement layer
pixel
base layer
Prior art date
Application number
PCT/CN2021/071513
Other languages
English (en)
French (fr)
Inventor
周长波
李琛
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2021/071513 priority Critical patent/WO2022151053A1/zh
Publication of WO2022151053A1 publication Critical patent/WO2022151053A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • H04N19/23Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding with coding of regions that are present throughout a whole video segment, e.g. sprites, background or mosaic

Definitions

  • the present invention generally relates to the technical field of data processing, and more particularly to a data processing method, device and system, and a computer storage medium.
  • a first aspect of the present invention provides a data processing method, the method includes: encoding a to-be-processed image with a first preset encoding rule to obtain a base layer image code stream;
  • the enhancement layer image data of the region of interest is encoded with a second preset encoding rule to obtain an enhancement layer image code stream; wherein the bit depth of the enhancement layer image data is greater than the base layer image data corresponding to the base layer image code stream bit depth; transmit the image code stream of the base layer and the image code stream of the enhancement layer.
  • a second aspect of the present invention provides a data processing method, the method includes: acquiring a base layer image code stream and an enhancement layer image code stream; decoding the base layer image code stream according to a first preset encoding rule to obtain base layer image data of the image; decoding the enhancement layer image code stream according to the second preset coding rule to obtain enhancement layer image data of the region of interest in the image; wherein, the enhancement layer image data
  • the bit depth is greater than the bit depth of the base layer image data; the base layer image data and the enhancement layer image data are fused to obtain a target image.
  • a third aspect of the present invention provides a data processing apparatus, comprising: a memory for storing executable program instructions; one or more processors for executing the program instructions stored in the memory, so that the processing The device executes: encodes the image to be processed with a first preset encoding rule to obtain a base layer image code stream; encodes the enhancement layer image data of the region of interest in the image to be processed with a second preset encoding rule, to obtain an enhancement layer image code stream; wherein, the bit depth of the enhancement layer image data is greater than the bit depth of the base layer image data corresponding to the base layer image code stream; transmitting the base layer image code stream and the enhancement layer image stream.
  • a fourth aspect of the present invention provides a data processing apparatus, comprising: a memory for storing executable program instructions; one or more processors for executing the program instructions stored in the memory, so that the processing The device executes: obtains the base layer image code stream and the enhancement layer image code stream; decodes the base layer image code stream with the first preset coding rule to obtain the base layer image data of the image; uses the second preset coding rule Decoding the enhancement layer image code stream to obtain enhancement layer image data of the region of interest in the image; wherein the bit depth of the enhancement layer image data is greater than the bit depth of the base layer image data; The base layer image data and the enhancement layer image data are fused to obtain a target image.
  • the present application further provides a data processing system, the data processing system comprising: the data processing apparatus described in the foregoing third aspect; and the data processing apparatus described in the foregoing fourth aspect.
  • the present application also provides a computer storage medium on which a computer program is stored, and when the computer program is executed by a processor, realizes the data processing method of the first aspect, or realizes the data processing method of the first aspect. .
  • the image data to be processed is encoded according to the first preset encoding rule to obtain the image code stream of the base layer and the image data of the enhancement layer of the region of interest in the image data to be processed.
  • Encoding with the second preset encoding rule to obtain an enhancement layer image code stream and transmitting the base layer image code stream and the enhancement layer image code stream, thereby completing the encoding of the image data to be processed, and being compatible with the enhancement layer
  • the enhancement layer is applied to the image, it only needs to be calculated in the region of interest, which effectively reduces the amount of calculation.
  • the image data to be processed can be HDR image data, which can be obtained by performing nonlinear image transformation processing on the original image data
  • the method according to the present application can finally perform the HDR image data corresponding to the region of interest. to encode.
  • the image data of the base layer can belong to the image data of the mainstream image format, such as image data such as JPEG, so that according to different requirements, the image data of the mainstream image format can be decoded, and the HDR image data can also be decoded.
  • the decoded image data of the base layer and the image data of the enhancement layer are fused to obtain a target image, for example, an HDR graphic, so that the dissemination of the HDR image can be greatly improved.
  • the decoding end can decode the image code stream of the base layer and the image code stream of the enhancement layer, and fuse the decoded image data of the base layer and the image data of the enhancement layer, a high dynamic range close to the image to be processed can be obtained. target image. Since the bit depth of the enhancement layer image data is larger than the bit depth of the base layer image data corresponding to the base layer image code stream, it retains the image accuracy, so that the user has a larger post-editing space and provides the user with more Creation space and freedom, more user-friendly.
  • the corresponding apparatus, system, and storage medium provided by the embodiments of the present application based on the above-mentioned data processing method may also have the above-mentioned corresponding technical effects.
  • FIG. 1 is a schematic flowchart of a data processing method according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a data processing method provided by an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of dividing into slices provided by an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of slice division provided by an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a scanning block provided by an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a scanning slice provided by an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of an enhancement layer image code stream provided by an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of header information of an enhancement layer image code stream provided by an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of header information of an enhancement layer image code stream provided by an embodiment of the present invention.
  • FIG. 10 is a schematic flowchart of a data processing method provided by an embodiment of the present invention.
  • FIG. 11 is a schematic block diagram of a data processing apparatus according to an embodiment of the present invention.
  • FIG. 12 is a schematic block diagram of a data processing apparatus according to an embodiment of the present invention.
  • FIG. 13 is a schematic block diagram of a data processing system according to an embodiment of the present invention.
  • JPEG XT is a new JPEG-compatible HDR image compression standard, but it is display-oriented, not friendly to users for post-editing images, and has limited editing space.
  • an embodiment of the present application proposes a data processing method, the method includes: encoding a to-be-processed image according to a first preset encoding rule to obtain a base layer image code stream;
  • the enhancement layer image data of the region of interest in the processed image is encoded with a second preset encoding rule to obtain an enhancement layer image code stream; wherein, the bit depth of the enhancement layer image data is greater than the corresponding base layer image code stream
  • the bit depth of the image data of the base layer; the image code stream of the base layer and the image code stream of the enhancement layer are transmitted, so as to complete the encoding of the image data to be processed, and can be compatible with the image data code stream of the enhancement layer and the image data code of the base layer.
  • the enhancement layer image data of the region of interest is encoded, it is avoided to add the enhancement layer image data in the whole image range, which can effectively reduce the data amount of the enhancement layer image data.
  • the bit depth of the image data of the enhancement layer is larger than the bit depth of the image data of the base layer corresponding to the image code stream of the base layer, the precision of the image is preserved, so that the user has a larger space for post-editing, providing users with more More creative space and degrees of freedom, more user-friendly.
  • FIG. 1 is a schematic flowchart of a data processing method provided by an embodiment of the present invention
  • the method 100 provided by an embodiment of the present application may be executed by a data processing device that may have an encoder, and the encoder may be set in the data processing device
  • the data processing device may be a camera, an unmanned aerial vehicle, a hand-held pan/tilt, a mobile phone, a tablet computer, etc., or a data processing device with a camera and/or camera function.
  • the method 100 includes the following steps S110 to S130:
  • step S110 the image to be processed is encoded according to the first preset encoding rule to obtain a base layer image code stream.
  • the image to be processed is obtained by performing at least one nonlinear image transformation process on the original image; wherein, the bit depth of the image data before transformation is higher than or equal to the bit depth of the transformed image data.
  • the nonlinear transformation processing the loss of details caused by the linear compression can be avoided, so that the image data to be processed retains some detail information of the bright part or the dark part.
  • the original image is high dynamic range image data.
  • the original image data includes linear image data or nonlinear image data.
  • the linear image data may be image data acquired by an image sensor.
  • the nonlinear image data may be high dynamic range image data generated based on multiple images, or may be image data acquired by an image sensor.
  • the original image data may refer to the original image data collected from the image acquisition device in the data processing device, the original image data may be linear image data, and the image data may be linear HDR image data.
  • the image acquisition device may be a camera, a camera, etc., for example, the original image data may be acquired through, for example, a camera of a mobile phone, and the original image data may be 16-bit (bit) linear image data.
  • the image data to be processed may be acquired from the original image data, which may be non-linear image data, and the size may be 12 bits.
  • the acquisition methods can include the following methods:
  • the camera of the mobile phone captures an image of the external environment, and the image is 16-bit linear image data.
  • the encoder in the mobile phone obtains the image data, it can obtain 12-bit nonlinear image data through nonlinear image transformation, which can be used as the intermediate layer image data, and can be directly used as the image data to be processed.
  • the image data to be processed can be non-linear image data. Linear HDR image data.
  • image data to be processed may also be obtained by performing linear image transformation processing on the original image data, and the specific implementation process is similar to that described above, and will not be repeated here. Only for description: for example, linear image transformation processing (eg, linear cropping) of 16-bit original image data to 15-bit image data. This compresses the amount of data for subsequent image processing.
  • linear image transformation processing eg, linear cropping
  • the HDR image will be closer to the display device (ie, the display device) than the original image.
  • a displayable image supported by a display device of an electronic device that may have a decoder usually support images with lower bit depths, such as 8-bit images.
  • the actual data volume will not be reduced too much. In the nonlinear transformation, the details that are more sensitive to the human eye are retained, and the details that are not sensitive to people are more compressed.
  • the nonlinear 12bit HDR image will be closer to the 8bit image, while the linear 16bit original image will be less close to the 8bit image. Since the 12bitHDR image is closer to the 8bit image in terms of visual effect, the 12bitHDR image will be more friendly to the user's viewing, and the 12bitHDR image still retains the information of the high-bit depth image, so it also leaves room for editing. . It is very user-friendly, both in terms of visual effects and editing.
  • not only one nonlinear transformation may be performed to obtain the image data to be processed. It is also possible to continue to perform another nonlinear transformation based on the nonlinear transformation.
  • performing nonlinear image transformation processing on the original image data to obtain the to-be-processed image data includes: performing at least one nonlinear image transformation processing on the original image data to obtain the to-be-processed image data.
  • the encoder in the mobile phone may continue to perform nonlinear changes to obtain 10-bit image data as nonlinear HDR image data.
  • performing linear image transformation processing on the original image data to obtain the image data to be processed includes: performing linear image transformation processing on the original image data at least once to obtain the image data to be processed.
  • At least one nonlinear image transformation process and at least one linear image transformation process are performed on the original image data to obtain the to-be-processed image data. It should be understood that the processing order of the nonlinear image transformation processing and the linear image transformation processing is not limited.
  • the mobile phone continues to perform linear compression on the basis of the 12-bit image data, such as linear clipping, to obtain 10-bit image data as the final nonlinear HDR image data.
  • the encoder in the mobile phone can perform nonlinear image transformation processing on the 16-bit original image data to obtain 14-bit image data, and then perform linear cropping on the 14-bit image data to obtain 12-bit image data.
  • a linear clipping is performed again to obtain 10-bit image data, which is used as the final nonlinear HDR image data.
  • the final data volume of the image data to be processed can also be selected for different display devices, if the display device can support larger display image data, such as 10-bit image data, 12-bit image data, and so on.
  • the above-mentioned transformation times and the target of image transformation can be changed along with it, but the specific implementation steps of transformation are similar, and will not be repeated here.
  • the above-mentioned image transformation process may also be led by the user.
  • performing nonlinear image transformation processing on the original image data to obtain the image data to be processed includes: determining the target of the nonlinear image transformation of the original image data based on the loss of image accuracy; The non-linear image transformation process obtains the image data to be processed.
  • the image precision loss refers to the loss of precision of the image during the transformation process, which is determined according to the final transformation target of the image data.
  • the image precision loss is preset, and there may be multiple losses, and each loss may correspond to a final transformation target, that is, the amount of data for the final transformation of the image.
  • the image data is finally transformed into 10-bit image data, 12-bit image data, or 14-bit image data.
  • each transformation target may also correspond to a preset transformation process.
  • the encoder in the mobile phone can display various transformation targets in advance through a display device, such as the screen of the mobile phone where the encoder is located, for the user to select.
  • the encoder in the mobile phone determines the final transformation target in response to the user's selection, and transforms the original image through the above process and transformation target to obtain image data to be processed, which is used as the final nonlinear HDR image data.
  • the number of transformations can also be set, so that the transformation process is determined according to the number of transformation processes. Since similar content has been described above, it will not be repeated here.
  • linear image transformation processing it is also possible to perform linear image transformation processing on the original image data to obtain the image data to be processed, including: determining the target of the linear image transformation of the original image data based on the loss of image accuracy; The target is to perform linear image transformation processing on the original image to obtain the image data to be processed. It will not be repeated here.
  • original image data is acquired, and image transformation processing is performed on the original image data to obtain image data to be processed.
  • the image transformation processing may include linear transformation processing, nonlinear transformation processing, or combination processing of linear transformation processing and nonlinear transformation.
  • the base layer image data refers to the image data of the base data bit depth, that is, the low bit depth image data, and the dynamic range thereof is also the base dynamic range. Such as 8bit image data.
  • the base layer image data is compliant with the displayable image data of conventional display devices.
  • extracting the base layer image data of the preset data amount from the to-be-processed image data or the original image data includes: performing linear image transformation processing on the to-be-processed image data or the original image data to obtain the base layer image data of the preset data amount.
  • the preset data amount may be determined according to the requirements of the data processing device associated with or where the decoder is located, such as 8 bits.
  • the encoder in the mobile phone can perform linear cropping after acquiring the above-mentioned 12-bit image data, and directly obtain the linearly cropped 8-bit base layer image data.
  • the base layer image data of the preset data amount can also be obtained by performing linear image transformation processing from the original image data. It is not repeated here, but it should be understood that the dynamic range of the image data of the base layer is smaller than the dynamic range of the image data to be processed.
  • the first preset coding rule is the coding rule determined according to the JPEG coding rule, for example, based on the image data to be processed such as the aforementioned 12bit image data, JPEG coding rule compression is performed to obtain the base layer image code stream.
  • JPEG coding rule compression is performed to obtain the base layer image code stream.
  • it can also be an encoding rule determined according to the encoding rule of the current mainstream image format, such as TIFF (Tag Image File Format, Tag Image File Format), BMP (Bitmap, Bitmap), etc.
  • step S120 the enhancement layer image data of the region of interest in the image to be processed is encoded with a second preset encoding rule to obtain an enhancement layer image code stream; wherein, The bit depth of the enhancement layer image data is greater than the bit depth of the base layer image data corresponding to the base layer image code stream.
  • the enhancement layer image data is obtained based on the image to be processed and the image code stream of the base layer.
  • the method of the present application further includes: decoding the image code stream of the base layer to obtain the image data of the base layer; data and the base layer image data to determine enhancement layer image data of the image to be processed.
  • the enhancement layer image data refers to the image data between the high bit depth image data and the base layer image data. Its dynamic range is also high dynamic range, which is higher than the dynamic range of the base layer image data.
  • Decoding the base layer image code stream to obtain the base layer image data includes: decoding the base layer image code stream based on a first preset decoding rule corresponding to the first preset encoding rule, and decoding the obtained base layer image code stream. data as base layer image data.
  • the encoder in the mobile phone after acquiring the basic layer image data, that is, the undecoded 8-bit image data, the encoder in the mobile phone performs JPEG encoding on it.
  • the encoded base layer image data is obtained, and then the decoded image data of the base layer can be obtained through image decoding.
  • JPEG is a widely used image compression standard because the standard software and hardware algorithms are simple to implement and can ensure the quality of images at a higher compression ratio.
  • other image formats can also be encoded and decoded. Select which image format to encode and decode. When the entire image data to be processed is encoded subsequently, the image in this image format can be compatible with HDR images.
  • determining the enhancement layer image data of the to-be-processed image based on the to-be-processed image data and the base layer image data includes: based on the to-be-processed image data and the base layer image data (that is, the The base layer image code stream is decoded to obtain the base layer image data) and a residual is calculated; the enhancement layer image data of the image to be processed is determined based on the residual.
  • the residual refers to image data obtained by removing the image data of the base layer from the image data to be processed.
  • the residual can be directly used as the enhancement layer image data of the image to be processed.
  • the residual may be calculated based on image data to be processed, such as 12-bit image data, and base layer image data obtained by decoding the base layer image code stream.
  • the calculation of the residual based on the image data to be processed and the image data of the base layer may include: subtracting the image data of the base layer from the image data to be processed to obtain the residual.
  • the encoder can subtract the corresponding 8-bit image data from the 12-bit image data to obtain the enhancement layer of the image to be processed. image data.
  • the electronic device for decoding the target image code stream after the image data to be processed is encoded and decoded, for the electronic device for decoding the target image code stream, it can only obtain the decoded base layer image data (eg, the decoded target image) from the decoded target image.
  • the undecoded base layer image data cannot be obtained, so in order to better decode the target image in the future, the decoded enhancement layer image data and the decoded base layer image data data (here refers to the base layer image compressed by the first image format, or the encoded base layer image) to obtain the decoded target image, so when encoding the image data to be processed, it can be for the base layer after encoding and decoding. image data to determine the enhancement layer image data.
  • the enhancement layer image data of the to-be-processed image can be obtained from the to-be-processed image data and the encoded and decoded base layer image data.
  • it can also be: perform decoding processing on the base layer image code stream based on the first preset decoding rule corresponding to the first preset coding rule, and use the decoded data as the base layer image data;
  • Calculating the residual from the layer image data includes: calculating the residual based on the to-be-processed image data and the base layer image data.
  • the encoder in the mobile phone after acquiring the basic layer image data, that is, the undecoded 8-bit image data, the encoder in the mobile phone performs JPEG encoding on it.
  • the encoded base layer image data is obtained, and then the decoded image data of the base layer can be obtained through image decoding. Then, the corresponding decoded base layer image data is subtracted from the to-be-processed image data to obtain enhancement layer image data.
  • JPEG is a widely used image compression standard because the standard software and hardware algorithms are simple to implement and can ensure the quality of images at a higher compression ratio.
  • other image formats can also be encoded and decoded. Select which image format to encode and decode. When the entire image data to be processed is encoded subsequently, the image in this image format can be compatible with HDR images.
  • the decoded target image such as a 12bit image
  • Random noise can be added to the base layer image data (which can be base layer image data or base layer image data) obtained above (then it is also equivalent to adding random noise to the enhancement layer image data, because the enhancement layer image The data is obtained from the base layer image data with random noise added), this blockiness can be improved at the cost of increasing the code stream and part of the noise.
  • the method 100 may further include: adding random noise data to the base layer image data; calculating a residual based on the to-be-processed image data and the base-layer image data, including: based on the to-be-processed image data and adding random noise data
  • the base layer image data computes residuals.
  • the method 100 may further include: after calculating the residual, adding random noise data to the residual. In addition, the method 100 further includes: in the process of calculating the residual, adding random noise data, so that the obtained residual has random noise data.
  • the dynamic range of the image data to be processed can be compared with the base layer image data (the base layer images involved here and below can be the base layer image)
  • the image data may also be aligned with the dynamic range of the base layer image data after decoding the base layer image code stream.
  • the dynamic range represented by the image data to be processed is aligned with the dynamic range represented by the base layer image data.
  • the alignment may be: hdr/16, where hdr is image data to be processed, that is, the above-mentioned nonlinear HDR image data.
  • hdr is image data to be processed
  • the base layer image data is 8bit image data
  • hdr/16 can obtain 8bit integer and 4bit decimal, so as to align with the dynamic range of the base layer image data.
  • the aligned image data to be processed is compensated; the base layer image data is subtracted from the compensated image data to be processed, get residuals.
  • the compensation method may be, for example, compensating the hdr/16 data, hdr/16*drc coef , where *drc coef is the dynamic range of the base layer image data and the dynamic range of the HDR image (ie, hdr) data The inverse of the range scale.
  • the image data of the enhancement layer can be obtained by subtracting the image data of the base layer.
  • calculating the residual based on the image data to be processed and the image data of the base layer includes: aligning the dynamic range represented by the image data to be processed and the dynamic range represented by the image data of the base layer; The ratio of the dynamic range of the data to the dynamic range of the base layer image data is used to compensate the aligned to-be-processed image data; the base-layer image data is subtracted from the compensated to-be-processed image data to obtain a residual.
  • the data of the enhancement layer image can be selected.
  • determining the enhancement layer image data based on the residual includes: obtaining the data to be retained based on the residual; selecting the retained data from the data to be retained, and using the data as the enhancement layer image data of the image to be processed.
  • selecting and retaining data refers to performing data retention or data selection on the obtained residuals.
  • the way to choose to keep the data can be, for example, (hdr/16*drc coef -base)*2 n , where n is the number of bits of data you want to keep. According to the previous text, hdr/16 can get 4 decimal places, so the maximum value of n is 4, and the minimum value is 1. Here you can choose n to be 3, then you can keep three decimal places. (hdr/16*drc coef -base) is the residual. Random noise data can be added to this residual.
  • selecting the reserved data from the data to be reserved and using it as the enhancement layer image data includes: adjusting the reserved data to obtain the enhancement layer image data with positive integer data.
  • res is the image data of the enhancement layer. It can be understood that it is adjusted to a positive integer through 2048.
  • Base can be pre-processed base layer image data.
  • the method further includes: encoding the enhancement layer image data of the region of interest in the to-be-processed image data according to a second preset encoding rule to obtain an enhancement layer image code stream.
  • the second preset encoding rule refers to the encoding rule for the enhancement layer image data, and is encoded accordingly.
  • the encoding process may include: as shown in Figure 2, dividing into slices (slice image data), that is, dividing the slice image data in Figure 2, determining a region of interest, DCT (Discrete Cosine Transform, discrete cosine transform) transformation, quantization, Qp
  • map quantization parameter map, quantization parameter table
  • the DCT transform coefficients obtained after the DCT transform are quantized, thereby encoding the image data.
  • the Qp map can be determined from the base layer image data.
  • the enhancement layer image data can be firstly divided into slices, and then divided into block sub-blocks of 8 ⁇ 8 pixels based on each slice, and DCT transform is performed on each block to obtain the DCT transform coefficients of each block.
  • the enhancement layer image data of the image to be processed is divided to obtain multiple groups of macroblocks, and the macroblocks are divided into sub-blocks; for the sub-blocks, second information corresponding to the sub-blocks is determined.
  • the enhancement layer image data is divided to obtain multiple groups of macroblocks, and the macroblocks are divided into sub-blocks; based on the multiple groups of macroblocks, slice image data is generated by combining.
  • the encoder in the mobile phone can firstly divide the input enhancement layer image data into slices for smooth subsequent encoding.
  • the enhancement layer image data is divided into macroblocks MB, the size of which is 16x16 pixels.
  • the MB can then be further divided into 4 blocks, each with a size of 8x8 pixels, and these blocks can be the basic unit of subsequent processing.
  • the MBs of a slice should preferably be in the same MB row.
  • the number of MBs in the slice can be 4.
  • the MBs in the slice The number can be 2. And so on, until there is only 1 MB in the slice.
  • a schematic diagram of the division into slices is given in Figure 3. Among them, there are 8 slices in each row, there are 19 slice rows in total, and there are 152 slices in total. Each slice can be encoded and decoded in parallel. Among them, slice301 has 8 MBs, slice302 has 4 MBs, slice303 has 2 MBs, and slice304 has 1 MB.
  • the input of the sub-module corresponding to the division may be the enhancement layer image data, and the output may be the divided image data.
  • a region of interest in the image to be processed is determined.
  • the region of interest may be an area in the image to be processed determined based on an input instruction of the user, or may also be an area automatically determined by the data processing device through intelligent analysis.
  • the region of interest may include high-brightness areas and/or low-brightness areas in the image to be processed.
  • determining the region of interest in the image to be processed includes: determining, based on the first pixel information of the image data to be processed and/or the second pixel information of the enhancement layer image data, determining the region of interest in the image.
  • the region of interest for example, obtaining first pixel information corresponding to pixel units in each pixel region in the image data to be processed, and/or, pixels in each pixel region in the enhancement layer image data of the image data to be processed
  • the second pixel information corresponding to the unit; when the first pixel information of at least one pixel unit in the first pixel area in each pixel area satisfies the first preset threshold condition and/or the second pixel When the information satisfies the second preset threshold condition, it is determined that the first pixel region belongs to the region of interest.
  • the first pixel area may correspond to any one of the multiple slices in the foregoing description, or the first pixel area may also be any one of the multiple pixel areas divided based on other rules.
  • the first preset threshold condition and the second preset threshold condition may be reasonably set according to prior experience.
  • the first pixel information includes luminance information, a first chrominance component and a second chrominance component
  • the first preset threshold condition includes: the luminance information is greater than a first luminance threshold; or the luminance information is less than a second luminance threshold, wherein the first luminance threshold is greater than the second luminance threshold; or the first chrominance component is greater than the first chrominance threshold; or the first chrominance component is smaller than the second chrominance threshold, Wherein, the first chrominance threshold is greater than the second chrominance threshold; or the second chrominance component is greater than the third chrominance threshold; or the second chrominance component is less than the fourth chrominance threshold, wherein, The third chrominance threshold is greater than the fourth chrominance threshold.
  • the high-brightness pixel unit and the low-brightness pixel unit in the to-be-processed image can be determined through the first preset threshold condition.
  • the second pixel information includes a luminance residual, a first chrominance component residual, and a second chrominance component residual
  • the second preset threshold condition includes: the absolute value of the luminance residual is greater than the first chrominance residual. a luminance residual threshold; or the absolute value of the first chrominance component residual is greater than the first chrominance component residual threshold; or the absolute value of the second chrominance component residual is greater than the second chrominance component residual threshold.
  • a region of interest may include one or more pixel regions, and an image to be processed may include one or more regions of interest.
  • the method of the present application further includes: setting a label for the pixel unit and/or pixel area in the image to be processed, wherein the pixel unit and/or pixel area determined to belong to the region of interest is set as the first label, so The first label is used to indicate that the corresponding pixel unit and/or pixel area has enhancement layer image data, and the pixel unit and/or pixel area that does not belong to the area of interest is set as the second label or no label is set, that is, when the pixel unit and/or pixel area When the unit and/or pixel area does not contain the first label or the label is the second label, it is used to indicate that the corresponding pixel unit and/or pixel area has no enhancement layer image data.
  • the enhancement layer image data corresponding to the pixel unit and/or pixel area set with the first label more details and dynamic range are preserved in the enhancement layer image data of the region of interest.
  • the first label may be "true", or the first label may be 1 and the second label may be 0, or may be any other suitable label type.
  • the image data to be processed is data in YUV format
  • “Y” represents brightness (Luminance or Luma), that is, Grayscale value
  • "U” and “V” represent chrominance (Chrominance or Chroma), which is used to describe the color and saturation of the image, and is used to specify the color of the pixel
  • the brightness information of the first pixel information includes the Y value
  • the first chrominance component of the first pixel information includes a U value
  • the second chrominance component of the first pixel information includes a V value
  • all pixel units in the image to be processed correspond to their respective Y, U, V
  • the first The luminance threshold is YTH1, the second luminance threshold is YTH2, the first chromaticity threshold is UTH1, the second chromaticity threshold is UTH2, the third chromaticity threshold is VTH1, and the fourth chromaticity threshold is
  • the first luminance residual threshold is DY_TH
  • the first The chrominance component residual threshold is DU_TH
  • the second chrominance component residual threshold is DV_TH.
  • the enhancement layer image data of the corresponding slice is not saved.
  • the enhancement layer image data is considered to be 0, that is No enhancement layer image data is carried. If there is at least one "true” flag in the slice, the enhancement layer image data of the corresponding slice is saved.
  • the determining the region of interest in the image to be processed based on the first pixel information of the image data to be processed and/or the second pixel information of the enhancement layer image data further includes: using at least one A frame surrounds at least a portion of the pixel units set as the first label, wherein the proportion of the pixel units surrounded by the at least one frame in all pixel units set as the first label is not less than a threshold ratio, and /or at least a part of the pixel area set as the first label is surrounded by at least one frame, and the proportion of the pixel area surrounded by the at least one frame in all the pixel areas set as the first label is not less than Threshold ratio, the area enclosed by the frame (the area enclosed by the frame includes at least part of the pixel units marked as the first label, or at least part of the pixel area marked as the first label, or may also include part of the The pixel unit or pixel area not marked with the first label or set as the second label) is the region of interest.
  • At least one frame may be one frame or multiple frames, and one frame corresponds to one region of interest.
  • the threshold ratio can be reasonably set according to actual needs, for example, the threshold ratio is greater than or equal to 80%, such as 80%, 90%, etc.
  • the range of the threshold ratio is only an example, and the same applies to other suitable ratios. in this application.
  • the shape of the frame can be any suitable shape, such as a rectangular frame, a circular frame, a polygonal frame, and the like.
  • the frame may be a rectangular frame.
  • 1 to N rectangular frames are selected in the image to be processed, and the area marked as the first label such as "true” by the main marker map (MASK) is covered with a rectangular area as small as possible. For example, select a rectangular box that covers 90% of the area marked "true” by the marker map (MASK).
  • the size and position of the rectangular frame correspond to the size and position of the region of interest, which can be generated by an iterative algorithm, that is, starting from the rectangular frame covering the entire image to be processed, and continuously reducing the length and width of the rectangular frame, Process the moving rectangle in the image, until the rectangle can no longer be moved or shrunk, and can no longer satisfy the coverage not less than the threshold ratio, such as 90% of the marker map (MASK) marked as the first label such as "true" area, stop iteration, the rectangle at this time
  • the frame is used as the final frame, that is, the frame of the region of interest.
  • the enhancement layer image data is calculated and saved within the range of the rectangular frame, and the position and size of the rectangular frame are recorded as the position and size of the region of interest.
  • the above steps of determining the region of interest can also be performed before the calculation of the enhancement layer image data.
  • the region of interest can be determined based on the image data to be processed, and only the pixels corresponding to the region of interest can be enhanced. A series of calculations for layer image data.
  • ROI regions different strategies can be used to save the image information in the ROI in a targeted manner.
  • the ROI mainly composed of high-brightness pixel units
  • it can be compressed to retain the highlight information with higher precision.
  • brightness amplification will be performed to preserve information in shadows with higher precision.
  • the encoding the enhancement layer image data of the region of interest in the to-be-processed image according to a second preset encoding rule to obtain an enhancement layer image code stream includes: The brightness information of the pixel unit and/or the pixel area is determined, and the brightness state of the pixel unit and/or pixel area of the region of interest is determined, and the brightness state includes a first brightness state and a second brightness state; compressing the enhancement layer image data corresponding to the pixel units and/or pixel areas of the first brightness state to be compressed into a predetermined input range; or, compressing the pixel units and/or pixel areas of the second brightness state to the second brightness state
  • the corresponding enhancement layer image data is enlarged to be enlarged within a predetermined input range.
  • the brightness corresponding to the first brightness state is higher than the brightness corresponding to the second brightness state, that is, the first brightness state may correspond to high brightness, while the second brightness state corresponds to low brightness, wherein the brightness corresponding to the first brightness state Higher than the first brightness threshold, the brightness corresponding to the second brightness state is lower than the second brightness threshold, and the first brightness threshold is greater than the second brightness threshold, these brightness thresholds can be reasonably set according to prior experience, and can be determined based on the previous For example, based on the first pixel information of the image data to be processed or the second pixel information of the enhancement layer image data, the first preset conditions and the second pixel information in the area of interest are determined.
  • At least one of the preset conditions is used to judge, for example, when Y>YTH2 or U>UTH2 or V>VTH2 of the pixel unit, it is determined that the pixel unit is a high-brightness pixel, that is, the first brightness state, when Y ⁇ YTH1 or U ⁇ UTH1 or V ⁇ VTH1, then it is determined that the pixel unit is a low-brightness pixel, that is, the second brightness state.
  • encoding the enhancement layer image data of the region of interest in the to-be-processed image according to a second preset encoding rule, to obtain an enhancement layer image code stream includes: according to the data in the region of interest Brightness information of a pixel unit and/or pixel area, to determine the brightness state of the pixel unit and/or pixel area of the region of interest, where the brightness state includes a first brightness state and a second brightness state, wherein the first brightness state The corresponding brightness is higher than the brightness corresponding to the second brightness state; when the ratio of the pixel units and/or pixel regions whose brightness state in the region of interest is the first brightness state is greater than the first threshold ratio, the The enhancement layer image data corresponding to the region of interest is compressed to be compressed into a predetermined input range; when the brightness state in the region of interest is that the second brightness is the pixel unit and/or the proportion of the pixel region is greater than the second threshold ratio , and the enhancement layer image data corresponding to the region of interest is enlarged to be
  • the first threshold ratio and the second threshold ratio can be reasonably set according to actual needs, for example, the first threshold ratio can be 70%, 80%, or 90%, etc., and the second threshold ratio can be 70%, 80%, or 90%, etc. .
  • the main brightness state of the pixel unit in the region of interest can be judged through the threshold ratio, so that different measurements are adopted for different regions of interest, and the image information in the ROI can be saved pertinently.
  • the aforementioned predetermined input range refers to the input range of the compression algorithm.
  • Different compression algorithms may have different input ranges, and may have the same input range.
  • the input range of the compression algorithm is is 0-4095.
  • the enhancement layer image data corresponding to the region of interest is compressed to be within a predetermined input range, and any suitable suitable For example, multiplying the enhancement layer image data corresponding to the region of interest by a compression factor to compress it into a predetermined input range, wherein the compression factor is not less than 0 and not greater than 1, and further, the compression factor is not less than 1/256 and not greater than 1, for another example, based on the first preset curve, non-linear compression is performed on the enhancement layer image data corresponding to the region of interest, so as to be compressed into a predetermined input range, by comparing the pixel values uniformly.
  • Spread over the input range of the compression algorithm e.g. 0-4095 to preserve specular information with greater precision.
  • the enhancement layer image data corresponding to the region of interest is enlarged to be within a predetermined input range, and any suitable enlargement known to those skilled in the art can be used.
  • method for example, adding a compensation offset (eg, adding a larger offset, such as 4000) to the enhancement layer image data corresponding to the region of interest to zoom in to a predetermined input range; for another example, based on the first
  • the second preset curve is used to non-linearly amplify the enhancement layer image data corresponding to the region of interest to a predetermined input range, so that the pixel values are more evenly distributed within the input range of the compression algorithm, and more accurate Information in shadows is preserved.
  • the encoding process for the enhancement layer image data can also include: DCT (Discrete Cosine Transform, discrete cosine transform) transformation, quantization, the calculation of Qp map (quantization parameter map, quantization parameter table), that is, the quantization parameter in Figure 2, Coefficient scanning, entropy coding, etc.
  • DCT Discrete Cosine Transform, discrete cosine transform
  • Qp map quantization parameter map, quantization parameter table
  • the DCT transform coefficients obtained after the DCT transform are quantized, thereby encoding the image data.
  • the Qp map can be determined by the base layer image data (the base layer image data refers to the uncoded base layer image data).
  • encoding the enhancement layer image data based on the second preset encoding rule includes: encoding the enhancement layer image data based on the base layer image data.
  • encoding the enhancement layer image data based on the base layer image data includes: determining first information reflecting the compression of the enhancement layer image data according to the base layer image data; encoding the enhancement layer image data based on the first information .
  • the first information refers to the information reflecting the compression of the enhancement layer image, such as Qp map, wherein, the larger the Qp in the Qp map, the more strongly compressed the enhancement layer image at the corresponding position, and the smaller the Qp, the corresponding The more complete the information about the location of the enhancement layer image is preserved.
  • the compression also refers to the removed image data information.
  • the quantity corresponding to the bright part and/or the dark part of the image data of the base layer can be determined first, and then the size of Qp is determined through a quantity threshold. to encode.
  • determining the first information reflecting the compression of the enhancement layer image data of the region of interest according to the base layer image data includes: determining, according to the brightness distribution of the base layer image data, that the enhancement layer image data reflecting the region of interest is compressed first information on the situation.
  • the brightness distribution of the base layer image data refers to the distribution of bright parts (also referred to as highlights) and dark parts in the base layer image data. If the number of bright and/or dark areas in a certain image data area is larger, the more complete it is stored, and the smaller its Qp is. Otherwise, Qp is larger.
  • the number of highlighted pixels is determined according to the brightness in each preset image data area in the image; the number of dark pixels is determined according to the brightness in each preset image data area; and/or the number of pixels in the dark place to determine the corresponding brightness distribution.
  • the number of dark pixels and the number of highlighted pixels can be determined by the color information corresponding to the image data.
  • the base layer image is converted from YUV format to RGB format, and the number of highlighted pixels is determined according to the brightness in each preset image data area in the RGB format image; according to each preset image data area in the YUV format According to the brightness situation, the number of pixels in the dark place is determined; according to the number of highlighted pixels and/or the number of pixels in the dark place, the corresponding brightness distribution is determined.
  • the encoder in the mobile phone first converts the input YUV (color coding method, "Y” represents the brightness (Luminance or Luma), that is, the gray value; and “U” and “V” represent It is the chrominance (Chrominance or Chroma), which is used to describe the color and saturation of the image, and is used to specify the color of the pixel.)
  • the basic layer image data is converted into RGB (RGB color mode, red, green, blue three-color color mode) image data in the image format.
  • the base layer image data may be divided into a plurality of image data regions in advance, and the sum of the number of pixels in the highlight and dark regions may be counted for each region.
  • the quantization parameter qIdx in this area is set to 2, otherwise it is set to 8, until the quantization parameter is completed for each area , the Qp map that guides the quantization process corresponding to the enhancement layer image data can be obtained. It should be understood that each region shares the same set of quantization parameters.
  • the division method may be the above-mentioned division into slices (slice image data, every 8 MB is divided into one slice image data, MB (Macroblock, macroblock), MB's The size is 16x16 pixels.), the base layer image data can also be divided into 16*32 quantization blocks (Quantization Block, QB), the same quantization parameter is used in the same QB and/or the same slice is used. The same quantization parameter. It should be understood that the image data may be divided into regions according to requirements, which is not limited here.
  • the determining of the first information may be: determining the first information reflecting the compression of the enhancement layer image data according to the luminance distribution of the base layer image data, including: dividing the base layer image data into quantization blocks; For each quantization block, the corresponding luminance distribution is determined; according to the luminance distribution, the quantization parameter of each quantization block is determined.
  • the determination of the brightness distribution may specifically include: determining the corresponding brightness distribution for each slice image data, including: determining the corresponding brightness distribution for each slice image data, including: according to each slice image data In the case of medium brightness, determine the number of highlighted pixels; according to the brightness in each slice image data, determine the number of pixels in the dark; determine the corresponding brightness according to the number of highlighted pixels and/or the number of pixels in the dark Distribution.
  • determining the number of highlighted pixels according to the brightness in each sliced image data includes: converting the base layer image data from YUV format to RGB format, and determining the brightness according to the brightness in each sliced image data in RGB format.
  • the number of highlighted pixels determine the number of dark pixels according to the brightness of each sliced image data, including: determine the number of dark pixels according to the brightness of each sliced image data in YUV format;
  • the number of points and/or the number of pixels in the dark place determines the corresponding brightness distribution.
  • determining the quantization parameter of the slice according to the brightness distribution may be: determining the first information reflecting the compression of the enhancement layer image data according to the brightness distribution of the base layer image data, including: dividing the base layer image data into slice images data; for each slice image data, determine the corresponding brightness distribution; according to the brightness distribution, determine the quantization parameter of each slice image data. Since it has been described above, it will not be repeated here.
  • the pixel in the highlighted area is determined if the value of one of the three RGB channels is greater than 220, and it is the pixel in the highlighted area.
  • the determination condition of the pixel in the dark place is that the value of the Y channel in the YUV is less than 64, then it is the pixel in the dark place. Statistics from this.
  • determining the quantization parameter of each slice image data according to the brightness distribution includes: determining the quantization parameter of the corresponding slice image data according to the number of bright pixels and/or the number of dark pixels in the brightness distribution.
  • the above is a process of adaptively setting quantization parameters according to the importance of the content of the image data.
  • the embodiment of the present application will use relatively small quantization parameters in these parts of the image data. , so that the encoding quality of this part of the image data is higher, so as to ensure the post-editing space of the user's image.
  • the first information can also be determined according to the brightness distribution of the HDR image data (for example, the above-mentioned 12-bit image data to be processed).
  • the HDR image data for example, the above-mentioned 12-bit image data to be processed. The implementation process of the determination Similar to the above, it will not be repeated. Only description: It is enough to replace the image data of the base layer with the image data to be processed.
  • the encoding process of the enhancement layer image data shown in FIG. 2 it can be implemented by modularization. For example, it will be divided into slice (slice image data), DCT (Discrete Cosine Transform, discrete cosine transform) transformation, quantization, calculation of Qp map (quantization parameter map, quantization parameter table), coefficient scanning, entropy coding, etc. are divided into a sub-module. to achieve their corresponding functions. Each sub-module can obtain output results through input information.
  • the input can be the base layer image data or the image data to be processed, and the output is the Qp map.
  • quantification can be performed.
  • encoding the enhancement layer image data based on the first information includes: determining second information for dividing the image data frequency of the enhancement layer image data; quantizing the second information based on the first information;
  • the enhancement layer image data is encoded.
  • the second information refers to information for dividing the image data frequency of the enhancement layer image data.
  • DCT transform coefficients which can divide the image data frequency of the enhancement layer image data, to divide high frequency and low frequency.
  • the low frequency is in the upper left corner of the DCT transform coefficient matrix
  • the high frequency is in the lower right corner of the DCT transform coefficient matrix, so as to facilitate subsequent scanning and coding, compress the high frequency information of the image data, and retain the low frequency information of the image data.
  • the second information for determining the frequency of image data for dividing the enhancement layer image data may include:
  • the enhancement layer image data can be divided into slices first, and then divided into block sub-blocks of 8 ⁇ 8 pixels based on each slice, and DCT transform is performed on each block to obtain the DCT transform coefficients of each block.
  • the DCT transform calculation can also be obtained by 1D-DCT one-dimensional DCT transform, where the one-dimensional DCT calculation formula is as follows:
  • the formula can also determine the 1D-DCT transform for each block, and obtain the variation coefficient corresponding to each block.
  • X(k) is the one-dimensional DCT transform coefficient of change
  • k is 0, 1, 2...7
  • x(i) is the image data signal corresponding to the block
  • i is the corresponding pixel point.
  • the transformation matrix can be planned as the following form:
  • quantizing the second information such as the coefficients after DCT transformation can be: realized by the following quantization formula:
  • the quantization formula of the DC component is:
  • the quantization formula of the AC component is:
  • TF[i][j] is the DCT transform coefficient after DCT transform
  • bitdepth is the bit width of the DCT transform coefficient
  • QF[i][j] is the quantized DCT transform coefficient
  • qMat[i][j] is the quantization matrix, which is preset.
  • qIdx is a quantization parameter, which is obtained from the corresponding position of the input QP map
  • floor is a round-down function
  • i, j represent the coordinates in the 8*8block.
  • the DC component since its value is greater than the AC component, in order to reduce its excessive occupation, save storage resources and reduce the subsequent calculation amount, it is adjusted by 2 ⁇ (botdeoth+2), or 16384, to reduce the DC component, That is, use TF[i][j]-2 ⁇ (bitdepth+2) to prevent the DC component from being too large.
  • each quantization can be performed in blocks of 8*8 units.
  • the relationship between block, MB, QB, and slice is shown in FIG. 4 .
  • each small block 401 represents an 8*8 block of the basic structure of DCT transform and quantization, and every 4 blocks constitute an MB, and the order of data arrangement of the 4 blocks in 1 MB is shown in FIG. 4 .
  • Every 8 blocks form a QB (16*32), that is, the same qIdx is used for quantization of every 8 blocks (that is, the sub-blocks in the same slice image data share the same quantization parameter), and this value is indexed from the Qp map.
  • the entire figure 4 is 16*16*8 as the size of a slice.
  • the slice at the end of the row may become smaller, i.e. the contained QB may be reduced, requiring special consideration.
  • a slice still uses the same qIdx, but because the slice is located at the texture boundary of the image data, should the whole slice use a smaller qIdx, such as 2, it will make the code stream more wasteful, or should the entire slice be used?
  • a larger qIdx such as 8
  • Such slices can be further divided into smaller blocks, such as QBs, and then the number of pixels in the bright and/or dark areas in each QB can be counted again according to the above method.
  • a QB is allocated a corresponding qIdx, that is, one QB shares a quantization parameter at this time, and each block is quantized to complete the quantization.
  • the same quantization parameter is shared for one QB.
  • it can be divided into QBs starting from the division of the enhancement layer image data, and each QB contains multiple blocks.
  • DCT transforms transform the blocks in it.
  • quantization is performed for each block, and the quantization parameter used in the quantization is a quantization parameter shared by one QB.
  • the QB can also be further divided into smaller image data area units, and the above-mentioned QP map and DCT transformation can be repeated to achieve the final quantization, which will not be repeated here.
  • the same quantization parameter may also be used for slices. That is, according to the foregoing, the corresponding quantization parameter can be determined for each slice and quantized, and then only one QB is used to share the same quantization parameter for the image data texture boundary, and quantization is performed. It will not be repeated here.
  • the input is the DCT transform coefficient and the QP map
  • the output is the quantized coefficient
  • the quantization result that is, the quantization coefficient
  • the quantization result can be scanned and encoded to generate a code stream.
  • encoding the enhancement layer image data according to the quantization result includes: scanning the quantization result, and encoding the scanned quantization result; obtaining the enhancement layer image code stream includes: generating the enhancement layer image data code based on the encoding result flow.
  • the scanning mode may be: scanning the quantization results, including: scanning the quantization results in each sub-block according to a preset scanning mode to obtain the quantization results after the first scanning; for slice image data, according to the first scanning The quantization results in the slice image data are sequentially scanned for the quantization results in each sub-block, and the quantization results after the second scan are obtained as the final quantization results after the scan.
  • the preset scanning mode may be a progressive scanning mode.
  • the scanning is still based on the sub-blocks in the slice, so the image data of the enhancement layer is divided to obtain multiple groups of macroblocks, and the macroblocks are divided into sub-blocks; based on the multiple groups of macroblocks, a slice image is generated by combining data.
  • the encoder in the mobile phone scans by the progressive scanning method, that is, each block is scanned separately, and the scan result is obtained, that is, the reordered block result.
  • the left picture in Figure 5 is the block before scanning
  • the right picture is the block scanned by the progressive scanning method, in which the direction of the arrow indicates the order of scanning.
  • all 8x8blocks in the slice should be scanned, first scan the frequency component in the same position in each block, and then scan the next frequency component in each block, and the scanning order is from low frequency to high frequency.
  • the direction of the arrow in Figure 6.
  • the scanning is performed in order of frequency.
  • the frequency components may be scanned from low to high.
  • the frequency component at the same position and the small block (pixel) of the lowest frequency component are scanned first, and then the small block (pixel) of the frequency component at the next position is sequentially scanned, that is, the next frequency component is scanned until the end.
  • the input is the quantized coefficient
  • the output is the quantized coefficient reordered after scanning.
  • the quantized coefficients reordered after the scan are encoded.
  • an encoder in a mobile phone entropy encodes the quantized coefficients that are reordered after scanning, as described above.
  • the entropy encoding process in the embodiment of the present application adopts Rice-golomb (Rice-golomb encoding, Rice-Glomb encoding) mixed entropy encoding.
  • the input of the sub-module of the above entropy coding function is the quantized coefficients reordered after scanning, and the output is the binary sequence after entropy coding.
  • the above-mentioned method for encoding the enhancement layer image data of the region of interest as shown in FIG. 2 is only an example, and other suitable methods are also applicable to this application, such as general video encoding methods, such as H264, H265, H266 Wait.
  • step S130 the enhancement layer image code stream and the base layer image code stream are transmitted.
  • the two code streams may be two independent code streams, or, based on the enhancement layer image code stream and the base layer image code stream, a target image code stream may be generated, and the target image code stream may be transmitted, thereby transmitting the enhanced image code stream.
  • layer image codestream and base layer image codestream may be performed at the encoding end, and the encoded code stream may be transmitted to the decoding end for decoding and display if necessary.
  • Generating a target image code stream based on the enhancement layer image code stream and the base layer image code stream includes: writing the enhancement layer image code stream into a preset field in the base layer image code stream to generate the target image code stream , wherein the preset field can select the corresponding preset field according to different coding rules. For example, for the encoding rule corresponding to JPEG, it may be the position immediately following the APP0 application segment.
  • the encoder in the mobile phone finds the position of the APP0 application segment of the base layer image code stream (code stream marker character FF E0), then reads the length of the application segment, and moves the pointer back by this length, It is the insertion position of the enhancement layer image code stream, and the enhancement layer image code stream is inserted into the base layer image code stream, and the final image code stream is output.
  • the slice header (slice header information) stores the qIdx values of different QBs in the slice.
  • the qIdx of each QB is stored in 1 bit, 1 means qIdx is 8, 0 means qIdx is 2.
  • the enhancement layer image code stream can also be encapsulated.
  • the encoder in the mobile phone encapsulates the enhancement layer image code stream using the APP7 (code stream marker character FF E7) application segment. Since JPEG specifies that the length of the application segment cannot exceed 0XFFFF, it is necessary to cut the enhancement layer image code stream. Then package again.
  • APP7 code stream marker character FF E7
  • JPEG specifies that the length of the application segment cannot exceed 0XFFFF, it is necessary to cut the enhancement layer image code stream. Then package again.
  • the above encapsulation field is only for illustration, and in specific implementation, it may be other APP fields.
  • the method 100 further includes: dividing the enhancement layer image code stream, and encapsulating the divided code stream; writing the enhancement layer image code stream into a preset field in the base layer image code stream, including: The encapsulated code stream is written into the preset field, and the encapsulated code stream has corresponding code stream identification information.
  • the encoder in the mobile phone divides the enhancement layer image code stream, encapsulates the divided code stream, and inserts it into the adding position. Both the enhancement layer image code stream and the divided code stream have corresponding code stream identification information.
  • each segment needs certain header information to be marked.
  • the content of the header information is shown in Figure 8.
  • the header information of this segment is used to describe the total information of the enhancement layer image code stream.
  • the enhancement layer image code stream is recorded only once.
  • Figure 9 shows the header information of each segment of the enhancement layer image code stream after cutting, and each segment of the enhancement layer image code stream has.
  • the method of the present application further includes: transmitting the position information and size information of the region of interest, and carrying the position information and size information of the ROI in the metadata of the enhancement layer, which can be obtained by adding the position information and size information of the region of interest.
  • the position information and size information are transmitted to the decoding end, and the decoding end can determine whether to use the base layer and the enhancement layer to fuse to generate the enhanced dynamic image (ie, the target image) according to the knowledge.
  • the base layer image code stream is obtained by encoding the image data to be processed according to the first preset encoding rule, and the enhancement layer image data of the region of interest in the image data to be processed is encoded with the second preset encoding rule.
  • the image data to be processed can be HDR image data, which can be obtained by performing nonlinear image transformation processing on the original image data
  • the method according to the present application can finally perform the HDR image data corresponding to the region of interest. to encode.
  • the image data of the base layer can belong to the image data of the mainstream image format, such as image data such as JPEG, so that according to different requirements, the image data of the mainstream image format can be decoded, and the HDR image data can also be decoded.
  • the decoded image data of the base layer and the image data of the enhancement layer are fused to obtain a target image, for example, an HDR graphic, so that the dissemination of the HDR image can be greatly improved.
  • the decoding end can decode the image code stream of the base layer and the image code stream of the enhancement layer, and fuse the decoded image data of the base layer and the image data of the enhancement layer, a high dynamic range close to the image to be processed can be obtained. target image. Since the bit depth of the enhancement layer image data is larger than the bit depth of the base layer image data corresponding to the base layer image code stream, it retains the image accuracy, so that the user has a larger post-editing space and provides the user with more Creation space and freedom, more user-friendly.
  • FIG. 10 is a schematic flowchart of a data processing method provided by an embodiment of the present invention.
  • the data processing method 1000 provided in this embodiment of the present application may be executed by a data processing apparatus that may have a decoder, and the decoder may be set in the data processing apparatus, and the data processing apparatus may be a camera, an unmanned aerial vehicle, a handheld gimbal, Mobile phones, computers, etc., and data processing devices with camera or camera functions are all acceptable.
  • the method 1000 includes the following steps:
  • step S1001 obtain the base layer image code stream and the enhancement layer image code stream; for example, obtain the base layer image code stream and the enhancement layer image code stream transmitted by the data processing device at the encoding end, or obtain the base layer image code stream and the enhancement layer image code stream.
  • the target code stream of the layer image code stream is obtained.
  • step S1002 the base layer image code stream is decoded according to the first preset coding rule to obtain base layer image data of the image; wherein the first preset decoding rule is corresponding to the first preset coding rule , that is, the first preset coding rule set above, such as the coding rule determined according to the JPEG coding rule, on the contrary, the first preset decoding rule may be the decoding rule determined according to the JPEG decoding rule.
  • the decoding end receives the target image code stream sent by the encoding end.
  • the decoder at the decoding end obtains the JPEG code stream in the code stream, and then decodes the JPEG code stream according to the decoding rule determined by the JPEG decoding rule to obtain the base layer image data for display.
  • the target image code stream may be in the form of a JPEG code stream, since it also includes an enhancement layer image data code stream, the target image code stream also includes an enhancement layer image in addition to the JPEG code stream. data stream.
  • the base layer image data is used to generate the base layer image.
  • the decoder of the decoding end can directly generate the base layer image of the JPEG image format according to the data, which can be displayed on the display screen of the decoding end.
  • step S1003 the enhancement layer image code stream is decoded according to the second preset coding rule to obtain enhancement layer image data of the region of interest in the image; wherein, the bit depth of the enhancement layer image data
  • the bit depth of the image data of the base layer is larger than that of the image data of the base layer
  • the image data of the enhancement layer is used to generate a target image in combination with the image data of the base layer, and the dynamic range of the target image is larger than that of the base layer image.
  • the second preset decoding rule corresponds to the second preset encoding rule, that is, the second preset encoding rule set above.
  • the second preset decoding rule may be an inverse process of the second preset encoding rule.
  • the decoder of the decoding end such as a computer or a mobile phone can also directly obtain the enhancement layer image code stream of the region of interest, or obtain the enhancement layer image data code from the preset bytes in the target image code stream. flow.
  • the divided code stream may be spliced into a complete enhancement layer image code stream.
  • the decoder may perform decoding according to the inverse process of the aforementioned enhancement layer image data encoding rule (ie, the second preset encoding rule) to obtain the enhancement layer image data of the region of interest.
  • step S1004 the base layer image data and the enhancement layer image data are fused to obtain a target image.
  • the enhancement layer image data and the base layer image data of the region of interest are fused to obtain a target image, for example, an HDR image, and the HDR image is displayed.
  • the method further includes: acquiring the location information and size information of the region of interest.
  • the decoder at the decoding end can determine whether to use the base layer and the enhancement layer to fuse to generate an enhanced dynamic image according to the position information and size information.
  • the image data of the base layer and the enhancement layer are The image data is fused to obtain a target image, which includes: obtaining labels of pixel units and/or pixel regions in the image, where the labels include a first label and a second label, and the first label is used to indicate the corresponding pixel unit and/or pixel area.
  • the pixel area has enhancement layer image data
  • the first label or the second label is not included, it is used to indicate that the corresponding pixel unit and/or pixel area has no enhancement layer data; it will be marked as the first label
  • the enhancement layer image data corresponding to the pixel units and/or pixel regions of the label and the base layer image data are fused to generate a target image, for example, an HDR image, and the HDR image is displayed.
  • the steps involved in the data processing method 1000 are not limited in order, and may not be performed in accordance with the order of labeling and writing, and may be executed concurrently.
  • the data processing method 1000 can be executed by the parallel operation of multiple processing elements in the data processing apparatus.
  • the data processing method of the embodiment of the present application is performed by using the aforementioned data processing method to obtain the enhancement layer image code stream and the base layer image code stream, it can obtain a target image with a higher dynamic range, and, during fusion, In the process, since its enhancement layer image code stream only has enhancement layer image data corresponding to the region of interest, its computational complexity is smaller.
  • FIG. 11 is a schematic block diagram of a data processing apparatus provided by an embodiment of the present invention.
  • the data processing device can be any electronic device with an encoder, for example, the data processing device can be: a camera, a movable platform (such as an aircraft, a vehicle, a ship, a robot, etc.), a handheld PTZ, a mobile phone, a mobile platform Any device with image data processing function, such as camera, tablet computer, computer, etc., wherein the data processing device may also have imaging and/or photographing function.
  • the data processing device can be any electronic device with an encoder, for example, the data processing device can be: a camera, a movable platform (such as an aircraft, a vehicle, a ship, a robot, etc.), a handheld PTZ, a mobile phone, a mobile platform Any device with image data processing function, such as camera, tablet computer, computer, etc., wherein the data processing device may also have imaging and/or photographing function.
  • the data processing device 1100 includes one or more processors 1101 , one or more memories 1102 , an image acquisition device (not shown), a communication interface, a display unit, and the like. These components are interconnected by a bus system and/or other form of connection mechanism (not shown).
  • the memory 1102 is used for storing executable instructions, for example, for storing corresponding steps and program instructions for implementing the data processing method 100 according to the embodiment of the present application.
  • the memory 1102 can store an operating system (including programs for processing various basic system services and for performing hardware-related tasks), an application program required for at least one function (such as a control function of a camera, a sound playback function, an image playback function) and many more. Some of these applications can be downloaded from external servers via wireless communication. Other applications may be installed in memory 1102 at the time of manufacture or at the factory, and may also include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and /or non-volatile memory.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory, or the like.
  • the non-volatile memory may include, for example, a read only memory (ROM), a hard disk, a flash memory (eg, an SD card), and the like.
  • the communication interface may be used for communication between the data processing apparatus 1100 and other external devices such as the data processing apparatus at the decoding end, including wired or wireless communication.
  • the communication interface may be access to a wireless network based on a communication standard, such as WiFi (eg WiFi6), software-defined radio (SDR for short), 2G, 3G, 4G, 5G or a combination thereof.
  • the communication interface receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication interface 503 also includes a near field communication (NFC) module to facilitate short-range communication.
  • the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • the data processing device 1100 further includes an input device (not shown), which may be a device used by the user to input various operations, for example, may include function keys (such as touch keys, keys (such as volume control keys, switch keys, etc.) etc.), mechanical keys, soft keys, etc., one or more of joysticks, microphones, touchscreens, etc., or other user input devices for enabling a user to enter information.
  • Data eg, audio, video, images, etc. is passed through input device to obtain.
  • the processor 1101 may be a central processing unit (CPU), a graphics processing unit (GPU), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other form of processing with data processing capabilities and/or instruction execution capabilities unit, and can control other components in the camera to perform desired functions.
  • the processor 1101 can execute the instructions stored in the memory 1102 to execute the corresponding steps of the data processing method 100 described herein. For details, reference may be made to the foregoing method description.
  • processor 1101 can include one or more embedded processors, processor cores, microprocessors, logic circuits, hardware finite state machines (FSMs), digital signal processors (DSPs), ISPs, encoders, decoders or their combination.
  • FSMs hardware finite state machines
  • DSPs digital signal processors
  • ISPs encoders, decoders or their combination.
  • One or more processors 1101 are configured to execute the program instructions stored in the memory, so that the processors perform the following steps: encoding the image data to be processed according to a first preset encoding rule to obtain a base layer image code stream, optionally, the first preset encoding rule is an encoding rule determined according to the JPEG encoding rule; the enhancement layer image data of the region of interest in the to-be-processed image data is encoded with the second preset encoding rule , to obtain an enhancement layer image code stream; wherein, the bit depth of the enhancement layer image data is greater than the bit depth of the base layer image data corresponding to the base layer image code stream; transmit the base layer image code stream and the enhancement layer image code stream. layer image code stream.
  • the dynamic range of the image data to be processed is higher than the dynamic range of the base layer image data.
  • the processor 1101 is further configured to: decode the base layer image code stream to obtain the base layer image data; and determine the to-be-processed image based on the to-be-processed image data and the base-layer image data The enhancement layer image data; determining the region of interest in the to-be-processed image based on the first pixel information of the to-be-processed image data and/or the second pixel information of the to-be-processed enhancement layer image data.
  • the image data to be processed is obtained by performing at least one nonlinear image transformation process on the original image data; wherein, the bit depth of the image data before the transformation is higher than or equal to the bit depth of the transformed image data.
  • the processor 1101 is further configured to determine the region of interest in the image based on the first pixel information of the image data to be processed and/or the second pixel information of the enhancement layer image data, including: acquiring the The first pixel information corresponding to the pixel unit in each pixel area in the image data to be processed, and/or the second pixel corresponding to the pixel unit in each pixel area in the enhancement layer image data of the image data to be processed information; when the first pixel information of at least one pixel unit in the first pixel area in each pixel area satisfies the first preset threshold condition and/or the second pixel information satisfies the second preset threshold condition is determined, the first pixel region belongs to the region of interest.
  • the first pixel information includes luminance information, a first chrominance component and a second chrominance component
  • the first preset threshold condition includes: the luminance information is greater than a first luminance threshold; or the luminance information is less than a second luminance threshold, wherein the first luminance threshold is greater than the second luminance threshold; or the first chrominance component is greater than the first chrominance threshold; or the first chrominance component is smaller than the second chrominance threshold,
  • the first chrominance threshold is greater than the second chrominance threshold; or the second chrominance component is greater than the third chrominance threshold; or the second chrominance component is less than the fourth chrominance threshold, wherein,
  • the third chrominance threshold is greater than the fourth chrominance threshold.
  • the second pixel information includes a luminance residual, a first chrominance component residual, and a second chrominance component residual
  • the second preset threshold condition includes: the absolute value of the luminance residual is greater than the first chrominance residual. a luminance residual threshold; or the absolute value of the first chrominance component residual is greater than the first chrominance component residual threshold; or the absolute value of the second chrominance component residual is greater than the second chrominance component residual threshold.
  • the processor 1101 is further configured to transmit the location information and size information of the region of interest.
  • the processor 1101 is further configured to: set a label for the pixel unit and/or pixel area in the image to be processed, wherein the pixel unit and/or pixel area determined to belong to the region of interest is set as the first label, so The first label is used to indicate that the corresponding pixel unit and/or pixel area has enhancement layer image data, and when the first label is not included or the label is the second label, it is used to indicate that the corresponding pixel unit and/or The pixel area has no enhancement layer image data.
  • the processor 1101 is further configured to determine the region of interest in the image to be processed based on the first pixel information of the image data to be processed and/or the second pixel information of the enhancement layer image data, and further includes : surround at least a part of the pixel units set as the first label with at least one frame, and the proportion of the pixel units surrounded by the at least one frame in all the pixel units set as the first label is not less than the threshold ratio , and/or, surround at least a part of the pixel area set as the first label with at least one frame, and the proportion of the pixel area surrounded by the at least one frame in all the pixel areas set as the first label Not less than the threshold ratio, the area enclosed by the frame is the area of interest.
  • the threshold ratio is greater than or equal to 80%.
  • the processor 1101 is further configured to transmit the base layer image code stream and the enhancement layer image code stream, further comprising: based on the enhancement layer image code stream and the base layer image code stream, generating a target image code stream; and transmitting the target image code stream.
  • the processor 1101 is further configured to encode the enhancement layer image data of the region of interest in the to-be-processed image with a second preset encoding rule to obtain an enhancement layer image code stream, including: according to the Brightness information of pixel units and/or pixel regions in the region of interest, determining brightness states of pixel units and/or pixel regions in the region of interest, the brightness states including a first brightness state and a second brightness state, wherein , the brightness corresponding to the first brightness state is higher than the brightness corresponding to the second brightness state; the enhancement layer image data corresponding to the pixel unit and/or pixel area whose brightness state is the first brightness state is compressed to compress to or, the enhancement layer image data corresponding to the pixel unit and/or the pixel region whose brightness state is the second brightness state is enlarged, so as to be enlarged into the predetermined input range.
  • the processor 1101 is further configured to encode the enhancement layer image data of the region of interest in the to-be-processed image with a second preset encoding rule to obtain an enhancement layer image code stream, including: according to the Brightness information of pixel units and/or pixel regions in the region of interest, determining brightness states of pixel units and/or pixel regions in the region of interest, the brightness states including a first brightness state and a second brightness state, wherein , the brightness corresponding to the first brightness state is higher than the brightness corresponding to the second brightness state; when the brightness state in the region of interest is the ratio of the pixel units and/or pixel areas of the first brightness state to the first brightness state is greater than the first threshold ratio when the image data of the enhancement layer corresponding to the region of interest is compressed to be within a predetermined input range; when the brightness state in the region of interest is that the second brightness is the pixel unit and/or the proportion of the pixel region is greater than When the second threshold ratio is used, the enhancement layer image data corresponding to the region
  • the processor 1101 is further configured to compress the enhancement layer image data corresponding to the region of interest so as to be within a predetermined input range, including: multiplying the enhancement layer image data corresponding to the region of interest by a compression factor to compress to a predetermined input range, wherein the compression factor is not less than 0 and not greater than 1, further, the compression factor is not less than 1/256 and not greater than 1; or based on the first preset curve, Non-linear compression is performed on the enhancement layer image data corresponding to the region of interest, so as to be compressed into a predetermined input range.
  • the processor 1101 is further configured to zoom in the enhancement layer image data corresponding to the region of interest so as to be within a predetermined input range, including: adding the enhancement layer image data corresponding to the region of interest with Compensating the offset amount to zoom in within a predetermined input range; or performing nonlinear amplification on the enhancement layer image data corresponding to the region of interest based on the second preset curve to zoom in within a predetermined input range.
  • the processor 1101 is further configured to determine, based on the image data to be processed and the image data of the base layer, the image data of the enhancement layer of the image to be processed, including: based on the image data to be processed and the image data of the base layer A residual is calculated from the base layer image data; and enhancement layer image data of the image to be processed is determined based on the residual.
  • the processor 1101 is further configured to calculate a residual based on the image data to be processed and the image data of the base layer, including: subtracting the image data of the base layer from the image data to be processed to obtain the obtained image data. the residuals.
  • the processor 1101 is further configured to calculate the residual based on the to-be-processed image data and the base layer image data, including: comparing the dynamic range represented by the to-be-processed image data with the The dynamic range represented by the image data of the base layer is aligned; the aligned image data to be processed is compensated according to the ratio of the dynamic range of the image data to be processed to the dynamic range of the image data of the base layer; The base layer image data is subtracted from the processed image data to obtain a residual.
  • the processor 1101 is further configured to determine the enhancement layer image data based on the residual, including: obtaining data to be retained based on the residual; selecting retained data from the data to be retained, and using the data as the enhancement layer image data.
  • the processor 1101 is further configured to select the reserved data from the data to be reserved and use it as the enhancement layer image data, including: adjusting the reserved data to obtain enhancement with positive integer data Layer image data.
  • the processor 1101 is further configured to encode the enhancement layer image data corresponding to the region of interest with a second preset encoding rule to obtain an enhancement layer image code stream, including: based on the base layer image data The enhancement layer image data corresponding to the region of interest is encoded.
  • the data processing apparatus further comprises a display unit (not shown), which may be used to display a device for displaying images or video generated by the processor assembly.
  • the display unit may include, for example, a liquid crystal display (LCD) device or an organic light emitting diode (OLED) device.
  • the display unit may display various images, such as menus, selected operating parameters, a capture mode selection interface, images captured by an image sensor, received by the processor, etc. and/or received from a user interface of the device other information, etc.
  • the data processing device further includes an input device, which may be a device used by the user to input various operations, such as a keyboard, a mouse, a function key (such as a volume control key, a switch key, etc.), an operation stick, a microphone, a touch screen, etc. one or more of.
  • an input device which may be a device used by the user to input various operations, such as a keyboard, a mouse, a function key (such as a volume control key, a switch key, etc.), an operation stick, a microphone, a touch screen, etc. one or more of.
  • the data processing apparatus in this embodiment of the present application can execute the foregoing data processing method 100 , it has the same advantages as the foregoing method 100 .
  • FIG. 12 is a schematic block diagram of a data processing apparatus provided by an embodiment of the present invention.
  • the data processing device can be any electronic device with a decoder, for example, the data processing device can be: a camera, a movable platform (such as an aircraft, a car, a ship, a robot, etc.), a handheld PTZ, a mobile phone, a mobile platform Any device with image processing function, such as camera, tablet computer, computer, etc., wherein, the data processing device may also have camera and/or camera function.
  • a camera a movable platform (such as an aircraft, a car, a ship, a robot, etc.), a handheld PTZ, a mobile phone, a mobile platform
  • Any device with image processing function such as camera, tablet computer, computer, etc., wherein, the data processing device may also have camera and/or camera function.
  • the data processing device 1200 includes one or more processors 1201, one or more memories 1202, an image acquisition device (not shown), a communication interface, a display unit, and the like. These components are interconnected by a bus system and/or other form of connection mechanism (not shown).
  • the memory 1202 is used for storing executable instructions, for example, for storing corresponding steps and program instructions for implementing the data processing method 1000 according to the embodiment of the present application.
  • the memory 1202 can store an operating system (including programs for processing various basic system services and for performing hardware-related tasks), an application program required for at least one function (such as a control function of a photographing device, a sound playback function, an image playback function) and many more. Some of these applications can be downloaded from external servers via wireless communication. Other applications may be installed in memory 1202 at the time of manufacture or at the factory, and may also include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and /or non-volatile memory.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory, or the like.
  • the non-volatile memory may include, for example, a read only memory (ROM), a hard disk, a flash memory (eg, an SD card), and the like.
  • the communication interface can be used for communication between the data processing apparatus 1200 and other external devices, such as the data processing apparatus at the encoding end, etc., including wired or wireless communication.
  • the communication interface may be access to a wireless network based on a communication standard, such as WiFi (eg WiFi6), software-defined radio (SDR for short), 2G, 3G, 4G, 5G or a combination thereof.
  • the communication interface receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication interface 503 also includes a near field communication (NFC) module to facilitate short-range communication.
  • the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • the data processing device 1200 further includes an input device (not shown), which may be a device used by the user to input various operations, for example, may include function keys (such as touch keys, keys (such as volume control keys, switch keys, etc.) etc.), mechanical keys, soft keys, etc., one or more of joysticks, microphones, touchscreens, etc., or other user input devices for enabling a user to enter information.
  • Data eg, audio, video, images, etc. is passed through input device to obtain.
  • Processor 1201 may be a central processing unit (CPU), graphics processing unit (GPU), application specific integrated circuit (ASIC), field programmable gate array (FPGA), or other form of processing with data processing capabilities and/or instruction execution capabilities unit, and can control other components in the camera to perform desired functions.
  • the processor 1201 can execute the instructions stored in the memory 502 to execute the corresponding steps of the data processing method 100 described herein.
  • processor 1201 can include one or more embedded processors, processor cores, microprocessors, logic circuits, hardware finite state machines (FSMs), digital signal processors (DSPs), ISPs, encoders, decoders or their combination.
  • FSMs hardware finite state machines
  • DSPs digital signal processors
  • ISPs encoders, decoders or their combination.
  • One or more processors 1201 are configured to execute the program instructions stored in the memory, so that the processors perform the following steps: acquiring a base layer image code stream and an enhancement layer image code stream; using a first preset encoding rule Decoding the base layer image code stream to obtain base layer image data of the image; decoding the enhancement layer image code stream with a second preset coding rule to obtain enhancement of the region of interest in the image layer image data; wherein, the bit depth of the enhancement layer image data is greater than the bit depth of the base layer image data; the base layer image data and the enhancement layer image data are fused to obtain a target image.
  • the processor 1201 is further configured to acquire position information and size information of the region of interest.
  • the processor 1201 is configured to fuse the image data of the base layer and the image data of the enhancement layer to obtain a target image, including: obtaining a label of a pixel unit and/or a pixel area in the image, the label Including a first label and a second label, the first label is used to indicate that the corresponding pixel unit and/or pixel area has enhancement layer image data, when the first label or the second label is not included, used for Indicates that the corresponding pixel unit and/or pixel area has no enhancement layer data; the enhancement layer image data corresponding to the pixel unit and/or pixel area marked as the first label is fused with the base layer image data to generate target image.
  • the data processing apparatus further comprises a display unit (not shown), which may be used to display a device for displaying images or video generated by the processor assembly.
  • the display unit may include, for example, a liquid crystal display (LCD) device or an organic light emitting diode (OLED) device.
  • the display unit may display various images, such as menus, selected operating parameters, a capture mode selection interface, images captured by an image sensor, received by the processor, etc. and/or received from a user interface of the device other information, etc.
  • the data processing device further includes an input device, which may be a device used by the user to input various operations, such as a keyboard, a mouse, a function key (such as a volume control key, a switch key, etc.), an operation stick, a microphone, a touch screen, etc. one or more of.
  • an input device which may be a device used by the user to input various operations, such as a keyboard, a mouse, a function key (such as a volume control key, a switch key, etc.), an operation stick, a microphone, a touch screen, etc. one or more of.
  • the data processing apparatus in this embodiment of the present application can execute the foregoing data processing method 1000 , it has the same advantages as the foregoing method 1000 .
  • FIG. 13 is a schematic block diagram of a data processing system provided by an embodiment of the present invention.
  • the data processing system 1300 includes the aforementioned data processing apparatus 1100 and data processing apparatus 1200, wherein the data processing apparatus 1100 and the data processing apparatus 1200 may be the same apparatus or equipment, or may be two independent apparatuses or equipment, such as data
  • the processing apparatus 1100 and the data processing apparatus 1200 may be the same electronic device, such as a camera, a mobile phone, a computer, a tablet computer, etc., which can both encode and decode when needed.
  • the data processing apparatus 1100 and the data processing apparatus 1200 are two apparatuses and devices connected in communication.
  • the data processing apparatus 1100 can collect image data, and encode the image data according to the data processing method 100 above.
  • the base layer image code stream and the enhancement layer image code stream are output, and the data processing apparatus 1200 can receive the base layer image code stream and the enhancement layer image code stream, and after decoding, fuse to generate a target image.
  • the data processing apparatus 1100 may be a drone with a camera
  • the data processing apparatus 1200 may be a control terminal communicatively connected to the drone, such as a mobile phone, a tablet computer, a smart wearable device, and the like.
  • the data processing system can perform both encoding and decoding, it also has the advantages of the data processing method described above.
  • an embodiment of the present invention further provides a computer storage medium, on which a computer program is stored.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor may execute the program instructions stored in the storage device to implement (implemented by the processor) the embodiments of the present invention described herein.
  • Functions and/or other desired functions for example, to execute the data processing method 100 and/or the corresponding steps of the data processing method 1000 according to the embodiments of the present application, various application programs and Various data, such as various data used and/or generated by the application.
  • the computer storage medium may include, for example, a memory card for a smartphone, a storage unit for a tablet computer, a hard disk for a personal computer, read only memory (ROM), erasable programmable read only memory (EPROM), portable compact disk Read only memory (CD-ROM), USB memory, or any combination of the above storage media.
  • the computer-readable storage medium can be any combination of one or more computer-readable storage media.
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or May be integrated into another device, or some features may be omitted, or not implemented.
  • Various component embodiments of the present invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some modules according to the embodiments of the present invention.
  • DSP digital signal processor
  • the present invention may also be implemented as apparatus programs (eg, computer programs and computer program products) for performing part or all of the methods described herein.
  • Such a program implementing the present invention may be stored on a computer-readable medium, or may be in the form of one or more signals. Such signals may be downloaded from Internet sites, or provided on carrier signals, or in any other form.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

一种数据处理方法、装置及系统、计算机存储介质,该数据处理方法(100)包括:对待处理图像以第一预设编码规则进行编码,以获得基础层图像码流(S110);对待处理图像中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流(S120);其中,增强层图像数据的位深大于基础层图像码流对应的基础层图像数据的位深;传输基础层图像码流和增强层图像码流(S130)。基于数据处理方法(100),能够避免在全图像范围内添加增强层图像数据,可以有效的减小增强层图像数据的数据量,同时在将增强层应用于图像时,只需要在感兴趣区域内进行计算,有效减少了计算量。

Description

数据处理方法、装置及系统、计算机存储介质
说明书
技术领域
本发明总地涉及数据处理技术领域,更具体地涉及一种数据处理方法、装置及系统、计算机存储介质。
背景技术
随着图像传感器硬件性能的逐渐增加,相机所能拍摄的实际场景的动态范围也越来越大。而当前互联网使用最为广泛的主流图像压缩格式,如,JPEG(Joint Photographic Experts Group,是由联合图像专家组提出的一种针对静止图像的有损压缩国际标准),由于数据位深的限制只能够保留相机所拍摄场景的部分动态范围。
近些年来,为了适应高动态范围(High Dynamic Range,HDR)图像,一些支持高位深的图像压缩技术也不断被提出来,但是对一些主流图像格式并不兼容。
发明内容
为了解决上述问题中的至少一个而提出了本发明。具体地,本发明第一方面提供一种数据处理方法,所述方法包括:对待处理图像以第一预设编码规则进行编码,以获得基础层图像码流;对所述待处理图像中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流;其中,所述增强层图像数据的位深大于所述基础层图像码流对应的基础层图像数据的位深;传输所述基础层图像码流和所述增强层图像码流。
本发明第二方面提供一种数据处理方法,所述方法包括:获取基础层图像码流和增强层图像码流;以第一预设编码规则对所述基础层图像码流进行解码,以获得图像的基础层图像数据;以第二预设编码规则对所述增强层图像码流进行解码,以获得所述图像中的感兴趣区域的增强层图像数据;其中,所述增强层图像数据的位深大于所述基础层图像数据的位深;对所述基础层图像数据和所述增强层图像数据进行融合,获得目标图像。
本发明第三方面提供一种数据处理装置,包括:存储器,用于存储可执行的程序指令;一个或多个处理器,用于执行所述存储器中存储的所述程序指令,使得所述处理器执行:对待处理图像以第一预设编码规则进行编码,以获得基础层图像码流;对所述待处理图像中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流;其中,所述增强层图像数据的位深大于所述基础层图像码流对应的基础层图像数据的位深;传输所述基础层图像码流和所述增强层图像码流。
本发明第四方面提供一种数据处理装置,包括:存储器,用于存储可执行的程序指令;一个或多个处理器,用于执行所述存储器中存储的所述程序指令,使得所述处理器执行:获取基础层图像码流和增强层图像码流;以第一预设编码规则对所述基础层图像码流进行 解码,以获得图像的基础层图像数据;以第二预设编码规则对所述增强层图像码流进行解码,以获得所述图像中的感兴趣区域的增强层图像数据;其中,所述增强层图像数据的位深大于所述基础层图像数据的位深;对所述基础层图像数据和所述增强层图像数据进行融合,获得目标图像。
本申请还提供一种数据处理系统,所述数据处理系统包括:前述的第三方面所述的数据处理装置;前述的第四方面所述的数据处理装置。
另外,本申请还一种计算机存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现前述的第一方面的数据处理方法,或,实现前述的第一方面的数据处理方法。
根据本申请的数据处理方法,通过将待处理图像数据以第一预设编码规则进行编码,以获得基础层图像码流,以及对所述待处理图像数据中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流,并传输所述基础层图像码流和所述增强层图像码流,从而完成对待处理图像数据的编码,同时可以兼容增强层图像数据码流以及基本层图像数据码流,并且,由于仅对感兴趣区域的增强层图像数据进行编码,避免在全图像范围内添加增强层图像数据,可以有效的减小增强层图像数据的数据量,同时在将增强层应用于图像时,只需要在感兴趣区域内进行计算,有效减少了计算量。
此外,由于待处理图像数据可以是HDR图像数据,其可以是对原始图像数据做非线性图像变换处理得到待处理图像数据,那么根据本申请的方法则可以最终对感兴趣区域对应的HDR图像数据进行编码。其中,基本层图像数据则可以属于主流图像格式的图像数据,如,JPEG等图像数据,从而可以根据不同的需求,即可以解码主流图像格式的图像数据,也可以解码HDR图像数据,并通过将解码后的基础层图像数据和增强层图像数据进行融合,以获得目标图像,该目标图像例如为HDR图形,从而可以大大提高了HDR图像的传播性。
此外,若解码端可解码基础层图像码流和增强层图像码流,并通过将解码后的基础层图像数据和增强层图像数据进行融合,可以得到一个接近于待处理图像的动态范围较高的目标图像。由于增强层图像数据的位深大于所述基础层图像码流对应的基础层图像数据的位深,因此其保留了图像的精度,使得用户具有较大的后期编辑空间,给用户提供更多的创作空间和自由度,对用户更加友好。
相对的,基于上述数据处理方法的本申请实施例另提供的对应装置、系统和存储介质,也可以具有上述对应的技术效果。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使 用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种数据处理方法的流程示意图;
图2为本发明实施例提供的数据处理方法的示意图;
图3为本发明实施例提供的划分成slice的示意图;
图4为本发明实施例提供的对slice划分的示意图;
图5为本发明实施例提供的扫描block的示意图;
图6为本发明实施例提供的扫描slice的示意图;
图7为本发明实施例提供的增强层图像码流的示意图;
图8为本发明实施例提供的增强层图像码流的头信息的示意图;
图9为本发明实施例提供的增强层图像码流的头信息的示意图;
图10为本发明实施例提供的一种数据处理方法的流程示意图;
图11为本发明实施例提供的一种数据处理装置的示意性框图;
图12为本发明实施例提供的一种数据处理装置的示意性框图;
图13为本发明实施例提供的一种数据处理系统的示意性框图。
具体实施方式
为了使得本发明的目的、技术方案和优点更为明显,下面将参照附图详细描述根据本发明的示例实施例。显然,所描述的实施例仅仅是本发明的一部分实施例,而不是本发明的全部实施例,应理解,本发明不受这里描述的示例实施例的限制。基于本发明中描述的本发明实施例,本领域技术人员在没有付出创造性劳动的情况下所得到的所有其它实施例都应落入本发明的保护范围之内。
在下文的描述中,给出了大量具体的细节以便提供对本发明更为彻底的理解。然而,对于本领域技术人员而言显而易见的是,本发明可以无需一个或多个这些细节而得以实施。在其他的例子中,为了避免与本发明发生混淆,对于本领域公知的一些技术特征未进行描述。
应当理解的是,本发明能够以不同形式实施,而不应当解释为局限于这里提出的实施例。相反地,提供这些实施例将使公开彻底和完全,并且将本发明的范围完全地传递给本领域技术人员。
在此使用的术语的目的仅在于描述具体实施例并且不作为本发明的限制。在此使用时,单数形式的“一”、“一个”和“所述/该”也意图包括复数形式,除非上下文清楚指出另外 的方式。还应明白术语“组成”和/或“包括”,当在该说明书中使用时,确定所述特征、整数、步骤、操作、元件和/或部件的存在,但不排除一个或更多其它的特征、整数、步骤、操作、元件、部件和/或组的存在或添加。在此使用时,术语“和/或”包括相关所列项目的任何及所有组合。
为了彻底理解本发明,将在下列的描述中提出详细的结构,以便阐释本发明提出的技术方案。本发明的可选实施例详细描述如下,然而除了这些详细描述外,本发明还可以具有其他实施方式。
下面结合附图,对本发明的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。
随着图像传感器硬件性能的逐渐增加,相机所能拍摄的实际场景的动态范围也越来越大。JPEG作为当前互联网使用最为广泛的图像压缩格式,由于数据位深的限制只能够保留相机所拍摄场景的部分动态范围。
近些年来,为了适应高动态范围(High Dynamic Range,HDR)的图像,一些支持高位深的图像压缩技术也不断被提出来,但是由于对JPEG图像格式的不兼容,因此并没有受到很多相机厂商的支持。JPEG XT是一种新的兼容JPEG的HDR图像压缩标准,但是却是面向显示端的,对于用户后期编辑图像很不友好,且编辑空间也有限。
因此,鉴于上述问题的存在,本申请实施例中提出一种数据处理方法,所述方法包括:对待处理图像以第一预设编码规则进行编码,以获得基础层图像码流;对所述待处理图像中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流;其中,所述增强层图像数据的位深大于所述基础层图像码流对应的基础层图像数据的位深;传输所述基础层图像码流和所述增强层图像码流,从而完成对待处理图像数据的编码,同时可以兼容增强层图像数据码流以及基本层图像数据码流,并且,由于仅对感兴趣区域的增强层图像数据进行编码,避免在全图像范围内添加增强层图像数据,可以有效的减小增强层图像数据的数据量,同时在将增强层应用于图像时,只需要在感兴趣区域内进行计算,有效减少了计算量。另外,由于增强层图像数据的位深大于所述基础层图像码流对应的基础层图像数据的位深,因此其保留了图像的精度,使得用户具有较大的后期编辑空间,给用户提供更多的创作空间和自由度,对用户更加友好。
下面将结合附图,对本发明的一些实施方式作详细说明。在各实施例之间不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。
图1为本发明实施例提供的一种数据处理方法的流程示意图;本申请实施例提供的该方法100可以由可具有编码器的数据处理装置执行,该编码器可以设置在数据处理装置中, 数据处理装置可以为,相机、无人机、手持云台、手机、平板电脑等,具有摄像和/或照相功能的数据处理装置均可。
作为示例,如图1所示,该方法100包括以下步骤S110至步骤S130:
首先,在步骤S110中,对待处理图像以第一预设编码规则进行编码,以获得基础层图像码流。
待处理图像是对原始图像进行至少一次非线性图像变换处理得到的;其中,所述变换前的图像数据的位深高于或等于变换后的图像数据的位深。通过非线性变换处理,可以避免进行线型压缩导致的细节丢失,从而使得待处理图像数据保留一些亮部或暗部的细节信息。
原始图像为高动态范围的图像数据。原始图像数据包括线性图像数据或者非线性图像数据。线性图像数据可以是通过图像传感器采集得到的图像数据。非线性图像数据可以是基于多张图像生成的高动态范围图像数据,也可以是通过图像传感器采集的得到的图像数据。
其中,原始图像数据可以是指从数据处理装置中的图像采集装置采集的原始图像数据,该原始图像数据可以为线性图像数据,该图像数据可以是线性HDR图像数据。其中,图像采集装置可以是相机、摄像头等,如可以通过例如手机的摄像头来获取到原始图像数据,该原始图像数据可以为16bit(位)的线性图像数据。
在获取到原始图像数据后,可以从该原始图像数据中获取到待处理图像数据,可以为非线性图像数据,其大小可以为12bit。获取方式可以包括以下几种方式:
例如,根据前文所述,手机的摄像头拍摄外部环境的图像,该图像为16bit的线性图像数据。手机中的编码器在获取到该图像数据后,可以通过非线性图像变换,得到12bit的非线性图像数据,作为中间层图像数据,并可以直接作为待处理图像数据,该待处理图像数据可以是非线性HDR图像数据。
需要说明的是,也可以对原始图像数据做线性图像变换处理得到待处理图像数据,具体的实施过程与前文所述相似,此处就不再赘述。仅说明:例如,将16bit原始图像数据线性图像变换处理(如,线性剪裁)到15bit图像数据。这样可以压缩数据量,便于后续图像处理。
由于是针对该非线性HDR图像数据进行编码,且该HDR图像是通过原始图像进行非线性变换/线性变换得到的,那么该HDR图像相比原始图像而言,会更接近显示设备(即显示设备可以具有解码器的电子设备的显示设备)所支持的可展示图像。如,传统显示设备通常支持较低位深的图像,如8bit图像,对于16bit原始图像非线性变换到12bitHDR图像,实际上数据量不会减少太多。在非线性变换中,保留人眼更敏感的细节信息,对人不敏感 的细节信息做更大的压缩。且,在视觉效果上,对于非线性12bitHDR图像会更接近于8bit图像,而线性16bit原始图像,则不太接近8bit图像。由于在视觉效果上12bitHDR图像更接近于8bit图像,那么对于用户的观看而言,12bitHDR图像会更友好,且12bitHDR图像又还保留有高位深图像的信息,所以也给用户留有了编辑的空间。无论是视觉效果上还是编辑上,对于用户都非常友好。
为了能够在视觉效果上更加贴近于显示设备所展示的图像,或者,能够保留更多的高位深图像的信息,可以不单单仅进行一次非线性变换,得到待处理图像数据。还可以继续在该非线性变换基础进行再次的非线性变换。
具体的,对原始图像数据做非线性图像变换处理得到待处理图像数据,包括:对原始图像数据做至少一次非线性图像变换处理得到待处理图像数据。
例如,根据前文所述,手机中的编码器在得到上述12bit图像数据后,还可以继续进行非线性变化来得到10bit图像数据,作为非线性HDR图像数据。
相对的,对原始图像数据做线性图像变换处理得到待处理图像数据,包括:对原始图像数据做至少一次线性图像变换处理得到待处理图像数据。
还可以,对原始图像数据做至少一次非线性图像变换处理和至少一次线性图像变换处理得到待处理图像数据。应理解,对非线性图像变换处理和线性图像变换处理,的处理顺序不限定。
例如,根据前文所述,手机在该12bit图像数据基础上继续进行线性压缩,如线性剪裁,得到10bit图像数据,作为最终的非线性HDR图像数据。或者,根据前文所述,手机中的编码器可以针对16bit原始图像数据进行非线性图像变换处理,得到14bit图像数据,然后对该14bit图像数据进行一次线性剪裁,得到12bit图像数据。在该12bit图像数据基础上,再进行一次线性剪裁,得到10bit图像数据,作为最终的非线性HDR图像数据。
需要说明的是,对于原始图像而言,当可以采集到更大的图像,如20bit原始图像数据等等。那么随着原始图像数据的增大,上述针对该原始图像数据的变换过程也可以增多,不拘泥于一次、二次的压缩过程,应理解的是,第一次对原始图像数据进行变换的过程可以是非线性图像变换处理,然后继续多次的线性图像变换处理等。
相对的,也可以针对不同的显示设备来选择最终的待处理图像数据的数据量,如果显示设备可以支持更大的展示图像数据,如,10bit图像数据、12bit图像数据等。都可以随着其改变上述变换次数以及图像变换的目标,但是变换的具体实施步骤是相似的,就不再赘述。
为了能够满足用户的自主性,也可以由用户来主导上述图像变换的过程。
具体的,对原始图像数据做非线性图像变换处理得到待处理图像数据,包括:基于图 像精度损失,确定原始图像数据的非线性图像变换的目标;基于非线性图像变换的目标,对原始图像做非线性图像变换处理得到待处理图像数据。
其中,图像精度损失是指图像在变换过程中所损失的精度,其根据图像数据最终变换目标来确定。在本申请实施例中,该图像精度损失是已经预置好的,可以为多个,每个损失都可以对应一个最终变换目标,即图像最终变换的数据量。如,图像数据最终被变换到10bit图像数据或12bit图像数据或14bit图像数据等。且每个变换目标也可以对应预置好的变换过程。
例如,根据前文所述,手机中的编码器可以通过展示设备,如编码器所在手机的屏幕,来预先显示出各个变换目标,供用户选择。手机中的编码器响应于用户的选择,确定最终的变换目标,通过上述过程,以及变换目标对原始图像进行变换,得到待处理图像数据,并作为最终的非线性HDR图像数据。
需要说明的是,除了上述变换目标外,还可以设置变换次数,从而根据变换过程的次数来确定其变换过程。由于前文阐述过类似的内容,此处就不再赘述。
此外,对于线性图像变换处理来说也可以:对原始图像数据做线性图像变换处理得到待处理图像数据,包括:基于图像精度损失,确定原始图像数据的线性图像变换的目标;基于线性图像变换的目标,对原始图像做线性图像变换处理得到待处理图像数据。此处就不再赘述。
在另一实施方式中,获取原始图像数据,对原始图像数据做图像变换处理得到待处理图像数据。该图像变换处理可以包括线性变换处理、非线性变换处理,或者线性变换处理与非线性变换组合处理。
还可以从待处理图像数据或原始图像数据中提取预设数据量的基本层图像数据。基本层图像数据是指基础数据位深的图像数据,即低位深图像数据,其动态范围也是基础动态范围。如8bit图像数据。该基本层图像数据是符合传统显示设备的可展示图像数据的。
其中,从待处理图像数据或原始图像数据中提取预设数据量的基本层图像数据,包括:对待处理图像数据或原始图像数据做线性图像变换处理得到预设数据量的基本层图像数据。其中,预设数据量可以根据解码器所关联的或所在的数据处理装置的需求而定,如8bit。
例如,根据前文所述,手机中的编码器可以在获取到上述12bit图像数据后,进行线性剪裁,直接得到线性剪裁后的8bit基本层图像数据。
需要说明的是,也可以从原始图像数据中做线性图像变换处理得到预设数据量的基本层图像数据。此处就不再赘述,但应理解,基本层图像数据的动态范围是小于待处理图像数据的动态范围的。
其中,第一预设编码规则为根据JPEG编码规则确定的编码规则,例如基于待处理图 像数据例如前文的12bit图像数据进行JPEG编码规则压缩,获得基础层图像码流。此外,还可以是根据当前主流图像格式的编码规则确定的编码规则,如TIFF(标签图像文件格式,Tag Image File Format)、BMP(Bitmap,位图)等。
接着,继续如图1所示,在步骤S120中,对所述待处理图像中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流;其中,所述增强层图像数据的位深大于所述基础层图像码流对应的基础层图像数据的位深。
增强层图像数据是基于待处理图像和基础层图像码流获得的,例如,本申请的方法还包括:对所述基础层图像码流进行解码,得到所述基础层图像数据;基于待处理图像数据和所述基础层图像数据,确定待处理图像的增强层图像数据。
其中,增强层图像数据是指高位深图像数据与基本层图像数据之间的图像数据。其动态范围也是高动态范围,是高于基本层图像数据的动态范围的。
对所述基础层图像码流进行解码,得到所述基础层图像数据,包括:基于与第一预设编码规则对应的第一预设解码规则对基本层图像码流做解码处理,将解码得到数据作为基本层图像数据。
例如,根据前文所示,如图2所示,手机中的编码器在获取到基本层图像数据,即未编解码的8bit图像数据后,对其进行JPEG编码的编码。得到编码后的基本层图像数据,然后可以再经过图像解码,得到解码后的基本层图像数据。
需要说明的是,JPEG由于该标准软硬件算法实现简单,并且能够在较高的压缩倍数下保证图像的质量,使得该标准成为使用较为广泛的图像压缩标准。除了JPEG外,还可以进行其它图像格式编解码,选择哪种图像格式进行编解码,在后续整个待处理图像数据进行编码的时候,就可以针对该图像格式的图像与HDR图像进行兼容。
在一个示例中,基于所述待处理图像数据和所述基础层图像数据,确定待处理图像的增强层图像数据,包括:基于所述待处理图像数据和所述基本层图像数据(也即对所述基础层图像码流进行解码,得到所述基础层图像数据)计算残差;基于所述残差确定待处理图像的增强层图像数据。
其中,残差是指从待处理图像数据除去基本层图像数据所得到的图像数据。可以直接将该残差作为待处理图像的增强层图像数据。
在一种实施方式中,可以基于待处理图像数据例如12bit图像数据和由对所述基础层图像码流进行解码得到的基础层图像数据计算残差。
在另一种实施方式中,则具体的可以为:基于待处理图像数据和基本层图像数据计算残差,包括:从待处理图像数据中减去基本层图像数据,得到残差。
例如,根据前文所述,编码器在得到待处理图像数据,12bit图像数据后,8bit基本层图像数据后,可以从12bit图像数据中减去对应的8bit图像数据,可以得到待处理图像的增强层图像数据。
此外,在对待处理图像数据进行编码以及解码后,对于用于解码该目标图像码流的电子设备而言,其只能从解码后的目标图像中获取到解码后的基本层图像数据(如,对应通过JPEG编码规则编码的码流),不能拿到未编解码的基本层图像数据,所以为了后续更好地对目标图像进行解码,由解码后的增强层图像数据与解码后的基本层图像数据(这里是指通过第一图像格式进行压缩,或者说进行编码的基本层图像),得到解码后的目标图像,所以在对待处理图像数据进行编码的时候,可以是针对编解码后的基本层图像数据来确定增强层图像数据的。
此外,当基本层图像数据属于编解码后的基本层图像数据时,则可以由待处理图像数据以及编解码后的基本层图像数据,得到待处理图像的增强层图像数据。
具体的,也可以为:基于与第一预设编码规则对应的第一预设解码规则对基本层图像码流做解码处理,将解码得到数据作为基本层图像数据;基于待处理图像数据和基本层图像数据计算残差,包括:基于待处理图像数据和基本层图像数据计算残差。
例如,根据前文所示,如图2所示,手机中的编码器在获取到基本层图像数据,即未编解码的8bit图像数据后,对其进行JPEG编码的编码。得到编码后的基本层图像数据,然后可以再经过图像解码,得到解码后的基本层图像数据。然后,在从待处理图像数据,减去对应的解码后的基本层图像数据,得到增强层图像数据。
需要说明的是,JPEG由于该标准软硬件算法实现简单,并且能够在较高的压缩倍数下保证图像的质量,使得该标准成为使用较为广泛的图像压缩标准。除了JPEG外,还可以进行其它图像格式编解码,选择哪种图像格式进行编解码,在后续整个待处理图像数据进行编码的时候,就可以针对该图像格式的图像与HDR图像进行兼容。
此外,由于在某些较为平坦的增强层图像的区域,其经过后续的增强层图像数据的编码过程,如DCT变换和量化后,就会只剩下增强层图像数据的直流分量,从而导致最终解码后的目标图像,如12bit图像,呈现块效应。为了改善最后解码后的目标图像的质量。可以在上述得到的基本层图像数据(可以是基本层图像数据,也可以为基本层图像数据)的基础上加入随机噪声(那么也相当于在增强层图像数据上加入随机噪声,因为增强层图像数据是根据加入随机噪声的基本层图像数据得到的),就可以以增加码流和部分噪声为代价来改善这种块效应。
具体的,可以是,该方法100还包括:在基本层图像数据中加入随机噪声数据;基于 待处理图像数据和基本层图像数据计算残差,包括:基于待处理图像数据和加入随机噪声数据的基本层图像数据计算残差。
由于这种情况只会出现在图像数据中特别暗的地方,因此只在基本层图像数据的灰度值小于3的地方增加0,1随机噪声。然后确定出残差。此处就不再赘述。
此外,还可以该方法100还包括:在计算出残差后,在残差中加入随机噪声数据。另,该方法100还包括:在计算残差的过程中,加入随机噪声数据,以使得到的残差具有随机噪声数据。
为了能够更加方便地确定出增强层图像数据,在计算残差的时候,可以将待处理图像数据的动态范围与基本层图像数据(此处以及以下所涉及的基本层图像均可以是是基本层图像数据,也可以为基础层图像码流解码后的基本层图像数据)的动态范围进行对齐。
具体的,将待处理图像数据所表示的动态范围与基本层图像数据所表示的动态范围进行对齐。
例如,对齐方式可以为:hdr/16,其中,hdr为待处理图像数据,即上述非线性HDR图像数据。当hdr为12bit,且基本层图像数据为8bit图像数据,则hdr/16可以得到8bit整数和4bit小数,从而与基本层图像数据的动态范围对齐。
在确定残差的时候,如果通过基本层图像数据来确定的,那么势必基本层图像数据会带来损失,为了补偿这个损失,可以对这个损失进行补偿。
具体的,通过增强层图像数据的动态范围与基本层图像数据的动态范围的比例,对对齐后的待处理图像数据进行补偿;从将补偿后的待处理图像数据中减去基本层图像数据,得到残差。
其中,补偿的方式可以是,例如,对hdr/16的数据进行补偿,将hdr/16*drc coef,其中,*drc coef是基本层图像数据的动态范围与HDR图像(即hdr)数据的动态范围比例的倒数。补偿完,可以再减去基本层图像数据即可得到增强层图像数据。
基于前述论述可知,其中,基于待处理图像数据和基本层图像数据计算残差,包括:将待处理图像数据所表示的动态范围与基本层图像数据所表示的动态范围进行对齐;通过增强层图像数据的动态范围与基本层图像数据的动态范围的比例,对对齐后的待处理图像数据进行补偿;从将补偿后的待处理图像数据中减去基本层图像数据,得到残差。
由于前文已经阐述过了,此处就不再赘述。
为了能够优化计算方式,减少计算量,提高计算效率和时间,可以对增强层图像的数据进行选择。
在一个示例中,基于残差确定增强层图像数据,包括:基于残差,得到待保留数据;从待保留数据中选择保留数据,并作为待处理图像的增强层图像数据。
其中,选择保留数据是指对得到的残差,进行数据保留或者数据选择。
选择保留数据的方式可以为,例如,将(hdr/16*drc coef-base)*2 n,其中,n为想要保留数据的位数。根据前文可知hdr/16可以得到4位小数,所以n最大为4,最小为1。此处可以选择n为3,那么就可以保留三位小数。(hdr/16*drc coef-base)为残差。可以在该残差内加入随机噪声数据。
由于在后续图像的编码过程中,其计算过程均是对正整数进行计算的,所以还需要将小数转换为正整数。
具体的,从待保留数据中选择保留数据,并作为增强层图像数据,包括:对保留数据进行调整,得到具有正整数数据的增强层图像数据。
例如,根据前文所述,可以通过下式1)来调整:
Figure PCTCN2021071513-appb-000001
其中,res为增强层图像数据,可以理解,通过2048来调整为正整数。Base可以为预处理预的基本层图像的数据。
此外,还可以直接:对残差进行调整,得到具有正整数数据的残差,直接作为待处理图像的增强层图像数据。
之后,该方法还包括:对所述待处理图像数据中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流。其中,第二预设编码规则是指针对增强层图像数据的编码规则,由此进行编码的。该编码过程可以包括:如图2所示,划分成slice(切片图像数据)即图2中的划分切片图像数据,确定感兴趣区域,DCT(Discrete Cosine Transform,离散余弦变换)变换,量化,Qp map(quantization parameter map,量化参数表)的计算,即图2中的量化参数,系数扫描,熵编码等。其中,根据由Qp map作为基础,对DCT变换后得到DCT变换系数进行量化的,由此进行图像数据编码。而Qp map则是可以通过基本层图像数据确定的。
下文中将对第二预设编码规则变得编码过程进行描述。
如图2所示,可以先对增强层图像数据划分成slice,基于每个slice再划分为8x8像素的block子块,对每个block进行DCT变换得到每个block的DCT变换系数。
其中,block的具体划分方式可以为:
对待处理图像的增强层图像数据进行划分,得到多组宏块,并将宏块划分为子块;针对子块,确定子块对应的第二信息。
其中,对增强层图像数据进行划分,得到多组宏块,并将宏块划分为子块;基于多组宏块,组合生成切片图像数据。
例如,根据前文所述,手机中的编码器在得到增强层图像数据后,为了后续顺利进行编码,可以先将输入的增强层图像数据进划分为slice。首先,将增强层图像数据被划分为宏块MB,其大小是16x16像素。MB可以接着继续划分为4个blocks,每个block的大小为8x8像素,这些blocks可以是后续处理的基本单位。
当增强层图像数据的分辨率不满足16的倍数的时候,可以先对增强层图像数据进行padding填充直到分辨率满足16的倍数。最后,增强层图像数据的每8个MB被划分成一组,称为slice。
一个slice的MB最好要处于同一MB行,当一行剩余的MB的数目不够8个MB时,slice中MB的数目可以为4,当一行剩余的MB的数目不够4个MB时,slice中MB的数目可以为2。以此类推,直到slice中仅有1个MB。
图3中给出了划分成slice的示意图。其中,每行有8个slice,共有19个slice行,共有152个slices。每个slice可以并行的编解码。其中,slice301有8个MB,slice302有4个MB,slice303有2个MB,slice304有一个MB。该划分所对应的子模块其输入可以是增强层图像数据,输出是划分后的图像数据。
之后,如图2所示,确定待处理图像中的感兴趣区域(region of interest,简称ROI)。该感兴趣区域可以是基于用户的输入指令而确定的待处理图像中的区域,或者,也可以是数据处理装置通过智能分析自动确定的区域。例如该感兴趣区域可以包括待处理图像中的高亮度区域和/或低亮度区域等。
在一个示例中,确定待处理图像中的感兴趣区域,包括:所述基于所述待处理图像数据的第一像素信息和/或所述增强层图像数据的第二像素信息,确定图像中的感兴趣区域,例如,获取所述待处理图像数据中各个像素区域中的像素单元对应的第一像素信息,和/或,所述待处理图像数据的增强层图像数据中各个像素区域中的像素单元对应的所述第二像素信息;当所述各个像素区域中的第一像素区域中的至少一个像素单元的所述第一像素信息满足第一预设阈值条件和/或所述第二像素信息满足第二预设阈值条件时,则确定所述第一像素区域属于所述感兴趣区域。该第一像素区域可以对应为前文中多个slice中的任意一个slice,或者第一像素区域还可以是基于其他规则划分的多个像素区域中的任意一个像素区域。其中,第一预设阈值条件和第二预设阈值条件可以根据先验经验合理设定。
例如,所述第一像素信息包括亮度信息、第一色度分量和第二色度分量,所述第一预设阈值条件包括:所述亮度信息大于第一亮度阈值;或所述亮度信息小于第二亮度阈值,其中所述第一亮度阈值大于所述第二亮度阈值;或所述第一色度分量大于第一色度阈值;或所述第一色度分量小于第二色度阈值,其中,所述第一色度阈值大于所述第二色度阈值;或所述第二色度分量大于第三色度阈值;或所述第二色度分量小于第四色度阈值,其中, 所述第三色度阈值大于所述第四色度阈值。通过第一预设阈值条件可以确定待处理图像中的高亮度的像素单元和低亮度的像素单元。
再例如,所述第二像素信息包括亮度残差、第一色度分量残差和第二色度分量残差,所述第二预设阈值条件包括:所述亮度残差的绝对值大于第一亮度残差阈值;或所述第一色度分量残差的绝对值大于第一色度分量残差阈值;或所述第二色度分量残差的绝对值大于第二色度分量残差阈值。参考前文的描述,由于残差是基于前文的待处理图像和基础层图像获得的,通过残差可以看出两者像素值的差距,当残差的绝对值越大,表明两者相差越大,进而表明对应的待处理图像中的像素单元要么为高亮度要么为低亮度。
感兴趣区域可以包括一个或多个像素区域,一个待处理图像可以包括一个或多个感兴趣区域。
在一个示例中,本申请的方法还包括:对待处理图像中的像素单元和/或像素区域设置标签,其中,将确定属于感兴趣区域的像素单元和/或像素区域设置为第一标签,所述第一标签用于表示对应的像素单元和/或像素区域具有增强层图像数据,将不属于感兴趣区域的像素单元和/或像素区域设置为第二标签或者不设置标签,也即当像素单元和/或像素区域不包含所述第一标签或者所述标签为第二标签时,用于表示对应的像素单元和/或像素区域没有增强层图像数据。通过对设置有第一标签的像素单元和/或像素区域对应的增强层图像数据进行保存和编码,从而在感兴趣区域的增强层图像数据中保留更多的细节和动态范围。
通常在使用中,图像的大部分区域亮度适中、细节也基本满足需求,在这些区域增强层效果并不明显,通过标记图中指示像素单元和/或像素区域的标签例如第一标签,还可以避免在全图范围内添加增强层图像数据,有效减小增强层的数据量,同时在将增强层应用在图像时只需要在指定区域内进行计算,有效减少了计算量。
第一标签可以为“真”,或者,第一标签为1,第二标签为0,或者,还可以是其他任意的适合的标签类型。
在一个具体示例中,当待处理图像数据为YUV格式的数据时,例如待处理图像数据为HDR图像数据(例如12bit YUV444)时,其中,“Y”表示明亮度(Luminance或Luma),也就是灰度值;而“U”和“V”表示的则是色度(Chrominance或Chroma),作用是描述影像色彩及饱和度,用于指定像素的颜色;第一像素信息的亮度信息包括Y值,第一像素信息的第一色度分量包括U值,第一像素信息的第二色度分量包括V值,待处理图像中所有像素单元均对应有各自的Y,U,V,若第一亮度阈值为YTH1,第二亮度阈值为YTH2,第一色度阈值为UTH1,第二色度阈值为UTH2,第三色度阈值为VTH1,第四色度阈值为VTH2;将增强层图像数据中所有像素单元的第二像素信息表示为Yres,Ures,Vres,该 些Yres,Ures,Vres为基于前述的公式1)计算获得的,则亮度残差的绝对值abs(Yres–2048),第一色度分量残差的绝对值可以表示为abs(Ures–2048),第二色度分量残差的绝对值可以表示为abs(Vres–2048),若第一亮度残差阈值为DY_TH,第一色度分量残差阈值为DU_TH,第二色度分量残差阈值为DV_TH,取增强层图像相同尺寸的标记图(MASK),对每个像素单元仅在满足下述条件中的至少一个时标记为“真”:abs(Yres–2048)>DY_TH或abs(Ures–2048)>DU_TH或abs(Vres–2048)>DV_TH或Y>YTH2或U>UTH2或V>VTH2或Y<YTH1或U<UTH1或V<VTH1。
对于前文中划分的slice,若slice内没有为“真”的标记,则不保存相应slice的增强层图像数据,在解码时,例如恢复HDR图像时,则认为增强层图像数据为0,也即不携带增强层图像数据。若slice内具有至少一个“真”的标记,则保存相应slice的增强层图像数据。
在一个示例中,所述基于所述待处理图像数据的第一像素信息和/或所述增强层图像数据的第二像素信息,确定待处理图像中的感兴趣区域,还包括:用至少一个边框包围至少部分设置为所述第一标签的像素单元,其中,所述至少一个边框所包围的像素单元在设置为所述第一标签的所有像素单元中所占的比例不小于阈值比例,和/或,用至少一个边框包围至少部分设置为所述第一标签的像素区域,所述至少一个边框所包围的像素区域在设置为所述第一标签的所有像素区域中所占的比例不小于阈值比例,所述边框包围的区域(该边框包围的区域中包括至少部分的标记为第一标签的像素单元,或者包括至少部分的标记为第一标签的像素区域,或者,还可能包括部分的没有标记第一标签或者设置为第二标签的像素单元或像素区域)为所述感兴趣区域。其中,至少一个边框可以是一个边框也可以是多个边框,一个边框和一个感兴趣区域相对应。可选地,阈值比例可以根据实际需要合理设定,例如阈值比例大于或等于80%,例如可以为80%、90%等,该阈值比例的范围仅作为示例,对于其他适合的比例也同样适用于本申请。
边框的形状可以是任意适合的形状,例如矩形框、圆形框、多边形框等。本申请实施例中,边框可以为矩形框。
例如在待处理图像内选择1~N个矩形框,以尽可能小的矩形面积覆盖主要的标记图(MASK)标记为第一标签例如“真”的区域。例如选择1个矩形框,覆盖90%的标记图(MASK)标记为“真”的区域。
矩形框的尺寸和位置也即对应为感兴趣区域的尺寸和位置,其可以通过迭代的算法生成,即从覆盖整个待处理图像的矩形框开始,不断缩小矩形框的长度和宽度,并在待处理图像中移动矩形框,直到矩形框再移动或缩小时无法继续满足覆盖不小于阈值比例例如90%的标记图(MASK)标记为第一标签例如“真”的区域,停止迭代,此时的矩形框作 为最终的边框,也即感兴趣区域的边框,在该矩形框范围内计算并保存增强层图像数据,并记录矩形边框的位置和大小,作为感兴趣区域的位置和尺寸。
值得一提的是,上述确定感兴趣区域的相关步骤还可以在增强层图像数据的计算之前进行,例如可以基于待处理图像数据确定感兴趣区域,而可以仅对感兴趣区域对应的像素进行增强层图像数据的一系列计算。
对于不同ROI区域,可以采用不同的策略,针对性地保存该ROI内的图像信息,例如对于主要由高亮度的像素单元组成的ROI,对其进行压缩,从而更高精度地保留高光的信息,或者对于主要由低亮度像素组成的ROI,则将进行亮度放大,更高精度地保留阴影处的信息。
在一个示例中,所述对所述待处理图像中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流,包括:根据所述感兴趣区域内的像素单元和/或像素区域的亮度信息,确定所述感兴趣区域的像素单元和/或像素区域的亮度状态,所述亮度状态包括第一亮度状态和第二亮度状态;对所述亮度状态为第一亮度状态的像素单元和/或像素区域对应的增强层图像数据进行压缩,以压缩到预定输入范围内;或,对所述亮度状态为第二亮度状态的像素单元和/或像素区域对应的增强层图像数据进行放大,以放大到预定输入范围内。
第一亮度状态对应的亮度高于所述第二亮度状态对应的亮度,也即第一亮度状态可以对应为高亮度,而第二亮度状态对应为低亮度,其中,第一亮度状态对应的亮度高于第一亮度阈值,第二亮度状态对应的亮度低于第二亮度阈值,第一亮度阈值大于第二亮度阈值,该些亮度阈值可以根据先验经验合理设定,可以基于前文中在确定感兴趣区域时的预设条件来做判断,例如基于待处理图像数据的第一像素信息或者基于增强层图像数据的第二像素信息,来确定感兴趣区域时的第一预设条件和第二预设条件中的至少一种来进行判断,例如,当像素单元的Y>YTH2或U>UTH2或V>VTH2时,则确定像素单元为高亮度像素,也即为第一亮度状态,当Y<YTH1或U<UTH1或V<VTH1,则确定像素单元为低亮度像素,也即为第二亮度状态。
在另一个示例中,对所述待处理图像中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流,包括:根据所述感兴趣区域内的像素单元和/或像素区域的亮度信息,确定所述感兴趣区域的像素单元和/或像素区域的亮度状态,所述亮度状态包括第一亮度状态和第二亮度状态,其中,第一亮度状态对应的亮度高于所述第二亮度状态对应的亮度;当感兴趣区域内的亮度状态为第一亮度状态的像素单元和/或像素区域所占的比例大于第一阈值比例时,对所述感兴趣区域对应的增强层图像数据进行压缩,以压缩到预定输入范围内;当感兴趣区域内的亮度状态为第二亮度为像素单元和/ 或像素区域所占的比例大于第二阈值比例时,对所述感兴趣区域对应的增强层图像数据进行放大,以放大到预定输入范围内。该第一阈值比例和第二阈值比例可以根据实际需要合理设定,例如第一阈值比例可以为70%、80%或90%等,第二阈值比例可以为70%、80%或90%等。通过该阈值比例可以判断感兴趣区域内像素单元的主要亮度状态,从而针对不同感兴趣区域,采用不同的测量,针对性地保存该ROI内的图像信息。
值得一提的是,前述的预定输入范围是指压缩算法的输入范围,不同压缩算法可能具有不同的输入范围,有可能具有相同的输入范围,例如对于本申请实施例,压缩算法的输入范围例如为0-4095。
在一个示例中,对于主要由高亮度像素组成的感兴趣区域,对所述感兴趣区域对应的增强层图像数据进行压缩,以压缩到预定输入范围内,可以采用本领域技术人员熟知的任意适合的压缩方法,例如将所述感兴趣区域对应的增强层图像数据乘以压缩因子,以压缩到预定输入范围内,其中,所述压缩因子不小于0且不大于1,进一步,压缩因子不小于1/256且不大于1,又例如,基于第一预设曲线,对所述感兴趣区域对应的增强层图像数据进行非线性压缩,以压缩到预定输入范围内,通过将像素值比较均匀地分布在压缩算法的输入范围(例如0-4095)内,从而更高精度地保留高光的信息。
在一个示例中,对于主要由低亮度像素组成的ROI,对所述感兴趣区域对应的增强层图像数据进行放大,以放大到预定输入范围内,可以采用本领域技术人员熟知的任意适合的放大方法,例如,将所述感兴趣区域对应的增强层图像数据加上补偿偏置量(例如加上较大的偏置量,例如4000),以放大到预定输入范围内;再例如,基于第二预设曲线,对所述感兴趣区域对应的增强层图像数据进行非线性放大,以放大到预定输入范围内,从而将像素值比较均匀地分布在压缩算法的输入范围内,更高精度地保留阴影处的信息。
进一步,针对增强层图像数据的编码过程还可以包括:DCT(Discrete Cosine Transform,离散余弦变换)变换,量化,Qp map(quantization parameter map,量化参数表)的计算,即图2中的量化参数,系数扫描,熵编码等。其中,根据由Qp map作为基础,对DCT变换后得到DCT变换系数进行量化的,由此进行图像数据编码。而Qp map则是可以通过基本层图像数据(该基本层图像数据是指的未经编码的基本层图像数据)确定的。
基于此,基于第二预设编码规则对增强层图像数据进行编码,包括:基于基本层图像数据对增强层图像数据进行编码。
具体的,基于基本层图像数据对增强层图像数据进行编码,包括:根据基本层图像数据,确定反映增强层图像数据被压缩情况的第一信息;基于第一信息,对增强层图像数据进行编码。
其中,第一信息是指反映增强层图像被压缩情况的信息,如Qp map,其中,Qp map 中的Qp越大,则对应位置的增强层图像被压缩得越厉害,Qp越小,则对应位置的增强层图像被保存的信息越完整。应理解,该压缩也是指被去除掉的图像数据信息。
具体的,可以先确定出基本层图像数据的亮处和/或暗处对应的数量,然后通过一个数量阈值来确定Qp的大小。从而进行编码。
其中,根据基本层图像数据,确定反映感兴趣区域的增强层图像数据被压缩情况的第一信息,包括:根据基本层图像数据的亮度分布情况,确定反映感兴趣区域的增强层图像数据被压缩情况的第一信息。
基本层图像数据的亮度分布情况是指基本层图像数据中的亮处(也可以称为高亮处)以及暗处的分布情况。如果某个图像数据区域中的亮处和/或暗处所对应的数量越多,其被保存的越完整,其Qp越小。否则,Qp越大。
更具体的,根据图像中各个预置图像数据区域中亮度情况,确定高亮像素点个数;根据各个预置图像数据区域中亮度情况,确定暗处像素点个数;根据高亮像素点个数和/或暗处像素点个数,确定对应的亮度分布情况。
可以通过对应图像数据的颜色信息来确定上述暗处像素点个数和高亮像素点个数。
具体的,将基本层图像由YUV格式转换为RGB格式,根据RGB格式的图像中各个预置图像数据区域中亮度情况,确定高亮像素点个数;根据YUV格式的各个预置图像数据区域中亮度情况,确定暗处像素点个数;根据高亮像素点个数和/或暗处像素点个数,确定对应的亮度分布情况。
例如,根据前文所述,手机中的编码器首先将输入的YUV(颜色编码方法,“Y”表示明亮度(Luminance或Luma),也就是灰度值;而“U”和“V”表示的则是色度(Chrominance或Chroma),作用是描述影像色彩及饱和度,用于指定像素的颜色。)的基本层图像数据转化为RGB(RGB color mode,红、绿、蓝三色色彩模式)的图像格式的图像数据。可以基于预先将基本层图像数据划分成多个图像数据区域,对每个区域进行统计高亮处和暗处的像素点个数和。根据统计出每个区域中的高亮处和暗处的像素点的个数和进行确定,如果高亮处和暗处的像素点的个数和大于该对应区域内像素点的个数的8%(如,该区域内的像素为16*32,则大于16*32*8%),则该区域内的量化参数qIdx设置为2,否则设置为8,直至对每个区域都完成量化参数的设置,即可得到针对增强层图像数据对应的指导量化过程的Qp map。应理解,每个区域共享同一个设置的量化参数。
其中,对于将基本层图像数据划分成多个区域,其划分方式可以是上述划分成slice(切片图像数据,每8个MB被划分成一个切片图像数据,MB(Macroblock,宏块),MB的大小是16x16像素。),也可以将基本层图像数据划分为16*32的量化块(Quantization Block,QB),同一个QB内使用相同的量化参数和/或同一个slice使用相同的量化参数。 应理解,可以根据需求将图像数据进行区域划分,此处不进行限制。
对于量化块而言,其确定第一信息可以为:根据基本层图像数据的亮度分布情况,确定反映增强层图像数据被压缩情况的第一信息,包括:将基本层图像数据划分为量化块;针对各个量化块,确定对应的亮度分布情况;根据亮度分布情况,确定各个量化块的量化参数。
由于前文已经阐述过相似的实施方式,此处就不再赘述。
在针对slice的时候,其确定亮度分布情况可以具体为:针对各个切片图像数据,确定对应的亮度分布情况,包括:针对各个切片图像数据,确定对应的亮度分布情况,包括:根据各个切片图像数据中亮度情况,确定高亮像素点个数;根据各个切片图像数据中亮度情况,确定暗处像素点个数;根据高亮像素点个数和/或暗处像素点个数,确定对应的亮度分布情况。
更具体的,根据各个切片图像数据中亮度情况,确定高亮像素点个数,包括:将基本层图像数据由YUV格式转换为RGB格式,并根据RGB格式的各个切片图像数据中亮度情况,确定高亮像素点个数;根据各个切片图像数据中亮度情况,确定暗处像素点个数,包括:根据YUV格式的各个切片图像数据中亮度情况,确定暗处像素点个数;根据高亮像素点个数和/或暗处像素点个数,确定对应的亮度分布情况。
由于前文已经阐述过了,此处就不再赘述。
而对于根据亮度分布情况,确定slice其量化参数可以为:根据基本层图像数据的亮度分布情况,确定反映增强层图像数据被压缩情况的第一信息,包括:将基本层图像数据划分为切片图像数据;针对各个切片图像数据,确定对应的亮度分布情况;根据亮度分布情况,确定各个切片图像数据的量化参数。由于前文已经阐述过了,此处就不再赘述。
统计每个区域内高亮处和/或暗处的像素点个数,高亮处的像素确定条件是RGB三个通道存在一个通道的值大于220,则为高亮处的像素。暗处的像素确定条件是存在YUV中的Y通道的值小于64,则为暗处的像素。由此进行统计。
应理解,对于任一区域中,如果统计出来只有暗处的像素或者高亮处的像素,则可以只按照暗处的像素或者高亮处的像素的个数进行确定,如果该区域既有暗处的像素又有高亮处的像素,则可以根据暗处的像素和高亮处的像素的个数和来判断。此处就不再赘述。
而对于slice确定其量化参数可以为:
具体的,根据亮度分布情况,确定各个切片图像数据的量化参数,包括:根据亮度分布情况中的高亮像素点个数和/或暗处像素点个数,确定对应切片图像数据的量化参数。
由于前文已经阐述过了,此处就不再赘述。
需要说明的是,上述是根据图像数据内容的重要性自适应设定量化参数过程。考虑到 用户在后期编辑端会对图像数据的动态范围的两端,即图像数据的亮暗部做出较大的调整,因此本申请实施例在图像数据的这些部分会使用相对较小的量化参数,使得这部分图像数据的编码质量较高,从而保证用户图像的后期编辑空间。
此外,除了根据基本层图像数据的亮度分布情况来确定第一信息外,也可以根据HDR图像数据(如,上述12bit的待处理图像数据)的亮度分布情况来确定第一信息,确定的实施过程与上述相似,就不再赘述。仅说明:将基本层图像数据替换为待处理图像数据即可。
另,针对图2中所示的增强层图像数据的编码过程,可以将其模块化来实现。如将划分成slice(切片图像数据),DCT(Discrete Cosine Transform,离散余弦变换)变换,量化,Qp map(quantization parameter map,量化参数表)的计算,系数扫描,熵编码等分别以一个子模块来实现各自对应的功能。每个子模块均可以通过输入信息,得到输出结果。对于Qp map子模块而言,其输入可以是基本层图像数据也可以是待处理图像数据,输出是Qp map。
在确定完第一信息后,则可以进行量化。
具体的,基于第一信息,对增强层图像数据进行编码,包括:确定划分增强层图像数据的图像数据频率的第二信息;基于第一信息,对第二信息进行量化;根据量化结果,对增强层图像数据进行编码。
其中,第二信息是指用于划分增强层图像数据的图像数据频率的信息。如,DCT变换系数,其可以划分增强层图像数据的图像数据频率,将高频和低频划分出来。低频在DCT变换系数矩阵的左上角,高频在DCT变换系数矩阵的右下角,以便于后续扫描编码,多压缩图像数据的高频信息,多保留图像数据的低频信息。
其中,确定划分增强层图像数据的图像数据频率的第二信息,可以包括:
可以先对增强层图像数据划分成slice,基于每个slice再划分为8x8像素的block子块,对每个block进行DCT变换得到每个block的DCT变换系数。
其中,block的具体划分方式可以为参考前文有关图3的描述,在此不再重复。
DCT变换计算还可以通过1D-DCT一维DCT变换得到,其中一维DCT的计算公式如下:
Figure PCTCN2021071513-appb-000002
其中,该公式也可以是针对每个block来确定1D-DCT变换,得到每个block对应的变化系数。X(k)为一维DCT变换变化系数,k为0、1、2……7,x(i)为block对应的图像数据信号,i是对应像素点。
其中,变换矩阵可以规划为如下形式:
Figure PCTCN2021071513-appb-000003
其中:
Figure PCTCN2021071513-appb-000004
Figure PCTCN2021071513-appb-000005
其中,基于第一信息例如QPmap,对第二信息例如DCT变换后的系数进行量化可以为:通过下式量化公式实现:
DC分量的量化公式为:
QF[i][j]=floor((TF[i][j]-2^(botdepth+2))/(qMat[i][j]*qIdx))  (3)
其中,2^(bitdepth+2),bitdepth是12的时候,该值为16384。在具体处理时代码可直接带入该值进行处理。
AC分量的量化公式为:
QF[i][j]=floor(TF[i][j]/(qMat[i][j]*qIdx))           (4)
其中,TF[i][j]是DCT变换后的DCT变换系数,bitdepth为DCT变换系数的位宽,QF[i][j]是经量化后的DCT变换系数,qMat[i][j]是量化矩阵,预置好的。qIdx是量化参数,由输入的QP map对应位置得到,floor为向下取整函数,i,j表示在8*8block中的坐标。
对于DC分量而言,由于其数值大于AC分量,为了降低其占位过多,节省存储资源以及减少后续的计算量,通过2^(botdeoth+2),或16384进行调整,减小DC分量,即使用TF[i][j]-2^(bitdepth+2)防止DC分量过大。
如果资源充足,且能够快速进行后续计算,也可以不进行调整,则可以直接从公式中去除掉对应项即可。
每次量化可以是以8*8为单位的block进行的。在本申请实施例中,block,MB,QB,slice之间的关系如图4所示。在图4中,每个小块401代表DCT变换和量化的基本结构8*8的block,每4个block组成一个MB,1个MB内4个block数据排列的顺序如图4中所示。每8个block组成一个QB(16*32),即每8个block量化时使用相同的qIdx(即同一切片图像数据中的子块共 享同一个量化参数),该值从Qp map中索引。整个图4为16*16*8为一个slice的大小。行末的slice可能会变小,即包含的QB可能会减少,需要特殊考虑。
此外,图像数据纹理边界,需要特殊考虑。
此时,如果一个slice还使用相同的qIdx,但由于这个slice位于图像数据纹理边界处的话,要不该slice整个使用较小的qIdx,如2,会使得码流比较浪费,要不该slice整个使用较大的qIdx,如,8,会使得部分图像数据编码质量较差。因此,可以对这样的slice继续划分,分为更小的块,如QB,再根据上述方式重新对每个QB中统计高亮处和/或暗处的像素点的个数信息,对每个QB分配对应的qIdx,即,此时一个QB共享一个量化参数,对每个block进行量化,以完成量化。
应理解,如果是针对一个QB共用相同的量化参数。那么根据前文可知,可以从增强层图像数据的划分开始,就可以划分到QB,每个QB中包含多个block。DCT变换的时候,对其中的block进行变换。然后,量化是针对每个block进行的,此时量化中用到的量化参数则是一个QB共用一个量化参数。当然QB也可以继续划分,划分到更小的图像数据区域单元,重复上述QP map和DCT变换,来实现最终的量化,此处就不再赘述。
需要说明的是,还可以如果是针对slice使用相同的量化参数。即根据前文可知,可以针对每个slice来确定其对应的量化参数,并进行量化,然后仅对图像数据纹理边界,使用一个QB共用相同的量化参数,并进行量化。此处就不再赘述。
对于量化功能对应的子模块其输入时DCT变换系数和QP map,输出为量化系数。
在得到量化结果后,可以对量化结果,即量化系数,进行扫描,并进行编码,生成码流。
具体的,根据量化结果,对增强层图像数据进行编码,包括:对量化结果进行扫描,并对扫描后的量化结果进行编码;得到增强层图像码流包括:基于编码结果生成增强层图像数据码流。
其中,扫描方式可以为:对量化结果进行扫描,包括:根据预置扫描方式对各个子块中的量化结果进行扫描,得到第一扫描后的量化结果;针对切片图像数据,按照第一扫描后的量化结果顺序扫描切片图像数据中各个子块中的量化结果,得到第二扫描后的量化结果,作为最终的扫描后的量化结果。
其中,预置扫描方式可以为progressive渐进扫描方式。
在扫描前,还是基于slice中的子块来进行扫描的,所以对增强层图像数据进行划分,得到多组宏块,并将宏块划分为子块;基于多组宏块,组合生成切片图像数据。
例如,根据前文所述,如图5所示,该图5是手机中的编码器通过progressive扫描方式进行扫描的,即对每个block进行单独的扫描,拿到扫描结果,即重新排序后的block 结果。其中,图5中左图是扫描前的block,右图是通过progressive扫描方式扫描block,其中箭头方向表示扫描的顺序。然后,对8x8block的扫描完成后,要对slice中的所有8x8block进行扫描,先扫描每个block中相同位置的频率分量再扫描每个block中的下一频率分量,扫描顺序为低频到高频。如图6箭头方向所示。在图6中,如从0位置开始扫描,按照频率顺序进行扫描。在一实施例中,可以是频率分量由低到高进行扫描。先对同一位置的频率分量,最低频率分量的小块(像素)扫描,然后再顺序扫描下一个位置的频率分量的小块(像素),也即,扫描下一频率分量直到结束。
需要说明的是,对于扫描功能对应的子模块,其输入是量化系数,输出是扫描后重新排序的量化系数。
在扫描完成后,则对扫描后重新排序的量化系数进行编码。
例如,根据前文所述,手机中的编码器对扫描后重新排序的量化系数进行熵编码。本申请实施例熵编码过程采用的是Rice-golomb(Rice-golomb编码,莱斯-格伦布编码)混合熵编码。
上述熵编码功能的子模块的输入是扫描后重新排序的量化系数,输出是熵编码后的二进制序列。
上述如图2所示的对感兴趣区域的增强层图像数据进行编码的方法仅作为示例,对于其他适合的方法也同样可以适用于本申请,例如通用的视频编码方式,比如H264、H265、H266等。
继续如图1所示,在步骤S130中,传输增强层图像码流和基本层图像码流。
该两路码流可以是两路独立的码流,或者,还可以基于所述增强层图像码流和所述基础层图像码流,生成目标图像码流,传输目标图形码流,从而传输增强层图像码流和基本层图像码流。前述的编码过程可以是在编码端进行,而编码后的码流则可以在需要是传输至解码端,进行解码及显示。
基于所述增强层图像码流和所述基础层图像码流,生成目标图像码流,包括:将增强层图像码流写入基本层图像码流中的预设字段,以生成目标图像码流,其中,预设字段可以根据不同编码规则选择对应的预设字段。例如,对于JPEG所对应的编码规则而言,可以是紧跟APP0应用段的位置。
例如,根据前文所述,手机中的编码器找到基本层图像码流的APP0应用段的位置(码流标记字符FF E0),然后读取该应用段的长度,将指针往后移动该长度,就是增强层图像码流的插入的位置,并将增强层图像码流插入到基本层图像码流中,输出最终的图像码流。
如果基本层图像码流中不存在APP0应用段,则在基本层图像码流的最后或者其它的 自定义位置插入增强层图像码流。
其中,如图7所示,增强层图像码流组织结构。其中slice header(slice头信息)中存放着该slice中不同QB的qIdx值,每个QB的qIdx用1bit存储,1代表qIdx为8,0代表qIdx为2。
此外,在插入增强层图像码流之前,还可以对增强层图像码流进行封装。例如,手机中的编码器将增强层图像码流使用APP7(码流标记字符FF E7)应用段进行封装,由于JPEG规定应用段的长度不能超过0XFFFF,因此需要对增强层图像码流进行切割,然后再封装。上述封装字段仅为举例说明,具体实施时,可以是其他APP字段。
具体的,该方法100还包括:将增强层图像码流进行分割,并将分割后的码流进行封装;将增强层图像码流写入基本层图像码流中的预设字段,包括:将封装后的码流写入到预设字段,封装后的码流具有对应的码流标识信息。
手机中的编码器将增强层图像码流进行分割,并将分割后的码流,进行封装,插入到添加位置,增强层图像码流以及分割后的码流均具有对应的码流标识信息。
将增强层图像码流切割成多段后,每段就需要一定头信息来进行标记,头信息内容如图8所示,该段头信息用来描述增强层图像码流总的信息,未切割前的增强层图像码流,只记录一次。
图9示出了切割后每段增强层图像码流的头信息,每段增强层图像码流都有。
在一个示例中,本申请的方法还包括:传输所述感兴趣区域的位置信息和尺寸信息,在增强层的元数据中携带ROI的位置信息和尺寸信息,可以通过将所述感兴趣区域的位置信息和尺寸信息传输给解码端,解码端可以根据该知识来确定是否利用基本层和增强层来融合生成增强后的动态图像(也即目标图像)。
至此,完成了对本申请实施例中的数据处理方法的描述,可以理解的是,本申请的上述方法中的各个步骤,在不矛盾的前提下,顺序可以调换或者交替进行,可以存在并发执行的可能。
综上所述,通过将待处理图像数据以第一预设编码规则进行编码,以获得基础层图像码流,以及对所述待处理图像数据中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流,并传输所述基础层图像码流和所述增强层图像码流,从而完成对待处理图像数据的编码,同时可以兼容增强层图像数据码流以及基本层图像数据码流,并且,由于仅对感兴趣区域的增强层图像数据进行编码,避免在全图像范围内添加增强层图像数据,可以有效的减小增强层图像数据的数据量,同时在将增强层应用于图像时,只需要在感兴趣区域内进行计算,有效减少了计算量。
此外,由于待处理图像数据可以是HDR图像数据,其可以是对原始图像数据做非线性 图像变换处理得到待处理图像数据,那么根据本申请的方法则可以最终对感兴趣区域对应的HDR图像数据进行编码。其中,基本层图像数据则可以属于主流图像格式的图像数据,如,JPEG等图像数据,从而可以根据不同的需求,即可以解码主流图像格式的图像数据,也可以解码HDR图像数据,并通过将解码后的基础层图像数据和增强层图像数据进行融合,以获得目标图像,该目标图像例如为HDR图形,从而可以大大提高了HDR图像的传播性。
此外,若解码端可解码基础层图像码流和增强层图像码流,并通过将解码后的基础层图像数据和增强层图像数据进行融合,可以得到一个接近于待处理图像的动态范围较高的目标图像。由于增强层图像数据的位深大于所述基础层图像码流对应的基础层图像数据的位深,因此其保留了图像的精度,使得用户具有较大的后期编辑空间,给用户提供更多的创作空间和自由度,对用户更加友好。
基于上述相似的发明构思,下面,参考图10对本申请实施例的另一种数据处理方法进行描述,其中,图10为本发明实施例提供的一种数据处理方法的流程示意图。
本申请实施例提供的该数据处理方法1000可以由可具有解码器的数据处理装置执行,该解码器可以设置在数据处理装置中,数据处理装置可以为,相机、无人机、手持云台、手机、电脑等,具有摄像或照相功能的数据处理装置均可。该方法1000包括以下步骤:
在步骤S1001中,获取基础层图像码流和增强层图像码流;例如,获取编码端的数据处理装置传输的基础层图像码流和增强层图像码流,或者获取包括基础层图像码流和增强层图像码流的目标码流。
在步骤S1002中,以第一预设编码规则对所述基础层图像码流进行解码,以获得图像的基础层图像数据;其中,第一预设解码规则是与第一预设编码规则对应的,即前文所设的第一预设编码规则,如根据JPEG编码规则确定的编码规则,相对的,第一预设解码规则可以是根据JPEG解码规则确定的解码规则。
例如,解码端接收到编码端发送的目标图像码流。解码端的解码器获取该码流中的JPEG码流,然后根据JPEG解码规则确定的解码规则来对JPEG码流进行解码,得到基本层图像数据,以备进行展示。应理解,虽然该目标图像码流可以是JPEG码流的码流形式,但是由于其中还包含由增强层图像数据码流,所以在目标图像码流中除了包括JPEG码流外还有增强层图像数据码流。
此外,基本层图像数据用于生成基本层图像。
例如,根据前文所述,解码端的解码器在获取到基本层图像数据后,可以根据该数据 直接生成JPEG图像格式的基本层图像,可以通过解码端的显示屏进行展示。
在步骤S1003中,以第二预设编码规则对所述增强层图像码流进行解码,以获得所述图像中的感兴趣区域的增强层图像数据;其中,所述增强层图像数据的位深大于所述基础层图像数据的位深;增强层图像数据用于配合基本层图像数据生成目标图像,目标图像的动态范围大于基本层图像的动态范围。
其中,第二预设解码规则与第二预设编码规则对应的,即前文所设的第二预设编码规则。第二预设解码规则可以是第二预设编码规则的逆过程。
例如,根据前文所述,解码端例如电脑或手机的解码器还可以直接获取感兴趣区域的增强层图像码流,或者,从目标图像码流中的预设字节中得到增强层图像数据码流。其中,可以是将分割后的码流进行拼接成完整的增强层图像码流。在得到增强层图像码流后,解码器可以根据前文所述的增强层图像数据编码规则(即第二预设编码规则)的逆过程进行解码,得到感兴趣区域的增强层图像数据。
在步骤S1004中,对所述基础层图像数据和所述增强层图像数据进行融合,获得目标图像。
最终将感兴趣区域的增强层图像数据以及基本层图像数据进行融合,得到目标图形,目标图像例如为HDR图像,并显示HDR图像。
在一个示例中,当增强层图像码流中记录有感兴趣区域的位置信息和尺寸信息时,所述方法还包括:获取所述感兴趣区域的位置信息和尺寸信息。解码端的解码器可以根据该位置信息和尺寸信息,确定是否利用基本层和增强层融合生成增强后的动态图像。
在另一个示例中,当在增强层图像数据进行编码的过程中,对感兴趣区域内的像素单元和/或像素区域设置了标签,则所述对所述基础层图像数据和所述增强层图像数据进行融合,获得目标图像,包括:获取图像中的像素单元和/或像素区域的标签,所述标签包括第一标签和第二标签,所述第一标签用于表示对应的像素单元和/或像素区域具有增强层图像数据,当不包含所述第一标签或者所述第二标签时,用于表示对应的像素单元和/或像素区域没有增强层数据;将标记为所述第一标签的像素单元和/或像素区域所对应的增强层图像数据和所述基本层图像数据进行融合,以生成目标图像,目标图像例如为HDR图像,并显示HDR图像。
本申请实施例提供的数据处理方法1000的一些细节可以参考前文中的数据处理方法100的各个步骤的描述。
需要说明的是,数据处理方法1000中所涉及的各个步骤不限定顺序,可以不按照标号以及撰写的先后顺序进行,可以存在并发执行的可能。同时,该数据处理方法1000可以通过数据处理装置中的多个处理元器件并行工作来执行。
由于本申请实施例的数据处理方法利用前文中的数据处理方法获得增强层图像码流和基本层图像码流而执行,因此,其可以获得具有更高动态范围的目标图像,并且,在进行融合的过程中,由于其增强层图像码流仅具有与感兴趣区域相对应的增强层图像数据,因此,其计算量更小。
下面,参考图11对本申请实施例提供的一种数据处理装置进行描述,图11为本发明实施例提供的一种数据处理装置的示意性框图。
该数据处理装置可以为任意的具有编码器的电子设备,例如数据处理装置可以为:相机、可移动平台(例如飞行器、车、船、机器人等)、手持云台、手机、可移动平台上的相机、平板电脑、计算机等任意具有图像数据处理功能的装置,其中,该数据处理装置还可以具有摄像和/或照相功能。
如图11所示,该数据处理装置1100包括一个或多个处理器1101、一个或多个存储器1102、图像采集装置(未示出)、通信接口、显示单元等。这些组件通过总线系统和/或其它形式的连接机构(未示出)互连。
存储器1102用于存储可执行指令,例如用于存储用于实现根据本申请实施例的数据处理方法100中的相应步骤和程序指令。存储器1102可存储操作系统(包括用于处理各种基本系统服务和用于执行硬件相关任务的程序)、至少一个功能所需的应用程序(比如拍摄装置的控制功能、声音播放功能、图像播放功能等)等。这些应用程序中的一些可以经由无线通信从外部服务器下载。其它应用程序可以在制造或出厂时被安装在存储器1102内,还可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存(例如SD卡)等。
通信接口(未示出)可以用于数据处理装置1100和其他外部设备例如解码端的数据处理装置等之间进行通信,包括有线或者无线方式的通信。通信接口可以为接入基于通信标准的无线网络,如WiFi(例如WiFi6)、软件无线电(Software-defined radio,简称SDR)、2G、3G、4G、5G或它们的组合。在一个示例性实施例中,通信接口经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,通信接口503还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
可选地,数据处理装置1100还包括输入装置(未示出)可以是用户用来输入各种操作的装置,例如可以包括功能键(比如例如,触摸键、按键(例如音量控制按键、开关按键等)、机械键、软键等、操作杆、麦克风和触摸屏等中的一个或多个,或者其他用于使得 用户能够输入信息的用户输入装置。数据(例如,音频、视频、图像等)通过输入装置来获得。
处理器1101可以是中央处理单元(CPU)、图像处理单元(GPU)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者具有数据处理能力和/或指令执行能力的其它形式的处理单元,并且可以控制拍摄装置中的其它组件以执行期望的功能。处理器1101能够执行存储器1102中存储的所述指令,以执行本文描述的数据处理方法100的相应步骤,具体地可以参考前文的方法描述。例如,处理器1101能够包括一个或多个嵌入式处理器、处理器核心、微型处理器、逻辑电路、硬件有限状态机(FSM)、数字信号处理器(DSP)、ISP、编码器、解码器或它们的组合。
一个或多个处理器1101用于执行所述存储器中存储的所述程序指令,使得所述处理器执行以下步骤:对待处理图像数据以第一预设编码规则进行编码,以获得基础层图像码流,可选地,所述第一预设编码规则为根据JPEG编码规则确定的编码规则;对所述待处理图像数据中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流;其中,所述增强层图像数据的位深大于所述基础层图像码流对应的基础层图像数据的位深;传输所述基础层图像码流和所述增强层图像码流。可选地,所述待处理图像数据的动态范围高于所述基础层图像数据的动态范围。
在一个示例中,处理器1101还用于:对所述基础层图像码流进行解码,得到所述基础层图像数据;基于所述待处理图像数据和所述基础层图像数据,确定待处理图像的增强层图像数据;基于所述待处理图像数据的第一像素信息和/或所述待处理图像的增强层图像数据的第二像素信息,确定所述待处理图像中的感兴趣区域。
可选地,所述待处理图像数据是对原始图像数据进行至少一次非线性图像变换处理得到的;其中,所述变换前的图像数据的位深高于或等于变换后的图像数据的位深。
在一个示例中,处理器1101还用于基于所述待处理图像数据的第一像素信息和/或所述增强层图像数据的第二像素信息,确定图像中的感兴趣区域,包括:获取所述待处理图像数据中各个像素区域中的像素单元对应的第一像素信息,和/或,所述待处理图像数据的增强层图像数据中各个像素区域中的像素单元对应的所述第二像素信息;当所述各个像素区域中的第一像素区域中的至少一个像素单元的所述第一像素信息满足第一预设阈值条件和/或所述第二像素信息满足第二预设阈值条件时,则确定所述第一像素区域属于所述感兴趣区域。
例如,所述第一像素信息包括亮度信息、第一色度分量和第二色度分量,所述第一预设阈值条件包括:所述亮度信息大于第一亮度阈值;或所述亮度信息小于第二亮度阈值,其中所述第一亮度阈值大于所述第二亮度阈值;或所述第一色度分量大于第一色度阈值; 或所述第一色度分量小于第二色度阈值,其中,所述第一色度阈值大于所述第二色度阈值;或所述第二色度分量大于第三色度阈值;或所述第二色度分量小于第四色度阈值,其中,所述第三色度阈值大于所述第四色度阈值。
再例如,所述第二像素信息包括亮度残差、第一色度分量残差和第二色度分量残差,所述第二预设阈值条件包括:所述亮度残差的绝对值大于第一亮度残差阈值;或所述第一色度分量残差的绝对值大于第一色度分量残差阈值;或所述第二色度分量残差的绝对值大于第二色度分量残差阈值。
在一个示例中,处理器1101还用于传输所述感兴趣区域的位置信息和尺寸信息。
在一个示例中,处理器1101还用于:对待处理图像中的像素单元和/或像素区域设置标签,其中,将确定属于感兴趣区域的像素单元和/或像素区域设置为第一标签,所述第一标签用于表示对应的像素单元和/或像素区域具有增强层图像数据,当不包含所述第一标签或者所述标签为第二标签时,用于表示对应的像素单元和/或像素区域没有增强层图像数据。
在一个示例中,处理器1101还用于基于所述待处理图像数据的第一像素信息和/或所述增强层图像数据的第二像素信息,确定待处理图像中的感兴趣区域,还包括:用至少一个边框包围至少部分设置为所述第一标签的像素单元,所述至少一个边框所包围的像素单元在设置为所述第一标签的所有像素单元中所占的比例不小于阈值比例,和/或,用至少一个边框包围至少部分设置为所述第一标签的像素区域,所述至少一个边框所包围的像素区域在设置为所述第一标签的所有像素区域中所占的比例不小于阈值比例,所述边框包围的区域为所述感兴趣区域。可选地,所述阈值比例大于或等于80%。
在一个示例中,处理器1101还用于所述传输所述基础层图像码流和所述增强层图像码流,还包括:基于所述增强层图像码流和所述基础层图像码流,生成目标图像码流;传输所述目标图形码流。
在一个示例中,处理器1101还用于对所述待处理图像中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流,包括:根据所述感兴趣区域内的像素单元和/或像素区域的亮度信息,确定所述感兴趣区域的像素单元和/或像素区域的亮度状态,所述亮度状态包括第一亮度状态和第二亮度状态,其中,第一亮度状态对应的亮度高于所述第二亮度状态对应的亮度;对所述亮度状态为第一亮度状态的像素单元和/或像素区域对应的增强层图像数据进行压缩,以压缩到预定输入范围内;或,对所述亮度状态为第二亮度状态的像素单元和/或像素区域对应的增强层图像数据进行放大,以放大到预定输入范围内。
在一个示例中,处理器1101还用于对所述待处理图像中的感兴趣区域的增强层图像 数据以第二预设编码规则进行编码,以获得增强层图像码流,包括:根据所述感兴趣区域内的像素单元和/或像素区域的亮度信息,确定所述感兴趣区域的像素单元和/或像素区域的亮度状态,所述亮度状态包括第一亮度状态和第二亮度状态,其中,第一亮度状态对应的亮度高于所述第二亮度状态对应的亮度;当感兴趣区域内的亮度状态为第一亮度状态的像素单元和/或像素区域所占的比例大于第一阈值比例时,对所述感兴趣区域对应的增强层图像数据进行压缩,以压缩到预定输入范围内;当感兴趣区域内的亮度状态为第二亮度为像素单元和/或像素区域所占的比例大于第二阈值比例时,对所述感兴趣区域对应的增强层图像数据进行放大,以放大到预定输入范围内。
在一个示例中,处理器1101还用于对所述感兴趣区域对应的增强层图像数据进行压缩,以压缩到预定输入范围内,包括:将所述感兴趣区域对应的增强层图像数据乘以压缩因子,以压缩到预定输入范围内,其中,所述压缩因子不小于0且不大于1,进一步,所述压缩因子不小于1/256,且不大于1;或者基于第一预设曲线,对所述感兴趣区域对应的增强层图像数据进行非线性压缩,以压缩到预定输入范围内。
在一个示例中,处理器1101还用于对所述感兴趣区域对应的增强层图像数据进行放大,以放大到预定输入范围内,包括:将所述感兴趣区域对应的增强层图像数据加上补偿偏置量,以放大到预定输入范围内;或者基于第二预设曲线,对所述感兴趣区域对应的增强层图像数据进行非线性放大,以放大到预定输入范围内。
在一个示例中,处理器1101还用于基于所述待处理图像数据和所述基础层图像数据,确定所述待处理图像的增强层图像数据,包括:基于所述待处理图像数据和所述基本层图像数据计算残差;基于所述残差确定所述待处理图像的增强层图像数据。
在一个示例中,处理器1101还用于基于所述待处理图像数据和所述基本层图像数据计算残差,包括:从所述待处理图像数据中减去所述基本层图像数据,得到所述残差。
在一个示例中,处理器1101还用于所述基于所述待处理图像数据和所述基本层图像数据计算所述残差,包括:将所述待处理图像数据所表示的动态范围与所述基本层图像数据所表示的动态范围进行对齐;通过所述待处理图像数据的动态范围与基本层图像数据的动态范围的比例,对对齐后的待处理图像数据进行补偿;从将补偿后的待处理图像数据中减去所述基本层图像数据,得到残差。
在一个示例中,处理器1101还用于所述基于所述残差确定增强层图像数据,包括:基于所述残差,得到待保留数据;从所述待保留数据中选择保留数据,并作为所述增强层图像数据。
在一个示例中,处理器1101还用于所述从所述待保留数据中选择保留数据,并作为所述增强层图像数据,包括:对所述保留数据进行调整,得到具有正整数数据的增强层图 像数据。
在一个示例中,处理器1101还用于对所述感兴趣区域对应的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流,包括:基于所述基本层图像数据对所述感兴趣区域对应的增强层图像数据进行编码。
具体地,处理器1101所执行的各个步骤的细节描述可以参考前文的数据处理方法100的描述,在此不再重复。
可选地,数据处理装置还包括显示单元(未示出),可以用于显示由处理器组件产生的图像或视频的器件。显示单元可以包括,例如,液晶显示(LCD)装置或有机发光二极管(OLED)装置。基于从处理器获取的数据,显示单元可以显示各种图像,例如菜单、所选择的操作参数、拍摄模式选择界面、由图像传感器捕获的图像、用处理器等和/或从设备的用户界面接收的其它信息等。
可选地,数据处理装置还包括输入装置可以是用户用来输入各种操作的装置,例如可以包括键盘、鼠标、功能键(比如音量控制按键、开关按键等)、操作杆、麦克风和触摸屏等中的一个或多个。
由于本申请实施例的数据处理装置能够执行前文的数据处理方法100,因此其具有和前述的方法100相同的优点。
下面,参考图12对本申请另一种数据处理装置进行描述,图12为本发明实施例提供的一种数据处理装置的示意性框图。
该数据处理装置可以为任意的具有解码器的电子设备,例如数据处理装置可以为:相机、可移动平台(例如飞行器、车、船、机器人等)、手持云台、手机、可移动平台上的相机、平板电脑、计算机等任意具有图像处理功能的装置,其中,该数据处理装置还可以具有摄像和/或照相功能。
如图12所示,该数据处理装置1200包括一个或多个处理器1201、一个或多个存储器1202、图像采集装置(未示出)、通信接口、显示单元等。这些组件通过总线系统和/或其它形式的连接机构(未示出)互连。
存储器1202用于存储可执行指令,例如用于存储用于实现根据本申请实施例的数据处理方法1000中的相应步骤和程序指令。存储器1202可存储操作系统(包括用于处理各种基本系统服务和用于执行硬件相关任务的程序)、至少一个功能所需的应用程序(比如拍摄装置的控制功能、声音播放功能、图像播放功能等)等。这些应用程序中的一些可以经由无线通信从外部服务器下载。其它应用程序可以在制造或出厂时被安装在存储器1202内,还可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机 可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存(例如SD卡)等。
通信接口(未示出)可以用于数据处理装置1200和其他外部设备例如编码端的数据处理装置等之间进行通信,包括有线或者无线方式的通信。通信接口可以为接入基于通信标准的无线网络,如WiFi(例如WiFi6)、软件无线电(Software-defined radio,简称SDR)、2G、3G、4G、5G或它们的组合。在一个示例性实施例中,通信接口经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,通信接口503还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
可选地,数据处理装置1200还包括输入装置(未示出)可以是用户用来输入各种操作的装置,例如可以包括功能键(比如例如,触摸键、按键(例如音量控制按键、开关按键等)、机械键、软键等、操作杆、麦克风和触摸屏等中的一个或多个,或者其他用于使得用户能够输入信息的用户输入装置。数据(例如,音频、视频、图像等)通过输入装置来获得。
处理器1201可以是中央处理单元(CPU)、图像处理单元(GPU)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者具有数据处理能力和/或指令执行能力的其它形式的处理单元,并且可以控制拍摄装置中的其它组件以执行期望的功能。处理器1201能够执行存储器502中存储的所述指令,以执行本文描述的数据处理方法100的相应步骤,具体地可以参考前文的方法描述。例如,处理器1201能够包括一个或多个嵌入式处理器、处理器核心、微型处理器、逻辑电路、硬件有限状态机(FSM)、数字信号处理器(DSP)、ISP、编码器、解码器或它们的组合。
一个或多个处理器1201用于执行所述存储器中存储的所述程序指令,使得所述处理器执行以下步骤:获取基础层图像码流和增强层图像码流;以第一预设编码规则对所述基础层图像码流进行解码,以获得图像的基础层图像数据;以第二预设编码规则对所述增强层图像码流进行解码,以获得所述图像中的感兴趣区域的增强层图像数据;其中,所述增强层图像数据的位深大于所述基础层图像数据的位深;对所述基础层图像数据和所述增强层图像数据进行融合,获得目标图像。
在一个示例中,处理器1201还用于获取所述感兴趣区域的位置信息和尺寸信息。
在一个示例中,处理器1201用于对所述基础层图像数据和所述增强层图像数据进行融合,获得目标图像,包括:获取图像中的像素单元和/或像素区域的标签,所述标签包括第一标签和第二标签,所述第一标签用于表示对应的像素单元和/或像素区域具有增强 层图像数据,当不包含所述第一标签或者所述第二标签时,用于表示对应的像素单元和/或像素区域没有增强层数据;将标记为所述第一标签的像素单元和/或像素区域所对应的增强层图像数据和所述基本层图像数据进行融合,以生成目标图像。
具体地,处理器1201所执行的各个步骤的细节描述可以参考前文的数据处理方法1000或数据处理方法100的描述,在此不再重复。
可选地,数据处理装置还包括显示单元(未示出),可以用于显示由处理器组件产生的图像或视频的器件。显示单元可以包括,例如,液晶显示(LCD)装置或有机发光二极管(OLED)装置。基于从处理器获取的数据,显示单元可以显示各种图像,例如菜单、所选择的操作参数、拍摄模式选择界面、由图像传感器捕获的图像、用处理器等和/或从设备的用户界面接收的其它信息等。
可选地,数据处理装置还包括输入装置可以是用户用来输入各种操作的装置,例如可以包括键盘、鼠标、功能键(比如音量控制按键、开关按键等)、操作杆、麦克风和触摸屏等中的一个或多个。
由于本申请实施例的数据处理装置能够执行前文的数据处理方法1000,因此其具有和前述的方法1000相同的优点。
另外,本申请实施例还提供数据处理系统,其中,图13为本发明实施例提供的一种数据处理系统的示意性框图。
该数据处理系统1300包括前述的数据处理装置1100和数据处理装置1200,其中,数据处理装置1100和数据处理装置1200可以是同一个装置或设备,也可以是独立的两个装置或设备,例如数据处理装置1100和数据处理装置1200可以是同一个电子设备例如相机、手机、计算机、平板电脑等,该电子设备既可以编码又可以在需要时进行解码。
在另一个示例中,数据处理装置1100和数据处理装置1200是通信连接的两个装置和设备,例如数据处理装置1100可以采集图像数据,并将图像数据按照前文的数据处理方法100进行编码处理后输出基本层图像码流和增强层图像码流,而数据处理装置1200则可以接收基本层图像码流和增强层图像码流,并进行解码后,融合生成目标图像。
在一个具体示例中,数据处理装置1100可以是具有相机的无人机,而数据处理装置1200则可以是与无人机通信连接的控制终端,例如手机、平板电脑、智能穿戴设备等。
由于数据处理系统既可以进行编码又可以进行解码,因此其同样具有前文描述的 数据处理方法的优点。
另外,本发明实施例还提供了一种计算机存储介质,其上存储有计算机程序。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器可以运行存储装置存储的所述程序指令,以实现本文所述的本发明实施例中(由处理器实现)的功能以及/或者其它期望的功能,例如以执行根据本申请实施例的数据处理方法100和/或数据处理方法1000的相应步骤,在所述计算机可读存储介质中还可以存储各种应用程序和各种数据,例如所述应用程序使用和/或产生的各种数据等。
例如,所述计算机存储介质例如可以包括智能电话的存储卡、平板电脑的存储部件、个人计算机的硬盘、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、USB存储器、或者上述存储介质的任意组合。所述计算机可读存储介质可以是一个或多个计算机可读存储介质的任意组合。
尽管这里已经参考附图描述了示例实施例,应理解上述示例实施例仅仅是示例性的,并且不意图将本发明的范围限制于此。本领域普通技术人员可以在其中进行各种改变和修改,而不偏离本发明的范围和精神。所有这些改变和修改意在被包括在所附权利要求所要求的本发明的范围之内。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。例如,以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个设备,或一些特征可以忽略,或不执行。
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。
类似地,应当理解,为了精简本发明并帮助理解各个发明方面中的一个或多个,在对本发明的示例性实施例的描述中,本发明的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该本发明的方法解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如相应的 权利要求书所反映的那样,其发明点在于可以用少于某个公开的单个实施例的所有特征的特征来解决相应的技术问题。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本发明的单独实施例。
本领域的技术人员可以理解,除了特征之间相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的替代特征来代替。
此外,本领域的技术人员能够理解,尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如,在权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的一些模块的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。
应该注意的是上述实施例对本发明进行说明而不是对本发明进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。

Claims (31)

  1. 一种数据处理方法,其特征在于,所述方法包括:
    对待处理图像以第一预设编码规则进行编码,以获得基础层图像码流;
    对所述待处理图像中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流;其中,所述增强层图像数据的位深大于所述基础层图像码流对应的基础层图像数据的位深;
    传输所述基础层图像码流和所述增强层图像码流。
  2. 如权利要求1所述的方法,其特征在于,还包括:
    对所述基础层图像码流进行解码,得到所述基础层图像数据;
    基于待处理图像数据和所述基础层图像数据,确定所述待处理图像的增强层图像数据;
    基于所述待处理图像数据的第一像素信息和/或所述待处理图像的增强层图像数据的第二像素信息,确定所述待处理图像中的感兴趣区域。
  3. 如权利要求2所述的方法,其特征在于,所述待处理图像数据是对原始图像数据进行至少一次非线性图像变换处理得到的;其中,所述变换前的图像数据的位深高于或等于变换后的图像数据的位深。
  4. 如权利要求2所述的方法,其特征在于,所述基于所述待处理图像数据的第一像素信息和/或所述待处理图像的增强层图像数据的第二像素信息,确定图像中的感兴趣区域,包括:
    获取所述待处理图像数据中各个像素区域中的像素单元对应的第一像素信息,和/或,所述待处理图像的增强层图像数据中各个像素区域中的像素单元对应的所述第二像素信息;
    当所述各个像素区域中的第一像素区域中的至少一个像素单元的所述第一像素信息满足第一预设阈值条件和/或所述第二像素信息满足第二预设阈值条件时,则确定所述第一像素区域属于所述感兴趣区域。
  5. 如权利要求4所述的方法,其特征在于,所述第一像素信息包括亮度信息、第一色度分量和第二色度分量,所述第一预设阈值条件包括:
    所述亮度信息大于第一亮度阈值;或
    所述亮度信息小于第二亮度阈值,其中所述第一亮度阈值大于所述第二亮度阈值;或
    所述第一色度分量大于第一色度阈值;或
    所述第一色度分量小于第二色度阈值,其中,所述第一色度阈值大于所述第二色度阈值;或
    所述第二色度分量大于第三色度阈值;或
    所述第二色度分量小于第四色度阈值,其中,所述第三色度阈值大于所述第四色度阈值。
  6. 如权利要求4所述的方法,其特征在于,所述第二像素信息包括亮度残差、第一色度分量残差和第二色度分量残差,所述第二预设阈值条件包括:
    所述亮度残差的绝对值大于第一亮度残差阈值;或
    所述第一色度分量残差的绝对值大于第一色度分量残差阈值;或
    所述第二色度分量残差的绝对值大于第二色度分量残差阈值。
  7. 如权利要求1所述的方法,其特征在于,还包括:
    传输所述感兴趣区域的位置信息和尺寸信息。
  8. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    对待处理图像中的像素单元和/或像素区域设置标签,其中,将确定属于感兴趣区域的像素单元和/或像素区域设置为第一标签,所述第一标签用于表示对应的像素单元和/或像素区域具有增强层图像数据,当不包含所述第一标签或者所述标签为第二标签时,用于表示对应的像素单元和/或像素区域没有增强层图像数据。
  9. 如权利要求8所述的方法,其特征在于,所述基于所述待处理图像数据的第一像素信息和/或所述增强层图像数据的第二像素信息,确定待处理图像中的感兴趣区域,还包括:
    用至少一个边框包围至少部分设置为所述第一标签的像素单元,所述至少一个边框所包围的像素单元在设置为所述第一标签的所有像素单元中所占的比例不小于阈值比例,和/或
    用至少一个边框包围至少部分设置为所述第一标签的像素区域,所述至少一个边框所包围的像素区域在设置为所述第一标签的所有像素区域中所占的比例不小于阈值比例,
    所述边框包围的区域为所述感兴趣区域。
  10. 如权利要求9所述的方法,其特征在于,所述阈值比例大于或等于80%。
  11. 如权利要求1所述的方法,其特征在于,所述传输所述基础层图像码流和所述增强层图像码流,还包括:
    基于所述增强层图像码流和所述基础层图像码流,生成目标图像码流;
    传输所述目标图形码流。
  12. 如权利要求1所述的方法,其特征在于,所述对所述待处理图像中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流,包括:
    根据所述感兴趣区域内的像素单元和/或像素区域的亮度信息,确定所述感兴趣区域的像素单元和/或像素区域的亮度状态,所述亮度状态包括第一亮度状态和第二亮度状态, 其中,第一亮度状态对应的亮度高于所述第二亮度状态对应的亮度;
    对所述亮度状态为第一亮度状态的像素单元和/或像素区域对应的增强层图像数据进行压缩,以压缩到预定输入范围内;或,
    对所述亮度状态为第二亮度状态的像素单元和/或像素区域对应的增强层图像数据进行放大,以放大到预定输入范围内。
  13. 如权利要求1所述的方法,其特征在于,对所述待处理图像中的感兴趣区域的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流,包括:
    根据所述感兴趣区域内的像素单元和/或像素区域的亮度信息,确定所述感兴趣区域的像素单元和/或像素区域的亮度状态,所述亮度状态包括第一亮度状态和第二亮度状态,其中,第一亮度状态对应的亮度高于所述第二亮度状态对应的亮度;
    当感兴趣区域内的亮度状态为第一亮度状态的像素单元和/或像素区域所占的比例大于第一阈值比例时,对所述感兴趣区域对应的增强层图像数据进行压缩,以压缩到预定输入范围内;
    当感兴趣区域内的亮度状态为第二亮度为像素单元和/或像素区域所占的比例大于第二阈值比例时,对所述感兴趣区域对应的增强层图像数据进行放大,以放大到预定输入范围内。
  14. 如权利要求12或13所述的方法,其特征在于,所述对所述感兴趣区域对应的增强层图像数据进行压缩,以压缩到预定输入范围内,包括:
    将所述感兴趣区域对应的增强层图像数据乘以压缩因子,以压缩到预定输入范围内,其中,所述压缩因子不小于0且不大于1;或者
    基于第一预设曲线,对所述感兴趣区域对应的增强层图像数据进行非线性压缩,以压缩到预定输入范围内。
  15. 如权利要求14所述的方法,其特征在于,所述压缩因子不小于1/256,且不大于1。
  16. 如权利要求12或13所述的方法,其特征在于,所述对所述感兴趣区域对应的增强层图像数据进行放大,以放大到预定输入范围内,包括:
    将所述感兴趣区域对应的增强层图像数据加上补偿偏置量,以放大到预定输入范围内;或者
    基于第二预设曲线,对所述感兴趣区域对应的增强层图像数据进行非线性放大,以放大到预定输入范围内。
  17. 如权利要求2所述的方法,其特征在于,所述基于所述待处理图像数据和所述基础层图像数据,确定所述待处理图像的增强层图像数据,包括:
    基于所述待处理图像数据和所述基本层图像数据计算残差;
    基于所述残差确定所述待处理图像的增强层图像数据。
  18. 根据权利要求17所述的方法,其特征在于,所述基于所述待处理图像数据和所述基本层图像数据计算残差,包括:
    从所述待处理图像数据中减去所述基本层图像数据,得到所述残差。
  19. 根据权利要求17所述的方法,其特征在于,所述基于所述待处理图像数据和所述基本层图像数据计算所述残差,包括:
    将所述待处理图像数据所表示的动态范围与所述基本层图像数据所表示的动态范围进行对齐;
    通过所述待处理图像数据的动态范围与基本层图像数据的动态范围的比例,对对齐后的待处理图像数据进行补偿;
    从将补偿后的待处理图像数据中减去所述基本层图像数据,得到残差。
  20. 根据权利要求17至19任一项所述的方法,其特征在于,所述基于所述残差确定增强层图像数据,包括:
    基于所述残差,得到待保留数据;
    从所述待保留数据中选择保留数据,并作为所述增强层图像数据。
  21. 根据权利要求20所述的方法,其特征在于,所述从所述待保留数据中选择保留数据,并作为所述增强层图像数据,包括:
    对所述保留数据进行调整,得到具有正整数数据的增强层图像数据。
  22. 根据权利要求1所述的方法,其特征在于,对所述感兴趣区域对应的增强层图像数据以第二预设编码规则进行编码,以获得增强层图像码流,包括:
    基于所述基本层图像数据对所述感兴趣区域对应的增强层图像数据进行编码。
  23. 根据权利要求1所述的方法,其特征在于,所述待处理图像的动态范围高于所述基础层图像数据的动态范围。
  24. 根据权利要求1所述的方法,其特征在于,所述第一预设编码规则为根据JPEG编码规则确定的编码规则。
  25. 一种数据处理方法,其特征在于,包括:
    获取基础层图像码流和增强层图像码流;
    以第一预设编码规则对所述基础层图像码流进行解码,以获得图像的基础层图像数据;
    以第二预设编码规则对所述增强层图像码流进行解码,以获得所述图像中的感兴趣区域的增强层图像数据;其中,所述增强层图像数据的位深大于所述基础层图像数据的位深;
    对所述基础层图像数据和所述增强层图像数据进行融合,获得目标图像。
  26. 如权利要求25所述的方法,其特征在于,所述方法还包括:
    获取所述感兴趣区域的位置信息和尺寸信息。
  27. 根据权利要求25所述的方法,其特征在于,所述对所述基础层图像数据和所述增强层图像数据进行融合,获得目标图像,包括:
    获取图像中的像素单元和/或像素区域的标签,所述标签包括第一标签和第二标签,所述第一标签用于表示对应的像素单元和/或像素区域具有增强层图像数据,当不包含所述第一标签或者所述第二标签时,用于表示对应的像素单元和/或像素区域没有增强层数据;
    将标记为所述第一标签的像素单元和/或像素区域所对应的增强层图像数据和所述基本层图像数据进行融合,以生成目标图像。
  28. 一种数据处理装置,其特征在于,所述数据处理装置包括:
    存储器,用于存储可执行的程序指令;
    一个或多个处理器,用于执行所述存储器中存储的所述程序指令,使得所述处理器执行如权利要求1至24任一项所述的数据处理方法。
  29. 一种数据处理装置,其特征在于,所述数据处理装置包括:
    存储器,用于存储可执行的程序指令;
    一个或多个处理器,用于执行所述存储器中存储的所述程序指令,使得所述处理器执行如权利要求25至27任一项所述的数据处理方法。
  30. 一种数据处理系统,其特征在于,所述数据处理系统包括:
    如权利要求28所述的数据处理装置;
    如权利要求29所述的数据处理装置。
  31. 一种计算机存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现1至24任一项所述的数据处理方法,或,实现25至27任一项所述的数据处理方法。
PCT/CN2021/071513 2021-01-13 2021-01-13 数据处理方法、装置及系统、计算机存储介质 WO2022151053A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/071513 WO2022151053A1 (zh) 2021-01-13 2021-01-13 数据处理方法、装置及系统、计算机存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/071513 WO2022151053A1 (zh) 2021-01-13 2021-01-13 数据处理方法、装置及系统、计算机存储介质

Publications (1)

Publication Number Publication Date
WO2022151053A1 true WO2022151053A1 (zh) 2022-07-21

Family

ID=82447718

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/071513 WO2022151053A1 (zh) 2021-01-13 2021-01-13 数据处理方法、装置及系统、计算机存储介质

Country Status (1)

Country Link
WO (1) WO2022151053A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116886923A (zh) * 2023-06-19 2023-10-13 广州开得联软件技术有限公司 课堂视频编码方法、装置、存储介质和设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101262604A (zh) * 2008-04-23 2008-09-10 哈尔滨工程大学 一种感兴趣区优先传输的可伸缩视频编码方法
CN101383962A (zh) * 2007-09-07 2009-03-11 武汉大学 基于感兴趣区域的低码率空域增强层编解码方法
US20090148054A1 (en) * 2007-12-06 2009-06-11 Samsung Electronics Co., Ltd. Method, medium and apparatus encoding/decoding image hierarchically
CN101478671A (zh) * 2008-01-02 2009-07-08 中兴通讯股份有限公司 应用于视频监控的视频编码装置及其视频编码方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101383962A (zh) * 2007-09-07 2009-03-11 武汉大学 基于感兴趣区域的低码率空域增强层编解码方法
US20090148054A1 (en) * 2007-12-06 2009-06-11 Samsung Electronics Co., Ltd. Method, medium and apparatus encoding/decoding image hierarchically
CN101478671A (zh) * 2008-01-02 2009-07-08 中兴通讯股份有限公司 应用于视频监控的视频编码装置及其视频编码方法
CN101262604A (zh) * 2008-04-23 2008-09-10 哈尔滨工程大学 一种感兴趣区优先传输的可伸缩视频编码方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116886923A (zh) * 2023-06-19 2023-10-13 广州开得联软件技术有限公司 课堂视频编码方法、装置、存储介质和设备

Similar Documents

Publication Publication Date Title
CN109076226B (zh) 图像处理装置和方法
CN107736024B (zh) 用于视频处理的方法和设备
ES2425569T3 (es) Procesado de imágenes de secuencias de vídeo basadas en DCT en un dominio comprimido
US10805606B2 (en) Encoding method and device and decoding method and device
CN109951708B (zh) 图像处理设备和方法
Artusi et al. JPEG XT: A compression standard for HDR and WCG images [standards in a nutshell]
CN108370405A (zh) 一种图像信号转换处理方法、装置及终端设备
US8675984B2 (en) Merging multiple exposed images in transform domain
JP6719391B2 (ja) ビットストリーム内で、ldrピクチャのピクチャ/ビデオ・フォーマットと、このldrピクチャおよびイルミネーション・ピクチャから取得された復号済みのhdrピクチャのピクチャ/ビデオ・フォーマットとをシグナリングする方法および装置
US20160337668A1 (en) Method and apparatus for encoding image data and method and apparatus for decoding image data
US20170171565A1 (en) Method and apparatus for predicting image samples for encoding or decoding
KR20170115503A (ko) 화상을 역-톤 매핑하기 위한 장치 및 방법
KR20180030534A (ko) 파라메트릭 톤 - 조정 함수를 사용하여 영상을 톤 - 매핑하는 방법 및 장치
CN112040240B (zh) 数据处理方法、设备和存储介质
US8755621B2 (en) Data compression method and data compression system
WO2022151053A1 (zh) 数据处理方法、装置及系统、计算机存储介质
WO2023093768A1 (zh) 图像处理方法和装置
US10757426B2 (en) Method and apparatus for processing image data
Hanhart et al. Evaluation of JPEG XT for high dynamic range cameras
WO2022151054A1 (zh) 图像处理方法及装置、存储介质
WO2022077489A1 (zh) 数据处理方法、设备和存储介质
CN114554212A (zh) 视频处理装置及方法、计算机存储介质
CN115362678A (zh) 动态范围调整参数信令和使能可变位深度支持
EP3096520B1 (en) A method for encoding/decoding a picture block
US20240098293A1 (en) Encoding high dynamic range video data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21918269

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21918269

Country of ref document: EP

Kind code of ref document: A1