WO2023165082A1 - 图像预览方法、装置、电子设备、存储介质及计算机程序及其产品 - Google Patents

图像预览方法、装置、电子设备、存储介质及计算机程序及其产品 Download PDF

Info

Publication number
WO2023165082A1
WO2023165082A1 PCT/CN2022/110220 CN2022110220W WO2023165082A1 WO 2023165082 A1 WO2023165082 A1 WO 2023165082A1 CN 2022110220 W CN2022110220 W CN 2022110220W WO 2023165082 A1 WO2023165082 A1 WO 2023165082A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
group
preview
sample
Prior art date
Application number
PCT/CN2022/110220
Other languages
English (en)
French (fr)
Inventor
何岱岚
彭维崑
王岩
秦红伟
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023165082A1 publication Critical patent/WO2023165082A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present disclosure relates to but not limited to the field of computer technology, and in particular relates to an image preview method, device, electronic equipment, storage medium, computer program and products thereof.
  • Embodiments of the present disclosure provide an image preview method, device, electronic equipment, storage medium, computer program and products thereof.
  • An embodiment of the present disclosure provides an image preview method, including: acquiring target coding data, wherein the target coding data is obtained after the target coding network performs image coding on a target image; using the target preview corresponding to the target coding network The network performs image decoding on the target coded data to obtain a target preview image corresponding to the target image, wherein the resolution of the target preview image is smaller than the resolution of the target image.
  • An embodiment of the present disclosure provides an image preview device, including: an acquisition part configured to acquire target coded data, wherein the target coded data is obtained after the target code network performs image coding on a target image; an image preview part, It is configured to use a target preview network corresponding to the target encoding network to perform image decoding on the target encoded data to obtain a target preview image corresponding to the target image, wherein the resolution of the target preview image is smaller than the target The resolution of the image.
  • An embodiment of the present disclosure provides an electronic device, including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to invoke the instructions stored in the memory to execute the above method.
  • An embodiment of the present disclosure provides a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the foregoing method is implemented.
  • An embodiment of the present disclosure provides a computer program product.
  • the computer program product includes a non-transitory computer-readable storage medium storing a computer program.
  • the computer program is read and executed by a computer, a part or part of the above-mentioned method is implemented. All steps.
  • An embodiment of the present disclosure provides a computer program, where the computer program includes computer readable codes.
  • the computer readable codes run in an electronic device, the processor of the electronic device executes the program to implement the above method. some or all of the steps.
  • the target encoded data obtained after the target encoding network performs image encoding on the target image is obtained, and the target preview network corresponding to the target encoding network is used to decode the target encoded data to obtain a resolution smaller than the corresponding target image.
  • the encoded data can be used to directly generate a target preview image with a smaller resolution and better retain the semantic information of the target image, which can not only reduce the calculation cost of the preview image, but also improve the generation speed of the target preview image. Meet the user's preview needs.
  • FIG. 1 is a schematic flowchart of an image preview method provided by an embodiment of the present disclosure
  • FIG. 2 is a block diagram of an image preview device provided by an embodiment of the present disclosure
  • FIG. 3 is a block diagram of an electronic device provided by an embodiment of the present disclosure.
  • Fig. 4 is a block diagram of an electronic device provided by an embodiment of the present disclosure.
  • the image preview method provided by the embodiment of the present disclosure can be applied to the above-mentioned end-to-end image coding technology, and is a target coding network of the end-to-end image coding technology, and is trained to obtain a target preview network capable of directly generating small-resolution preview images.
  • the target coded data obtained after the target coding network performs image coding on the target image, and use the target preview network corresponding to the target coding network to decode the target coded data to obtain a target preview image with a resolution smaller than the corresponding target image, which can Using the encoded data to directly generate a target preview image with a smaller resolution and better retain the semantic information of the target image can not only reduce the computational cost of the preview image, improve the generation speed of the target preview image, but also meet the user's preview needs.
  • FIG. 1 is a schematic flowchart of an image preview method provided by an embodiment of the present disclosure.
  • the image preview method can be executed by electronic devices such as terminal equipment or servers, and the terminal equipment can be user equipment (User Equipment, UE), mobile equipment, user terminal, terminal, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA), handheld device, computing device, vehicle-mounted device, wearable device, etc.
  • the image preview method can be implemented by calling the computer-readable instructions stored in the memory by the processor.
  • the image preview method may include:
  • Step S11 acquiring target coded data, wherein the target coded data is obtained after the target code network performs image coding on the target image.
  • the target encoded data After image encoding is performed on the target image based on the target encoding network to obtain the target encoded data, if there is a demand for image preview of the target image, the target encoded data can be obtained, so that the subsequent image decoding process can be performed on the target encoded data.
  • the target coding network here may be a coding network trained based on end-to-end image coding technology.
  • the network structure and training process of the target coding network may refer to the end-to-end image coding network in the related art, which is not limited in this disclosure.
  • Step S12 using the target preview network corresponding to the target encoding network to decode the target coded data to obtain a target preview image corresponding to the target image, wherein the resolution of the target preview image is smaller than the resolution of the target image.
  • the target preview image with a smaller resolution and better retention of the semantic information of the target image can be directly generated.
  • the process of how to train the target preview image and use the target preview network to decode the target coded data to obtain the target preview image will be described in detail below in combination with some implementations of the present disclosure.
  • the target encoded data obtained after the target encoding network performs image encoding on the target image is obtained, and the target preview network corresponding to the target encoding network is used to decode the target encoded data to obtain a resolution smaller than the corresponding target image.
  • the encoded data can be used to directly generate a target preview image with a smaller resolution and better retain the semantic information of the target image, which can not only reduce the calculation cost of the preview image, but also improve the generation speed of the target preview image. Meet the user's preview needs.
  • the corresponding target preview network is trained for the target encoding network in advance.
  • the image preview method before using the target preview network corresponding to the target coding network to perform image decoding on the target coded data, the image preview method further includes: using the target coding network to perform image coding on the sample image to obtain a sample of the sample image Encoded data; use the initial preview network to decode the sample coded data to obtain the predicted preview image corresponding to the sample image, wherein the resolution of the predicted preview image is smaller than the resolution of the sample image; based on the predicted preview image, determine the corresponding Predict the decoded image; use the sample image and predict the decoded image to perform network training on the initial preview network to obtain the target preview network.
  • Use the target coding network to encode the sample image to obtain the sample coded data of the sample image use the initial preview network to decode the sample coded data, and obtain the predicted preview image with a smaller resolution corresponding to the sample image, based on the predicted preview image, determine the predictive decoding image corresponding to the sample image, use the sample image and the predictive decoding image to perform network training on the initial preview network, obtain the trained target preview network, and directly generate the encoded data of the target encoding network with a smaller resolution and can
  • a preview image that better retains the original semantic information of the image can not only reduce the calculation cost of the preview image, increase the generation speed of the preview image, but also meet the user's preview requirements.
  • a training sample set is constructed in advance, wherein the training sample set may include sample images of various resolutions.
  • the number of sample images included in the training sample set and the resolution of each sample image can be set according to actual conditions, which are not limited in this disclosure.
  • At least one sample image is randomly selected from the training sample set, and the sample image is input into the target coding network, and after image coding is performed by the target coding network, the sample coded data of the sample image is output.
  • the target encoding network is an encoding network based on the channel grouping entropy encoding algorithm
  • the sample encoding data is obtained by performing entropy encoding on K groups of sample channel features corresponding to the sample image in sequence, and the sample encoding data includes each group of sample channel The code stream data corresponding to the feature.
  • entropy encoding can be performed sequentially on the K groups of sample channel features corresponding to the sample image, making full use of the structural redundancy between different channel groups, and improving the accuracy of the encoded data.
  • the value of K can be determined according to the actual situation, which is not limited in the present disclosure.
  • the sample image is first encoded as an H ⁇ W ⁇ C sample feature tensor
  • H and W are the width and height of the spatial dimension
  • C is the number of feature channels in the channel dimension.
  • the sample feature tensor Group in the channel dimension to get K groups of sample channel features
  • the channel numbers corresponding to each group of sample channel features are respectively C 1 , C 2 , . . . , C K , where C 1 +C 2 + .
  • the values of C 1 , C 2 , . . . , C K may be the same or different, which is not limited in the present disclosure.
  • the features of the second group of sample channels After entropy coding the features of the second group of sample channels, input the features of the first group of sample channels and the second group of sample channels into the probability estimation model, determine the entropy parameters of the features of the third group of sample channels, and use the third group of sample channel features
  • the entropy parameter of the feature, entropy encoding is performed on the channel features of the third group of samples, and the code stream data of the channel features of the third group of samples is obtained.
  • the code stream data of the sample channel characteristics of the first group to the K group together constitute the sample coded data of the sample image.
  • the sample coded data can be image decoded, and the predicted preview image corresponding to the sample image has a smaller resolution and can better retain the original semantic information of the sample image.
  • the initial preview network here may be an untrained target preview network in an initialization state, that is, the initial preview network and the trained target preview network have the same network structure, but the network parameters may be different.
  • the network structure of the initial preview network can be set according to actual needs, which is not limited in this disclosure.
  • using the initial preview network to perform image decoding on the sample coded data to obtain the predicted preview image corresponding to the sample image includes: performing entropy decoding on the sample coded data to sequentially obtain the first group to the Nth group corresponding to the sample image A group of sample channel features, where N ⁇ K, N and K are positive integers; use the initial preview network to decode the image of the first to Nth group of sample channel features to obtain a predicted preview image.
  • Entropy decoding is performed on the code stream data of the first N groups of sample channel features that are entropy encoded in the K group of sample channel features, and the initial preview network is used to decode the decoded first N groups of sample channel features. Due to the original semantics of the sample image The information is concentrated in the code stream data of several sets of sample channel features that are entropy coded earlier, so that a small-resolution predictive preview image that can better retain the original semantic information of the sample image can be generated, and the calculation of generating the predictive preview image can be reduced. Overhead to improve the speed of generating predictive preview images.
  • N can be set according to actual conditions, which is not limited in the present disclosure.
  • entropy decoding is performed on the sample encoded data to obtain the first to Nth group of sample channel features corresponding to the sample image, including: determining the entropy parameters of the first group of sample channel features, and using the first group of sample channel features The entropy parameter of the feature, entropy decoding the code stream data corresponding to the first group of sample channel features, to obtain the first group of sample channel features; based on the decoding obtained from the first group to the j-1th group of sample channel features, determine the jth group The entropy parameters of the sample channel features, and using the entropy parameters of the j-th group of sample channel features, perform entropy decoding on the code stream data corresponding to the j-th set of sample channel features, and obtain the j-th set of sample channel features, where N ⁇ j ⁇ 2 , and j is an integer.
  • the first group to j-th group that has been entropy decoded before One set of sample channel features is used as context information, thereby improving the accuracy of decoding to obtain the first to Nth set of sample channel features.
  • the process of sequentially performing entropy decoding on the channel features of the first group to the Nth group of samples is the inverse process of sequentially entropy encoding the channel features of the first group to the Nth group of samples.
  • the channel features of the second group of samples After decoding the channel features of the second group of samples, input the channel features of the first group of samples and the channel features of the second group of samples into the probability estimation model, determine the entropy parameters of the channel features of the third group of samples, and use the third group of samples.
  • the entropy parameter of the channel feature, entropy encoding is performed on the code stream data corresponding to the channel feature of the third group of samples, and the channel feature of the third group of samples is obtained.
  • the first to Nth group of sample channel features can be input into the initial preview network to obtain the corresponding prediction of the sample image directly generated by the initial preview network Preview image.
  • a progressive decoding method may also be used to obtain N progressive predictive preview images.
  • the first progressive predictive preview image is generated when the first group of sample channel features are obtained by decoding, and the second to Nth groups of sample channel features are not obtained by decoding.
  • the second progressive prediction preview image is generated when the sample channel features are not decoded and the 3rd to Nth group of sample channel features are obtained; and so on, until the 1st to Nth group of sample channel features are obtained after decoding Afterwards, the Nth progressive predictive preview image is generated.
  • each progressive predictive preview image when generating each progressive predictive preview image, it is necessary to use the probability estimation model in the target encoding network to determine the entropy parameters of all undecoded sample channel features, resulting in a slower generation speed of each progressive predictive preview image. Slow, seriously affecting the user experience.
  • the predictive preview image includes N progressive predictive preview images; using the initial preview network, image decoding is performed on the first group to the Nth group of sample channel features to obtain the predictive preview image, including: obtaining the first When the sample channel features from the group to the mth group are obtained without decoding, and the sample channel features from the m+1th group to the Nth group are obtained without decoding, the entropy parameters of the sample channel features from the m+1th group to the Nth group are zero-filled , to obtain the filling entropy parameters of the sample channel features from the m+1th group to the Nth group, where N-1 ⁇ m ⁇ 1, and m is an integer; The filling entropy parameters of the sample channel features from the group to the Nth group are input to the initial preview network to obtain the mth progressive prediction preview image.
  • Filling the entropy parameters of the undecoded sample channel features with zero values can reduce the determination time of the entropy parameters of the undecoded sample channel features, thereby increasing the generation speed of each progressive prediction preview image, reducing the waiting time of users, and improving the user experience. experience.
  • the entropy parameters of the second to Nth group of sample channel features are filled with zero values to obtain the second
  • the filling entropy parameters of the sample channel features of the first group to the Nth group; the filling entropy parameters of the first group of sample channel features and the second group to the Nth group of sample channel features are input into the initial preview network to obtain the first progressive prediction preview image .
  • the entropy parameters of the sample channel features of the third to N groups are zero-filled respectively , get the filling entropy parameters of the sample channel features from the 3rd group to the Nth group; input the filling entropy parameters of the sample channel features of the 1st group, the sample channel features of the 2nd group, and the sample channel features of the 3rd group to the Nth group into the initial preview Network, get the 2nd progressive prediction preview image.
  • the resolution of the predicted preview image generated by the initial preview network may be determined based on a preset resolution ratio. For example, if the preset resolution ratio is 1:8, then the resolution of the predicted preview image is 1/8 of the resolution of the original sample image.
  • the value of the preset resolution ratio can be set according to the actual situation, which is not limited in the present disclosure.
  • the predictive decoding image corresponding to the sample image can be determined based on the predicted preview image.
  • determining a predictive decoded image corresponding to the sample image based on the predictive preview image includes: performing up-sampling on the predictive preview image to obtain a predictive decoded image, wherein the predictive decoded image has the same resolution as the sample image.
  • the resolution of the prediction preview image is smaller. Therefore, the prediction preview image is up-sampled to obtain a prediction decoding image with the same resolution as the original sample image, so as to perform subsequent comparison training.
  • the predictive preview image may be up-sampled by using a preset up-sampling algorithm.
  • the preset upsampling algorithm may be a bilinear difference algorithm, a nearest neighbor difference algorithm, or other algorithms capable of upsampling a predictive preview image, which is not limited in this embodiment of the present disclosure.
  • the prediction preview image can be continuously subjected to three times of bilinear interpolation to obtain a prediction with the same resolution as the original sample image. Decode the image.
  • the initial preview network is trained using the sample image and the predicted decoded image to obtain the target preview network, including: determining the distortion rate of the predicted decoded image relative to the sample image; based on the distortion rate, the initial preview network is Network training, get the target preview network.
  • the distortion rate of the predicted decoded image relative to the sample image is determined. Based on the distortion rate, the initial preview network can be trained to optimize the network parameters of the initial preview network. Thus training the target preview network.
  • reducing the distortion rate of the predicted decoded image relative to the sample image is taken as an optimization goal, and the optimization goal is optimized by using a stochastic gradient descent method. Iteratively execute the above training process until the distortion rate of the predicted decoded image relative to the sample image is less than the preset distortion rate threshold, or the function corresponding to the distortion rate reaches convergence, and the training is stopped.
  • the value of the preset distortion rate threshold may be set according to actual conditions, which is not limited in the present disclosure.
  • the network parameters of the initial preview network can be adjusted after each progressive predictive preview image is generated, so that the next progressive predictive preview image can be generated by using the initial preview network after adjusting the network parameters.
  • the target preview network After training the target preview network corresponding to the target encoding network, the target preview network can be applied to the image preview scene, and the target encoding data obtained after image encoding of the target image for the target encoding network can be directly generated.
  • a target preview image that preserves the original semantic information of the target image.
  • the target encoded data is obtained by performing entropy encoding on K sets of target channel features corresponding to the target image in sequence, and the target encoded data includes code stream data corresponding to each set of target channel features.
  • the process of the target coding network performing image coding on the target image to obtain the target coded data is the same as the above-mentioned target coding network performing image coding on the sample image to obtain the sample coded data.
  • the process is similar, and can be realized by referring to the method of obtaining sample coded data.
  • using the target preview network corresponding to the target coding network to decode the target coded data to obtain the target preview image corresponding to the target image includes: performing entropy decoding on the target coded data, and sequentially obtaining the first image corresponding to the target image Group 1 to group N target channel features, where N ⁇ K, N and K are positive integers; use the target preview network to decode the target channel features from group 1 to group N to obtain the target preview image.
  • Entropy decoding is performed on the code stream data of the first N groups of target channel features that are entropy encoded among the K groups of target channel features, and the image decoding of the first N groups of target channel features obtained by decoding is performed using the target preview network. Due to the semantic information of the target image Concentrate on the code stream data of several groups of target channel features that are entropy-encoded earlier, so that a target preview image with small resolution and better retention of the original semantic information of the target image can be generated, which can reduce the computational overhead of generating the target preview image , to increase the speed at which target preview images are generated.
  • the value of N is the same as the value of N set during the training process of obtaining the target preview network by using the partial channel feature image decoding method.
  • the process of performing entropy decoding on the target coded data to sequentially obtain the first to Nth groups of target channel features corresponding to the target image, and the above-mentioned entropy decoding on the sample coded data to sequentially obtain the first to Nth groups corresponding to the sample image The process of sample channel features is similar, and can be realized by referring to the method of obtaining sample channel features.
  • the decoded first to Nth groups of target channel features are input into the target preview network to obtain the target preview network directly generated The target preview image corresponding to the target image.
  • the target preview network when the target preview network is trained by combining partial channel feature image decoding and progressive decoding, the first to Nth groups of target channel features will be obtained by decoding, and then input into the target preview network in turn, To obtain N progressive target preview images corresponding to the target image.
  • the target preview image includes N progressive target preview images; using the target preview network, the first group to the Nth group of target channel features are decoded to obtain the target preview image, including: obtaining the first Group to i-th group of target channel features, and in the case of undecoded i+1-th group to N-th group of target channel features, the entropy parameters of i+1-th group to N-th group of target channel features are filled with zero values , to obtain the filling entropy parameters of the i+1th group to the Nth group target channel features, where N-1 ⁇ i ⁇ 1, and i is an integer; the 1st to the ith group target channel features, i+1 The filling entropy parameters of the group to N group target channel features are input to the target preview network to obtain the ith progressive target preview image.
  • Filling the entropy parameters of the undecoded target channel features with zero values can reduce the determination time of the entropy parameters of the undecoded target channel features, thereby increasing the generation speed of each progressive target preview image, reducing the user's waiting time, and improving the user experience. experience.
  • the process of using the target preview network to generate N progressive target preview images is similar to the above-mentioned method of using the initial preview network to generate N progressive predictive preview images, and can be realized by referring to the method of generating progressive predictive preview images.
  • the image preview method further includes: upsampling the target preview image to obtain a target decoded image corresponding to the target image, wherein the target decoded image has the same resolution as the target image.
  • the target preview image with a smaller resolution is up-sampled to obtain a target decoded image that restores the resolution of the target image, thereby satisfying the user's requirement for viewing a clear image with a larger resolution.
  • the present disclosure also provides an image preview device, electronic equipment, computer-readable storage media, and programs, all of which can be used to implement any image preview method provided in the present disclosure, corresponding technical solutions and descriptions, and corresponding records in the method section .
  • an image preview device corresponding to the image preview method is also provided in the embodiment of the present disclosure. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned image preview method in the embodiment of the present disclosure, the implementation of the device See the implementation of the method.
  • FIG. 2 is a block diagram of an image preview device provided by an embodiment of the present disclosure. As shown in FIG. 2 , the device 20 includes:
  • the acquiring part 21 is configured to acquire target encoded data, wherein the target encoded data is obtained after the target encoding network performs image encoding on the target image;
  • the image preview part 22 is configured to use the target preview network corresponding to the target encoding network to perform image decoding on the target coded data to obtain a target preview image corresponding to the target image, wherein the resolution of the target preview image is smaller than that of the target image.
  • the target coded data is obtained by performing entropy coding on K groups of target channel features corresponding to the target image in sequence;
  • the image preview part 22 includes: a first entropy decoding part configured to perform entropy decoding on the target coded data , to obtain the target channel features of the first group to the Nth group corresponding to the target image in turn, wherein, N ⁇ K, N and K are positive integers respectively;
  • the first image preview part is configured to use the target preview network, for the first group Go to the Nth group of target channel features for image decoding to obtain the target preview image.
  • the target preview image includes N progressive target preview images; the first image preview part is configured to: obtain the 1st to i-th group of target channel features after decoding, and obtain the i+1th group without decoding
  • the entropy parameters of the i+1th group to the Nth group of target channel features are filled with zero values to obtain the filling entropy of the i+1th group to the Nth group of target channel features Parameters, where N-1 ⁇ i ⁇ 1, and i is an integer; input the entropy parameters of the target channel features from the 1st group to the i-th group, and from the i+1th group to the Nth group of target channel features into the target preview network , get the ith progressive target preview image.
  • the device 20 further includes: an upsampling part configured to upsample the target preview image to obtain a target decoded image corresponding to the target image, wherein the target decoded image has the same resolution as the target image.
  • the device 20 further includes:
  • the encoding part is configured to use the target encoding network to perform image encoding on the sample image to obtain sample encoding data of the sample image before decoding the target encoding data using the target preview network corresponding to the target encoding network;
  • the decoding part is It is configured to use the initial preview network to perform image decoding on the sample coded data to obtain a predicted preview image corresponding to the sample image, wherein the resolution of the predicted preview image is smaller than the resolution of the sample image;
  • the determining part is configured to be based on the predicted preview image, The predictive decoding image corresponding to the sample image is determined;
  • the training part is configured to use the sample image and the predictive decoding image to perform network training on the initial preview network to obtain a target preview network.
  • the sample coded data is obtained by sequentially entropy coding K groups of sample channel features corresponding to the sample image;
  • the decoding part includes: a second entropy decoding part configured to perform entropy decoding on the sample coded data to obtain The first group to the Nth group of sample channel features corresponding to the sample image, where N ⁇ K, N and K are positive integers respectively;
  • the second image preview part is configured to use the initial preview network, for the first group to the Nth group
  • a group of sample channel features is used for image decoding to obtain a predicted preview image.
  • the sample encoding data includes code stream data corresponding to the characteristics of each group of sample channels; the second entropy decoding part is configured to: determine the entropy parameters of the characteristics of the first group of sample channels, and use the first group of sample channel features The entropy parameter of the feature, entropy decoding the code stream data corresponding to the first group of sample channel features, to obtain the first group of sample channel features; based on the decoding obtained from the first group to the j-1th group of sample channel features, determine the jth group The entropy parameters of the sample channel features, and using the entropy parameters of the j-th group of sample channel features, perform entropy decoding on the code stream data corresponding to the j-th set of sample channel features, and obtain the j-th set of sample channel features, where N ⁇ j ⁇ 2 , and j is an integer.
  • the predictive preview image includes N progressive predictive preview images; the second image preview part is configured to: obtain the 1st to mth group of sample channel features after decoding, and obtain the m+1th group without decoding
  • the entropy parameters of the sample channel features from the m+1th group to the Nth group are filled with zero values, and the filling entropy of the sample channel features from the m+1th group to the Nth group is obtained Parameters, where, N-1 ⁇ m ⁇ 1; the filling entropy parameters of the sample channel features from the first group to the mth group, and the sample channel features from the m+1th group to the Nth group are input into the initial preview network to obtain the mth Progressive predictive preview image.
  • the determining part is configured to: perform up-sampling on the predictive preview image to obtain a predictive decoded image, wherein the predictive decoded image has the same resolution as the sample image.
  • the training part is configured to: determine the distortion rate of the predicted decoded image relative to the sample image; based on the distortion rate, perform network training on the initial preview network to obtain the target preview network.
  • the acquisition part 21 may be a data reading device
  • the image preview part 22 may be a graphics processor
  • This method has a specific technical relationship with the internal structure of the computer system, and it can solve the technical problems of how to improve the hardware computing efficiency or execution effect (including reducing the amount of data storage, reducing the amount of data transmission, increasing the processing speed of the hardware, etc.), so as to obtain a natural The technical effect of regular computer system internal performance improvements.
  • the functions or parts included in the apparatus provided by the embodiments of the present disclosure can be used to execute the methods described in the above method embodiments, and the implementation can refer to the descriptions of the above method embodiments.
  • a "part" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course it may also be a unit, a module or a non-modular one.
  • Embodiments of the present disclosure also provide a computer-readable storage medium, on which computer program instructions are stored, and the above-mentioned method is implemented when the computer program instructions are executed by a processor.
  • Computer readable storage media may be volatile or nonvolatile computer readable storage media.
  • An embodiment of the present disclosure also proposes an electronic device, including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to invoke the instructions stored in the memory to execute the above method.
  • An embodiment of the present disclosure also provides a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.
  • An embodiment of the present disclosure provides a computer program, where the computer program includes computer readable codes.
  • the computer readable codes run in an electronic device, the processor of the electronic device executes the program to implement the above method. some or all of the steps.
  • Electronic devices may be provided as terminals, servers, or other forms of devices.
  • Fig. 3 is a block diagram of an electronic device provided by an embodiment of the present disclosure.
  • the electronic device 300 may be a UE, a mobile device, a user terminal, a terminal, a cellular phone, a cordless phone, a personal digital processing PDA, a handheld device , computing equipment, vehicle equipment, wearable equipment and other terminal equipment.
  • electronic device 300 may include one or more of the following components: processing component 302, memory 304, power supply component 306, multimedia component 308, audio component 310, input/output (I/O) interface 312, sensor component 314, and communication component 316 .
  • the processing component 302 generally controls the overall operations of the electronic device 300, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • the processing component 302 may include one or more processors 320 to execute instructions to complete all or part of the steps of the above method. Additionally, processing component 302 may include one or more components that facilitate interaction between processing component 302 and other components. For example, processing component 302 may include a multimedia portion to facilitate interaction between multimedia component 308 and processing component 302 .
  • the memory 304 is configured to store various types of data to support operations at the electronic device 300 . Examples of such data include instructions for any application or method operating on the electronic device 300, contact data, phonebook data, messages, pictures, videos, and the like.
  • Memory 304 can be realized by any type of volatile or non-volatile storage device or their combination, such as Static Random-Access Memory (Static Random-Access Memory, SRAM), Electrically Erasable Programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (Read-Only Memory, Only Memory, ROM), magnetic memory, flash memory, magnetic disk or optical disk.
  • Static Random-Access Memory SRAM
  • Electrically Erasable Programmable Read-Only Memory Electrically Erasable Programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), Erasable Programm
  • the power supply component 306 provides power to various components of the electronic device 300 .
  • Power components 306 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for electronic device 300 .
  • the multimedia component 308 includes a screen providing an output interface between the electronic device 300 and the user.
  • the screen may include a liquid crystal display (Liquid Crystal Display, LCD) and a touch panel (Touch Panel, TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user.
  • the touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor can sense the boundary of a touch or slide action, and can also detect the duration and pressure associated with the touch or slide action.
  • the multimedia component 308 includes at least one of a front camera and a rear camera. When the electronic device 300 is in an operation mode, such as a shooting mode or a video mode, at least one of the front camera and the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capability.
  • the audio component 310 is configured to at least one of outputting an audio signal, inputting an audio signal, and the like.
  • the audio component 310 includes a microphone (Microphone, MIC), which is configured to receive an external audio signal when the electronic device 300 is in an operation mode, such as a calling mode, a recording mode and a voice recognition mode. Received audio signals may be stored in memory 304 or sent via communication component 316 .
  • the audio component 310 also includes a speaker configured to output audio signals.
  • the input/output interface 312 provides an interface between the processing component 302 and peripheral interface parts, such as a keyboard, a click wheel, buttons, and the like. These buttons may include, but are not limited to: a home button, volume buttons, start button, and lock button.
  • Sensor assembly 314 includes one or more sensors configured to provide various aspects of status assessment for electronic device 300 .
  • the sensor assembly 314 can detect the open/close state of the electronic device 300, the relative positioning of the components, such as the display and the keypad of the electronic device 300, the sensor assembly 314 can also detect the electronic device 300 or one of the electronic device 300 Changes in position of components, presence or absence of user contact with electronic device 300 , electronic device 300 orientation or acceleration/deceleration and temperature changes in electronic device 300 .
  • the sensor assembly 314 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact.
  • Sensor assembly 314 may also include a light sensor, such as a Complementary Metal Oxide Semiconductor (CMOS) or Charge-coupled Device (CCD) image sensor, configured for use in imaging applications.
  • CMOS Complementary Metal Oxide Semiconductor
  • CCD Charge-coupled Device
  • the sensor component 314 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.
  • the communication component 316 is configured to facilitate wired or wireless communication between the electronic device 300 and other devices.
  • the electronic device 300 can access a wireless network based on a communication standard, such as a wireless network (Wi-Fi), a second-generation mobile communication technology (2-Generation wireless telephone technology, 2G), a third-generation mobile communication technology (3rd-generation, 3G), the fourth generation mobile communication technology (4th generation mobile communication technology, 4G), the long term evolution of general mobile communication technology (Long Term Evolution, LTE), the fifth generation mobile communication technology (5th Generation Mobile Communication Technology, 5G) or their combination.
  • the communication component 316 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 316 also includes a near field communication (Near Field Communication, NFC) module to facilitate short-range communication.
  • NFC Near Field Communication
  • the NFC module can be based on radio frequency identification (Radio Frequency IDentification, RFID) technology, infrared data association (Infrared Data Association, IrDA) technology, ultra-wideband (Ultra Wide Band, UWB) technology, Bluetooth (Bluetooth, BT) technology and other technology to achieve.
  • RFID Radio Frequency IDentification
  • IrDA Infrared Data Association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • the electronic device 300 may be implemented by one or more Application Specific Integrated Circuits (Application Specific Integrated Circuit, ASIC), Digital Signal Processor (Digital Signal Processor, DSP), Digital Signal Processing Device (Digital Signal Processing Device , DSPD), Programmable Logic Device (Programmable Logic Device, PLD), Field Programmable Gate Array (Field Programmable Gate Array, FPGA), controller, microcontroller, microprocessor or other electronic components to implement the above method.
  • ASIC Application Specific Integrated Circuits
  • DSP Digital Signal Processor
  • DSPD Digital Signal Processing Device
  • PLD Programmable Logic Device
  • Field Programmable Gate Array Field Programmable Gate Array
  • FPGA Field Programmable Gate Array
  • a non-volatile computer-readable storage medium such as the memory 304 including computer program instructions, which can be executed by the processor 320 of the electronic device 300 to implement the above method.
  • This disclosure relates to the field of augmented reality.
  • acquiring the image information of the target object in the real environment and then using various visual related algorithms to detect or identify the relevant features, states and attributes of the target object, and obtain a virtual reality that matches the application.
  • AR effects combined with reality may involve faces, limbs, gestures, actions, etc. related to the human body, or markers and markers related to objects, or sand tables, display areas or display items related to venues or places.
  • Vision-related algorithms can involve visual positioning, Simultaneous Localization And Mapping (SLAM), 3D reconstruction, image registration, background segmentation, object key point extraction and tracking, object pose or depth detection, etc.
  • SLAM Simultaneous Localization And Mapping
  • Specific applications can not only involve interactive scenes such as guided tours, navigation, explanations, reconstructions, virtual effect overlays and display related to real scenes or objects, but also special effects processing related to people, such as makeup beautification, body beautification, special effect display, virtual Interactive scenarios such as model display.
  • the relevant features, states and attributes of the target object can be detected or identified through the convolutional neural network.
  • the above-mentioned convolutional neural network is a network model obtained by performing model training based on a deep learning framework.
  • Fig. 4 is a block diagram of an electronic device provided by an embodiment of the present disclosure, and the electronic device 400 may be provided as a server or a terminal device.
  • electronic device 400 includes processing component 422 , which may include one or more processors, and a memory resource represented by memory 432 configured to store instructions executable by processing component 422 , such as application programs.
  • An application program stored in memory 432 may include one or more portions each corresponding to a set of instructions.
  • the processing component 422 is configured to execute instructions to perform the above method.
  • the electronic device 400 may also include a power component 426 configured to perform power management of the electronic device 400 , a wired or wireless network interface 450 configured to connect the electronic device 400 to a network, and an input/output interface 458 .
  • the electronic device 400 can operate based on an operating system stored in the memory 432 .
  • a non-volatile computer-readable storage medium such as a memory 432 including computer program instructions, which can be executed by the processing component 422 of the electronic device 400 to implement the above method.
  • the present disclosure may be at least one of systems, methods, computer programs and products thereof.
  • a computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present disclosure.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Examples of computer-readable storage media include: portable computer discs, hard drives, Random Access Memory (RAM), ROM, Erasable Programmable Read-Only Memory (EPROM or Flash), SRAM, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above.
  • computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
  • the computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to each computing/processing device, or downloaded to an external computer or external storage device through at least one of a network, such as the Internet, a local area network, a wide area network, or a wireless network. .
  • the network may include at least one of copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, edge servers.
  • a network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • Computer program instructions for performing the operations of the present disclosure may be assembly instructions, Instruction Set Architectures (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or in the form of one or more source or object code written in any combination of programming languages, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages.
  • Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it may be connected to an external computer such as use an Internet service provider to connect via the Internet).
  • electronic circuits such as programmable logic circuits, FPGAs, or programmable logic arrays (Programmable logic arrays, PLAs), can be customized by utilizing state information of computer-readable program instructions, which can execute computer-readable Read program instructions, thereby implementing various aspects of the present disclosure.
  • These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a portion, a program segment, or a portion of an instruction that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the computer program product can be realized by hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in other embodiments, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) and the like.
  • the products applying the disclosed technical solution have clearly notified the personal information processing rules and obtained the individual's independent consent before processing personal information.
  • the disclosed technical solution involves sensitive personal information the products applying the disclosed technical solution have obtained individual consent before processing sensitive personal information, and at the same time meet the requirement of "express consent". For example, at a personal information collection device such as a camera, a clear and prominent sign is set up to inform that it has entered the scope of personal information collection, and personal information will be collected.
  • the personal information processing rules may include Information such as the information processor, the purpose of personal information processing, the method of processing, and the type of personal information processed.
  • Embodiments of the present disclosure provide an image preview method, device, electronic equipment, storage medium, and computer program product, wherein the image preview method includes: acquiring target coded data, wherein the target coded data is Obtained after image encoding; use the target preview network corresponding to the target encoding network to perform image decoding on the target encoded data to obtain a target preview image corresponding to the target image, wherein the resolution of the target preview image is smaller than the resolution of the target image.
  • the above solution can not only reduce the calculation cost of the preview image, increase the generation speed of the target preview image, but also meet the user's preview requirements.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本公开实施例提供了一种图像预览方法、装置、电子设备、存储介质及计算机程序及其产品,所述方法包括:获取目标编码数据,其中,所述目标编码数据是目标编码网络对目标图像进行图像编码之后得到的;利用所述目标编码网络对应的目标预览网络,对所述目标编码数据进行图像解码,得到所述目标图像对应的目标预览图像,其中,所述目标预览图像的分辨率小于所述目标图像的分辨率。

Description

图像预览方法、装置、电子设备、存储介质及计算机程序及其产品
相关申请的交叉引用
本公开实施例基于申请号为202210210395.2、申请人为北京市商汤科技开发有限公司、申请日为2022年3月4日、申请名称为“图像预览方法及装置、电子设备和存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。
技术领域
本公开涉及但不限于计算机技术领域,尤其涉及一种图像预览方法、装置、电子设备、存储介质及计算机程序及其产品。
背景技术
近年来,基于深度学习的端到端图像编码技术取得了迅速而深入的发展。一些新近提出的技术已经在编码速度和编码压缩率上达到或超过了传统的图像编码技术,例如,JPEG、BPG、VVC。目前,端到端图像编码技术已经进入了标准化阶段,可以相信,凭借优越的性能,端到端图像编码技术将在未来得到广泛应用。在实际使用图像编码时,存在对已经编码得到的编码数据进行图像预览的需求。
发明内容
本公开实施例提供一种图像预览方法、装置、电子设备、存储介质及计算机程序及其产品。
本公开实施例提供了一种图像预览方法,包括:获取目标编码数据,其中,所述目标编码数据是目标编码网络对目标图像进行图像编码之后得到的;利用所述目标编码网络对应的目标预览网络,对所述目标编码数据进行图像解码,得到所述目标图像对应的目标预览图像,其中,所述目标预览图像的分辨率小于所述目标图像的分辨率。
本公开实施例提供了一种图像预览装置,包括:获取部分,被配置为获取目标编码数据,其中,所述目标编码数据是目标编码网络对目标图像进行图像编码之后得到的;图像预览部分,被配置为利用所述目标编码网络对应的目标预览网络,对所述目标编码数据进行图像解码,得到所述目标图像对应的目标预览图像,其中,所述目标预览图像的分辨率小于所述目标图像的分辨率。
本公开实施例提供了一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器存储的指令,以执行上述方法。
本公开实施例提供了一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。
本公开实施例提供一种计算机程序产品,所述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,所述计算机程序被计算机读取并执行时,实现上述方法中的部分或全部步骤。
本公开实施例提供一种计算机程序,所述计算机程序包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备的处理器执行用于实现上述方法中的部分或全部步骤。
在本公开实施例中,获取目标编码网络对目标图像进行图像编码之后得到的目标编码数据,利用目标编码网络对应的目标预览网络,对目标编码数据进行图像解码,得到分辨率小于对应的目标图像的目标预览图像,这样可以利用编码数据直接生成分辨率较小且能较好保留目标图像的语义信息的目标预览图像,既可以降低预览图像的计算开销,提高目标预览图像的生成速度,又可以满足用户的预览需求。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本公开。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对本公开实施例中所需要使用的附图进行说明。
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。
图1为本公开实施例提供的一种图像预览方法的流程示意图;
图2为本公开实施例提供的一种图像预览装置的框图;
图3为本公开实施例提供的一种电子设备的框图;
图4为本公开实施例提供的一种电子设备的框图。
具体实施方式
以下将参考附图详细说明本公开的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。
另外,为了更好地说明本公开,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本公开同样可以实施。在一些实例中,对于本领 域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本公开的主旨。
近年来,基于深度学习的端到端图像编码技术取得了迅速而深入的发展。一些新近提出的技术已经在编码速度和编码压缩率上达到或超过了传统的图像编码技术,例如,JPEG、BPG、VVC。目前,端到端图像编码技术已经进入了标准化阶段,可以相信,凭借优越的性能,端到端图像编码技术将在未来得到广泛应用。在实际使用图像编码时,存在对已经编码得到的编码数据进行图像预览的需求。
一般地,在图像预览时,不需要完整解码得到原始分辨率的图像,而只需要观察对应的低清晰度的小分辨率预览图像。然而,现有的端到端图像编码技术,并没有专门用于生成预览图像的方法,因此,需要先利用编码数据完整解码得到原始分辨率的图像,进而再对原始分辨率的图像进行下采样后,以得到小分辨率的预览图像重建后反馈给用户。由于完整解码所使用解码网络一般具有较大功耗,这在预览具有极大分辨率与像素数的超高清原始图像时将导致较大的计算开销,浪费计算资源,且预览图像的生成速度较慢,用户体验较差。
本公开实施例提供的图像预览方法,可以应用于上述端到端图像编码技术,为端到端图像编码技术的目标编码网络,训练得到能够直接生成小分辨率预览图像的目标预览网络。获取目标编码网络对目标图像进行图像编码之后得到的目标编码数据,利用目标编码网络对应的目标预览网络,对目标编码数据进行图像解码,得到分辨率小于对应的目标图像的目标预览图像,这样可以利用编码数据直接生成分辨率较小且能较好保留目标图像的语义信息的目标预览图像,既可以降低预览图像的计算开销,提高目标预览图像的生成速度,又可以满足用户的预览需求。
下面详细介绍本公开实施例提供的图像预览方法。
图1为本公开实施例提供的一种图像预览方法的流程示意图。该图像预览方法可以由终端设备或服务器等电子设备执行,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字助理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等,该图像预览方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。如图1所示,该图像预览方法可以包括:
步骤S11,获取目标编码数据,其中,目标编码数据是目标编码网络对目标图像进行图像编码之后得到的。
在基于目标编码网络对目标图像进行图像编码得到目标编码数据之后,如果存在需要对目标图像进行图像预览的需求,可以获取目标编码数据,从而在后续对目标编码数据进行图像解码处理。
这里的目标编码网络可以是基于端到端图像编码技术训练得到的编码网络,目标编码网络的网络结构、训练过程可以参考相关技术中的端到端图像编码网络,本公开对此不作限定。
步骤S12,利用目标编码网络对应的目标预览网络,对目标编码数据进行图像解码,得到目标图像对应的目标预览图像,其中,目标预览图像的分辨率小于目标图像的分辨率。
利用目标编码网络对应的目标预览网络,对目标编码数据进行图像解码,可以直接生成分辨率较小且能较好保留目标图像的语义信息的目标预览图像。后文会结合本公开一些实现方式,对如何训练目标预览图像、以及利用目标预览网络对目标编码数据进行图像解码得到 目标预览图像的过程进行详细描述。
在本公开实施例中,获取目标编码网络对目标图像进行图像编码之后得到的目标编码数据,利用目标编码网络对应的目标预览网络,对目标编码数据进行图像解码,得到分辨率小于对应的目标图像的目标预览图像,这样可以利用编码数据直接生成分辨率较小且能较好保留目标图像的语义信息的目标预览图像,既可以降低预览图像的计算开销,提高目标预览图像的生成速度,又可以满足用户的预览需求。
为了能对目标编码网络编码后的编码数据进行图像预览,预先为目标编码网络训练对应的目标预览网络。
在一些实施方式中,在利用目标编码网络对应的目标预览网络,对目标编码数据进行图像解码之前,该图像预览方法还包括:利用目标编码网络,对样本图像进行图像编码,得到样本图像的样本编码数据;利用初始预览网络,对样本编码数据进行图像解码,得到样本图像对应的预测预览图像,其中,预测预览图像的分辨率小于样本图像的分辨率;基于预测预览图像,确定样本图像对应的预测解码图像;利用样本图像和预测解码图像,对初始预览网络进行网络训练,得到目标预览网络。
利用目标编码网络,对样本图像进行图像编码,得到样本图像的样本编码数据,利用初始预览网络,对样本编码数据进行图像解码,得到样本图像对应的分辨率较小的预测预览图像,基于预测预览图像,确定样本图像对应的预测解码图像,利用样本图像和预测解码图像,对初始预览网络进行网络训练,得到训练好的目标预览网络,为目标编码网络的编码数据直接生成分辨率较小且能较好保留图像的原始语义信息的预览图像,既可以降低预览图像的计算开销,提高预览图像的生成速度,又可以满足用户的预览需求。
为了完成网络训练,预先构建训练样本集,其中,训练样本集中可以包括多种分辨率的样本图像。训练样本集中包括的样本图像的数量、每个样本图像的分辨率可以根据实际情况设置,本公开对此不作限定。
在训练过程中,从训练样本集中随机选择至少一个样本图像,将样本图像输入目标编码网络,经目标编码网络进行图像编码之后,输出样本图像的样本编码数据。
在一些实施方式中,目标编码网络是基于通道分组熵编码算法的编码网络,样本编码数据是对样本图像对应的K组样本通道特征依次进行熵编码得到的,样本编码数据中包括每组样本通道特征对应的码流数据。
在目标编码网络是基于通道分组熵编码算法的编码网络的情况下,可以对样本图像对应的K组样本通道特征依次进行熵编码,充分利用不同通道组之间的结构冗余,提高编码数据的编码压缩率。K的取值可以根据实际情况确定,本公开对此不作限定。
在一些实施例中,样本图像首先被编码为H×W×C的样本特征张量
Figure PCTCN2022110220-appb-000001
其中,H、W为空间维度的宽度、高度,C为通道维度的特征通道数。对样本图像进行编码得到样本特征张量的过程可以参考相关技术中的图像编码过程,本公开对此不作限定。
在基于通道分组熵编码算法的目标编码网络中,对样本特征张量
Figure PCTCN2022110220-appb-000002
在通道维度进行分组,得到K组样本通道特征
Figure PCTCN2022110220-appb-000003
每组样本通道特征对应的通道数分别为C 1、C 2、……、 C K,其中,C 1+C 2+……+C K=C。C 1、C 2、……、C K的取值可以相同,也可以不同,本公开对此不作限定。
在对K组样本通道特征依次进行熵编码时,对于除第1组样本通道特征以外的任意第i组样本通道特征,以之前已经完成熵编码的第1组至第i-1组样本通道特征作为上下文信息,以实现充分利用不同通道组之间的结构冗余。
利用目标编码网络中的概率估计模型,确定第1组样本通道特征的熵参数,以及利用第1组样本通道特征的熵参数,对第1组样本通道特征进行熵编码,得到第1组样本通道特征的码流数据。
在对第1组样本通道特征进行熵编码之后,将第1组样本通道特征输入概率估计模型,确定第2组样本通道特征的熵参数,以及利用第2组样本通道特征的熵参数,对第2组样本通道特征进行熵编码,得到第2组样本通道特征的码流数据。
在对第2组样本通道特征进行熵编码之后,将第1组样本通道特征、第2组样本通道特征输入概率估计模型,确定第3组样本通道特征的熵参数,以及利用第3组样本通道特征的熵参数,对第3组样本通道特征进行熵编码,得到第3组样本通道特征的码流数据。
以此类推,直至完成对第K组样本通道特征的熵编码。第1组至第K组样本通道特征的码流数据,共同构成样本图像的样本编码数据。
利用初始预览网络,可以对样本编码数据进行图像解码,直接生成样本图像对应的分辨率较小且能较好保留样本图像的原始语义信息的预测预览图像。
这里的初始预览网络可以是初始化状态未经过训练的目标预览网络,即初始预览网络和训练得到的目标预览网络具有相同的网络结构,而网络参数可以不相同。初始预览网络的网络结构可以根据实际需要进行设置,本公开对此不作限定。
在一些实施方式中,利用初始预览网络,对样本编码数据进行图像解码,得到样本图像对应的预测预览图像,包括:对样本编码数据进行熵解码,依次得到样本图像对应的第1组至第N组样本通道特征,其中,N<K,N和K分别为正整数;利用初始预览网络,对第1组至第N组样本通道特征进行图像解码,得到预测预览图像。
对K组样本通道特征中靠前进行熵编码的N组样本通道特征的码流数据进行熵解码,利用初始预览网络对解码得到的前N组样本通道特征进行图像解码,由于样本图像的原始语义信息集中在靠前进行熵编码的几组样本通道特征的码流数据中,这样可以生成小分辨率且能较好保留样本图像的原始语义信息的预测预览图像,能够降低生成预测预览图像的计算开销,提高生成预测预览图像的速度。
N的取值可以根据实际情况进行设置,本公开对此不作限定。
在一些实施方式中,对样本编码数据进行熵解码,得到样本图像对应的第1组至第N组样本通道特征,包括:确定第1组样本通道特征的熵参数,以及利用第1组样本通道特征的熵参数,对第1组样本通道特征对应的码流数据进行熵解码,得到第1组样本通道特征;基于解码得到的第1组至第j-1组样本通道特征,确定第j组样本通道特征的熵参数,以及利用第j组样本通道特征的熵参数,对第j组样本通道特征对应的码流数据进行熵解码,得到第j组样本通道特征,其中,N≥j≥2,且j为整数。
在对第1组至第N组样本通道特征依次进行熵解码时,对于除第1组样本通道特征以外的任意第j组样本通道特征,以之前已经完成熵解码的第1组至第j-1组样本通道特征作为上下文信息,从而提升解码得到第1组至第N组样本通道特征的准确性。
对第1组至第N组样本通道特征依次进行熵解码的过程,是上述对第1组至第N组样本通道特征依次进行熵编码的逆过程。
利用目标编码网络中的概率估计模型,确定第1组样本通道特征的熵参数,以及利用第1组样本通道特征的熵参数,对第1组样本通道特征对应的码流数据进行熵解码,得到第1组样本通道特征。
在解码得到第1组样本通道特征之后,将解码得到的第1组样本通道特征输入概率估计模型,确定第2组样本通道特征的熵参数,以及利用第2组样本通道特征的熵参数,对第2组样本通道特征对应的码流数据进行熵解码,得到第2组样本通道特征。
在解码得到第2组样本通道特征之后,将解码得到的第1组样本通道特征、第2组样本通道特征输入概率估计模型,确定第3组样本通道特征的熵参数,以及利用第3组样本通道特征的熵参数,对第3组样本通道特征对应的码流数据进行熵编码,得到第3组样本通道特征。
以此类推,直至解码得到第N组样本通道特征。
在一些实施例中,在解码得到第1组至第N组样本通道特征之后,可以将第1组至第N组样本通道特征输入初始预览网络,得到初始预览网络直接生成的样本图像对应的预测预览图像。
在一些实施例中,在依次解码得到第1组至第N组样本通道特征,也可以采用渐进式解码方式,得到N个渐进式预测预览图像。
例如,在解码得到第1组样本通道特征,且未解码得到第2组至第N组样本通道特征的情况下,生成第1个渐进式预测预览图像;在解码得到第1组至第2组样本通道特征,且未解码得到第3组至第N组样本通道特征的情况下,生成第2个渐进式预测预览图像;以此类推,直至在解码得到第1组至第N组样本通道特征之后,生成第N个渐进式预测预览图像。
相关技术中,在生成每一个渐进式预测预览图像时,均需要使用目标编码网络中的概率估计模型,确定所有未解码样本通道特征的熵参数,导致每一个渐进式预测预览图像的生成速度较慢,严重影响用户体验。
在一些实施方式中,预测预览图像包括N个渐进式预测预览图像;利用初始预览网络,对第1组至第N组样本通道特征进行图像解码,得到预测预览图像,包括:在解码得到第1组至第m组样本通道特征,且未解码得到第m+1组至第N组样本通道特征的情况下,分别对第m+1组至第N组样本通道特征的熵参数进行零值填充,得到第m+1组至第N组样本通道特征的填充熵参数,其中,N-1≥m≥1,且m为整数;将第1组至第m组样本通道特征、第m+1组至第N组样本通道特征的填充熵参数,输入初始预览网络,得到第m个渐进式预测预览图像。
对未解码的样本通道特征的熵参数进行零值填充,可以降低未解码的样本通道特征的熵参数的确定时间,从而提高每一个渐进式预测预览图像的生成速度,减少用户等待时间,提 高用户体验。
在解码得到第1组样本通道特征且未解码得到第2组至第N组样本通道特征的情况下,分别对第2组至第N组样本通道特征的熵参数进行零值填充,得到第2组至第N组样本通道特征的填充熵参数;将第1组样本通道特征、第2组至第N组样本通道特征的填充熵参数,输入初始预览网络,得到第1个渐进式预测预览图像。
在解码得到第1组至第2组样本通道特征且未解码得到第3组至第N组样本通道特征的情况下,分别对第3组至第N组样本通道特征的熵参数进行零值填充,得到第3组至第N组样本通道特征的填充熵参数;将第1组样本通道特征、第2组样本通道特征、第3组至第N组样本通道特征的填充熵参数,输入初始预览网络,得到第2个渐进式预测预览图像。
以此类推,直至得到第N个渐进式预测预览图像。
本公开实施例中,除了可以采样上述利用样本图像对应的K组样本通道特征中的前1至N组生成预测预览图像的方式之外,也可以采用利用样本图像对应的全部K组样本通道特征生成预测预览图像的方式,本公开实施例对此不作限定。
初始预览网络生成的预测预览图像的分辨率,可以基于预设分辨率比例确定。例如,预设分辨率比例为1:8,则预测预览图像的分辨率为原始的样本图像的分辨率的1/8。预设分辨率比例的取值可以根据实际情况进行设置,本公开对此不作限定。
由于预览图像的分辨率小于样本图像的分辨率,为了利用样本图像进行后续网络训练,可以基于预测预览图像确定样本图像对应的预测解码图像。
在一些实施方式中,基于预测预览图像,确定样本图像对应的预测解码图像,包括:对预测预览图像进行上采样,得到预测解码图像,其中,预测解码图像与样本图像具有相同的分辨率。
与原始的样本图像相比,预测预览图像的分辨率较小,因此,对预测预览图像进行上采样,得到与原始的样本图像具有相同的分辨率的预测解码图像,从而进行后续对比训练。
在一些实施例中,可以利用预设上采样算法,对预测预览图像进行上采样。预设上采样算法可以是双线性差值算法,可以是最近邻差值算法,还可以是其它能够对预测预览图像进行上采样的算法,本公开实施例对此不作限定。
例如,预测预览图像的分辨率是原始的样本图像的分辨率的1/8,则可以对预测预览图像连续进行3次双线性插值处理,得到与原始的样本图像具有相同的分辨率的预测解码图像。
在一些实施方式中,利用样本图像和预测解码图像,对初始预览网络进行网络训练,得到目标预览网络,包括:确定预测解码图像相对于样本图像的失真率;基于失真率,对初始预览网络进行网络训练,得到目标预览网络。
在基于初始预览网络输出的预测预览图像上采样得到预测解码图像之后,确定预测解码图像相对于样本图像的失真率,基于失真率可以对初始预览网络进行网络训练,优化初始预览网络的网络参数,从而训练得到目标预览网络。
在一些实施例中,将降低预测解码图像相对于样本图像的失真率作为优化目标,并利用随机梯度下降法对该优化目标进行优化。迭代执行上述训练过程,直至预测解码图像相对于样本图像的失真率小于预设失真率阈值,或者失真率对应的函数达到收敛,停止训练。其中, 预设失真率阈值的取值可以根据实际情况进行设置,本公开对此不作限定。
针对上述渐进式解码方式,可以在生成每一个渐进式预测预览图像之后,均对初始预览网络的网络参数进行调整,从而利用调整网络参数之后的初始预览网络生成下一个渐进式预测预览图像。
在训练得到目标编码网络对应的目标预览网络之后,可以将目标预览网络应用于图像预览场景,为目标编码网络对目标图像进行图像编码之后得到的目标编码数据,直接生成分辨率较小且能较好保留目标图像的原始语义信息的目标预览图像。
在一些实施方式中,目标编码数据是对目标图像对应的K组目标通道特征依次进行熵编码得到的,目标编码数据中包括每组目标通道特征对应的码流数据。
在目标编码网络是基于通道分组熵编码算法的编码网络的情况下,目标编码网络对目标图像进行图像编码得到目标编码数据的过程,与上述目标编码网络对样本图像进行图像编码得到样本编码数据的过程类似,可以参照得到样本编码数据的方式实现。
在一些实施方式中,利用目标编码网络对应的目标预览网络,对目标编码数据进行图像解码,得到目标图像对应的目标预览图像,包括:对目标编码数据进行熵解码,依次得到目标图像对应的第1组至第N组目标通道特征,其中,N<K,N和K分别为正整数;利用目标预览网络,对第1组至第N组目标通道特征进行图像解码,得到目标预览图像。
对K组目标通道特征中靠前进行熵编码的N组目标通道特征的码流数据进行熵解码,利用目标预览网络对解码得到的前N组目标通道特征进行图像解码,由于目标图像的语义信息集中在靠前进行熵编码的几组目标通道特征的码流数据中,这样可以生成小分辨率且能较好保留目标图像的原始语义信息的目标预览图像,能够降低生成目标预览图像的计算开销,提高生成目标预览图像的速度。
N的取值,与采用部分通道特征图像解码方式,训练得到目标预览网络的训练过程中设置的N的取值相同。
对目标编码数据进行熵解码,依次得到目标图像对应的第1组至第N组目标通道特征的过程,与上述对样本编码数据进行熵解码,依次得到样本图像对应的第1组至第N组样本通道特征的过程类似,可以参照得到样本通道特征的方式实现。
在一些实施例中,在目标预览网络是采用部分通道特征图像解码方式训练得到的情况下,将解码得到的第1组至第N组目标通道特征输入目标预览网络,得到目标预览网络直接生成的目标图像对应的目标预览图像。
在一些实施例中,在目标预览网络是采用部分通道特征图像解码以及渐进式解码结合的方式训练得到的情况下,将解码得到第1组至第N组目标通道特征,依次输入目标预览网络,以得到目标图像对应的N个渐进式目标预览图像。
在一些实施方式中,目标预览图像包括N个渐进式目标预览图像;利用目标预览网络,对第1组至第N组目标通道特征进行图像解码,得到目标预览图像,包括:在解码得到第1组至第i组目标通道特征,且未解码得到第i+1组至第N组目标通道特征的情况下,分别对第i+1组至第N组目标通道特征的熵参数进行零值填充,得到第i+1组至第N组目标通道特征的填充熵参数,其中,N-1≥i≥1,且i为整数;将第1组至第i组目标通道特征、第i+1 组至第N组目标通道特征的填充熵参数,输入目标预览网络,得到第i个渐进式目标预览图像。
对未解码的目标通道特征的熵参数进行零值填充,可以降低未解码的目标通道特征的熵参数的确定时间,从而提高每一个渐进式目标预览图像的生成速度,减少用户等待时间,提高用户体验。
利用目标预览网络生成N个渐进式目标预览图像的过程,与上述利用初始预览网络生成N个渐进式预测预览图像的方式相似,可以参照生成渐进式预测预览图像的方式实现。
在一些实施方式中,该图像预览方法还包括:对目标预览图像进行上采样,得到目标图像对应的目标解码图像,其中,目标解码图像与目标图像具有相同的分辨率。
对分辨率较小的目标预览图像进行上采样,得到恢复目标图像分辨率的目标解码图像,从而可以满足用户查看分辨率较大的清晰图像的需求。
可以理解,本公开提及的上述各个方法实施例,在不违背原理逻辑的情况下,均可以彼此相互结合形成结合后的实施例。
此外,本公开还提供了图像预览装置、电子设备、计算机可读存储介质、程序,上述均可用来实现本公开提供的任一种图像预览方法,相应技术方案和描述和参见方法部分的相应记载。
基于同一发明构思,本公开实施例中还提供了与图像预览方法对应的图像预览装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述图像预览方法相似,因此装置的实施可以参见方法的实施。
图2为本公开实施例提供的一种图像预览装置的框图,如图2所示,装置20包括:
获取部分21,被配置为获取目标编码数据,其中,目标编码数据是目标编码网络对目标图像进行图像编码之后得到的;
图像预览部分22,被配置为利用目标编码网络对应的目标预览网络,对目标编码数据进行图像解码,得到目标图像对应的目标预览图像,其中,目标预览图像的分辨率小于目标图像的分辨率。
在一些实施方式中,目标编码数据是对目标图像对应的K组目标通道特征依次进行熵编码得到的;图像预览部分22,包括:第一熵解码部分,被配置为对目标编码数据进行熵解码,依次得到目标图像对应的第1组至第N组目标通道特征,其中,N<K,N和K分别为正整数;第一图像预览部分,被配置为利用目标预览网络,对第1组至第N组目标通道特征进行图像解码,得到目标预览图像。
在一些实施方式中,目标预览图像包括N个渐进式目标预览图像;第一图像预览部分,被配置为:在解码得到第1组至第i组目标通道特征,且未解码得到第i+1组至第N组目标通道特征的情况下,分别对第i+1组至第N组目标通道特征的熵参数进行零值填充,得到第i+1组至第N组目标通道特征的填充熵参数,其中,N-1≥i≥1,且i为整数;将第1组至第i组目标通道特征、第i+1组至第N组目标通道特征的填充熵参数,输入目标预览网络,得到第i个渐进式目标预览图像。
在一些实施方式中,装置20,还包括:上采样部分,被配置为对目标预览图像进行上采 样,得到目标图像对应的目标解码图像,其中,目标解码图像与目标图像具有相同的分辨率。
在一些实施方式中,装置20,还包括:
编码部分,被配置为在利用目标编码网络对应的目标预览网络,对目标编码数据进行图像解码之前,利用目标编码网络,对样本图像进行图像编码,得到样本图像的样本编码数据;解码部分,被配置为利用初始预览网络,对样本编码数据进行图像解码,得到样本图像对应的预测预览图像,其中,预测预览图像的分辨率小于样本图像的分辨率;确定部分,被配置为基于预测预览图像,确定样本图像对应的预测解码图像;训练部分,被配置为利用样本图像和预测解码图像,对初始预览网络进行网络训练,得到目标预览网络。
在一些实施方式中,样本编码数据是对样本图像对应的K组样本通道特征依次进行熵编码得到的;解码部分,包括:第二熵解码部分,被配置为对样本编码数据进行熵解码,得到样本图像对应的第1组至第N组样本通道特征,其中,N<K,N和K分别为正整数;第二图像预览部分,被配置为利用初始预览网络,对第1组至第N组样本通道特征进行图像解码,得到预测预览图像。
在一些实施方式中,样本编码数据中包括每组样本通道特征对应的码流数据;第二熵解码部分,被配置为:确定第1组样本通道特征的熵参数,以及利用第1组样本通道特征的熵参数,对第1组样本通道特征对应的码流数据进行熵解码,得到第1组样本通道特征;基于解码得到的第1组至第j-1组样本通道特征,确定第j组样本通道特征的熵参数,以及利用第j组样本通道特征的熵参数,对第j组样本通道特征对应的码流数据进行熵解码,得到第j组样本通道特征,其中,N≥j≥2,且j为整数。
在一些实施方式中,预测预览图像包括N个渐进式预测预览图像;第二图像预览部分,被配置为:在解码得到第1组至第m组样本通道特征,且未解码得到第m+1组至第N组样本通道特征的情况下,分别对第m+1组至第N组样本通道特征的熵参数进行零值填充,得到第m+1组至第N组样本通道特征的填充熵参数,其中,N-1≥m≥1;将第1组至第m组样本通道特征、第m+1组至第N组样本通道特征的填充熵参数,输入初始预览网络,得到第m个渐进式预测预览图像。
在一些实施方式中,确定部分,被配置为:对预测预览图像进行上采样,得到预测解码图像,其中,预测解码图像与样本图像具有相同的分辨率。
在一些实施方式中,训练部分,被配置为:确定预测解码图像相对于样本图像的失真率;基于失真率,对初始预览网络进行网络训练,得到目标预览网络。
在本实施例中,获取部分21可以是数据读取设备,图像预览部分22可以是图形处理器。
该方法与计算机系统的内部结构存在特定技术关联,且能够解决如何提升硬件运算效率或执行效果的技术问题(包括减少数据存储量、减少数据传输量、提高硬件处理速度等),从而获得符合自然规律的计算机系统内部性能改进的技术效果。
在一些实施例中,本公开实施例提供的装置具有的功能或包含的部分可以用于执行上文方法实施例描述的方法,其实现可以参照上文方法实施例的描述。
在本公开实施例以及其他的实施例中,“部分”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是单元,还可以是模块也可以是非模块化的。
本公开实施例还提出一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。计算机可读存储介质可以是易失性或非易失性计算机可读存储介质。
本公开实施例还提出一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器存储的指令,以执行上述方法。
本公开实施例还提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行上述方法。
本公开实施例提供一种计算机程序,所述计算机程序包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备的处理器执行用于实现上述方法中的部分或全部步骤。
电子设备可以被提供为终端、服务器或其它形态的设备。
图3为本公开实施例提供的一种电子设备的框图,如图3所示,电子设备300可以是UE、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字处理PDA、手持设备、计算设备、车载设备、可穿戴设备等终端设备。
参照图3,电子设备300可以包括以下一个或多个组件:处理组件302,存储器304,电源组件306,多媒体组件308,音频组件310,输入/输出(I/O)接口312,传感器组件314,以及通信组件316。
处理组件302通常控制电子设备300的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件302可以包括一个或多个处理器320来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件302可以包括一个或多个部分,便于处理组件302和其他组件之间的交互。例如,处理组件302可以包括多媒体部分,以方便多媒体组件308和处理组件302之间的交互。
存储器304被配置为存储各种类型的数据以支持在电子设备300的操作。这些数据的示例包括用于在电子设备300上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器304可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(Static Random-Access Memory,SRAM),电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM),可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM),可编程只读存储器(Programmable Read-Only Memory,PROM),只读存储器(Read-Only Memory,ROM),磁存储器,快闪存储器,磁盘或光盘。
电源组件306为电子设备300的各种组件提供电力。电源组件306可以包括电源管理系统,一个或多个电源,及其他与为电子设备300生成、管理和分配电力相关联的组件。
多媒体组件308包括在所述电子设备300和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(Liquid Crystal Display,LCD)和触摸面板(Touch Panel,TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸 传感器可以感测触摸或滑动动作的边界,还可以检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件308包括前置摄像头、后置摄像头中的至少之一。当电子设备300处于操作模式,如拍摄模式或视频模式时,前置摄像头、后置摄像头中的至少之一可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。
音频组件310被配置为输出音频信号、输入音频信号等中的至少一种。例如,音频组件310包括一个麦克风(Microphone,MIC),当电子设备300处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被存储在存储器304或经由通信组件316发送。在一些实施例中,音频组件310还包括一个扬声器,被配置为输出音频信号。
输入/输出接口312为处理组件302和外围接口部分之间提供接口,上述外围接口部分可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件314包括一个或多个传感器,被配置为为电子设备300提供各个方面的状态评估。例如,传感器组件314可以检测到电子设备300的打开/关闭状态,组件的相对定位,例如所述组件为电子设备300的显示器和小键盘,传感器组件314还可以检测电子设备300或电子设备300一个组件的位置改变,用户与电子设备300接触的存在或不存在,电子设备300方位或加速/减速和电子设备300的温度变化。传感器组件314可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件314还可以包括光传感器,如互补金属氧化物半导体(Complementary Metal Oxide Semiconductor,CMOS)或电荷耦合装置(Charge-coupled Device,CCD)图像传感器,被配置为在成像应用中使用。在一些实施例中,该传感器组件314还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。
通信组件316被配置为便于电子设备300和其他设备之间有线或无线方式的通信。电子设备300可以接入基于通信标准的无线网络,如无线网络(Wi-Fi)、第二代移动通信技术(2-Generation wireless telephone technology,2G)、第三代移动通信技术(3rd-generation,3G)、第四代移动通信技术(4th generation mobile communication technology,4G)、通用移动通信技术的长期演进(Long Term Evolution,LTE)、第五代移动通信技术(5th Generation Mobile Communication Technology,5G)或它们的组合。在一个示例性实施例中,通信组件316经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件316还包括近场通信(Near Field Communication,NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(Radio Frequency IDentification,RFID)技术,红外数据协会(Infrared Data Association,IrDA)技术,超宽带(Ultra Wide Band,UWB)技术,蓝牙(Bluetooth,BT)技术和其他技术来实现。
在示例性实施例中,电子设备300可以被一个或多个应用专用集成电路(Application Specific Integrated Circuit,ASIC)、数字信号处理器(Digital Signal Processor,DSP)、数字信号处理设备(Digital Signal Processing Device,DSPD)、可编程逻辑器件(Programmable Logic  Device,PLD)、现场可编程门阵列(Field Programmable Gate Array,FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。
在示例性实施例中,还提供了一种非易失性计算机可读存储介质,例如包括计算机程序指令的存储器304,上述计算机程序指令可由电子设备300的处理器320执行以完成上述方法。
本公开涉及增强现实领域,通过获取现实环境中的目标对象的图像信息,进而借助各类视觉相关算法实现对目标对象的相关特征、状态及属性进行检测或识别处理,从而得到与应用匹配的虚拟与现实相结合的AR效果。示例性的,目标对象可涉及与人体相关的脸部、肢体、手势、动作等,或者与物体相关的标识物、标志物,或者与场馆或场所相关的沙盘、展示区域或展示物品等。视觉相关算法可涉及视觉定位、即时定位与地图构建(Simultaneous Localization And Mapping,SLAM)、三维重建、图像注册、背景分割、对象的关键点提取及跟踪、对象的位姿或深度检测等。具体应用不仅可以涉及跟真实场景或物品相关的导览、导航、讲解、重建、虚拟效果叠加展示等交互场景,还可以涉及与人相关的特效处理,比如妆容美化、肢体美化、特效展示、虚拟模型展示等交互场景。可通过卷积神经网络,实现对目标对象的相关特征、状态及属性进行检测或识别处理。上述卷积神经网络是基于深度学习框架进行模型训练而得到的网络模型。
图4为本公开实施例提供的一种电子设备的框图,电子设备400可以被提供为一服务器或终端设备。如图4所示,电子设备400包括处理组件422,可以包括一个或多个处理器,以及由存储器432所代表的存储器资源,被配置为存储可由处理组件422的执行的指令,例如应用程序。存储器432中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的部分。此外,处理组件422被配置为执行指令,以执行上述方法。
电子设备400还可以包括一个电源组件426被配置为执行电子设备400的电源管理,一个有线或无线网络接口450被配置为将电子设备400连接到网络,和一个输入/输出接口458。电子设备400可以操作基于存储在存储器432的操作系统。
在示例性实施例中,还提供了一种非易失性计算机可读存储介质,例如包括计算机程序指令的存储器432,上述计算机程序指令可由电子设备400的处理组件422执行以完成上述方法。
本公开可以是系统、方法、计算机程序及其产品中的至少之一。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本公开的各个方面的计算机可读程序指令。
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是(但不限于)电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(Random Access Memory,RAM)、ROM、可擦式可编程只读存储器(EPROM或闪存)、SRAM、便携式压缩盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、数字多功能盘(Digital Versatile Disc,DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述 的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网、无线网中的至少之一下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机、边缘服务器中的至少之一。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。
用于执行本公开操作的计算机程序指令可以是汇编指令、指令集架构(Instruction Set Architectures,ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络,包括局域网(Local Area Network,LAN)或广域网(Wide Area Network,WAN),连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、FPGA或可编程逻辑阵列(Programmable logic arrays,PLA),该电子电路可以执行计算机可读程序指令,从而实现本公开的各个方面。
这里参照根据本公开实施例的方法、装置(系统)、计算机程序产品的流程图、框图中的至少之一,描述了本公开的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。
附图中的流程图和框图显示了根据本公开的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个 部分、程序段或指令的一部分,所述部分、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
该计算机程序产品可以通过硬件、软件或其结合的方式实现。在一些实施例中,所述计算机程序产品体现为计算机存储介质,在另一些实施例中,计算机程序产品体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
上文对各个实施例的描述倾向于强调各个实施例之间的不同之处,其相同或相似之处可以互相参考。
本领域技术人员可以理解,在实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的执行顺序应当以其功能和可能的内在逻辑确定。
若本公开技术方案涉及个人信息,应用本公开技术方案的产品在处理个人信息前,已明确告知个人信息处理规则,并取得个人自主同意。若本公开技术方案涉及敏感个人信息,应用本公开技术方案的产品在处理敏感个人信息前,已取得个人单独同意,并且同时满足“明示同意”的要求。例如,在摄像头等个人信息采集装置处,设置明确显著的标识告知已进入个人信息采集范围,将会对个人信息进行采集,若个人自愿进入采集范围即视为同意对其个人信息进行采集;或者在个人信息处理的装置上,利用明显的标识/信息告知个人信息处理规则的情况下,通过弹窗信息或请个人自行上传其个人信息等方式获得个人授权;其中,个人信息处理规则可包括个人信息处理者、个人信息处理目的、处理方式以及处理的个人信息种类等信息。
以上已经描述了本公开的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。
工业实用性
本公开实施例提供了一种图像预览方法、装置、电子设备、存储介质和计算机程序产品,其中,图像预览方法包括:获取目标编码数据,其中,所述目标编码数据是目标编码网络对目标图像进行图像编码之后得到的;利用所述目标编码网络对应的目标预览网络,对所述目标编码数据进行图像解码,得到所述目标图像对应的目标预览图像,其中,所述目标预览图像的分辨率小于所述目标图像的分辨率。上述方案既可以降低预览图像的计算开销,提高目标预览图像的生成速度,又可以满足用户的预览需求。

Claims (21)

  1. 一种图像预览方法,包括:
    获取目标编码数据,其中,所述目标编码数据是目标编码网络对目标图像进行图像编码之后得到的;
    利用所述目标编码网络对应的目标预览网络,对所述目标编码数据进行图像解码,得到所述目标图像对应的目标预览图像,其中,所述目标预览图像的分辨率小于所述目标图像的分辨率。
  2. 根据权利要求1所述的方法,其中,所述目标编码数据是对所述目标图像对应的K组目标通道特征依次进行熵编码得到的;
    所述利用所述目标编码网络对应的目标预览网络,对所述目标编码数据进行图像解码,得到所述目标图像对应的目标预览图像,包括:
    对所述目标编码数据进行熵解码,依次得到所述目标图像对应的第1组至第N组目标通道特征,其中,N<K,N和K分别为正整数;
    利用所述目标预览网络,对所述第1组至第N组目标通道特征进行图像解码,得到所述目标预览图像。
  3. 根据权利要求2所述的方法,其中,所述目标预览图像包括N个渐进式目标预览图像;
    所述利用所述目标预览网络,对所述第1组至第N组目标通道特征进行图像解码,得到所述目标预览图像,包括:
    在解码得到第1组至第i组目标通道特征,且未解码得到第i+1组至第N组目标通道特征的情况下,分别对所述第i+1组至第N组目标通道特征的熵参数进行零值填充,得到所述第i+1组至第N组目标通道特征的填充熵参数,其中,N-1≥i≥1,且i为整数;
    将所述第1组至第i组目标通道特征、所述第i+1组至第N组目标通道特征的填充熵参数,输入所述目标预览网络,得到第i个渐进式目标预览图像。
  4. 根据权利要求1至3中任意一项所述的方法,所述方法还包括:
    对所述目标预览图像进行上采样,得到所述目标图像对应的目标解码图像,其中,所述目标解码图像与所述目标图像具有相同的分辨率。
  5. 根据权利要求1至4中任意一项所述的方法,其中,在利用所述目标编码网络对应的目标预览网络,对所述目标编码数据进行图像解码之前,所述方法还包括:
    利用所述目标编码网络,对样本图像进行图像编码,得到所述样本图像的样本编码数据;
    利用初始预览网络,对所述样本编码数据进行图像解码,得到所述样本图像对应的预测预览图像,其中,所述预测预览图像的分辨率小于所述样本图像的分辨率;
    基于所述预测预览图像,确定所述样本图像对应的预测解码图像;
    利用所述样本图像和所述预测解码图像,对所述初始预览网络进行网络训练,得到所述目标预览网络。
  6. 根据权利要求5所述的方法,其中,所述样本编码数据是对所述样本图像对应的K组样本通道特征依次进行熵编码得到的;
    所述利用初始预览网络,对所述样本编码数据进行图像解码,得到所述样本图像对应的预测预览图像,包括:
    对所述样本编码数据进行熵解码,得到所述样本图像对应的第1组至第N组样本通道特征,其中,N<K,N和K分别为正整数;
    利用所述初始预览网络,对所述第1组至第N组样本通道特征进行图像解码,得到所述预测预览图像。
  7. 根据权利要求6所述的方法,其中,所述样本编码数据中包括每组样本通道特征对应的码流数据;
    所述对所述样本编码数据进行熵解码,得到所述样本图像对应的第1组至第N组样本通道特征,包括:
    确定第1组样本通道特征的熵参数,以及利用所述第1组样本通道特征的熵参数,对所述第1组样本通道特征对应的码流数据进行熵解码,得到所述第1组样本通道特征;
    基于解码得到的第1组至第j-1组样本通道特征,确定第j组样本通道特征的熵参数,以及利用所述第j组样本通道特征的熵参数,对所述第j组样本通道特征对应的码流数据进行熵解码,得到所述第j组样本通道特征,其中,N≥j≥2,且j为整数。
  8. 根据权利要求6或7所述的方法,其中,所述预测预览图像包括N个渐进式预测预览图像;
    所述利用所述初始预览网络,对所述第1组至第N组样本通道特征进行图像解码,得到所述预测预览图像,包括:
    在解码得到第1组至第m组样本通道特征,且未解码得到第m+1组至第N组样本通道特征的情况下,分别对所述第m+1组至第N组样本通道特征的熵参数进行零值填充,得到所述第m+1组至第N组样本通道特征的填充熵参数,其中,N-1≥m≥1,且m为整数;
    将所述第1组至第m组样本通道特征、所述第m+1组至第N组样本通道特征的填充熵参数,输入所述初始预览网络,得到第m个渐进式预测预览图像。
  9. 根据权利要求5至8中任意一项所述的方法,其中,所述基于所述预测预览图像,确定所述样本图像对应的预测解码图像,包括:
    对所述预测预览图像进行上采样,得到所述预测解码图像,其中,所述预测解码图像与所述样本图像具有相同的分辨率。
  10. 根据权利要求5至9中任意一项所述的方法,其中,所述利用所述样本图像和所述预测解码图像,对所述初始预览网络进行网络训练,得到所述目标预览网络,包括:
    确定所述预测解码图像相对于所述样本图像的失真率;
    基于所述失真率,对所述初始预览网络进行网络训练,得到所述目标预览网络。
  11. 一种图像预览装置,包括:
    获取部分,被配置为获取目标编码数据,其中,所述目标编码数据是目标编码网络对目标图像进行图像编码之后得到的;
    图像预览部分,被配置为利用所述目标编码网络对应的目标预览网络,对所述目标编码数据进行图像解码,得到所述目标图像对应的目标预览图像,其中,所述目标预览图像的分 辨率小于所述目标图像的分辨率。
  12. 根据权利要求11所述的装置,其中,所述目标编码数据是对所述目标图像对应的K组目标通道特征依次进行熵编码得到的;所述图像预览部分,包括:
    第一熵解码部分,被配置为对所述目标编码数据进行熵解码,依次得到所述目标图像对应的第1组至第N组目标通道特征,其中,N<K,N和K分别为正整数;
    第一图像预览部分,被配置为利用所述目标预览网络,对所述第1组至第N组目标通道特征进行图像解码,得到所述目标预览图像。
  13. 根据权利要求12所述的装置,其中,所述目标预览图像包括N个渐进式目标预览图像;所述第一图像预览部分,被配置为:
    在解码得到第1组至第i组目标通道特征,且未解码得到第i+1组至第N组目标通道特征的情况下,分别对所述第i+1组至第N组目标通道特征的熵参数进行零值填充,得到所述第i+1组至第N组目标通道特征的填充熵参数,其中,N-1≥i≥1,且i为整数;
    将所述第1组至第i组目标通道特征、所述第i+1组至第N组目标通道特征的填充熵参数,输入所述目标预览网络,得到第i个渐进式目标预览图像。
  14. 根据权利要求11至13中任意一项所述的装置,所述装置还包括:
    上采样部分,被配置为对所述目标预览图像进行上采样,得到所述目标图像对应的目标解码图像,其中,所述目标解码图像与所述目标图像具有相同的分辨率。
  15. 根据权利要求11至14中任意一项所述的装置,其中,在利用所述目标编码网络对应的目标预览网络,对所述目标编码数据进行图像解码之前,所述装置还包括:
    编码部分,被配置为利用所述目标编码网络,对样本图像进行图像编码,得到所述样本图像的样本编码数据;
    解码部分,被配置为利用初始预览网络,对所述样本编码数据进行图像解码,得到所述样本图像对应的预测预览图像,其中,所述预测预览图像的分辨率小于所述样本图像的分辨率;
    确定部分,被配置为基于所述预测预览图像,确定所述样本图像对应的预测解码图像;
    训练部分,被配置为利用所述样本图像和所述预测解码图像,对所述初始预览网络进行网络训练,得到所述目标预览网络。
  16. 根据权利要求15所述的装置,其中,所述样本编码数据是对所述样本图像对应的K组样本通道特征依次进行熵编码得到的;所述解码部分,包括:
    第二熵解码部分,被配置为对所述样本编码数据进行熵解码,得到所述样本图像对应的第1组至第N组样本通道特征,其中,N<K,N和K分别为正整数;
    第二图像预览部分,被配置为利用所述初始预览网络,对所述第1组至第N组样本通道特征进行图像解码,得到所述预测预览图像。
  17. 根据权利要求16所述的装置,其中,所述样本编码数据中包括每组样本通道特征对应的码流数据;所述第二熵解码部分,被配置为:
    确定第1组样本通道特征的熵参数,以及利用所述第1组样本通道特征的熵参数,对所述第1组样本通道特征对应的码流数据进行熵解码,得到所述第1组样本通道特征;
    基于解码得到的第1组至第j-1组样本通道特征,确定第j组样本通道特征的熵参数,以及利用所述第j组样本通道特征的熵参数,对所述第j组样本通道特征对应的码流数据进行熵解码,得到所述第j组样本通道特征,其中,N≥j≥2,且j为整数。
  18. 一种电子设备,包括:
    处理器;
    用于存储处理器可执行指令的存储器;
    其中,所述处理器被配置为调用所述存储器存储的指令,以执行权利要求1至10中任意一项所述的方法。
  19. 一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现权利要求1至10中任意一项所述的方法。
  20. 一种计算机程序产品,所述计算机程序产品包括计算机程序或指令,在所述计算机程序或指令在电子设备上运行的情况下,使得所述电子设备执行权利要求1至10中任意一项所述的方法。
  21. 一种计算机程序,所述计算机程序包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备的处理器执行用于实现权利要求1至10中任意一项所述的方法。
PCT/CN2022/110220 2022-03-04 2022-08-04 图像预览方法、装置、电子设备、存储介质及计算机程序及其产品 WO2023165082A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210210395.2 2022-03-04
CN202210210395.2A CN114581542A (zh) 2022-03-04 2022-03-04 图像预览方法及装置、电子设备和存储介质

Publications (1)

Publication Number Publication Date
WO2023165082A1 true WO2023165082A1 (zh) 2023-09-07

Family

ID=81772619

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/110220 WO2023165082A1 (zh) 2022-03-04 2022-08-04 图像预览方法、装置、电子设备、存储介质及计算机程序及其产品

Country Status (2)

Country Link
CN (1) CN114581542A (zh)
WO (1) WO2023165082A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114581542A (zh) * 2022-03-04 2022-06-03 北京市商汤科技开发有限公司 图像预览方法及装置、电子设备和存储介质
CN117294854A (zh) * 2022-06-20 2023-12-26 华为技术有限公司 一种图像编码、解码方法及编码、解码装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160219289A1 (en) * 2015-01-23 2016-07-28 Sony Corporation Data encoding and decoding
CN107155093A (zh) * 2017-06-21 2017-09-12 普联技术有限公司 一种视频预览方法、装置及设备
CN112714320A (zh) * 2020-12-25 2021-04-27 杭州海康威视数字技术股份有限公司 一种解码方法、解码设备及计算机可读存储介质
CN113170161A (zh) * 2020-08-31 2021-07-23 深圳市大疆创新科技有限公司 图像编码方法、图像解码方法、装置和存储介质
CN113313776A (zh) * 2021-05-27 2021-08-27 Oppo广东移动通信有限公司 图像处理方法、图像处理装置、存储介质与电子设备
CN114581542A (zh) * 2022-03-04 2022-06-03 北京市商汤科技开发有限公司 图像预览方法及装置、电子设备和存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160219289A1 (en) * 2015-01-23 2016-07-28 Sony Corporation Data encoding and decoding
CN107155093A (zh) * 2017-06-21 2017-09-12 普联技术有限公司 一种视频预览方法、装置及设备
CN113170161A (zh) * 2020-08-31 2021-07-23 深圳市大疆创新科技有限公司 图像编码方法、图像解码方法、装置和存储介质
CN112714320A (zh) * 2020-12-25 2021-04-27 杭州海康威视数字技术股份有限公司 一种解码方法、解码设备及计算机可读存储介质
CN113313776A (zh) * 2021-05-27 2021-08-27 Oppo广东移动通信有限公司 图像处理方法、图像处理装置、存储介质与电子设备
CN114581542A (zh) * 2022-03-04 2022-06-03 北京市商汤科技开发有限公司 图像预览方法及装置、电子设备和存储介质

Also Published As

Publication number Publication date
CN114581542A (zh) 2022-06-03

Similar Documents

Publication Publication Date Title
CN109740516B (zh) 一种用户识别方法、装置、电子设备及存储介质
WO2023165082A1 (zh) 图像预览方法、装置、电子设备、存储介质及计算机程序及其产品
CN110659640B (zh) 文本序列的识别方法及装置、电子设备和存储介质
TWI771645B (zh) 文本識別方法及裝置、電子設備、儲存介質
CN109658401B (zh) 图像处理方法及装置、电子设备和存储介质
CN113766313B (zh) 视频数据处理方法及装置、电子设备和存储介质
JP7106687B2 (ja) 画像生成方法および装置、電子機器、並びに記憶媒体
CN111612070B (zh) 基于场景图的图像描述生成方法及装置
CN111539410B (zh) 字符识别方法及装置、电子设备和存储介质
WO2020155713A1 (zh) 图像处理方法及装置、网络训练方法及装置
CN109145970B (zh) 基于图像的问答处理方法和装置、电子设备及存储介质
TW202032425A (zh) 圖像處理方法及裝置、電子設備和儲存介質
KR20220116015A (ko) 네트워크 트레이닝 방법 및 장치, 이미지 생성 방법 및 장치
CN111242303A (zh) 网络训练方法及装置、图像处理方法及装置
CN113613003B (zh) 视频压缩、解压缩方法及装置、电子设备和存储介质
CN112509123A (zh) 三维重建方法、装置、电子设备及存储介质
CN111931781A (zh) 图像处理方法及装置、电子设备和存储介质
WO2022247091A1 (zh) 人群定位方法及装置、电子设备和存储介质
CN114446318A (zh) 音频数据分离方法、装置、电子设备及存储介质
CN111988622B (zh) 视频预测方法及装置、电子设备和存储介质
CN114693905A (zh) 文本识别模型构建方法、文本识别方法以及装置
WO2023142419A1 (zh) 人脸跟踪识别方法、装置、电子设备、介质及程序产品
CN114842404A (zh) 时序动作提名的生成方法及装置、电子设备和存储介质
CN115422932A (zh) 一种词向量训练方法及装置、电子设备和存储介质
CN114554226A (zh) 图像处理方法及装置、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22929515

Country of ref document: EP

Kind code of ref document: A1