CN113453017B

CN113453017B - Video processing method, device, equipment and computer program product

Info

Publication number: CN113453017B
Application number: CN202110715548.4A
Authority: CN
Inventors: 张健; 李立锋
Original assignee: China Mobile Communications Group Co Ltd; MIGU Culture Technology Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; MIGU Culture Technology Co Ltd
Priority date: 2021-06-24
Filing date: 2021-06-24
Publication date: 2022-08-23
Anticipated expiration: 2041-06-24
Also published as: CN113453017A

Abstract

The invention discloses a video processing method, which comprises the following steps: performing lens segmentation operation on a key frame of a video to be processed to obtain a plurality of image groups; acquiring color difference, and respectively determining each reference frame and a frame to be processed based on the first frame; respectively acquiring color value corresponding relations between the part areas of the frames to be processed in each image group and the color value ranges; and acquiring a first low color gamut image frame corresponding to each frame to be processed, and determining a compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value corresponding relation and the reference frame. The invention also discloses a video processing device, equipment and a computer program product. According to the method, the structure of the key frame is disassembled, the color value ranges of different part areas are compressed, the compression ratio of intraframe compression is improved, the selection of the reference frame is carried out according to the similarity of each frame in each lens, and the accuracy of the reference frame is improved.

Description

Video processing method, device, equipment and computer program product

Technical Field

The present invention relates to the field of video processing, and in particular, to a video processing method, apparatus, device, and computer program product.

Background

With the development of mobile devices such as smart phones, tablet computers and electronic readers, multimedia data is widely used in mobile terminals. Since the network environment of the mobile terminal is variable and complex, and the available bandwidth is easily limited, when the mobile terminal needs to download or upload a video file with a large data volume to the network side, the video file needs to be compressed.

In general, video compression includes intra-frame compression and inter-frame compression. Intra-frame compression techniques compress individual frames, often referred to as I-frames or key frames. The interframe compression techniques compress frames with reference to previous or subsequent frames, which are often referred to as predicted frames, P-frames, or B-frames. In the conventional video compression, intra-frame compression of key frames in a GOP (Group of Pictures) and inter-frame prediction using image residuals are adopted for video compression. However, the conventional intra-frame compression method does not process different areas of key frames in the same shot, so that the intra-frame compression is low in compression ratio.

The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.

Disclosure of Invention

The invention mainly aims to provide a video processing method, a video processing device, video processing equipment and a computer program product, and aims to solve the technical problem that the compression ratio of key frame intraframe compression in the existing video compression is low.

In order to achieve the above object, the present invention provides a video processing method, including the steps of:

performing shot segmentation operation on key frames of a video to be processed to obtain a plurality of image groups, wherein each frame in the image groups belongs to the same shot;

acquiring color differences of people and objects in each image group, respectively determining the target similarity of each frame based on the similarity between two adjacent frames in the image group, respectively determining a first frame with the target similarity being greater than or equal to a preset threshold value in each image group, and respectively determining each reference frame and a frame to be processed based on the first frame;

performing structure disassembly operation on the frame to be processed to respectively obtain color value ranges corresponding to the part areas in the frame to be processed, and respectively obtaining color value corresponding relations between the part areas and the color value ranges of the frame to be processed in each image group;

and acquiring a first low color gamut image frame corresponding to each frame to be processed, and determining a compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value corresponding relation and the reference frame.

Further, the step of performing a structure disassembly operation on the frame to be processed to obtain color value ranges corresponding to the part regions in the frame to be processed respectively includes:

respectively mapping the to-be-processed frames based on a preset rgb color partition to obtain mapped to-be-processed frames, obtaining second low color gamut image frames corresponding to the mapped to-be-processed frames, determining boundary lines in each second low color gamut image frame based on a color value continuity rule, respectively determining part areas in the to-be-processed frames based on the boundary lines, and obtaining color value ranges corresponding to the part areas based on the to-be-processed frames; alternatively, the first and second electrodes may be,

and identifying the frame image to be processed to obtain part areas in the frame to be processed, and acquiring color value ranges corresponding to the part areas based on the frame to be processed, wherein the part areas comprise different parts of people or objects in the frame to be processed.

Further, the step of obtaining the first low color gamut image frame corresponding to each frame to be processed includes:

converting the frame to be processed into a black-and-white image frame or a wire frame image frame to obtain the first low color gamut image frame; alternatively, the first and second electrodes may be,

determining an object contour in each frame to be processed based on an image edge detection algorithm, and performing Fourier transform operation on contour coordinates of the object contour to obtain the first low color gamut image frame.

Further, the step of determining the target similarity of each frame based on the similarity between two adjacent frames in the image group comprises:

respectively mapping the image frames in each image group based on a preset rgb color partition to obtain a mapped image group;

acquiring a color histogram vector of each image frame in the mapped image group, determining the similarity between two adjacent frames in each image group based on the color histogram vector, and determining the target similarity of each frame based on the determined similarity; alternatively, the first and second electrodes may be,

normalizing the mapped image group to obtain a normalized image group, acquiring pixel pairs corresponding to a differentiation pixel point between two adjacent image frames in the normalized image group, and determining the pixel similarity between each pixel pair; and determining the similarity between two adjacent frames in each image group based on the pixel similarity, and determining the target similarity of each frame based on the similarity.

Further, the step of determining a compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value correspondence, and the reference frame includes:

obtaining a low color gamut picture and a color picture comprising UV components corresponding to each frame to be processed, taking the low color gamut picture as the input of a GAN model to be trained, and taking the color picture as the output of the GAN model to be trained for model training to obtain a pre-trained GAN model, wherein the GAN model to be trained is a model GAN corresponding to the style of the video to be processed;

if a second frame with the target similarity smaller than a preset threshold exists in each image group, determining a compressed video corresponding to the video to be processed based on the color difference, a pre-trained GAN model, the low color gamut picture, the second frame, the color value corresponding relation and the reference frame;

and if the second frame with the target similarity smaller than the preset threshold value does not exist in each image group, determining the compressed video corresponding to the video to be processed based on the color difference, the pre-trained GAN model, the low color gamut picture, the color value corresponding relation and the reference frame.

Further, after the step of determining the compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value correspondence, and the reference frame, the video processing method further includes:

acquiring a pre-trained GAN model corresponding to a compressed video, and inputting video frames in the compressed video into the pre-trained GAN model for model training to obtain trained video frames;

processing each image group in the trained video frame based on the Y value of the reference frame in each image group of the compressed video to obtain a processed video frame;

determining whether each image group corresponding to the processed video frame has a color difference image group or not based on the color difference and color value corresponding relation in the compressed video;

and if the target video does not exist, the processed video frame is taken as the decoded target video.

Further, after the step of determining whether there is a color difference image group in each image group of the processed video frame, the method further comprises:

if the color difference image group exists, acquiring a first color distribution corresponding to a difference image frame in the color difference image group and a second color distribution corresponding to a reference frame in the difference image group;

matching the first color distribution based on the second color distribution to obtain a processed difference image frame, and replacing the difference image frame in the color difference image group with the processed difference image frame to obtain a processed color difference image group;

and determining the target video based on the processed color difference image group and the processed normal image group in the video frame.

Further, to achieve the above object, the present invention also provides a video processing apparatus comprising:

the device comprises a segmentation module, a processing module and a processing module, wherein the segmentation module is used for carrying out lens segmentation operation on key frames of a video to be processed so as to obtain a plurality of image groups, and each frame in the image groups belongs to the same lens;

the first acquisition module is used for acquiring color differences of people and objects in each image group, respectively determining the target similarity of each frame based on the similarity between two adjacent frames in the image group, respectively determining a first frame with the target similarity being greater than or equal to a preset threshold in each image group, and respectively determining each reference frame and each frame to be processed based on the first frame;

the disassembling module is used for performing structure disassembling operation on the frame to be processed so as to respectively obtain color value ranges corresponding to the part areas in the frame to be processed and respectively obtain color value corresponding relations between the part areas and the color value ranges of the frame to be processed in each image group;

and the second acquisition module is used for acquiring a first low color gamut image frame corresponding to each frame to be processed and determining a compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value corresponding relation and the reference frame.

Further, to achieve the above object, the present invention also provides a video processing apparatus comprising: a memory, a processor and a video processing program stored on the memory and executable on the processor, the video processing program when executed by the processor implementing the steps of the video processing method as described above.

Furthermore, to achieve the above object, the present invention also provides a computer program product comprising a computer program, which when executed by a processor, implements the steps of the aforementioned video processing method.

The method comprises the steps of carrying out lens segmentation operation on key frames of a video to be processed to obtain a plurality of image groups, then obtaining color differences of people and objects in each image group, respectively determining the target similarity of each frame based on the similarity between two adjacent frames in the image groups, respectively determining a first frame with the target similarity being greater than or equal to a preset threshold value in each image group, and respectively determining each reference frame and a frame to be processed based on the first frame; then, performing structure disassembly operation on the frame to be processed to respectively obtain color value ranges corresponding to the part areas in the frame to be processed, and respectively obtaining color value corresponding relations between the part areas and the color value ranges of the frame to be processed in each image group; and then acquiring a first low color gamut image frame corresponding to each frame to be processed, determining a compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value corresponding relation and the reference frame, and compressing the color value range of different part areas by performing structure disassembly on a key frame, so that the compression ratio of intraframe compression is improved. Meanwhile, the reference frame is selected according to the similarity of each frame in each shot, so that the accuracy of the reference frame is improved.

Drawings

Fig. 1 is a schematic structural diagram of a video processing device in a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a video processing method according to a first embodiment of the present invention;

FIG. 3 is a diagram illustrating a structure of a frame to be processed according to an embodiment of a video processing method of the present invention;

fig. 4 is a functional block diagram of a video processing apparatus according to an embodiment of the invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, fig. 1 is a schematic structural diagram of a video processing device in a hardware operating environment according to an embodiment of the present invention.

The video processing device in the embodiment of the present invention may be a PC, or may be a mobile terminal device having a display function, such as a smart phone, a tablet computer, an electronic book reader, an MP3(Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3) player, an MP4(Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4) player, a portable computer, or the like.

As shown in fig. 1, the video processing apparatus may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. The communication bus 1002 is used to implement connection communication among these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., a WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory such as a disk memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.

Optionally, the video processing device may further include a camera, a Radio Frequency (RF) circuit, a sensor, an audio circuit, a WiFi module, and the like. Such as light sensors, motion sensors, and other sensors, among others. In particular, the light sensor may include an ambient light sensor that adjusts the brightness of the display screen according to the brightness of ambient light, and a proximity sensor that turns off the display screen and/or the backlight when the video processing device is moved to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when the mobile terminal is stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer and tapping) and the like for recognizing the attitude of the mobile terminal; of course, the video processing device may also be configured with other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, and so on, which are not described herein again.

Those skilled in the art will appreciate that the terminal architecture shown in fig. 1 does not constitute a limitation of video processing devices and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.

As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a video processing program.

In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting a background server and communicating data with the background server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be used to invoke a video processing program stored in the memory 1005.

In this embodiment, a video processing apparatus includes: the system comprises a memory 1005, a processor 1001 and a video processing program stored on the memory 1005 and capable of running on the processor 1001, wherein when the processor 1001 calls the video processing program stored in the memory 1005, the steps of the video processing method in the following embodiments are executed.

Referring to fig. 2, fig. 2 is a flowchart illustrating a video processing method according to a first embodiment of the present invention.

In this embodiment, the video processing method includes the steps of:

step S101, performing shot segmentation operation on a key frame of a video to be processed to obtain a plurality of image groups, wherein each frame in the image groups belongs to the same shot;

in this embodiment, when a video to be processed is acquired, a shot segmentation technology is used to perform a shot segmentation operation on a key frame of the video to be processed to obtain a plurality of image groups, so that each frame in each image group belongs to the same shot, specifically, existing shot segmentation technologies such as feature comparison and pixel comparison may be used.

Step S102, acquiring color differences of people and objects in each image group, determining the target similarity of each frame based on the similarity between two adjacent frames in the image group, respectively determining a first frame with the target similarity being greater than or equal to a preset threshold in each image group, and respectively determining each reference frame and a frame to be processed based on the first frame;

in this embodiment, an image recognition algorithm is first adopted to perform image recognition on each frame of each image group to determine a person and an object in each frame of each image group, and obtain a maximum area, a minimum area, a maximum area of the object, and a minimum area of the object in each image group, where the maximum area and the minimum area of the person include a maximum area and a minimum area of each person in a key frame of the image group, the maximum area and the minimum area of the object include a maximum area and a minimum area of each object in the key frame of the image group, and for each image group, a maximum frame of the person to which the maximum area of each person belongs and a minimum frame of the person to which the minimum area belongs, and a maximum frame of the object to which the maximum area of the object belongs and a minimum frame of the object to which the minimum area of the object belongs are obtained respectively. And then calculating the color difference of the character based on the pixel value of the character corresponding area in the maximum character frame and the pixel value of the character corresponding area in the minimum character frame, calculating the color difference of the object based on the pixel value of the object corresponding area in the maximum object frame and the pixel value of the object corresponding area in the minimum object frame, and taking the color difference of the character and the color difference of the object as the color difference.

Then, the similarity between two adjacent frames in the image group is obtained, specifically, the existing similarity may be used, and the target similarity of each frame in the image group is determined based on the similarity, for example, if a frame is a start frame or an end frame of the image group, the similarity between the start frame and the next frame is taken as the target similarity of the start frame, the similarity between the end frame and the previous frame is taken as the target similarity of the end frame, and for an intermediate frame, the target similarity is determined based on the similarity between the frame and the next frame and the similarity between the frame and the previous frame, for example, the target similarity is an average of the similarity between the frame and the next frame and the similarity between the frame and the previous frame.

When the target similarity of each frame is obtained, determining a first frame in each image group, where the target similarity is greater than or equal to a preset threshold, specifically, for each image group, determining whether the target similarity of each frame is greater than or equal to the preset threshold, taking the frame with the target similarity greater than or equal to the preset threshold as the first frame, and determining each reference frame and a frame to be processed based on the first frame, respectively, where any one of the first frames may be selected as the reference frame, or taking the frame with the maximum target similarity in the first frame as the reference frame, and taking other frames except the reference frame in the first frame as the frames to be processed.

It should be noted that, for each frame in the image group of the same shot, if the target similarity of the frame is greater than or equal to the preset threshold, the color of the object or person in the frame is changed only due to the environment and light, and belongs to the normal range, compared with the previous frame and the next frame, and therefore the frame is taken as the first frame. Of course, if the target similarity of the frame is smaller than the preset threshold, the color of the object or person in the frame changes due to its own cause, for example, the color of the light of the traffic light changes, and when compressing the frame, the frame is not compressed.

Step S103, performing structure dismantling operation on the frame to be processed to respectively obtain color value ranges corresponding to the part areas in the frame to be processed, and respectively obtaining color value corresponding relations between the part areas and the color value ranges of the frame to be processed in each image group;

in this embodiment, after the frame to be processed of each image group is obtained, the frame to be processed of each image group is subjected to structure disassembly, and the frame to be processed is subjected to structure disassembly to obtain color value ranges corresponding to the part regions in the frame to be processed, respectively, where the part regions in the frame to be processed include each part of a person or an object or a plurality of regions with discontinuous color values, and then, a color value range is determined based on rgb values of pixels in each part region, and the color value range may be from a minimum rgb value of a pixel in the part region to a maximum rgb value of a pixel in the part region, and a color value correspondence relationship between the part region and the color value range of the frame to be processed in each image group is recorded.

Step S104, obtaining a first low color gamut image frame corresponding to each frame to be processed, and determining a compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value corresponding relation and the reference frame.

In this embodiment, after determining the color value correspondence relationship, a first low color gamut image frame corresponding to each frame to be processed is obtained, for example, each frame to be processed is converted into a black-and-white picture or a wire frame picture to obtain a first low color gamut image frame corresponding to each frame to be processed, and according to the color difference, the first low color gamut image frame, the color value correspondence relationship and the reference frame, a compressed video corresponding to the video to be processed is determined, that is, the color difference, the first low color gamut image frame, the color value correspondence relationship and the reference frame are video-encapsulated according to the sequence of the image group to obtain a compressed video,

for example, if the compressed video is used for network transmission, the color value corresponding relationship of the frame to be processed in each image group can be obtained according to the color value corresponding relationship, and the color value corresponding relationship of the frame to be processed in the image group is encapsulated in front of the image group.

In the video processing method provided by this embodiment, a plurality of image groups are obtained by performing a shot segmentation operation on a key frame of a video to be processed, then a color difference between a person and an object in each image group is obtained, a target similarity of each frame is determined based on a similarity between two adjacent frames in each image group, a first frame with a target similarity greater than or equal to a preset threshold in each image group is determined, and a reference frame and a frame to be processed are determined based on the first frame; then, performing structure disassembly operation on the frame to be processed to respectively obtain color value ranges corresponding to the part areas in the frame to be processed, and respectively obtaining color value corresponding relations between the part areas and the color value ranges of the frame to be processed in each image group; and then acquiring a first low color gamut image frame corresponding to each frame to be processed, determining a compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value corresponding relation and the reference frame, and compressing the color value range of different part areas by performing structure disassembly on a key frame, so that the compression ratio of intraframe compression is improved. Meanwhile, the reference frame is selected according to the similarity of each frame in each shot, so that the accuracy of the reference frame is improved.

Based on the first embodiment, a second embodiment of the video processing method of the present invention is proposed, in this embodiment, step S103 includes:

step S201, respectively mapping the to-be-processed frames based on a preset rgb color partition to obtain mapped to-be-processed frames, obtaining second low color gamut image frames corresponding to the mapped to-be-processed frames, determining boundary lines in each second low color gamut image frame based on a color value continuity rule, respectively determining part areas in the to-be-processed frames based on the boundary lines, and obtaining color value ranges corresponding to the part areas based on the to-be-processed frames; alternatively, the first and second liquid crystal display panels may be,

step S202, performing image recognition operation on the frame to be processed to obtain part areas in the frame to be processed, and acquiring color value ranges corresponding to the part areas based on the frame to be processed, wherein the part areas comprise different parts of people or objects in the frame to be processed.

In one way, to reduce the amount of computation, the image frames in each image group are mapped based on the pre-rgb color partition to obtain the mapped image group, for example, the pre-rgb color partition may be set to 8 or 16 partitions, or certainly, more partitions, and when the pre-rgb color partition includes 8 partitions, the color value ranges of r, g, and b are respectively: 0 to 31, 32 to 63, 64 to 95, 95 to 127, 128 to 159, 160 to 191, 192 to 223, 224 to 255.

Then, a second low color gamut image frame corresponding to the mapped frame to be processed is obtained, that is, the mapped frame to be processed is converted into a black-white image (or a gray image), boundary lines in each second low color gamut image frame are determined based on a color value continuity rule, part areas in the frame to be processed are respectively determined based on the boundary lines, and when it is determined that the differentiation of adjacent pixel points in the second low color gamut image frame is large and the differentiated color presents continuity according to the color value continuity rule, the continuous pixel is regarded as a boundary line of different parts. Specifically, the pixel similarity between adjacent pixel points in each second low color gamut image frame may be calculated, when the pixel similarity between adjacent pixel points is smaller than a preset pixel similarity, it is determined that the difference between the adjacent pixel points is large, and the adjacent pixel points are used as differentiated pixel points, or an average value of the pixel similarities of the current pixel point and other adjacent pixel points is used as the current similarity of the current pixel point, and the pixel points in the second low color gamut image frame, of which the current similarity is smaller than the preset pixel similarity, are used as differentiated pixel points, and if the differentiated pixel points exhibit continuity, a connecting line between consecutive differentiated pixel points is used as a boundary line in the corresponding second low color gamut image frame. And then respectively determining part areas in the frame to be processed based on the boundary lines, wherein the part areas are all areas surrounded by the boundary lines in the second low color gamut image frame, and acquiring color value ranges corresponding to the part areas based on the frame to be processed.

Referring to fig. 3, in fig. 3, there are 4 color values, wherein 3 of the color values are compared, the difference is small, and the other 1 color value is greatly different from the other 3 color values, so that the upper, left and right color values form a boundary.

In another mode, the image of the frame to be processed is identified to obtain part areas in the frame to be processed, the part areas include different parts of the person or the object in the frame to be processed, for example, for the person in the frame to be processed, the part areas include five sense organs, hair, clothes, shoes, decorations and the like of the person. And then acquiring color value ranges corresponding to the part areas based on the frame to be processed.

In the video processing method provided in this embodiment, image frames in each image group are mapped based on a preset rgb color partition to obtain a mapped image group, a second low color gamut image frame corresponding to each frame in the mapped image group is obtained, a boundary line in each second low color gamut image frame is determined based on a color value continuity rule, a part region in a frame to be processed is determined based on the boundary line, and a color value range corresponding to each part region is obtained based on the frame to be processed; or, performing image recognition operation on the frame to be processed of each image group to obtain a part region in the frame to be processed, and acquiring a color value range corresponding to each part region based on the frame to be processed, wherein the part region comprises different parts of people or objects in the frame to be processed, and the part region in the frame to be processed can be accurately obtained by performing image recognition according to the second low color gamut image frame or the frame to be processed, so that accurate structure disassembly operation is performed on the frame to be processed, and then the color value ranges of different part regions are compressed, thereby improving the compression ratio of intra-frame compression.

Based on the first embodiment, a third embodiment of the video processing method of the present invention is proposed, in this embodiment, step S104 includes:

step S301, converting the frame to be processed into a black-and-white image frame or a wire frame image frame to obtain the first low color gamut image frame; alternatively, the first and second electrodes may be,

step S302, based on an image edge detection algorithm, determining an object contour in each frame to be processed, and performing Fourier transform operation on contour coordinates of the object contour to obtain the first low color gamut image frame.

In this embodiment, the frame to be processed may be directly converted into a black-and-white image frame or a wire frame image frame to obtain the first low color gamut image frame, where the black-and-white image frame may be a binary image or a grayscale image in the above embodiment.

Or determining the object contour in each frame to be processed based on an image edge detection algorithm, specifically detecting the object contour in each frame to be processed by using the existing image edge detection algorithm, then splitting the object contour into a plurality of points, obtaining the coordinates of each point to obtain the contour coordinates of the object contour, and performing fourier transform operation on the contour coordinates to obtain the first low color gamut image frame.

In the video processing method provided in this embodiment, the frame to be processed is converted into a black-and-white image frame or a wire frame image frame, so as to obtain the first low color gamut image frame; or determining the object contour in each frame to be processed based on an image edge detection algorithm, performing Fourier transform operation on the contour coordinates of the object contour to obtain the first low color gamut image frame, accurately obtaining the first low color gamut image frame, and further improving the compression ratio of intraframe compression through the low color gamut image frame.

Based on the first embodiment, a fourth embodiment of the video processing method of the present invention is proposed, in this embodiment, step S102 includes:

step S401, based on a preset rgb color partition, respectively mapping the image frames in each image group to obtain a mapped image group;

step S402, acquiring a color histogram vector of each image frame in the mapped image group, determining the similarity between two adjacent frames in each image group based on the color histogram vector, and determining the target similarity of each frame based on the determined similarity; alternatively, the first and second electrodes may be,

step S403, performing normalization operation on the mapped image group to obtain a normalized image group, acquiring a pixel pair corresponding to a differentiation pixel point between two adjacent image frames in the normalized image group, and determining the pixel similarity between each pixel pair; and determining the similarity between two adjacent frames in each image group based on the pixel similarity, and determining the target similarity of each frame based on the determined similarity.

In this embodiment, to reduce the amount of calculation, the image frames in each image group are mapped based on the preset rgb color partition, so as to obtain the mapped image group, for example, 8 or 16 partitions may be set for the preset rgb color partition, and certainly, more partitions may be set, and when the preset rgb color partition includes 8 partitions, the color value ranges of r, g, and b are respectively: 0 to 31, 32 to 63, 64 to 95, 95 to 127, 128 to 159, 160 to 191, 192 to 223, 224 to 255.

In an embodiment, a color histogram vector of each image frame in the mapped image group is obtained, that is, an existing color histogram vector algorithm is used to obtain the color histogram vector, and the similarity between two adjacent frames in each image group is determined based on the color histogram vector, specifically, a cosine value of the color histogram vector corresponding to the two adjacent frames is used as the similarity.

In another embodiment, the mapped image group is normalized to obtain a normalized image group, a pixel pair corresponding to a differentiated pixel point between two adjacent image frames in the normalized image group is obtained, the pixel similarity between each pixel pair is determined, specifically, a color partition list of the two adjacent image frames is obtained, differentiated pixel points (pixel points with different rgb values) are found according to uniform sequencing, the pixel pair corresponding to the differentiated pixel point is obtained, and the pixel similarity between each pixel pair is calculated through the cosine of an rgb space vector. Then, the similarity between two adjacent frames in each image group is determined based on the pixel similarity, specifically, the similarity is (pixel similarity + number of differentiated pixels × 1)/total number of pixels.

In the video processing method provided by this embodiment, a color histogram vector of each image frame in a mapped image group is obtained, a similarity between two adjacent frames in each image group is determined based on the color histogram vector, and a target similarity of each frame is determined based on the color histogram vector; or, performing normalization operation on the mapped image group to obtain a normalized image group, acquiring a pixel pair corresponding to a differentiated pixel point between two adjacent image frames in the normalized image group, and determining the pixel similarity between each pixel pair; and based on the pixel similarity, determining the similarity between two adjacent frames in each image group, and based on the target similarity of each frame, accurately obtaining the target similarity according to a color histogram vector or a pixel pair corresponding to a differential pixel point, so as to improve the accuracy of the first frame and further improve the video compression ratio.

Based on the above respective embodiments, a fifth embodiment of the video processing method of the present invention is proposed, in which the step S104 includes:

step S501, obtaining a low color gamut picture and a color picture including UV components corresponding to each frame to be processed, using the low color gamut picture as the input of a GAN model to be trained, and performing model training by using the color picture as the output of the GAN model to be trained to obtain a pre-trained GAN model, wherein the GAN model to be trained is a model GAN corresponding to the style of the video to be processed;

step S502, if a second frame with the target similarity smaller than a preset threshold exists in each image group, determining a compressed video corresponding to the video to be processed based on the color difference, a pre-trained GAN model, the low color gamut picture, the second frame, the color value corresponding relation and the reference frame;

step S503, if there is no second frame in each image group whose target similarity is smaller than the preset threshold, determining a compressed video corresponding to the video to be processed based on the color difference, the pre-trained GAN model, the low color gamut picture, the color value correspondence, and the reference frame.

In this embodiment, after the first low color gamut image frame is obtained, a corresponding to-be-trained GAN model is determined according to a style of a to-be-processed video, and a low color gamut picture and a color picture including a UV component corresponding to each to-be-processed frame are obtained, specifically, RGB of the to-be-processed frame is converted into a color space YUV (where Y represents brightness, that is, a gray value, and U and V represent chromaticity) by using an existing algorithm. And then, taking the low color gamut picture as the input of the GAN model to be trained, and taking the color picture as the output of the GAN model to be trained for model training to obtain a pre-trained GAN model.

Then, judging whether a second frame with the target similarity smaller than a preset threshold exists in each image group, if so, taking the second frame as a reserved frame, not coding the second frame, and determining a compressed video based on the color difference, the pre-trained GAN model, the low color gamut image, the second frame, the corresponding relation of the color values and the reference frame; namely, the color difference, the pre-trained GAN model, the low color gamut picture, the second frame, the color value corresponding relation and the reference frame are subjected to video encapsulation according to the sequence of the image group to obtain the compressed video.

And if not, performing video encapsulation on the color difference, the pre-trained GAN model, the low color gamut picture, the color value corresponding relation and the reference frame according to the sequence of the image group to obtain a compressed video.

In this embodiment, if the compressed video is used locally, only the model number corresponding to the pre-trained GAN model is encapsulated during video encapsulation, otherwise, the pre-trained GAN model may be directly encapsulated.

In the video processing method provided in this embodiment, a low color gamut picture corresponding to each frame to be processed and a color picture including a UV component are obtained, the low color gamut picture is used as an input of a GAN model to be trained, and the color picture is used as an output of the GAN model to be trained to perform model training, so as to obtain a pre-trained GAN model, where the GAN model to be trained is a model GAN corresponding to the style of the video to be processed; then if a second frame with the target similarity smaller than a preset threshold exists in each image group, determining a compressed video corresponding to the video to be processed based on the color difference, a pre-trained GAN model, the low color gamut picture, the second frame, the color value corresponding relation and the reference frame; if each image group does not have a second frame with the target similarity smaller than a preset threshold, determining a compressed video corresponding to the video to be processed based on the color difference, the pre-trained GAN model, the low color gamut picture, the color value corresponding relation and the reference frame, packaging information of the pre-trained GAN model into the compressed video, facilitating subsequent decoding and coloring of the compressed video, and meanwhile, reserving the second frame to ensure the authenticity of the second frame and avoid coding the second frame to cause serious distortion of the decoded image.

Based on the fifth embodiment, a sixth embodiment of the video processing method of the present invention is proposed, in this embodiment, after step S104, the video processing method further includes:

step S601, acquiring a pre-trained GAN model corresponding to a compressed video, and inputting video frames in the compressed video into the pre-trained GAN model for model training to obtain trained video frames;

step S602, processing each image group in the trained video frame based on the Y value of the reference frame in each image group of the compressed video to obtain a processed video frame;

step S603, determining whether each image group corresponding to the processed video frame has a color difference image group or not based on the color difference and color value corresponding relation in the compressed video;

in step S604, if the target video does not exist, the processed video frame is used as the decoded target video.

In this embodiment, when the compressed video needs to be decoded, a pre-trained GAN model corresponding to the compressed video is obtained first, for example, the pre-trained GAN model is locally obtained through a model number in the compressed video, or the pre-trained GAN model is obtained in the compressed video. And inputting the video frame in the compressed video into a pre-trained GAN model for model training to obtain a trained video frame. And the reference frame and the second frame do not need model training.

Then, acquiring a reference frame in each image group of the compressed video, obtaining a Y value (brightness value) of each reference frame, respectively processing each image group in the trained video frame based on each Y value to obtain a processed video frame, and replacing the Y value of the video frame in the trained image group with the Y value of the corresponding reference frame in any trained image group in the trained video frame to obtain the trained video frame.

Then, determining whether each image group corresponding to the processed video frame has a color difference image group or not based on the color difference and color value corresponding relation in the compressed video; the method comprises the steps of acquiring current color differences of people and objects in all image groups corresponding to a processed video frame based on a processing mode the same as the color differences in a compressed video, then respectively determining whether difference values between the current color differences and the color differences of the people and the objects are smaller than a preset color difference value, if the difference values are larger than or equal to the preset color difference value, determining whether color values of a part area of the processed video frame are in a corresponding color value range based on a color value corresponding relation, if yes, determining whether color difference image groups exist in all image groups corresponding to the processed video frame, and further taking the processed video frame as a decoded target video.

Further, in an embodiment, after step S604, the method further includes:

step a, if the color difference image group exists, acquiring a first color distribution corresponding to a difference image frame in the color difference image group and a second color distribution corresponding to a reference frame in the difference image frame;

b, matching the first color distribution based on the second color distribution to obtain a processed difference image frame, and replacing the difference image frame in the color difference image group with the processed difference image frame to obtain a processed color difference image group;

and c, determining the target video based on the processed color difference image group and the processed normal image group in the video frame.

If the difference value exists, the video frame of the current color difference with the difference value smaller than the preset color difference value in the processed video frame is used as a difference image frame, and/or the video frame with the color value of the position area in the processed video frame outside the corresponding color value range is used as the difference image frame, a reference frame is determined according to the difference image group to which the difference image frame belongs, and a first color distribution corresponding to the difference image frame in the color difference image group and a second color distribution corresponding to the reference frame in the difference image group are obtained, wherein the first color distribution is the color distribution of pixels corresponding to different characters, objects or parts, and the second color distribution is the color distribution of pixels corresponding to different characters, objects or parts in the reference frame.

Then, the first color distribution is matched based on the second color distribution to obtain a processed differential image frame, specifically, each pixel point in the second color distribution is matched with each pixel point in the first color distribution, the pixel value of each pixel point in the first color distribution is adjusted to the pixel value of each pixel point in the second color distribution, or a certain fault tolerance range can be set, a pixel adjustment range corresponding to the pixel value of each pixel point in the second color distribution is obtained first, and the pixel value of each pixel point in the first color distribution is adjusted to the closest pixel value in the corresponding pixel adjustment range.

For example, the pixel values of the pixels in the second color distribution are sequentially: #8F0000, # D70000, # F00000, and the pixel values of the respective pixel points in the first color distribution are: and #3F0000, # D80000, # F00000, the proportion of the pixel adjustment range is 10%, and #3F0000 exceeds #8F0000, so that #3F0000 needs to match with the nearest color value #8F0000, the similarity between # D70000 and # D80000 is high, and within the fault tolerance range, # F00000 and # F00000 are consistent.

Finally, replacing the difference image frame in the color difference image group with the processed difference image frame to obtain a processed color difference image group, and determining the target video based on the processed color difference image group and the normal image group in the processed video frame

In the video processing method provided in this embodiment, a pre-trained GAN model is obtained based on a model number in the compressed video, and a video frame in the compressed video is input to the pre-trained GAN model for model training, so as to obtain a trained video frame; processing each image group in the trained video frame based on the Y value of the reference frame in each image group in the compressed video to obtain a processed video frame; then acquiring a second color difference of a person or an object in each image group of the processed video frame, and determining whether each image group of the processed video frame has a color difference image group or not based on the color difference and the second color difference; and if the target video does not exist, the processed video frame is used as the decoded target video, and the target video can be accurately decoded according to the compressed video.

The present invention also provides a video processing apparatus, and referring to fig. 4, the video processing apparatus includes:

a segmentation module 10, configured to perform a shot segmentation operation on a key frame of a video to be processed to obtain a plurality of image groups, where each frame in the image groups belongs to the same shot;

the first acquiring module 20 is configured to acquire color differences of people and objects in each image group, determine a target similarity of each frame based on a similarity between two adjacent frames in the image group, determine a first frame in each image group, where the target similarity is greater than or equal to a preset threshold, and determine each reference frame and each frame to be processed based on the first frame;

a disassembling module 30, configured to perform structure disassembling operation on the frame to be processed, so as to obtain color value ranges corresponding to the part regions in the frame to be processed, and obtain color value correspondence between the part regions and the color value ranges of the frame to be processed in each image group;

a second obtaining module 40, configured to obtain a first low color gamut image frame corresponding to each frame to be processed, and determine a compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value correspondence, and the reference frame.

The methods executed by the program units can refer to various embodiments of the video processing method of the present invention, and are not described herein again.

The invention also provides a computer readable storage medium.

The computer-readable storage medium of the present invention has stored thereon a video processing program which, when executed by a processor, implements the steps of the video processing method as described above.

The method implemented when the video processing program running on the processor is executed may refer to each embodiment of the video processing method of the present invention, and details thereof are not repeated herein.

Furthermore, an embodiment of the present invention further provides a computer program product, which includes a video processing program, and when the video processing program is executed by a processor, the steps of the video processing method described above are implemented.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or system in which the element is included.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A video processing method, characterized in that the video processing method comprises the steps of:

acquiring color differences corresponding to each image group, determining target similarity of each frame based on the similarity between two adjacent frames in the image group, determining a first frame with the target similarity being larger than or equal to a preset threshold value in each image group, determining a reference frame and a frame to be processed based on the first frame, wherein the color differences comprise character color differences and object color differences, determining the character color differences for each image group based on a character maximum frame to which the maximum area of each character in the image group belongs and a character minimum frame to which the minimum area belongs, and determining the object color differences based on an object maximum frame to which the maximum area of each object belongs and an object minimum frame to which the minimum area belongs;

performing structure disassembly operation on the frame to be processed to respectively obtain color value ranges corresponding to part regions in the frame to be processed, and respectively obtaining color value corresponding relations between the part regions and the color value ranges of the frame to be processed in each image group, wherein the part regions comprise all parts of people and objects, or the part regions comprise a plurality of regions with discontinuous color values in the frame to be processed;

acquiring a first low color gamut image frame corresponding to each frame to be processed, and determining a compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value corresponding relationship and the reference frame, wherein the color difference, the first low color gamut image frame, the color value corresponding relationship and the reference frame are subjected to video encapsulation according to the sequence of an image group to obtain the compressed video, and the first low color gamut image frame is a black-and-white image frame or a wire frame image frame corresponding to the frame to be processed.

2. The video processing method according to claim 1, wherein the step of performing a structure decomposition operation on the frame to be processed to obtain color value ranges corresponding to the portion regions in the frame to be processed respectively comprises:

respectively mapping the to-be-processed frames based on preset rgb color partitions to obtain mapped to-be-processed frames, obtaining second low color gamut image frames corresponding to the mapped to-be-processed frames, determining boundary lines in each second low color gamut image frame based on a color value continuity rule, respectively determining part areas in the to-be-processed frames based on the boundary lines, and obtaining color value ranges corresponding to the part areas based on the to-be-processed frames, wherein the second low color gamut image frames are black-and-white images or grayscale images; alternatively, the first and second electrodes may be,

3. The video processing method of claim 1, wherein the step of obtaining a first low color gamut image frame corresponding to each of the frames to be processed comprises:

4. The video processing method of claim 1, wherein the step of determining the object similarity of each frame based on the similarity between two adjacent frames in the image group comprises:

5. The video processing method according to any one of claims 1 to 4, wherein the step of determining the compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value correspondence, and the reference frame comprises:

6. The video processing method according to claim 5, wherein after the step of determining the compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value correspondence, and the reference frame, the video processing method further comprises:

acquiring a pre-trained GAN model corresponding to a compressed video, and inputting a video frame in the compressed video into the pre-trained GAN model for model training to obtain a trained video frame;

determining whether each image group corresponding to the processed video frame has a color difference image group or not based on the color difference and the color value corresponding relation in the compressed video, and specifically, acquiring the current color difference of a person and the current color difference of an object in each image group corresponding to the processed video frame; if the difference value between the current color difference value of the person and the color difference value of the corresponding image group in the compressed video is larger than or equal to the preset color difference value, and the difference value between the current color difference value of the object and the color difference value of the corresponding image group in the compressed video is larger than or equal to the preset color difference value, determining whether the color value of the part area of the processed video frame is in the corresponding color value range or not based on the color value corresponding relation, wherein if the color value is in the corresponding color value range, determining that the color difference image group does not exist in each image group corresponding to the processed video frame;

7. The video processing method of claim 6, wherein the step of determining whether the color difference image group exists in each image group of the processed video frame further comprises:

8. A video processing apparatus, characterized in that the video processing apparatus comprises:

the first acquisition module is used for acquiring color differences corresponding to each image group, determining the target similarity of each frame based on the similarity between two adjacent frames in the image group, determining the first frame with the target similarity being larger than or equal to a preset threshold value in each image group, determining each reference frame and each frame to be processed based on the first frame, wherein the color differences comprise character color differences and object color differences, determining the character color differences for each image group based on the maximum character frame to which the maximum area of each character in the image group belongs and the minimum character frame to which the minimum area belongs, and determining the object color differences based on the maximum object frame to which the maximum area of each object belongs and the minimum object frame to which the minimum area belongs;

a disassembling module, configured to perform structure disassembling operation on the frame to be processed to obtain color value ranges corresponding to part regions in the frame to be processed, and obtain color value correspondence between the part regions and the color value ranges of the frame to be processed in each image group, where the part regions include parts of people and objects, or the part regions include multiple regions with discontinuous color values in the frame to be processed;

the second acquisition module is used for acquiring each first low color gamut image frame corresponding to the frame to be processed, and determining the compressed video corresponding to the video to be processed based on the color difference, the first low color gamut image frame, the color value corresponding relation and the reference frame, wherein the color difference, the first low color gamut image frame, the color value corresponding relation and the reference frame are subjected to video packaging according to the sequence of the image group to obtain the compressed video, and the first low color gamut image frame is a black-and-white image frame or a wire frame image frame corresponding to the frame to be processed.

9. A video processing apparatus, characterized in that the video processing apparatus comprises: memory, processor and video processing program stored on the memory and executable on the processor, which when executed by the processor implements the steps of the video processing method according to any of claims 1 to 7.

10. A computer-readable storage medium, characterized in that a video processing program is stored thereon, which when executed by a processor implements the steps of the video processing method according to any one of claims 1 to 7.