CN111107414A

CN111107414A - Video processing method and device

Info

Publication number: CN111107414A
Application number: CN201811245795.7A
Authority: CN
Inventors: 刘晓玲; 陈云海; 林立宇; 张萍
Original assignee: China Telecom Corp Ltd
Current assignee: China Telecom Corp Ltd
Priority date: 2018-10-25
Filing date: 2018-10-25
Publication date: 2020-05-05

Abstract

The present disclosure provides a video processing method and apparatus. The video processing device detects a video picture to obtain a rolling caption area and a non-rolling caption area, performs deinterlacing processing matched with rolling captions for the rolling caption area, and performs deinterlacing processing matched with non-rolling captions for the non-rolling caption area. According to the method, the area of the video picture is divided to obtain a rolling caption area and a non-rolling caption area, and then matched anti-interlacing processing is carried out on different areas. Thereby an overall optimal picture quality is obtained.

Description

Video processing method and device

Technical Field

The present disclosure relates to the field of information processing, and in particular, to a video processing method and apparatus.

Background

Conventional analog signals (e.g., television signals) are interlaced, while internet video files tend to be progressive. The image conversion from interlace scanning to progressive scanning requires an anti-interlace process. Currently, there are a variety of de-interlacing algorithms in the industry, which are applicable to different motion characteristics, each having its advantages and disadvantages. For example, the YADIF algorithm has good image quality and sharp edges, but has serious trailing on the roll caption; the LI algorithm of the sub-filter in the PP filter provided by FFMPEG has a good effect on rolling caption restoration although the picture edge is blurred.

Disclosure of Invention

The inventor finds through research that in video pictures, there are often different areas with different image interlacing features, for example, the images of the main picture and the scroll subtitles superimposed on the video signal exhibit different motion characteristics, and adopting the standard single processing effect may cause one to be lost: if one region is well adapted, the other region is poor, and the overall optimal picture quality cannot be obtained.

To this end, the present disclosure provides a scheme capable of performing matching deinterlacing on regions having different motion characteristics on an image.

In accordance with an aspect of one or more embodiments of the present disclosure, there is provided a video processing method including: detecting a video picture to obtain a rolling caption area and a non-rolling caption area; aiming at the rolling caption area, carrying out the anti-interlacing processing matched with the rolling caption; and performing anti-interlacing processing matched with the non-rolling subtitles aiming at the non-rolling subtitle area.

In some embodiments, detecting the video picture comprises: taking the appointed frame as a current frame, and equally dividing a region to be judged in a video picture of the current frame; for the designated area a0, extracting a corresponding characteristic sequence s 0; estimating the area a1 of the designated area a0 in the video picture of the next frame of the current frame according to all possible moving directions and moving speeds of the roll caption; extracting a corresponding characteristic sequence s1 for the region a1 in a video picture of a next frame of the current frame; in the case where the feature sequence s1 includes the feature sequence s0, the region a0 is determined to be a telop region.

In some embodiments, where the feature sequence s1 includes the feature sequence s0, determining the region a0 as a crawl region comprises: in the case where the signature sequence s1 includes the signature sequence s0, determining that the region a0 moves; taking the next frame of the current frame as the current frame, and then repeatedly executing the step of estimating the area a1 of the designated area a0 in the video picture of the next frame of the current frame according to all possible moving directions and moving speeds of the roll caption; if all the n times of continuous detection results are that the area a0 moves, the area a0 is determined to be a rolling caption area, and n is larger than a preset threshold.

In some embodiments, the moving direction and moving speed of the region a0 are determined according to the moving state of the signature sequence s 0.

In some embodiments, when the adjacent areas are all the rolling caption areas and the deviation of the moving direction of the adjacent areas is within a predetermined range, the adjacent areas are merged.

In some embodiments, the sequence of features is a sequence of angular points.

In accordance with another aspect of one or more embodiments of the present disclosure, there is provided a video processing apparatus including: the area detection module is configured to detect a video picture to obtain a rolling caption area and a non-rolling caption area; and the anti-interleaving processing module is configured to perform anti-interleaving processing matched with the rolling captions for the rolling caption area and perform anti-interleaving processing matched with the non-rolling captions for the non-rolling caption area.

In some embodiments, the region detection module uses the designated frame as the current frame, equally divides the region to be determined in the video picture of the current frame, extracts a corresponding feature sequence s0 for the designated region a0, estimates the region a1 of the designated region a0 in the video picture of the next frame of the current frame according to all possible motion directions and motion speeds of the rolling caption, extracts a corresponding feature sequence s1 for the region a1 in the video picture of the next frame of the current frame, and determines the region a0 to be the rolling caption region in the case that the feature sequence s1 includes the feature sequence s 0.

In some embodiments, the region detection module is configured to determine that the region a0 moves in a case where the feature sequence s1 includes the feature sequence s0, take a next frame of the current frame as the current frame, and then repeatedly perform an operation of estimating a region a1 in which the designated region a0 is located in a video picture of the next frame of the current frame according to all possible moving directions and moving speeds of the subtitles, and if all n consecutive detections result in the motion of the region a0, determine that the region a0 is a subtitle region, where n is greater than a predetermined threshold.

In some embodiments, the region detection module is further configured to determine a moving direction and a moving speed of the region a0 according to the moving state of the feature sequence s 0.

In some embodiments, the region merging module is configured to merge the adjacent regions when the adjacent regions are all the rolling caption regions and the deviation of the moving direction of the adjacent regions is within a predetermined range.

In some embodiments, the sequence of features is a sequence of angular points.

In accordance with another aspect of one or more embodiments of the present disclosure, there is provided a video processing apparatus including: a memory configured to store instructions; a processor coupled to the memory, the processor configured to perform a method according to any of the embodiments described above based on instructions stored in the memory.

According to another aspect of one or more embodiments of the present disclosure, there is provided a computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions, which when executed by a processor, implement a method as described above in relation to any one of the embodiments.

Other features of the present disclosure and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is an exemplary flow diagram of a video processing method according to an embodiment of the present disclosure;

fig. 2 is an exemplary flow diagram of a video processing method of another embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a screen display according to an embodiment of the disclosure;

FIG. 4 is a schematic view of a screen display according to another embodiment of the present disclosure;

fig. 5 is an exemplary flow chart of a video processing method of yet another embodiment of the present disclosure;

fig. 6 is an exemplary block diagram of a video processing apparatus of one embodiment of the present disclosure;

fig. 7 is an exemplary block diagram of a video processing apparatus of one embodiment of the present disclosure;

FIG. 8 is a diagram of an original video frame;

FIG. 9 is a schematic diagram of YADIF deinterlacing of an original video frame;

FIG. 10 is a diagram illustrating an original video frame after LI deinterlacing;

fig. 11 is a schematic diagram illustrating an original video frame after being subjected to a partition interleaving process.

Detailed Description

The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

The relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

Fig. 1 is an exemplary flowchart of a video processing method according to an embodiment of the present disclosure. In some embodiments, the method steps of the present embodiment may be performed by a video processing device.

In step 101, a video picture is detected to obtain a crawl area and a non-crawl area.

In step 102, for the rolling caption area, performing the anti-interlacing processing matched with the rolling caption; and performing anti-interlacing processing matched with the non-rolling subtitles aiming at the non-rolling subtitle areas.

For example, the YADIF algorithm is used for the non-telop area, and the LI algorithm, which is a sub-filter in PP filters provided by FFMPEG, is used for the telop area.

In the video processing method provided by the above embodiment of the present disclosure, the video picture is divided into the regions to obtain the rolling caption region and the non-rolling caption region, and then the different regions are subjected to the matched deinterlace. Thereby an overall optimal picture quality is obtained.

Fig. 2 is an exemplary flowchart of a video processing method according to another embodiment of the present disclosure. In some embodiments, the method steps of the present embodiment may be performed by a video processing device.

In step 201, a designated frame is used as a current frame, and an area to be determined in a video picture of the current frame is divided equally.

At step 202, for the designated area a0, the corresponding feature sequence s0 is extracted.

In some embodiments, the sequence of features may be a sequence of corners.

In step 203, the area a1 of the designated area a0 in the video picture of the next frame of the current frame is estimated according to all possible moving directions and moving speeds of the roll screen.

In step 204, in the next frame video picture of the current frame, a corresponding feature sequence s1 is extracted for the region a 1.

In step 205, in the case where the feature sequence s1 includes the feature sequence s0, it is determined that the region a0 is a telop region.

Fig. 3 is a schematic view of a screen display according to an embodiment of the disclosure. As shown in fig. 3, the regions to be determined in the video picture of the specified frame are equally divided to obtain a plurality of regions. For the sake of simplicity, only one area a is given here. For region a, the corresponding sequence of corner points s0 is extracted. The area a1 where the area a is located in the video picture of the next frame of the current frame is estimated based on all possible moving directions and moving speeds of the roll caption. Since the area a may move in four directions of up, down, left, and right, the area a is indicated by a dotted box in fig. 3 at a position where the next frame may appear.

In the video picture of the next frame of the specified frame, as shown in fig. 4, a corresponding corner sequence s1 is extracted for the region a 1. And further determines whether the corner sequence s1 includes the corner sequence s 0. If the sequence of corners s1 contains the sequence of corners s0, it indicates that the region a has moved, and thus the region a can be determined to be a region of the crawl. For example, as shown in fig. 4, the area a has moved to the left.

Fig. 5 is an exemplary flowchart of a video processing method according to still another embodiment of the present disclosure. In some embodiments, the method steps of the present embodiment may be performed by a video processing device.

In step 501, a region to be determined in a video picture of an i-th frame is divided equally.

At step 502, for the specified region a0, the corresponding feature sequence s0 is extracted.

In some embodiments, the sequence of features may be a sequence of corners.

In step 503, the area a1 of the designated area a0 in the i +1 th frame video picture is estimated according to all possible moving directions and moving speeds of the roll caption.

In step 504, in the i +1 th frame video picture, a corresponding feature sequence s1 is extracted for the region a 1.

In step 505, it is determined whether the signature sequence s1 contains the signature sequence s 0.

In case the signature sequence s1 contains the signature sequence s0, performing step 506; in case the signature sequence s1 does not contain the signature sequence s0, step 510 is performed.

In step 506, the motion of the area a0 is determined, and the number of times of the continuous motion of the area a0 is counted.

At step 507, it is determined whether the number of consecutive movements of region a0 is greater than a threshold.

If the number of times of continuous motion of the area a0 is not greater than the threshold, execute step 508; if the number of consecutive movements of the area a0 is greater than the threshold, step 509 is performed.

In step 508, i is set to i + 1. Step 503 is then repeated.

In step 509, region a0 is determined to be a crawl region.

At step 510, region a0 is determined to be a non-crawl region.

In order to avoid the determination of the deviation, by continuously detecting a plurality of frames, if the area a0 moves in n consecutive frames (n is greater than the threshold), it is determined that the area a0 is the telop area. Thereby improving the accuracy of detection.

In some embodiments, the moving direction and moving speed of the region a0 may be determined according to the moving state of the signature sequence s 0.

In some embodiments, when the adjacent areas are all the rolling caption areas and the deviation of the moving direction of the adjacent areas is within a predetermined range, the adjacent areas are merged. Thereby improving the efficiency of the de-interlacing process.

Fig. 6 is an exemplary block diagram of a video processing apparatus according to an embodiment of the present disclosure. As shown in fig. 6, the video processing apparatus includes an area detection module 61 and an anti-interlacing processing module 62.

The area detection module 61 is configured to detect the video pictures for a crawl area and a non-crawl area.

The de-interlacing processing module 62 is configured to perform de-interlacing processing that matches the roll-to-roll subtitles for the roll-to-roll subtitle regions and to perform de-interlacing processing that matches the non-roll-to-roll subtitles for the non-roll-to-roll subtitle regions.

In the video processing apparatus provided in the above embodiment of the present disclosure, the video picture is divided into regions to obtain the rolling caption region and the non-rolling caption region, and then different regions are subjected to matched deinterlace. Thereby an overall optimal picture quality is obtained.

In some embodiments, the region detection module 61 uses the designated frame as the current frame, equally divides the region to be determined in the video picture of the current frame, extracts a corresponding feature sequence s0 for the designated region a0, estimates the region a1 of the designated region a0 in the video picture of the next frame of the current frame according to all possible motion directions and motion speeds of the rolling caption, extracts a corresponding feature sequence s1 for the region a1 in the video picture of the next frame of the current frame, and determines the region a0 to be the rolling caption region in the case that the feature sequence s1 includes the feature sequence s 0.

In some embodiments, the sequence of features is a sequence of angular points.

In some embodiments, the region detection module 61 is further configured to determine that the region a0 moves in the case that the feature sequence s1 includes the feature sequence s0, take the next frame of the current frame as the current frame, and then repeatedly perform an operation of estimating a region a1 in which the designated region a0 is located in the video picture of the next frame of the current frame according to all possible moving directions and moving speeds of the subtitles, and if all n consecutive detections result in the motion of the region a0, determine that the region a0 is a subtitle region, and n is greater than the predetermined threshold.

If the area a0 moves in n consecutive frames (n is greater than the threshold), it is determined that the area a0 is the telop area. Thereby improving the accuracy of detection.

In some embodiments, as shown in fig. 6, the video processing apparatus further comprises a region merging module 63. The region merging module 63 is configured to merge adjacent regions when the adjacent regions are all rolling caption regions and the deviation of the moving direction of the adjacent regions is within a predetermined range. Thereby improving the efficiency of the de-interlacing process.

Fig. 7 is an exemplary block diagram of a video processing apparatus according to an embodiment of the present disclosure. As shown in fig. 7, the video processing apparatus includes a memory 71 and a processor 72.

The memory 71 is used for storing instructions, the processor 72 is coupled to the memory 71, and the processor 72 is configured to execute the method according to any one of the embodiments in fig. 1, 2 and 5 based on the instructions stored in the memory.

As shown in fig. 7, the video processing apparatus further includes a communication interface 73 for information interaction with other devices. Meanwhile, the device also comprises a bus 74, and the processor 72, the communication interface 73 and the memory 71 are communicated with each other through the bus 74.

The memory 71 may comprise a high-speed RAM memory, and may further comprise a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 71 may also be a memory array. The storage 71 may also be partitioned and the blocks may be combined into virtual volumes according to certain rules.

Further, the processor 72 may be a central processing unit CPU, or may be an application specific integrated circuit ASIC, or one or more integrated circuits configured to implement embodiments of the present disclosure.

The present disclosure also relates to a computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions, and the instructions, when executed by a processor, implement the method according to any one of the embodiments in fig. 1, 2, and 5.

The present disclosure is illustrated below by way of a specific example.

FIG. 8 is a diagram of an original video frame. In fig. 8, "news live room", "return trip passenger flow peak after festival", "spring return trip, civil aviation", "14: 07 "is a non-crawl area. Show "department space 2050 years planning. "is a telop area.

Fig. 9 is a schematic diagram of YADIF deinterlacing of an original video picture. As shown in fig. 9, the original picture is YADIF deinterlaced, the image is sharp, but the roll-screen still has jaggy.

Fig. 10 is a schematic diagram of LI deinterlacing an original video frame. As shown in fig. 10, the original picture is deinterlaced by the LI algorithm in the PP filter provided by FFMPEG, and the roll caption is clear, but the image is slightly blurred.

Fig. 11 is a schematic diagram illustrating an original video frame after being subjected to a partition interleaving process. As shown in fig. 1, the non-telop area is YADIF deinterlaced, resulting in a sharper picture. The rolling caption area is processed by the LI algorithm of the sub-filter in the PP filter provided by FFMPEG, and the caption has no tailing. Thereby resulting in an overall optimal picture quality.

In some embodiments, the functional unit modules described above may be implemented as a general purpose Processor, a Programmable Logic Controller (PLC), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable Logic device, discrete gate or transistor Logic, discrete hardware components, or any suitable combination thereof for performing the functions described in this disclosure.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The description of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to practitioners skilled in this art. The embodiment was chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. A video processing method, comprising:

detecting a video picture to obtain a rolling caption area and a non-rolling caption area;

aiming at the rolling caption area, carrying out the anti-interlacing processing matched with the rolling caption;

and performing anti-interlacing processing matched with the non-rolling subtitles aiming at the non-rolling subtitle area.

2. The method of claim 1, wherein detecting a video picture comprises:

taking the appointed frame as a current frame, and equally dividing a region to be judged in a video picture of the current frame;

for the designated area a0, extracting a corresponding characteristic sequence s 0;

estimating the area a1 of the designated area a0 in the video picture of the next frame of the current frame according to all possible moving directions and moving speeds of the roll caption;

extracting a corresponding characteristic sequence s1 for the region a1 in a video picture of a next frame of the current frame;

in the case where the feature sequence s1 includes the feature sequence s0, the region a0 is determined to be a telop region.

3. The method of claim 2, wherein, in the case that the feature sequence s1 contains the feature sequence s0, determining that region a0 is a crawl region comprises:

in the case where the signature sequence s1 includes the signature sequence s0, determining that the region a0 moves;

taking the next frame of the current frame as the current frame, and then repeatedly executing the step of estimating the area a1 of the designated area a0 in the video picture of the next frame of the current frame according to all possible moving directions and moving speeds of the roll caption;

if all the n times of continuous detection results are that the area a0 moves, the area a0 is determined to be a rolling caption area, and n is larger than a preset threshold.

4. The method of claim 3, wherein,

according to the motion state of the feature sequence s0, the motion direction and the motion speed of the region a0 are determined.

5. The method of claim 4, wherein,

and merging the adjacent areas when the adjacent areas are the rolling caption areas and the deviation of the motion directions of the adjacent areas is within a preset range.

6. The method of claim 2, wherein,

the characteristic sequence is a sequence of corner points.

7. A video processing apparatus comprising:

the area detection module is configured to detect a video picture to obtain a rolling caption area and a non-rolling caption area;

and the anti-interleaving processing module is configured to perform anti-interleaving processing matched with the rolling captions for the rolling caption area and perform anti-interleaving processing matched with the non-rolling captions for the non-rolling caption area.

8. The apparatus of claim 7, wherein,

the region detection module takes the designated frame as a current frame, equally divides a region to be determined in a video picture of the current frame, extracts a corresponding feature sequence s0 for the designated region a0, estimates a region a1 where the designated region a0 is located in a video picture of a next frame of the current frame according to all possible motion directions and motion speeds of the rolling caption, extracts a corresponding feature sequence s1 for the region a1 in the video picture of the next frame of the current frame, and determines the region a0 to be a rolling caption region when the feature sequence s1 contains the feature sequence s 0.

9. The apparatus of claim 8, wherein,

the region detection module is configured to determine that the region a0 moves, and takes the next frame of the current frame as the current frame, if the feature sequence s1 includes the feature sequence s0, then repeatedly perform an operation of estimating the region a1 where the designated region a0 is located in the video picture of the next frame of the current frame according to all possible moving directions and moving speeds of the rolling caption, and if all n times of continuous detection result shows that the region a0 moves, determine that the region a0 is the rolling caption region, and n is greater than a predetermined threshold.

10. The apparatus of claim 9, wherein,

the region detection module is further configured to determine a moving direction and a moving speed of the region a0 according to the moving state of the feature sequence s 0.

11. The apparatus of claim 10, wherein,

and the area merging module is configured to merge the adjacent areas when the adjacent areas are the rolling caption areas and the deviation of the moving direction of the adjacent areas is within a preset range.

12. The apparatus of claim 8, wherein,

the characteristic sequence is a sequence of corner points.

13. A video processing apparatus comprising:

a memory configured to store instructions;

a processor coupled to the memory, the processor configured to perform implementing the method of any of claims 1-6 based on instructions stored by the memory.

14. A computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions which, when executed by a processor, implement the method of any one of claims 1-6.