CN102521805B

CN102521805B - Video word processing method based on interframe information

Info

Publication number: CN102521805B
Application number: CN 201110391472
Authority: CN
Inventors: 田岩; 许毅平; 文灏; 陈柱; 孙福生
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2011-11-30
Filing date: 2011-11-30
Publication date: 2013-07-24
Anticipated expiration: 2031-11-30
Also published as: CN102521805A

Abstract

The invention discloses a video word processing method based on interframe information. The method comprises the following steps of: detecting a text area R(x, y, t) of a current image; checking the text area R(x, y, t) of a current image f(x, y, t); performing background restoration on the text area R(x, y, t) from the first frame to the last frame; and performing the background restoration from the last frame to the first frame. The method has the advantages of ensuring the accuracy of detection positioning, ensuring the correctness of a restoration result, and ensuring the time continuity of a video result.

Description

Video text disposal route based on inter-frame information

Technical field

The invention belongs to video image and handle application, be specifically related to a kind of video text disposal route based on inter-frame information.

Background technology

During video image was handled, literal extracted and the image information reparation all has very application prospects, therefore, also more and more is subject to people's attention in recent years.

Yet the method in positioning image Chinese version zone also is difficult to accomplish complete detection and extraction for the literal in some complex background automatically at present.Present image repair algorithm mainly can be divided into two big classes based on the restorative procedure of PDE with based on the restorative procedure of texture.The main two kinds of different thinkings of supposition and optimizing that embodied respectively.These methods have been ignored frame of video correlativity in time often based on Flame Image Process, thereby it is inaccurate to cause repairing the result, and repair the result occur flicker, phenomenons such as saltus step easily when playing.

Summary of the invention

The object of the present invention is to provide a kind of video text disposal route based on inter-frame information, it can solve and repair inaccurate and reparation result easy flicker, the problems such as saltus step of occurring when playing of result in the existing method.

The present invention is achieved by the following technical solutions:

A kind of video text disposal route based on inter-frame information comprises the steps:

(1) (t), establish present image is f (x to the text filed R of detection present image for x, y, y, t), the N width of cloth image of its front is followed successively by f (x, y, t-1) ..., f (x, y, t-N), this N width of cloth image correspondence text filed be respectively R (x, y, t-1),, R (x, y, t-N), wherein x represents the horizontal ordinate of present image, and y represents the ordinate of present image, and t represents the frame number of present image;

(2) to present image f (x, y, text filed R t) (x, y t) carry out verification, specifically comprise following substep:

(21) judgement is text filed

Whether the subregion among the t is text filed

In occurred;

(22) if do not occur, can determine that then this subregion is the false-alarm zone, and this subregion is got rid of, and (x, y t) are updated to R with text filed R _New(x, y, t), R wherein _New(x, y t) are area to be repaired in the present image;

(23) if occurred, then continue to text filed R (x, y, t) in other subregions carry out verification;

(3) from front to back to text filed R (x, y t) carry out the background reparation, specifically comprise following substep:

(31) (x, y t) and the motion excursion of preceding N frame, if motion is bigger, directly handle next frame to analyze present image f; Less if move, then carry out step (32);

(32) with the text filed R of present image _New(x, y, t) and the text filed R of preceding N frame (x, y, t-1) ..., (x, y t-N) compare R, and the zone that obtains utilizing inter-frame information to repair in the present image is R _New(x, y, t)-R (x, y, t-1) ..., R _New(x, y, t)-R (x, y, t-N);

(33) (x, y is in t) to present image f The zone, the search best matching blocks is finished reparation in preceding N frame, and upgrades R _New(x, y, t);

(34) repeating step (3) is handled back one frame, until processes complete section video.(4) begin to carry out the background reparation forward from last frame, specifically comprise following substep:

(41) (x, y t) with the motion excursion of back N frame, if motion is bigger, directly handle front one frame, if it is less to move, then carry out step (42) to analyze present image f;

(42) with the text filed R of present image _New(x, y, t) with the text filed R of back N frame (x, y, t+1) ..., (x, y t+N) compare R, and the zone that obtains utilizing inter-frame information to repair in the present image is R _New(x, y, t)-R (x, y, t+1) ..., R _New(x, y, t)-R (x, y, t+N);

(43) (x, y is in t) to present image f

The zone, the search best matching blocks is finished reparation in the N frame of back, and upgrades R _New(x, y, t);

(44) if this moment To remaining text filed, the match block that search is best in this frame is repaired so; Otherwise repeating step (4) is handled front one frame, up to processes complete section video.

Method of the present invention has the following advantages: (1) based on the continuity of video text captions, the inventive method proposes to utilize inter-frame information verification surveyed area, improves the accuracy of zone location; (2) based on the continuity of video content, the inventive method utilizes the interframe available information to finish reparation, can find optimum matching information fast like this, has guaranteed to repair result's correctness again; (3) the inventive method result that will at every turn repair is as the available information of repairing next time, constantly accumulates and continue the continuity of repairing the result, and the interframe that can reduce image processing method to a great extent and produced is glimmered and saltus step.Generally speaking, the present invention can guarantee the accuracy of detection and location, guarantees to repair result's correctness, guarantees the time continuity of results for video.

Description of drawings

Fig. 1 (a) is an original image.

Fig. 1 (b) illustrates through the image after the art methods processing.

Fig. 1 (c) illustrates through the image after the inventive method processing.

Fig. 2 is the process flow diagram that the present invention is based on the video text disposal route of inter-frame information.

Fig. 3 is the refinement process flow diagram of step in the inventive method (2).

Fig. 4 is the refinement process flow diagram of step in the inventive method (3).

Fig. 5 is the refinement process flow diagram of step in the inventive method (4).

Embodiment

Below in conjunction with accompanying drawing and concrete case study on implementation the present invention is further described.

Below at first terms more of the present invention are used to make an explanation:

Image text filed: image Chinese version region.

False-alarm zone: be detected as text filed non-text filed in the image.

Background is repaired: removes text filedly, repair also and reduce by text filed background of blocking.

Motion excursion: the variable quantity of position in the frame of front and back, a certain zone.

Best matching blocks: in the hunting zone with the most close zone on certain tolerance mode, area to be repaired.

Shown in Fig. 2,3,4,5, the video text disposal route that the present invention is based on inter-frame information may further comprise the steps:

(21) judgement is text filed

In subregion whether text filed

In occurred;

(31) analyze present image f (x, y, t) and the motion excursion of preceding N frame; If it is bigger to move, directly next frame is handled; Less if move, then carry out step (32);

(33) (x, y is in t) to present image f

The zone, the search best matching blocks is finished reparation in preceding N frame, and upgrades R _New(x, y, t);

(41) analyze present image f (x, y is t) with the motion excursion of back N frame; If it is bigger to move, directly front one frame is handled; Less if move, then carry out step (42);

(43) (x, y is in t) to present image f The zone, the search best matching blocks is finished reparation in the N frame of back, and upgrades R _New(x, y, t);

(44) if this moment

To remaining text filed, the match block that search is best in this frame is repaired so; Otherwise repeating step (4) is handled front one frame, up to processes complete section video.

For the validity based on the video text disposal route of inter-frame information is analyzed, the inventor is respectively from literal type, the background complexity, and whether background motion speed, caption character length occurs changes these 4 aspects making video test sequence of starting with; Wherein the difference of literal type is represented the character features difference, mainly influences the universality of algorithm in the test section; Background complexity difference mainly influences algorithm in the accuracy of test section and the correctness of eliminating false-alarm regions mechanism; The background motion speed, can the reparation part that have influence on algorithm use inter-frame information; Whether caption character length occurs changes, and inter-frame information that the reparation part of algorithm can utilize what have influence on.

According to the independent component analysis as can be known, these 4 factors all are independently of one another in video; They have reflected that respectively algorithm detects and repair the performance of each link, and these 4 features of video can directly visually distinguish, therefore for selecting and to make video test sequence all very useful.

7 video test sequence have been made according to the various combination of 4 factors in the experiment, according to statistical testing of business cycles to multitude of video, independent factor in these 7 videos meets the combinational logic of ordinary video, and array mode contained most of kind of common video, and its array mode is as shown in table 1.Each cycle tests time span is 10 minutes, and frame per second was for 24 frame/seconds.

The feature of table 17 video test sequence

Fig. 1 (a) is an original image, and Fig. 1 (b) is the result that the Criminisi method obtains, and Fig. 1 (c) is the result that this disposal route obtains; The texture information complexity of the literal part of blocking as can be seen among Fig. 1 (a), details is abundant.Fig. 1 (b) can't recover the detailed information partly that is blocked fully owing to adopted this frame restorative procedure of Criminisi, causes the texture information disappearance, so the result has significantly blur effect.Fig. 1 (c) is than Fig. 1 (b), and details is resumed fully, and does not almost have repairing mark.

For the advantage of this disposal route better is described, further carry out quantitative judge below by the objective evaluation parameter; Experimental result is shown in table 2 and 3, and experimental result shows that method of the present invention can guarantee the accuracy of detection and location, guarantees to repair result's correctness, guarantees the time continuity of results for video.

Table 2 text detection result's quality evaluation

Table 3 background is repaired result's quality evaluation

The evaluation index explanation:

For the effect to text detection is carried out quantitative test, the present invention considers to adopt following two kinds of quality objective evaluation indexs to carry out quantitative test;

(1) false alarm rate

Refer to the non-text filed text filed ratio that totally detects that accounts among the image detection result:

The false alarm rate of video is that the false alarm rate to frame of video is averaged and obtains;

(2) loss

Refer to the text filed totally text filed ratio that accounts for that is not detected in the image:

The loss of video is that the loss to frame of video is averaged and obtains.

For the effect to the video reparation is carried out quantitative test, the present invention considers to adopt following four kinds of quality objective evaluation indexs to carry out quantitative test:

(1) signal to noise ratio (S/N ratio) (Snr)

The quality that can reflect the image repair effect.Generally calculate with following formula:

Snr (f_{O}, f) = \underset{i, j}{Σ} {(f_{O} (i, j) - μ (f_{O}))}^{2} / \underset{i, j}{Σ} {(f_{O} (i, j) - f (i, j))}^{2}

Wherein f is an original image, f _OBe the image after repairing, the average of μ representative image.

(2) square error (Rmse)

Being original image and the quadratic sum of repairing the difference of image corresponding point, is the judgement to an object fidelity of image repair performance;

Rmse (f_{0}, f) = \sqrt{\underset{i, j}{Σ} {(f_{0} (i, j) - f (i, j))}^{2}}

(3) the universal qualities factor (UQI)

By original image and the comparison of repairing image, the quality of reflection image repair quality.

UQI (f_{0}, f) = \frac{4 σ_{f_{0} f} μ (f_{0}) μ (f)}{(σ_{f_{0}}^{2} + σ_{f}^{2}) [μ {(f_{0})}^{2} + μ {(f)}^{2}]}

Wherein f is an original image, f _OBe the image after repairing, the average of μ representative image, the variance of σ representative image, Presentation video f _OCovariance with f.

(4) temporal correlation (Corr)

Correlativity is to weigh the standard of the degree of correlation of the gray scale of repairing image front and back frame correspondence position on time, and the temporal correlation of the big more explanation restored video of correlativity is good more;

Corr (f_{t}, f_{t + 1}) = \frac{\underset{i, j}{Σ} (f_{t} (i, j) - μ (f_{t})) (f_{t + 1} (i, j) - μ (f_{t + 1}))}{\sqrt{\underset{i, j}{Σ} {(f_{t} (i, j) - μ (f_{t}))}^{2} \underset{i, j}{Σ} {(f_{t + 1} (i, j) - μ (f_{t + 1}))}^{2}}}

F wherein _tAnd f _T+1Be respectively and repair back t and t+1 image constantly, the average of μ representative image.

Claims

1. video text disposal route based on inter-frame information may further comprise the steps:

(1) (t), establish described present image is f (x to the text filed R of detection present image for x, y, y, t), the N width of cloth image of its front is followed successively by f (x, y, t-1) ..., f (x, y, t-N), described N width of cloth image correspondence text filed be respectively R (x, y, t-1),, R (x, y, t-N), wherein x represents the horizontal ordinate of described present image, and y represents the ordinate of described present image, and t represents the frame number of described present image;

(2) to described present image f (x, y, described text filed R t) (x, y t) carry out verification, specifically comprise following substep:

(21) judge described text filed R (x, y, t) subregion in whether text filed R (x, y, t-1) ..., (x, y occurred in t-N) R;

(22) if do not occur, can determine that then described subregion is the false-alarm zone, described subregion is got rid of, and (x, y t) are updated to R with described text filed R _New(x, y, t), R wherein _New(x, y t) are area to be repaired in the described present image;

(23) if occurred, then continue to described text filed R (x, y, t) in other subregions carry out verification;

(3) from front to back to described text filed R (x, y t) carry out the background reparation, specifically comprise following substep:

(31) (x, y t) and the motion excursion of preceding N frame, if motion is bigger, then directly handle next frame, if it is less to move, then carry out step (32) to analyze described present image f;

(32) with the text filed R of described present image _New(x, y, t) and the text filed R of preceding N frame (x, y, t-1) ..., (x, y t-N) compare R, and the zone that obtains utilizing inter-frame information to repair in the described present image is R _New(x, y, t)-R (x, y, t-1) ..., R _New(x, y, t)-R (x, y, t-N);

(33) to described present image f (x, y, t) R in _New(x, y, t)-R (x, y, t-1) ..., R _New(x, y, t)-R (x, y, t-N) zone, the search best matching blocks is finished reparation in the N frame before described, and upgrades R _New(x, y, t);

(34) repeating step (3) is handled back one frame, until processes complete section video;

(4) begin to carry out the background reparation forward from last frame, specifically comprise following substep:

(41) analyze described present image f (x, y is t) with the motion excursion of back N frame; If it is bigger to move, directly front one frame is handled; Less if move, then carry out step (42);

(42) with the text filed R of described present image _New(x, y, t) with the text filed R of back N frame (x, y, t+1) ..., (x, y t+N) compare R, and the zone that obtains utilizing inter-frame information to repair in the present image is R _New(x, y, t)-R (x, y, t+1) ..., R _New(x, y, t)-R (x, y, t+N);

(43) to described present image f (x, y, t) R in _New(x, y, t)-R (x, y, t+1) ..., R _New(x, y, t)-R (x, y, t+N) zone, the search best matching blocks is finished reparation in the N frame of back, and upgrades R _New(x, y, t);

2. video text disposal route according to claim 1 is characterized in that: utilize inter-frame information to finish range check and background reparation in step (2)～(4).