CN102802074A

CN102802074A - Method for extracting and displaying text messages from television signal and television

Info

Publication number: CN102802074A
Application number: CN2012102883679A
Authority: CN
Inventors: 张文军
Original assignee: Hisense Group Co Ltd
Current assignee: Hisense Group Co Ltd
Priority date: 2012-08-14
Filing date: 2012-08-14
Publication date: 2012-11-28
Anticipated expiration: 2032-08-14
Also published as: CN102802074B

Abstract

The invention discloses a method for extracting and displaying text messages from a television signal and a television. The method comprises the steps of: receiving at least one frame of television signal; obtaining a static image from the television signal; intercepting a detection image from a set region on the static image, wherein the set region covers the text messages on the static image; identifying the detection image, extracting the text messages; and displaying the extracted text messages. According to the invention, specially aiming at the characteristics of the text messages on a television image, through intercepting a part on the static image, including texts, the detection image with a smaller detection region is formed; therefore, compared with the prior art, the method has the advantages of reducing operation requirements, and increasing identification efficiency; and according to the particularity of the television signal, through comparing two frames of images, an image background can be removed, and the text messages can be conveniently extracted.

Description

The method and the television set that from TV signal, extract Word message and show

Technical field

The present invention relates to video image and process field, design a kind of television set that from TV signal, extracts Word message and the method that shows and use this method especially.

Background technology

In order to satisfy user's demand better, a lot of TV programme all have captions, and these captions are providing great help to the user under the noisy environment and under the situation that needs to translate.Along with popularizing of DTV; The wide high proportion of many television display screen curtains all is 16:9 on the market now; And the picture image of the TV programme of TV station's broadcast now all is standard the ratio of width to height of 4:3, thereby causes the wide high proportion of television image and television display screen curtain inconsistent.For this reason, many television sets have plurality of display modes and select for the user:

1) 16:9 screen mode toggle.

The television video image stretch that is about to 4:3 standard the ratio of width to height is the ratio of width to height of 16:9, thereby can on television screen, be displayed in full screen.But television image is widened to a certain extent under this pattern, and picture can be out of shape, and can become short and stout like the figure image in the picture, has reduced the visual effect of TV programme.

2) 4:3 ratio mode.

Be that tv programme picture is identical with the height of television screen.The ratio that television image shows under this pattern is normal; But because under the prerequisite of equal height; The width of television screen is wideer than the width of tv programme picture; Tv programme picture can't fill up the width of television screen, causes the television screen both sides blank screen to occur, and it is less that screen utilizes insufficient and television image seems.

3) amplification mode.

Be that tv programme picture is identical with the width of television screen.Aspect ratio is normal under this pattern; But because under the prerequisite of same widths; The height of the aspect ratio television screen of tv programme picture is higher; Cause tv programme picture can't show on both sides up and down, if the part that can not show is little to whole television-viewing influence, this is the good method that makes full use of the television set wide screen.

But the captions of TV programme often are positioned at the bottom of television image; And the captions in most TV signals are embedded in television image the inside, are not independent, so therefrom extract captions certain degree of difficulty are arranged; If the user selects amplification mode; The captions user of picture bottom just can't see or can't intactly see so, and this makes and causes the inconvenience of watching by user's image switching display mode of when needs are watched captions, having to.

The patent of invention of China's application number 95119442.9 has proposed a kind of " control captions device shown on wide ratio screen ", and it can detect and extract caption signal from TV signal, thereby can captions be presented at the suitable position of screen.But this invention is only applicable to " having the TV signal of caption information "; Promptly the caption information of this invention extraction is not to be embedded in the television image yet; But independent packet is contained in the TV signal; Make the application of this invention have certain limitation, for the TV programme signal that does not comprise independent caption information of China, the device of this invention is also inapplicable especially.

The patent of invention that the patent of invention of China's application number 02801652.1 has proposed a kind of " in vision signal, detecting the method and apparatus of captions ", the patent application of Chinese application number 201110315054.3 has proposed a kind of " method that the video caption text extracts and discerns ", Chinese application number 200710178831.8 has proposed a kind of " video subtitle information extracting method "; These three kinds of schemes all can detect and extract Word message from vision signal; But these three kinds of schemes all will be retrieved character area in the full frame scope of video image; And account form is complicated; Operand is big; Can cause bigger burden to hardware device, especially obvious under the situation that high-definition video signal is popularized now.

Summary of the invention

The purpose of this invention is to provide a kind of method and television set that from TV signal, extracts Word message and show, be not inconsistent and the problem that causes captions to show with the ratio of width to height that solves TV signal and display screen.

The present invention proposes a kind of method of from TV signal, extracting Word message and showing, may further comprise the steps:

Receive at least one frame TV signal;

From TV signal, obtain still image;

Regions intercepting detected image from the said still image, said regions is positioned at the bottom of said still image;

Said detected image is discerned, extracted Word message wherein;

Show the said Word message that extracts.

According to the described method of from TV signal, extracting Word message and showing of preferred embodiment of the present invention; The height of the regions on the said still image is (1-r/R) * B/2, and wherein, r is the ratio of width to height of still image; R is the ratio of width to height of video screen, and B is the video screen height.

According to the described method of from TV signal, extracting Word message and showing of preferred embodiment of the present invention, the step of the regions intercepting detected image from the said still image comprises:

Gray scale to said still image is extracted, and obtains the black and white picture corresponding with said still image;

Detect the zone that gray scale on the said black and white picture is higher than presetting first threshold, and be the regions of said still image the zone definitions that said gray scale is higher than said first threshold;

The said detected image of scope intercepting of the regions from the said still image.

According to the described method of from TV signal, extracting Word message and showing of preferred embodiment of the present invention, said said detected image is discerned, also comprise before extracting the step of Word message wherein:

Remove the background parts on the said detected image.

According to the described method of from TV signal, extracting Word message and showing of preferred embodiment of the present invention, the step of the background parts on the said detected image of said removal comprises:

The gray value of respective pixel on the said detected image of adjacent two frames or the multiframe of being separated by is compared;

With the zone as a setting part removal of gray-value variation on the said detected image less than the pixel place of the second preset threshold value.

According to the described method of from TV signal, extracting Word message and showing of preferred embodiment of the present invention, the step that said detected image is discerned comprises before:

Said detected image is detected, to confirm whether Word message is arranged on the said detected image.

According to the described method of from TV signal, extracting Word message and showing of preferred embodiment of the present invention, said said detected image is detected, to confirm whether have the step of Word message to comprise on the said detected image:

Gray value subtraction to neighbor on the said detected image;

Add up the transition times of every capable pixel on the said detected image;

Go up the transition times of pixel according to each row on the said detected image and confirm whether Word message is arranged on the said detected image.

According to the described method of from TV signal, extracting Word message and showing of preferred embodiment of the present invention, said detected image to be discerned, the step of extracting Word message wherein comprises:

Detect the starting point and the terminating point of said the above Word message of detected image, to confirm the zone of said Word message;

To going cutting and character segmentation in the zone of said the above Word message of detected image, to form the individual character image;

Extract the statistical nature or the architectural feature of said individual character image;

Statistical nature or architectural feature according to the said individual character image that extracts identify the Word message on the said detected image.

According to the described method of from TV signal, extracting Word message and showing of preferred embodiment of the present invention; The statistical nature or the architectural feature of the said individual character image that said basis extracts, the step that identifies the Word message on the said detected image further comprises afterwards:

Recognition result is proofreaied and correct.

Write down size and/or the spacing and/or the relative position of said Word message;

The step of the said Word message that said demonstration extracts comprises:

Size and/or spacing and/or relative position according to the said Word message that writes down show the Word message identical with the original text word shape.

The present invention proposes a kind of television set in addition, comprises display screen and Word message extraction module, and said Word message extraction module comprises:

The still image acquiring unit is used for obtaining still image from TV programme signal;

The detected image acquiring unit is used for the regions intercepting detected image from the said still image, and said regions is positioned at the bottom of said still image;

The Word message recognition unit is used for said detected image is discerned, and extracts Word message wherein, on said display screen, to show the said Word message that extracts.

According to the described television set of preferred embodiment of the present invention, the height of the regions on the said still image is (1-r/R) * B/2, and wherein, r is the ratio of width to height of still image, and R is the ratio of width to height of said display screen, and B is the height of said display screen.

According to the described television set of preferred embodiment of the present invention, said detected image acquiring unit further comprises:

Gray scale is extracted subelement, is used for the gray scale of said still image is extracted, and obtains the black and white picture corresponding with said still image;

Regions definition subelement is used to detect the zone that gray scale on the said black and white picture is higher than presetting first threshold, is the regions of said still image with the zone definitions that said gray scale is higher than preset said first threshold;

The intercepting subelement is used for the said detected image of scope intercepting according to the regions on the said still image.

According to the described television set of preferred embodiment of the present invention, said Word message extraction module also comprises: the background removal unit is used to remove the background parts on the said detected image.

According to the described television set of preferred embodiment of the present invention, said background removal unit further comprises:

Gray scale comparer unit is used for the gray value of respective pixel on the said detected image of adjacent two frames or the multiframe of being separated by is compared;

The background removal subelement is used for the zone as a setting part removal of gray-value variation on the said detected image less than the pixel place of the second preset threshold value.

According to the described television set of preferred embodiment of the present invention, said Word message extraction module also comprises: the Word message confirmation unit is used for said detected image is detected, to confirm whether Word message is arranged on the said detected image.

According to the described television set of preferred embodiment of the present invention, said Word message confirmation unit further comprises:

The gray scale computation subunit is used for the gray value subtraction to neighbor on the said detected image;

Add up subelement, be used to add up the transition times of every capable pixel on the said detected image, confirm with the transition times that goes up pixel according to each row on the said detected image whether Word message is arranged on the said detected image.

According to the described television set of preferred embodiment of the present invention, said Word message recognition unit further comprises:

The zone detection sub-unit is used to detect the starting point and the terminating point of said the above Word message of detected image, to confirm the zone of said Word message;

The cutting subelement is used for cutting and character segmentation are gone in the zone of said the above Word message of detected image, to form the individual character image;

The feature extraction subelement is used to extract the statistical nature or the architectural feature of said individual character image, and statistical nature or architectural feature with according to the said individual character image that extracts identify the Word message on the said detected image.

According to the described television set of preferred embodiment of the present invention, said Word message extraction module also comprises: correcting unit is used for the recognition result of said Word message recognition unit is proofreaied and correct.

According to the described television set of preferred embodiment of the present invention; Said Word message extraction module also comprises: record cell; Be used to write down size and/or the spacing and/or the relative position of said Word message, so that said display screen shows the Word message identical with the original text word shape.

With respect to prior art; The invention has the beneficial effects as follows: the present invention is directed against the characteristics of Word message on the television image especially; Through the part that contains literal on the still image is carried out intercepting, form the less detected image of surveyed area, therefore with respect to prior art; When reducing the computing requirement, improved recognition efficiency.On the other hand, the present invention, can remove image background through two two field pictures are compared according to the particularity of TV signal, can extract text message easily.

Certainly, arbitrary product of implement originally, inventing might not reach above-described all advantages simultaneously.

Above-mentioned explanation only is the general introduction of technical scheme of the present invention; Understand technological means of the present invention in order can more to know; And can implement according to the content of specification, and for let of the present invention above-mentioned with other purposes, feature and advantage can be more obviously understandable, below special act preferred embodiment; And conjunction with figs., specify as follows.

Description of drawings

Fig. 1 is the embodiment of the invention is extracted the method for Word message and demonstration from TV signal a kind of flow chart;

Fig. 2 is the sketch map of the regions at caption information place on the TV still image;

Fig. 3 is the flow chart of a kind of regions intercepting detected image from the still image of the embodiment of the invention;

Fig. 4 is a kind of flow chart that extracts Word message from detected image of the embodiment of the invention;

Fig. 5 is the another kind of the embodiment of the invention extracts the method for Word message and demonstration from TV signal an another kind of flow chart;

Fig. 6 is a kind of flow chart of confirming whether to have on the detected image Word message of the embodiment of the invention;

Fig. 7 is a kind of flow chart of removing the background parts on the said detected image of the embodiment of the invention;

Fig. 8 is a kind of television structure figure of the embodiment of the invention;

Fig. 9 is the another kind of television structure figure of the embodiment of the invention;

Figure 10 is another comparatively detailed television structure figure of the embodiment of the invention.

Embodiment

Reach technological means and the effect that predetermined goal of the invention is taked for further setting forth the present invention; Below in conjunction with accompanying drawing and preferred embodiment; To method and its embodiment of television set, method, step and the effect from TV signal, extracting Word message and show that proposes according to the present invention, specify as after.

Relevant aforementioned and other technology contents, characteristics and effect of the present invention can clearly appear in following the cooperation in the graphic preferred embodiment detailed description of reference.Through the explanation of embodiment, when can being to reach technological means and the effect that predetermined purpose takes to be able to more deeply and concrete understanding to the present invention, yet the appended graphic usefulness that only provides reference and explanation be not to be used for the present invention is limited.

See also Fig. 1, it is the embodiment of the invention is extracted the method for Word message and demonstration from TV signal a kind of flow chart, and it may further comprise the steps:

S10 receives at least one frame TV signal.

S11 obtains still image from TV signal.Be about to the single width image frame that the continuous image frame of TV signal is separated into a frame one frame, and the single width image frame of each frame least unit is described still image.

S12, the regions intercepting detected image from the said still image.Said regions is the zone that covers Word message on the said still image.

Different with existing Word message detection and Identification technology, the present invention only carries out text detection and identification to the sub-fraction image (being the image in the regions) of still image, and then can reduce operand, improves recognition efficiency.

Wherein the present invention the Word message that will extract be the caption information of TV programme; Because the captions of TV programme generally all are the bottoms that appears at television image; When therefore extracting to the caption information in the TV programme signal, regions just is provided with the bottom that is positioned at said still image.

Specifically, the ratio of width to height of supposing tv display screen is R, and the ratio of width to height of the still image of TV signal is r, and the width of tv display screen is A, highly is B.Like this, when the display image width of TV signal was stretched the width A that reaches tv display screen, then the demonstration of TV signal highly was B ' at this moment.

According to: A/B '=r, A/B=R learn B '=RB/r.

When TV signal is presented on the tv display screen with amplification mode; Have highly for the television image of B '-B and do not show; If television image up and down symmetry shows (promptly two ends not have the part of demonstration identical up and down), then the bottom have highly and be (B '-B)/2 television image do not show.Because B '=RB/r, through calculating, then the bottom has highly and does not show for the picture of (1-r/R)/2.

Therefore; When the present invention is used for extracting and during the caption information of display of television programmes signal, the regions of television image is positioned at the bottom of still image, and the height of regions is preferably (1-r/R) * B/2; Wherein, R is the ratio of width to height of still image, and R is the ratio of width to height of video screen, and B is the video screen height.Being that enlarged and displayed the ratio of width to height is that the image of 4:3 is an example in the tv display screen of 16:9 in the ratio of width to height; Both sides do not have picture displayed respectively to account for 1/8 of entire image up and down; Therefore, the present invention only needs with 1/8 part of television image lower end as regions and intercepting detected image to extract captions wherein.Certainly, this regions also can suitably be amplified (for example the Word message half that only is blocked in order to extract complete Word message, needs to enlarge regions), and generally it highly needs not exceed (1-r/R) * B.As shown in Figure 2.

For the rational detected image of intercepting, just need detect the residing regions of Word message.Generally speaking, therefore the gray scale of Word message only need be extracted the gray scale of still image all than higher on the still image, just can identify Word message through grey scale signal.See also Fig. 3, the regions intercepting detected image from said still image can may further comprise the steps:

S121 extracts the gray scale of said still image.Be equivalent to obtain the black and white picture corresponding with said still image.

S122 detects the zone that gray scale is higher than presetting first threshold, and is the regions of said still image with the zone definitions that said gray scale is higher than preset first threshold value.Because the gray scale of literal on still image be higher relatively, thus according to actual needs or experience set first threshold and find out character area (being regions), thereby can improve the detection efficiency of Word message.

S123, the said detected image of scope intercepting of the regions from the said still image.

S13 discerns said detected image, extracts Word message wherein.

After intercepting goes out detected image, just need discern the Word message that is comprised on the detected image, see also Fig. 4, it further can may further comprise the steps:

S131 detects the starting point and the terminating point of said the above Word message of detected image, to confirm the zone of said Word message.

Because the caption information of TV generally has only delegation, at most only two go, so the judgement of its starting point and terminating point is more convenient.

S132 is to going cutting and character segmentation in the zone of said the above Word message of detected image, to form the individual character image.

For follow-up identification, also need Word message be cut apart.The purpose of this step is that single literal is extracted from detected image, and the row cutting is that detected image is cut into row earlier, and character segmentation is an image of in being cut into capable image, isolating single literal.The method of row cutting and character segmentation is a lot, such as the method that can adopt level, vertical segmentation, promptly through find out the coordinate of each literal in the projection of level, vertical direction.

S133 extracts the statistical nature or the architectural feature of said individual character image.

To the process that the individual character image carries out feature extraction, comprise steps such as refinement, normalization, the statistical nature of individual character image or the extraction of architectural feature all are existing comparatively mature technique, repeat no more at this.It should be noted that; The pattern expression-form of literal has many kinds with corresponding dictionary formation method; Every kind of form can be selected different character again; Every kind of characteristic has different abstracting methods again, and this just makes method of discrimination and criterion and used mathematical tool different, has formed character recognition method of a great variety, that form is different.Generally speaking; Different character extracts with the method for designing of grader and has determined the different processing method of recognition system employing, can be divided into recognition methods that tactic pattern recognition methods, statistical pattern recognition method, statistics combine with structure and Artificial Neural Network etc. usually.

S134, statistical nature or architectural feature according to the said individual character image that extracts identify the Word message on the said detected image.The identification of Word message promptly is the process that from existing feature database, finds and wait to know the highest character type of literal similarity.

S14 shows the said Word message that extracts.The position that Word message shows can be by user's appointed positions, also can be the position of acquiescence.In order to guarantee that the user can see the Word message that extracts, the position of acquiescence must be positioned at the image-region beyond the regions on the still image.Certainly, if the user be set to not show, can not show yet.

See also Fig. 5, it is the embodiment of the invention is extracted the method for Word message and demonstration from TV signal an another kind of flow chart, and it may further comprise the steps:

S50 receives at least one frame TV signal.

S51 obtains still image from TV signal.Be about to the single width image frame that the continuous image frame of TV signal is separated into a frame one frame, and the single width image frame of each frame least unit is described still image.

S52, the regions intercepting detected image from the said still image.Said regions is positioned at the bottom of still image.

S53 detects said detected image, to confirm whether Word message is arranged on the said detected image.

For improving recognition efficiency; Reduce unnecessary computing; Can after obtaining detected image, further detect to confirm whether contain Word message on it; Owing to have gray scale difference between text and the background image,, have the local gray scale difference value of text to have saltus step if the gray scale of adjacent two pixels on the image is carried out subtraction.If so each row pixel is all carried out subtraction, draw the transition times of every capable pixel, the transition times that comprises the row of Word message is so wanted obviously more than the row that does not have literal.Specifically, see also Fig. 6, step S53 may further include following steps:

S531 is to the gray value subtraction of neighbor on the said detected image.

S532 adds up the transition times of every capable pixel on the said detected image, and said saltus step is meant that the difference of neighbor gray value is greater than one the 3rd threshold value.This 3rd threshold value can be set and adjust according to actual conditions or experience.

S533 goes up the transition times of pixel and confirms whether Word message is arranged on the said detected image according to each row on the said detected image.If the transition times of an every row of detected image is all fewer and average, just can judge on this image does not have text message.Otherwise then judging has.

S54 removes the background parts on the said detected image.

Because TV signal itself is a continually varying; And generally can continuing the several seconds, Word message just changes once; If compare through two adjacent two field pictures or two two field pictures of several frames of being separated by, remove the part of its variation and keep constant part, then very limits get rid of background; Like this, the accuracy of text identification is just higher.Specifically, see also Fig. 7, step S54 may further include following steps:

S541 compares the gray value of respective pixel on the said detected image of adjacent two frames or the multiframe of being separated by.

S542 is with the zone as a setting part removal of gray-value variation on the said detected image less than the pixel place of the second preset threshold value.Second threshold value described here can be set or adjust according to actual conditions or experience.

Such as; Suppose that the current detected image of discerning is the N frame; So through the N-n two field picture of N two field picture n frame preceding with it is compared, the gray scale of each corresponding pixel more between the two is if the absolute value of difference of finding to have corresponding pixel grey scale is greater than a certain particular value i; Think that then this pixel is a background picture, thereby can confirm the zone and the removal of background on the image.Like this, just can filter out the part of background on the detected image further, make the identification of literal more more convenient.For purposes of the invention, preferred n value is less than 5, and i can value 10.

S55 discerns said detected image, extracts Word message wherein.After having obtained detected image after treatment, just need discern the Word message that is comprised on the detected image, the process of identification is identical with abovementioned steps S13, repeats no more at this.

S56 proofreaies and correct recognition result.Correction is a process of utilizing the meaning of a word, word frequency, syntax rule or corpus etc. that the Word message that extracts is verified.

S57 writes down size and/or the spacing and/or the relative position of said Word message.

The present invention can calculate the position of literal when Word message is discerned, and the information such as its literal spacing and/or size and/or relative position that write down are so that keep original literal shape when showing.Perhaps the user also can utilize size and/or the spacing and/or the relative position of the Word message of record, the size text of the Word message that adjustment will show, color, frame etc. when showing.

S58 shows the said Word message that extracts.

The position that Word message shows can be by user's appointed positions, also can be the position of acquiescence.In order to guarantee that the user can see the Word message that extracts, the position of acquiescence must be positioned at the image-region beyond the regions on the still image.Certainly, if the user be set to not show, can not show yet.

The present invention to the characteristics of Word message on the television image, through the part that contains literal on the still image is carried out intercepting, forms the less detected image of surveyed area especially, therefore with respect to prior art, when reducing the computing requirement, has improved recognition efficiency.On the other hand, the present invention, can remove image background through two two field pictures are compared according to the particularity of TV signal, can extract text message easily.

The present invention proposes a kind of television set in addition, sees also Fig. 8, and it comprises display screen 81 and Word message extraction module 82, and Word message extraction module 82 is used for extracting Word message from the image of TV signal, and on display screen 81, shows.Especially for amplification mode; Word message extraction module 82 can extract the captions that the television image bottom can't show; And in effective projection scope of display screen 81, show, can avoid the user to see captions and switch the play mode of television set continually because of needs.

Wherein, Word message extraction module 82 further comprises still image acquiring unit 821, detected image acquiring unit 822 and Word message recognition unit 823 again; Detected image acquiring unit 822 links to each other with still image acquiring unit 821 and Word message recognition unit 823 respectively, and Word message recognition unit 823 links to each other with display screen 81.

During work; At first still image acquiring unit 821 can obtain still image from TV programme signal; Be about to the single width image frame that the continuous image frame of TV signal is separated into a frame one frame, and the single width image frame of each frame least unit is described still image.

The then regions intercepting detected image of detected image acquiring unit 822 from the still image, said regions is the zone that covers Word message on the still image.For the captions of television image, because the captions of TV programme generally all are the bottoms that appears at television image, when therefore extracting to the caption information in the TV programme signal, regions just is provided with the bottom that is positioned at said still image.

Specifically, the ratio of width to height of supposing display screen 81 is R, and the ratio of width to height of the still image of TV signal is r, and the width of display screen 81 is A, highly is B.Like this, when the display image width of TV signal was stretched the width A that reaches tv display screen, then the demonstration of TV signal highly was B ' at this moment.

According to: A/B '=r, A/B=R learn B '=RB/r.

When TV signal is presented at 81 last times of display screen with amplification mode; Have highly for the television image of B '-B and do not show; If television image up and down symmetry shows (promptly two ends not have the part of demonstration identical up and down), then the bottom have highly and be (B '-B)/2 television image do not show.Because B '=RB/r, through calculating, then the bottom has highly and does not show for the picture of (1-r/R)/2.

Therefore, the height of regions is preferably (1-r/R) * B/2, and wherein, r is the ratio of width to height of still image, and R is the ratio of width to height of display screen 81, and B is display screen 81 height.Being that enlarged and displayed the ratio of width to height is that the image of 4:3 is an example in the display screen 81 of 16:9 in the ratio of width to height; Both sides do not have picture displayed respectively to account for 1/8 of entire image up and down; Therefore, the present invention only needs with 1/8 part of television image lower end as regions and intercepting detected image to extract captions wherein.Certainly, this regions also can suitably be amplified (for example the Word message half that only is blocked in order to extract complete Word message, needs to enlarge regions), and generally it highly needs not exceed (1-r/R) * B.As shown in Figure 2.

At last, Word message recognition unit 823 can be discerned detected image, extracts Word message wherein, on display screen 81, to show the Word message that extracts.The position that Word message shows can be by user's appointed positions, also can be the position of acquiescence.In order to guarantee that the user can see the Word message that extracts, the position of acquiescence must be positioned at the image-region beyond the regions on the still image.Certainly, if the user be set to not show, can not show yet.

See also Fig. 9, it is the another kind of television structure figure of the embodiment of the invention, and it comprises display screen 81 and Word message extraction module 82.Wherein, Word message extraction module 82 further comprises still image acquiring unit 821, detected image acquiring unit 822, Word message recognition unit 823, Word message confirmation unit 824, background removal unit 825, correcting unit 826 and record cell 827 again.Detected image acquiring unit 822 links to each other with still image acquiring unit 821 and Word message recognition unit 823 respectively; Word message recognition unit 823 links to each other with display screen 81; Word message confirmation unit 824 all links to each other with detected image acquiring unit 822 with background removal unit 825, and correcting unit 826 all links to each other with Word message recognition unit 823 with record cell 827.

The television set of present embodiment is on the basis of Fig. 8, to have increased Word message confirmation unit 824, background removal unit 825, correcting unit 826 and record cell 827.Detected image acquiring unit 822 in intercepting after the detected image, can transfer to 824 pairs of detected image of Word message confirmation unit earlier and detect, to confirm whether Word message is arranged on the said detected image.Confirming have Word message detected image acquiring unit 822 meetings afterwards again detected image is transferred to the background parts that background removal unit 825 is removed on the detected image on the detected image, to improve the detection efficiency of Word message.Detected image after the removal background just can send Word message recognition unit 823 to and carry out literal identification.

After Word message recognition unit 823 identified the Word message on the detected image, correcting unit 826 can be proofreaied and correct recognition result, promptly utilizes the meaning of a word, word frequency, syntax rule or corpus etc. that the Word message that extracts is verified.After checking is passed through, again by size and/or the spacing and/or the relative position of record cell 827 shorthand information, so that when showing, keep original literal shape.Perhaps the user also can utilize size and/or the spacing and/or the relative position of the Word message of record, the size text of the Word message that adjustment will show, color, frame etc. when showing.

See also Figure 10, it is another comparatively detailed television structure figure of the embodiment of the invention, and it comprises display screen 81 and Word message extraction module 82.Wherein, Word message extraction module 82 comprises still image acquiring unit 821, detected image acquiring unit 822, Word message recognition unit 823, Word message confirmation unit 824, background removal unit 825, correcting unit 826 and record cell 827 again.Detected image acquiring unit 822 comprises further that again gray scale is extracted subelement 8221, regions defines subelement 8222 and intercepting subelement 8223.Word message recognition unit 823 further comprises regional detection sub-unit 8231, cutting subelement 8232 and feature extraction subelement 8233 again.Word message confirmation unit 824 further comprises gray scale computation subunit 8241 and statistics subelement 8242 again.Background removal unit 825 comprises gray scale comparer unit 8251 and background removal subelement 8252 again.

When still image acquiring unit 821 obtains still image from TV programme signal after; At first send to gray scale and extract subelement 8221; Extract the gray scale of 8221 pairs of still images of subelement by gray scale and extract, be equivalent to obtain the black and white picture corresponding with still image.Gray scale is extracted subelement 8221 black and white picture is sent to regions definition subelement 8222 then; Gray scale is higher than the zone of presetting first threshold on the regions definition subelement 8222 meeting detection black and white pictures, is the regions of still image with the zone definitions that gray scale is higher than said first threshold.Because the gray scale of literal on still image be higher relatively, thus according to actual needs or experience set first threshold and find out character area (being regions), thereby can improve the detection efficiency of Word message.Defined after the regions, intercepting subelement 8223 just goes out detected image according to scope intercepting from the still image of the regions of definition.

After 8223 interceptings of intercepting subelement went out detected image, gray scale computation subunit 8241 can be obtained this detected image, and to the gray value subtraction of neighbor on the detected image.Add up the transition times of subelement 8242 then according to every capable pixel on the operation result statistics detected image of gray scale computation subunit 8241; Saltus step described here is meant the difference of neighbor gray value greater than one the 3rd threshold value (this 3rd threshold value can be set and adjust according to actual conditions or experience), confirms on the detected image whether Word message is arranged with the transition times that goes up pixel according to each row on the detected image.If the transition times of an every row of detected image is all fewer and average, just can judge on this image does not have text message, thereby abandons the follow-up literal identification to this detected image.Otherwise then judging has.

Simultaneously, gray scale comparer unit 8251 also can obtain this detected image, and the gray value of respective pixel on the said detected image of adjacent two frames or the multiframe of being separated by is compared.Background removal subelement 8252 can be according to the comparison result of gray scale comparer unit 8251, with the zone as a setting part removal (here second threshold value also can according to actual conditions or experience set or adjust) of gray-value variation on the detected image less than the pixel place of the second preset threshold value then.Such as; Suppose that the current detected image of discerning is the N frame; So through the N-n two field picture of N two field picture n frame preceding with it is compared, the gray scale of each corresponding pixel more between the two is if the absolute value of difference of finding to have corresponding pixel grey scale is greater than a certain particular value i; Think that then this pixel is a background picture, thereby can confirm the zone and the removal of background on the image.Like this, just can filter out the part of background on the detected image further, make the identification of literal more more convenient.For purposes of the invention, preferred n value is less than 5, and i can value 10.

Afterwards, regional detection sub-unit 8231 can detect the starting point and the terminating point of Word message on the detected image, to confirm the zone of said Word message.For the image of TV programme,, at most only two go, so the judgement of its starting point and terminating point is more convenient because the caption information of TV generally has only delegation.After the starting point of Word message and terminating point were confirmed, cutting subelement 8232 can be gone cutting and character segmentation to the zone of Word message on the detected image, to form the individual character image.The purpose of cutting is that single literal is extracted from detected image, and the row cutting is that detected image is cut into row earlier, and character segmentation is an image of in being cut into capable image, isolating single literal.The method of row cutting and character segmentation is a lot, such as the method that can adopt level, vertical segmentation, promptly through find out the coordinate of each literal in the projection of level, vertical direction.After cutting was accomplished, feature extraction subelement 8233 just can extract the statistical nature or the architectural feature of individual character image, with the Word message on the identification detected image.To the process that the individual character image carries out feature extraction, comprise steps such as refinement, normalization, the statistical nature of individual character image or the extraction of architectural feature all are existing comparatively mature technique, repeat no more at this.It should be noted that; The pattern expression-form of literal has many kinds with corresponding dictionary formation method; Every kind of form can be selected different character again; Every kind of characteristic has different abstracting methods again, and this just makes method of discrimination and criterion and used mathematical tool different, has formed character recognition method of a great variety, that form is different.Generally speaking; Different character extracts with the method for designing of grader and has determined the different processing method of recognition system employing, can be divided into recognition methods that tactic pattern recognition methods, statistical pattern recognition method, statistics combine with structure and Artificial Neural Network etc. usually.

At last, display screen 81 obtains the Word message on the television image, and is presented at zone visible on the display screen 81.

The above only is preferred embodiment of the present invention, is not the present invention is done any pro forma restriction; Though the present invention discloses as above with preferred embodiment; Yet be not in order to limiting the present invention, anyly be familiar with the professional and technical personnel, in not breaking away from technical scheme scope of the present invention; When the technology contents of above-mentioned announcement capable of using is made a little change or is modified to the equivalent embodiment of equivalent variations; In every case be not break away from technical scheme content of the present invention, to any simple modification, equivalent variations and modification that above embodiment did, all still belong in the scope of technical scheme of the present invention according to technical spirit of the present invention.

Claims

1. a method of from TV signal, extracting Word message and showing is characterized in that, comprising:

Receive at least one frame TV signal;

From TV signal, obtain still image;

Said detected image is discerned, extracted Word message wherein;

Show the said Word message that extracts.

2. the method for from TV signal, extracting Word message and showing as claimed in claim 1 is characterized in that the height of the regions on the said still image is (1-r/R) * B/2; Wherein, R is the ratio of width to height of still image, and R is the ratio of width to height of video screen, and B is the video screen height.

3. the method for from TV signal, extracting Word message and showing as claimed in claim 1 is characterized in that, the step of the regions intercepting detected image from the said still image comprises:

Gray scale to said still image is extracted;

Detect the zone that gray scale is higher than presetting first threshold, and be the regions of said still image the zone definitions that said gray scale is higher than presetting first threshold;

4. the method for from TV signal, extracting Word message and showing as claimed in claim 1 is characterized in that, said said detected image is discerned, and also comprises before extracting the step of Word message wherein:

Remove the background parts on the said detected image.

5. the method for from TV signal, extracting Word message and showing as claimed in claim 4 is characterized in that the step of the background parts on the said detected image of said removal comprises:

6. a television set comprises display screen, it is characterized in that, also comprises the Word message extraction module, and said Word message extraction module comprises:

7. television set as claimed in claim 6 is characterized in that, the height of the regions on the said still image is (1-r/R) * B/2, and wherein, r is the ratio of width to height of still image, and R is the ratio of width to height of said display screen, and B is the height of said display screen.

8. television set as claimed in claim 6 is characterized in that, said detected image acquiring unit further comprises:

Gray scale is extracted subelement, is used for the gray scale of said still image is extracted;

Regions definition subelement is used to detect the zone that gray scale is higher than presetting first threshold, is the regions of said still image with the zone definitions that said gray scale is higher than said first threshold;

9. television set as claimed in claim 6 is characterized in that, said Word message extraction module also comprises:

The background removal unit is used to remove the background parts on the said detected image.

10. television set as claimed in claim 9 is characterized in that, said background removal unit further comprises: