CN102810155A - Method and device for extracting text stroke images from image - Google Patents

Method and device for extracting text stroke images from image Download PDF

Info

Publication number
CN102810155A
CN102810155A CN2011101576734A CN201110157673A CN102810155A CN 102810155 A CN102810155 A CN 102810155A CN 2011101576734 A CN2011101576734 A CN 2011101576734A CN 201110157673 A CN201110157673 A CN 201110157673A CN 102810155 A CN102810155 A CN 102810155A
Authority
CN
China
Prior art keywords
image
stroke
data
directions
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011101576734A
Other languages
Chinese (zh)
Other versions
CN102810155B (en
Inventor
桂天宜
皆川明洋
胜山裕
孙俊
堀田悦伸
直井聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to CN201110157673.4A priority Critical patent/CN102810155B/en
Priority to JP2012110573A priority patent/JP5939023B2/en
Publication of CN102810155A publication Critical patent/CN102810155A/en
Application granted granted Critical
Publication of CN102810155B publication Critical patent/CN102810155B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Processing (AREA)
  • Character Input (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method and a device for extracting text stroke images from an image. The method includes: acquiring marginal information and gradient information of the image; subjecting the acquired marginal information and gradient information to predetermined reinforcing processing so as to reinforce the marginal information and the gradient information relative to a text in the image; and acquiring the text stroke images corresponding to the reinforced marginal information and gradient information.

Description

Be used for extracting the method and apparatus of text stroke image from image
Technical field
The present invention relates to image processing field, relate more specifically to be used for extracting the method and apparatus of text stroke image from image.
Background technology
In current field of information processing, there is a large amount of video files.The needs that existence is retrieved these video files effectively.For video labeling, video search etc., the text message in the video is accurately simple again clue.Therefore, how extract exactly and identification video in the text message that comprised, extremely important to follow-up video labeling, video frequency searching.
Known some are extracted the technology of style of writing strokes, have that speed is slow, noise big, to shortcomings such as the stroke yardstick are insensitive.
The method and apparatus that is used for extracting text stroke image that need be able to address the above problem from image.
Summary of the invention
Provide hereinafter about brief overview of the present invention, so that the basic comprehension about some aspect of the present invention is provided.Should be appreciated that this general introduction is not about exhaustive general introduction of the present invention.It is not that intention is confirmed key of the present invention or pith, neither be intended to limit scope of the present invention.Its purpose only is to provide some notion with the form of simplifying, with this as the preorder in greater detail of argumentation after a while.
The objective of the invention is to, a kind of method and apparatus that is used for extracting from image text stroke image is provided.
According to an aspect of the present invention, a kind of method that is used for extracting from image text stroke image is provided, has comprised: obtain image edge information and gradient information; Reinforcement to the marginal information obtained and gradient information are scheduled to is handled, thereby strengthens marginal information and the gradient information relevant with text in the image; And obtain and marginal information and the corresponding text stroke of gradient information image through strengthening.
According to another aspect of the present invention, a kind of device that is used for extracting from image text stroke image is provided, comprising: information acquisition unit is used to obtain image edge information and gradient information; Enhancing unit is used for the reinforcement that the marginal information obtained and gradient information are scheduled to is handled, thereby strengthens marginal information and the gradient information relevant with text in the image; And stroke image acquisition unit, be used to obtain marginal information and the corresponding text stroke of the gradient information image strengthened with warp.
In addition, embodiments of the invention also provide the computer program that is used to realize said method.
In addition, embodiments of the invention also provide the computer program of computer-readable medium form at least, record the computer program code that is used to realize said method on it.
Through below in conjunction with the detailed description of accompanying drawing to most preferred embodiment of the present invention, these and other advantage of the present invention will be more obvious.
Description of drawings
With reference to below in conjunction with the explanation of accompanying drawing, can understand above and other purpose, characteristics and advantage of the present invention to the embodiment of the invention with being more prone to.Parts in the accompanying drawing are just in order to illustrate principle of the present invention.In the accompanying drawings, identical or similar techniques characteristic or parts will adopt identical or similar Reference numeral to represent.
Fig. 1 is the process flow diagram that is used for extracting from image the method for text stroke image that illustrates according to the embodiment of the invention;
The synoptic diagram of signal that the signal that Figure 1A is schematically illustrated step signal, pulse signal, the signal corresponding with the Suo Beier operator, utilize the extraction result of Suo Beier operator extraction step signal, utilize the extraction result of Suo Beier operator extraction pulse signal, obtain after the extraction result of step signal is taken absolute value, the extraction result of pulse signals obtain after taking absolute value and corresponding signal through skew;
Fig. 2 be illustrate according to the embodiment of the invention pass through utilize the Suo Beier operator from image, to extract the process flow diagram of the method for text stroke image;
Fig. 3 illustrates pending original image;
Fig. 4 A-Fig. 4 H illustrates the image that obtains with after the Suo Beier operator convolution;
Fig. 5 A-Fig. 5 D illustrates through to image resulting four images of the image after the Suo Beier operator is handled after relative direction skew and synthetic skew;
Fig. 6 A-Fig. 6 D illustrates through the image after handling through the Suo Beier operator being squinted round about and synthesizing resulting four images of image after squinting;
Fig. 7 illustrates to integrate the resulting thin stroke image in back;
Fig. 8 illustrates to integrate the resulting thick stroke image in back;
Fig. 9 illustrates the thick stroke image that obtains through filtration treatment;
Figure 10 is the block diagram that is used for extracting from image the device of text stroke image that illustrates according to the embodiment of the invention;
Figure 11 be illustrate according to the embodiment of the invention pass through utilize the Suo Beier operator from image, to extract the block diagram of the device of text stroke image; And
Figure 12 shows the structural drawing of giving an example of computing equipment that is used for extracting from image the method and apparatus of text stroke image that can be used for embodiment of the present invention.
Embodiment
Embodiments of the invention are described with reference to the accompanying drawings.Element of in an accompanying drawing of the present invention or a kind of embodiment, describing and characteristic can combine with element and the characteristic shown in one or more other accompanying drawing or the embodiment.Should be noted that for purpose clearly, omitted the parts that have nothing to do with the present invention, those of ordinary skills are known and the expression and the description of processing in accompanying drawing and the explanation.
Followingly the method that being used for extracting from image text stroke image according to the embodiment of the invention is described with reference to Fig. 1.
Fig. 1 is the process flow diagram that is used for extracting from image the method for text stroke image that illustrates according to the embodiment of the invention.As shown in Figure 1, in step S102, can obtain image edge information and gradient information.
Preferably, can analyze the marginal information of representative image and the step signal or the pulse signal of gradient information, and extract marginal information and gradient information according to analysis result.
In image, the view data of thin stroke can be rendered as pulse signal, and the view data of thick stroke can be rendered as step signal with the large scale object that is similar to thick stroke.Can pulse signals analyze, and extract the marginal information and the gradient information of thin stroke according to its analysis result.In addition, can analyze, and extract the marginal information and the gradient information of thick stroke, and extract the marginal information and the gradient information of the large scale object that is similar to thick stroke according to its analysis result to step signal.
Followingly describe the process of utilizing Suo Beier operator extraction step signal and the process of utilizing Suo Beier operator extraction pulse signal, and strengthen the process handled through migration processing and synthetic processing execution with reference to Figure 1A.
Figure i shows step signal; Figure ii illustrates pulse signal; Figure iii and figure iv illustrate the signal corresponding with the Suo Beier operator; Figure v illustrates the extraction result who utilizes Suo Beier operator extraction step signal, and figure vi illustrates the extraction result who utilizes Suo Beier operator extraction pulse signal, and figure vii illustrates the signal that obtains after extraction result to step signal takes absolute value; The signal that obtains after the extraction result that figure viii illustrates pulse signals takes absolute value, and the synoptic diagram of the corresponding signal through skew of figure ix and figure x.
Should be understood that coordinate and size among Figure 1A only are schematic and nonrestrictive, and only be intended to illustrate the principle that coherent signal is handled.
Shown in Figure 1A; Through utilizing Suo Beier operator (shown in figure iii) to extract step signal (shown in figure i); Can obtain single paddy shape signal (shown in figure v); And, can obtain the signal (shown in figure vi) of paddy shape signal and peak shape signal combination through utilizing Suo Beier operator (shown in figure iv) to extract pulse signal (shown in figure ii).Thus it is clear that, extract step signal and pulse signal through utilizing the Suo Beier operator, two kinds of signals that can corresponding acquisition differ greatly (like figure v with shown in the figure vi).
Then; Through the intensive treatment described after a while (for example; Comprise that migration processing handles with synthetic) can strengthen step signal and pulse signal, thereby extract the thin stroke corresponding with pulse signal view data and with the view data of the corresponding thick stroke of step signal.
For example, can obtain another peak shape signal (shown in figure viii) to utilizing the paddy shape signal (shown in figure vi) take absolute value (that is, getting the size of paddy shape signal) in the pulse signal that the Suo Beier operator extraction goes out.Then (promptly with this another peak shape signal and parent peak shape signal section; The peak shape signal that utilizes the Suo Beier operator extraction to arrive) squints to relative direction; And this another peak shape signal after will squinting and parent peak shape signal superimposed (shown in figure x), thereby can strengthen the view data of the thin stroke corresponding once more with this pulse signal.In addition, can take absolute value, obtain another peak shape signal (shown in figure vii) utilizing the paddy shape signal (shown in figure v) in the step signal that the Suo Beier operator extraction goes out.This another peak shape signal (shown in figure ix) then squints.
Alternatively, can filter out the marginal information and the gradient information of the large scale object that is similar to thick stroke based on predetermined filtercondition, and only stay the marginal information and the gradient information of thick stroke.Concrete filtration treatment will be described after a while.
Alternatively, can carry out filtering to original image as shown in Figure 3 according to the readability of image or under the situation of needs.For example, can utilize low-pass filter that original image is carried out filtering, to suppress the noise in the image.Low-pass filter for example can be a Gaussian filter, but low-pass filter is not limited thereto, and can be any suitable low pass filter well known by persons skilled in the art.
In step S104, can handle the reinforcement that the marginal information obtained and gradient information are scheduled to, thereby strengthen marginal information and the gradient information relevant in the image with text.
Can come marginal information and gradient information are strengthened handling based on the whole bag of tricks.Preferably, can come marginal information and gradient information are strengthened through binary conversion treatment and integration processing.Average treatment handled or asked to the integration processing here can for the processing that seeks common ground, maximizing.Preferably, integration processing can be the processing that seeks common ground, and this will describe in detail after a while.
In step S106, can obtain marginal information and the corresponding text stroke of the gradient information image strengthened with warp.
Following with reference to Fig. 2-Fig. 9 describe according to the embodiment of the invention pass through utilize the Suo Beier operator from image, to extract the method for text stroke image.
Fig. 2 be illustrate according to the embodiment of the invention pass through utilize the Suo Beier operator from image, to extract the process flow diagram of the method for text stroke image; Fig. 3 illustrates pending original image; Fig. 4 A-Fig. 4 H illustrates the image that obtains with after the Suo Beier operator convolution; Fig. 5 A-Fig. 5 D illustrates through to image resulting four images of image after relative direction skew and synthetic skew; Fig. 6 A-Fig. 6 D illustrates through image being squinted round about and synthesizing resulting four images of image after squinting; Fig. 7 illustrates to integrate the resulting thin stroke image in back, and Fig. 8 illustrates to integrate the resulting thick stroke image in back, and Fig. 9 illustrates the thick stroke image that obtains through filtration treatment.
As shown in Figure 2, in step S202, can carry out the image (original image as shown in Figure 3) that will handle smoothly (for example, LPF).Step S202 is an optional step.In other words, at image more clearly under the situation or as required, can image as shown in Figure 3 not carried out smoothly (for example, LPF).
Here, original image shown in Figure 3 is as the image that will handle.In Fig. 3, hope that the text stroke that extracts comprises: the FUJITSU at place, the image upper right corner and the Japan word under the image.Wherein, FUJITSU is the thin stroke literal, and Japan word is the thick stroke literal.Be noted that Fig. 3 only is an example.In fact, in some images, possibly only comprise thin stroke literal or thick stroke literal.
In step S204, can utilize Suo Beier (Sobel) operator with level and smooth after the image convolution.Under the situation of the smoothing processing of not carrying out step S202, can directly utilize Suo Beier operator and image (original image as shown in Figure 3) convolution that will handle.
Particularly, utilize the Suo Beier operator that the processing image is carried out convolution on a plurality of directions.In other words, be used to obtain marginal information and a plurality of rope Bel operator convolution kernels of gradient information of image on a plurality of directions and carry out convolutional calculation with the view data of image respectively.
For example, utilize the Suo Beier operator to the processing image is carried out convolution on four direction.Preferably, this four direction comprises horizontal direction, vertical direction and two diagonal.Image after the convolution is shown in Fig. 4 A-4H.Choose this because the four directions, the direction of the common four stroke "one" "Shu" "Pie" "Fu" respectively.
For example, the Suo Beier operator convolution kernel that is utilized is:
Figure BSA00000516867700061
Wherein, S hBe the Suo Beier operator convolution kernel related, S with horizontal direction vBe the Suo Beier operator convolution kernel related, S with vertical direction RdBe the Suo Beier operator convolution kernel related, S with first diagonal LdBe the Suo Beier operator convolution kernel related with second diagonal.First diagonal is the diagonal from the upper right corner to the lower left corner, and second diagonal is the diagonal from the upper left corner to the lower right corner.
Because the result of convolution has and just has negatively, therefore can utilize two tomographic images to store convolution results here.One deck is used for depositing the positive pulse response image, and one deck is used for depositing the negative pulse response image in addition.In other words, will be divided into positive pulse response image data and negative pulse response image data to the result of the convolutional calculation of all directions to all directions.
Only provide the example of four direction in this article, but this not to say only on four direction, to carry out convolutional calculation.Under the high-precision situation of needs or according to other needs, can utilize more multidirectional rope Bel operator image that convolution kernel is handled to carry out convolution, for example 8 directions.More multidirectional convolutional calculation can be brought more accurate result.The definition of Suo Beier operator and carry out process of convolution with view data can be based on any suitable prior art.
Through original image as shown in Figure 3 is carried out convolution, can obtain shown in Fig. 4 A-Fig. 4 H with the image after the Suo Beier operator convolution kernel convolution.
Particularly, Fig. 4 A illustrates the image of the positive pulse response image data that comprise horizontal direction, and Fig. 4 B illustrates the image of the negative pulse response image data that comprise horizontal direction.Fig. 4 C illustrates the image of the positive pulse response image data that comprise vertical direction, and Fig. 4 D illustrates the image of the negative pulse response image data that comprise vertical direction.Fig. 4 E illustrates the image of the positive pulse response image data of the diagonal that comprises from the upper right corner to the lower left corner, and Fig. 4 F illustrates the image of the negative pulse response image data of the diagonal that comprises from the upper right corner to the lower left corner.Fig. 4 G illustrates the image of the positive pulse response image data of the diagonal that comprises from the upper left corner to the lower right corner, and Fig. 4 H illustrates the image of the negative pulse response image data of the diagonal that comprises from the upper left corner to the lower right corner.
Next, in step S206, utilize the stroke width of pre-estimating that positive pulse response image data and negative pulse response image data-bias are also synthesized the image after squinting, shown in Fig. 5 A-5D and Fig. 6 A-6D.
Can carry out calculating for positive pulse response image data and negative pulse response image data, to obtain the first composograph data to all directions to all directions to relative direction displacement and addition.In addition, can be for the calculating of carrying out displacement round about and addition to the positive pulse response image data and the negative pulse response image data of all directions, to obtain the second composograph data to all directions.
For example, can be based on following formula (1) to the calculating of relative direction displacement and addition:
I h(x,y)=(I h-positive(x,y-w/2)+I h-negative(x,y+w/2))/2,
I v(x,y)=(I v-positive(x-w/2,y)+I v-negative(x+w/2,y))/2,
I Rd(x, y)=(I Rd-positive(x+w/2, y-w/2)+I Rd-negative(x-w/2, y+w/2))/2, and
I ld(x,y)=(I ld-positive(x-w/2,y-w/2)+I ld-negative(x+w/2,y+w/2))/2;
In addition, the displacement and the calculating of addition can be based on following formula (2) round about:
I h’(x,y)=(I h-positive(x,y+w/2)+I h-negative(x,y-w/2))/2,
I v’(x,y)=(I v-positive(x+w/2,y)+I v-negative(x-w/2,y))/2,
I Rd' (x, y)=(I Rd-positive(x-w/2, y+w/2)+I Rd-negative(x+w/2, y-w/2))/2, and
I ld’(x,y)=(I ld-positive(x+w/2,y+w/2)+I ld-negative(x-w/2,y-w/2))/2;
Wherein, the horizontal ordinate of x remarked pixel, the ordinate of y remarked pixel, I h(x, y), I v(x, y), I Rd(x, y) and I Ld(x, y) expression is to the first composograph data of horizontal, vertical and two these four directions of diagonal of pixel respectively, and wherein, the image that comprises the first composograph data is shown in Fig. 5 A-5D.
Fig. 5 A illustrates and comprises data I h(x, image y), Fig. 5 B illustrate and comprise data I v(x, image y), Fig. 5 C illustrate and comprise data I Rd(x, image y), Fig. 5 D illustrate and comprise data I Ld(x, image y).
I h' (x, y), I v' (x, y) and I Rd' (x, y), I Ld' (x, y) respectively expression to pixel laterally, vertically and the second composograph data of two these four directions of diagonal.Wherein, the image that comprises the second composograph data is shown in Fig. 6 A-6D.
Fig. 6 A illustrates and comprises data I h' (x, image y), Fig. 6 B illustrate and comprise data I v' (x, image y), Fig. 6 C illustrate and comprise data I Rd' (x, image y), Fig. 6 D illustrate and comprise data I Ld' (x, image y).
I H-positive(x, y), I V-positive(x, y), I Rd-positive(x, y) and I Ld-positive(x, y) respectively expression to pixel laterally, vertically and the positive pulse response image data of two these four directions of diagonal.Wherein, the image that comprises the positive pulse respective image data is shown in Fig. 4 A, 4C, 4E and 4G.
I H-negative(x, y), I V-negative(x, y), I Rd-negative(x, y) and I Ld-negative(x, y) respectively expression to pixel laterally, vertically and the negative pulse response image data of two these four directions of diagonal.Wherein, the image that comprises the negative pulse respective image data is shown in Fig. 4 B, 4D, 4F and 4H.
The w that relates in the above-mentioned processing is the stroke width of pre-estimating.For example, can rule of thumb estimate the value of w, perhaps can be through image being estimated the value of estimating w.In addition, above-mentioned formula 1 and 2 has just provided to the view data horizontal, vertical and two these four directions of diagonal of pixel and has synthesized an object lesson of handling, and is not construed as limiting the invention.In fact, any suitable synthetic processing that is used to strengthen stroke all is suitable for.
Should be understood that the reason of carrying out to relative direction displacement and displacement round about is, confirm easily not necessarily that for a nonspecific image displacement of which direction can reach the effect that stroke is strengthened.For this reason, carry out to the relative direction displacement respectively and be shifted round about always reaching the effect that stroke is strengthened through trial property ground.Can reach the effect that stroke is strengthened if can confirm the displacement of which direction in advance, then not necessarily carry out simultaneously to relative direction displacement and displacement round about, that is, only need be to relative direction displacement and one of displacement round about.In the given example images of this paper, can reach the effect that stroke is strengthened to the displacement of relative direction, shown in Fig. 5 A-5D.Displacement round about can not reach the effect that stroke is strengthened, shown in Fig. 6 A-6D.
In step S208, can carry out the binary conversion treatment and the integration processing of high threshold to the image after synthetic in step S206.
For example, can utilize predefined high threshold respectively to first composograph data that are directed against all directions and the binary conversion treatment that the second composograph data are carried out high threshold.The first composograph data to all directions to through the binary conversion treatment of high threshold are carried out integration processing, and the second composograph data to all directions through the binary conversion treatment of high threshold are carried out integration processing.
One of average treatment handled or asked to the integration processing here can for the processing that seeks common ground, maximizing.
If integration processing is the processing that seeks common ground, then can be through the processing that seeks common ground of following formula:
I output-thin=I h-binarized-highUI v-binarized-highUI rd-binarized-highUI ld-binarized-high
I’ output-thin=I’ h-binarized-highUI’ v-binarized-highUI’ rd-binarized-highUI’ ld-binarized-high
Wherein, I Output-thinBe the first thin stroke view data after seeking common ground, I H-binarized-high, I V-binarized-high, I Rd-binarized-highAnd I Ld-binarized-highBe respectively the first composograph data I h(x, y), I v(x, y), I Rd(x, y) and I Ld(x y) carries out the binary image data that obtain after the binaryzation of high threshold.Wherein, I ' Output-thinBe the second thin stroke view data after seeking common ground, I ' H-binarized-high, I ' V-binarized-high, I ' Rd-binarized-highAnd I ' Ld-binarized-highBe respectively the second composograph data I h' (x, y), I v' (x, y), I Rd' (x, y) and I Ld' (x y) carries out the binary image data that obtain after the binaryzation of high threshold.
At step S210, can obtain the thin stroke view data.That is, obtain I Output-thinAnd I ' Output-thin
In step S212, can hang down the binary conversion treatment and the integration processing of threshold value to the image after synthetic.
For example, can utilize predefined low threshold value less than above-mentioned high threshold respectively to first composograph data of all directions and the binary conversion treatment that the second composograph data are hanged down threshold value.Process is hanged down the first composograph data to all directions of the binary conversion treatment of threshold value and carry out integration processing, and the second composograph data to all directions through second binary conversion treatment are carried out integration processing.
One of average treatment handled or asked to the integration processing here can for the processing that seeks common ground, maximizing.
If integration processing is the processing that seeks common ground.For example, can be through the processing that seeks common ground of following formula:
I output-thick=I h-binarized-lowUI v-binarized-lowUI rd-binarized-lowUI ld-binarized-low
I’ output-thick=I’ h-binarized-lowUI’ v-binarized-lowUI’ rd-binarized-low?U?I’ ld-binarized-low
Wherein, I Output-thickBe the first thick stroke view data after seeking common ground, I H-binarized-low, I V-binarized-low, I Rd-binarized-lowAnd I Ld-binarized-lowBe respectively the first composograph data I h(x, y), I v(x, y), I Rd(x, y) and I Ld(x y) hangs down the binary image data that obtain after the binaryzation of threshold value.Wherein, I ' Output-thickBe the second thick stroke view data after seeking common ground, I ' H-binarized-low, I ' V-binarized-low, I ' Rd-binarized-lowAnd I ' Ld-binarized-lowBe respectively the second composograph data I h' (x, y), I v' (x, y), I Rd' (x, y) and I Ld' (x y) hangs down the binary image data that obtain after the binaryzation of threshold value.
Understand easily; As above-mentioned; Can reach the effect that stroke is strengthened if can confirm the displacement of which direction in advance, then not necessarily carry out simultaneously to relative direction displacement and displacement round about, so that obtain the first composograph data and the second composograph data.Also be; Only need on the direction that can reach the effect that stroke is strengthened, (for example relative direction or reverse direction) be shifted and obtain corresponding composograph data; Correspondingly, above-mentioned binary conversion treatment and integration processing also only need to get final product to this composograph data execution.
At step S214, alternatively, the data after in step S212, handling are filtered according to predetermined filtercondition.
Particularly; Utilize predetermined filtercondition respectively the first thick stroke view data to be carried out filtration treatment with the data relevant with connected domain in the second thick stroke view data, to obtain to satisfy the first connected domain view data and the second connected domain view data of predetermined filtercondition.
The reason of utilizing predetermined filtercondition that the first thick stroke view data and the second thick stroke view data are carried out filtration treatment herein is; The first thick stroke view data that in step S212, obtains and the second thick stroke view data are except comprising thick stroke; The MARG that also possibly comprise other objects; For example, the human body MARG in the image or other MARGs that is similar to thick stroke.Through utilizing predetermined filtercondition that the first thick stroke view data and the second thick stroke view data are carried out filtration treatment, can obtain the connected domain view data corresponding more exactly with thick stroke.
Wherein, predetermined filtercondition can be one of following condition: the gray variance of the pixel in (1) connected domain is less than the pre-determined variance threshold value, and the inward flange of (2) connected domain is consistent to outer peripheral pixel polarity, and the size of (3) connected domain is in predetermined size threshold value; Perhaps filtercondition can be two or a multinomial any combination in above-mentioned (1), (2) and (3).
Here, connected domain is meant the closed region that the edge constituted of the thick stroke view data representative that is obtained by the processing through step S212.
For filtercondition (1), that is, the gray variance of the pixel in the connected domain particularly, at first, can find the coordinate information relevant with connected domain less than the pre-determined variance threshold value in the processed images in step S212.Then,, confirm the gray variance of connected domain, and judge that whether the interior gray variance of this connected domain is less than the pre-determined variance threshold value based on coordinate information and based on original image as shown in Figure 3.Under the situation of gray variance in connected domain less than the pre-determined variance threshold value, keep this connected domain, that is, think that this connected domain is related with thick stroke.Otherwise, filter out this connected domain.
For filtercondition (2), that is, the inward flange of connected domain is consistent to outer peripheral pixel polarity, particularly, at first, can in step S212, find the coordinate information relevant with connected domain in the processed images.Then, based on coordinate information and based on original image as shown in Figure 3, confirm whether the inward flange of connected domain is consistent to outer peripheral pixel polarity.Under inward flange in the connected domain situation consistent, keep this connected domain, that is, think that this connected domain is related with thick stroke with outer peripheral pixel polarity.Otherwise, filter out this connected domain.
Here the polarity of said pixel is appreciated that to be the relation between the gray scale of gray scale and the outward flange pixel adjacent with this inward flange pixel of an inward flange pixel.For example, to be set to be " 0 " or " 255 " to the gray scale of pixel in the binary image, if then the gray-scale value of two pixels all is " 0 " or be " 255 " all, thinks the polarity unanimity of these two pixels, otherwise think that then polarity is inconsistent.Certainly, also can set pixel polarity according to actual needs, repeat no more at this for other consistent appropraite conditions.
For filtercondition (3), that is, the size of connected domain particularly, if the size of connected domain in predetermined threshold, then keeps this connected domain, that is, thinks that this connected domain is related with thick stroke in predetermined size threshold value.Otherwise, filter out this connected domain.Understand easily, though the thick stroke image wants big with respect to the thin stroke image, generally have the more common upper limit, the upper limit is set above-mentioned predetermined threshold in view of the above.If connected domain is oversize, then can think this connected domain pairing be not thick stroke and be non-stroke object, for example, the thick stroke size can not occupy size half the even more of entire image in most cases.Certainly, the thick stroke size is very big really under special circumstances, then only needs the size of the above-mentioned predetermined threshold of adjustment can reflect this situation.
Wherein, above-mentioned pre-determined variance threshold value can be confirmed according to the empirical value that obtains through experiment in advance with the threshold value of predetermined size, perhaps can confirm according to the range estimation to image pattern.Certainly, also can confirm according to any other proper method.
Also can combined filtering condition (1), in (2) and (3) two or whole combination in any carry out filtration.
Next, in step S216, obtain respectively and the first connected domain view data and the corresponding first thick stroke edge data of the second connected domain view data and the second thick stroke edge data.
In step S218; Obtain respectively and the first thin stroke view data and the corresponding first thin stroke image (as shown in Figure 7) of the second thin stroke view data and the second thin stroke image (not shown), and acquisition respectively with the first thick stroke edge data and the corresponding first thick stroke image (as shown in Figure 9) of the second thick stroke edge data and the second thick stroke image (not shown).
Should be understood that under the situation of execution in step S214 not, also can obtain the thick stroke image.In other words, can be based on the I of step S212 acquisition Output-thickAnd I ' Output-thickObtain the first thick stroke image (as shown in Figure 8) and the second thick stroke image.
In this article; Although can obtain the second thin stroke image and the second thick stroke image through above-mentioned steps; But, therefore do not specifically illustrate the corresponding second thin stroke image that obtains, the second thick stroke image because displacement round about can not reach the effect that stroke is strengthened.
Alternatively, can screen the final thin stroke image of a conduct in the first thin stroke image and the second thin stroke image based on the compromise of the recall rate of the precision of the recall rate of the precision of text stroke, text stroke or text stroke and text stroke.Also can screen the final thick stroke image of a conduct in the first thick stroke image and the second thick stroke image based on the compromise of the recall rate of the precision of the recall rate of the precision of text stroke, text stroke or text stroke and text stroke.In the example of this paper; Because the displacement of relative direction can reach the effect that stroke is strengthened; Therefore; Compromise based on the recall rate of the precision of the recall rate of the precision of text stroke, text stroke or text stroke and text stroke can obtain the first thin stroke image and the first thick stroke image and extract the result as final stroke.
The precision of the text stroke of being mentioned among this paper be appreciated that for actual detected to the ratio of quantity of the stroke that arrives of quantity and the actual detected of correct stroke, the recall rate of text stroke is appreciated that the ratio into quantity with the quantity of in esse correct stroke of the correct stroke of actual detected.
Above-mentioned is that example has been described the method for from image, extracting text stroke image with the Suo Beier operator.But this only is to describe the method for from image, extracting text stroke image with the mode of example.Operable operator is not limited to this, and can be Robert operator, Prewitt operator, Laplace operator, log operator, Canny operator or other suitable arbitrarily operators.
In addition, carry out binary conversion treatment earlier and carry out integration processing again, in fact also can carry out integration processing earlier and carry out binary conversion treatment again although in step S208 and S212, described.
Above-mentioned high threshold of mentioning and low threshold value can rule of thumb come to confirm, perhaps confirm through the test of several times.But this only is an example, can confirm high threshold and low threshold value through any appropriate method.
Followingly the device 400 that being used for extracting from image text stroke image according to the embodiment of the invention is described with reference to Figure 10.
Figure 10 is the block diagram that is used for extracting from image the device 400 of text stroke image that illustrates according to the embodiment of the invention.Device 400 for example can be carried out the above-mentioned method of describing with reference to Fig. 1 that is used for extracting from image text stroke image.
Particularly, device 400 comprises: information acquisition unit 402 can be used to obtain said image edge information and gradient information; Enhancing unit 404 can be used for the reinforcement that the marginal information obtained and gradient information are scheduled to is handled, thereby strengthen marginal information relevant with text in the said image and gradient information; And stroke image acquisition unit 406, can be used to obtain marginal information and the corresponding text stroke of the gradient information image strengthened with warp.
Alternatively, device 400 can comprise the screening unit (not shown).Screening unit can be configured to a plurality of text stroke images that extracted are screened.
Information acquisition unit 402 can be configured to the step signal of the marginal information of representative image and gradient information or pulse signal are analyzed, and extracts marginal information and gradient information according to analysis result.
In image, the view data of thin stroke can be rendered as pulse signal, and the view data of thick stroke can be rendered as step signal with the large scale object that is similar to thick stroke.Can pulse signals analyze, and extract the marginal information and the gradient information of thin stroke according to its analysis result.In addition, can analyze, and extract the marginal information and the gradient information of thick stroke, and extract the marginal information and the gradient information of the large scale object that is similar to thick stroke according to its analysis result to step signal.
Alternatively, device 400 can also comprise the noise reduction unit (not shown).Noise reduction unit can be configured to according to the readability of image or under the situation of needs original image carried out filtering with noise reduction.For example, can utilize low-pass filter that image is carried out filtering, to suppress the noise in the original image.Low-pass filter for example can be a Gaussian filter, but low-pass filter is not limited thereto, and can be any suitable low pass filter well known by persons skilled in the art.
Enhancing unit 404 also can be configured to come marginal information and gradient information are strengthened handling based on the whole bag of tricks.Preferably, can come marginal information and gradient information are strengthened through binary conversion treatment and integration processing.One of average treatment handled or asked to the integration processing here can for the processing that seeks common ground, maximizing.Preferably, integration processing can be the processing that seeks common ground, and this will describe in detail after a while.
Following with reference to Figure 11 describe according to the embodiment of the invention pass through utilize the Suo Beier operator from image, to extract the device 400 ' of text stroke image.
Figure 11 be illustrate according to the embodiment of the invention pass through utilize the Suo Beier operator from image, to extract the block diagram of the device 400 ' of text stroke image.
Device 400 ' can carry out above with reference to Fig. 2-Fig. 9 describe according to the embodiment of the invention pass through utilize the Suo Beier operator from image, to extract the method for text stroke image.
Particularly, be similar to device 400, device 400 ' comprises that information acquisition unit 402 ', enhancing unit 404 ' and stroke image obtain unit 406 '.Alternatively, device 400 ' also can comprise screening unit and noise reduction unit (all not shown).
Screening unit can be configured to based in the recall rate of the precision of text stroke and text stroke at least one, screen the final thin stroke image of a conduct in said first thin stroke image and the said second thin stroke image and/or screen the said first thick stroke image and the said second thick stroke image in the final thick stroke image of a conduct.Concrete Screening Treatment is described with reference to Fig. 2-Fig. 9, no longer repeats at this.
Noise reduction unit can be configured to original image (image that promptly will handle) is carried out filtering with noise reduction.Concrete noise reduction process is described with reference to Fig. 2-Fig. 9, no longer repeats at this.
Wherein, information acquisition unit 402 ' can comprise convolution algorithm subelement 4022 and divide subelement 4024.Convolution algorithm subelement 4022 is configured to utilize and is used to obtain marginal information and a plurality of rope Bel operator convolution kernels of gradient information of image on a plurality of directions and carries out convolutional calculation with the view data of said image respectively.Dividing subelement 4024 is configured to the result of the convolutional calculation that is directed against all directions is divided into positive pulse response image data and negative pulse response image data to all directions.
Enhancing unit 404 ' can comprise that first composograph obtains subelement 4042 and/or second composograph obtains subelement 4044.First composograph obtains subelement 4042 and can be configured to carry out the calculating to relative direction displacement and addition for positive pulse response image data and negative pulse response image data to all directions, to obtain the first composograph data to all directions.Second composograph obtains subelement 4044 and can be configured to carry out the also calculating of addition of displacement round about for positive pulse response image data and negative pulse response image data to all directions, to obtain the second composograph data to all directions.
Said stroke image obtains unit 406 ' and can comprise: the first binary conversion treatment subelement 4062, the first thin stroke image obtain that subelement 4064, the second thin stroke image obtain subelement 4066, the second binary conversion treatment subelement 4068, the first thick stroke image obtains subelement 40610 and the second thick stroke image obtains subelement 40612.
The first binary conversion treatment subelement 4062 can be configured to utilize predefined first threshold respectively the first composograph data and the second composograph data to all directions to be carried out first binary conversion treatment.
The first thin stroke image obtains subelement 4064 and can be configured to carrying out integration processing through the first composograph data to all directions of first binary conversion treatment obtaining the first thin stroke view data, thereby obtains and the corresponding first thin stroke image of the said first thin stroke view data.
The second thin stroke image obtains subelement 4066 and can be configured to carrying out integration processing through the second composograph data to all directions of first binary conversion treatment obtaining the second corresponding thin stroke view data, thereby obtains and the corresponding second thin stroke image of the second thin stroke view data.
The second binary conversion treatment subelement 4068 can be configured to utilize predefined second threshold value less than first threshold respectively the first composograph data and the second composograph data to all directions to be carried out second binary conversion treatment.
The first thick stroke image obtains subelement 40610 and can be configured to carrying out integration processing through the first composograph data to all directions of second binary conversion treatment obtaining the first thick stroke edge data, thereby obtains and the corresponding first thick stroke image of the said first thick stroke edge data.
The second thick stroke image obtains subelement 40612 and can be configured to carrying out integration processing through the second composograph data to all directions of second binary conversion treatment obtaining the second corresponding thick stroke edge data, thereby obtains and the corresponding second thick stroke image of the said second thick stroke edge data.
As required; The stroke image obtains that unit 406 can include only the first binary conversion treatment subelement 4062, the first thin stroke image obtains subelement 4064 and the second thin stroke image obtains subelement 4066, perhaps includes only the second binary conversion treatment subelement 4068, the first thick stroke image obtains subelement 40610 and the second thick stroke image obtains subelement 40612.In other words; As required; The stroke image obtains unit 406 and can be configured to only obtain the thin stroke image or only obtain the thick stroke image; The text message that for example only has thick stroke or thin stroke at pending image, perhaps a needs extracts the situation of the text message of thick stroke or thin stroke.
And; As described above; Can reach the effect that stroke is strengthened if can confirm the displacement of which direction in advance; Then enhancing unit 404 ' can comprise first composograph obtain subelement 4042 and second composograph obtain in the subelement 4044 with this direction on one of corresponding composograph acquisition subelement of processing; Correspondingly; The stroke image obtain unit 406 ' can include only with this direction on the first corresponding thin stroke image of processing obtain subelement 4064 and the second thin stroke image and one of obtain in the subelement 4066, and comprise with this direction on the corresponding first thick stroke image acquisition subelement 40610 of processing and second thick stroke image acquisition subelement 40612 in one of.
Device 400 ' for example can be configured to carry out above with reference to Fig. 2-Fig. 9 describe according to the embodiment of the invention pass through utilize the Suo Beier operator from image, to extract the method for text stroke image.For the sake of simplicity, no longer specifically describing the first binary conversion treatment subelement 4062, the first thin stroke image at this obtains subelement 4064, the second thin stroke image and obtains subelement 4066, the second binary conversion treatment subelement 4068, the first thick stroke image and obtain the concrete processing procedure that subelement 40610 and the second thick stroke image obtain subelement 40612.
In addition, alternatively, the stroke image obtains unit 406 ' can also comprise filtration subelement (not shown).Filter subelement and can be configured to utilize predetermined filtercondition respectively the first thick stroke edge data to be carried out filtration treatment with the data relevant with connected domain in the second thick stroke edge data, to obtain to satisfy the first connected domain view data and the second connected domain view data of predetermined filtercondition.
Predetermined filtercondition comprise in the following condition one of at least: the gray variance of the pixel in the said connected domain is less than the pre-determined variance threshold value; The inward flange of said connected domain is consistent to outer peripheral pixel polarity; And the size of said connected domain is in predetermined size threshold value.Describe with reference to Fig. 2-Fig. 9 about predetermined filtercondition, no longer repeat at this.
Through according to the embodiment of the invention be used for from the method and apparatus that image extracts text stroke image can realize the following technique effect one of at least: speed is fast, can suppress noise when extracting stroke, insensitive to the stroke yardstick, because the video of handling is second-rate.In addition, according to the embodiment of the invention be used for extract the method and apparatus of text stroke image from image prioris such as the color of text, text background contrast do not required.
More than combine specific embodiment to describe ultimate principle of the present invention; But; It is to be noted; As far as those of ordinary skill in the art, can understand whole or any step or the parts of method and apparatus of the present invention, can be in the network of any calculation element (comprising processor, storage medium etc.) or calculation element; Realize that with hardware, firmware, software or their combination this is that those of ordinary skills use their basic programming skill just can realize under the situation of having read explanation of the present invention.
Therefore, the object of the invention can also be realized through program of operation or batch processing on any calculation element.Said calculation element can be known fexible unit.Therefore, the object of the invention also can be only through providing the program product that comprises the program code of realizing said method or device to realize.That is to say that such program product also constitutes the present invention, and the storage medium that stores such program product also constitutes the present invention.Obviously, said storage medium can be any storage medium that is developed in any known storage medium or future.
Realizing under the situation of embodiments of the invention through software and/or firmware; From storage medium or network to computing machine with specialized hardware structure; Multi-purpose computer 1200 for example shown in Figure 12 is installed the program that constitutes this software; This computing machine can be carried out various functions or the like when various program is installed.
In Figure 12, CPU (CPU) 1201 carries out various processing according to program stored among ROM (read-only memory) (ROM) 1202 or from the program that storage area 1208 is loaded into random-access memory (ram) 1203.In RAM 1203, also store data required when CPU 1201 carries out various processing or the like as required.CPU 1201, ROM 1202 and RAM 1203 are via bus 1204 link each other.Input/output interface 1205 also link arrives bus 1204.
Following parts link is to input/output interface 1205: importation 1206 (comprising keyboard, mouse or the like), output 1207 (comprise display; Such as cathode ray tube (CRT), LCD (LCD) etc. and loudspeaker etc.), storage area 1208 (comprising hard disk etc.), communications portion 1209 (comprising that NIC is such as LAN card, modulator-demodular unit etc.).Communications portion 1209 is handled such as the Internet executive communication via network.As required, but driver 1210 also link to input/output interface 1205.Detachable media 1211 is installed on the driver 1210 such as disk, CD, magneto-optic disk, semiconductor memory or the like as required, makes the computer program of therefrom reading be installed to as required in the storage area 1208.
Realizing through software under the situation of above-mentioned series of processes, such as detachable media 1211 program that constitutes software is being installed such as the Internet or storage medium from network.
It will be understood by those of skill in the art that this storage medium is not limited to shown in Figure 12 wherein having program stored therein, distribute so that the detachable media 1211 of program to be provided to the user with equipment with being separated.The example of detachable media 1211 comprises disk (comprising floppy disk (registered trademark)), CD (comprising compact disc read-only memory (CD-ROM) and digital universal disc (DVD)), magneto-optic disk (comprising mini-disk (MD) (registered trademark)) and semiconductor memory.Perhaps, storage medium can be hard disk that comprises in ROM 1202, the storage area 1208 or the like, computer program stored wherein, and be distributed to the user with the equipment that comprises them.
The present invention also proposes a kind of program product that stores the instruction code of machine-readable.When instruction code is read and carried out by machine, can carry out above-mentioned method according to the embodiment of the invention.
Correspondingly, the storage medium that is used for carrying the program product of the above-mentioned instruction code that stores machine-readable is also included within of the present invention open.Storage medium includes but not limited to floppy disk, CD, magneto-optic disk, storage card, memory stick etc.
Those of ordinary skill in the art should be understood that what give an example at this is exemplary, and the present invention is not limited thereto.
In this manual, " first ", " second " and statements such as " N " are for described characteristic is distinguished on literal, clearly to describe the present invention.Therefore, should it be regarded as having any determinate implication.
As an example, each of each step of said method and the said equipment forms module and/or the unit may be embodied as software, firmware, hardware or its combination, and as the part in the relevant device.Spendable concrete means or mode were well known to those skilled in the art when each was formed module, unit mode through software, firmware, hardware or its combination and is configured in the said apparatus, repeated no more at this.
As an example; Under situation about realizing through software or firmware; Can the program that constitute this software be installed to the computing machine with specialized hardware structure (multi-purpose computer 1200 for example shown in Figure 12) from storage medium or network; This computing machine can be carried out various functions etc. when various program is installed.
In the above in the description to the specific embodiment of the invention; Characteristic to a kind of embodiment is described and/or illustrated can be used in one or more other embodiments with identical or similar mode; Combined with the characteristic in other embodiments, or substitute the characteristic in other embodiments.
Should stress that term " comprises/comprise " existence that when this paper uses, refers to characteristic, key element, step or assembly, but not get rid of the existence of one or more other characteristics, key element, step or assembly or additional.
In addition, the time sequencing of describing during method of the present invention is not limited to is to specifications carried out, also can according to other time sequencing ground, carry out concurrently or independently.The execution sequence of the method for therefore, describing in this instructions does not constitute restriction to technical scope of the present invention.
About comprising the embodiment of above embodiment, following remarks is also disclosed:
Remarks
1. 1 kinds of methods that are used for extracting from image text stroke image of remarks comprise:
Obtain said image edge information and gradient information;
Reinforcement to the marginal information obtained and gradient information are scheduled to is handled, thereby strengthens marginal information relevant with text in the said image and gradient information; And
Obtain marginal information and the corresponding text stroke of gradient information image with the warp reinforcement.
Remarks 2. is according to remarks 1 described method, and wherein, the said step of obtaining said image edge information and gradient information comprises:
Step signal or pulse signal to representing said image edge information and gradient information are analyzed, and extract said marginal information and said gradient information according to analysis result.
Remarks 3. is according to remarks 2 described methods, wherein:
The said step of obtaining said image edge information and gradient information comprises:
Utilization is used to obtain marginal information and a plurality of rope Bel operator convolution kernels of gradient information of image on a plurality of directions and carries out convolutional calculation with the view data of said image respectively, and
To be divided into positive pulse response image data and negative pulse response image data to the result of the convolutional calculation of all directions to all directions;
The marginal information relevant with text and the step of gradient information comprise in the said image of said reinforcement:
Positive pulse response image data and negative pulse response image data for to all directions are carried out the calculating to relative direction displacement and addition, with the first composograph data of acquisition to all directions, and/or
For the calculating of carrying out displacement round about and addition to the positive pulse response image data and the negative pulse response image data of all directions, to obtain the second composograph data to all directions;
Said acquisition comprises with the marginal information of warp reinforcement and the step of the corresponding text stroke of gradient information image:
Utilize predefined first threshold respectively the first composograph data and/or the second composograph data to all directions to be carried out first binary conversion treatment; The first composograph data to all directions to through first binary conversion treatment are carried out integration processing to obtain the first thin stroke view data; Thereby obtain and the corresponding first thin stroke image of the said first thin stroke view data; And/or to through first binary conversion treatment carry out integration processing obtaining the second corresponding thin stroke view data to the second composograph data of all directions, thereby obtain and the corresponding second thin stroke image of the said second thin stroke view data; And/or
Utilize predefined second threshold value respectively the first composograph data and/or the second composograph data to all directions to be carried out second binary conversion treatment less than first threshold; The first composograph data to all directions to through second binary conversion treatment are carried out integration processing to obtain the first thick stroke edge data; Thereby obtain and the corresponding first thick stroke image of the said first thick stroke edge data; And/or to through second binary conversion treatment carry out integration processing obtaining the second corresponding thick stroke edge data to the second composograph data of all directions, thereby obtain and the corresponding second thick stroke image of the said second thick stroke edge data.
Remarks 4. is according to remarks 3 described methods, and wherein, said integration processing comprises the processing that seeks common ground, maximizing processing and asks one of average treatment.
Remarks 5. is according to remarks 3 or 4 described methods, and wherein, a plurality of rope Bel operator convolution kernels that are used to obtain marginal information and the gradient information of image on a plurality of directions in said utilization comprise with the step that the view data of said image is carried out convolutional calculation respectively:
Utilization be used to obtain image laterally, vertically and the marginal information on two these four directions of diagonal and four Suo Beier operator convolution kernels of gradient information carry out convolutional calculation with the view data of said image respectively.
Remarks 6. is according to each described method in the remarks 3 to 5, and wherein, said calculating to relative direction displacement and addition is based on following formula:
I h(x,y)=(I h-positive(x,y-w/2)+I h-negative(x,y+w/2))/2,
I v(x,y)=(I v-positive(x-w/2,y)+I v-negative(x+w/2,y))/2,
I Rd(x, y)=(I Rd-positive(x+w/2, y-w/2)+I Rd-negative(x-w/2, y+w/2))/2, and
I ld(x,y)=(I ld-positive(x-w/2,y-w/2)+I ld-negative(x+w/2,y+w/2))/2;
And wherein, the said displacement round about and the calculating of addition are based on following formula:
I h’(x,y)=(I h-positive(x,y+w/2)+I h-negative(x,y-w/2))/2,
I v’(x,y)=(I v-positive(x+w/2,y)+I v-negative(x-w/2,y))/2,
I Rd' (x, y)=(I Rd-positive(x-w/2, y+w/2)+I Rd-negative(x+w/2, y-w/2))/2, and
I ld’(x,y)=(I ld-positive(x+w/2,y+w/2)+I ld-negative(x-w/2,y-w/2))/2;
Wherein, the horizontal ordinate of x remarked pixel, y are represented the ordinate of said pixel, I h(x, y), I v(x, y), I Rd(x, y) and I Ld(x, y) respectively expression to said pixel laterally, vertically and the first composograph data of two these four directions of diagonal,
I h' (x, y), I r' (x, y) and I Rd' (x, y), I Ld' (x, y) respectively expression to said pixel laterally, vertically and the second composograph data of two these four directions of diagonal,
I H-positive(x, y), I V-positive(x, y), I Rd-positive(x, y) and I Ld-positive(x, y) respectively expression to said pixel laterally, vertically and the positive pulse response image data of two these four directions of diagonal, and
I H-negative(x, y), I V-negative(x, y), I Rd-negative(x, y) and I Ld-negative(x, y) expression is to the negative pulse response image data of horizontal, vertical and two these four directions of diagonal of said pixel respectively, and w is the stroke width of pre-estimating.
Remarks 7. is according to each described method in the remarks 3 to 6; Wherein, said acquisition comprises with the step of the said first thick stroke edge data and the corresponding first thick stroke image of the said second thick stroke edge data and the second thick stroke image respectively:
Utilize predetermined filtercondition respectively the connected domain data in said first thick stroke edge data and the said second thick stroke edge data to be carried out filtration treatment, to obtain to satisfy the first connected domain view data and the second connected domain view data of said predetermined filtercondition; And
Obtain respectively and the said first connected domain view data and the corresponding first thick stroke image of the said second connected domain view data and the second thick stroke image.
Remarks 8. is according to remarks 7 described methods, wherein, said predetermined filtercondition comprise in the following condition one of at least:
The gray variance of the pixel in the said connected domain is less than the pre-determined variance threshold value;
The inward flange of said connected domain is consistent to outer peripheral pixel polarity; And
The size of said connected domain is in predetermined size threshold value.
Remarks 9. also comprises according to each described method in the remarks 3 to 8:
Based in the recall rate of the precision of text stroke and text stroke at least one, screen the final thin stroke image of a conduct in said first thin stroke image and the said second thin stroke image and/or screen the said first thick stroke image and the said second thick stroke image in the final thick stroke image of a conduct.
10. 1 kinds of devices that are used for extracting from image text stroke image of remarks comprise:
Information acquisition unit is used to obtain said image edge information and gradient information;
Enhancing unit is used for the reinforcement that the marginal information obtained and gradient information are scheduled to is handled, thereby strengthens marginal information relevant with text in the said image and gradient information; And
The stroke image obtains the unit, is used to obtain marginal information and the corresponding text stroke of the gradient information image strengthened with warp.
Remarks 11. is according to remarks 10 described devices; Wherein, Said information acquisition unit is configured to the step signal of representing said image edge information and gradient information or pulse signal are analyzed, and extracts said marginal information and said gradient information according to analysis result.
Remarks 12. is according to remarks 11 described devices, wherein:
Said information acquisition unit comprises:
The convolution algorithm subelement is used to utilize be used to obtain marginal information and a plurality of rope Bel operator convolution kernels of gradient information of image on a plurality of directions and carry out convolutional calculation with the view data of said image respectively, and
Divide subelement, be used for the result of the convolutional calculation that is directed against all directions is divided into positive pulse response image data and negative pulse response image data to all directions;
Said enhancing unit comprises:
First composograph acquisition subelement is used for carrying out the calculating to relative direction displacement and addition for positive pulse response image data and negative pulse response image data to all directions, with the first composograph data of acquisition to all directions, and/or
Second composograph obtains subelement, is used for carrying out the also calculating of addition of displacement round about for positive pulse response image data and negative pulse response image data to all directions, to obtain the second composograph data to all directions;
Said stroke image obtains the unit and comprises:
The first binary conversion treatment subelement is used to utilize predefined first threshold respectively the first composograph data and/or the second composograph data to all directions to be carried out first binary conversion treatment; The first thin stroke image obtains subelement; Be used for to through first binary conversion treatment carry out integration processing obtaining the first thin stroke view data to the first composograph data of all directions, thereby obtain and the corresponding first thin stroke image of the said first thin stroke view data; And/or second the thin stroke image obtain subelement; Be used for to through first binary conversion treatment carry out integration processing obtaining the second corresponding thin stroke view data to the second composograph data of all directions, thereby obtain and the corresponding second thin stroke image of the second thin stroke view data; And/or
The second binary conversion treatment subelement is used to utilize predefined second threshold value less than first threshold respectively the first composograph data and/or the second composograph data to all directions to be carried out second binary conversion treatment; The first thick stroke image obtains subelement; Be used for to through second binary conversion treatment carry out integration processing obtaining the first thick stroke edge data to the first composograph data of all directions, thereby obtain and the corresponding first thick stroke image of the said first thick stroke edge data; And/or second the thick stroke image obtain subelement; Be used for to through second binary conversion treatment carry out integration processing obtaining the second corresponding thick stroke edge data to the second composograph data of all directions, thereby obtain and the corresponding second thick stroke image of the said second thick stroke edge data.
Remarks 13. is according to remarks 12 described devices, and wherein, said integration processing comprises the processing that seeks common ground, maximizing processing and asks one of average treatment.
Remarks 14. is according to remarks 12 or 13 described devices; Wherein, said convolution algorithm subelement be configured to utilize be used to obtain image laterally, vertically and the marginal information on two these four directions of diagonal and four Suo Beier operator convolution kernels of gradient information carry out convolutional calculation with the view data of said image respectively.
Remarks 15. is according to each described device in the remarks 12 to 14, and wherein, said calculating to relative direction displacement and addition is based on following formula:
I h(x,y)=(I h-positive(x,y-w/2)+I h-negative(x,y+w/2))/2,
I v(x,y)=(I v-positive(x-w/2,y)+I v-negative(x+w/2,y))/2,
I Rd(x, y)=(I Td-positive(x+w/2, y-w/2)+I Rd-negative(x-w/2, y+w/2))/2, and
I ld(x,y)=(I ld-positive(x-w/2,y-w/2)+I ld-negative(x+w/2,y+w/2))/2;
And wherein, the said displacement round about and the calculating of addition are based on following formula:
I h’(x,y)=(I h-positive(x,y+w/2)+I h-negative(x,y-w/2))/2,
I v’(x,y)=(I v-positive(x+w/2,y)+I v-negative(x-w/2,y))/2,
I Rd' (x, y)=(I Rd-positve(x-w/2, y+w/2)+I Rd-negative(x+w/2, y-w/2))/2, and
I ld’(x,y)=(I ld-postive(x+w/2,y+w/2)+I ld-negative(x-w/2,y-w/2))/2;
Wherein, the horizontal ordinate of x remarked pixel, y are represented the ordinate of said pixel, I h(x, y), I v(x, y), I Rd(x, y) and I Ld(x, y) respectively expression to said pixel laterally, vertically and the first composograph data of two these four directions of diagonal,
I h' (x, y), I v' (x, y) and I Rd' (x, y), I Ld' (x, y) respectively expression to said pixel laterally, vertically and the second composograph data of two these four directions of diagonal,
I H-positive(x, y), I V-positive(x, y), I Rd-positive(x, y) and I Ld-positive(x, y) respectively expression to said pixel laterally, vertically and the positive pulse response image data of two these four directions of diagonal, and
I H-negative(x, y), I V-negative(x, y), I Rd-negative(x, y) and I Ld-negative(x, y) expression is to the negative pulse response image data of horizontal, vertical and two these four directions of diagonal of said pixel respectively, and w is the stroke width of pre-estimating.
Remarks 16. is according to each described device in the remarks 12 to 15, and wherein, said stroke image obtains the unit and also comprises:
Filter subelement; Be used for utilizing predetermined filtercondition respectively the connected domain data of said first thick stroke edge data and the said second thick stroke edge data to be carried out filtration treatment, to obtain to satisfy the first connected domain view data and the second connected domain view data of said predetermined filtercondition.
Remarks 17. is according to remarks 16 described devices, wherein, said predetermined filtercondition comprise in the following condition one of at least:
The gray variance of the pixel in the said connected domain is less than the pre-determined variance threshold value;
The inward flange of said connected domain is consistent to outer peripheral pixel polarity; And
The size of said connected domain is in predetermined size threshold value.
Remarks 18. also comprises according to each described device in the remarks 12 to 17:
Screening unit; Be used at least one based on the recall rate of the precision of text stroke and text stroke, screen the final thin stroke image of a conduct in said first thin stroke image and the said second thin stroke image and/or screen the said first thick stroke image and the said second thick stroke image in the final thick stroke image of a conduct.
19. 1 kinds of program products that store the instruction code of machine-readable of remarks when said instruction code is read and carried out by machine, can be carried out like each described method that is used for extracting from image text stroke image among the remarks 1-9.
20. 1 kinds of storage mediums that carry like remarks 19 described program products of remarks.
Although the present invention is disclosed above through description to specific embodiment of the present invention; But; Should be appreciated that, those skilled in the art can be in the spirit of accompanying claims and scope design to various modifications of the present invention, improve or equivalent.These modifications, improvement or equivalent also should be believed to comprise in protection scope of the present invention.

Claims (10)

1. one kind is used for comprising from the method for image extraction text stroke image:
Obtain said image edge information and gradient information;
Reinforcement to the marginal information obtained and gradient information are scheduled to is handled, thereby strengthens marginal information relevant with text in the said image and gradient information; And
Obtain marginal information and the corresponding text stroke of gradient information image with the warp reinforcement.
2. method according to claim 1, wherein, the said step of obtaining said image edge information and gradient information comprises:
Step signal or pulse signal to representing said image edge information and gradient information are analyzed, and extract said marginal information and said gradient information according to analysis result.
3. method according to claim 2, wherein:
The said step of obtaining said image edge information and gradient information comprises:
Utilization is used to obtain marginal information and a plurality of rope Bel operator convolution kernels of gradient information of image on a plurality of directions and carries out convolutional calculation with the view data of said image respectively, and
To be divided into positive pulse response image data and negative pulse response image data to the result of the convolutional calculation of all directions to all directions;
The marginal information relevant with text and the step of gradient information comprise in the said image of said reinforcement:
Positive pulse response image data and negative pulse response image data for to all directions are carried out the calculating to relative direction displacement and addition, with the first composograph data of acquisition to all directions, and/or
For the calculating of carrying out displacement round about and addition to the positive pulse response image data and the negative pulse response image data of all directions, to obtain the second composograph data to all directions;
Said acquisition comprises with the marginal information of warp reinforcement and the step of the corresponding text stroke of gradient information image:
Utilize predefined first threshold respectively the first composograph data and/or the second composograph data to all directions to be carried out first binary conversion treatment; The first composograph data to all directions to through first binary conversion treatment are carried out integration processing to obtain the first thin stroke view data; Thereby obtain and the corresponding first thin stroke image of the said first thin stroke view data; And/or to through first binary conversion treatment carry out integration processing obtaining the second corresponding thin stroke view data to the second composograph data of all directions, thereby obtain and the corresponding second thin stroke image of the said second thin stroke view data; And/or
Utilize predefined second threshold value respectively the first composograph data and/or the second composograph data to all directions to be carried out second binary conversion treatment less than first threshold; The first composograph data to all directions to through second binary conversion treatment are carried out integration processing to obtain the first thick stroke edge data; Thereby obtain and the corresponding first thick stroke image of the said first thick stroke edge data; And/or to through second binary conversion treatment carry out integration processing obtaining the second corresponding thick stroke edge data to the second composograph data of all directions, thereby obtain and the corresponding second thick stroke image of the said second thick stroke edge data.
4. method according to claim 3, wherein, said integration processing comprises the processing that seeks common ground, maximizing processing and asks one of average treatment.
5. according to claim 3 or 4 described methods, wherein, a plurality of rope Bel operator convolution kernels that are used to obtain marginal information and the gradient information of image on a plurality of directions in said utilization comprise with the step that the view data of said image is carried out convolutional calculation respectively:
Utilization be used to obtain image laterally, vertically and the marginal information on two these four directions of diagonal and four Suo Beier operator convolution kernels of gradient information carry out convolutional calculation with the view data of said image respectively.
6. according to each described method in the claim 3 to 5, wherein, said calculating to relative direction displacement and addition is based on following formula:
I h(x,y)=(I h-positive(x,y-w/2)+I h-negative(x,y+w/2))/2,
I v(x,y)=(I v-positive(x-w/2,y)+I v-negative(x+w/2,y))/2,
I Rd(x, y)=(I Rd-positive(x+w/2, y-w/2)+I Rd-negative(x-w/2, y+w/2))/2, and
I ld(x,y)=(I ld-positive(x-w/2,y-w/2)+I ld-negative(x+w/2,y+w/2))/2;
And wherein, the said displacement round about and the calculating of addition are based on following formula:
I h’(x,y)=(I h-positive(x,y+w/2)+I h-negative(x,y-w/2))/2,
I v’(x,y)=(I v-positive(x+w/2,y)+I v-negative(x-w/2,y))/2,
I Rd' (x, y)=(I Rd-positive(x-w/2, y+w/2)+I Rd-negative(x+w/2, y-w/2))/2, and
I ld’(x,y)=(I ld-positive(x+w/2,y+w/2)+I ld-negative(x-w/2,y-w/2))/2;
Wherein, the horizontal ordinate of x remarked pixel, y are represented the ordinate of said pixel, I h(x, y), I v(x, y), I Rd(x, y) and I Ld(x, y) respectively expression to said pixel laterally, vertically and the first composograph data of two these four directions of diagonal,
I h' (x, y), I v' (x, y) and I Rd' (x, y), I Ld' (x, y) respectively expression to said pixel laterally, vertically and the second composograph data of two these four directions of diagonal,
I H-positive(x, y), I V-positive(x, y), I Rd-positive(x, y) and I Ld-positive(x, y) respectively expression to said pixel laterally, vertically and the positive pulse response image data of two these four directions of diagonal, and
I H-negative(x, y), I V-negative(x, y), I Rd-negative(x, y) and I Ld-negative(x, y) expression is to the negative pulse response image data of horizontal, vertical and two these four directions of diagonal of said pixel respectively, and w is the stroke width of pre-estimating.
7. according to each described method in the claim 3 to 6; Wherein, said acquisition comprises with the step of the said first thick stroke edge data and the corresponding first thick stroke image of the said second thick stroke edge data and the second thick stroke image respectively:
Utilize predetermined filtercondition respectively the connected domain data in said first thick stroke edge data and the said second thick stroke edge data to be carried out filtration treatment, to obtain to satisfy the first connected domain view data and the second connected domain view data of said predetermined filtercondition; And
Obtain respectively and the said first connected domain view data and the corresponding first thick stroke image of the said second connected domain view data and the second thick stroke image.
8. method according to claim 7, wherein, said predetermined filtercondition comprise in the following condition one of at least:
The gray variance of the pixel in the said connected domain is less than the pre-determined variance threshold value;
The inward flange of said connected domain is consistent to outer peripheral pixel polarity; And
The size of said connected domain is in predetermined size threshold value.
9. according to each described method in the claim 3 to 8, also comprise:
Based in the recall rate of the precision of text stroke and text stroke at least one, screen the final thin stroke image of a conduct in said first thin stroke image and the said second thin stroke image and/or screen the said first thick stroke image and the said second thick stroke image in the final thick stroke image of a conduct.
10. one kind is used for comprising from the device of image extraction text stroke image:
Information acquisition unit is used to obtain said image edge information and gradient information;
Enhancing unit is used for the reinforcement that the marginal information obtained and gradient information are scheduled to is handled, thereby strengthens marginal information relevant with text in the said image and gradient information; And
The stroke image obtains the unit, is used to obtain marginal information and the corresponding text stroke of the gradient information image strengthened with warp.
CN201110157673.4A 2011-05-31 2011-05-31 Method and device for extracting text stroke images from image Active CN102810155B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201110157673.4A CN102810155B (en) 2011-05-31 2011-05-31 Method and device for extracting text stroke images from image
JP2012110573A JP5939023B2 (en) 2011-05-31 2012-05-14 Computer program and image extraction apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110157673.4A CN102810155B (en) 2011-05-31 2011-05-31 Method and device for extracting text stroke images from image

Publications (2)

Publication Number Publication Date
CN102810155A true CN102810155A (en) 2012-12-05
CN102810155B CN102810155B (en) 2015-04-15

Family

ID=47233859

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110157673.4A Active CN102810155B (en) 2011-05-31 2011-05-31 Method and device for extracting text stroke images from image

Country Status (2)

Country Link
JP (1) JP5939023B2 (en)
CN (1) CN102810155B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761520A (en) * 2013-12-05 2014-04-30 南京理工大学 Document image non-parameter binaryzation method based on stroke width
CN104036253A (en) * 2014-06-20 2014-09-10 智慧城市系统服务(中国)有限公司 Lane line tracking method and lane line tracking system
CN104112135A (en) * 2013-04-18 2014-10-22 富士通株式会社 Text image extraction device and method
CN107967459A (en) * 2017-12-07 2018-04-27 北京小米移动软件有限公司 convolution processing method, device and storage medium
CN116916047A (en) * 2023-09-12 2023-10-20 北京点聚信息技术有限公司 Intelligent storage method for layout file identification data
WO2024120107A1 (en) * 2022-12-09 2024-06-13 蔚来移动科技有限公司 Method for acquiring foreground contour of text, method for acquiring watermark image, system, apparatus and medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111557692B (en) * 2020-04-26 2022-11-22 深圳华声医疗技术股份有限公司 Automatic measurement method, ultrasonic measurement device and medium for target organ tissue

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101615252A (en) * 2008-06-25 2009-12-30 中国科学院自动化研究所 A kind of method for extracting text information from adaptive images
JP2011076302A (en) * 2009-09-30 2011-04-14 Ntt Comware Corp Device, contour extraction method program, and contour extraction

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0650534B2 (en) * 1988-11-30 1994-06-29 株式会社東芝 Mark detection method
JP4587698B2 (en) * 2004-04-21 2010-11-24 オムロン株式会社 Character component extractor
JP4420877B2 (en) * 2005-09-22 2010-02-24 シャープ株式会社 Image processing method, image processing apparatus, and image output apparatus
JP4701144B2 (en) * 2006-09-26 2011-06-15 富士通株式会社 Image processing apparatus, image processing method, and image processing program
JP5010627B2 (en) * 2009-02-19 2012-08-29 三菱重工業株式会社 Character recognition device and character recognition method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101615252A (en) * 2008-06-25 2009-12-30 中国科学院自动化研究所 A kind of method for extracting text information from adaptive images
JP2011076302A (en) * 2009-09-30 2011-04-14 Ntt Comware Corp Device, contour extraction method program, and contour extraction

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王一丁等: "基于梯度增强的新闻字幕分割算法", 《计算机辅助设计与图形学学报》, vol. 21, no. 8, 31 August 2009 (2009-08-31), pages 1170 - 1173 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112135A (en) * 2013-04-18 2014-10-22 富士通株式会社 Text image extraction device and method
CN104112135B (en) * 2013-04-18 2017-06-06 富士通株式会社 Text image extraction element and method
CN103761520A (en) * 2013-12-05 2014-04-30 南京理工大学 Document image non-parameter binaryzation method based on stroke width
CN103761520B (en) * 2013-12-05 2016-09-21 南京理工大学 File and picture based on stroke width is without ginseng binarization method
CN104036253A (en) * 2014-06-20 2014-09-10 智慧城市系统服务(中国)有限公司 Lane line tracking method and lane line tracking system
CN107967459A (en) * 2017-12-07 2018-04-27 北京小米移动软件有限公司 convolution processing method, device and storage medium
CN107967459B (en) * 2017-12-07 2021-08-24 北京小米移动软件有限公司 Convolution processing method, convolution processing device and storage medium
WO2024120107A1 (en) * 2022-12-09 2024-06-13 蔚来移动科技有限公司 Method for acquiring foreground contour of text, method for acquiring watermark image, system, apparatus and medium
CN116916047A (en) * 2023-09-12 2023-10-20 北京点聚信息技术有限公司 Intelligent storage method for layout file identification data
CN116916047B (en) * 2023-09-12 2023-11-10 北京点聚信息技术有限公司 Intelligent storage method for layout file identification data

Also Published As

Publication number Publication date
JP5939023B2 (en) 2016-06-22
CN102810155B (en) 2015-04-15
JP2012252691A (en) 2012-12-20

Similar Documents

Publication Publication Date Title
CN102810155A (en) Method and device for extracting text stroke images from image
Ngo et al. Improved color attenuation prior for single-image haze removal
CN106875546B (en) A kind of recognition methods of VAT invoice
CN107784301A (en) Method and apparatus for identifying character area in image
CN103686194B (en) Video denoising method and device based on non-local mean value
CN101453575B (en) Video subtitle information extracting method
CN104299008B (en) Vehicle type classification method based on multi-feature fusion
US10297029B2 (en) Method and device for image segmentation
CN102054271B (en) Text line detection method and device
CN108108731B (en) Text detection method and device based on synthetic data
CN104299009B (en) License plate character recognition method based on multi-feature fusion
CN105096347B (en) Image processing apparatus and method
CN109558908B (en) Method for determining optimal edge of given area
CN106446952A (en) Method and apparatus for recognizing score image
CN103971361A (en) Image processing device and method
CN107133933B (en) Mammary X-ray image enhancement method based on convolutional neural network
Tahseen et al. Binarization Methods in Multimedia Systems when Recognizing License Plates of Cars
CN103543277A (en) Blood type result recognition algorithm based on grey level analysis and type recognition
CN104580829A (en) Terahertz image enhancing method and system
CN101930597A (en) Mathematical morphology-based image edge detection method
CN110517273B (en) Cytology image segmentation method based on dynamic gradient threshold
CN103226824B (en) Maintain the video Redirectional system of vision significance
Kim et al. Sredgenet: Edge enhanced single image super resolution using dense edge detection network and feature merge network
CN104517270A (en) Terahertz image processing method and system
CN107563963A (en) A kind of method based on individual depth map super-resolution rebuilding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant