From the picture that camera obtains, extract the method for raw data
Technical field
The present invention relates to a kind of method of from the image that camera obtains, extracting raw data.
Especially but not exclusively, relate to a kind of basis by integrated or be not integrated into camera in the communication facilities or video camera present the data of being extracted along the viewing angle of expectation from image with the captured digital picture of any incident angle method.
Background technology
It is applied in transmission and storage text data and digital photo especially, wherein text data and digital photo by in advance with camera with any incident angle takes, then by the correction distortion of projection and/or alternatively by the enhancing resolution handle, so that obtain to have image after the correction of higher readability, observe along the incident angle that is different from camera, for example observe with normal incidence angle or predetermined arbitrarily incident angle.Certainly, use the present invention, can or proofread and correct the back before correction and extract useful information.
This method be particularly suitable for transmitting by be installed in mobile terminals, such as the camera on the cellular radio transmission/receiver captured text and/or pictorial information.
Certainly, in order to extract the raw data that relates to printing or hand-written information in the image and therefrom to infer the zone that will be corrected, the applicant has proposed a solution by patented claim FR 2 840 093, this scheme comprises by calculating difference image D (C, L) (in fact, the contrast between the light luminance of background light luminance (light level) and data to be extracted) comes information extraction as the raw data that is extracted.Threshold value is used to select the value that will extract from difference image.Therefore, can select this threshold value V
sThreshold value as gradient is used to remove wire grid (grid oflines) (square pattern: square pattern)).But there is following shortcoming in this method:
If do not have wire grid in the original image, then be worth V
sCorresponding to the threshold value that is used to remove noise.Have been found that by traditional histogram technology that satisfactory result is not provided and be difficult to obtain this threshold value.
If there is gridline, then can be identified for seeking the proper threshold value of pattern, but this threshold value can not always be used as the threshold value that is used for information extraction.In fact, because the non-prediction pictures contrast owing at random lighting condition as diffusion saturated (diffuse saturation) and as the unsharp image surface, change, so this threshold value is not always removed gridline or noise up hill and dale.
Under the situation of coloured image, three passages (red, green and blue) need be considered, and do not know it is that each passage has a threshold value or all passages that a threshold value is arranged.
And, as everyone knows, people are to the reading of the text that reproduced from the information that camera provided of taking original document or picture and/or to understand the supposition photography be with the normal incidence angle or near carrying out at the normal incidence angle, so that allow identification to constitute the literal of text and explain picture (this needs the observation of shape and ratio usually).
In fact, when camera was taken text with any incident angle, the image that is produced had distortion of projection: therefore, from the certain distance of camera, character recognition occurred and understood the disappearance of the needed details of document subsequently.
In order to eliminate these shortcomings, the applicant has proposed a solution, this scheme comprises: extract the discernible context dependent data that exist in the captured image of camera, by means of these context dependent adjustments of data by raw data that camera provided or be extracted data, then, the data after the correction are stored in the storer and/or are transferred to the addressee so that be shown for reading.
The context dependent data that are used for proofreading and correct raw data can influence and initially be present in the reliable or pattern reported in advance (physics, drawing or print profile), and wherein some parameter of this pattern is known in advance.
So trimming process can comprise following steps:
-this pattern of search in the captured original image of camera,
-according to comprised and be because the pattern distortions that the variation of above-mentioned parameter produces is calculated the distortion of projection that original image presented,
-determine to raw data or to being extracted the correction of data according to distortion of projection,
-under the situation of considering determined correction in advance, generate the image that comprises correction back data.
Then, obtain the pattern search step, comprising by first search sequence:
The border that occurs in-the detected image,
-extraction length surpasses the border of predetermined value, and
-detect zone by the boundary definition that is found, have the enough surface areas (greater than predetermined value) and the edge of contact image not.
For each zone of being found, this processing comprises: calculation procedure, be used for determining the main shaft in zone, be used to seek the point of region exterior on the described main shaft, make up the outer circle cone of externally-originated point, from the edge, extract a little, wherein the outside normal at edge and connect it and from external point begin vectorial relative, calculating is extracted the line that main shaft extended a little, when finding four lines, calculate tetragonal four summits of deriving, then by these four lines, when tetragonal surface area during near this regional surface area, calculating is deformed into this quadrilateral the correspondence of the rectangle with predetermined ratio.
0014。A shortcoming that has been found that this method exactly is that it needs the ratio on society top in advance.Certainly, if these ratios that set in advance are not initial proportions, then the performed correspondent transform of image is just caused being included in the variation of the ratio of proofreading and correct the object in the image of back.
And, have been found that the employed so far corresponding complexity especially of calculating.In fact, for each pixel of final image, need to determine such zone of initial pictures, brightness that promptly should the zone and chromatic value are read should residing position so that in final image it is assigned to this pixel according to corresponding relation subsequently.
Now, as can be seen, handwritten text part does not generally comprise 20% of the pixel that surpasses this image in the image, and therefore remaining 80% image pixel is that people are uninterested.
Summary of the invention
Therefore, target of the present invention is particularly in addressing these problems.
For this purpose, at first, provide a kind of method, be used for being identified for exactly proofreading and correct the context of noise related data that is extracted raw data, especially, be used for exactly determining to extract and be printed or hand-written information and be indifferent to the threshold value V whether gridline exists
s, no matter what pattern.In addition, this threshold value can be used as the Grads threshold that is used to seek pattern, so that processing requirements is reduced to only pattern search step.If extract the information of coloured image, then each color component of image all should be considered, and is used to calculate unique threshold value, so that extract colouring information.
Then, consider image with gray level, it may reside in one of the combination of three Color Channels (R-G-B) of image or these passages.
More particularly, the invention provides a kind of method of from the image of camera, extracting raw data, it is characterized in that it comprises the following steps:
A) for being listed as the definite value V that constitutes by the color component of image of each point of C and row L location in the image
0[C, L] is expressed as follows:
V
0[C,L]=αRed[C,L]+βGreen[C,L]+γBlue[C,L]
α in the formula, beta, gamma are coefficients, and these coefficients can for example satisfy following relationship:
α+β+γ=1andα,β,γ≥0
B) be each some calculating background value V of image
Back[C, L]
C) be each some calculating difference D[C of image, L]
D[C, L]=V
Back.-V
0[C, L] (dark data/bright background)
Or
V
0[C, L]-V
Back(bright data/dark background)
D) according to the contrast histogram and/or according to raw data D[C, L] regional maximal value comprise the probability q of noise, calculate by being used to proofread and correct the threshold value V that the context of noise related data that is extracted raw data constitutes
s
E) by means of context of noise related data V
sProofread and correct raw data D[C, L], obtain being extracted data D
*[C, L].
F) considering to proofread and correct back raw data D
*Under the situation of [C, L], be the value I behind each some calculation correction of image
*[C, L].
G) alternatively, present with desired angle and be extracted data or comprise the image that these are extracted data.
Advantageously,
-background value V
BackCan determine by the sequence of operation that comprises the following steps:
-be each some calculated value V of image
N+1[C, L], wherein this value is at value V
N[C, L] be V on the symmetric construction item at center with [C, L]
NMaximal value (the dark data on the bright background) or minimum value (the bright data on the dark background) between the different mean values of value.
-with described calculating iteration pre-determined number (N final), then with the value V of final image
N finalBe thought of as the value V of background image
Back,
-can be by following type concern acquisition value V
N+1The calculating of [C, L]:
-background image V
BackAlso can determine by the sequence of operation that comprises the following step:
-generate and compare V
NLittle four times image V
N+1, be included as each some calculated value V of image
N+1[C, L], wherein V
N+1[C, L] is to be the V of center (four neighbors here) with point [2C+1/2,2L+1/2]
NLocal mean value and at least one comprise maximal value (the dark data on the bright background) or minimum value (the bright data on the dark background) between the local mean value (16 neighbors) of a large amount of pixels here; So, image V
N+1Than image V
NLittle four times.
-with described calculating iteration predetermined quantity V
NFinalIt is inferior,
-interpolated image value V
N Final, so that obtain and initial pictures V
0The value V of same size
Back
-value V
N+1[C, L] can such sequence of operation determine that wherein this sequence of operation comprises:
Because camera is in the optional position of supporting media (supporting medium) front, thus raw data D[C, L] be subjected to the influence of distortion of projection usually.Can proofread and correct the distortion of projection that is extracted raw data with the method that how much context dependent data are extracted in known being used for.Similarly, these are extracted the influence that raw data also is subjected to brightness and/or electronic noise, and these noises can be by eliminating as lower threshold value:
At each the some calculating noise context dependent data V that is image
sAfter, fiducial value D[C, L] and threshold value V
s, will be with the value D of following method extraction so that determine
*[C, L]:
If D[C, L]<V
s, D then
*[C, L]=0
If D[C, L] 〉=V
s, then be worth D[C, L] and maintenance, i.e. D
*[C, L]=D[C, L] otherwise use D[C, L]-V
sReplace, i.e. D
*[C, L]=D[C, L]-V
s
Generation comprises the image I that is extracted data according to subtraction principle (subtractive principle)
*(p), this is by calculating I
*(p)=I
Max-f.D
*(p) (dark data/bright background) obtain, wherein I
MaxBe the value of bright background, for example it can equal 255, perhaps by I
*(p)=I
Min+ f.D
*(p) (bright data/dark background) obtains I
MinCan equal 0.
Threshold value V
sBe to be used to proofread and correct raw data D[C, L] the context of noise related data.The method that can comprise the probability q of noise according to the arbitrary region maximal value based on raw data is calculated it.Under dark data conditions on the bright background, this method comprises the following operational phase (particular procedure for the search of dark data on the bright background is indicated) in bracket:
-the first step, for each pixel p of gray level image I (perhaps Color Channel or brightness) carries out following step:
A) for each direction d, wherein 0<| d|<D
If satisfy following condition:
-I is protruding on [p-d, p+d], that is:
For any 0≤λ≤1, I (p+ (1-2 λ) d)≤λ I (p-d)+(1-λ) I (p+d)
(dark data/bright background)
Perhaps
-I is recessed on [p-d, p+d], that is:
For any 0≤λ≤1, I (p+ (1-2 λ) d) 〉=λ I (p-d)+(1-λ) I (p+d)
(bright data/dark background)
So calculating G (p, d)=(I (p+d)+I (p-d))/2
Otherwise G (p, d)=0
B) calculated value S (p), it equal all direction d G (p, maximal value d), wherein 0<| d|<D
As the optional method that calculates S (p), S (p) can be replaced by D (p), and D (p) is corresponding to raw data,
-the second step, calculated value S
Max, it equals the maximal value of the S (p) for all pixel p,
-Di three step is for 0 and S
MaxBetween all s values, histogram H (s) is by zero setting.
Four steps of-Di, calculate the contrast histogram for the regional maximal value pixel that comprises the noise that will eliminate, wherein this calculating can comprise:
● for each pixel p among the image S (p), if S (p) is regional maximal value, then H (S (p)) is according to concerning that H (S (p)) ← H (S (p))+1 is increased.
● determine equation S=S
MaxAnd N=1/q, and need only H (S) less than N, and S is just replaced by S-1, and the end value of S is called as S
Min, N is the minimum number of regional maximal value pixel, is greater than or equal to 1 so that comprise the mathematical expectation of quantity of the pixel of noise
● according to following formula calculated value V
S
V
s=r.S
Min+ (1-r) .S
Max, 1/2≤r≤1 wherein
Threshold value V
sAlso can calculate according to following method:
1) first step, the histogram H_pits of calculating pit comprises the following operational phase:
A) for each pixel p of image I, carry out following operation:
I. for each direction d, wherein 0<| d|<D;
If satisfy following condition:
-I is protruding on [p-d, p+d], that is:
For any 0≤λ≤1, I (p+ (1-2 λ) d)≤λ I (p-d)+(1-λ) I (p+d)
(bright data/dark background)
Perhaps
-I is recessed on [p-d, p+d], that is:
For any 0≤λ≤1, I (p+ (1-2 λ) d) 〉=λ I (p-d)+(1-λ) I (p+d)
(bright data/dark background)
Then calculate G (p, d)=(I (p+d)+I (p-d))/2
Otherwise G (p, d)=0
Then calculate G (p, d)=(I (p+d)+I (p-d))/2
Otherwise G (p, d)=0
Ii. be all directions, wherein 0<| d|<D, calculate S (p)=G (p, maximal value d),
As the optional method that calculates S (p), S (p) can be replaced by the D corresponding with raw data (p),
B) calculate the maximal value S_pits_ that caves in
Max, it equals the maximal value of the S (p) for all pixel p.
C) for 0 and the maximal value S_pits_max of depression between all s values, the histogram H_pits that caves in is by zero setting.
D) for each pixel p of image [p-d, p+d], carry out following calculating:
If i. S (p) is regional maximal value, then H_pits (S (p)) increases in the following manner:
H_pits(S(p))←H_pits(S(p))+1
2) in second step, calculate protruding histogram H_bumps, comprise the following operational phase:
A) for each pixel p of image I, carry out following operation:
I. for each direction d, wherein 0<| d|<D:
If satisfy following condition:
-I is protruding on [p-d, p+d], that is:
For any 0≤λ≤1, I (p+ (1-2 λ) d)≤λ I (p-d)+(1-λ) I (p+d)
(bright data/dark background)
Or
-I is recessed on [p-d, p+d], that is:
For any 0≤λ≤1, I (p+ (1-2 λ) d) 〉=λ I (p-d)+(1-λ) I (p+d)
(bright data/dark background)
Then calculate G (p, d)=(I (p+d)+I (p-d))/2
Otherwise G (p, d)=0
Ii. for all direction d, wherein 0<| d|<D, calculate S (p)=G (p, maximal value d)
As previously mentioned, as the optional method that calculates S (p), S (p) can replace (bright data/dark background) by the value D corresponding with raw data (p)
B) calculate protruding maximal value S_bumps_max, it equals the maximal value of the S (p) for all pixel p.
C) for 0 and the projection maximal value S_bumps_max between each s, protruding histogram H_bumps (s) is by zero setting.
D) for each pixel p of image S (p), carry out following calculating:
If i. S (p) is regional maximal value
H_bumps (S (p)) increases in the following manner:
H_bumps(S(p))←H_bumps(S(p))+1
3) the 3rd step, stack depression histogram H_pits and protruding histogram H_bumps comprised with the next stage:
A) calculate S_max according to following according to expression formula:
S_max=Max (the maximal value S_pits_max of depression, the maximal value S_bumps_max of projection)
B) calculate H_max according to following formula:
H_max=is for the maximal value of depression H_pits (S) Yu the protruding H_bumps (S) of all S values.
C) calculate s0 according to following formula:
S0=makes the maximal value of s of H_pits (s)=H_max or H_bumps (s)=H_max.
D) calculate s=s0+1, and select α to make 0<α<1/2, and as long as:
|ln(1+H_pits(s)-ln(1+H_bumps(s))|<α.ln(1+H_max)
Carry out s ← s+1 (wherein ln is Napierian logarithms function (Napier ' s logarithmfunction))
At last, the end value by s adds 1 and determines S
Min
4) calculate extraction threshold value V according to following relation
s:
V
S=r.S
Min+ (1-r) .S
Max, 1/2<r<1 wherein
As can be seen, be used for extracting raw data method step b) by iteration many times, make not allow effectively to proofread and correct to being extracted raw data by means of the threshold value of describing in the past, calculate via two kinds of methods calculating S (p).
This shortcoming can be overcome by using a kind of optional method, and this method comprises with D (p) replacement S (p).
Therefore, in this case, when any regional maximal value of raw data contains the probability q of noise when known, the process of extracting the context of noise related data can may further comprise the steps:
-the first step, calculated value S (p), it equals the maximal value for the D (p) of all pixel p=[C, L], and D is the image of the raw data that will be corrected.
-the second step, for all 0 and S
MaxBetween the S value, histogram is reset H (S)=0
In three steps of-Di, for each pixel p among the image D (p), if D (p) is regional maximal value, then H (D (p)) increases according to following relation:
H(D(p))←H(D(p))+1
In four steps of-Di, determine equation S=S
MaxAnd N=l/q, and as long as H (S) less than N, then replaces S with S-1, the end value of S is called as S
Min
Five steps of-Di are according to following formula calculating noise context dependent data V
sValue:
V
S=r.S
min+(1-r).S
max?with1/2≤r≤1
If any regional maximal value of raw data contains probability q the unknown of noise, then extract context of noise related data V
sProcess can may further comprise the steps:
1) first step is calculated depression histogram H_pits, comprises the following operational phase:
A) calculate the maximal value S_pits_ that caves in
Max, it equals for all pixel p, the maximal value of D (p), and D is the image of raw data dark on the bright background that is extracted.
B) for each 0 and depression maximal value S_pits_max between s value, the histogram H_pits that caves in is by zero setting
C) for each pixel p of image D (p),
If D (p) is regional maximal value
H_pits (D (p)) increases in such a way:
H_pits(D(p))←H_pits(D(p))+1
2) in second step, calculate protruding histogram H_bumps, comprise the following operational phase:
A) calculate the maximal value S_bumps that caves in
Max, it equals for pixel p, the maximal value of D (p), and D is the image of raw data bright on the dark background that is extracted.
B) for each 0 with the depression maximal value S_bumps
MaxBetween the s value, histogram H_bumps is by zero setting
C) for each pixel p among the image D (p),
If D (p) is regional maximal value
H_bumps (D (p)) increases in such a way:
H_bumps(D(p))←H_bumps(D(p))+1
3) the 3rd step, stack depression H_pits and protruding H_bumps histogram comprise the following operational phase:
A) calculate S_max according to following formula:
S_max=Max (the maximal value S_pits_max of depression, the maximal value S_bumps_max of projection)
B) calculate H_max according to following formula:
H_max=is for all S values, the maximal value of depression H_pits (S) and protruding H_bumps (S).
C) calculate s0 according to following formula:
S0=makes the maximal value of s of H_pits (s)=H_max or H_bumps (s)=H_max.
D) calculate s=s0+1, and select α to make 0<α<1/2, and as long as:
|ln(1+H_pits(s)-ln(1+H_bumps(s))|<α.ln(1+H_max)
Just carry out s ← s+1 (wherein ln is the Napierian logarithms function)
At last, S
MinAdding 1 by the end value of s determines
4) according to the following calculating noise context dependent data V that concerns
sValue:
V
S=r.S
min+(1-r).S
max?where?1/2<r<1
Certainly, the informational needs that has in the coloured image of red, green, blue Color Channel is extracted.The step of previously described method can be by determining that for each passage threshold value is used to each Color Channel.Can extract the red, green, blue value on the pixel of its threshold value by having surpassed, thereby from the red, green, blue passage, extract colouring information and be reassembled into final coloured image at each.
And, for the shortcoming of the searching method of eliminating pattern (physics, that draw or print profile), explain some context dependent data, and introduce the ratio that sets in advance, the present invention proposes to determine that four of the pattern that presents in the contour images by some context dependent data that are used for determining image are determined the formed tetragonal true depth-width ratio examples of point, so that reconstruct has the document of same ratio.
For this reason, the applicant has proposed a kind ofly to present from the image of aforementioned type the method for the information of being extracted along the viewing angle of expectation according to camera with any incident angle picture shot, and this method comprises:
Can confirm unique point, be used for the defining context related data at least four of the pattern that is presented in the captured image of-search camera.
-alternatively, extract data according to preassigned.
-according to the relative position of four points with respect to the relative reference position, calculating will or be extracted the geometry deformation that data are carried out to original image, information.
-according to geometry deformation, determine and will or be extracted the correction that data are carried out original image.
-under the situation of considering determined geometry correction thus, generation comprises the image that is extracted data.
This method characteristics are, in order to obtain to have the image that is extracted data comprising of the ratio identical with object, this method comprises: determine by the defined tetragonal true height of point noted earlier/wide ratio, and when generating correction back image this ratio r is taken into account.
More specifically, quadrilateral (rectangle) determination of ratio is finished according to the process that may further comprise the steps:
Four confirmable unique points of the pattern that occurs in-the searching image.
-determine the latent point on the tetragonal limit that next free these four points are limited, and determine to connect the local horizon (horizon line) of not concealing point.
-determine the coordinate of the subpoint F of camera optical centre O on the horizon,
-according to the distance between not latent point and the subpoint F and according to the distance between this subpoint F and the optical centre O, calculate camera basic point (rectangular projection of the optical centre of camera on the pattern plane).
-according to the distance calculation focal length between optical centre, subpoint F and the camera basic point.
-calculate to be positioned at and do not have on the hidden line routine of camera distance basic point (ellipse) distance (conventional (elliptical) distance) be in not have hidden line and is connected the camera basic point and does not conceal with and put O
1, O
2, P
1, P
2Line between point of crossing M
1N
1, M
2N
2Coordinate.
-by considering rectangle O
1, O
2, P
1, P
2Be the projection of square on the pattern plane,, calculate the ratio on the limit of initial pattern according to precalculated coordinate.
Iff not having hidden line to intersect at latent point with a pair of, and every otherly do not have a hidden line parallel (latent spot projection to the infinite distance with ing), then the calculating of ratio r will begin execution from the predefined focal length of camera.
If all not latent spot projections are to the infinite distance, then ratio r equals the ratio of tetragonal adjacent edge length.
A significant advantage of this method is that it is insensitive that it lacks orthogonality to the quadrilateral adjacent edge, and wherein this phenomenon often occurs when quadrilateral is the pattern of hand picture.In fact, traditional scheme is responsive especially for this defective (instability when lacking orthogonality).
Another advantage of this scheme is, can reproduce wherein do not have the character permutation text of (alignmentof characters).
For by avoid unnecessary calculating, by the calculating that only they is applied to the pixel relevant and finished by repeated use as much as possible and alleviate corresponding calculating significantly with handwritten text in the image, the applicant proposes a kind of sequence of operation that comprises with the next stage:
-comprise the useful part of the initial pictures that is extracted data and identical binary value (0 or 1) is distributed to the pixel of this useful part by isolation, create initial (distortion) binary mask in zone to be corrected.
-directly change is calculated desirable binary mask (arriving with reference to polygonal variation based on any polygon) by initial mask
-for each pixel of the useful part of desirable binary mask (u, v), by inverse correspondence, calculate position in the initial pictures (x, y), by (x, the interpolate value of y) locating determines that (u v) locates the value of final image to pixel in the initial pictures.
Advantageously, the calculating of the inverse correspondence row that can comprise each pixel by the desirable mask of inverse correspondence and the row primary Calculation of carrying out.Then, can infer the position of given pixel in initial pictures by the joining of calculating two lines.
Description of drawings
Below with reference to the accompanying drawings each embodiment as non-limitative example of the present invention is described, wherein:
Fig. 1 is the synoptic diagram of camera document, and employed major parameter can show with this accompanying drawing in the method according to this invention;
Fig. 2 is the projection of observation on the file and picture plane that shows among Fig. 1;
Fig. 3 is the projection of Fig. 2 type, but wherein one of two not latent points are projected to the infinite distance;
Fig. 4 relates to the chart of the inboard feature of quadrilateral;
Fig. 5 graphic extension perspective geometry unchangeability;
Fig. 6 is to be the synoptic diagram that obtains the image processing process operation steps of proofreading and correct the back image according to the present invention;
Fig. 7-the 11st is used for the chart of diagram according to the performed calculating of process shown in Figure 6;
Figure 12 shows the example of a pair of histogram H_pits and H_bumps in the logarithmic coordinate reference frame;
Figure 13 is the synoptic diagram that shows the main geometric parameters of camera;
Figure 14 is illustrated in the chart that is used to construct the rectangular patterns with the vertical wide ratio of aforementioned physics under the situation that does not have not latent point.
Embodiment
In example shown in Figure 1, flatly be placed on the planar support media with the original document of camera.
Camera is set on the plane of supporting media with a height of determining, and therefore is positioned on the document plane, and wherein the axle towards the document orientation of camera is (here, about 30 ° incident angle) of tilting.
Be positioned on the plane of delineation by the captured file and picture of camera, wherein this plane of delineation extends vertically up to the optical axis of camera.
The optical centre C of camera is known as camera based in the rectangular projection of supporting on the media plane.
By a C and to be parallel to the plane of supporting the media plane be the visible horizon that supports media.
On image, the rectangular patterns of document D T provides quadrilateral A, B, and C, D (Fig. 2), tetragonal line segment DC and AB are by meeting at a F
1Two lines (not having hidden line) produce, and line segment DA and CB are by meeting at a F
2Two lines (not having hidden line) produce.Generate line segment
Line be the local horizon.
As shown in Figure 2:
X is the base (projection of optical centre C on the document plane) of camera
M
1Be line (AD) and (F
1X) intersection point.
N
1Be line (BC) and (F
1X) intersection point.
M
2Be line (AB) and (F
2X) intersection point.
N
2Be line (CD) and (F
2X) intersection point.
δ is a positive constant, and representative is from axle (F
1, X) with (F
2, X) the routine distance of the some X measurement on.
I is an incident angle.
E is that major axis is parallel to the ellipse of (FX); Its long axis length is δ/cons (i), and minor axis length is δ.
O1 and P
1Be (F
1, X) with the intersection point of oval E.
O2 and P
2Be (F
2, X) with the intersection point of oval E.
O is the center of image.
F is the online (F of camera optical centre O
1, F
2) on rectangular projection.
The method according to this invention, calculate the physics aspect ratio r (r=horizontal length/vertical length) of the rectangle that forms master pattern according to one of following three kinds of operation stepss:
1) as a F
1And F
1When existing, line segment AB, BC, the uneven situation of CD, DA.In this case, operation steps comprised with the next stage:
-the phase one, project to local horizon (F by center O with image
1, F
2), the coordinate of calculation level F.
-subordinate phase, by means of following relation, by base to some F (X F), calculates the position of camera based apart from dist.
This comes from following demonstration, divides for three steps:
A) base of camera and the angle between the local horizon are 90 degree, and release:
With
Ii. therefore
B) F
1With F
2Between the angle also be 90 degree, and release:
With
G=OF/cons (i) wherein, j is F
1And F
2Between angle
We obtain thus
Ii. therefore
C) the final expression formula of XF is by synthetic relationship a) ii. and b) ii. obtains.
-the phase III, use following relation to calculate focal distance f:
(releasing) by top a) i.
-Di quadravalence section is calculated incident angle i, is expressed as follows:
(releasing) by top a) i.
-Di five-stage is determined some M by the value of previous calculating
1, N
1, O
1And P
1Coordinate.
In six stages of-Di, determine some M by the value of previous calculating
2, N
2, O
2And P
2Coordinate.
In seven stages of-Di, utilize and intersect ratio (crossed ratio) and utilize rectangle O
1, O
2, P
1, P
2Be this fact of foursquare projection of extending in the pattern plane at center being with the base, according to the following aspect ratio r that concerns computational physics:
This is this true generation of projection of the line segment of two equal length by [O1, P1] and [O2, P2]; [O1, P1] and [O2, P2] can be used as line segment, and the relative length of line segment [M1, N1] and [M2, N2] can utilize crossing ratio to measure, and r can be derived out by them.
2) the parallel situation (intersection point projects to the infinite distance) of two line segments (Fig. 3)
In this case, ratio r obtains according to following relation:
In the formula, f is the focal length (known this focal distance f is that calculated in advance draws) of camera
3) there is not the latent situation of putting (line segment AB, BC, CD, DA are parallel in twos)
In this case, ratio is expressed as simply
These relations are based on the crossing ratio of perspective geometry unchangeability, particularly four points in itself, and the pass of these four points ties up among Fig. 5 and shows, Fig. 5 has shown to have optical centre O
*Camera take two view A of same object with two different incident angles
*B
*C
*D
*-A
1 *B
1 *C
1 *D
1 *
By an A
*, B
*, C
*And D
*, first series intersects ratio
Can obtain.
Similarly, by A
* 1, B
* 1, C
* 1And D
* 1, obtain second series and intersect ratio
So intersecting the conservation of ratio is represented as:
In these somes point, for example put under the situation that A is projected to the infinite distance ratio A
*B
*/ A
*D
*Be considered to equal 1.
As previously mentioned, the present invention also provides a kind of method that is used for the shaping image, thereby allows to reduce the corresponding complexity of calculating, and particularly still uses this calculating when comprising text in this image so far.
The different step of this plastic model of Fig. 6 diagram comprises:
The first step is calculated wherein detected and text (writing) part of frame (the perhaps page), the binary mask of the anamorphose that has been extracted.This step comprises all pixels outside the quadrilateral of image useful part and does not correspond to the pixel zero setting value of writing.
Can determine the position of the point that quadrilateral is inner or outside according to method shown in Figure 4.
This figure has shown quadrilateral A ', B ', C ', D ' in x, y coordinate reference on the plane, and the coordinate in quadrilateral inside is xp, yp and xo, two the some P of yo and G.Point G can be quadrilateral A ', B ', and C ', the center of gravity of D ' perhaps is the cornerwise center of quadrilateral more simply, for example the center of line segment B ' D '.
Line segment A ' B '-B ' C '-C ' D ' and D ' A ' are respectively by line D
1, D
2, D
3, D
4Produce.
These lines and line D more generally
i, i=1 wherein, 2,3,4, expression formula be following type:
ai.x+bi.y+ci=0
Ai, bi, ci are constants.
So and if only if puts P with respect to tetragonal boundary (each boundary line D
1-D
4Being divided into two parts with unilateral) when always being in same side with G, some P is in quadrilateral inside: this actual being expressed as:
{ 1,2,3, the i of 4} has identical symbol for belonging to set for ai.xp+bi.yp+ci and ai.xo+bi.yo+co.This is write as with following form:
i∈{1,2,3,4}(ai.xp+bi.yp+ci).(αi.xo+bi.xo+ci)≥0
B) in second step, calculate desirable mask by direct correspondence.
Here with reference to figure 7, wherein Fig. 7 illustrates the principle that is used for calculating by correspondence the image of a point.In this width of cloth figure, shown the determined quadrilateral P of method that utilizes previous (page or leaf) to describe
1, P
2, P
3, P
4And this quadrilateral internal coordinate is (u, point v).
Point O if it exists, is line (P
1, P
2) and (P
3, P
4) intersection point.Point Q is line (P
1, P
4) and (P
2, P
3) intersection point.The point I be line segment OP and
Intersection point, and J be line segment QP and
Intersection point.
Known, correspondence provides quadrilateral (P, here
1-P
2) visible rectangle H (P in Fig. 8
1), H (P
2), H (P
3), H (P
4) conversion.
In Fig. 8, shown that also coordinate is H (I), the point of H (J) (x, y), and the length D of rectangle
xAnd width D
y
The conservation that intersects ratio is as follows:
Can infer the coordinate of H (P) thus
When the picture of line comprised the line segment that connects the picture of two points on the original line simply, the picture that calculates line by correspondence just obviously resulted from this calculating.
The calculating of desirable mask is carried out according to following steps:
Suppose (i is the pixel of writing corresponding in the distortion binary mask j), and it has four sub-pixs (Fig. 9) around it:
We suppose A, and B, C and D are that these sub-pixs are by direct corresponding resulting picture (Figure 10).Therefore, A, B, C, D are quadrilaterals.We consider the minimum rectangle that this quadrilateral comprises.Be included in all the pixel values of being changed to " very " in this rectangle, for example 1.
Desirable binary mask can obtain thus.Then, should set up a mechanism and be used for the picture of coordinates computed shape as the point of (u ± 1/2, v ± 1/2), u wherein, v is a pixel.
For this reason, consider the some P of coordinate unilateral (u ± 1/2, v ± 1/2).This o'clock is determined by the intersection point of two middle row: the perpendicular line of coordinate u ± 1/2 and coordinate are the horizontal line of v ± 1/2.So the picture of some P is at the intersection point of the picture by horizontal line that correspondence obtained and perpendicular line.
Therefore, picture of row (and middle column) is calculated in advance in the middle of these.In case these pictures have been precomputed, the intersection point of two precalculated pictures by the centre row obtains the picture of sub-pix.
The 3rd step, inverse correspondence.
In order to calculate final image, to each pixel of binary templates, must distribute an intensity level, wherein calculate this intensity level: for this reason, need to carry out inverse correspondence and calculate by seeking the position of this pixel in final image.
Therefore, by repeating the symbolic representation of Fig. 7 and 8, (x y) is counted as a pixel of desirable mask.This pixel be expert at y with row x intersection point.So,, obtain this pixel position in the image after distortion by obtain the point of crossing of the picture of row and row with inverse correspondence.
Then, seek line (QJ) and parameter (OI), so that calculate their point of crossing P.Then, should calculation level I and the position of J.For example, by seeking apart from JP
3And IP
1Obtain this result easily.
This is possible by using the form with the downcrossings ratio:
So, position that can calculation level P.
In fact, in advance the inverse correspondence of the row and column by desirable mask is calculated these pictures.Then, in original image, the intersection point by calculating two lines (in this example, two lines are (OI) and (QJ)) and infer given locations of pixels.
Certainly, the present invention is not restricted to this single method.
D) in the 4th step, create final image.
Suppose that (u v) is a pixel of desirable mask.By the position in its initial pictures after distortion of intersection point calculation of row v and the precalculated inverse image of row u.The point that is found be called as (x, y).Then, with an intensity level distribute to pixel (x, y), its will be inserted into initial pictures point (x, y) in.In order to finish this operation, for example, use bilinear interpolation.
If consider around point (as shown in figure 11 all, then the intensity of Cha Ruing is given by the following formula for x, pixel y):
I(x,y)=(y-j)[(i+1-x)I(i,j+1)+(x-i)I(i+1,j+1)]
+(j+1-y)[(i+1-x)I(1,j)+I(I+1,j)]
So (u, (x y), considers that wherein gray level is quantized to the pixel on the final image in final image v) will to have intensity I.
Advantageously, can calculate the image comprise the data of from noise, being extracted after the correction according to the subtraction principle.
Known, brightness equals the combination of strength of basic color (red, green, indigo plant): L=0.5G+0.3R+0.2B for example.
Therefore, the method according to this invention for each pixel, comprises a continuous processing, has: extract brightness, extract raw data D (p), calculating noise context dependent data V
s, by means of raw data D after the context of noise related data extraction noise compensation
*(p), generate luminance picture by following calculating then:
Advantageously, under the situation of coloured image,,,, can use the subtraction principle so that acquisition is used for the color of searching (sought-after) of final image as by wave filter by from background color, removing the contrast of the definite colourity of institute.
For example, can extract context of noise related data V based on luminance picture
s, then can be by calculating the raw data D of passage
R, D
G, D
BAnd from the noise (D of color channel
* R, D
* G, D
* B) in raw data after extract proofreading and correct, represent the colourity (V of observed colourity RGB and background
R Back, V
G Back, V
B Back) between contrast, and with V
sAs threshold value, generate the chromatic diagram picture after proofreading and correct at last.
As an example, our supposition on a pixel, to the estimation colourity for the background of the white portion that supports media is
And this pixel representative has colourity
Blueness write the zone.Our supposition, the optical noise after the correction white/blue contrast is (D
* R, D
* G, D
* B)=(160-120,140-115,110-105)=(40,25,5).The colourity of the pixel of expression support media white portion is set to (R in our final image
B, G
B, B
B)=(255,255,255), then, deduct in advance with the contrast of factor f weighting by contrast and to determine colourity after the correction of this pixel final image, make the colourity (R of final image after the correction of this pixel from white
*, G
*, B
*) will be, if f=1, (R
*, G
*, B
*)=(R
B-D
*, G
B-D
*, B
B-D
* B)=(255-40,255-25,255-5)=(215,230,250)
Color and reference color that above-mentioned factor f can advantageously be used to be obtained are calibrated (aligning), for example show on test pattern.
Certainly, the present invention is not limited to previously described embodiment.
Therefore, can find significantly, commonly used being used for determine can (be the probability q that is produced by noise based on the regional maximal value of knowing raw data D (p) in advance) the threshold value V of or type information hand-written from each pixel extraction of difference image D (p)
sMethod have following two shortcomings:
-at first,, must know to experience probability q, so that information extraction from their image for each camera module.This has stoped any information of thinking to be worth believing from the information that is extracted that is derived by the captured image of unknown camera module (for example, from the image information extraction that server received, being used for by fax it being forwarded to the addressee).
-secondly, must know in advance that information is information dark on the bright background or opposite.
So, the invention provides improvement to this method, can avoid above-mentioned two shortcomings by it.This improvement provides threshold value V especially
saccurately determine that wherein in this threshold value, printing or hand-written information can be extracted out from difference image D (p) (be similar to D[C, L]), also provide and determined that exactly information is dark or opposite on the bright background, bright on the dark background.
By considering grayscale image I (p), wherein one of the combination of three of this grayscale image or image colors (red, indigo plant, green) passage or these three passages may further comprise the steps according to method of the present invention, with reference to Figure 12:
1) first step is calculated depression histogram H_pits, comprises the following operational phase:
A) for each pixel p of image I, carry out following operation:
I. for each direction d, wherein 0<| d|<D;
If satisfy condition:
-I is protruding on [p-d, p+d], that is:
For any 0≤λ≤1, I (p+ (1-2 λ) d)≤λ I (p-d)+(1-λ) I (p+d)
(bright data/dark background)
Perhaps
-I is recessed on [p-d, p+d], that is:
For any 0≤λ≤1, I (p+ (1-2 λ) d) 〉=λ I (p-d)+(1-λ) I (p+d)
(bright data/dark background)
Then calculate G (p, d)=(I (p+d)+I (p-d))/2
Otherwise G (p, d)=0
Ii. for all direction d, wherein 0<| d|<D, calculate S (p)=G (p, maximal value d)
B) calculate the maximal value S_pits_max that caves in, it equals for all pixel p, the maximal value of S (p).
C) for 0 and depression maximal value S_pits_max between each value of s, depression histogram H_pits is by zero setting.
D) for each pixel p of image S (p), carry out following calculating:
I) if S (p) is regional maximal value,
Increase H_pits (S (p)) according to following mode:
H_pits(S(p))←H_pits(S(p))+1
2) in second step, calculate protruding histogram H_bumps, comprise the following operational phase:
A), carry out following operation for each pixel p of image I:
I. for each direction d, wherein 0<| d|<D:
If satisfy following condition:
-I is protruding on [p-d, p+d], that is:
For any 0≤λ≤1, I (p+ (1-2 λ) d)≤λ I (p-d)+(1-λ) I (p+d)
(dark data/bright background)
Or
-I is recessed on [p-d, p+d], that is:
For any 0≤λ≤1, I (p+ (1-2 λ) d) 〉=λ I (p-d)+(1-λ) I (p+d)
(dark data/bright background)
Then calculate G (p, d)=(I (p+d)+I (p-d))/2
Otherwise G (p, d)=0
Ii. for all direction d, wherein 0<| d|<D, calculate S (p)=G (p, maximal value d)
B) calculate protruding maximal value S_bumps_max, it equals for all pixel p, the maximal value of S (p).
C) for 0 and protruding maximal value S_bumps_max between each s, protruding histogram H_bumps (s) is by zero setting.
D) for each pixel p of image S (p), carry out following calculating:
If i. S (p) is regional maximal value,
H_bumps (S (p)) increases in the following manner:
H_bumps(S(p))←H_bumps(S(p))+1
3) the 3rd step, stack depression histogram H_pits and protruding histogram H_bumps comprised with the next stage:
A) calculate S_max according to following formula:
S_max=Max (the maximal value S_pits_max of depression, the maximal value S_bumps_max of projection)
B) calculate H_max according to following formula:
H_max=is for all S values, the maximal value of depression H_pits (S) and protruding H_bumps (S).
C) calculate s0 according to following formula:
S0=makes the maximal value of s of H_pits (s)=H_max or H_bumps (s)=H_max.
D) calculate s=s0+1, and select α to make 0<α<1/2, and as long as:
|ln(1+H_pits(s)-ln(I+H_bumps(s))|<α.ln(1+H_max)
Just carry out s=s+1 (wherein ln is the Napierian logarithms function)
At last, the final value by s adds 1 determined value S
Min
4) calculate extraction threshold value V according to following formula
sValue:
V
S=r.S
Min+ (1-r) .S
Max, 1/2<r≤1 wherein
5) compare H_pits and H_bumps, comprise the following operational phase, β>0:
A) by the following calculated value N_pits that concerns:
N_pits=is for s=S
MinTo s=S_pits_max, H_pits (s)
βAnd.
B) by the following calculated value N_bumps that concerns:
N_bumps=is for s=S
MinTo s=S_bumps_max, H_bumps (s)
βAnd.
C), otherwise extract bright information on the dark background if N_pits less than N_bumps, then should extract information dark on the bright background.
6) extract monochrome information L (p), comprise the following operational phase:
A) calculate D according to known method
B) for each pixel p among the difference image D (p)
If D (p)>V
s, then D (p) is considered to relevant and is extracted.
If i. information is information dark on the bright background, calculated value then, L (p)=I
Maz-f.D (p), I
MazCan equal 255
Ii. otherwise calculated value L (p)=I
Min+ f.D (p), I
MinCan equal 0
If do not think that D (p) is correlated with
If i. information is information dark on the bright background, then the value of L (p) equals I
Maz(bright background)
Ii. otherwise the value of L (p) equals I
Min(dark background)
For example, can use following parameter to obtain gratifying result:
D=3
α=20%
For extraction, r=85%
f=5
The invention still further relates to rectangle (A, B, C, the emulation of image D), wherein (C D) has the physics aspect ratio r=CD/AD of appointment to rectangle for A, B; The specified point of the rectangle of projection in the image (for example putting D); And known and projector distance camera (for example CD); Wherein this camera has appointment focal length (f); Pitch angle (pi/2)-i, wherein i is an incident angle, α is the rotation angle around the axle of camera, and if i ≠ 0, then be with respect to one of latent that exists (F for example with ing
1) the skew angle β of regulation.These different parameters show that in Figure 13 Figure 13 represents camera, have be tied to the ox of focus oy, its optic axis and focus in the oz coordinate reference system.
This way to solve the problem comprises following three steps, with reference to figure 2,3 and Figure 14, that is:
-the first step is calculated three unknown point A, B and C (some D the is appointment) position in the new images that must generate.Point must with the physics aspect ratio r that must be projected to the pattern on this new images and must be by the position consistency of the camera of emulation (focal length, pitch angle, rotation angle, skew angle).
In-the second step, calculate corresponding relation, so that will be included in information projection in the pattern of original image to being calculated on the pattern by emulating image.
Three steps of-Di, with corresponding relation, the brightness and the colourity of new images in the profile of determining from original image, to be calculated.
Following three kinds of situations are considered in the calculating of three unknown point of pattern:
If i ≠ 0 (having latent point at least), calculating comprises following four operation stepss:
1.OX=f.tan(i)
2.OF=f/tan(i)
3. some X and F are set at by image center O and with perpendicular line and are on the straight line of angle α.
4. set-point F
1, make
FF
1=ftan(β)/sin(i)
A) β ≠ 0 if (2 not latent points)
I) set-point F
2, make FF
2=(OF.XF)/FF
1
Ii) according to an X, F
1, F
2(if, ≠ 0), D and infer a some M according to distance D C
1, C, N
1, O
1, P
1, O
2, P
2And N
2
Iii) set-point M
2, to obtain relation
B) if β ≠ 0 (has only not latent point a: F
1=F) (Fig. 3)
I) some A is set up on online (DF), makes
Ii) put B and be set up on online (FC), make BF=CF. (AF/DF)
C) if i=0 (not do not conceal point) with (Figure 14)
1) utilizes some D, distance D C and rotation angle α, set-point C
2) set-point B makes that (C D) is rectangle for A, B