CN107977593A - Image processing apparatus and image processing method - Google Patents
Image processing apparatus and image processing method Download PDFInfo
- Publication number
- CN107977593A CN107977593A CN201610921297.4A CN201610921297A CN107977593A CN 107977593 A CN107977593 A CN 107977593A CN 201610921297 A CN201610921297 A CN 201610921297A CN 107977593 A CN107977593 A CN 107977593A
- Authority
- CN
- China
- Prior art keywords
- connected domain
- connecting line
- heading character
- character connected
- title
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/414—Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
Abstract
The present invention relates to image processing apparatus and image processing method.Image processing apparatus according to the present invention includes:Connected domain acquiring unit, for obtaining multiple connected domains of newspaper image;Character connected domain determination unit, for merging overlapping connected domain in the multiple connected domain and adjacent connected domain to obtain multiple character connected domains;Heading character determination unit, for determining multiple heading character connected domains from the multiple character connected domain;Connecting line determination unit, for determining one or more title connecting line according to the multiple heading character connected domain;And Title area acquiring unit, it is located at the heading character connected domain on identical title connecting line for combination to obtain one or more Title area of the newspaper image.Using image processing apparatus according to the present invention and image processing method, the title extraction of newspaper image can be automatically performed, so as to save substantial amounts of manpower, improves the labeling effciency of Digital Newspapers title.
Description
Technical field
The embodiment of the present invention is related to image processing field, more particularly to the Title area that can obtain newspaper image
Image processing apparatus and image processing method.
Background technology
This part provides background information related to the present invention, this is not necessarily the prior art.
With the development of information technology, in order to protect original paper, and the convenience of storage and lookup, many libraries are all to shop
The Press Literature of Tibetan has carried out digitized work, and literature content is stored in the form of microfilm for consulting.To these
When digitlization newpapers and periodicals are retrieved, layout information, the article being often related to such as publication date, release and version name are crucial
The information such as word and headline.However, due to oneself of the different layout information of different newspaper existence forms and heading message
By property, these contents automatically extract with there are bigger difficulty.Therefore, at present to the mark of layout information and heading message
Mostly by the way of handmarking.The processing mode of handmarking needs to waste substantial amounts of manpower and materials, and speed is slower,
Inefficiency.
For above technical problem, the present invention wishes to propose a kind of scheme, and the title that can be automatically performed newspaper image carries
Take, so as to save substantial amounts of manpower, improve the labeling effciency of Digital Newspapers title.
The content of the invention
This part provides the general summary of the present invention, rather than its four corner or the comprehensive of its whole feature drape over one's shoulders
Dew.
It is an object of the invention to provide a kind of image processing apparatus and image processing method, can be automatically performed newspaper figure
The title extraction of picture, so as to save substantial amounts of manpower, improves the labeling effciency of Digital Newspapers title.
According to an aspect of the present invention, there is provided a kind of image processing apparatus, including:Connected domain acquiring unit, for obtaining
Take multiple connected domains of newspaper image;Character connected domain determination unit, for merging the overlapping connection in the multiple connected domain
Domain and adjacent connected domain are to obtain multiple character connected domains;Heading character determination unit, for from the multiple character connected domain
In determine multiple heading character connected domains;Connecting line determination unit, for determining one according to the multiple heading character connected domain
A or multiple title connecting lines;And Title area acquiring unit, for combining the mark on identical title connecting line
The autograph accords with connected domain to obtain one or more Title area of the newspaper image.
According to another aspect of the present invention, there is provided a kind of image processing method, including:Obtain multiple companies of newspaper image
Logical domain;Merge the overlapping connected domain in the multiple connected domain and adjacent connected domain to obtain multiple character connected domains;From described
Multiple heading character connected domains are determined in multiple character connected domains;According to the multiple heading character connected domain determine one or
Multiple title connecting lines;And combination is located at the heading character connected domain on identical title connecting line to obtain the newspaper figure
One or more Title area of picture.
According to another aspect of the present invention, there is provided a kind of program product, the program product include the machine being stored therein
Device readable instruction code, wherein, described instruction code can make the computer perform root when being read by computer and being performed
According to the image processing method of the present invention.
According to another aspect of the present invention, there is provided a kind of machinable medium, carries according to the present invention thereon
Program product.
, can be by merging overlapping connected domain and phase using image processing apparatus according to the present invention and image processing method
Adjacent connected domain obtains the character connected domain of newspaper image, and heading character connected domain is determined from character connected domain, and can root
One or more title connecting line is determined according to heading character connected domain, is located at so as to combine on identical title connecting line
Heading character connected domain to obtain the Title area of newspaper image.So, newspaper can be automatically extracted fast and reliablely
The Title area of image, so as to save substantial amounts of manpower and materials, improves the labeling effciency of Digital Newspapers title.
Description and specific examples in this summary are intended merely to the purpose of signal, the model being not intended to limit the invention
Enclose.
Brief description of the drawings
Attached drawing described here is intended merely to the purpose of the signal of selected embodiment and not all possible implementation, and not
It is intended to limit the scope of the invention.In the accompanying drawings:
Fig. 1 is the schematic diagram according to the newspaper image to be treated of the embodiment of the present invention;
Fig. 2 is the structure diagram according to the image processing apparatus of the embodiment of the present invention;
Fig. 3 is the schematic diagram of the newspaper image after the multiple connected domains of acquisition according to the embodiment of the present invention;
Fig. 4 is the structure diagram according to the character connected domain determination unit of the embodiment of the present invention;
Fig. 5 is the schematic diagram according to the overlapping connected domain of merging of the embodiment of the present invention;
Fig. 6 is the schematic diagram according to the adjacent connected domain of merging of the embodiment of the present invention;
Fig. 7 is the structure diagram according to the heading character determination unit of the embodiment of the present invention;
Fig. 8 is the schematic diagram according to the newspaper image after the definite heading character connected domain of the embodiment of the present invention;
Fig. 9 is the structure diagram according to the heading character determination unit of an alternative embodiment of the invention;
Figure 10 is to count the curve map of the number of character connected domain according to the embodiment of the present invention by variable of size;
Figure 11 is the structure diagram according to the connecting line determination unit of the embodiment of the present invention;
Figure 12 is the structure diagram according to the connecting line determination unit of an alternative embodiment of the invention;
Figure 13 is the structure diagram according to the connecting line determination unit of another embodiment of the present invention;
Figure 14 is the schematic diagram according to the process for determining a title connecting line of the embodiment of the present invention;
Figure 15 be according to the embodiment of the present invention obtain Title area after newspaper image schematic diagram;
Figure 16 is the structure diagram according to the image processing apparatus of an alternative embodiment of the invention;
Figure 17 be according to an alternative embodiment of the invention obtain Title area after newspaper image schematic diagram;
Figure 18 is the flow chart according to the image processing method of the embodiment of the present invention;And
Figure 19 is the exemplary knot for the general purpose personal computer that can wherein realize image processing method according to the present invention
The block diagram of structure.
Although the present invention is subjected to various modifications and alternative forms, its specific embodiment is as an example in attached drawing
In show, and be described in detail here.It should be understood, however, that the description at this to specific embodiment is not intended to send out this
It is bright to be restricted to disclosed concrete form, but on the contrary, the invention aims to cover the spirit and scope of the present invention it
It is interior all modifications, equivalent and replace.It should be noted that running through several attached drawings, corresponding label indicates corresponding component.
Embodiment
The example of the present invention is described more fully referring now to attached drawing.It is described below what is be merely exemplary in nature,
It is not intended to limit the invention, application or purposes.
Example embodiment is provided below, so that the present invention will become detailed, and will be to those skilled in the art
Fully pass on its scope.The example of numerous specific details such as discrete cell, apparatus and method is elaborated, to provide to this hair
The detailed understanding of bright embodiment.To those skilled in the art it will be obvious that, it is not necessary to use specific details,
Example embodiment can be implemented with many different forms, they shall not be interpreted to limit the scope of the invention.
In some example embodiments, well-known process, well-known structure and widely-known technique are not described in detail.
Fig. 1 is the schematic diagram according to the newspaper image to be treated of the embodiment of the present invention.As shown in Figure 1, newspaper figure
There are the information such as title, date, text and the title of newspaper as in.In Fig. 1 the newspaper image is shown with black box
Title area.It is an object of the invention to provide a kind of image processing apparatus and image processing method, enabling extraction is such as
The Title area of newspaper image shown in Fig. 1.
Image processing apparatus 200 according to an embodiment of the invention is described with reference to Fig. 2.
Image processing apparatus 200 according to the present invention includes connected domain acquiring unit 210, character connected domain determination unit
220th, heading character determination unit 230, connecting line determination unit 240 and Title area acquiring unit 250.
According to an embodiment of the invention, connected domain acquiring unit 210 can obtain multiple connected domains of newspaper image.Connection
Domain (also referred to as connected region) detection is common method in image processing field and area of pattern recognition, it is in target point
Cut, have a wide range of applications in edge detection and region detection.There is big in image processing field and area of pattern recognition
The method of the detection connected domain of amount, the present invention do not limit this.Appoint that is, connected domain acquiring unit 210 can use
Method known to meaning obtains multiple connected domains of a newspaper image.Further, connected domain acquiring unit 210 will can obtain
Multiple connected domains be transferred to character connected domain determination unit 220.
According to an embodiment of the invention, character connected domain determination unit 220 can merge the overlapping company in multiple connected domains
Lead to domain and adjacent connected domain to obtain multiple character connected domains.According to an embodiment of the invention, character connected domain determination unit 220
Multiple connected domains of newspaper image can be obtained from connected domain acquiring unit 210, and multiple character connected domains of acquisition are passed
It is defeated to arrive heading character determination unit 230.According to an embodiment of the invention, character can include chinese character, can also include Korea Spro
The characteristics of language character and Japanese character etc., this kind of character is that a character includes one or more connected domain, thus can be with
Character connected domain is obtained by way of merging overlapping connected domain and adjacent connected domain.
According to an embodiment of the invention, heading character determination unit 230 can determine multiple from multiple character connected domains
Heading character connected domain.According to an embodiment of the invention, heading character determination unit 230 can be from character connected domain determination unit
220 obtain multiple character connected domains of newspaper image, and therefrom determine that multiple heading character connected domains are transferred to connecting line and determine
Unit 240.
According to an embodiment of the invention, connecting line determination unit 240 can determine one according to multiple heading character connected domains
A or multiple title connecting lines.Here, title connecting line refers to the heading character connected domain for belonging to same Title area
Line.According to an embodiment of the invention, connecting line determination unit 240 can obtain multiple from heading character determination unit 230
Heading character connected domain, and determine one or more title connecting line, definite title connecting line is transferred to Title area
Acquiring unit 250.
According to an embodiment of the invention, Title area acquiring unit 250 can be combined on identical title connecting line
Heading character connected domain to obtain one or more Title area of newspaper image.Here, Title area acquiring unit 250
One or more title connecting line can be obtained from connecting line determination unit 240, thus obtain one of newspaper image or
Multiple Title areas.
Image processing apparatus according to the present invention, can determine multiple heading character connected domains according to newspaper image, and can
To determine the line for the heading character connected domain for belonging to same Title area.Once it is determined that title connecting line, can be easy
Ground obtains and the corresponding Title area of title connecting line.So, the mark of newspaper image can be realized fast and reliablely
Topic automatically extracts.
Fig. 3 is the schematic diagram of the newspaper image after the multiple connected domains of acquisition according to the embodiment of the present invention.Hereinbefore
Mention, connected domain acquiring unit 210 can obtain multiple connected domains of newspaper image, and the square frame of each black shows in Fig. 3
A connected domain is gone out, multiple connected domains is shown in Fig. 3.As shown in figure 3, a connected domain is probably one in newspaper image
Connected domain where a character, such as " anti-" word, it is also possible to the part in a character in newspaper image, such as " adding "
The left side " power " part where connected domain.That is, a character in newspaper image may be by one or more company
Logical domain is formed.Therefore, in order to obtain the character connected domain in newspaper image, it is necessary to connected domain acquiring unit 210 obtain it is more
A connected domain carries out appropriate merging.
Fig. 4 is the structure diagram according to the character connected domain determination unit 220 of the embodiment of the present invention.As shown in figure 4, word
Overlapping connected domain determination unit 221, adjacent connected domain determination unit 222 can be included and merge by according with connected domain determination unit 220
Unit 223.
According to an embodiment of the invention, when the boundary rectangle frame of two connected domains is there are during overlapping region, connected domain is overlapped
Determination unit 221 can determine that the two connected domains are overlapping connected domain.Here, overlapping connected domain determination unit 221 can be from
Connected domain acquiring unit 210 obtains all connected domains of newspaper image, and judges whether deposited in all connected domains of newspaper image
In overlapping connected domain.Further, combining unit 223 can be transferred to by overlapping connected domain by overlapping connected domain determination unit 221.
Fig. 5 is the schematic diagram according to the overlapping connected domain of merging of the embodiment of the present invention.Figure institute on the left of Fig. 5 arrows
Show, character " ginseng " includes two with the connected domain shown in black box.Two connected domains shown in Fig. 5 are rectangle, and
In practical operation, connected domain can be other arbitrary polygonal shapes.According to an embodiment of the invention, when two connected domains
Boundary rectangle frame is there are during overlapping region, i.e., when one or more angle of the boundary rectangle frame of a connected domain is located at another
When in the region of the boundary rectangle frame of connected domain, overlapping connected domain determination unit 221 can determine that the two connected domains are overlapping
Connected domain.As shown in the figure on the left of Fig. 5 arrows, two connected domains that character " ginseng " includes are there are overlapping region, thus overlapping company
Logical domain determination unit 221 judges the two connected domains to overlap connected domain.
According to an embodiment of the invention, the feelings of multiple overlapping connected domains can also be judged by overlapping connected domain determination unit 221
Condition.For example, if all there are overlapping region for the boundary rectangle frame of each two connected domain in multiple connected domains, then overlapping connection
Domain determination unit 221 may determine that this multiple connected domain belongs to overlapping connected domain.
According to an embodiment of the invention, when between closest two side of the boundary rectangle frame of two connected domains away from
From less than first threshold, and merge the aspect ratio of the boundary rectangle frame of the connected domain after the two connected domains and 1 difference
During less than second threshold, adjacent connected domain determination unit 222 determines that the two connected domains are adjacent connected domain.According to the present invention
Embodiment, adjacent connected domain determination unit 222 not only may determine that whether two connected domains are adjacent connected domain, can also judge
Whether one group of connected domain including two or more connected domain is adjacent connected domain.For example, when each two in one group of connected domain connects
The distance between closest two side of boundary rectangle frame in logical domain both less than first threshold, and merge the connection of this group
When the aspect ratio of the boundary rectangle frame of connected domain after domain is less than second threshold with 1 difference, adjacent connected domain determination unit
222 can determine that this group of connected domain is adjacent connected domain.
Here, adjacent connected domain determination unit 222 can obtain all companies of newspaper image from connected domain acquiring unit 210
Logical domain, and judge to whether there is adjacent connected domain in all connected domains of newspaper image.Further, adjacent connected domain determination unit
Adjacent connected domain can be transferred to combining unit 223 by 222.
Fig. 6 is the schematic diagram according to the adjacent connected domain of merging of the embodiment of the present invention.Figure institute on the left of Fig. 6 arrows
Show, character " adding " includes two with the connected domain shown in black box.Two connected domains shown in Fig. 6 are rectangle, and
In practical operation, connected domain can be other arbitrary polygonal shapes.According to an embodiment of the invention, using two connected domains as
Example, when the distance between closest two side of boundary rectangle frame of two connected domains is less than first threshold, illustrates this
Two connected domains are apart from close, and the aspect ratio of the boundary rectangle frame of connected domain after the two connected domains are merged and 1
When difference is less than second threshold, illustrate to merge the close square of connected domain after the two connected domains.When meeting above-mentioned two
During condition, adjacent connected domain determination unit 222 can determine that the two connected domains are adjacent connected domain.On the left of Fig. 6 arrows
Shown in figure, two connected domains that character " adding " includes merge the later connected domain of the two connected domains very apart from close
Close to square, thus adjacent connected domain determination unit 222 judges the two connected domains for adjacent connected domain.
According to an embodiment of the invention, combining unit 223 can merge overlapping connected domain, merge adjacent connected domain, and can
Multiple connected domains to be obtained after by merging are used as character connected domain.Here, combining unit 223 can be from overlapping connected domain
Determination unit 221 obtains overlapping connected domain, adjacent connected domain is obtained from adjacent connected domain determination unit 222, so as to merge overlapping
Connected domain, merges adjacent connected domain, and using the connected domain after merging as character connected domain.According to an embodiment of the invention,
Combining unit 223 can also obtain multiple connected domains of newspaper image from connected domain acquiring unit 210, so as to will both be not belonging to hand over
Folded connected domain is also not belonging to the connected domain of adjacent connected domain directly as character connected domain.Further, combining unit 223 can be with
Definite character connected domain is transferred to heading character determination unit 230.
As shown in the figure on the right side of Fig. 5 arrows, two overlapping connected domains that character " ginseng " is included merge the company obtained later
The boundary rectangle in logical domain includes character " ginseng ".As shown in the figure on the right side of Fig. 6 arrows, two adjacent companies that character " adding " is included
The boundary rectangle for the connected domain that logical domain obtains after merging includes character " adding ".
According to an embodiment of the invention, character connected domain determination unit 220 can be by merging overlapping connected domain and merging
The mode of adjacent connected domain determines character connected domain from multiple connected domains of newspaper image.The character connected domain so obtained is very
Accurately, and to determine that heading character connected domain lays the foundation in next step.
Fig. 7 is the structure diagram according to the heading character determination unit 230 of the embodiment of the present invention.
As shown in fig. 7, heading character determination unit 230 can include comparing unit 231.According to an embodiment of the invention,
The character connected domain that size in multiple character connected domains can be more than the 3rd threshold value by comparing unit 231 is determined as heading character company
Logical domain.
According to an embodiment of the invention, heading character determination unit 230 can be obtained from character connected domain determination unit 220
Multiple character connected domains of newspaper image, and the size of all character connected domains is compared with the 3rd threshold value, and by size
Character connected domain more than the 3rd threshold value is determined as heading character connected domain.Further, heading character determination unit 230 can incite somebody to action
Definite heading character connected domain is transmitted to connecting line determination unit 240.
As shown in Figure 1, in newspaper image, under normal circumstances, heading character is larger than text character.Therefore, according to
The embodiment of the present invention, can set the 3rd threshold value according to practical experience, so that the character that size is more than to the 3rd threshold value connects
Domain is determined as heading character connected domain.
According to an embodiment of the invention, the size of character connected domain, such as character connection can be weighed with various parameters
In the length and width of the area in domain, the length of the boundary rectangle of character connected domain and the boundary rectangle of wide average and character connected domain
Higher value etc..Different reference thresholds can be set for different parameter of measurement.For example, when the face with character connected domain
When accumulating the size to weigh character connected domain, area in multiple character connected domains can be more than by comparing unit 231 is directed to area
The character connected domain of the 3rd threshold value be determined as heading character connected domain;When the length of the boundary rectangle with character connected domain and wide
When average is to weigh the size of character connected domain, comparing unit 231 can by the length of boundary rectangle in multiple character connected domains and
Wide average is more than the length for the boundary rectangle for being directed to character connected domain and the character connected domain of the 3rd threshold value of wide average determines
For heading character connected domain;When with the length of the boundary rectangle of character connected domain and it is wide in higher value weigh character connected domain
During size, the higher value in the length of boundary rectangle in multiple character connected domains and width can be more than by comparing unit 231 is directed to word
Accord with the length of the boundary rectangle of connected domain with it is wide in the character connected domain of the 3rd threshold value of higher value be determined as heading character and connect
Domain.
Fig. 8 is the schematic diagram according to the newspaper image after the definite heading character connected domain of the embodiment of the present invention.Fig. 8
It is middle to show heading character connected domain with black box, that is to say, that heading character is determined in heading character determination unit 230
After connected domain, the text character connected domain in newspaper image is removed, only remaining heading character connected domain.
It is noted above, the 3rd threshold value can be set according to practical experience.According to an embodiment of the invention, can also basis
The sizes of all character connected domains of newspaper image calculates the 3rd threshold value.
Fig. 9 is the structure diagram according to the heading character determination unit 230 of an alternative embodiment of the invention.
As shown in figure 9, heading character determination unit 230 can include size determination unit 232, statistic unit 233, threshold value
Determination unit 234 and comparing unit 231.
According to an embodiment of the invention, size determination unit 232 can determine all characters in multiple character connected domains
The size of connected domain.Here, size determination unit 232 can obtain newspaper image from character connected domain determination unit 220
All character connected domains, and calculate the size of all character connected domains.Further, size determination unit 232 can be by all words
The size of symbol connected domain is transmitted to statistic unit 233.
According to an embodiment of the invention, statistic unit 233 can count character using the size of character connected domain as variable
The number of connected domain.Here, statistic unit 233 can obtain the size of all character connected domains from size determination unit 232,
And the number of the various sizes of character connected domain of statistics is transmitted to threshold value determination unit 234.
According to an embodiment of the invention, threshold value determination unit 234 can be according to the maximum number of character connected domain
Size determines the 3rd threshold value.Here, threshold value determination unit 234 can be obtained from statistic unit 233 has various sizes of word
The number of connected domain is accorded with, so that it is determined that the size with the maximum number of character connected domain, and will have the maximum number of character
The size of connected domain is multiplied by certain empirical coefficient so that it is determined that the 3rd threshold value.Further, threshold value determination unit 234 can be by really
The 3rd fixed threshold value is transmitted to comparing unit 231, and heading character connected domain is determined for comparing unit 231.
As described above, the size of character connected domain can be weighed with various parameters, such as the area of character connected domain,
Higher value in the length of the boundary rectangle of character connected domain and the length of the boundary rectangle of wide average and character connected domain and width
Etc..Here, the parameter of size is weighed used by size determination unit 232, statistic unit 233 and threshold value determination unit 234
It is consistent with the parameter that size is weighed used by comparing unit 231.Exemplified by weighing size using the area of character connected domain,
Size determination unit 232 can determine the area of all character connected domains, and statistic unit 233 can be counted with different area
The number of character connected domain, and threshold value determination unit 234 can be according to the area with the maximum number of character connected domain come really
Fixed 3rd threshold value.
Figure 10 is to count the curve map of the number of character connected domain according to the embodiment of the present invention by variable of size.
Number of the statistic unit 233 using the size of character connected domain as statistics of variable character connected domain, and obtain as shown in Figure 10
Curve map.Here, the size of the character connected domain with maximum number N is L, thus threshold value determination unit 234 can be according to ruler
Very little L determines the 3rd threshold value.
According to an embodiment of the invention, threshold value determination unit 234 can determine that the 3rd threshold value T is:
T=k1×L
Wherein, L is the size with the maximum number of character connected domain, k1For empirical coefficient, and k1>1。
According to an embodiment of the invention, heading character determination unit 230 can be connected according to all characters of newspaper image
Domain determines the 3rd threshold value, so as to determining all heading character connected domains in character connected domain according to the 3rd threshold value,
So that the heading character connected domain determined is more accurate.
According to an embodiment of the invention, after heading character determination unit 230 determines multiple heading character connected domains,
Connecting line determination unit 240 can determine one or more connecting line according to multiple heading character connected domains.Figure 11 is root
According to the structure diagram of the connecting line determination unit 240 of the embodiment of the present invention.
As shown in figure 11, connecting line determination unit 240 can include finding unit 241 and determination unit 242.
According to an embodiment of the invention, the title connecting line that is not belonging to search out can be traveled through by finding unit 241
Heading character connected domain finds title connecting line as heading character connected domain is started.
According to an embodiment of the invention, all title connecting lines searched out can be determined as by determination unit 242
One or more title connecting line.
According to an embodiment of the invention, find unit 241 can determine first one start heading character connected domain, and from
This starts heading character connected domain and begins look for title connecting line.In an embodiment of the present invention, the heading character by one
Connected domain can at most find a title connecting line.A title company is searched out when starting heading character connected domain according to one
After wiring, finding unit 241 can lemma since the title connecting line for being not belonging to search out and not doing
Accord with and next beginning heading character connected domain is chosen in the heading character connected domain of connected domain, and the lemma since this is next
Symbol connected domain begins look for title connecting line.In this way, finding unit 241 can determine that one or more starts to mark
Autograph symbol connected domain, and after determining that starts a heading character connected domain every time, heading character connected domain is opened since this
Begin to find title connecting line, until all heading character connected domains belong to title connecting line or all do beginning lemma
Untill according with connected domain.
According to an embodiment of the invention, traversal can be used when determining to start heading character connected domain by finding unit 241
Mode.That is, order (such as display of the heading character connected domain on newspaper image according to heading character connected domain
The storage order of order or heading character connected domain) lemma of title connecting line for being not belonging to search out is chosen successively
Connected domain is accorded with as beginning heading character connected domain.
According to an embodiment of the invention, find unit 241 determine start heading character connected domain when can also use with
The mode of machine.Marked that is, finding unit 241 since the title connecting line for being not belonging to search out and not doing
A heading character connected domain is randomly selected in the heading character connected domain of autograph symbol connected domain as heading character is started to connect
Domain.
It is next determined that unit 242 can using all title connecting lines searched out as one of newspaper image or
Multiple title connecting lines.
Next, searching unit 241 according to an embodiment of the invention will be described in detail with reference to Figure 12 and Figure 13.
Figure 12 is the structure diagram according to the connecting line determination unit of an alternative embodiment of the invention.As shown in figure 12,
Stable state connecting line set determination unit 2411 and output unit 2412 can be included by finding unit 241.
According to an embodiment of the invention, stable state connecting line set determination unit 2411 can be repeated for beginning heading character
The end heading character connected domain of each stable state connecting line in the stable state connecting line set of connected domain performs the step of following operation
Suddenly, until neighbours' lemma is not present in the end heading character connected domain of each stable state connecting line in stable state connecting line set
Untill according with connected domain:Neighbours' heading character connected domain of end heading character connected domain and end heading character connected domain will be connected
Connecting line as transient state connecting line;When neighbours' heading character connected domain of end heading character connected domain meets predetermined condition
When, by the transient state connecting line where neighbours' heading character connected domain and the stable state connecting line phase where the heading character connected domain of end
Connect the stable state connecting line where the end heading character connected domain stored to update in stable state connecting line set;And work as end
When neighbours' heading character connected domain of heading character connected domain is unsatisfactory for predetermined condition, by where neighbours' heading character connected domain
Transient state connecting line is stored in stable state connecting line set as new stable state connecting line.
According to an embodiment of the invention, output unit 2412 will can connect in stable state connecting line set comprising heading character
The most stable state connecting line in logical domain is determined as a title connecting line.
According to an embodiment of the invention, when the mark that the title connecting line in stable state connecting line set there are more than two includes
When autograph symbol connected domain number is identical and all most, output unit 2412 can randomly select a stable state connecting line as mark
Connecting line is inscribed, to ensure that heading character connected domain can only at most search out a title connecting line by one.
Figure 13 is the structure diagram according to the connecting line determination unit of another embodiment of the present invention.As shown in figure 13,
Finding unit 241 can also include starting heading character connected domain determination unit 2413.
According to an embodiment of the invention, beginning heading character connected domain determination unit 2413, which can travel through, is not belonging to seek
The heading character connected domain of the title connecting line found is as beginning heading character connected domain.
According to an embodiment of the invention, when beginning heading character connected domain determination unit 2413 determines a beginning title
After character connected domain, stable state connecting line set determination unit 2411 can be determined with the method for embodiment described above
This starts the stable state connecting line set of heading character connected domain.According to an embodiment of the invention, each start heading character to connect
Logical domain stores in the stable state connecting line set and is opened by heading character connected domain this all there are a stable state connecting line set
Begin the connecting line of one or more heading character connected domain searched out.Next, output unit 2412 can connect from stable state
A title connecting line is determined in wiring set.According to an embodiment of the invention, heading character connected domain determination unit is started
2413 can determine multiple beginning heading character connected domains by way of traversal, therefore output unit 2412 can be correspondingly defeated
Go out multiple title connecting lines.
According to an embodiment of the invention, stable state connecting line set determination unit 2411 can also include initialization unit (not
Show).The stable state connecting line set for starting heading character connected domain can be initialized as including following stable state by initialization unit
Connecting line:Connect this start heading character connected domain with this start heading character connected domain neighbours' heading character connected domain company
Wiring.
According to an embodiment of the invention, initialization unit can be to the stable state connecting line set of beginning heading character connected domain
Initialized.Next, stable state connecting line set determination unit 2411 can use embodiment of the present invention steady to this
State connecting line set is updated, so that it is determined that final stable state connecting line set.
The function of finding unit 241 is described in detail below in conjunction with Figure 14, i.e., for a specific beginning heading character
Connected domain determines stable state connecting line set, and the process of a title connecting line is determined according to stable state connecting line set.
Figure 14 is the schematic diagram according to the process for determining a title connecting line of the embodiment of the present invention.
As shown in figure 14,0-9 shows 10 heading character connected domains, here, for convenience of description, by No. 0 lemma
Connected domain is accorded with as beginning heading character connected domain.
First, initialization unit the stable state connecting line set that start heading character connected domain can be initialized as including with
Lower stable state connecting line:This is connected to start heading character connected domain and start neighbours' heading character of heading character connected domain with this to connect
The connecting line in domain.
In the present invention, neighbours' heading character connected domain of a heading character connected domain can be defined.When two titles
When character connected domain meets first group of predetermined condition, the two heading character connected domains are referred to as neighbours' heading character connected domain.Root
According to the embodiment of the present invention, first group of predetermined condition can include the pact for the distance between the two heading character connected domains
Beam.As a specific example, first group of predetermined condition is:The distance between center of two heading character connected domains is less than
2 times of minimum value in both below:The height of one heading character connected domain and the maximum of width;And another title
The height of character connected domain and the maximum of width, i.e.,:
dij<2*(min(max(iw,ih),max(jw,jh)))
Wherein, dijRepresent the distance between center of i-th of heading character connected domain and j-th of heading character connected domain,
iwRepresent the width of i-th of heading character connected domain, ihRepresent the height of i-th of heading character connected domain, jwRepresent j-th of mark
The width of autograph symbol connected domain, jhRepresent the height of j-th of heading character connected domain.
As shown in figure 14, No. 1 heading character connected domain is neighbours' heading character connected domain of No. 0 heading character connected domain.
Therefore, the stable state connecting line set of No. 0 heading character connected domain can be initialized as including following stable state company by initialization unit
Wiring:The stable state connecting line 0-1 of No. 0 heading character connected domain of connection and No. 1 heading character connected domain.Here, it illustrate only 0
There is neighbours' heading character connected domain in number heading character connected domain.In practical operation, a beginning lemma
Symbol connected domain may have multiple neighbours' heading character connected domains, then there are a plurality of steady in the stable state connecting line set of its initialization
State connecting line.
Next, stable state connecting line set determination unit 2411 can be directed to the stable state connection for starting heading character connected domain
The end heading character connected domain of each stable state connecting line in line set performs following operation:Connection end heading character is connected
The connecting line of neighbours' heading character connected domain of logical domain and end heading character connected domain is as transient state connecting line.
In the present invention, the end heading character connected domain of stable state connecting line is to be eventually connected on the stable state connecting line
That heading character connected domain.By taking the embodiment shown in Figure 14 as an example, the end heading character connected domain of stable state connecting line 0-1 is
No. 1 heading character connected domain.Since, there may be multiple stable state connecting lines, thus there is also multiple ends in stable state connecting line set
Heading character connected domain is held, identical operation is carried out for each end heading character connected domain.Further, end is being judged
It can be used during neighbours' heading character connected domain of heading character connected domain with judging that the neighbours for starting heading character connected domain mark
Similar method during autograph symbol connected domain, details are not described herein.
As shown in figure 14, as No. 1 heading character connected domain of end heading character connected domain, there are three neighbours' titles
Character connected domain:No. 2 heading character connected domains, No. 3 heading character connected domains and No. 4 heading character connected domains.Stable state connecting line
Gather determination unit 2411 by the connecting line 1-2 for connecting No. 1 heading character connected domain and No. 2 heading character connected domains, be connected No. 1
The connecting line 1-3 of heading character connected domain and No. 3 heading character connected domains and it is connected No. 1 heading character connected domain and No. 4 titles
The connecting line 1-4 of character connected domain is as transient state connecting line.
Next, stable state connecting line set determination unit 2411 may determine that neighbours' title of end heading character connected domain
Whether character connected domain meets second group of predetermined condition, when meeting second group of predetermined condition, by neighbours' heading character connected domain
The transient state connecting line at place is connected with the stable state connecting line where the heading character connected domain of end to update stable state connecting line collection
Stable state connecting line where the end heading character connected domain stored in conjunction;And when being unsatisfactory for second group of predetermined condition, will
Transient state connecting line where neighbours' heading character connected domain is stored in stable state connecting line set as new stable state connecting line.
According to an embodiment of the invention, second group of predetermined condition can include the neighbours for end heading character connected domain
The length of heading character connected domain, width, place transient state connecting line slope and to where the heading character connected domain of end
Stable state connecting line distance constraint.
As a specific example, second group of predetermined condition can include:A. the neighbours of end caption text connected domain
The length of the length of caption text connected domain and end caption text connected domain is (or steady where the caption text connected domain of end
The median of the length of all caption text connected domains on state connecting line) difference be less than the 4th threshold value;B. end heading-text
Width (or the end caption text of the width of neighbours' caption text connected domain of word connected domain and end caption text connected domain
The median of the width of all caption text connected domains on stable state connecting line where connected domain) difference be less than the 5th threshold
Value;C. the slope of the transient state connecting line where neighbours' caption text connected domain of end caption text connected domain and end heading-text
The difference of the slope of stable state connecting line where word connected domain is less than the 6th threshold value;And the neighbour of d. ends caption text connected domain
The distance for occupying the stable state connecting line where the center to end caption text connected domain of caption text connected domain is less than the 7th threshold value.
According to an embodiment of the invention, can according to actual demand or experience come set the 4th threshold value, the 5th threshold value,
6th threshold value and the 7th threshold value, can also set these threshold values according to certain criterion.For example, can set the 7th threshold value as
k2*max(cw,ch), wherein, k2Represent empirical coefficient, and k2<1, cwRepresent neighbours' heading-text of end caption text connected domain
The width of word connected domain, chRepresent the height of neighbours' caption text connected domain of end caption text connected domain.
In second group of predetermined condition, meet that condition a and b illustrate neighbours' caption text connected domain and end caption text
Connected domain is closely sized to, and meets that condition c and d illustrate the transient state connecting line and end mark where neighbours' caption text connected domain
Inscribe stable state connecting line where word connected domain almost point-blank.Therefore, when neighbours' caption text connected domain meets the
During two groups of predetermined conditions, the stable state where which can be connected to end caption text connected domain connects
In wiring.
By taking the example shown in Figure 14 as an example, stable state connecting line set determination unit 2411 may determine that as end lemma
Accord with neighbours' heading character connected domain of No. 1 heading character connected domain of connected domain:No. 2, No. 3 and No. 4 heading character connected domains are
Second group of predetermined condition of no satisfaction.As shown in figure 14, although No. 2 and No. 3 heading character connected domains meet condition a and b, but can
Condition c and d can be unsatisfactory for, thus is unsatisfactory for second group of predetermined condition, and No. 4 heading character connected domains meet second group of predetermined bar
Part.According to an embodiment of the invention, stable state connecting line set determination unit 2411 is by the transient state where No. 4 heading character connected domains
Connecting line 1-4 is connected with the stable state connecting line 0-1 where No. 1 heading character connected domain to be deposited with updating in stable state connecting line set
Stable state connecting line 0-1 where No. 1 heading character connected domain of storage.That is, at this time, stored in stable state connecting line set
Stable state connecting line 0-1 is updated to 0-1-4.According to an embodiment of the invention, stable state connecting line set determination unit 2411 is by No. 2
The transient state connecting line 1-3 conducts where transient state connecting line 1-2 and No. 3 heading character connected domains where heading character connected domain
New stable state connecting line is stored in stable state connecting line set.By above step, three are stored in stable state connecting line set
Bar stable state connecting line:0-1-4;1-2 and 1-3.
Next, stable state connecting line set determination unit 2411 may determine that each stable state in stable state connecting line set connects
Whether the end heading character connected domain of wiring is also there are neighbours' heading character connected domain, if appointing in stable state connecting line set
The end heading character connected domain of one stable state connecting line is also there are neighbours' heading character connected domain, then stable state connecting line set is true
Order member 2411 can be repeated for each stable state connecting line in the stable state connecting line set for starting heading character connected domain
End heading character connected domain performs the step of operations described above, until each stable state in stable state connecting line set connects
Untill all neighbours' heading character connected domain is not present in the end heading character connected domain of wiring.
By taking the example shown in Figure 14 as an example, for the end lemma of the stable state connecting line 1-2 in stable state connecting line set
Accord with No. 2 heading character connected domains of connected domain:No. 5 marks of No. 2 heading character connected domains and its neighbours' heading character connected domain will be connected
The connecting line 2-5 of autograph symbol connected domain is as transient state connecting line;Judge that No. 5 heading character connected domains are unsatisfactory for second group of predetermined bar
Part;It is stored in transient state connecting line 2-5 as new stable state connecting line in stable state connecting line set.For stable state connecting line set
In stable state connecting line 1-3 No. 3 heading character connected domains of end heading character connected domain:No. 3 heading character connections will be connected
The connecting line 3-7 of domain and its No. 7 heading character connected domain of neighbours' heading character connected domain is as transient state connecting line;Judge No. 7 marks
Autograph symbol connected domain is unsatisfactory for second group of predetermined condition;Stable state is stored in using transient state connecting line 3-7 as new stable state connecting line
In connecting line set.For No. 4 titles of end heading character connected domain of the stable state connecting line 0-1-4 in stable state connecting line set
Character connected domain:The company of No. 4 heading character connected domains and its neighbours' heading character No. 5 heading character connected domain of connected domain will be connected
The connecting line of No. 4 heading character connected domains of wiring 4-5 and connection and its neighbours' heading character No. 6 heading character connected domain of connected domain
4-6 is as transient state connecting line;Judge that No. 5 heading character connected domains are all unsatisfactory for second group of predetermined condition, No. 6 heading character connections
Domain meets second group of predetermined condition;It is stored in transient state connecting line 4-5 as new stable state connecting line in stable state connecting line set,
Stable state connecting line 0-1-4 is updated to 0-1-4-6.Therefore, by above step, stored in stable state connecting line set following
Stable state connecting line:1-2;1-3;2-5;3-7;4-5 and 0-1-4-6.
In the manner, stable state connecting line set determination unit 2411 can repeat to be directed to stable state connecting line set
In the end heading character connected domain of each stable state connecting line the step of performing operations described above, until stable state connects
Untill neighbours' heading character connected domain is not present in the end heading character connected domain of each stable state connecting line in line set.Most
Afterwards, determine to be combined into for the stable state connecting line collection for starting heading character No. 0 heading character connected domain of connected domain:1-2;1-3;2-5;
3-7;4-5;0-1-4-6-8-9;5-6 and 6-7.
Next, output unit 2412 can be chosen comprising heading character connected domain number most from stable state connecting line set
More stable state connecting line 0-1-4-6-8-9 is as the title connecting line searched out by No. 0 heading character connected domain.
According to an embodiment of the invention, the title connecting line that is not belonging to search out can be traveled through by finding unit 241
Heading character connected domain finds title connecting line as heading character connected domain is started.For example, searching out title connecting line
After 0-1-4-6-8-9, finding unit 241 can determine that No. 2 heading character connected domains connect as next beginning heading character
Title connecting line is found in logical domain, thereby determines that unit 242 can be with multiple title connecting lines of newspaper image.Next, header area
Domain acquiring unit 250 obtains multiple Title areas of newspaper image by combining the heading character connected domain on title connecting line.
Thus, it is possible to the fast and reliable automatic Title area for obtaining newspaper image in ground.
Figure 15 be according to the embodiment of the present invention obtain Title area after newspaper image schematic diagram.Such as Figure 15 institutes
Show, two Title areas of newspaper image are shown with black box.
According to an embodiment of the invention, Title area acquiring unit 250 is by combining the heading character on title connecting line
Connected domain obtains multiple Title areas of newspaper image.Under normal circumstances, the Title area of the newspaper image thus obtained is
Accurately.However, the influence of the presence and some other small connected domain due to punctuation mark, may have extraction not
Complete title.As shown in figure 15, " Acheng six " actually should also be a part for Title area, but due to being reported in extraction
During the connected domain of paper image, " six " word has been split into three small connected domains, and is connected merging adjacent connected domain with overlapping
The character connected domain of " six " word also can be surrounded without acquisition well during domain, causes to determine to connect in connecting line determination unit 240
During wiring, not by " these three the character connected domains of Acheng six " are connected with following title connecting line.
In order to solve the above-mentioned technical problem, the present invention proposes the image processing apparatus of another embodiment.Figure 16 is root
According to the structure diagram of the image processing apparatus 200 of an alternative embodiment of the invention.
As shown in figure 16, image processing apparatus 200 can include connected domain acquiring unit 210, character connected domain determines list
Member 220, heading character determination unit 230, connecting line determination unit 240, connecting line updating block 260 and Title area obtain single
Member 250.Connected domain acquiring unit 210 depicted herein, character connected domain determination unit 220, heading character determination unit 230,
Connecting line determination unit 240 and Title area acquiring unit 250 can use previously described connected domain acquiring unit 210, word
Accord with connected domain determination unit 220, heading character determination unit 230, connecting line determination unit 240 and Title area acquiring unit
250, details are not described herein.Connecting line updating block 260 is described below.
According to an embodiment of the invention, when remaining heading character connected domain meets the 3rd group of predetermined condition, connecting line is more
New unit 260 can will meet the remaining heading character connected domain and the head end mark of a title connecting line of the 3rd group of predetermined condition
Autograph symbol connected domain or end heading character connected domain are connected to update this title connecting line.Wherein, remaining lemma
It is to be not belonging to the heading character connected domain of one or more title connecting line to accord with connected domain.
According to an embodiment of the invention, connecting line updating block 260 can obtain newspaper figure from connecting line determination unit 240
All title connecting lines of picture, and all heading characters connection of newspaper image can be obtained from heading character determination unit 230
Domain, so that it is determined that being not belonging to those heading character connected domains of title connecting line as remaining heading character connected domain.According to this
The embodiment of invention, connecting line updating block 260 can determine all remaining heading character connected domains of newspaper image.
Next, connecting line updating block 260 may determine that in all remaining heading character connected domains of newspaper image
Whether each residue heading character connected domain meets the 3rd group of predetermined condition.Specifically, connecting line updating block 260 may determine that
Whether one remaining heading character connected domain meets the 3rd group of predetermined condition with a title connecting line.
According to an embodiment of the invention, the 3rd group of predetermined condition can include the ruler for remaining heading character connected domain
Very little, stroke width, with the slope of the connecting line of the heading character connected domain on title connecting line, to this title connecting line
Distance and this title connecting line on heading character connected domain between minimum distance constraint.
As a specific example, the 3rd group of predetermined condition can include:A. remaining caption text connected domain
The head end caption text connected domain of length and title connecting line either end caption text connected domain length (or this
The median of the length of all heading character connected domains on title connecting line) difference be less than the 8th threshold value;B. this is remaining
The width of caption text connected domain is connected with the head end caption text connected domain or end caption text of this title connecting line
The difference of the width (or median of the width of all heading character connected domains on this title connecting line) in domain is less than the
Nine threshold values;C. the connection of this remaining caption text connected domain and any one heading character connected domain on this title connecting line
The slope of line and the difference of the slope of this title connecting line are less than the tenth threshold value;D. in this remaining caption text connected domain
The distance of the heart to this title connecting line is less than the 11st threshold value;E. this remaining caption text connected domain is connected with this title
The minimum distance between heading character connected domain on line is less than the 12nd threshold value;And f. this remaining caption text connected domain
Stroke width and the head end caption text connected domain of this title connecting line or the stroke width of end caption text connected domain
The difference for spending (or median of the stroke width of all heading character connected domains on this title connecting line) is less than the tenth
Three threshold values.
According to an embodiment of the invention, above-mentioned each threshold value can be set according to actual needs or experience, can also
Above-mentioned each threshold value is set according to certain criterion.For the condition e in the 3rd group of predetermined condition, due to remaining heading character
There may be punctuation mark between connected domain and corresponding title connecting line, therefore should be than judging neighbours in Rule of judgment e
It is loose during heading character connected domain.That is, the 12nd threshold value is greater than 2* (min (max (pw,ph),max(qw,
qh))), wherein, pwRepresent the width of remaining heading character connected domain, phRepresent the height of remaining heading character connected domain, qwTable
The width of the nearest heading character connected domain of the distance residue heading character connected domain, q on indicating topic connecting linehRepresent that title connects
The height of the nearest heading character connected domain of the distance residue heading character connected domain in wiring.
According to an embodiment of the invention, when remaining heading character connected domain meets the 3rd group of predetermined condition, connecting line is more
The remaining heading character connected domain for meeting the 3rd group of predetermined condition can be somebody's turn to do by new unit 260 with distance on this title connecting line
The nearest heading character connected domain of remaining heading character connected domain is connected to update this title connecting line.
Under normal circumstances, the caption text connection that distance residue caption text connected domain is nearest on a title connecting line
Domain is the head end caption text connected domain or end caption text connected domain of the title connecting line.Therefore, when remaining lemma
When according with connected domain the 3rd group of predetermined condition of satisfaction, connecting line updating block 260 can will meet the residue of the 3rd group of predetermined condition
Heading character connected domain is connected with the head end heading character connected domain or end heading character connected domain of this title connecting line
Connect to update this title connecting line.
According to an embodiment of the invention, after connecting line updating block 260 have updated title connecting line, Title area obtains
Unit 250 can combine the heading character connected domain on the title connecting line after identical renewal to obtain newspaper image
One or more Title area.
According to an embodiment of the invention, title connecting line can be post-processed, so as to eliminate punctuation mark and
Influence of the small connected domain to title connecting line, so that the title connecting line obtained is more accurate.
Figure 17 be according to an alternative embodiment of the invention obtain Title area after newspaper image schematic diagram.Such as
Shown in Figure 17, after have passed through post processing, the Title area of the newspaper image of acquisition includes " Acheng six ", so that more accurate
Ground obtains the Title area of newspaper image.
The foregoing describe image processing apparatus according to the present invention.Image processing method according to the present invention will be described below
Method.
Figure 18 is the flow chart according to the image processing method of the embodiment of the present invention.
As shown in figure 18, in step S1810, multiple connected domains of newspaper image are obtained.
Next, in step S1820, merge overlapping connected domain in multiple connected domains and adjacent connected domain is more to obtain
A character connected domain.
Next, in step S1830, multiple heading character connected domains are determined from multiple character connected domains.
Next, in step S1840, determine that one or more title connects according to multiple heading character connected domains
Line.
Next, in step S1850, combination is located at the heading character connected domain on identical title connecting line to obtain
One or more Title area of newspaper image.
Preferably, the overlapping connected domain in multiple connected domains and adjacent connected domain are merged to obtain multiple character connected domain bags
Include:When the boundary rectangle frames of two connected domains, there are during overlapping region, determine that two connected domains are overlapping connected domain;When two companies
The distance between closest two side of boundary rectangle frame in logical domain is less than first threshold, and merges the two connected domains
When the aspect ratio of the boundary rectangle frame of connected domain afterwards is less than second threshold with 1 difference, it is adjacent to determine two connected domains
Connected domain;And merge overlapping connected domain, merge adjacent connected domain, and using the multiple connected domains obtained after merging as word
Accord with connected domain.
Preferably, determine that multiple heading character connected domains include from multiple character connected domains:By multiple character connected domains
The character connected domain that middle size is more than the 3rd threshold value is determined as heading character connected domain.
Preferably, determine that multiple heading character connected domains include from multiple character connected domains:Determine multiple character connections
The size of all character connected domains in domain;The number of character connected domain is counted using the size of character connected domain as variable;With
And the 3rd threshold value is determined according to the size with the maximum number of character connected domain.
Preferably, determine that one or more title connecting line includes according to multiple heading character connected domains:Traversal does not belong to
In the heading character connected domain of the title connecting line searched out title connection is found as heading character connected domain is started
Line;All title connecting lines searched out are determined as one or more title connecting line.
Preferably, finding title connecting line includes:Repeat for the stable state connecting line set for starting heading character connected domain
In the end heading character connected domain of each stable state connecting line the step of performing following operation, until in stable state connecting line set
Each stable state connecting line end heading character connected domain be not present neighbours' heading character connected domain untill:End will be connected
The connecting line of neighbours' heading character connected domain of heading character connected domain and end heading character connected domain is as transient state connecting line;
When neighbours' heading character connected domain of end heading character connected domain meets predetermined condition, by neighbours' heading character connected domain institute
Transient state connecting line be connected with the stable state connecting line where the heading character connected domain of end to update stable state connecting line set
Stable state connecting line where the end heading character connected domain of middle storage;And neighbours' title when end heading character connected domain
When character connected domain is unsatisfactory for predetermined condition, connect the transient state connecting line where neighbours' heading character connected domain as new stable state
Wiring is stored in stable state connecting line set;And most steady of heading character connected domain will be included in stable state connecting line set
State connecting line is determined as a title connecting line.
Preferably, title connecting line is found to further include:The stable state connecting line set for starting heading character connected domain is initial
Turn to including following stable state connecting line:Connection starts heading character connected domain with starting neighbours' lemma of heading character connected domain
Accord with the connecting line of connected domain.
Preferably, predetermined condition include for end heading character connected domain neighbours' heading character connected domain length,
Width, the slope of transient state connecting line at place and the pact of distance to the stable state connecting line where the heading character connected domain of end
Beam.
Preferably, image processing method further includes:When remaining heading character connected domain meets predetermined condition, will meet pre-
The remaining heading character connected domain of fixed condition and the head end heading character connected domain or end lemma of a title connecting line
Symbol connected domain is connected to update a title connecting line, and remaining heading character connected domain is to be not belonging to one or more title
The heading character connected domain of connecting line.
Preferably, predetermined condition includes the size for remaining heading character connected domain, stroke width, connects with a title
The slope of the connecting line of heading character connected domain in wiring, to a title connecting line distance, with a title connecting line
On heading character connected domain between minimum distance constraint.
Image processing method described above can be by image processing apparatus 200 according to an embodiment of the invention real
Existing, therefore, the various embodiments of image processing apparatus 200 described above are suitable for this, and this will not be repeated here.
It can be seen from the above that using image processing apparatus according to the present invention and image processing method, can be overlapping by merging
Connected domain and adjacent connected domain obtain the character connected domain of newspaper image, determine heading character connection from character connected domain
Domain, and one or more title connecting line can be determined according to heading character connected domain, so as to combine positioned at identical
Heading character connected domain on title connecting line is to obtain the Title area of newspaper image.So, can fast and reliablely
The Title area of newspaper image is automatically extracted, so as to save substantial amounts of manpower and materials, improves the mark effect of Digital Newspapers title
Rate.
Obviously, each operating process of image processing method according to the present invention can be various machine readable to be stored in
The mode of computer executable program in storage medium is realized.
Moreover, the purpose of the present invention can also be accomplished in the following manner:Above-mentioned executable program code will be stored with
Storage medium is directly or indirectly supplied to system or equipment, and computer or central processing in the system or equipment
Unit (CPU) reads and performs above procedure code.At this time, as long as the system or equipment have the function of executive program, then
Embodiments of the present invention are not limited to program, and the program can also be arbitrary form, for example, target program, explanation
The program or be supplied to shell script of operating system etc. that device performs.
These above-mentioned machinable mediums include but not limited to:Various memories and storage unit, semiconductor equipment,
Disk cell such as light, magnetic and magneto-optic disk, and other media for being suitable for storage information etc..
In addition, computer is by the corresponding website that is connected on internet, and by the computer program according to the present invention
Code is downloaded and is installed in computer and then performs the program, can also realize technical scheme.
Figure 19 is the exemplary knot for the general purpose personal computer that can wherein realize image processing method according to the present invention
The block diagram of structure.
As shown in figure 19, CPU 1901 according to the program stored in read-only storage (ROM) 1902 or from storage part
1908 programs for being loaded into random access memory (RAM) 1903 perform various processing.In RAM 1903, deposited also according to needs
Store up the data required when CPU 1901 performs various processing etc..CPU 1901, ROM 1902 and RAM 1903 are via bus
1904 are connected to each other.Input/output interface 1905 is also connected to bus 1904.
Components described below is connected to input/output interface 1905:Importation 1906 (including keyboard, mouse etc.), output
Part 1907 (including display, such as cathode-ray tube (CRT), liquid crystal display (LCD) etc., and loudspeaker etc.), storage
Part 1908 (including hard disk etc.), communications portion 1909 (including network interface card such as LAN card, modem etc.).Communication
Part 1909 performs communication process via network such as internet.As needed, driver 1910 can be connected to input/output
Interface 1905.Detachable media 1911 such as disk, CD, magneto-optic disk, semiconductor memory etc. are installed in as needed
On driver 1910 so that the computer program read out is mounted in storage part 1908 as needed.
It is such as removable from network such as internet or storage medium in the case where realizing above-mentioned series of processes by software
Unload the program that the installation of medium 1911 forms software.
It will be understood by those of skill in the art that this storage medium is not limited to wherein be stored with journey shown in Figure 19
Sequence and equipment are separately distributed to provide a user the detachable media 1911 of program.The example bag of detachable media 1911
Containing disk (including floppy disk (registration mark)), CD (comprising compact disc read-only memory (CD-ROM) and digital universal disc (DVD)),
Magneto-optic disk (including mini-disk (MD) (registration mark)) and semiconductor memory.Alternatively, storage medium can be ROM 1902, deposit
Hard disk included in storage part 1908 etc., wherein computer program stored, and user is distributed to together with the equipment comprising them.
In the system and method for the present invention, it is clear that each unit or each step can be decomposed and/or reconfigured.
These decompose and/or reconfigure the equivalents that should be regarded as the present invention.Also, the step of performing above-mentioned series of processes can be certainly
So perform, but and need not be necessarily performed sequentially in time in chronological order according to the order of explanation.Some steps can
To perform parallel or independently of one another.
Although the embodiment of the present invention is described in detail with reference to attached drawing above, it is to be understood that reality described above
The mode of applying is only intended to the explanation present invention, and is not construed as limiting the invention.For those skilled in the art, may be used
To make various changes and modifications the above embodiment without departing from the spirit and scope of the invention.Therefore, it is of the invention
Scope is only limited by appended claim and its equivalents.
On the embodiment including above example, following note is also disclosed:
A kind of 1. image processing apparatus are attached, including:
Connected domain acquiring unit, for obtaining multiple connected domains of newspaper image;
Character connected domain determination unit, for merge overlapping connected domain in the multiple connected domain and adjacent connected domain with
Obtain multiple character connected domains;
Heading character determination unit, for determining multiple heading character connected domains from the multiple character connected domain;
Connecting line determination unit, for determining that one or more title connects according to the multiple heading character connected domain
Line;And
Title area acquiring unit, is located at the heading character connected domain on identical title connecting line to obtain for combination
One or more Title area of the newspaper image.
Image processing apparatus of the note 2. according to note 1, wherein, the character connected domain determination unit includes:
Overlapping connected domain determination unit, for when the boundary rectangle frame of two connected domains is there are during overlapping region, determining institute
It is overlapping connected domain to state two connected domains;
Adjacent connected domain determination unit, for when between closest two side of the boundary rectangle frame of two connected domains
Distance be less than first threshold, and merge the aspect ratio and 1 of the boundary rectangle frame of the connected domain after the two connected domains
When difference is less than second threshold, it is adjacent connected domain to determine described two connected domains;And
Combining unit, for merging overlapping connected domain, merges adjacent connected domain, and the multiple companies that will be obtained after merging
Logical domain is as character connected domain.
Image processing apparatus of the note 3. according to note 1, wherein, the heading character determination unit includes:
Comparing unit, the character connected domain for size in the multiple character connected domain to be more than to the 3rd threshold value are determined as
Heading character connected domain.
Image processing apparatus of the note 4. according to note 3, wherein, the heading character determination unit further includes:
Size determination unit, for determining the size of all character connected domains in the multiple character connected domain;
Statistic unit, for counting the number of character connected domain using the size of character connected domain as variable;And
Threshold value determination unit, for determining the 3rd threshold according to the size with the maximum number of character connected domain
Value.
Image processing apparatus of the note 5. according to note 1, wherein, the connecting line determination unit includes:
Unit is found, the heading character connected domain for traveling through the title connecting line for being not belonging to search out, which is used as, to be started
Heading character connected domain finds title connecting line;And
Determination unit, connects for all title connecting lines searched out to be determined as one or more of titles
Wiring.
Image processing apparatus of the note 6. according to note 5, wherein, the searching unit includes:
Stable state connecting line set determination unit, for repeating the stable state connecting line for the beginning heading character connected domain
The step of end heading character connected domain of each stable state connecting line in set performs following operation, until the stable state connects
Untill neighbours' heading character connected domain is not present in the end heading character connected domain of each stable state connecting line in line set:
The neighbours' heading character for connecting the end heading character connected domain and the end heading character connected domain is connected
The connecting line in logical domain is as transient state connecting line;
When neighbours' heading character connected domain of the end heading character connected domain meets predetermined condition, by the neighbours
Transient state connecting line where heading character connected domain is connected with the stable state connecting line where the end heading character connected domain
To update the stable state connecting line where the end heading character connected domain stored in the stable state connecting line set;And
When neighbours' heading character connected domain of the end heading character connected domain is unsatisfactory for the predetermined condition, by institute
Transient state connecting line where stating neighbours' heading character connected domain is stored in the stable state connecting line collection as new stable state connecting line
In conjunction;And
Output unit, for the stable states most comprising heading character connected domain in the stable state connecting line set to be connected
Line is determined as a title connecting line.
Image processing apparatus of the note 7. according to note 6, wherein, the stable state connecting line set determination unit is also wrapped
Include:
Initialization unit, for by the stable state connecting line set of the beginning heading character connected domain be initialized as including with
Lower stable state connecting line:Connect neighbours' heading character of the beginning heading character connected domain and the beginning heading character connected domain
The connecting line of connected domain.
Image processing apparatus of the note 8. according to note 6, wherein, the predetermined condition includes marking for the end
The length of neighbours' heading character connected domain of autograph symbol connected domain, width, place transient state connecting line slope and to described
The constraint of the distance of stable state connecting line where the heading character connected domain of end.
Image processing apparatus of the note 9. according to note 1, wherein, described image processing unit further includes:
Connecting line updating block, it is described predetermined for that when remaining heading character connected domain meets predetermined condition, will meet
The remaining heading character connected domain of condition and the head end heading character connected domain or end title of one title connecting line
Character connected domain is connected to update one title connecting line, and the residue heading character connected domain is to be not belonging to described one
The heading character connected domain of a or multiple title connecting lines.
Image processing apparatus of the note 10. according to note 9, wherein, the predetermined condition includes being directed to the residue
The size of heading character connected domain, stroke width, the connecting line with the heading character connected domain on one title connecting line
Slope, between the heading character connected domain in the distance and one title connecting line of one title connecting line
Minimum distance constraint.
A kind of 11. image processing methods are attached, including:
Obtain multiple connected domains of newspaper image;
Merge the overlapping connected domain in the multiple connected domain and adjacent connected domain to obtain multiple character connected domains;
Multiple heading character connected domains are determined from the multiple character connected domain;
One or more title connecting line is determined according to the multiple heading character connected domain;And
Heading character connected domain of the combination on the identical title connecting line with obtain one of the newspaper image or
The multiple Title areas of person.
Image processing method of the note 12. according to note 11, wherein, merge the overlapping company in the multiple connected domain
Logical domain and adjacent connected domain are included with obtaining multiple character connected domains:
When the boundary rectangle frames of two connected domains, there are during overlapping region, determine described two connected domains for overlapping connection
Domain;
When the distance between closest two side of boundary rectangle frame of two connected domains is less than first threshold, and
Merge the aspect ratio of boundary rectangle frame of the connected domain after the two connected domains and when 1 difference is less than second threshold, it is definite
Described two connected domains are adjacent connected domain;And
Merge overlapping connected domain, merge adjacent connected domain, and using the multiple connected domains obtained after merging as character
Connected domain.
Image processing method of the note 13. according to note 11, wherein, determined from the multiple character connected domain more
A heading character connected domain includes:
The character connected domain that size in the multiple character connected domain is more than to the 3rd threshold value is determined as heading character connection
Domain.
Image processing method of the note 14. according to note 13, wherein, determined from the multiple character connected domain more
A heading character connected domain includes:
Determine the size of all character connected domains in the multiple character connected domain;
The number of character connected domain is counted using the size of character connected domain as variable;And
The 3rd threshold value is determined according to the size with the maximum number of character connected domain.
Image processing method of the note 15. according to note 11, wherein, it is true according to the multiple heading character connected domain
One or more fixed title connecting line includes:
The heading character connected domain that traversal is not belonging to the title connecting line searched out is connected as heading character is started
Title connecting line is found in domain;
All title connecting lines searched out are determined as one or more of title connecting lines.
Image processing method of the note 16. according to note 15, wherein, finding the title connecting line includes:
Repeat the end of each stable state connecting line in the stable state connecting line set for the beginning heading character connected domain
The step of holding heading character connected domain to perform following operation, until each stable state connecting line in the stable state connecting line set
Untill all neighbours' heading character connected domain is not present in end heading character connected domain:
The neighbours' heading character for connecting the end heading character connected domain and the end heading character connected domain is connected
The connecting line in logical domain is as transient state connecting line;
When neighbours' heading character connected domain of the end heading character connected domain meets predetermined condition, by the neighbours
Transient state connecting line where heading character connected domain is connected with the stable state connecting line where the end heading character connected domain
To update the stable state connecting line where the end heading character connected domain stored in the stable state connecting line set;And
When neighbours' heading character connected domain of the end heading character connected domain is unsatisfactory for the predetermined condition, by institute
The transient state connecting line where neighbours' heading character connected domain is stated as new stable state
Connecting line is stored in the stable state connecting line set;And
The stable state connecting lines most comprising heading character connected domain in the stable state connecting line set are determined as one
Title connecting line.
Image processing method of the note 17. according to note 16, wherein, find the title connecting line and further include:
The stable state connecting line set of the beginning heading character connected domain is initialized as including following stable state connecting line:Even
Connect the connecting line of the beginning heading character connected domain and neighbours' heading character connected domain of the beginning heading character connected domain.
Image processing method of the note 18. according to note 16, wherein, the predetermined condition includes being directed to the end
The length of neighbours' heading character connected domain of heading character connected domain, width, place transient state connecting line slope and to institute
State the constraint of the distance of the stable state connecting line where the heading character connected domain of end.
Image processing method of the note 19. according to note 11, further includes:
When remaining heading character connected domain meets predetermined condition, the remaining heading character for meeting the predetermined condition is connected
Logical domain is connected with more with the head end heading character connected domain or end heading character connected domain of one title connecting line
New one title connecting line, the residue heading character connected domain is to be not belonging to one or more of title connecting lines
Heading character connected domain.
A kind of 20. machinable mediums are attached, carry the machine readable instructions generation including being stored therein thereon
The program product of code, wherein, described instruction code can make the computer perform basis when being read by computer and being performed
It is attached the image processing method any one of 11-19.
Claims (10)
1. a kind of image processing apparatus, including:
Connected domain acquiring unit, for obtaining multiple connected domains of newspaper image;
Character connected domain determination unit, for merging overlapping connected domain in the multiple connected domain and adjacent connected domain to obtain
Multiple character connected domains;
Heading character determination unit, for determining multiple heading character connected domains from the multiple character connected domain;
Connecting line determination unit, for determining one or more title connecting line according to the multiple heading character connected domain;
And
Title area acquiring unit, it is described to obtain for combining the heading character connected domain on identical title connecting line
One or more Title area of newspaper image.
2. image processing apparatus according to claim 1, wherein, the character connected domain determination unit includes:
Overlapping connected domain determination unit, for when the boundary rectangle frame of two connected domains is there are during overlapping region, determining described two
A connected domain is overlapping connected domain;
Adjacent connected domain determination unit, for when between closest two side of the boundary rectangle frame of two connected domains away from
From less than first threshold, and merge the aspect ratio of the boundary rectangle frame of the connected domain after the two connected domains and 1 difference
During less than second threshold, it is adjacent connected domain to determine described two connected domains;And
Combining unit, for merging overlapping connected domain, merges adjacent connected domain, and the multiple connected domains that will be obtained after merging
As character connected domain.
3. image processing apparatus according to claim 1, wherein, the heading character determination unit includes:
Comparing unit, the character connected domain for size in the multiple character connected domain to be more than to the 3rd threshold value are determined as title
Character connected domain.
4. image processing apparatus according to claim 3, wherein, the heading character determination unit further includes:
Size determination unit, for determining the size of all character connected domains in the multiple character connected domain;
Statistic unit, for counting the number of character connected domain using the size of character connected domain as variable;And
Threshold value determination unit, for determining the 3rd threshold value according to the size with the maximum number of character connected domain.
5. image processing apparatus according to claim 1, wherein, the connecting line determination unit includes:
Unit is found, for traveling through the heading character connected domain for the title connecting line for being not belonging to search out as beginning title
Character connected domain finds title connecting line;And
Determination unit, connects for all title connecting lines searched out to be determined as one or more of titles
Line.
6. image processing apparatus according to claim 5, wherein, the searching unit includes:
Stable state connecting line set determination unit, for repeating the stable state connecting line set for the beginning heading character connected domain
In the end heading character connected domain of each stable state connecting line the step of performing following operation, until the stable state connecting line collection
Untill neighbours' heading character connected domain is not present in the end heading character connected domain of each stable state connecting line in conjunction:
Neighbours' heading character connected domain of the end heading character connected domain and the end heading character connected domain will be connected
Connecting line as transient state connecting line;
When neighbours' heading character connected domain of the end heading character connected domain meets predetermined condition, by neighbours' title
Transient state connecting line where character connected domain is connected with more with the stable state connecting line where the end heading character connected domain
Stable state connecting line where the end heading character connected domain stored in the new stable state connecting line set;And
When neighbours' heading character connected domain of the end heading character connected domain is unsatisfactory for the predetermined condition, by the neighbour
Transient state connecting line where occupying heading character connected domain is stored in the stable state connecting line set as new stable state connecting line;
And
Output unit, for the stable state connecting lines most comprising heading character connected domain in the stable state connecting line set are true
It is set to a title connecting line.
7. image processing apparatus according to claim 6, wherein, the stable state connecting line set determination unit further includes:
Initialization unit, it is following steady for the stable state connecting line set of the beginning heading character connected domain to be initialized as including
State connecting line:Neighbours' heading character that the beginning heading character connected domain is connected with the beginning heading character connected domain connects
The connecting line in domain.
8. image processing apparatus according to claim 6, wherein, the predetermined condition includes being directed to the end lemma
Accord with the length of neighbours' heading character connected domain of connected domain, width, place transient state connecting line slope and to the end
The constraint of the distance of stable state connecting line where heading character connected domain.
9. image processing apparatus according to claim 1, wherein, described image processing unit further includes:
Connecting line updating block, for that when remaining heading character connected domain meets predetermined condition, will meet the predetermined condition
Remaining heading character connected domain and one title connecting line head end heading character connected domain or end heading character
Connected domain is connected to update one title connecting line, the residue heading character connected domain be not belonging to it is one or
The heading character connected domain of the multiple title connecting lines of person.
10. a kind of image processing method, including:
Obtain multiple connected domains of newspaper image;
Merge the overlapping connected domain in the multiple connected domain and adjacent connected domain to obtain multiple character connected domains;
Multiple heading character connected domains are determined from the multiple character connected domain;
One or more title connecting line is determined according to the multiple heading character connected domain;And
Heading character connected domain of the combination on the identical title connecting line is to obtain one of the newspaper image or more
A Title area.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610921297.4A CN107977593A (en) | 2016-10-21 | 2016-10-21 | Image processing apparatus and image processing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610921297.4A CN107977593A (en) | 2016-10-21 | 2016-10-21 | Image processing apparatus and image processing method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107977593A true CN107977593A (en) | 2018-05-01 |
Family
ID=62003866
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610921297.4A Pending CN107977593A (en) | 2016-10-21 | 2016-10-21 | Image processing apparatus and image processing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107977593A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109558876A (en) * | 2018-11-20 | 2019-04-02 | 浙江口碑网络技术有限公司 | Character recognition processing method and device |
CN109948413A (en) * | 2018-12-29 | 2019-06-28 | 禾多科技(北京)有限公司 | Method for detecting lane lines based on the fusion of high-precision map |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090316219A1 (en) * | 2008-06-18 | 2009-12-24 | Canon Kabushiki Kaisha | Image processing apparatus, image processing method and computer-readable storage medium |
CN102855264A (en) * | 2011-07-01 | 2013-01-02 | 富士通株式会社 | Method and device for document processing |
CN103034842A (en) * | 2012-12-05 | 2013-04-10 | 上海合合信息科技发展有限公司 | Professional notebook computer facilitating electronization and electronic thumbnail photo display method thereof |
CN103093228A (en) * | 2013-01-17 | 2013-05-08 | 上海交通大学 | Chinese detection method in natural scene image based on connected domain |
CN103839060A (en) * | 2012-11-26 | 2014-06-04 | 阿里巴巴集团控股有限公司 | Single-word region combination method and device |
CN104573685A (en) * | 2015-01-29 | 2015-04-29 | 中南大学 | Natural scene text detecting method based on extraction of linear structures |
US20160086026A1 (en) * | 2014-09-23 | 2016-03-24 | Konica Minolta Laboratory U.S.A., Inc. | Removal of graphics from document images using heuristic text analysis and text recovery |
WO2016069005A1 (en) * | 2014-10-31 | 2016-05-06 | Hewlett-Packard Development Company, L.P. | Text line detection |
CN105844275A (en) * | 2016-03-25 | 2016-08-10 | 北京云江科技有限公司 | Method for positioning text lines in text image |
CN105844207A (en) * | 2015-01-15 | 2016-08-10 | 富士通株式会社 | Text line extraction method and text line extraction equipment |
-
2016
- 2016-10-21 CN CN201610921297.4A patent/CN107977593A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090316219A1 (en) * | 2008-06-18 | 2009-12-24 | Canon Kabushiki Kaisha | Image processing apparatus, image processing method and computer-readable storage medium |
CN102855264A (en) * | 2011-07-01 | 2013-01-02 | 富士通株式会社 | Method and device for document processing |
CN103839060A (en) * | 2012-11-26 | 2014-06-04 | 阿里巴巴集团控股有限公司 | Single-word region combination method and device |
CN103034842A (en) * | 2012-12-05 | 2013-04-10 | 上海合合信息科技发展有限公司 | Professional notebook computer facilitating electronization and electronic thumbnail photo display method thereof |
CN103093228A (en) * | 2013-01-17 | 2013-05-08 | 上海交通大学 | Chinese detection method in natural scene image based on connected domain |
US20160086026A1 (en) * | 2014-09-23 | 2016-03-24 | Konica Minolta Laboratory U.S.A., Inc. | Removal of graphics from document images using heuristic text analysis and text recovery |
WO2016069005A1 (en) * | 2014-10-31 | 2016-05-06 | Hewlett-Packard Development Company, L.P. | Text line detection |
CN105844207A (en) * | 2015-01-15 | 2016-08-10 | 富士通株式会社 | Text line extraction method and text line extraction equipment |
CN104573685A (en) * | 2015-01-29 | 2015-04-29 | 中南大学 | Natural scene text detecting method based on extraction of linear structures |
CN105844275A (en) * | 2016-03-25 | 2016-08-10 | 北京云江科技有限公司 | Method for positioning text lines in text image |
Non-Patent Citations (2)
Title |
---|
WEIJUAN WEN等: "An Efficient Method for Text Location and Segmentation", 《2009 WRI WORLD CONGRESS ON SOFTWARE ENGINEERING》 * |
张文杰: "基于移动终端的报纸版面分析及识别", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109558876A (en) * | 2018-11-20 | 2019-04-02 | 浙江口碑网络技术有限公司 | Character recognition processing method and device |
CN109948413A (en) * | 2018-12-29 | 2019-06-28 | 禾多科技(北京)有限公司 | Method for detecting lane lines based on the fusion of high-precision map |
CN109948413B (en) * | 2018-12-29 | 2021-06-04 | 禾多科技(北京)有限公司 | Lane line detection method based on high-precision map fusion |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104035997B (en) | Scientific and technical information acquisition and pushing method based on text classification and image deep mining | |
CN110795919B (en) | Form extraction method, device, equipment and medium in PDF document | |
CN101887523B (en) | Method for detecting image spam email by picture character and local invariant feature | |
CN111046784A (en) | Document layout analysis and identification method and device, electronic equipment and storage medium | |
CN105719243B (en) | Image processing apparatus and method | |
CN102968637A (en) | Complicated background image and character division method | |
CN102193946A (en) | Method and system for adding tags into media file | |
CN108205676B (en) | The method and apparatus for extracting pictograph region | |
Prusty et al. | Indiscapes: Instance segmentation networks for layout parsing of historical indic manuscripts | |
CN110334217A (en) | A kind of element abstracting method, device, equipment and storage medium | |
CN103617192B (en) | The clustering method and device of a kind of data object | |
CN113780229A (en) | Text recognition method and device | |
CN112883926B (en) | Identification method and device for form medical images | |
CN112434555B (en) | Key value pair region identification method and device, storage medium and electronic equipment | |
CN111814425A (en) | Book automatic typesetting implementation method based on book character information | |
CN110276352A (en) | Index identification method, device, electronic equipment and computer readable storage medium | |
CN106227808A (en) | A kind of method removing mail interference information and method for judging rubbish mail | |
CN107977593A (en) | Image processing apparatus and image processing method | |
CN109461195A (en) | A kind of chart extracting method, device and equipment based on SVG | |
CN107291774A (en) | Error sample recognition methods and device | |
Yang et al. | Semi-automatic ground truth generation for chart image recognition | |
CN109726369A (en) | A kind of intelligent template questions record Implementation Technology based on normative document | |
CN114119949A (en) | Method and system for generating enhanced text synthetic image | |
US20100299535A1 (en) | Method and apparatus for extracting raster images from portable electronic document | |
CN112149654B (en) | Invoice text information identification method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180501 |
|
WD01 | Invention patent application deemed withdrawn after publication |