CN110097059A - Based on file and picture binary coding method, system, the device for generating confrontation network - Google Patents
Based on file and picture binary coding method, system, the device for generating confrontation network Download PDFInfo
- Publication number
- CN110097059A CN110097059A CN201910222323.8A CN201910222323A CN110097059A CN 110097059 A CN110097059 A CN 110097059A CN 201910222323 A CN201910222323 A CN 201910222323A CN 110097059 A CN110097059 A CN 110097059A
- Authority
- CN
- China
- Prior art keywords
- image
- picture
- image block
- binary
- convolutional neural
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 65
- 238000005520 cutting process Methods 0.000 claims abstract description 27
- 238000012545 processing Methods 0.000 claims abstract description 18
- 238000013528 artificial neural network Methods 0.000 claims description 12
- 238000003860 storage Methods 0.000 claims description 11
- 238000012549 training Methods 0.000 claims description 11
- 238000005457 optimization Methods 0.000 claims description 5
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000003475 lamination Methods 0.000 claims description 2
- 230000001788 irregular Effects 0.000 abstract description 3
- 238000006243 chemical reaction Methods 0.000 abstract description 2
- 238000010606 normalization Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 8
- 238000012015 optical character recognition Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000026676 system process Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/28—Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
Abstract
The invention belongs to field of image processings, more particularly to a kind of based on the file and picture binary coding method, system, the device that generate confrontation network, it is intended that it is unstable to solve existing binarization method its binaryzation accuracy in the case where the picture quality of document picture is irregular, the poor problem of robustness.The method of the present invention includes: to carry out cutting to original document image;Divide and binary conversion treatment is carried out to the original document image after cutting image, normalization respectively based on the first convolutional neural networks;Obtained binary image is passed through into splicing respectively, scaling generates original document image size, and it is merged with the grayscale image of original document image, pass through the second convolutional neural networks into binaryzation after carrying out picture cutting, and merges obtained binary image block and obtain final binary picture.Take pictures file and picture available accuracy higher binary image of the present invention for multiple types document, and stability with higher, strong robustness.
Description
Technical field
The invention belongs to field of image processings, and in particular to a kind of based on the document image binaryzation side for generating confrontation network
Method, system, device.
Background technique
In recent years, with the fast development of network technology, the mankind have come into that information is epoch-making, traditional acquisition of information
Method, such as books, newspaper and periodical be due to the inconvenience of carrying, while storing and needing a large amount of space, is not easy to compile
It collects and arranges and propagate.People increasingly tend to store using electronic equipments such as disks, therefore by paper material text information
It rapidly inputs computer to have very important significance, OCR (Optical Character Recognition, optical character identification)
Thus technology generates.OCR technique can be realized the announcement high speed of text information, automatically input, save a large amount of human resources,
It has been widely used at present.
The success of OCR technique depends on the pretreatment work to text image, can carry out good binaryzation to image
Processing, it will be able to the accuracy rate of OCR identification is greatly improved, so binaryzation work has very big researching value.It is answered actual
In, the quality of text image may be multifarious, may have unclear or noise of printing etc. to bother, existing binarization method exists
Its binaryzation accuracy is unstable in the case that the picture quality of document picture is irregular, and robustness is poor.
Summary of the invention
In order to solve the above problem in the prior art, in order to solve existing binarization method in the image of document picture
Its binaryzation accuracy is unstable in the case that quality is irregular, and the poor problem of robustness, the first aspect of the present invention mentions
Go out a kind of file and picture binary coding method based on generation confrontation network, this method comprises:
Step S10 obtains the multiple images of default first size according to setting step-length from the original document image of input
Block, as the first image block set;
Step S20 obtains the two-value of every image block by the first convolutional neural networks for the first image set of blocks
Change figure, obtains the second image block set;The original document image is normalized to the first size size, passes through described
One convolutional neural networks obtain its binary picture, as the first binary map;
Step S30 splices each image block in second image block set to obtain the second binary map;By the first binary map
The size of the original document image is zoomed to as third binary map;Obtain the grayscale image of the original document image;By institute
State the second binary map, third binary map, the original document image grayscale image merge to obtain triple channel image;
Step S40, the triple channel image obtains third image block set using the method cutting of step S10, and passes through
Second convolutional neural networks obtain the binary map of an image block, as the 4th image block set;
Each image block in 4th image block set is spliced the final two-value for obtaining original document image by step S50
Change figure;
Wherein, first convolutional neural networks, second convolutional neural networks cascade composition generate confrontation network
Generator, and parameter optimization is carried out by training.
In some preferred embodiments, the arbiter of the confrontation network is the full convolutional neural networks of patch-based;
First convolutional neural networks, second convolutional neural networks are the identical semantic segmentation net of two structures
Network;First convolutional neural networks are used to generate binary image according to the contextual information of regional area;The volume Two
Product neural network be used for according to text and context information gap to the output results of first convolutional neural networks into
Row amendment.
In some preferred embodiments, it is described confrontation network training when loss function LlossFor
LcGAN(G, D)=EX, y[log D (x, y)]+Ex[log (1-D (x, G (x, z)))]
LL1(G)=EX, y[| | (y-G (x, z)) | |1]
Wherein, G, D respectively indicate the generator and arbiter in confrontation network;LcGAN(G, D) is generator and arbiter
Trained confrontation loss, LL1(G) L1 of the image and true bianry image generated for generator loses, and x is input picture, and z is
Random noise in generator, G (x, z) indicate the binarization result figure that generator is generated using input picture x and random noise z
Picture, y are true bianry image, and γ is the corresponding weight coefficient of two kinds of losses, and D (x, y) is by input picture and true two-value
Change the corresponding arbiter of sample and exports result.
In some preferred embodiments, first convolutional neural networks, second convolutional neural networks include
Five layers of convolutional layer, five layers of warp lamination.
In some preferred embodiments, each image block in the first image set of blocks, the of picture centre
Two size areas are not Chong Die with other image blocks in the first image set of blocks.
In some preferred embodiments, the first size is A*A, and described second having a size of B*B;
Upper left point [a, b] image block based determines the upper left point of four adjacent image blocks of the image block, method are as follows:
The upper left point coordinate of left side adjacent image block is [a-A+ (B/2), b];
The upper left point coordinate of right side adjacent image block is [a+A- (B/2), b];
The upper left point coordinate of top adjacent image block is [a, b-A+ (B/2)];
The upper left point coordinate of lower section adjacent image block is [a, b+A- (B/2)].
In some preferred embodiments, the first size is 256*256, and described second having a size of 128*128.
The second aspect of the present invention proposes a kind of document image binaryzation system based on generation confrontation network, this is
It unites including at cutting module, the first convolution Processing with Neural Network module, triple channel image collection module, the second convolutional neural networks
Manage module, final binary picture obtains module;
The cutting module is configured to obtain the more of default first size from the text image of input according to setting step-length
A image block constructs image block set;
The first convolution Processing with Neural Network module, be configured to for by the cutting module from original document image
The first image block set is obtained, the binary picture of every image block is obtained by the first convolutional neural networks, obtains the second image block
Set;The original document image is normalized to the first size size, is obtained by first convolutional neural networks
Its binary picture, as the first binary map;
The triple channel image collection module is configured to that each image block in second image block set is spliced to obtain
Two binary maps;First binary map is zoomed into the size of the original document image as third binary map;It obtains described original
The grayscale image of file and picture;The grayscale image of second binary map, third binary map, the original document image is merged to obtain
Triple channel image;
The second convolution Processing with Neural Network module, be configured to for by the cutting module from the triple channel figure
As obtaining third image block set, and by the binary map of the second convolutional neural networks acquisition image block, as the 4th image
Set of blocks;
The final binary picture obtains module, is configured to splice to obtain by each image block in the 4th image block set
The final binary picture of original document image;
Wherein, first convolutional neural networks, second convolutional neural networks cascade composition generate confrontation network
Generator, and parameter optimization is carried out by training.
The third aspect of the present invention proposes a kind of storage device, wherein be stored with a plurality of program, described program be suitable for by
Processor loads and executes above-mentioned based on the file and picture binary coding method for generating confrontation network to realize.
The fourth aspect of the present invention proposes a kind of processing unit, including processor, storage device;Processor, suitable for holding
Each program of row;Storage device is suitable for storing a plurality of program;Described program is suitable for being loaded by processor and being executed above-mentioned to realize
Based on generate confrontation network file and picture binary coding method.
Beneficial effects of the present invention:
The present invention for multiple types document the higher binary image of the available accuracy of file and picture of taking pictures, and
Stability with higher, strong robustness, meanwhile, the present invention mentions file and picture text by the way of double convolutional neural networks
It takes with good adaptability, non-legible noise jamming can be overcome.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is an embodiment of the present invention based on the file and picture binary coding method process signal for generating confrontation network
Figure;
Fig. 2 is original document image cutting schematic diagram in an embodiment of the present invention;
Fig. 3 is that generator partial structure diagram in confrontation network structure is generated in an embodiment of the present invention;
Fig. 4 is that arbiter structural schematic diagram in confrontation network structure is generated in an embodiment of the present invention;
Fig. 5 is the result example obtained in an embodiment of the present invention through the first convolutional neural networks;
Fig. 6 is the input picture example of the second convolutional neural networks in an embodiment of the present invention;
Fig. 7 is the final binary picture example that an embodiment of the present invention obtains original document image.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to the embodiment of the present invention
In technical solution be clearly and completely described, it is clear that described embodiments are some of the embodiments of the present invention, without
It is whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not before making creative work
Every other embodiment obtained is put, shall fall within the protection scope of the present invention.
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.
In order to which more clearly the present invention will be described, we are invented below with reference to Fig. 1-Fig. 7 each in a kind of embodiment
Part carries out expansion detailed description.
Binary conversion treatment is carried out using two convolutional neural networks cascades in the present invention, in order to preferably carry out to the present invention
Illustrate, hereafter describe the composition and training of two convolutional neural networks in advance, is then based on trained two convolutional Neurals again
Network describes of the invention based on the file and picture binary coding method for generating confrontation network.
1, the composition and training of two convolutional neural networks
First convolutional neural networks, the second convolutional neural networks cascade composition generate the generator of confrontation network, and are based on
This building confrontation network.
(1) generator
In designed generation confrontation network, the first convolutional neural networks, the second convolutional neural networks are by two structure phases
Same semantic segmentation network (U-NET) cascades composition, wherein each U-NET network includes five layers of convolutional layer, five layers of deconvolution
Layer, to guarantee that input and output picture size is identical.Two U-NET effects are respectively as follows: first U-NET structure mainly according to part
The contextual information in region generates binary image, and keeps text details as much as possible.Second U-NET structure is based on not
With the contextual information difference of text under scale and background, the result images generated to first part are corrected, with into one
Step eliminates ambient noise.Generator structure is shown in the Blocked portion among Fig. 3, and G1 is the first convolutional neural networks in the figure, G2 is
Second convolutional neural networks.
(2) arbiter
Arbiter is the full convolutional neural networks of patch-based.The purpose is to the two-value for distinguishing generator generation
Which more standard of image and original binary image changed.Specific network structure is shown in Fig. 4, the binary picture that generator is generated
Piece binaryzation picture sample corresponding with input sample compares judgement, wherein the binary map and input that generator generates are schemed
As the result of comparison judgement be it is false, the comparison judging result of the corresponding standard binary map of original image and input picture is true.
(3) loss function
Fight loss function L when network traininglossFor
LcGAN(G, D)=EX, y[log D (x, y)]+Ex[log (1-D (x, G (x, z)))]
LL1(G)=EX, y[| | (y-G (x, z)) | |1]
Wherein, G, D respectively indicate the generator and arbiter in confrontation network;LcGAN(G, D) is that generator and arbiter are instructed
Experienced confrontation loss, LL1(G) L1 of the image and true bianry image generated for generator loses;X is input picture;Z is raw
Random noise in growing up to be a useful person;G (x, z) indicates the binarization result figure that generator is generated using input picture x and random noise z
Picture, y is true bianry image, and γ is two kinds of corresponding weight coefficients (taking γ=1 in some embodiments) of loss, D (x,
Y) for by input picture and the corresponding arbiter output result of true binaryzation sample.
2, the method for the present invention
The file and picture binary coding method based on generation confrontation network of an embodiment of the present invention, as shown in Figure 1, the party
Method includes:
Step S10 obtains the multiple images of default first size according to setting step-length from the original document image of input
Block, as the first image block set.
In the present embodiment, each image block in the first image block set, the second size area of picture centre not with
Other image blocks are overlapped in first image block set.
For example, default first size is A*A (such as can be 256*256), second (such as can be having a size of B*B
128*128), original file and picture of taking pictures is cut into the image block of A*A size, the B* at each image block center according to a fixed step size
B area is not overlapped, and is not overlapped to realize, the position of adjacent image block can be determined using lower method:
Upper left point [a, b] image block based determines the upper left point of four adjacent image blocks of the image block::
The upper left point coordinate of left side adjacent image block is [a-A+ (B/2), b];
The upper left point coordinate of right side adjacent image block is [a+A- (B/2), b];
The upper left point coordinate of top adjacent image block is [a, b-A+ (B/2)];
The upper left point coordinate of lower section adjacent image block is [a, b+A- (B/2)].
For example, first size is 256*256 in one embodiment, second having a size of 128*128, an image block upper left
Point is [a, b], then the upper left point coordinate of corresponding left side adjacent image block is [a-256+64, b];A left side for right side adjacent image block
Upper coordinate is [a+256-64, b];The upper left point coordinate of top adjacent image block is [a, b-256+64];Lower section adjacent image
The upper left point coordinate of block is [a, b+256-64].
It is illustrated in figure 2 an exemplary original document image cutting schematic diagram, the lines in figure indicate original document figure
The corresponding relationship of image behind image position and cutting.
Step S20 obtains the two-value of every image block by the first convolutional neural networks for the first image set of blocks
Change figure, obtains the second image block set;The original document image is normalized to the first size size, passes through described
One convolutional neural networks obtain its binary picture, as the first binary map.
In the present embodiment, image block each in the first image block set is inputted into trained first convolutional neural networks, is obtained
To the initial binary result images of each corresponding image block, the second image block set is obtained;Meanwhile by original document image
Total normalized rate obtains its binaryzation by the first convolutional neural networks as a result, conduct to A*A (such as can be 256*256)
First binary map.
Fig. 5 is that the binary image block process generated in an embodiment of the present invention through the first convolutional neural networks is spliced into
Result example after original image size gives (a), (b), (c), (d), (e) five examples in the figure.
Step S30 splices each image block in second image block set to obtain the second binary map;By the first binary map
The size of the original document image is zoomed to as third binary map;Obtain the grayscale image of the original document image;By institute
State the second binary map, third binary map, the original document image grayscale image merge to obtain triple channel image.
In the present embodiment, the second convolutional neural networks input picture is made of three channels, therefore the step needs in advance
Obtain triple channel image, method are as follows:
Each image block in the second image block set will be obtained in step S30, group is carried out using the information with step S10 cutting
It is merged and connects, revert to the preliminary binarization result of original document image, as the second binary map, which is the second convolution nerve net
First channel of network input picture;
The first binary map that step S30 is obtained is zoomed into the size of original document image as third binary map, the figure
For second channel of the second convolutional neural networks input picture;
Obtain third channel of the grayscale image of original document image as the second convolutional neural networks input picture;
Merge the grayscale image of the second binary map, third binary map, original document image to obtain triple channel image.
Two triple channel example images being illustrated in figure 6.
Step S40, the triple channel image obtains third image block set using the method cutting of step S10, and passes through
Second convolutional neural networks obtain the binary map of an image block, as the 4th image block set.
Each image block in 4th image block set is spliced the final two-value for obtaining original document image by step S50
Change figure.
In the present embodiment, each image block in the 4th image block set that step S40 is obtained, using with step S10 cutting
Information be combined splicing, revert to the corresponding binarization result image of original document image, and using the image as original
The final binary picture of file and picture.
Picture binaryzation process of the invention can also be shown in Fig. 3, input picture (original document image) passes through figure
Image block set after obtaining cutting as cutting, normalized by scaling after original image, pass through gray proces
The grayscale image of original document image is obtained, obtains figure after the bianry image merged block that the image block set after cutting is obtained by G1
Piece (1), the original image after normalization after G1 binaryzation by obtaining picture (2), picture (1), picture (2), original document figure
The grayscale image of picture carries out picture cutting after merging again, obtains multiple binaryzation pictures by G2 later, obtains after merging final
Binary picture.
Fig. 7 is the final binary picture example that an embodiment of the present invention obtains original document image, including (a),
(b), (c), (d), (e) five result examples, in Fig. 5 respectively figure correspond to each other.
The document image binaryzation system based on generation confrontation network of an embodiment of the present invention, including cutting module,
First convolution Processing with Neural Network module, triple channel image collection module, the second convolution Processing with Neural Network module, final two-value
Change figure and obtains module.
The cutting module is configured to obtain the more of default first size from the text image of input according to setting step-length
A image block constructs image block set.
The first convolution Processing with Neural Network module, be configured to for by the cutting module from original text image
The first image block set is obtained, the binary picture of every image block is obtained by the first convolutional neural networks, obtains the second image block
Set;The original text image is normalized to the first size size, is obtained by first convolutional neural networks
Its binary picture, as the first binary map.
The triple channel image collection module is configured to that each image block in second image block set is spliced to obtain
Two binary maps;First binary map is zoomed into the size of the original text image as third binary map;It obtains described original
The grayscale image of text image;The grayscale image of second binary map, third binary map, the original text image is merged to obtain
Triple channel image.
The second convolution Processing with Neural Network module, be configured to for by the cutting module from the triple channel figure
As obtaining third image block set, and by the binary map of the second convolutional neural networks acquisition image block, as the 4th image
Set of blocks.
The final binary picture obtains module, is configured to splice to obtain by each image block in the 4th image block set
The final binary picture of original text image.
Wherein, first convolutional neural networks, second convolutional neural networks cascade composition generate confrontation network
Generator, and parameter optimization is carried out by training.
Person of ordinary skill in the field can be understood that, for convenience and simplicity of description, foregoing description
The specific work process of system and related explanation, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
It should be noted that it is provided by the above embodiment based on the document image binaryzation system for generating confrontation network, only
The example of the division of the above functional modules, in practical applications, it can according to need and by above-mentioned function distribution
Completed by different functional modules, i.e., by the embodiment of the present invention module or step decompose or combine again, for example, on
The module for stating embodiment can be merged into a module, multiple submodule can also be further split into, to complete above description
All or part of function.For module involved in the embodiment of the present invention, the title of step, it is only for distinguish each
Module or step, are not intended as inappropriate limitation of the present invention.
The storage device of an embodiment of the present invention, wherein being stored with a plurality of program, described program is suitable for being added by processor
It carries and executes and is above-mentioned based on the file and picture binary coding method for generating confrontation network to realize.
The processing unit of embodiment in the present invention one, including processor, storage device;Processor is adapted for carrying out each journey
Sequence;Storage device is suitable for storing a plurality of program;Described program is suitable for being loaded by processor and being executed above-mentioned based on life to realize
At the file and picture binary coding method of confrontation network.
Person of ordinary skill in the field can be understood that, for convenience and simplicity of description, foregoing description
The specific work process and related explanation of storage device, processing unit, can refer to corresponding processes in the foregoing method embodiment,
Details are not described herein.
Those skilled in the art should be able to recognize that, mould described in conjunction with the examples disclosed in the embodiments of the present disclosure
Block, method and step, can be realized with electronic hardware, computer software, or a combination of the two, software module, method and step pair
The program answered can be placed in random access memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electric erasable and can compile
Any other form of storage well known in journey ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field is situated between
In matter.In order to clearly demonstrate the interchangeability of electronic hardware and software, in the above description according to function generally
Describe each exemplary composition and step.These functions are executed actually with electronic hardware or software mode, depend on technology
The specific application and design constraint of scheme.Those skilled in the art can carry out using distinct methods each specific application
Realize described function, but such implementation should not be considered as beyond the scope of the present invention.
Term " first ", " second " etc. are to be used to distinguish similar objects, rather than be used to describe or indicate specific suitable
Sequence or precedence.
Term " includes " or any other like term are intended to cover non-exclusive inclusion, so that including a system
Process, method, article or equipment/device of column element not only includes those elements, but also including being not explicitly listed
Other elements, or further include the intrinsic element of these process, method, article or equipment/devices.
So far, it has been combined preferred embodiment shown in the drawings and describes technical solution of the present invention, still, this field
Technical staff is it is easily understood that protection scope of the present invention is expressly not limited to these specific embodiments.Without departing from this
Under the premise of the principle of invention, those skilled in the art can make equivalent change or replacement to the relevant technologies feature, these
Technical solution after change or replacement will fall within the scope of protection of the present invention.
Claims (10)
1. a kind of based on the file and picture binary coding method for generating confrontation network, which is characterized in that this method comprises:
Step S10 obtains the multiple images block of default first size from the original document image of input according to setting step-length, makees
For the first image block set;
Step S20 obtains the binaryzation of every image block by the first convolutional neural networks for the first image set of blocks
Figure, obtains the second image block set;The original document image is normalized to the first size size, passes through described first
Convolutional neural networks obtain its binary picture, as the first binary map;
Step S30 splices each image block in second image block set to obtain the second binary map;First binary map is scaled
Extremely the size of the original document image is as third binary map;Obtain the grayscale image of the original document image;By described
Two binary maps, third binary map, the original document image grayscale image merge to obtain triple channel image;
Step S40, the triple channel image obtains third image block set using the method cutting of step S10, and passes through second
Convolutional neural networks obtain the binary map of an image block, as the 4th image block set;
Each image block in 4th image block set is spliced the final binary picture for obtaining original document image by step S50;
Wherein, first convolutional neural networks, second convolutional neural networks cascade composition generate the generation of confrontation network
Device, and parameter optimization is carried out by training.
2. according to claim 1 based on the file and picture binary coding method for generating confrontation network, which is characterized in that described
The arbiter for fighting network is the full convolutional neural networks of patch-based;
First convolutional neural networks, second convolutional neural networks are the identical semantic segmentation network of two structures;Institute
The first convolutional neural networks are stated for generating binary image according to the contextual information of regional area;Second convolutional Neural
Network according to text and output result of the context information gap to first convolutional neural networks for being modified.
3. according to claim 2 based on the file and picture binary coding method for generating confrontation network, which is characterized in that described
Fight loss function L when network traininglossFor
LcGAN(G, D)=Ex,y[logD(x,y)]+Ex[log(1-D(x,G(x,z)))]
LL1(G)=Ex,y[||(y-G(x,z))||1]
Wherein, G, D respectively indicate the generator and arbiter in confrontation network;LcGAN(G, D) is generator and arbiter training
Confrontation loss, LL1(G) L1 of the image and true bianry image generated for generator loses, and x is input picture, and z is generator
In random noise, G (x, z) indicates that the binarization result image that generator is generated using input picture x and random noise z, y are
True bianry image, γ are the corresponding weight coefficient of two kinds of losses, and D (x, y) is by input picture and true binaryzation sample
Corresponding arbiter exports result.
4. according to claim 2 based on the file and picture binary coding method for generating confrontation network, which is characterized in that described
First convolutional neural networks, second convolutional neural networks include five layers of convolutional layer, five layers of warp lamination.
5. according to claim 1-4 based on the file and picture binary coding method for generating confrontation network, feature
Be, each image block in the first image set of blocks, the second size area of picture centre not with first figure
As other image blocks overlapping in set of blocks.
6. according to claim 5 state based on the file and picture binary coding method for generating confrontation network, which is characterized in that described the
One having a size of A*A, and described second having a size of B*B;
Upper left point [a, b] image block based determines the upper left point of four adjacent image blocks of the image block, method are as follows:
The upper left point coordinate of left side adjacent image block is [a-A+ (B/2), b];
The upper left point coordinate of right side adjacent image block is [a+A- (B/2), b];
The upper left point coordinate of top adjacent image block is [a, b-A+ (B/2)];
The upper left point coordinate of lower section adjacent image block is [a, b+A- (B/2)].
7. according to claim 6 state based on the file and picture binary coding method for generating confrontation network, which is characterized in that described the
One having a size of 256*256, and described second having a size of 128*128.
8. it is a kind of based on generate confrontation network document image binaryzation system, which is characterized in that the system include cutting module,
First convolution Processing with Neural Network module, triple channel image collection module, the second convolution Processing with Neural Network module, final two-value
Change figure and obtains module;
The cutting module is configured to obtain multiple figures of default first size from the text image of input according to setting step-length
As block, image block set is constructed;
The first convolution Processing with Neural Network module is configured to for being obtained by the cutting module from original document image
First image block set obtains the binary picture of every image block by the first convolutional neural networks, obtains the second image block set;
The original document image is normalized to the first size size, obtains its two-value by first convolutional neural networks
Change figure, as the first binary map;
The triple channel image collection module is configured to splice each image block in second image block set to obtain the two or two
Value figure;First binary map is zoomed into the size of the original document image as third binary map;Obtain the original document
The grayscale image of image;Merge the grayscale image of second binary map, third binary map, the original document image to obtain threeway
Road image;
The second convolution Processing with Neural Network module is configured to for being obtained by the cutting module from the triple channel image
Third image block set is taken, and obtains the binary map of an image block by the second convolutional neural networks, as the 4th image block collection
It closes;
The final binary picture obtains module, is configured to splice to obtain by each image block in the 4th image block set original
The final binary picture of file and picture;
Wherein, first convolutional neural networks, second convolutional neural networks cascade composition generate the generation of confrontation network
Device, and parameter optimization is carried out by training.
9. a kind of storage device, wherein being stored with a plurality of program, which is characterized in that described program is suitable for being loaded and being held by processor
Row is to realize that claim 1-7 is described in any item based on the file and picture binary coding method for generating confrontation network.
10. a kind of processing unit, including processor, storage device;Processor is adapted for carrying out each program;Storage device is suitable for
Store a plurality of program;It is characterized in that, described program is suitable for being loaded by processor and being executed to realize any one of claim 1-7
The file and picture binary coding method based on generation confrontation network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910222323.8A CN110097059B (en) | 2019-03-22 | 2019-03-22 | Document image binarization method, system and device based on generation countermeasure network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910222323.8A CN110097059B (en) | 2019-03-22 | 2019-03-22 | Document image binarization method, system and device based on generation countermeasure network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110097059A true CN110097059A (en) | 2019-08-06 |
CN110097059B CN110097059B (en) | 2021-04-02 |
Family
ID=67443030
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910222323.8A Active CN110097059B (en) | 2019-03-22 | 2019-03-22 | Document image binarization method, system and device based on generation countermeasure network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110097059B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110516202A (en) * | 2019-08-20 | 2019-11-29 | Oppo广东移动通信有限公司 | Acquisition methods, document structure tree method, apparatus and the electronic equipment of document generator |
CN110717523A (en) * | 2019-09-20 | 2020-01-21 | 湖北工业大学 | D-LinkNet-based low-quality document image binarization method |
CN110895828A (en) * | 2019-12-03 | 2020-03-20 | 武汉纺织大学 | Model and method for generating MR (magnetic resonance) image simulating heterogeneous flexible biological tissue |
CN111695596A (en) * | 2020-04-30 | 2020-09-22 | 华为技术有限公司 | Neural network for image processing and related equipment |
CN112837329A (en) * | 2021-03-01 | 2021-05-25 | 西北民族大学 | Tibetan ancient book document image binarization method and system |
WO2022178949A1 (en) * | 2021-02-26 | 2022-09-01 | 平安科技(深圳)有限公司 | Semantic segmentation method and apparatus for electron microtomography data, device, and medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101021905A (en) * | 2006-02-15 | 2007-08-22 | 中国科学院自动化研究所 | File image binaryzation method |
CN106203434A (en) * | 2016-07-08 | 2016-12-07 | 中国科学院自动化研究所 | Based on the symmetric file and picture binary coding method of stroke structure |
CN108986067A (en) * | 2018-05-25 | 2018-12-11 | 上海交通大学 | Pulmonary nodule detection method based on cross-module state |
CN109190684A (en) * | 2018-08-15 | 2019-01-11 | 西安电子科技大学 | SAR image sample generating method based on sketch and structural generation confrontation network |
CN109190722A (en) * | 2018-08-06 | 2019-01-11 | 大连民族大学 | Font style based on language of the Manchus character picture migrates transform method |
US20190065880A1 (en) * | 2017-08-28 | 2019-02-28 | Abbyy Development Llc | Reconstructing document from series of document images |
CN109460735A (en) * | 2018-11-09 | 2019-03-12 | 中国科学院自动化研究所 | Document binary processing method, system, device based on figure semi-supervised learning |
-
2019
- 2019-03-22 CN CN201910222323.8A patent/CN110097059B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101021905A (en) * | 2006-02-15 | 2007-08-22 | 中国科学院自动化研究所 | File image binaryzation method |
CN106203434A (en) * | 2016-07-08 | 2016-12-07 | 中国科学院自动化研究所 | Based on the symmetric file and picture binary coding method of stroke structure |
US20190065880A1 (en) * | 2017-08-28 | 2019-02-28 | Abbyy Development Llc | Reconstructing document from series of document images |
CN108986067A (en) * | 2018-05-25 | 2018-12-11 | 上海交通大学 | Pulmonary nodule detection method based on cross-module state |
CN109190722A (en) * | 2018-08-06 | 2019-01-11 | 大连民族大学 | Font style based on language of the Manchus character picture migrates transform method |
CN109190684A (en) * | 2018-08-15 | 2019-01-11 | 西安电子科技大学 | SAR image sample generating method based on sketch and structural generation confrontation network |
CN109460735A (en) * | 2018-11-09 | 2019-03-12 | 中国科学院自动化研究所 | Document binary processing method, system, device based on figure semi-supervised learning |
Non-Patent Citations (3)
Title |
---|
A. SAKILA等: ""A hybrid approach for document image binarization"", 《2017 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTING AND INFORMATICS (ICICI)》 * |
JINYUAN ZHAO等: ""An Effective Binarization Method for Disturbed Camera-Captured Document Images"", 《2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR)》 * |
童立靖等: ""文档图像二值化算法VFCM"", 《文档图像二值化算法VFCM》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110516202A (en) * | 2019-08-20 | 2019-11-29 | Oppo广东移动通信有限公司 | Acquisition methods, document structure tree method, apparatus and the electronic equipment of document generator |
CN110717523A (en) * | 2019-09-20 | 2020-01-21 | 湖北工业大学 | D-LinkNet-based low-quality document image binarization method |
CN110895828A (en) * | 2019-12-03 | 2020-03-20 | 武汉纺织大学 | Model and method for generating MR (magnetic resonance) image simulating heterogeneous flexible biological tissue |
CN110895828B (en) * | 2019-12-03 | 2023-04-18 | 武汉纺织大学 | Model and method for generating MR (magnetic resonance) image simulating heterogeneous flexible biological tissue |
CN111695596A (en) * | 2020-04-30 | 2020-09-22 | 华为技术有限公司 | Neural network for image processing and related equipment |
WO2022178949A1 (en) * | 2021-02-26 | 2022-09-01 | 平安科技(深圳)有限公司 | Semantic segmentation method and apparatus for electron microtomography data, device, and medium |
CN112837329A (en) * | 2021-03-01 | 2021-05-25 | 西北民族大学 | Tibetan ancient book document image binarization method and system |
CN112837329B (en) * | 2021-03-01 | 2022-07-19 | 西北民族大学 | Tibetan ancient book document image binarization method and system |
Also Published As
Publication number | Publication date |
---|---|
CN110097059B (en) | 2021-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110097059A (en) | Based on file and picture binary coding method, system, the device for generating confrontation network | |
Crimmins | Geometric filter for speckle reduction | |
CN108805889A (en) | The fining conspicuousness method for segmenting objects of margin guide and system, equipment | |
FR2534400A1 (en) | GRAPHIC DISPLAY METHODS AND APPARATUS | |
CN113160062A (en) | Infrared image target detection method, device, equipment and storage medium | |
CN109523558A (en) | A kind of portrait dividing method and system | |
CN115205636B (en) | Image target detection method, system, equipment and storage medium | |
WO2016175785A1 (en) | Topic identification based on functional summarization | |
CN113361432A (en) | Video character end-to-end detection and identification method based on deep learning | |
Giridhar et al. | A novel approach to ocr using image recognition based classification for ancient tamil inscriptions in temples | |
CN111461211A (en) | Feature extraction method for lightweight target detection and corresponding detection method | |
Kovanen et al. | A layered method for determining manga text bubble reading order | |
de Meulenaer et al. | Deriving physical parameters of unresolved star clusters-V. M 31 PHAT star clusters | |
EP1553508A1 (en) | Method for realizing an electrical wiring diagram | |
CN102663715A (en) | Super-resolution method and device | |
CN112800259A (en) | Image generation method and system based on edge closure and commonality detection | |
Zimmermann | Chemical Structure Reconstruction with chemoCR. | |
Oriot et al. | Building extraction from stereoscopic aerial images | |
JP2012160002A (en) | Layout template generation device and image layout device | |
CN111260659A (en) | Image interactive segmentation method based on initial annotation point guidance | |
CN111145178A (en) | High-resolution remote sensing image multi-scale segmentation method | |
Sreedevi et al. | Enhancement of inscription images | |
Bejinariu et al. | Recovery of old dialectal materials and maps through image processing | |
Charrada et al. | Development of a database with ground truth for old documents analysis | |
CN114627292B (en) | Industrial shielding target detection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |