CN110097059A - Based on file and picture binary coding method, system, the device for generating confrontation network - Google Patents

Based on file and picture binary coding method, system, the device for generating confrontation network Download PDF

Info

Publication number
CN110097059A
CN110097059A CN201910222323.8A CN201910222323A CN110097059A CN 110097059 A CN110097059 A CN 110097059A CN 201910222323 A CN201910222323 A CN 201910222323A CN 110097059 A CN110097059 A CN 110097059A
Authority
CN
China
Prior art keywords
image
picture
image block
binary
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910222323.8A
Other languages
Chinese (zh)
Other versions
CN110097059B (en
Inventor
肖柏华
赵晋媛
贾馥溪
王春恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201910222323.8A priority Critical patent/CN110097059B/en
Publication of CN110097059A publication Critical patent/CN110097059A/en
Application granted granted Critical
Publication of CN110097059B publication Critical patent/CN110097059B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns

Abstract

The invention belongs to field of image processings, more particularly to a kind of based on the file and picture binary coding method, system, the device that generate confrontation network, it is intended that it is unstable to solve existing binarization method its binaryzation accuracy in the case where the picture quality of document picture is irregular, the poor problem of robustness.The method of the present invention includes: to carry out cutting to original document image;Divide and binary conversion treatment is carried out to the original document image after cutting image, normalization respectively based on the first convolutional neural networks;Obtained binary image is passed through into splicing respectively, scaling generates original document image size, and it is merged with the grayscale image of original document image, pass through the second convolutional neural networks into binaryzation after carrying out picture cutting, and merges obtained binary image block and obtain final binary picture.Take pictures file and picture available accuracy higher binary image of the present invention for multiple types document, and stability with higher, strong robustness.

Description

Based on file and picture binary coding method, system, the device for generating confrontation network
Technical field
The invention belongs to field of image processings, and in particular to a kind of based on the document image binaryzation side for generating confrontation network Method, system, device.
Background technique
In recent years, with the fast development of network technology, the mankind have come into that information is epoch-making, traditional acquisition of information Method, such as books, newspaper and periodical be due to the inconvenience of carrying, while storing and needing a large amount of space, is not easy to compile It collects and arranges and propagate.People increasingly tend to store using electronic equipments such as disks, therefore by paper material text information It rapidly inputs computer to have very important significance, OCR (Optical Character Recognition, optical character identification) Thus technology generates.OCR technique can be realized the announcement high speed of text information, automatically input, save a large amount of human resources, It has been widely used at present.
The success of OCR technique depends on the pretreatment work to text image, can carry out good binaryzation to image Processing, it will be able to the accuracy rate of OCR identification is greatly improved, so binaryzation work has very big researching value.It is answered actual In, the quality of text image may be multifarious, may have unclear or noise of printing etc. to bother, existing binarization method exists Its binaryzation accuracy is unstable in the case that the picture quality of document picture is irregular, and robustness is poor.
Summary of the invention
In order to solve the above problem in the prior art, in order to solve existing binarization method in the image of document picture Its binaryzation accuracy is unstable in the case that quality is irregular, and the poor problem of robustness, the first aspect of the present invention mentions Go out a kind of file and picture binary coding method based on generation confrontation network, this method comprises:
Step S10 obtains the multiple images of default first size according to setting step-length from the original document image of input Block, as the first image block set;
Step S20 obtains the two-value of every image block by the first convolutional neural networks for the first image set of blocks Change figure, obtains the second image block set;The original document image is normalized to the first size size, passes through described One convolutional neural networks obtain its binary picture, as the first binary map;
Step S30 splices each image block in second image block set to obtain the second binary map;By the first binary map The size of the original document image is zoomed to as third binary map;Obtain the grayscale image of the original document image;By institute State the second binary map, third binary map, the original document image grayscale image merge to obtain triple channel image;
Step S40, the triple channel image obtains third image block set using the method cutting of step S10, and passes through Second convolutional neural networks obtain the binary map of an image block, as the 4th image block set;
Each image block in 4th image block set is spliced the final two-value for obtaining original document image by step S50 Change figure;
Wherein, first convolutional neural networks, second convolutional neural networks cascade composition generate confrontation network Generator, and parameter optimization is carried out by training.
In some preferred embodiments, the arbiter of the confrontation network is the full convolutional neural networks of patch-based;
First convolutional neural networks, second convolutional neural networks are the identical semantic segmentation net of two structures Network;First convolutional neural networks are used to generate binary image according to the contextual information of regional area;The volume Two Product neural network be used for according to text and context information gap to the output results of first convolutional neural networks into Row amendment.
In some preferred embodiments, it is described confrontation network training when loss function LlossFor
LcGAN(G, D)=EX, y[log D (x, y)]+Ex[log (1-D (x, G (x, z)))]
LL1(G)=EX, y[| | (y-G (x, z)) | |1]
Wherein, G, D respectively indicate the generator and arbiter in confrontation network;LcGAN(G, D) is generator and arbiter Trained confrontation loss, LL1(G) L1 of the image and true bianry image generated for generator loses, and x is input picture, and z is Random noise in generator, G (x, z) indicate the binarization result figure that generator is generated using input picture x and random noise z Picture, y are true bianry image, and γ is the corresponding weight coefficient of two kinds of losses, and D (x, y) is by input picture and true two-value Change the corresponding arbiter of sample and exports result.
In some preferred embodiments, first convolutional neural networks, second convolutional neural networks include Five layers of convolutional layer, five layers of warp lamination.
In some preferred embodiments, each image block in the first image set of blocks, the of picture centre Two size areas are not Chong Die with other image blocks in the first image set of blocks.
In some preferred embodiments, the first size is A*A, and described second having a size of B*B;
Upper left point [a, b] image block based determines the upper left point of four adjacent image blocks of the image block, method are as follows:
The upper left point coordinate of left side adjacent image block is [a-A+ (B/2), b];
The upper left point coordinate of right side adjacent image block is [a+A- (B/2), b];
The upper left point coordinate of top adjacent image block is [a, b-A+ (B/2)];
The upper left point coordinate of lower section adjacent image block is [a, b+A- (B/2)].
In some preferred embodiments, the first size is 256*256, and described second having a size of 128*128.
The second aspect of the present invention proposes a kind of document image binaryzation system based on generation confrontation network, this is It unites including at cutting module, the first convolution Processing with Neural Network module, triple channel image collection module, the second convolutional neural networks Manage module, final binary picture obtains module;
The cutting module is configured to obtain the more of default first size from the text image of input according to setting step-length A image block constructs image block set;
The first convolution Processing with Neural Network module, be configured to for by the cutting module from original document image The first image block set is obtained, the binary picture of every image block is obtained by the first convolutional neural networks, obtains the second image block Set;The original document image is normalized to the first size size, is obtained by first convolutional neural networks Its binary picture, as the first binary map;
The triple channel image collection module is configured to that each image block in second image block set is spliced to obtain Two binary maps;First binary map is zoomed into the size of the original document image as third binary map;It obtains described original The grayscale image of file and picture;The grayscale image of second binary map, third binary map, the original document image is merged to obtain Triple channel image;
The second convolution Processing with Neural Network module, be configured to for by the cutting module from the triple channel figure As obtaining third image block set, and by the binary map of the second convolutional neural networks acquisition image block, as the 4th image Set of blocks;
The final binary picture obtains module, is configured to splice to obtain by each image block in the 4th image block set The final binary picture of original document image;
Wherein, first convolutional neural networks, second convolutional neural networks cascade composition generate confrontation network Generator, and parameter optimization is carried out by training.
The third aspect of the present invention proposes a kind of storage device, wherein be stored with a plurality of program, described program be suitable for by Processor loads and executes above-mentioned based on the file and picture binary coding method for generating confrontation network to realize.
The fourth aspect of the present invention proposes a kind of processing unit, including processor, storage device;Processor, suitable for holding Each program of row;Storage device is suitable for storing a plurality of program;Described program is suitable for being loaded by processor and being executed above-mentioned to realize Based on generate confrontation network file and picture binary coding method.
Beneficial effects of the present invention:
The present invention for multiple types document the higher binary image of the available accuracy of file and picture of taking pictures, and Stability with higher, strong robustness, meanwhile, the present invention mentions file and picture text by the way of double convolutional neural networks It takes with good adaptability, non-legible noise jamming can be overcome.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:
Fig. 1 is an embodiment of the present invention based on the file and picture binary coding method process signal for generating confrontation network Figure;
Fig. 2 is original document image cutting schematic diagram in an embodiment of the present invention;
Fig. 3 is that generator partial structure diagram in confrontation network structure is generated in an embodiment of the present invention;
Fig. 4 is that arbiter structural schematic diagram in confrontation network structure is generated in an embodiment of the present invention;
Fig. 5 is the result example obtained in an embodiment of the present invention through the first convolutional neural networks;
Fig. 6 is the input picture example of the second convolutional neural networks in an embodiment of the present invention;
Fig. 7 is the final binary picture example that an embodiment of the present invention obtains original document image.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to the embodiment of the present invention In technical solution be clearly and completely described, it is clear that described embodiments are some of the embodiments of the present invention, without It is whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not before making creative work Every other embodiment obtained is put, shall fall within the protection scope of the present invention.
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.
In order to which more clearly the present invention will be described, we are invented below with reference to Fig. 1-Fig. 7 each in a kind of embodiment Part carries out expansion detailed description.
Binary conversion treatment is carried out using two convolutional neural networks cascades in the present invention, in order to preferably carry out to the present invention Illustrate, hereafter describe the composition and training of two convolutional neural networks in advance, is then based on trained two convolutional Neurals again Network describes of the invention based on the file and picture binary coding method for generating confrontation network.
1, the composition and training of two convolutional neural networks
First convolutional neural networks, the second convolutional neural networks cascade composition generate the generator of confrontation network, and are based on This building confrontation network.
(1) generator
In designed generation confrontation network, the first convolutional neural networks, the second convolutional neural networks are by two structure phases Same semantic segmentation network (U-NET) cascades composition, wherein each U-NET network includes five layers of convolutional layer, five layers of deconvolution Layer, to guarantee that input and output picture size is identical.Two U-NET effects are respectively as follows: first U-NET structure mainly according to part The contextual information in region generates binary image, and keeps text details as much as possible.Second U-NET structure is based on not With the contextual information difference of text under scale and background, the result images generated to first part are corrected, with into one Step eliminates ambient noise.Generator structure is shown in the Blocked portion among Fig. 3, and G1 is the first convolutional neural networks in the figure, G2 is Second convolutional neural networks.
(2) arbiter
Arbiter is the full convolutional neural networks of patch-based.The purpose is to the two-value for distinguishing generator generation Which more standard of image and original binary image changed.Specific network structure is shown in Fig. 4, the binary picture that generator is generated Piece binaryzation picture sample corresponding with input sample compares judgement, wherein the binary map and input that generator generates are schemed As the result of comparison judgement be it is false, the comparison judging result of the corresponding standard binary map of original image and input picture is true.
(3) loss function
Fight loss function L when network traininglossFor
LcGAN(G, D)=EX, y[log D (x, y)]+Ex[log (1-D (x, G (x, z)))]
LL1(G)=EX, y[| | (y-G (x, z)) | |1]
Wherein, G, D respectively indicate the generator and arbiter in confrontation networkLcGAN(G, D) is that generator and arbiter are instructed Experienced confrontation loss, LL1(G) L1 of the image and true bianry image generated for generator loses;X is input picture;Z is raw Random noise in growing up to be a useful person;G (x, z) indicates the binarization result figure that generator is generated using input picture x and random noise z Picture, y is true bianry image, and γ is two kinds of corresponding weight coefficients (taking γ=1 in some embodiments) of loss, D (x, Y) for by input picture and the corresponding arbiter output result of true binaryzation sample.
2, the method for the present invention
The file and picture binary coding method based on generation confrontation network of an embodiment of the present invention, as shown in Figure 1, the party Method includes:
Step S10 obtains the multiple images of default first size according to setting step-length from the original document image of input Block, as the first image block set.
In the present embodiment, each image block in the first image block set, the second size area of picture centre not with Other image blocks are overlapped in first image block set.
For example, default first size is A*A (such as can be 256*256), second (such as can be having a size of B*B 128*128), original file and picture of taking pictures is cut into the image block of A*A size, the B* at each image block center according to a fixed step size B area is not overlapped, and is not overlapped to realize, the position of adjacent image block can be determined using lower method:
Upper left point [a, b] image block based determines the upper left point of four adjacent image blocks of the image block::
The upper left point coordinate of left side adjacent image block is [a-A+ (B/2), b];
The upper left point coordinate of right side adjacent image block is [a+A- (B/2), b];
The upper left point coordinate of top adjacent image block is [a, b-A+ (B/2)];
The upper left point coordinate of lower section adjacent image block is [a, b+A- (B/2)].
For example, first size is 256*256 in one embodiment, second having a size of 128*128, an image block upper left Point is [a, b], then the upper left point coordinate of corresponding left side adjacent image block is [a-256+64, b];A left side for right side adjacent image block Upper coordinate is [a+256-64, b];The upper left point coordinate of top adjacent image block is [a, b-256+64];Lower section adjacent image The upper left point coordinate of block is [a, b+256-64].
It is illustrated in figure 2 an exemplary original document image cutting schematic diagram, the lines in figure indicate original document figure The corresponding relationship of image behind image position and cutting.
Step S20 obtains the two-value of every image block by the first convolutional neural networks for the first image set of blocks Change figure, obtains the second image block set;The original document image is normalized to the first size size, passes through described One convolutional neural networks obtain its binary picture, as the first binary map.
In the present embodiment, image block each in the first image block set is inputted into trained first convolutional neural networks, is obtained To the initial binary result images of each corresponding image block, the second image block set is obtained;Meanwhile by original document image Total normalized rate obtains its binaryzation by the first convolutional neural networks as a result, conduct to A*A (such as can be 256*256) First binary map.
Fig. 5 is that the binary image block process generated in an embodiment of the present invention through the first convolutional neural networks is spliced into Result example after original image size gives (a), (b), (c), (d), (e) five examples in the figure.
Step S30 splices each image block in second image block set to obtain the second binary map;By the first binary map The size of the original document image is zoomed to as third binary map;Obtain the grayscale image of the original document image;By institute State the second binary map, third binary map, the original document image grayscale image merge to obtain triple channel image.
In the present embodiment, the second convolutional neural networks input picture is made of three channels, therefore the step needs in advance Obtain triple channel image, method are as follows:
Each image block in the second image block set will be obtained in step S30, group is carried out using the information with step S10 cutting It is merged and connects, revert to the preliminary binarization result of original document image, as the second binary map, which is the second convolution nerve net First channel of network input picture;
The first binary map that step S30 is obtained is zoomed into the size of original document image as third binary map, the figure For second channel of the second convolutional neural networks input picture;
Obtain third channel of the grayscale image of original document image as the second convolutional neural networks input picture;
Merge the grayscale image of the second binary map, third binary map, original document image to obtain triple channel image.
Two triple channel example images being illustrated in figure 6.
Step S40, the triple channel image obtains third image block set using the method cutting of step S10, and passes through Second convolutional neural networks obtain the binary map of an image block, as the 4th image block set.
Each image block in 4th image block set is spliced the final two-value for obtaining original document image by step S50 Change figure.
In the present embodiment, each image block in the 4th image block set that step S40 is obtained, using with step S10 cutting Information be combined splicing, revert to the corresponding binarization result image of original document image, and using the image as original The final binary picture of file and picture.
Picture binaryzation process of the invention can also be shown in Fig. 3, input picture (original document image) passes through figure Image block set after obtaining cutting as cutting, normalized by scaling after original image, pass through gray proces The grayscale image of original document image is obtained, obtains figure after the bianry image merged block that the image block set after cutting is obtained by G1 Piece (1), the original image after normalization after G1 binaryzation by obtaining picture (2), picture (1), picture (2), original document figure The grayscale image of picture carries out picture cutting after merging again, obtains multiple binaryzation pictures by G2 later, obtains after merging final Binary picture.
Fig. 7 is the final binary picture example that an embodiment of the present invention obtains original document image, including (a), (b), (c), (d), (e) five result examples, in Fig. 5 respectively figure correspond to each other.
The document image binaryzation system based on generation confrontation network of an embodiment of the present invention, including cutting module, First convolution Processing with Neural Network module, triple channel image collection module, the second convolution Processing with Neural Network module, final two-value Change figure and obtains module.
The cutting module is configured to obtain the more of default first size from the text image of input according to setting step-length A image block constructs image block set.
The first convolution Processing with Neural Network module, be configured to for by the cutting module from original text image The first image block set is obtained, the binary picture of every image block is obtained by the first convolutional neural networks, obtains the second image block Set;The original text image is normalized to the first size size, is obtained by first convolutional neural networks Its binary picture, as the first binary map.
The triple channel image collection module is configured to that each image block in second image block set is spliced to obtain Two binary maps;First binary map is zoomed into the size of the original text image as third binary map;It obtains described original The grayscale image of text image;The grayscale image of second binary map, third binary map, the original text image is merged to obtain Triple channel image.
The second convolution Processing with Neural Network module, be configured to for by the cutting module from the triple channel figure As obtaining third image block set, and by the binary map of the second convolutional neural networks acquisition image block, as the 4th image Set of blocks.
The final binary picture obtains module, is configured to splice to obtain by each image block in the 4th image block set The final binary picture of original text image.
Wherein, first convolutional neural networks, second convolutional neural networks cascade composition generate confrontation network Generator, and parameter optimization is carried out by training.
Person of ordinary skill in the field can be understood that, for convenience and simplicity of description, foregoing description The specific work process of system and related explanation, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
It should be noted that it is provided by the above embodiment based on the document image binaryzation system for generating confrontation network, only The example of the division of the above functional modules, in practical applications, it can according to need and by above-mentioned function distribution Completed by different functional modules, i.e., by the embodiment of the present invention module or step decompose or combine again, for example, on The module for stating embodiment can be merged into a module, multiple submodule can also be further split into, to complete above description All or part of function.For module involved in the embodiment of the present invention, the title of step, it is only for distinguish each Module or step, are not intended as inappropriate limitation of the present invention.
The storage device of an embodiment of the present invention, wherein being stored with a plurality of program, described program is suitable for being added by processor It carries and executes and is above-mentioned based on the file and picture binary coding method for generating confrontation network to realize.
The processing unit of embodiment in the present invention one, including processor, storage device;Processor is adapted for carrying out each journey Sequence;Storage device is suitable for storing a plurality of program;Described program is suitable for being loaded by processor and being executed above-mentioned based on life to realize At the file and picture binary coding method of confrontation network.
Person of ordinary skill in the field can be understood that, for convenience and simplicity of description, foregoing description The specific work process and related explanation of storage device, processing unit, can refer to corresponding processes in the foregoing method embodiment, Details are not described herein.
Those skilled in the art should be able to recognize that, mould described in conjunction with the examples disclosed in the embodiments of the present disclosure Block, method and step, can be realized with electronic hardware, computer software, or a combination of the two, software module, method and step pair The program answered can be placed in random access memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electric erasable and can compile Any other form of storage well known in journey ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field is situated between In matter.In order to clearly demonstrate the interchangeability of electronic hardware and software, in the above description according to function generally Describe each exemplary composition and step.These functions are executed actually with electronic hardware or software mode, depend on technology The specific application and design constraint of scheme.Those skilled in the art can carry out using distinct methods each specific application Realize described function, but such implementation should not be considered as beyond the scope of the present invention.
Term " first ", " second " etc. are to be used to distinguish similar objects, rather than be used to describe or indicate specific suitable Sequence or precedence.
Term " includes " or any other like term are intended to cover non-exclusive inclusion, so that including a system Process, method, article or equipment/device of column element not only includes those elements, but also including being not explicitly listed Other elements, or further include the intrinsic element of these process, method, article or equipment/devices.
So far, it has been combined preferred embodiment shown in the drawings and describes technical solution of the present invention, still, this field Technical staff is it is easily understood that protection scope of the present invention is expressly not limited to these specific embodiments.Without departing from this Under the premise of the principle of invention, those skilled in the art can make equivalent change or replacement to the relevant technologies feature, these Technical solution after change or replacement will fall within the scope of protection of the present invention.

Claims (10)

1. a kind of based on the file and picture binary coding method for generating confrontation network, which is characterized in that this method comprises:
Step S10 obtains the multiple images block of default first size from the original document image of input according to setting step-length, makees For the first image block set;
Step S20 obtains the binaryzation of every image block by the first convolutional neural networks for the first image set of blocks Figure, obtains the second image block set;The original document image is normalized to the first size size, passes through described first Convolutional neural networks obtain its binary picture, as the first binary map;
Step S30 splices each image block in second image block set to obtain the second binary map;First binary map is scaled Extremely the size of the original document image is as third binary map;Obtain the grayscale image of the original document image;By described Two binary maps, third binary map, the original document image grayscale image merge to obtain triple channel image;
Step S40, the triple channel image obtains third image block set using the method cutting of step S10, and passes through second Convolutional neural networks obtain the binary map of an image block, as the 4th image block set;
Each image block in 4th image block set is spliced the final binary picture for obtaining original document image by step S50;
Wherein, first convolutional neural networks, second convolutional neural networks cascade composition generate the generation of confrontation network Device, and parameter optimization is carried out by training.
2. according to claim 1 based on the file and picture binary coding method for generating confrontation network, which is characterized in that described The arbiter for fighting network is the full convolutional neural networks of patch-based;
First convolutional neural networks, second convolutional neural networks are the identical semantic segmentation network of two structures;Institute The first convolutional neural networks are stated for generating binary image according to the contextual information of regional area;Second convolutional Neural Network according to text and output result of the context information gap to first convolutional neural networks for being modified.
3. according to claim 2 based on the file and picture binary coding method for generating confrontation network, which is characterized in that described Fight loss function L when network traininglossFor
LcGAN(G, D)=Ex,y[logD(x,y)]+Ex[log(1-D(x,G(x,z)))]
LL1(G)=Ex,y[||(y-G(x,z))||1]
Wherein, G, D respectively indicate the generator and arbiter in confrontation network;LcGAN(G, D) is generator and arbiter training Confrontation loss, LL1(G) L1 of the image and true bianry image generated for generator loses, and x is input picture, and z is generator In random noise, G (x, z) indicates that the binarization result image that generator is generated using input picture x and random noise z, y are True bianry image, γ are the corresponding weight coefficient of two kinds of losses, and D (x, y) is by input picture and true binaryzation sample Corresponding arbiter exports result.
4. according to claim 2 based on the file and picture binary coding method for generating confrontation network, which is characterized in that described First convolutional neural networks, second convolutional neural networks include five layers of convolutional layer, five layers of warp lamination.
5. according to claim 1-4 based on the file and picture binary coding method for generating confrontation network, feature Be, each image block in the first image set of blocks, the second size area of picture centre not with first figure As other image blocks overlapping in set of blocks.
6. according to claim 5 state based on the file and picture binary coding method for generating confrontation network, which is characterized in that described the One having a size of A*A, and described second having a size of B*B;
Upper left point [a, b] image block based determines the upper left point of four adjacent image blocks of the image block, method are as follows:
The upper left point coordinate of left side adjacent image block is [a-A+ (B/2), b];
The upper left point coordinate of right side adjacent image block is [a+A- (B/2), b];
The upper left point coordinate of top adjacent image block is [a, b-A+ (B/2)];
The upper left point coordinate of lower section adjacent image block is [a, b+A- (B/2)].
7. according to claim 6 state based on the file and picture binary coding method for generating confrontation network, which is characterized in that described the One having a size of 256*256, and described second having a size of 128*128.
8. it is a kind of based on generate confrontation network document image binaryzation system, which is characterized in that the system include cutting module, First convolution Processing with Neural Network module, triple channel image collection module, the second convolution Processing with Neural Network module, final two-value Change figure and obtains module;
The cutting module is configured to obtain multiple figures of default first size from the text image of input according to setting step-length As block, image block set is constructed;
The first convolution Processing with Neural Network module is configured to for being obtained by the cutting module from original document image First image block set obtains the binary picture of every image block by the first convolutional neural networks, obtains the second image block set; The original document image is normalized to the first size size, obtains its two-value by first convolutional neural networks Change figure, as the first binary map;
The triple channel image collection module is configured to splice each image block in second image block set to obtain the two or two Value figure;First binary map is zoomed into the size of the original document image as third binary map;Obtain the original document The grayscale image of image;Merge the grayscale image of second binary map, third binary map, the original document image to obtain threeway Road image;
The second convolution Processing with Neural Network module is configured to for being obtained by the cutting module from the triple channel image Third image block set is taken, and obtains the binary map of an image block by the second convolutional neural networks, as the 4th image block collection It closes;
The final binary picture obtains module, is configured to splice to obtain by each image block in the 4th image block set original The final binary picture of file and picture;
Wherein, first convolutional neural networks, second convolutional neural networks cascade composition generate the generation of confrontation network Device, and parameter optimization is carried out by training.
9. a kind of storage device, wherein being stored with a plurality of program, which is characterized in that described program is suitable for being loaded and being held by processor Row is to realize that claim 1-7 is described in any item based on the file and picture binary coding method for generating confrontation network.
10. a kind of processing unit, including processor, storage device;Processor is adapted for carrying out each program;Storage device is suitable for Store a plurality of program;It is characterized in that, described program is suitable for being loaded by processor and being executed to realize any one of claim 1-7 The file and picture binary coding method based on generation confrontation network.
CN201910222323.8A 2019-03-22 2019-03-22 Document image binarization method, system and device based on generation countermeasure network Active CN110097059B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910222323.8A CN110097059B (en) 2019-03-22 2019-03-22 Document image binarization method, system and device based on generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910222323.8A CN110097059B (en) 2019-03-22 2019-03-22 Document image binarization method, system and device based on generation countermeasure network

Publications (2)

Publication Number Publication Date
CN110097059A true CN110097059A (en) 2019-08-06
CN110097059B CN110097059B (en) 2021-04-02

Family

ID=67443030

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910222323.8A Active CN110097059B (en) 2019-03-22 2019-03-22 Document image binarization method, system and device based on generation countermeasure network

Country Status (1)

Country Link
CN (1) CN110097059B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110516202A (en) * 2019-08-20 2019-11-29 Oppo广东移动通信有限公司 Acquisition methods, document structure tree method, apparatus and the electronic equipment of document generator
CN110717523A (en) * 2019-09-20 2020-01-21 湖北工业大学 D-LinkNet-based low-quality document image binarization method
CN110895828A (en) * 2019-12-03 2020-03-20 武汉纺织大学 Model and method for generating MR (magnetic resonance) image simulating heterogeneous flexible biological tissue
CN111695596A (en) * 2020-04-30 2020-09-22 华为技术有限公司 Neural network for image processing and related equipment
CN112837329A (en) * 2021-03-01 2021-05-25 西北民族大学 Tibetan ancient book document image binarization method and system
WO2022178949A1 (en) * 2021-02-26 2022-09-01 平安科技(深圳)有限公司 Semantic segmentation method and apparatus for electron microtomography data, device, and medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101021905A (en) * 2006-02-15 2007-08-22 中国科学院自动化研究所 File image binaryzation method
CN106203434A (en) * 2016-07-08 2016-12-07 中国科学院自动化研究所 Based on the symmetric file and picture binary coding method of stroke structure
CN108986067A (en) * 2018-05-25 2018-12-11 上海交通大学 Pulmonary nodule detection method based on cross-module state
CN109190684A (en) * 2018-08-15 2019-01-11 西安电子科技大学 SAR image sample generating method based on sketch and structural generation confrontation network
CN109190722A (en) * 2018-08-06 2019-01-11 大连民族大学 Font style based on language of the Manchus character picture migrates transform method
US20190065880A1 (en) * 2017-08-28 2019-02-28 Abbyy Development Llc Reconstructing document from series of document images
CN109460735A (en) * 2018-11-09 2019-03-12 中国科学院自动化研究所 Document binary processing method, system, device based on figure semi-supervised learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101021905A (en) * 2006-02-15 2007-08-22 中国科学院自动化研究所 File image binaryzation method
CN106203434A (en) * 2016-07-08 2016-12-07 中国科学院自动化研究所 Based on the symmetric file and picture binary coding method of stroke structure
US20190065880A1 (en) * 2017-08-28 2019-02-28 Abbyy Development Llc Reconstructing document from series of document images
CN108986067A (en) * 2018-05-25 2018-12-11 上海交通大学 Pulmonary nodule detection method based on cross-module state
CN109190722A (en) * 2018-08-06 2019-01-11 大连民族大学 Font style based on language of the Manchus character picture migrates transform method
CN109190684A (en) * 2018-08-15 2019-01-11 西安电子科技大学 SAR image sample generating method based on sketch and structural generation confrontation network
CN109460735A (en) * 2018-11-09 2019-03-12 中国科学院自动化研究所 Document binary processing method, system, device based on figure semi-supervised learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A. SAKILA等: ""A hybrid approach for document image binarization"", 《2017 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTING AND INFORMATICS (ICICI)》 *
JINYUAN ZHAO等: ""An Effective Binarization Method for Disturbed Camera-Captured Document Images"", 《2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR)》 *
童立靖等: ""文档图像二值化算法VFCM"", 《文档图像二值化算法VFCM》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110516202A (en) * 2019-08-20 2019-11-29 Oppo广东移动通信有限公司 Acquisition methods, document structure tree method, apparatus and the electronic equipment of document generator
CN110717523A (en) * 2019-09-20 2020-01-21 湖北工业大学 D-LinkNet-based low-quality document image binarization method
CN110895828A (en) * 2019-12-03 2020-03-20 武汉纺织大学 Model and method for generating MR (magnetic resonance) image simulating heterogeneous flexible biological tissue
CN110895828B (en) * 2019-12-03 2023-04-18 武汉纺织大学 Model and method for generating MR (magnetic resonance) image simulating heterogeneous flexible biological tissue
CN111695596A (en) * 2020-04-30 2020-09-22 华为技术有限公司 Neural network for image processing and related equipment
WO2022178949A1 (en) * 2021-02-26 2022-09-01 平安科技(深圳)有限公司 Semantic segmentation method and apparatus for electron microtomography data, device, and medium
CN112837329A (en) * 2021-03-01 2021-05-25 西北民族大学 Tibetan ancient book document image binarization method and system
CN112837329B (en) * 2021-03-01 2022-07-19 西北民族大学 Tibetan ancient book document image binarization method and system

Also Published As

Publication number Publication date
CN110097059B (en) 2021-04-02

Similar Documents

Publication Publication Date Title
CN110097059A (en) Based on file and picture binary coding method, system, the device for generating confrontation network
Crimmins Geometric filter for speckle reduction
CN108805889A (en) The fining conspicuousness method for segmenting objects of margin guide and system, equipment
FR2534400A1 (en) GRAPHIC DISPLAY METHODS AND APPARATUS
CN113160062A (en) Infrared image target detection method, device, equipment and storage medium
CN109523558A (en) A kind of portrait dividing method and system
CN115205636B (en) Image target detection method, system, equipment and storage medium
WO2016175785A1 (en) Topic identification based on functional summarization
CN113361432A (en) Video character end-to-end detection and identification method based on deep learning
Giridhar et al. A novel approach to ocr using image recognition based classification for ancient tamil inscriptions in temples
CN111461211A (en) Feature extraction method for lightweight target detection and corresponding detection method
Kovanen et al. A layered method for determining manga text bubble reading order
de Meulenaer et al. Deriving physical parameters of unresolved star clusters-V. M 31 PHAT star clusters
EP1553508A1 (en) Method for realizing an electrical wiring diagram
CN102663715A (en) Super-resolution method and device
CN112800259A (en) Image generation method and system based on edge closure and commonality detection
Zimmermann Chemical Structure Reconstruction with chemoCR.
Oriot et al. Building extraction from stereoscopic aerial images
JP2012160002A (en) Layout template generation device and image layout device
CN111260659A (en) Image interactive segmentation method based on initial annotation point guidance
CN111145178A (en) High-resolution remote sensing image multi-scale segmentation method
Sreedevi et al. Enhancement of inscription images
Bejinariu et al. Recovery of old dialectal materials and maps through image processing
Charrada et al. Development of a database with ground truth for old documents analysis
CN114627292B (en) Industrial shielding target detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant