CN106157284B - The localization method and device of character area in image - Google Patents
The localization method and device of character area in image Download PDFInfo
- Publication number
- CN106157284B CN106157284B CN201510151823.9A CN201510151823A CN106157284B CN 106157284 B CN106157284 B CN 106157284B CN 201510151823 A CN201510151823 A CN 201510151823A CN 106157284 B CN106157284 B CN 106157284B
- Authority
- CN
- China
- Prior art keywords
- image
- parameter
- text
- space distribution
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Character Input (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of localization methods of character area in image, including building Partial Linear Models;Text space distribution parameter corresponding with image is generated by the Partial Linear Models;The non-legible bianry image of text-corresponding to the image is reconstructed according to the text space distribution parameter.The embodiment of the invention also discloses a kind of positioning devices of character area in image.Compared with prior art, the technical solution of the embodiment of the present invention, complete abandoning tradition is in such a way that image outline or provincial characteristics carry out character area positioning, by to the deeper semantic feature of image, image text spatial distributed parameters, it is analyzed, character area is positioned, it can not only avoid the interference to positioning such as picture size, font, color, languages, make to position it is more accurate, it is more robust, and the semantic feature that this method is most basic based on image, it can be suitable for the image of various formats, there is versatility.
Description
Technical field
The present invention relates to picture and text processing technology fields, more specifically, are related to a kind of positioning side of character area in image
Method and device.
Background technique
In cyber transaction, since consumer can not intuitively see commodity, commodity image becomes businessman to consumption
Person describes the important means of commodity.However, part businessman in order to improve the attention rates of commodity, can be embedded in false in commodity image
Publicity text, not only cause malice to compete, but also cause consumer experience bad.Therefore, each e-commerce website is equal
The word content of commodity image is audited, to form the monitoring to commodity image.
In general, the process of text is generally in existing audit commodity image, firstly, to the character area in commodity image
It is positioned, determines the position of word segment in image;Then, according to identified character area, word content is refined,
Obtain clearly word content.The technological means positioned at present to text in commodity image is usually, first to commodity figure
Text salient region as in carries out coarse positioning, as far as possible exclusion background area;Secondly, using the information such as edge and color into
The further analysis of row, screens out, merges character area, obtain possible literal line;Finally, using classifier to candidate text
Row region carries out classification verifying, obtains true literal line region.
However, the size of commodity image, font, color, languages etc. all have uncertainty, and the text in commodity image
Word and the background of commodity image complexity are easily obscured, and bring very big interference for the character area in positioning image, lead to not
The character area in image is positioned, and then is unable to get clearly word content, commodity can not be carried out by way of automation
The audit of image.
Summary of the invention
To overcome problems of the prior art, the present invention provides the localization method and dress of character area in a kind of image
It sets.
In a first aspect, the present invention provides a kind of localization methods of character area in image, comprising: building parametric regression mould
Type;Text space distribution parameter corresponding with image is generated by the Partial Linear Models;It is distributed according to the text space
Parameter reconstructs the non-legible bianry image of text-corresponding to the image.
In a first possible implementation of that first aspect, the building Partial Linear Models, comprising: obtain the ginseng
The target text spatial distributed parameters of number regression model;Test image is inputted into the Partial Linear Models and generates test text sky
Between distribution parameter;It is obtained currently according to the target text spatial distributed parameters and the test text spatial distributed parameters operation
Error;Calculate the difference of the error current and pedestal error;Wherein, the pedestal error is the error that last time operation obtains;
Judge the difference whether less than the first preset threshold;If the difference is more than or equal to first preset threshold, according to
The error current adjusts the unknown parameter of the Partial Linear Models, and the error current is determined as pedestal error, is laid equal stress on
Step is executed again, test image is inputted into the Partial Linear Models generation test text spatial distributed parameters, until the difference
Less than first preset threshold;If the difference is less than first preset threshold, by the unknown ginseng of the Partial Linear Models
Several current values are determined as model parameter.
With reference to the above first aspect, in second of possible embodiment, described be distributed according to the text space is joined
Number reconstructs non-legible bianry images of text-corresponding to the image, comprising: by the text space distribution parameter less than the
The parameter of two preset thresholds is set as 0;The parameter for being greater than second preset threshold in the text space distribution parameter is set
It is set to 1;Parameter 0 and parameter 1 are converted into binarized pixel gray value;The text is constructed according to the binarized pixel gray value
The non-legible bianry image of word-.
With reference to the above first aspect, in the third possible embodiment, join in described be distributed the text space
Parameter in number less than preset threshold is set as 0;The parameter setting of preset threshold will be greater than in the text space distribution parameter
Before 1, further includes: building dimensionality reduction model;The text space distribution parameter is inputted into the dimensionality reduction model;Pass through parameter weight
The text space distribution parameter is carried out dimension-reduction treatment by the mode of structure.
With reference to the above first aspect, in the 4th kind of possible embodiment, the building dimensionality reduction model, comprising: obtain
The text space distribution parameter of the binary image marked in advance is as calibration text space distribution parameter;By the binary picture
The grey scale pixel value of picture inputs the dimensionality reduction model and generates reconstruct text space distribution parameter;According to the calibration text space point
Cloth parameter and the reconstruct text space distribution parameter operation obtain error current;Calculate the error current and pedestal error
Difference;Wherein, the pedestal error is the error that last time operation obtains;Judge whether the difference is less than third predetermined threshold value;
If the difference is more than or equal to the third predetermined threshold value, the unknown ginseng of the dimensionality reduction model is adjusted according to the error current
Number, is determined as pedestal error for the error current, and it is defeated by the grey scale pixel value of the binary image to repeat step
Enter the dimensionality reduction model and generate reconstruct text space distribution parameter, until the difference is less than the third predetermined threshold value;If institute
Difference is stated less than the third predetermined threshold value, the current value of the dimensionality reduction unknown-model parameter is determined as model parameter.
With reference to the above first aspect, in the 5th kind of possible embodiment, the acquisition Partial Linear Models
Target text spatial distributed parameters, comprising: read the output data of the dimensionality reduction model the last layer;Most by the dimensionality reduction model
The output data of later layer is determined as the target text spatial distributed parameters.
Second aspect, the present invention provides a kind of positioning devices of character area in image, comprising: building module is used for
Construct Partial Linear Models;Generation module, for passing through the generation of Partial Linear Models constructed by the building module and image
Corresponding text space distribution parameter;Reconstructed module, for according to generation module text space distribution parameter generated
Reconstruct the non-legible bianry image of text-corresponding to the image.
In second aspect in the first possible implementation, the building module includes: acquiring unit, generation unit,
Computing unit, judging unit, adjustment unit and determination unit, wherein the acquiring unit, for obtaining the parametric regression mould
The target text spatial distributed parameters of type;The generation unit is generated for test image to be inputted the Partial Linear Models
Test text spatial distributed parameters;The computing unit, for according to the target text spatial distributed parameters and the test
Text space distribution parameter operation obtains error current;It is also used to calculate the difference of the error current and pedestal error;Wherein,
The pedestal error is the error that last time operation obtains;The judging unit, for judging whether the difference is pre- less than first
If threshold value;When the difference is more than or equal to first preset threshold, the adjustment unit, for according to described current
The unknown parameter of Partial Linear Models described in error transfer factor;The determination unit, for being more than or equal to institute in the difference
When stating the first preset threshold, the error current is determined as pedestal error;It is less than first preset threshold in the difference
When, the determination unit is also used to the current value of the Partial Linear Models unknown parameter being determined as model parameter.
In conjunction with above-mentioned second aspect, in second of possible embodiment, the reconstructed module includes: binaryzation list
Member, converting unit and construction unit, wherein the binarization unit, for by the text space distribution parameter less than the
The parameter of two preset thresholds is set as 0;The parameter for being greater than second preset threshold in the text space distribution parameter is set
It is set to 1;The converting unit, for parameter 0 and parameter 1 to be converted to binarized pixel gray value;The construction unit is used for root
The non-legible bianry image of text-is constructed according to the binarized pixel gray value.
In conjunction with above-mentioned second aspect, in the third possible embodiment, described device further include: input unit and drop
Tie up unit, wherein the building module is also used to construct dimensionality reduction model;The input unit, for dividing the text space
Cloth parameter inputs the dimensionality reduction model;The dimensionality reduction unit, for being distributed the text space in such a way that parameter reconstructs
Parameter carries out dimension-reduction treatment.
In conjunction with above-mentioned second aspect, in the 4th kind of possible embodiment, the acquiring unit is also used to obtain in advance
The text space distribution parameter of the binary image of mark is as calibration text space distribution parameter;The generation unit, is also used
Reconstruct text space distribution parameter is generated in the grey scale pixel value of the binary image is inputted the dimensionality reduction model;The meter
Unit is calculated, is also used to be worked as according to the calibration text space distribution parameter and the reconstruct text space distribution parameter operation
Preceding error;Calculate the difference of the error current and pedestal error;The judging unit is also used to judge whether the difference is small
In third predetermined threshold value;When the difference is more than or equal to the third predetermined threshold value, the adjustment unit is also used to according to institute
State the unknown parameter that error current adjusts the dimensionality reduction model;The determination unit is also used to for the error current being determined as
Pedestal error;When the difference is less than the third predetermined threshold value, the determination unit is also used to the dimensionality reduction model not
Know that the current value of parameter is determined as model parameter.
In conjunction with above-mentioned second aspect, in the 5th kind of possible embodiment, the acquiring unit includes: to read son list
Member, for reading the output data of the dimensionality reduction model the last layer;The determination unit is also used to the dimensionality reduction model most
The output data of later layer is determined as the target text spatial distributed parameters.
From the above technical scheme, the embodiment of the present invention is when positioning the character area in image, firstly, building parameter
Regression model generates text space distribution parameter corresponding with image by Partial Linear Models, then, according to text space
Distribution parameter constructs the non-legible bianry image of text-, non-textual represents the text of image and explicitly.That is, image is joined
Numberization is explicitly positioned the character area in image by handling the corresponding parameter of image.As can be seen that
The technical solution of the embodiment of the present invention, complete abandoning tradition carry out character area positioning by image outline or provincial characteristics
Mode, by the way that the deeper semantic feature of image, image text spatial distributed parameters are analyzed, to character area into
Row positioning, picture size, font, color, languages etc. can not only be avoided to the interference of positioning, make to position it is more accurate, more
Robust, and the semantic feature that this method is most basic based on image can be suitable for the image of various formats, have versatility.
It should be understood that above general description and following detailed description be merely illustrative with explanatory description, it is right
Technical solution of the present invention does not constitute a limitation simultaneously.
Detailed description of the invention
It in order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, below will be to institute in embodiment
Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention
Example, for those of ordinary skill in the art, without creative efforts, can also obtain according to these attached drawings
Obtain other attached drawings.By the way that shown in attached drawing, above and other purpose of the invention, feature and advantage will be more clear.In whole
Identical attached drawing mark indicates identical part in attached drawing.Attached drawing, emphasis deliberately are not drawn by actual size equal proportion scaling
It is to show the gist of the present invention.
Fig. 1 is the flow chart of the localization method of character area in a kind of image provided in an embodiment of the present invention;
Fig. 2 is the flow chart of the localization method of character area in another image provided in an embodiment of the present invention;
Fig. 3 is that the present invention provides the schematic diagram of bit image undetermined;
Fig. 4 is the non-legible bianry image of the corresponding text-of image shown in Fig. 3;
Fig. 5 is the structural schematic diagram of the positioning device of character area in a kind of image provided in an embodiment of the present invention;
Fig. 6 is the structural schematic diagram of the positioning device of character area in another image provided in an embodiment of the present invention.
Specific embodiment
The mode of character area includes: based on Region Feature Extraction (Maximally Stable in existing positioning image
Extremal Regions, MSER) or based on stroke width transformation (Stroke Width Transform, SWT) text
Area positioning method etc..And existing character area positioning method is mostly based on the feature of engineer and rule is realized, manually
The feature of design and regular generalization ability are simultaneously bad, single for position color, and word column, literal line, text interval are consistent etc.
The single character area of rule is relatively applicable in, when character area in the changeable image of detection of complex, it is easy to erroneous detection is generated,
Poor robustness.In order to solve the above-mentioned technical problem, technical solution of the present invention is proposed.
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Whole description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Referring to Figure 1, Fig. 1 is the process of the localization method of character area in a kind of image provided in an embodiment of the present invention,
This approach includes the following steps.
Step S101 constructs Partial Linear Models.
Wherein, the embodiment of the present invention converts the image into text space distribution parameter by way of parametric regression, in order to
The text space distribution parameter of image can be accurately obtained, the technical solution of the embodiment of the present invention can be by marking sample learning
Mode construct Partial Linear Models.In the present embodiment, Partial Linear Models can be depth convolutional network (Depth of
Convolutional network, DCNN), deep neural network (Depth of neural network, DNN), support to
Amount machine (Support Vector Machine, SVM)) or AdaBoost etc..
Specifically, the present embodiment is illustrated building Partial Linear Models by taking DCNN study optimization as an example.It determines first
Partial Linear Models, Partial Linear Models can be such as following formula (1) to formula (4), wherein S is the target text of Partial Linear Models
This spatial distributed parameters, x are the text image of input, and S and x meet mapping relations F, as shown in formula (1), in the embodiment of the present invention
Middle F represents nonlinear mapping function, shown in the mapping relations such as formula (2).F in formula (2)iFor each layer of mapping function, mapping
Shown in functional expression such as formula (3).σ represents activation primitive in formula (3), for example, being the activation primitive of the last layer shown in formula (4).
F:S←x (1)
fi(ai-1)=σ (Wiai-1+bi)@ai, i=1, K, k-1 (3)
fk(ak-1)=Wkak-1+bk@S0 (4)
In the present embodiment, in the initial state by Wi,biValue be preset as the parameter of any non-zero, which can be with
For random natural number.Due to it is initial when Wi,biValue be arbitrary value, leading to Partial Linear Models may not be best model, according to
The calculating of input sample image is resulting, and text space distribution parameter may there are biggish mistakes with target text spatial distributed parameters
Difference, therefore, the process for constructing Partial Linear Models is W in Optimized modeli,biProcess.Since the degree of optimization of model can lead to
The error for crossing text space distribution parameter is reacted, and can be according to W with error in embodiments of the present inventioni,biValue
It is adjusted, to be optimized to model.
Specifically, the bianry image for marking out character area using one is as sample, it can be by the text space of the sample
Target text spatial distributed parameters of the distribution parameter as parametric regression function, i.e. S in formula (1), then by the RGB of the sample
Image is as input sample image, i.e. x in formula (1).Parameter current regression model is calculated into resulting text space distribution ginseng
Number is used as test text spatial distributed parameters, calculates the mistake of test text spatial distributed parameters and target text spatial distributed parameters
Difference, and the difference of error current Yu last time errors is calculated, if the difference is less than default first preset threshold, then it is assumed that
Wi,biOptimal value is had converged to, can be used as the model parameter of Partial Linear Models;If the difference is more than or equal to pre-
If the first preset threshold illustrates current Wi,biOptimal value is not converged to, W can be adjusted according to current error valuei,bi, with contracting
The error of small test text spatial distributed parameters and target text spatial distributed parameters, then again by the RGB image of the sample
Partial Linear Models are inputted as input sample image, new test text spatial distributed parameters is generated, new mistake is calculated
Then difference calculates the difference of new error and last time error that this is obtained, until difference is less than default first preset threshold.
It should be noted that in the present embodiment, when first time sample image being inputted Partial Linear Models, due to not depositing
Therefore the error caused by last time when calculating the difference of this error and last time error, sets 0 for last time error.
In addition, the first preset threshold can be set according to the specific functional relation and empirical value of Partial Linear Models, and
On-fixed value, the present invention is herein without repeating.
In the present embodiment, Partial Linear Models are trained by way of machine learning, can not only guarantee model
The parameter accuracy exported when in use is higher, and can be avoided the feature of engineer, and applicability is wider.
Step S102 generates text space distribution parameter corresponding with image by the Partial Linear Models.
Wherein, if using image as two-dimensional space, in image each pixel with a position of the two-dimensional space
Correspondence is set, and the position of two-dimensional space can be indicated by text space distribution parameter, therefore, each pixel is with one in image
A text space distribution parameter mutually maps, and the position where pixel can be indicated by text space distribution parameter.According to upper
Description is stated it is found that Partial Linear Models can give birth to image input Partial Linear Models by mark sample repetition learning building
At the text space distribution parameter of the image, and obtained text space distribution parameter is relatively accurate.
It is noted that the color value of the pixel should be also obtained, in order to true while obtaining pixel corresponding position
Determine the position of character area in image, therefore, after input picture, model passes through the tri- color value meters of R, G, B for reading pixel
The text space distribution parameter of the pixel is calculated, and text space distribution parameter generated can indicate location of pixels to be a series of
And the floating number of color, for example, 0.5,0.8 etc., wherein a pixel in each floating number correspondence image.
In addition, when constructing Partial Linear Models, in order to reduce calculation amount, the efficiency of model treatment image is improved, it can be with
Image is normalized by arest neighbors interpolation algorithm, reduces the dimension of image, for example, being 1024* by life size
1024 image is normalized to 256*256 size by arest neighbors interpolation algorithm.It should be pointed out that in order to guarantee to be generated
Text space distribution parameter accuracy, the sample image size that inputs is how many when constructing Partial Linear Models, passes through ginseng
The image that inputs should also be as to be correspondingly sized when number regression models positioning, for example, if when building Partial Linear Models, the sample of input
This image size is 256*256, then the image size inputted when using Partial Linear Models is also 256*256.Certainly,
It above are only a preferred embodiment of the present invention, the size according to the difference of Partial Linear Models, after image normalization
Not identical, the present invention is without limitation.In addition, image handle as those skilled in the art by arest neighbors interpolation algorithm
Technology known to member, details are not described herein again by the present invention.
The setting of the present embodiment can not only be accurately obtained the corresponding text space distribution parameter of image pixel, be fixed
The foundation that offers precise data of position character area, and image is normalized, additionally it is possible to greatly reduce parameter and returns
Return the calculation amount of model.
Step S103 reconstructs the non-legible two-value of text-corresponding to the image according to the text space distribution parameter
Image.
Wherein, in commodity image, in order to attract the attention of consumer, publicity or descriptive matter in which there are waken up mostly
Mesh, even if it is possible that the text of different colours, position and size in image, but text usually has in a zonule
Have very strong consistency, be that grey scale pixel value in the region is close then being showed, and with the pixel in other regions
Gray value is different, therefore, can by grey scale pixel value in analysis image and position, to the character area in image into
Row positioning.
According to foregoing description it is found that being image pixel gray level value and position, this reality represented by text space distribution parameter
It applies example and passes through the character area of the processing detection image to text space distribution parameter.In order to explicitly by character area and non-text
Block domain distinguishes, and the embodiment of the present invention sets text and non-legible region to the bianry image of two kinds of colors.
For example, white is arranged in character area, non-legible region is set as black.Specifically, due to text generated
Spatial distributed parameters are the numerical value of different sizes, firstly, it is necessary to by text space distribution parameter binaryzation, then by binaryzation
Text space distribution parameter is converted into the grey scale pixel value of binaryzation, to construct text-according to the grey scale pixel value of binaryzation
Non-legible bianry image.Include setting the second preset threshold for text space distribution parameter binaryzation, text space is distributed and is joined
Parameter in number less than the second preset threshold is set as 0;The parameter of the second preset threshold will be greater than in text space distribution parameter
It is set as 1, so that text space distribution parameter be made to only exist two kinds of settings.If constructing black-and-white bianry image, then will be after binaryzation
Text space distribution parameter multiplied by 255, generate two kinds of color gray values of black and white, construct to be formed according to color gray value
The non-legible bianry image of text-.
It is noted that the image of bianry image and input Partial Linear Models that building is formed is in the same size, and input
The image of Partial Linear Models may have already passed through normalized, not life size, and the bianry image for causing building to be formed is simultaneously
Non- life size, therefore, after obtaining the non-legible bianry image of text-, it is also necessary to judge whether the bianry image is less than original image
The bianry image is normalized to original image size by arest neighbors interpolation algorithm if being less than by the size of picture.
As can be seen from the above embodiments, in image described in the embodiment of the present invention character area localization method, image is joined
Numberization is explicitly positioned the character area in image by handling the corresponding parameter of image.As can be seen that
The technical solution of the embodiment of the present invention, complete abandoning tradition carry out character area positioning by image outline or provincial characteristics
Mode, by the way that the deeper semantic feature of image, image text spatial distributed parameters are analyzed, to character area into
Row positioning, picture size, font, color, languages etc. can not only be avoided to the interference of positioning, make to position it is more accurate, more
Robust, and the semantic feature that this method is most basic based on image can be suitable for the image of various formats, have versatility.
Above-described embodiment from the localization method of the embodiment of the present invention is described on one side, in order to make technical side of the invention
Case is clearer, perfect, and on the basis of the above embodiments, the embodiment of the present invention is also from other side to the technology of the present invention side
Case is described.Since the present embodiment is the additional notes to above-described embodiment, the present embodiment and above-described embodiment phase
The description of above-described embodiment is detailed in same part, repeats no more in the present embodiment.
Fig. 2 is referred to, Fig. 2 is the process of the localization method of character area in another image provided in an embodiment of the present invention
Figure, the positioning problems method include the following steps.
Step S201 constructs Partial Linear Models.
In the present embodiment, it is assumed that Partial Linear Models DCNN, when constructing DCNN, the 256*256 of sample image size.Structure
The process for building DCNN is detailed in the description of above-described embodiment, and details are not described herein again for the present embodiment.
Step S202 generates text space distribution parameter corresponding with image by the Partial Linear Models.
Referring to Fig. 3, Fig. 3 is bit image undetermined provided in an embodiment of the present invention, region 01, region 02 and region in the image
03 is character area, other regions are background area.Assuming that the size of the image is 1024*1024, when due to building DCNN
Sample image size is 256*256, therefore, before the image in Fig. 3 is inputted DCNN model, is needed the image by most
Neighbour's interpolation algorithm is normalized to 256*256, and normalized image is then inputted DCNN, each pixel of DCNN model read
R, the value of G and B, and calculated, corresponding each pixel generates a text space distribution parameter.
Step S203 constructs dimensionality reduction model.
It wherein,, can will be literary in order to reduce data processing amount when constructing bianry image according to text space distribution parameter
This spatial distributed parameters carry out dimension-reduction treatment, and by text space distribution parameter progress dimension-reduction treatment need by dimensionality reduction model into
Row, therefore, it is necessary to construct dimensionality reduction model.Dimensionality reduction model is provided with multitiered network and multiple nodes, and the first layer network receives input
Operation is carried out after data, is once merged node in calculating process;The output data of first layer network is as second layer net
Node is carried out secondary merging by the input data of network, and using output data as the input data of third layer network, until obtaining
The output data of the last layer network is merged by every layer of node and completes dimension-reduction treatment.In the present embodiment, dimensionality reduction model can be with
For depth Boltzmann machine (The depth of the Boltzmann machine, DBM), depth confidence network (Deep
Belief network, DBN) or limited Boltzmann machine (restricted Boltzmann machine, RBM) etc..In order to
The feature that can be avoided engineer, similar with building Partial Linear Models, dimensionality reduction model can also be by marking sample learning
Mode construct.
The present embodiment is described in detail building dimensionality reduction model by taking DBM as an example.Firstly, three layers of DBM model of building, in detail
See formula (4), wherein v represents visible variable, h1And h2The respectively hidden layer variable of the second layer and third layer, w be node unit it
Between connect the weight on side, b and c are node unit amount of bias.It is similar to building Partial Linear Models, it in the initial state, will be upper
The parameter that unknown parameter is set as any non-zero is stated, and optimal value is determined by sample training.
The bianry image marked in advance using one obtains the text space distribution parameter conduct of the bianry image as sample
The grey scale pixel value input dimensionality reduction model of the sample is generated reconstruct text by the calibration text space distribution parameter of DBM model training
This spatial distributed parameters.Since reconstruct text space distribution parameter is generated by dimensionality reduction model, unknown ginseng in dimensionality reduction model
Several values can directly be embodied by the error of reconstruct text space distribution parameter and calibration text space distribution parameter, with structure
It is similar to build Partial Linear Models, can be that foundation optimizes dimensionality reduction model with error amount.
Specifically, currently being missed according to calibration text space distribution parameter and reconstruct text space distribution parameter operation
Difference calculates the difference of error current and last time errors, if the difference is less than default third predetermined threshold value, then it is assumed that unknown
Parameter has converged to optimal value, can be used as the model parameter of dimensionality reduction model;If the difference is more than or equal to default the
Three preset thresholds illustrate that current unknown parameter does not converge to optimal value, can be adjusted according to current error value, to reduce
It reconstructs text space distribution parameter and demarcates the error of text space distribution parameter, then again by the grey scale pixel value of the sample
Dimensionality reduction model is inputted, new reconstruct text space distribution parameter is generated, new error is calculated, then calculates what this was obtained
The difference of new error and last time error, until difference is less than default third predetermined threshold value.
It should be noted that in the present embodiment, when first time the grey scale pixel value of sample being inputted dimensionality reduction model, due to
There is no errors caused by last time, therefore, when calculating the difference of this error and last time error, set last time error to
0。
In addition, the grey scale pixel value input dimensionality reduction model of sample is generated reconstruct text space distribution parameter, specifically include:
The grey scale pixel value of bianry image is inputted to the first layer of DBM model, the output of DBM model first layer operation according to preset order
Data continue operation as the input data of the second layer, and the output data of the second layer is transported as the input data of third layer
Calculate, DBM model since first layer successively using output data as next layer of input data, until obtaining the defeated of the last layer
Data out.Then, inverse operation is carried out using the output data of the last layer, obtains the reconstruct text space distribution of the bianry image
Parameter.
It should be noted that since the bianry image marked in advance is two-dimensional image, and in training DBM model, institute
The data of input should be it is one-dimensional, therefore, when reading data, with preset row or column for sequence be read out.
In addition, the DBM model in the present embodiment is provided with three-layer network, the number of nodes of the second layer be can be set to
1024, the number of nodes of third layer can be 256.Certainly, the present embodiment is only a preferable example of the invention, planned network
When, the network layer and every layer of number of nodes of different number can be set according to demand, and the present invention is without limitation.
The text space distribution parameter is inputted the dimensionality reduction model by step S204.
The text space distribution parameter is carried out dimension-reduction treatment in such a way that parameter reconstructs by step S205.
The text space distribution parameter of the DCNN bit image undetermined generated is inputted into DBM, the calculating of DBM through the above steps
Mode successively carries out parameter reconstruct, and the data that DBM the last layer is exported are the text space distribution parameter after dimensionality reduction.Wherein,
Parameter is reconstructed into the usual technological means of those skilled in the art, and and will not be described here in detail by the present invention.
As can be seen from the above embodiments, the text space distribution parameter that DCNN is exported is floating number, and DBM is to pass through parameter
The mode of reconstruct reduces the dimension of text space distribution parameter, does not change text space distribution parameter value, therefore, after dimensionality reduction
Text space distribution parameter remains as floating number.
It should be noted that since DCNN and DBM is handled same sub-picture, and as can be seen from the above description,
DBM is using the output data of the last layer hidden layer as the feature extracted, therefore, in order to enhance the stabilization to character area positioning
Property and robustness can be using the output datas of DBM the last layer as the target text spatial distribution of DCNN when constructing model
Parameter.Such setting, firstly, be trained by the same sample to DCNN and DBM, by DCNN and DBM combined training and
It uses, the performance of positioning can be greatly improved;Secondly, the output data of DBM the last layer is the feature that is extracted, therefore,
Not only there is specific representativeness, but also data volume is few, when as target text spatial distributed parameters training DCNN, can protect
It demonstrate,proves and significantly reduces operand under conditions of training accuracy.
In the present embodiment, by using Partial Linear Models and dimensionality reduction models coupling, when can greatly improve positioning
Performance, and enable to processing result robustness higher.
Step S206 sets 0 for the parameter for being less than preset threshold in the text space distribution parameter;By the text
Parameter in spatial distributed parameters greater than preset threshold is set as 1.
Wherein, the present embodiment is specially that the text space distribution parameter after DBM dimensionality reduction is carried out binaryzation setting.
Parameter 0 and parameter 1 are converted to binarized pixel gray value by step S207.
The present embodiment, by the parameter of binaryzation multiplied by 255, obtains pixel ash for Fig. 3 is converted to black-and-white binary map
Angle value 0 and 255, wherein grey scale pixel value 255 indicates that the pixel is black, and grey scale pixel value 0 indicates that the pixel is white.When
So, it above are only preferable example of the invention, other colors and white can also be converted by the parameter of binaryzation, as long as can
Region 01, region 02, region 03 and background area are explicitly distinguished, the present invention is without limitation.
Step S208 reconstructs the non-legible two-value of text-corresponding to the image according to the text space distribution parameter
Image.
As shown in figure 4, it is the corresponding black-and-white bianry image of the Fig. 3 constructed, wherein region according to grey scale pixel value
01, region 02 and region 03 are white, and background area is black, so that character area at three in Fig. 3 carried out explicitly
Positioning.
Furthermore, it is necessary to explanation, before Fig. 3 is inputted DCNN, Fig. 3 is normalized to size 256*256, and it is big with this
It is also the two-dimensional space of 256*256 represented by the corresponding text space distribution parameter of small image, therefore, DBM is generated
Bianry image size is also 256*256, and the size that Fig. 3 is middle image is 1024*1024, so, bianry image is generated in DBM
Afterwards, it is also necessary to bianry image is normalized to 1024*1024 size by arest neighbors interpolation algorithm, obtain image shown in Fig. 4.
From the above technical scheme, the embodiment of the present invention is when positioning the character area in image, firstly, building parameter
Regression model generates text space distribution parameter corresponding with image by Partial Linear Models, then, according to text space
Distribution parameter constructs the non-legible bianry image of text-, non-textual represents the text of image and explicitly.That is, image is joined
Numberization is explicitly positioned the character area in image by handling the corresponding parameter of image.As can be seen that
The technical solution of the embodiment of the present invention, complete abandoning tradition carry out character area positioning by image outline or provincial characteristics
Mode, by the way that the deeper semantic feature of image, image text spatial distributed parameters are analyzed, to character area into
Row positioning, picture size, font, color, languages etc. can not only be avoided to the interference of positioning, make to position it is more accurate, more
Robust, and the semantic feature that this method is most basic based on image can be suitable for the image of various formats, have versatility.
Corresponding with above-mentioned implementation method, the embodiment of the invention also provides a kind of positioning dresses of character area in image
It setting, refers to Fig. 5, Fig. 5 is the structural schematic diagram of the positioning device of character area in a kind of image provided in an embodiment of the present invention,
The device includes: building module 11, generation module 12 and reconstructed module 13.Wherein, module 11 is constructed, for constructing parametric regression
Model;Generation module 12, for generating text corresponding with image by Partial Linear Models constructed by the building module 11
This spatial distributed parameters;Reconstructed module 13, for being reconstructed according to the generation module 12 text space distribution parameter generated
The non-legible bianry image of text-corresponding to the image.
The function of each unit and the realization process of effect are detailed in corresponding realization process in the above method in described device,
Details are not described herein.
The localization method of character area in the image described in the embodiment of the present invention it can be seen from the present embodiment, by image
Parametrization, by handling the corresponding parameter of image, the character area in image is explicitly positioned.It can see
Out, the technical solution of the embodiment of the present invention, complete abandoning tradition, which carries out character area by image outline or provincial characteristics, to be determined
The mode of position, by the way that the deeper semantic feature of image, image text spatial distributed parameters are analyzed, to literal field
Domain is positioned, and can not only avoid picture size, font, color, languages etc. to the interference of positioning, make to position it is more accurate,
It is more robust, and the semantic feature that this method is most basic based on image, the image of various formats can be suitable for, had general
Property.
On the basis of the above embodiments, in the present embodiment, the building module 11 includes: acquiring unit, is generated single
Member, computing unit, judging unit, adjustment unit and determination unit, wherein the acquiring unit is returned for obtaining the parameter
Return the target text spatial distributed parameters of model;The generation unit, for test image to be inputted the Partial Linear Models
Generate test text spatial distributed parameters;The computing unit, for according to the target text spatial distributed parameters with it is described
Test text spatial distributed parameters operation obtains error current;It is also used to calculate the difference of the error current and pedestal error;
Wherein, the pedestal error is the error that last time operation obtains;The judging unit, for judging the difference whether less than
One preset threshold;When the difference is more than or equal to first preset threshold, the adjustment unit, for according to
Error current adjusts the unknown parameter of the Partial Linear Models;The determination unit, for being greater than or waiting in the difference
When first preset threshold, the error current is determined as pedestal error;It is default to be less than described first in the difference
When threshold value, the determination unit is also used to the current value of the Partial Linear Models unknown parameter being determined as model parameter.
The reconstructed module 13 includes: binarization unit, converting unit and construction unit, wherein the binarization unit,
For setting 0 for the parameter in the text space distribution parameter less than the second preset threshold;The text space is distributed
Parameter in parameter greater than second preset threshold is set as 1;The converting unit, for being converted to parameter 0 and parameter 1
Binarized pixel gray value;The construction unit, for constructing the text-non-legible two according to the binarized pixel gray value
It is worth image.
In order to make technical solution of the present invention in further detail, the embodiment of the invention also provides literal fields in another image
The positioning device in domain, refers to Fig. 6, and Fig. 6 is the positioning device of character area in another image provided in an embodiment of the present invention
Structural schematic diagram, which includes: building module 21, generation module 22, input unit 23, dimensionality reduction unit 24 and reconstructed module
25.Wherein, function and the effect for constructing module 21, generation module 22 and reconstructed module 25 are similar to the aforementioned embodiment, the present invention
Details are not described herein again for embodiment.In the present embodiment, module 21 is constructed, is also used to construct dimensionality reduction model;Input unit
23, for the text space distribution parameter to be inputted the dimensionality reduction model;Dimensionality reduction unit 24, the side for being reconstructed by parameter
The text space distribution parameter is carried out dimension-reduction treatment by formula.
In the present embodiment, the acquiring unit in module 21 is constructed, is also used to obtain the binary image marked in advance
Text space distribution parameter is as calibration text space distribution parameter;The generation unit is also used to the binary image
Grey scale pixel value input the dimensionality reduction model and generate reconstruct text space distribution parameter;The computing unit, is also used to basis
The calibration text space distribution parameter and the reconstruct text space distribution parameter operation obtain error current;Work as described in calculating
The difference of preceding error and pedestal error;The judging unit, is also used to judge whether the difference is less than third predetermined threshold value;?
When the difference is more than or equal to the third predetermined threshold value, the adjustment unit is also used to adjust institute according to the error current
State the unknown parameter of dimensionality reduction model;The determination unit is also used to the error current being determined as pedestal error;In the difference
When value is less than the third predetermined threshold value, the determination unit is also used to the current value of the dimensionality reduction unknown-model parameter is true
It is set to model parameter.
In conjunction with above-described embodiment, in the present embodiment, the acquiring unit includes reading subunit, for reading the drop
The output data of dimension module the last layer;In the present embodiment, the determination unit, be also used to by the dimensionality reduction model last
The output data of layer is determined as the target text spatial distributed parameters.
The function of each unit and the realization process of effect are detailed in corresponding realization process in the above method in described device,
Details are not described herein.
In summary, the embodiment of the present invention is when positioning the character area in image, firstly, building Partial Linear Models,
Text space distribution parameter corresponding with image is generated by Partial Linear Models, then, according to text space distribution parameter
The non-legible bianry image of text-is constructed, non-textual is represented the text of image and explicitly.That is, image parameter is passed through
The corresponding parameter of image is handled, the character area in image is explicitly positioned.As can be seen that the present invention is implemented
The technical solution of example, complete abandoning tradition pass through in such a way that image outline or provincial characteristics carry out character area positioning
To the deeper semantic feature of image, image text spatial distributed parameters are analyzed, are positioned to character area, no
Only can be avoided the interference to positioning such as picture size, font, color, languages, make to position it is more accurate, it is more robust, and
This method semantic feature most basic based on image can be suitable for the image of various formats, have versatility.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its
Its embodiment.The present invention is directed to cover any variations, uses, or adaptations of the invention, these modifications, purposes or
Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the present invention
Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following
Claim is pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims.
Claims (8)
1. the localization method of character area in a kind of image characterized by comprising
Construct Partial Linear Models;
Text space distribution parameter corresponding with image is generated by the Partial Linear Models;
The non-legible bianry image of text-corresponding to the image is reconstructed according to the text space distribution parameter;
It is described that the non-legible bianry image of text-corresponding to the image is reconstructed according to the text space distribution parameter, comprising:
0 is set by the parameter in the text space distribution parameter less than the second preset threshold;The text space is distributed
Parameter in parameter greater than second preset threshold is set as 1;
Parameter 0 and parameter 1 are converted into binarized pixel gray value;
The non-legible bianry image of text-is constructed according to the binarized pixel gray value;
0 is set by the parameter for being less than preset threshold in the text space distribution parameter described;By the text space point
Parameter in cloth parameter greater than preset threshold is set as before 1, further includes:
Construct dimensionality reduction model;The dimensionality reduction model is three layers of DBM model, and following formula is utilized to construct three layers of DBM model,Wherein, v represents visible variable, h1And h2Respectively the second layer and
The hidden layer variable of third layer, w connect the weight on side between node unit, and b and c are node unit amount of bias;
The text space distribution parameter is inputted into the dimensionality reduction model;
The text space distribution parameter is subjected to dimension-reduction treatment in such a way that parameter reconstructs.
2. the localization method of character area in image according to claim 1, which is characterized in that the building parametric regression
Model, comprising:
Obtain the target text spatial distributed parameters of the Partial Linear Models;
Test image is inputted into the Partial Linear Models and generates test text spatial distributed parameters;
Error current is obtained according to the target text spatial distributed parameters and the test text spatial distributed parameters operation;
Calculate the difference of the error current and pedestal error;Wherein, the pedestal error is the error that last time operation obtains;
Judge the difference whether less than the first preset threshold;
If the difference is more than or equal to first preset threshold, the parametric regression mould is adjusted according to the error current
The error current is determined as pedestal error by the unknown parameter of type, and repeats step and test image is inputted the ginseng
Number regression model generates test text spatial distributed parameters, until the difference is less than first preset threshold;
If the difference is less than first preset threshold, the current value of the Partial Linear Models unknown parameter is determined as mould
Shape parameter.
3. the localization method of character area in image according to claim 1, which is characterized in that the building dimensionality reduction mould
Type, comprising:
The text space distribution parameter of the binary image marked in advance is obtained as calibration text space distribution parameter;
The grey scale pixel value of the binary image is inputted into the dimensionality reduction model and generates reconstruct text space distribution parameter;
Error current is obtained according to the calibration text space distribution parameter and the reconstruct text space distribution parameter operation;
Calculate the difference of the error current and pedestal error;Wherein, the pedestal error is the error that last time operation obtains;
Judge whether the difference is less than third predetermined threshold value;
If the difference is more than or equal to the third predetermined threshold value, the unknown of the dimensionality reduction model is adjusted according to the error current
The error current is determined as pedestal error by parameter, and repeats step for the grey scale pixel value of the binary image
It inputs the dimensionality reduction model and generates reconstruct text space distribution parameter, until the difference is less than the third predetermined threshold value;
If the difference is less than the third predetermined threshold value, the current value of the dimensionality reduction unknown-model parameter is determined as model ginseng
Number.
4. according to the localization method of character area in image described in claim any in claim 2 to 3, which is characterized in that institute
State the target text spatial distributed parameters for obtaining the Partial Linear Models, comprising:
Read the output data of the dimensionality reduction model the last layer;
The output data of the dimensionality reduction model the last layer is determined as the target text spatial distributed parameters.
5. the positioning device of character area in a kind of image characterized by comprising
Module is constructed, for constructing Partial Linear Models;
Generation module, for generating text space corresponding with image by Partial Linear Models constructed by the building module
Distribution parameter;
Reconstructed module, for corresponding to the image according to generation module text space distribution parameter reconstruct generated
The non-legible bianry image of text-;
The reconstructed module includes: binarization unit, converting unit and construction unit, wherein
The binarization unit, for setting 0 for the parameter in the text space distribution parameter less than the second preset threshold;
1 is set by the parameter for being greater than second preset threshold in the text space distribution parameter;
The converting unit, for parameter 0 and parameter 1 to be converted to binarized pixel gray value;
The construction unit, for constructing the non-legible bianry image of text-according to the binarized pixel gray value;
Described device further include: input unit and dimensionality reduction unit, wherein
The building module, is also used to construct dimensionality reduction model;The dimensionality reduction model is three layers of DBM model, and utilizes following formula
Three layers of DBM model are constructed,
Wherein, v represents visible variable, h1And h2Respectively second
The hidden layer variable of layer and third layer, w connect the weight on side between node unit, and b and c are node unit amount of bias;
The input unit, for the text space distribution parameter to be inputted the dimensionality reduction model;
The dimensionality reduction unit, for the text space distribution parameter to be carried out dimension-reduction treatment in such a way that parameter reconstructs.
6. device according to claim 5, which is characterized in that the building module includes: acquiring unit, generation unit,
Computing unit, judging unit, adjustment unit and determination unit, wherein
The acquiring unit, for obtaining the target text spatial distributed parameters of the Partial Linear Models;
The generation unit generates test text spatial distributed parameters for test image to be inputted the Partial Linear Models;
The computing unit, for being transported according to the target text spatial distributed parameters and the test text spatial distributed parameters
Calculation obtains error current;It is also used to calculate the difference of the error current and pedestal error;Wherein, the pedestal error is last time
The error that operation obtains;
The judging unit, for judging the difference whether less than the first preset threshold;
When the difference is more than or equal to first preset threshold, the adjustment unit, for according to the current mistake
Difference adjusts the unknown parameter of the Partial Linear Models;The determination unit, it is described for being more than or equal in the difference
When the first preset threshold, the error current is determined as pedestal error;
When the difference is less than first preset threshold, the determination unit is also used to the Partial Linear Models not
Know that the current value of parameter is determined as model parameter.
7. device according to claim 6, which is characterized in that
The acquiring unit is also used to obtain the text space distribution parameter of the binary image marked in advance as calibration text
Spatial distributed parameters;
The generation unit is also used to input the grey scale pixel value of the binary image dimensionality reduction model generation reconstruct text
This spatial distributed parameters;
The computing unit is also used to according to the calibration text space distribution parameter and the reconstruct text space distribution parameter
Operation obtains error current;Calculate the difference of the error current and pedestal error;
The judging unit, is also used to judge whether the difference is less than third predetermined threshold value;
When the difference is more than or equal to the third predetermined threshold value, the adjustment unit is also used to according to the error current
Adjust the unknown parameter of the dimensionality reduction model;The determination unit is also used to the error current being determined as pedestal error;
When the difference is less than the third predetermined threshold value, the determination unit is also used to join the dimensionality reduction unknown-model
Several current values are determined as model parameter.
8. device according to claim 6 or 7, which is characterized in that the acquiring unit includes:
Reading subunit, for reading the output data of the dimensionality reduction model the last layer;
The determination unit is also used to the output data of the dimensionality reduction model the last layer being determined as the target text space
Distribution parameter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510151823.9A CN106157284B (en) | 2015-04-01 | 2015-04-01 | The localization method and device of character area in image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510151823.9A CN106157284B (en) | 2015-04-01 | 2015-04-01 | The localization method and device of character area in image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106157284A CN106157284A (en) | 2016-11-23 |
CN106157284B true CN106157284B (en) | 2019-10-11 |
Family
ID=57337830
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510151823.9A Active CN106157284B (en) | 2015-04-01 | 2015-04-01 | The localization method and device of character area in image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106157284B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111444903B (en) * | 2020-03-23 | 2022-12-09 | 西安交通大学 | Method, device and equipment for positioning characters in cartoon bubbles and readable storage medium |
CN111401347B (en) * | 2020-06-05 | 2020-11-10 | 支付宝(杭州)信息技术有限公司 | Information positioning method and device based on picture |
CN112668657B (en) * | 2020-12-30 | 2023-08-29 | 中山大学 | Attention-enhanced out-of-distribution image detection method based on uncertainty prediction of classifier |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101299239A (en) * | 2008-06-06 | 2008-11-05 | 北京中星微电子有限公司 | Method and device for acquiring character area image and character recognition system |
CN103679168A (en) * | 2012-08-30 | 2014-03-26 | 北京百度网讯科技有限公司 | Detection method and detection device for character region |
CN103839062A (en) * | 2014-03-11 | 2014-06-04 | 东方网力科技股份有限公司 | Image character positioning method and device |
CN104281850A (en) * | 2013-07-09 | 2015-01-14 | 腾讯科技(深圳)有限公司 | Character area identification method and device |
CN104298982A (en) * | 2013-07-16 | 2015-01-21 | 深圳市腾讯计算机系统有限公司 | Text recognition method and device |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7720675B2 (en) * | 2003-10-27 | 2010-05-18 | Educational Testing Service | Method and system for determining text coherence |
US7436994B2 (en) * | 2004-06-17 | 2008-10-14 | Destiny Technology Corporation | System of using neural network to distinguish text and picture in images and method thereof |
-
2015
- 2015-04-01 CN CN201510151823.9A patent/CN106157284B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101299239A (en) * | 2008-06-06 | 2008-11-05 | 北京中星微电子有限公司 | Method and device for acquiring character area image and character recognition system |
CN103679168A (en) * | 2012-08-30 | 2014-03-26 | 北京百度网讯科技有限公司 | Detection method and detection device for character region |
CN104281850A (en) * | 2013-07-09 | 2015-01-14 | 腾讯科技(深圳)有限公司 | Character area identification method and device |
CN104298982A (en) * | 2013-07-16 | 2015-01-21 | 深圳市腾讯计算机系统有限公司 | Text recognition method and device |
CN103839062A (en) * | 2014-03-11 | 2014-06-04 | 东方网力科技股份有限公司 | Image character positioning method and device |
Non-Patent Citations (8)
Title |
---|
Neural Network-based Text Location for News Video Indexing;Ki-Young Jeong 等;《Image Processing, 1999. ICIP 99. Proceedings. 1999 International Conference on》;19991028;全文 * |
一种基于高层语义的文字图像过滤模型;宋广为 等;《现代电子技术》;20131101;第36卷(第21期);全文 * |
图像视频复杂场景中文字检测识别方法研究;颜建强;《中国博士学位论文全文数据库 (信息科技辑)》;20150115;全文 * |
基于深度信念网络的入侵检测模型;杨昆朋;《研究与开发》;20150131;全文 * |
大规模网格模型的实时交互浏览;武宪;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20141215(第12期);第10-13页 * |
平面媒体中文字定位的研究与实现;李晨轩;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20101015(第10期);第8、36页 * |
支持向量机及其应用研究综述;祁亨年;《计算机工程》;20040531;第30卷(第10期);全文 * |
视频文本检测和识别技术研究;朱成军 等;《计算机工程》;20070531;第33卷(第10期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN106157284A (en) | 2016-11-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10643130B2 (en) | Systems and methods for polygon object annotation and a method of training and object annotation system | |
CN109840531B (en) | Method and device for training multi-label classification model | |
WO2019100724A1 (en) | Method and device for training multi-label classification model | |
WO2022213879A1 (en) | Target object detection method and apparatus, and computer device and storage medium | |
CN105320965B (en) | Sky based on depth convolutional neural networks composes united hyperspectral image classification method | |
Robins et al. | Theory and algorithms for constructing discrete Morse complexes from grayscale digital images | |
Cohen-Steiner et al. | Extending persistence using Poincaré and Lefschetz duality | |
JP2019514123A (en) | Remote determination of the quantity stored in containers in geographical areas | |
Zhao et al. | Recognition of building group patterns using graph convolutional network | |
Li et al. | A hybrid method combining pixel-based and object-oriented methods and its application in Hungary using Chinese HJ-1 satellite images | |
Cao et al. | A new difference image creation method based on deep neural networks for change detection in remote-sensing images | |
JP6612486B1 (en) | Learning device, classification device, learning method, classification method, learning program, and classification program | |
Ge et al. | Multiple-point simulation-based method for extraction of objects with spatial structure from remotely sensed imagery | |
CN106157284B (en) | The localization method and device of character area in image | |
Xiao et al. | Building segmentation and modeling from airborne LiDAR data | |
CN110458166A (en) | A kind of hazardous material detection method, device and equipment based on deformable convolution | |
CN109635714A (en) | The antidote and device of file scanned image | |
Wang et al. | Multi-feature sea–land segmentation based on pixel-wise learning for optical remote-sensing imagery | |
JP2019185787A (en) | Remote determination of containers in geographical region | |
CN103136760A (en) | Multi sensor image matching method based on fast and daisy | |
Wan et al. | A geometry-aware attention network for semantic segmentation of MLS point clouds | |
CN107358244B (en) | A kind of quick local invariant feature extracts and description method | |
CN111144466B (en) | Image sample self-adaptive depth measurement learning method | |
Chaudhuri et al. | Attention-driven cross-modal remote sensing image retrieval | |
Moreno-García et al. | Obtaining the consensus of multiple correspondences between graphs through online learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |