WO2017071064A1 - 区域提取方法、模型训练方法及装置 - Google Patents

区域提取方法、模型训练方法及装置 Download PDF

Info

Publication number
WO2017071064A1
WO2017071064A1 PCT/CN2015/099300 CN2015099300W WO2017071064A1 WO 2017071064 A1 WO2017071064 A1 WO 2017071064A1 CN 2015099300 W CN2015099300 W CN 2015099300W WO 2017071064 A1 WO2017071064 A1 WO 2017071064A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample image
digital
area
image
recognition model
Prior art date
Application number
PCT/CN2015/099300
Other languages
English (en)
French (fr)
Inventor
龙飞
张涛
陈志军
Original Assignee
小米科技有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 小米科技有限责任公司 filed Critical 小米科技有限责任公司
Priority to RU2016110914A priority Critical patent/RU2016110914A/ru
Priority to MX2016003753A priority patent/MX2016003753A/es
Priority to KR1020167005383A priority patent/KR101763891B1/ko
Priority to JP2017547047A priority patent/JP2018503201A/ja
Publication of WO2017071064A1 publication Critical patent/WO2017071064A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/758Involving statistics of pixels or of feature values, e.g. histogram matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1475Inclination or skew detection or correction of characters or of image to be recognised
    • G06V30/1478Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables

Definitions

  • the present disclosure relates to the field of image processing, and in particular, to a region extraction method, a model training method, and an apparatus.
  • Digital area extraction is a technique for extracting digital areas from an image.
  • the area extraction method for numbers generally only recognizes an area in which a predetermined size and a predetermined number of digits are located in the image. If the font style of the number in the image is different or the font size is different or the number of digits is different, It is difficult to efficiently extract the digital area from the image.
  • the present disclosure provides a region extraction, model training method and apparatus.
  • the technical solution is as follows:
  • a region extraction method comprising:
  • the recognition model is obtained by training a preset sample image by using a classification algorithm, where the preset sample image includes a positive sample image and a negative sample image, each positive sample image includes at least one number, and each negative sample image includes zero Number or missing number;
  • Area cutting the digital area yields at least one single-digit area.
  • the image to be identified is identified according to the recognition model, and at least one digital area is obtained, including:
  • the image features of the candidate window regions are input into the recognition model to be classified, and the classification result is obtained;
  • the candidate window area is recognized as a digital area.
  • the digital area is at least two, and the method further includes:
  • the n digital regions in which the intersection region exists are combined to obtain a combined digital region, including:
  • the leftmost edge of the n left edges of the n digital regions is determined as the merged left edge
  • the right edge of the n right edges of the n digital regions is determined as the merged right edge
  • the merged digital area is obtained.
  • the digital area is area cut to obtain at least one single digit area, including:
  • the histogram includes: an abscissa of each column of pixels and an accumulated value of foreground pixels in each column of pixels;
  • n single digital regions are identified.
  • a model training method comprising: acquiring a predetermined sample image, the predetermined sample image including a positive sample image and a negative sample image, the positive sample image including at least one number; the negative sample image Including zero digits or missing numbers;
  • the classification model is used to train the preset sample image to obtain a recognition model.
  • the preset sample image is trained by using a classification algorithm to obtain a recognition model, including:
  • the image feature of the positive sample image and the first label for indicating the positive result are input into the initial model constructed by the classification algorithm, and the image feature of the negative sample image and the second label for representing the negative result are input into the initial model, Identify the model.
  • the classification algorithm includes at least one of Adaboost, support vector machine SVM, artificial neural network, genetic algorithm, naive Bayes, decision tree, and nearest neighbor KNN algorithm.
  • an area extracting apparatus comprising:
  • An acquisition module configured to acquire a recognition model obtained by training a preset sample image by a classification algorithm, where the preset sample image includes a positive sample image and a negative sample image, each positive sample image including at least one number, each The negative sample image includes zero numbers or missing numbers;
  • An identification module configured to identify an image to be identified according to the recognition model to obtain at least one digital area
  • a cutting module configured to perform area cutting of the digital area to obtain at least one single digital area.
  • the identification module comprises:
  • a window sweeping sub-module configured to extract a candidate window region from the image to be identified according to a predetermined window sweeping strategy using a preset window
  • a classification sub-module configured to classify image feature input candidate recognition models of the candidate window regions to obtain classification results
  • the confirmation sub-module is configured to identify the candidate window area as a digital area when the classification result is a positive result.
  • the digital area is at least two, and the apparatus further includes:
  • a lookup module configured to find n digital regions in which an intersection region exists
  • the merging module is configured to combine the n digital regions in which the intersection region exists to obtain a combined digital region.
  • the merging module includes:
  • a first determining submodule configured to determine a left edge of the leftmost edge of the n digital regions as the left edge
  • a second determining submodule configured to determine a right edge of the rightmost edge among the n right edges of the n digital regions, and determine the merged right edge;
  • the third determining sub-module is configured to obtain the merged digital area according to the upper edge, the lower edge, the merged left edge, and the merged right edge.
  • the cutting module comprises:
  • a binarization sub-module configured to binarize the digital region to obtain a binarized digital region area
  • the calculation submodule is configured to calculate a histogram according to a vertical direction of the binarized digital region, the histogram comprising: an abscissa of each column of pixels and an accumulated value of foreground pixels in each column of pixels;
  • the digital identification sub-module is configured to identify n single-digit regions according to a continuous set of columns consisting of columns in which the accumulated value of the foreground pixels in the histogram is greater than a preset threshold.
  • a model training apparatus comprising:
  • a sample acquisition module configured to acquire a predetermined sample image, the predetermined sample image including a positive sample image and a negative sample image, the positive sample image including at least one number; the negative sample image including zero digits or a missing number;
  • the training module is configured to train the preset sample image by using a classification algorithm to obtain a recognition model.
  • the training module includes:
  • An extraction submodule configured to extract an image feature of the positive sample image and an image feature of the negative sample image
  • An input sub-module configured to convert an image feature of the positive sample image and a first tag input for representing a positive result into an initial model constructed using a classification algorithm, an image feature of the negative sample image and a second feature for representing a negative result The label is entered into the initial model and the recognition model is obtained.
  • the classification algorithm includes at least one of Adaboost, support vector machine SVM, artificial neural network, genetic algorithm, naive Bayes, decision tree, and nearest neighbor KNN algorithm.
  • an area extracting apparatus comprising:
  • a memory for storing processor executable instructions
  • processor is configured to:
  • the recognition model is obtained by training a preset sample image by using a classification algorithm, where the preset sample image includes a positive sample image and a negative sample image, each positive sample image includes at least one number, and each negative sample image includes zero Number or missing number;
  • Area cutting the digital area yields at least one single-digit area.
  • a model training apparatus comprising:
  • a memory for storing processor executable instructions
  • processor is configured to:
  • the predetermined sample image including a positive sample image and a negative sample image, the positive sample image including at least one number; the negative sample image including zero digits or a missing number;
  • the classification model is used to train the preset sample image to obtain a recognition model.
  • the recognition model is obtained by training a preset sample image by a classification algorithm, where the preset sample image includes a positive sample image and a negative sample image, each positive sample image includes at least one number, and each negative sample image includes zero. a number or a missing number; identifying the image to be identified according to the recognition model to obtain at least one digital area; performing area cutting on the digital area to obtain at least one single-digit area; and solving the number and number of digits in the digital area extraction method
  • the digital position of different font sizes or different digits cannot be accurately extracted; it is possible to accurately locate and cut the digital position of different font styles or different font sizes or different digits in the image by the recognition model. Extract the effect.
  • FIG. 1 is a flowchart of a model training method according to an exemplary embodiment
  • FIG. 2 is a flowchart of a region extraction method according to an exemplary embodiment
  • FIG. 3A is a flowchart of a model training method according to another exemplary embodiment
  • FIG. 3B is a schematic diagram of an original sample image, according to an exemplary embodiment
  • FIG. 3C is a schematic diagram of a positive sample image, according to an exemplary embodiment
  • FIG. 3D is a schematic diagram of a negative sample image, according to an exemplary embodiment
  • FIG. 4 is a flowchart of a region extraction method according to another exemplary embodiment
  • FIG. 5 is a flowchart of a region extraction method according to another exemplary embodiment
  • FIG. 6A is a flowchart of a region extraction method according to another exemplary embodiment
  • FIG. 6B is a schematic diagram of a left edge of a region, according to an exemplary embodiment
  • FIG. 6C is a schematic diagram of a right edge of a region, according to an exemplary embodiment
  • FIG. 6D is a schematic diagram of a region extraction according to an exemplary embodiment
  • FIG. 7A is a flowchart of a region extraction method according to another exemplary embodiment.
  • FIG. 7B is a schematic diagram of region binarization according to an exemplary embodiment
  • FIG. 7C is a schematic diagram of a region binarization histogram according to an exemplary embodiment
  • FIG. 7D is a schematic diagram of a region binarized continuous column set, according to an exemplary embodiment
  • FIG. 8 is a block diagram of an area extracting apparatus according to an exemplary embodiment
  • FIG. 9 is a block diagram of an area extracting apparatus according to another exemplary embodiment.
  • FIG. 10 is a block diagram of an area extracting apparatus according to another exemplary embodiment.
  • FIG. 11 is a block diagram of an area extracting apparatus according to another exemplary embodiment.
  • FIG. 12 is a block diagram of a model training apparatus according to an exemplary embodiment
  • FIG. 13 is a block diagram of a model training apparatus according to another exemplary embodiment
  • FIG. 14 is a block diagram of an area extracting apparatus, according to an exemplary embodiment.
  • Embodiments of the present disclosure include two processes: a first process of training a recognition model; and a second process of identifying by identifying a model.
  • the two processes may be implemented by using the same terminal; the first process may be performed by the first terminal, and the second process may be performed by the second terminal.
  • This embodiment of the present disclosure does not limit this.
  • the first process and the second process are separately described below using different embodiments.
  • FIG. 1 is a flowchart of a model training method according to an exemplary embodiment, the model training method includes the following steps:
  • a predetermined sample image is acquired, the predetermined sample image including a positive sample image and a negative sample image, the positive sample image including at least one number; the negative sample image including zero numbers or incomplete numbers word;
  • step 102 the preset sample image is trained by using a classification algorithm to obtain a recognition model.
  • the model training method by acquiring a predetermined sample image, the predetermined sample image includes a positive sample image and a negative sample image, the positive sample image includes at least one number; the negative sample image includes zero numbers Or the number of missing; using the classification algorithm to train the preset sample image to obtain the recognition model; solving the digital region extraction method has certain restrictions on the number of digits and the number of digits, and the digital position of different font sizes or different digits cannot be The problem of accurate extraction; the recognition model obtained by the training process is achieved, and the effect of accurately positioning the digital position of different font styles or different font sizes or different digit digits in the image is obtained.
  • FIG. 2 is a flowchart of a region extraction method according to an exemplary embodiment, where the region extraction method includes the following steps.
  • a recognition model is obtained, and the recognition model is obtained by training a preset sample image by using a classification algorithm, where the preset sample image includes a positive sample image and a negative sample image, and each positive sample image includes at least one number, each negative The sample image includes zero numbers or missing numbers;
  • step 202 identifying an image to be identified according to the recognition model to obtain at least one digital area
  • step 203 the digital area is area cut to obtain at least one single digital area.
  • the region extraction method obtains the recognition model, and the recognition model is obtained by training the preset sample image by using a classification algorithm, where the preset sample image includes a positive sample image and a negative sample image, each of which The positive sample image includes at least one number, each negative sample image includes zero digits or a missing number; the image to be recognized is identified according to the recognition model to obtain at least one digital region; and the digital region is region-cut to obtain at least one single digit Area; solves the problem that the digital area extraction method has certain limitation on the number of digits and the number of digits, and the digit position of different font sizes or different digits cannot be accurately extracted; the recognition of the model by different font styles in the image or The digital position of different font sizes or different digits is accurately positioned and the effect of the cut is cut.
  • FIG. 3A is a flowchart of a model training method according to another exemplary embodiment, the model training method includes the following steps.
  • a predetermined sample image is obtained, the predetermined sample image includes a positive sample image and a negative sample image, the positive sample image includes at least one number; the negative sample image includes zero numbers or a missing number;
  • the predetermined sample image is selected from the image library or the directly captured image, and the predetermined sample image refers to an image used by the recognition model, and the predetermined sample image includes two images of a positive sample image and a negative sample image.
  • the positive sample image may be a digital image containing a single number, or may be a digital image containing a single line and an unlimited number of digits.
  • the number in the positive sample image does not limit the font size, font style, and number of digits; at the same time, the number of digital images in the positive sample image is not limited.
  • the negative sample image can be an image containing zero digits or an image containing partially broken numbers.
  • the positive sample image may be an image formed by extracting one or more digital regions from the same image
  • the negative sample image may be an image formed by easily confusing regions near the digital region in the same image, and intercepting the same image. Only a small number of digitally formed images are included or images formed by other regions in the same image are intercepted, as shown in FIG. 3B as an original image of an image
  • FIG. 3C is a positive sample image formed from the original image of FIG. 3B.
  • FIG. 3D is a negative sample image formed from the original image of FIG. 3B.
  • step 302 an image feature of the positive sample image and an image feature of the negative sample image are extracted
  • feature extraction is performed on the positive sample image and the negative sample image, respectively, to obtain image features of the positive sample image and image features of the negative sample image.
  • step 303 the image features of the positive sample image and the first label for representing the positive result are input into an initial model constructed using the classification algorithm, and the image features of the negative sample image and the second label for representing the negative result are input.
  • the recognition model is obtained.
  • the image features of the obtained positive sample image are input into an initial model constructed using a classification algorithm, and the first label of the positive result corresponding to the positive sample image is also input into the initial model. For example: set the first label indicating a positive result to 1.
  • the image features of the obtained negative sample avatar are input into the initial model constructed using the classification algorithm, and the second label of the negative result corresponding to the negative sample image is also input into the initial model. For example: set the second label indicating a negative result to -1.
  • the classification algorithm includes at least one of Adaboost, SVM (Support Vector Machine), artificial neural network, genetic algorithm, naive Bayes, decision tree, KNN (k-Nearest Neighbor, k nearest neighbor) algorithm. .
  • the original positive sample image is 256*256 pixels, and the original positive sample image is extracted from the haar feature.
  • Each positive sample image can obtain a haar feature, and all the image features extracted from the positive sample image are input to In the initial model constructed by the classification algorithm, similarly, the image features extracted from the negative sample image are input into the initial model constructed by the classification algorithm, and after training, the recognition model is obtained.
  • the model training method by acquiring a predetermined sample image, the predetermined sample image includes a positive sample image and a negative sample image, the positive sample image includes at least one number; the negative sample image includes zero numbers Or a missing number; extracting an image feature of the positive sample image and an image feature of the negative sample image; and inputting the image feature of the positive sample image and the first tag for representing the positive result into an initial model constructed using a classification algorithm, the negative sample image
  • the image feature and the second label used to represent the negative result are input into the initial model to obtain the recognition model;
  • the digital region extraction method is limited in terms of the number of digits and the number of digits, and for different font sizes or different digits. The problem that the digital position cannot be accurately extracted; the effect of accurately locating the digital position of different font styles or different font sizes or different digits in the image by the recognition model is achieved.
  • FIG. 4 is a flowchart of a region extraction method according to another exemplary embodiment, the region extraction method includes the following steps.
  • a recognition model is obtained, and the recognition model is obtained by training a preset sample image by using a classification algorithm, where the preset sample image includes a positive sample image and a negative sample image, and each positive sample image includes at least one number, each negative The sample image includes zero numbers or missing numbers;
  • a recognition model is obtained, which is a model obtained by training an initial model constructed by using a classification algorithm for a positive sample image and a negative sample image in the embodiment shown in FIG.
  • a candidate window area is extracted from the image to be identified according to a predetermined window scanning policy using a preset window
  • the predetermined window sweeping strategy comprises: performing a window sweeping of the images to be recognized in order from top to bottom and from left to right.
  • the predetermined window sweeping strategy includes: using different sizes of pre-images for the same image to be identified. Set the window to sweep the window multiple times.
  • the predetermined window sweeping strategy comprises: when the window is to be recognized by the preset window with a fixed size, the preset window has overlapping areas during the two adjacent movements.
  • the preset window size is 16*16 pixels
  • the size of the image to be recognized is 256*256 pixels
  • the preset window of 16*16 pixels is started from the upper left corner of the image to be recognized, according to a predetermined window sweeping strategy.
  • the window is scanned from top to bottom and from left to right for each pixel of the image to be recognized.
  • the preset windows of the adjacent two moving processes overlap each other. region.
  • step 403 the image features of the candidate window regions are input into the recognition model for classification, and the classification result is obtained;
  • the image feature extraction is performed on the candidate window region obtained in step 402, and the image feature extraction is the same as the image feature extraction in the recognition model in the embodiment shown in FIG. 3A.
  • the image features extracted from the candidate window region are input into the recognition model acquired in the embodiment shown in FIG. 3A for classification, and the recognition model matches the image features extracted in the candidate window region with the template in the recognition model, and detects the candidate. Whether the window area is a numeric area.
  • the digital regions in the candidate window regions are identified by the recognition of the model for the detection of image features in the candidate window regions.
  • step 404 if the classification result is a positive result, the candidate window area is identified as a digital area
  • the candidate window area is identified as a digital area.
  • a positive result means that the candidate window region belongs to a model obtained by training a positive sample image.
  • the first label indicating the positive result in the recognition model
  • the classification result is a positive result
  • the first label is marked in the candidate window area after the classification.
  • step 405 if the classification result is a negative result, the candidate window area is identified as a non-numeric area;
  • the candidate window area is identified as a non-numeric area.
  • a negative result means that the candidate window region belongs to a model obtained by training a negative sample image.
  • the second label indicating the negative result in the recognition model
  • the classification result is a negative result
  • the second label is marked in the candidate window area after the classification.
  • step 406 the digital area is area cut to obtain at least one single digital area.
  • the candidate window region whose classification result is a positive result is subjected to region cutting, and a single-digit region in the candidate window region is obtained, wherein the candidate window region includes at least one Single digital area.
  • the region extraction method obtains the recognition model, and the recognition model is obtained by training the preset sample image by using a classification algorithm, where the preset sample image includes a positive sample image and a negative sample image, and each positive sample The image includes at least one number, each negative sample image includes zero digits or a missing number; the candidate window region is extracted from the image to be recognized according to a predetermined window sweeping strategy using a preset window; and the image feature of the candidate window region is input into the recognition model Perform classification to obtain classification results; if the classification result is positive result, identify the candidate window area as a digital area; perform area cutting on the digital area to obtain at least one single-digit area; and solve the digital size and number in the digital area extraction method
  • the number of digits is limited, and the problem of the digital position of different font sizes or different digits cannot be accurately extracted; the digit position of different font styles or different font sizes or different digits in the image is accurately positioned by the recognition model. And cut the effect of the extraction.
  • an intersection region may exist between the at least two digital regions, and a plurality of digital regions having an intersection region need to be merged.
  • the digital area is at least two, and after step 405, the following steps may also be included, as shown in FIG. 5:
  • step 501 the n digital regions in which the intersection region exists are found
  • n digital regions in which there are intersection regions in all digital regions are found by simple rules
  • the digital area in which the intersection area exists is found by detecting the number of overlaps in the digital area, or the digital area in which the intersection area exists is found by the mutual inclusion relationship of the overlapping areas in the digital area.
  • step 502 the n digital regions in which the intersection region exists are combined to obtain a combined digital region
  • the n digital regions in which the intersection region exists are combined to determine the final digital region.
  • the present embodiment finds out n digital regions in which an intersection region exists, and merges n digital regions in which an intersection region exists to obtain a combined digital region; so that the final determined digital region is more accurate and more Conducive to the identification and extraction of digital areas.
  • step 502 can be replaced by the following steps 502a through 502c, as shown in FIG. 6A:
  • step 502a the left edge of the leftmost edge among the n left edges of the n digital regions is determined as the merged left edge;
  • n left edges of n digital regions are obtained, and the left edge of the n left edges is determined as the merged left edge m1 of n digital regions, as shown in FIG. 6B. Shown.
  • step 502b the merged right edge is determined according to the rightmost right edge of the n right edges of the n digital regions;
  • n digital regions are arranged in one row, n right edges of n digital regions are obtained, and the right edge at the rightmost edge among the n right edges is determined as the merged right edge m2 of n digital regions, as shown in FIG. 6C. Shown.
  • step 502c the merged digital area is obtained according to the upper edge, the lower edge, the merged left edge, and the merged right edge.
  • the last merged digital area is obtained, as shown in FIG. 6D.
  • the embodiment determines that the merged left edge is based on the leftmost left edge of the n left edges of the n digital regions; and the rightmost edge among the n right edges of the n digital regions
  • the right edge of the side is determined as the merged right edge; according to the upper edge, the lower edge, the merged left edge, and the merged right edge, the merged digital area is obtained; so that the merged digital area is more accurate and more Conducive to the cutting and extraction of digital areas.
  • step 406 may be replaced by the following steps 406a through 406e, as shown in FIG. 7A:
  • step 406a the digital area is binarized to obtain a binarized digital area
  • the digital area is pre-processed according to the merged digital area in step 502c, wherein the pre-processing may include: denoising, filtering, extracting edges, and the like; and binarizing the pre-processed digital area.
  • Binarization refers to comparing the gray value of a pixel in a digital region with a preset gray threshold.
  • the pixel in the area is divided into two parts: a pixel group larger than the preset gray threshold and a pixel group smaller than the preset gray threshold, and the two pixel groups respectively display two different colors of black and white in the digital area.
  • the binarized digital area is obtained as shown in Fig. 7B.
  • a pixel of a color located in the foreground is referred to as a foreground pixel, that is, a white pixel in FIG. 7B
  • a color pixel located in the background is referred to as a background color pixel, that is, FIG. 7B Black pixel points.
  • the histogram is calculated according to the vertical direction for the binarized digital region, and the histogram includes: an abscissa of each column of pixels and an accumulated value of foreground pixels in each column of pixels;
  • n single-digit regions are identified according to a continuous set of columns consisting of columns in which the accumulated value of the foreground pixels in the histogram is greater than a preset threshold.
  • the accumulated value of the foreground color pixel in each column of pixels can be obtained, and the accumulated value of the foreground color pixel in each column of pixels is compared with a preset threshold, and the accumulated value of the foreground color pixel in the histogram is obtained.
  • a contiguous set of columns consisting of columns greater than a preset threshold is determined to be the column in which the single-digit region is located.
  • the continuous column set means that the column whose accumulated value of the foreground color pixel is larger than the preset threshold is a continuous p column, and the set of the consecutive p column pixel points is as shown in FIG. 7D, and the continuous column set is p, That is, a continuous white area formed in the histogram.
  • the accumulated values of the foreground color pixel points located in the lower side histogram are all greater than a preset threshold.
  • the p column of pixels corresponds to the digital area "3" in the digital image.
  • Each successive set of columns is identified as a digital region, and n consecutive sets of columns are identified as n single-digit regions.
  • the single-digit area corresponding to each number is identified, and the identification digital area can be improved.
  • the accuracy of a single digital area is improved.
  • FIG. 8 is a block diagram of an area extracting apparatus according to an exemplary embodiment, as shown in FIG.
  • the area extraction device includes but is not limited to:
  • the obtaining module 810 is configured to acquire a recognition model, and the recognition model is obtained by training a preset sample image by using a classification algorithm, where the preset sample image includes a positive sample image and a negative sample image, and each positive sample image includes at least one number, each Negative sample images include zero numbers or missing numbers;
  • the identification module 820 is configured to identify the image to be identified according to the recognition model to obtain at least one digital area
  • the cutting module 830 is configured to perform area cutting on the digital area to obtain at least one single digital area.
  • the area extracting apparatus obtains the recognition model, and the recognition model is obtained by training the preset sample image by using a classification algorithm, and the preset sample image includes a positive sample image and a negative sample image, each of which The positive sample image includes at least one number, each negative sample image includes zero digits or a missing number; the image to be recognized is identified according to the recognition model to obtain at least one digital region; and the digital region is region-cut to obtain at least one single digit Area; solves the problem that the digital area extraction method has certain limitation on the number of digits and the number of digits, and the digit position of different font sizes or different digits cannot be accurately extracted; the recognition of the model by different font styles in the image or The digital position of different font sizes or different digits is accurately positioned and the effect of the cut is cut.
  • FIG. 9 is a block diagram of an area extracting apparatus according to another exemplary embodiment. As shown in FIG. 9, the area extracting apparatus includes, but is not limited to:
  • the obtaining module 810 is configured to acquire a recognition model, and the recognition model is obtained by training a preset sample image by using a classification algorithm, where the preset sample image includes a positive sample image and a negative sample image, and each positive sample image includes at least one number, each Negative sample images include zero numbers or missing numbers;
  • the acquisition module 810 acquires a recognition model that is trained by an initial model constructed using a classification algorithm for the positive sample image and the negative sample image.
  • the identification module 820 is configured to identify the image to be identified according to the recognition model to obtain at least one digital area
  • the identification module 820 further includes the following submodules:
  • the window sweeping sub-module 821 is configured to extract a candidate window region from the image to be identified according to a predetermined window sweeping strategy using a preset window;
  • the window sub-module 821 sets a preset window of a fixed size, and uses the preset window to scan the image to be recognized according to the predetermined window scanning policy; and extracts the image from the image to be recognized after scanning the window.
  • a candidate window area to multiple digital regions.
  • the predetermined window sweeping strategy comprises: performing a window sweeping of the images to be recognized in order from top to bottom and from left to right.
  • the predetermined window sweeping strategy comprises: multiple window sweeping using different preset windows for the same image to be identified.
  • the predetermined window sweeping strategy comprises: when the window is to be recognized by the preset window with a fixed size, the preset window has overlapping areas during the two adjacent movements.
  • the classification sub-module 822 is configured to classify the image feature of the candidate window region into the recognition model to obtain a classification result
  • Image feature extraction is performed on the candidate window regions obtained in the window sub-module 821.
  • the image features extracted from the candidate window region are input into the recognition model acquired by the acquisition module 810 for classification, and the recognition model matches the image features extracted in the candidate window region with the template in the recognition model, and detects whether the candidate window region is Digital area.
  • the classification sub-module 822 identifies the digital regions in the candidate window regions by identifying the model for the detection of image features in the candidate window regions.
  • the confirmation sub-module 823 is configured to recognize the candidate window area as a digital area when the classification result is a positive result.
  • the confirmation sub-module 823 identifies the candidate window region as a digital region.
  • a positive result means that the candidate window region belongs to a model obtained by training a positive sample image.
  • the confirmation sub-module 823 is further configured to identify the candidate window region as a non-numeric region when the classification result is a negative result.
  • the confirmation sub-module 823 identifies the candidate window region as a non-numeric region.
  • a negative result means that the candidate window region belongs to a model obtained by training a negative sample image.
  • the cutting module 830 is configured to perform area cutting on the digital area to obtain at least one single digital area.
  • the cutting module 830 performs area cutting on the candidate window area obtained by the confirmation sub-module 823 to obtain a single-digit area in the candidate window area, wherein the candidate window area includes at least one singular number Word area.
  • the area extracting apparatus obtains the recognition model, and the recognition model is obtained by training the preset sample image by using a classification algorithm, and the preset sample image includes a positive sample image and a negative sample image, each of which The positive sample image includes at least one number, each negative sample image includes zero digits or a missing number; the candidate window region is extracted from the image to be recognized according to a predetermined window scanning policy using a preset window; and the image feature of the candidate window region is input
  • the recognition model is classified to obtain a classification result; if the classification result is a positive result, the candidate window area is identified as a digital area; the digital area is area cut to obtain at least one single-digit area; and the digital size is solved in the digital area extraction method.
  • the number of digits has a certain limit, the problem of the digital position of different font sizes or different digits cannot be accurately extracted; the digital position of different font styles or different font sizes or different digits in the image is achieved by the recognition model. Accurately locate and cut the extracted effect.
  • the apparatus may also include the following modules, as shown in Figure 10:
  • the searching module 1010 is configured to find out n digital areas where the intersection area exists
  • the lookup module 1010 finds n number areas of the intersection area in all the digital areas by simple rules.
  • the merging module 1020 is configured to combine the n digital regions in which the intersection region exists to obtain a combined digital region.
  • the merging module 1020 After finding the n digital regions where the intersection region exists, the merging module 1020 combines the n digital regions in which the intersection region exists to determine the final digital region.
  • the merging module 1020 may include the following sub-modules as an optional implementation manner:
  • the first determining sub-module 1021 is configured to determine the left edge of the n left edge of the n digital regions as the merged left edge;
  • n left edges of n digital regions are acquired, and the first determining sub-module 1021 determines the leftmost left edge of the n left edges as n digital regions combined left edge.
  • the second determining sub-module 1022 is configured to determine the right edge of the rightmost edge of the n right edges of the n digital regions as the merged right edge;
  • n digital regions are arranged in one row, n right edges of n digital regions are acquired, and the second determining sub-module 1022 determines the right edge at the rightmost of the n right edges as n digital regions combined right edge.
  • the third determining sub-module 1023 is configured to obtain the merged digital area according to the upper edge, the lower edge, the merged left edge, and the merged right edge.
  • the third determining submodule 1023 obtains the last merged digital region.
  • the embodiment determines that the merged left edge is based on the leftmost left edge of the n left edges of the n digital regions; and the rightmost edge among the n right edges of the n digital regions
  • the right edge of the side is determined as the merged right edge; according to the upper edge, the lower edge, the merged left edge, and the merged right edge, the merged digital area is obtained; so that the merged digital area is more accurate and more Conducive to the cutting and extraction of digital areas.
  • the cutting module 830 may further include the following sub-modules, as shown in FIG.
  • the binarization sub-module 831 is configured to binarize the digital area to obtain a binarized digital area
  • the binarization sub-module 831 performs pre-processing on the digital area, where the pre-processing may include: operations such as denoising, filtering, extracting edges, etc.; The subsequent digital area is binarized.
  • Binarization refers to comparing the gray value of a pixel in a digital region with a preset gray threshold, and dividing the pixel in the digital region into two parts: a pixel group larger than a preset gray threshold and less than a preset gray level.
  • the pixel group of the threshold value presents two different colors of black and white in the digital area, and obtains the digitized digital area.
  • the calculation sub-module 832 is configured to calculate a histogram in the vertical direction for the binarized digital region, the histogram comprising: an abscissa of each column of pixels and an accumulated value of foreground pixels in each column of pixels;
  • the calculation sub-module 832 calculates a histogram in the vertical direction for the digital region processed by the binarization sub-module 831, the histogram represents the abscissa of each column of pixels in the horizontal direction, and represents the front of each column of pixels in the vertical direction. The cumulative number of pixels in the scene.
  • the digital identification sub-module 833 is configured to identify n single-digit regions according to a continuous set of columns consisting of columns in which the accumulated value of the foreground pixels in the histogram is greater than a preset threshold.
  • the accumulated value of the foreground pixel in each column of pixels can be obtained, and the digital recognition sub-module 833 compares the accumulated value of the foreground pixel in each column of pixels with a preset threshold, and the foreground image in the histogram A set of consecutive columns of pixels whose accumulated value is greater than a preset threshold is determined to be a column in which a single-digit area is located.
  • the continuous column set means that the column whose accumulated value of the foreground color pixel is larger than the preset threshold is a continuous p column, and the set of consecutive p columns of pixel points.
  • Each successive set of columns is identified as a digital region, and n consecutive sets of columns are identified as n single-digit regions.
  • the present embodiment provides an accurate cut by binarizing the digital area and calculating the histogram in the vertical direction by the binarized digital area, and identifying the single-digit area corresponding to each number. And a method of identifying a single digital area.
  • FIG. 12 is a block diagram of a model training apparatus according to an exemplary embodiment. As shown in FIG. 12, the model training apparatus includes, but is not limited to:
  • the sample obtaining module 1210 is configured to acquire a predetermined sample image, where the predetermined sample image includes a positive sample image and a negative sample image, the positive sample image includes at least one number; the negative sample image includes zero numbers or a missing number;
  • the training module 1220 is configured to train the preset sample image by using a classification algorithm to obtain a recognition model.
  • the model training apparatus by acquiring a predetermined sample image, the predetermined sample image includes a positive sample image and a negative sample image, the positive sample image includes at least one number; the negative sample image includes zero numbers Or the number of missing; using the classification algorithm to train the preset sample image to obtain the recognition model; solving the digital region extraction method has certain restrictions on the number of digits and the number of digits, and the digital position of different font sizes or different digits cannot be The problem of accurate extraction; the recognition model obtained by the training process is achieved, and the effect of accurately positioning the digital position of different font styles or different font sizes or different digit digits in the image is obtained.
  • FIG. 13 is a block diagram of a model training apparatus according to another exemplary embodiment, as shown in FIG. As shown, the model training device includes but is not limited to:
  • the sample obtaining module 1210 is configured to acquire a predetermined sample image, where the predetermined sample image includes a positive sample image and a negative sample image, the positive sample image includes at least one number; the negative sample image includes zero numbers or a missing number;
  • the sample obtaining module 1210 selects a predetermined sample image from the image library or directly captured image, and the predetermined sample image refers to an image used by the recognition model, and the predetermined sample image includes two images of a positive sample image and a negative sample image.
  • the positive sample image may be a digital image containing a single number, or may be a digital image containing a single line and an unlimited number of digits.
  • the number in the positive sample image does not limit the font size, font style, and number of digits; at the same time, the number of digital images in the positive sample image is not limited.
  • the negative sample image can be an image containing zero digits or an image containing partially broken numbers.
  • the training module 1220 is configured to train the preset sample image by using a classification algorithm to obtain a recognition model.
  • the classification algorithm includes at least one of Adaboost, support vector machine SVM, artificial neural network, genetic algorithm, naive Bayes, decision tree, and nearest neighbor KNN algorithm.
  • the training model 1220 may include the following sub-modules:
  • An extraction sub-module 1221 configured to extract an image feature of the positive sample image and an image feature of the negative sample image
  • the extraction sub-module 1221 After the sample acquisition module 1210 obtains the positive sample image and the negative sample image, the extraction sub-module 1221 performs feature extraction on the positive sample image and the negative sample image, respectively, to obtain image features of the positive sample image and image features of the negative sample image.
  • the input sub-module 1222 is configured to input an image feature of the positive sample image and a first tag for indicating a positive result into an initial model constructed using a classification algorithm, and to use an image feature of the negative sample image and a first feature for representing a negative result In the initial model of the two-label input, the recognition model is obtained.
  • the input sub-module 1222 inputs the obtained image features of the positive sample image into the initial model constructed using the classification algorithm, while also inputting the first label of the positive result corresponding to the positive sample image into the initial model.
  • the input sub-module 1222 inputs the obtained image features of the negative sample avatar into the initial model constructed using the classification algorithm, and simultaneously inputs the second label of the negative result corresponding to the negative sample image into the initial model.
  • the classification algorithm includes at least one of Adaboost, SVM, artificial neural network, genetic algorithm, naive Bayes, decision tree, and KNN algorithm.
  • the model training apparatus by acquiring a predetermined sample image, the predetermined sample image includes a positive sample image and a negative sample image, the positive sample image includes at least one number; the negative sample image includes zero numbers Or a missing number; extracting an image feature of the positive sample image and an image feature of the negative sample image; and inputting the image feature of the positive sample image and the first tag for representing the positive result into an initial model constructed using a classification algorithm, the negative sample image
  • the image feature and the second label used to represent the negative result are input into the initial model to obtain the recognition model;
  • the digital region extraction method is limited in terms of the number of digits and the number of digits, and for different font sizes or different digits. The problem that the digital position cannot be accurately extracted; the effect of accurately locating the digital position of different font styles or different font sizes or different digits in the image by the recognition model is achieved.
  • the present disclosure also provides a region extracting device, the device comprising: a processor;
  • a memory for storing processor executable instructions
  • processor is configured to:
  • the recognition model is obtained by training a preset sample image by using a classification algorithm, where the preset sample image includes a positive sample image and a negative sample image, each positive sample image includes at least one number, and each negative sample image includes zero Number or missing number;
  • Area cutting the digital area yields at least one single-digit area.
  • the present disclosure also provides a model training device, the device comprising: a processor;
  • a memory for storing processor executable instructions
  • processor is configured to:
  • the predetermined sample image including a positive sample image and a negative sample image, the positive sample image including at least one number; the negative sample image including zero digits or a missing number;
  • the classification model is used to train the preset sample image to obtain a recognition model.
  • FIG. 14 is a block diagram of an apparatus for performing a region extraction method, according to an exemplary embodiment.
  • device 1400 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
  • apparatus 1400 can include one or more of the following components: processing component 1402, memory 1404, power component 1406, multimedia component 1408, audio component 1410, input/output (I/O) interface 1412, sensor component 1414, and Communication component 1416.
  • Processing component 1402 typically controls the overall operation of device 1400, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • Processing component 1402 can include one or more processors 1418 to execute instructions to perform all or part of the steps described above.
  • processing component 1402 can include one or more modules to facilitate interaction between component 1402 and other components.
  • processing component 1402 can include a multimedia module to facilitate interaction between multimedia component 1408 and processing component 1402.
  • Memory 1404 is configured to store various types of data to support operation at device 1400. Examples of such data include instructions for any application or method operating on device 1400, contact data, phone book data, messages, pictures, videos, and the like.
  • the memory 1404 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or Optical Disk.
  • Power component 1406 provides power to various components of device 1400.
  • Power component 1406 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 1400.
  • the multimedia component 1408 includes a screen between the device 1400 and the user that provides an output interface.
  • the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor can sense not only the boundaries of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation.
  • the multimedia component 1408 includes a front camera and/or a rear camera. When the device 1400 is in an operation mode, such as a shooting mode or a video mode, the front The camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
  • the audio component 1410 is configured to output and/or input an audio signal.
  • the audio component 1410 includes a microphone (MIC) that is configured to receive an external audio signal when the device 1400 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode.
  • the received audio signal may be further stored in memory 1404 or transmitted via communication component 1416.
  • the audio component 1410 also includes a speaker for outputting an audio signal.
  • the I/O interface 1412 provides an interface between the processing component 1402 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.
  • Sensor assembly 1414 includes one or more sensors for providing a status assessment of various aspects to device 1400.
  • sensor component 1414 can detect an open/closed state of device 1400, a relative positioning of components, such as a display and a keypad of device 1400, and sensor component 1414 can also detect a change in position of a component of device 1400 or device 1400, the user The presence or absence of contact with device 1400, device 1400 orientation or acceleration/deceleration and temperature variation of device 1400.
  • Sensor assembly 1414 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
  • Sensor assembly 1414 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor component 1414 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 1416 is configured to facilitate wired or wireless communication between device 1400 and other devices.
  • the device 1400 can access a wireless network based on a communication standard, such as Wi-Fi, 2G or 3G, or a combination thereof.
  • communication component 1416 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel.
  • communication component 1416 also includes a near field communication (NFC) module to facilitate short range communication.
  • NFC near field communication
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IRDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
  • RFID radio frequency identification
  • IRDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • device 1400 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation for performing the above-described region extraction method.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor or other electronic component implementation for performing the above-described region extraction method.
  • non-transitory computer readable storage medium comprising instructions, such as a memory 1404 comprising instructions executable by processor 1418 of apparatus 1400 to perform the region extraction method described above.
  • the non-transitory computer readable storage medium can be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

本公开揭示了一种区域提取方法、模型训练方法及装置,属于图像处理领域。所述区域提取方法包括:通过获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;根据识别模型对待识别的图像进行识别,得到至少一个数字区域;对数字区域进行区域切割,得到至少一个单数字区域;解决了数字区域提取方法中对数字大小和数字位数有一定限制,对不同字体大小或不同数字位数的数字位置无法准确提取的问题;达到了通过识别模型对图像中不同字体样式或不同字体大小或不同数字位数的数字位置进行准确定位并切割提取的效果。

Description

区域提取方法、模型训练方法及装置
本申请基于申请号为201510727932.0、申请日为2015年10月30日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本公开涉及图像处理领域,特别涉及一种区域提取方法、模型训练方法及装置。
背景技术
数字区域提取是一种在图像中提取出数字区域的技术。
相关技术中有关数字的区域提取方法,通常只能在图像中识别出预定大小和预定位数的数字所在的区域,若图像中的数字的字体样式不同或字体大小不同或数字的位数不同,则难以从图像中有效地提取出数字区域。
发明内容
为了解决相关技术中存在的问题,本公开提供一种区域提取、模型训练方法及装置。所述技术方案如下:
根据本公开实施例的第一方面,提供一种区域提取方法,该方法包括:
获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;
根据识别模型对待识别的图像进行识别,得到至少一个数字区域;
对数字区域进行区域切割,得到至少一个单数字区域。
在一个可选的实施例中,根据识别模型对待识别的图像进行识别,得到至少一个数字区域,包括:
使用预设窗口按照预定扫窗策略从待识别的图像中提取候选窗口区域;
将候选窗口区域的图像特征输入识别模型进行分类,得到分类结果;
若分类结果为正结果,则将候选窗口区域识别为数字区域。
在一个可选的实施例中,数字区域为至少两个,方法还包括:
查找出存在交集区域的n个数字区域;
将存在交集区域的n个数字区域进行合并,得到合并后的数字区域。
在一个可选的实施例中,将存在交集区域的n个数字区域进行合并,得到合并后的数字区域,包括:
在存在交集区域的n个数字区域之间的上边缘和下边缘重合时,
将n个数字区域的n个左边缘中位于最左侧的左边缘,确定为合并后的左边缘;
将n个数字区域的n个右边缘中位于最右侧的右边缘,确定为合并后的右边缘;
根据上边缘、下边缘、合并后的左边缘、合并后的右边缘,得到合并后的数字区域。
在一个可选的实施例中,对数字区域进行区域切割,得到至少一个单数字区域,包括:
对数字区域进行二值化,得到二值化后的数字区域;
对二值化后的数字区域按照竖直方向计算直方图,直方图包括:每列像素点的横坐标和每列像素点中前景色像素点的累加值;
根据直方图中前景色像素点的累加值大于预设阈值的列所组成的连续列集合,识别得到n个单数字区域。
根据本公开实施例的第二方面,提供一种模型训练方法,该方法包括:获取预定样本图像,预定样本图像包括正样本图像和负样本图像,正样本图像包括有至少一个数字;负样本图像包括零个数字或残缺数字;
利用分类算法对预设样本图像进行训练,得到识别模型。
在一个可选的实施例中,利用分类算法对预设样本图像进行训练,得到识别模型,包括:
提取正样本图像的图像特征和负样本图像的图像特征;
将正样本图像的图像特征和用于表示正结果的第一标签输入利用分类算法构建的初始模型中,将负样本图像的图像特征和用于表示负结果的第二标签输入初始模型中,得到识别模型。
在一个可选的实施例中,分类算法包括Adaboost、支持向量机SVM、人工神经网络、遗传算法、朴素贝叶斯、决策树、最近邻KNN算法中的至少一种。
根据本公开实施例的第三方面,提供一种区域提取装置,该装置包括:
获取模块,被配置为获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;
识别模块,被配置为根据识别模型对待识别的图像进行识别,得到至少一个数字区域;
切割模块,被配置为对数字区域进行区域切割,得到至少一个单数字区域。
在一个可选的实施例中,识别模块,包括:
扫窗子模块,被配置为使用预设窗口按照预定扫窗策略从待识别的图像中提取候选窗口区域;
分类子模块,被配置为将候选窗口区域的图像特征输入识别模型进行分类,得到分类结果;
确认子模块,被配置为在分类结果为正结果时,将候选窗口区域识别为数字区域。
在一个可选的实施例中,数字区域为至少两个,该装置还包括:
查找模块,被配置为查找出存在交集区域的n个数字区域;
合并模块,被配置为将存在交集区域的n个数字区域进行合并,得到合并后的数字区域。
在一个可选的实施例中,合并模块,包括:
在存在交集区域的n个数字区域之间的上边缘和下边缘重合时,
第一确定子模块,被配置为将n个数字区域的n个左边缘中位于最左侧的左边缘,确定为合并后的左边缘;
第二确定子模块,被配置为将n个数字区域的n个右边缘中位于最右侧的右边缘,确定为合并后的右边缘;
第三确定子模块,被配置为根据上边缘、下边缘、合并后的左边缘、合并后的右边缘,得到合并后的数字区域。
在一个可选的实施例中,切割模块,包括:
二值化子模块,被配置为对数字区域进行二值化,得到二值化后的数字区 域;
计算子模块,被配置为对二值化后的数字区域按照竖直方向计算直方图,直方图包括:每列像素点的横坐标和每列像素点中前景色像素点的累加值;
数字识别子模块,被配置为根据直方图中前景色像素点的累加值大于预设阈值的列所组成的连续列集合,识别得到n个单数字区域。
根据本公开实施例的第四方面,提供一种模型训练装置,该装置包括:
样本获取模块,被配置为获取预定样本图像,预定样本图像包括正样本图像和负样本图像,正样本图像包括有至少一个数字;负样本图像包括零个数字或残缺数字;
训练模块,被配置为利用分类算法对预设样本图像进行训练,得到识别模型。
在一个可选的实施例中,训练模块,包括:
提取子模块,被配置为提取正样本图像的图像特征和负样本图像的图像特征;
输入子模块,被配置为将正样本图像的图像特征和用于表示正结果的第一标签输入利用分类算法构建的初始模型中,将负样本图像的图像特征和用于表示负结果的第二标签输入初始模型中,得到识别模型。
在一个可选的实施例中,分类算法包括Adaboost、支持向量机SVM、人工神经网络、遗传算法、朴素贝叶斯、决策树、最近邻KNN算法中的至少一种。
根据本公开实施例的第五方面,提供一种区域提取装置,该装置包括:
处理器;
用于存储处理器可执行指令的存储器;
其中,处理器被配置为:
获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;
根据识别模型对待识别的图像进行识别,得到至少一个数字区域;
对数字区域进行区域切割,得到至少一个单数字区域。
根据本公开实施例的第六方面,提供一种模型训练装置,该装置包括:
处理器;
用于存储处理器可执行指令的存储器;
其中,处理器被配置为:
获取预定样本图像,预定样本图像包括正样本图像和负样本图像,正样本图像包括有至少一个数字;负样本图像包括零个数字或残缺数字;
利用分类算法对预设样本图像进行训练,得到识别模型。
本公开的实施例提供的技术方案可以包括以下有益效果:
通过获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;根据识别模型对待识别的图像进行识别,得到至少一个数字区域;对数字区域进行区域切割,得到至少一个单数字区域;解决了数字区域提取方法中对数字大小和数字位数有一定限制,对不同字体大小或不同数字位数的数字位置无法准确提取的问题;达到了通过识别模型对图像中不同字体样式或不同字体大小或不同数字位数的数字位置进行准确定位并切割提取的效果。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性的,并不能限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并于说明书一起用于解释本公开的原理。
图1是根据一示例性实施例示出的一种模型训练方法的流程图;
图2是根据一示例性实施例示出的一种区域提取方法的流程图;
图3A是根据另一示例性实施例示出的一种模型训练方法的流程图;
图3B是根据一示例性实施例示出的一种原始样本图像的示意图;
图3C是根据一示例性实施例示出的一种正样本图像的示意图;
图3D是根据一示例性实施例示出的一种负样本图像的示意图;
图4是根据另一示例性实施例示出的一种区域提取方法的流程图;
图5是根据另一示例性实施例示出的一种区域提取方法的流程图;
图6A是根据另一示例性实施例示出的一种区域提取方法的流程图;
图6B是根据一示例性实施例示出的一种区域左边缘的示意图;
图6C是根据一示例性实施例示出的一种区域右边缘的示意图;
图6D是根据一示例性实施例示出的一种区域提取的示意图;
图7A是根据另一示例性实施例示出的一种区域提取方法的流程图;
图7B是根据一示例性实施例示出的一种区域二值化的示意图;
图7C是根据一示例性实施例示出的一种区域二值化直方图的示意图;
图7D是根据一示例性实施例示出的一种区域二值化连续列集合的示意图;
图8是根据一示例性实施例示出的一种区域提取装置的框图;
图9是根据另一示例性实施例示出的一种区域提取装置的框图;
图10是根据另一示例性实施例示出的一种区域提取装置的框图;
图11是根据另一示例性实施例示出的一种区域提取装置的框图;
图12是根据一示例性实施例示出的一种模型训练装置的框图;
图13是根据另一示例性实施例示出的一种模型训练装置的框图;
图14是根据一示例性实施例示出的一种区域提取装置的框图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
本公开实施例包括两个过程:训练识别模型的第一过程;和,通过识别模型进行识别的第二过程。这两个过程可以采用同一个终端来实现;也可以采用第一终端执行第一过程,第二终端执行第二过程。本公开实施例对此不加以限定。下文采用不同的实施例分别对第一过程和第二过程进行阐述。
图1是根据一示例性实施例示出的一种模型训练方法的流程图,该模型训练方法包括以下几个步骤:
在步骤101中,获取预定样本图像,预定样本图像包括正样本图像和负样本图像,正样本图像包括有至少一个数字;负样本图像包括零个数字或残缺数 字;
在步骤102中,利用分类算法对预设样本图像进行训练,得到识别模型。
综上所述,本公开实施例中提供的模型训练方法,通过获取预定样本图像,预定样本图像包括正样本图像和负样本图像,正样本图像包括有至少一个数字;负样本图像包括零个数字或残缺数字;利用分类算法对预设样本图像进行训练,得到识别模型;解决了数字区域提取方法中对数字大小和数字位数有一定限制,对不同字体大小或不同数字位数的数字位置无法准确提取的问题;达到了该训练过程得到的识别模型,具有对图像中不同字体样式或不同字体大小或不同数字位数的数字位置进行准确定位的效果。
图2是根据一示例性实施例示出的一种区域提取方法的流程图,该区域提取方法包括以下几个步骤。
在步骤201中,获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;
在步骤202中,根据识别模型对待识别的图像进行识别,得到至少一个数字区域;
在步骤203中,对数字区域进行区域切割,得到至少一个单数字区域。
综上所述,本公开实施例中提供的区域提取方法,通过获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;根据识别模型对待识别的图像进行识别,得到至少一个数字区域;对数字区域进行区域切割,得到至少一个单数字区域;解决了数字区域提取方法中对数字大小和数字位数有一定限制,对不同字体大小或不同数字位数的数字位置无法准确提取的问题;达到了通过识别模型对图像中不同字体样式或不同字体大小或不同数字位数的数字位置进行准确定位并切割提取的效果。
图3A是根据另一示例性实施例示出的一种模型训练方法的流程图,该模型训练方法包括以下几个步骤。
在步骤301中,获取预定样本图像,预定样本图像包括正样本图像和负样本图像,正样本图像包括有至少一个数字;负样本图像包括零个数字或残缺数字;
通过图像库或直接拍摄得到的图像中选取预定样本图像,预定样本图像是指识别模型所使用的图像,预定样本图像包括正样本图像和负样本图像两种图像。其中,正样本图像可以为含有单个数字的数字图像,也可以为含有单行且数字个数不限的数字图像。正样本图像中的数字对字体大小、字体样式和数字位数不作限定;同时,正样本图像中对数字图像的个数不作限定。负样本图像可以是包含零个数字的图像,也可以是包含有部分残缺数字的图像。
可选的,正样本图像可以是从同一幅图像中截取出一个或多个数字区域形成的图像;负样本图像可以是截取同一幅图像中数字区域附近易于混淆区域形成的图像、截取同一幅图像中只包含一小部分数字形成的图像或者截取同一幅图像中其它区域形成的图像,如图3B所示是一幅图像的原始图;图3C是从图3B原始图中截取形成的正样本图像;图3D是从图3B原始图中截取形成的负样本图像。
在步骤302中,提取正样本图像的图像特征和负样本图像的图像特征;
在获取到正样本图像和负样本图像后,分别对正样本图像和负样本图像进行特征提取,得到正样本图像的图像特征和负样本图像的图像特征。
在步骤303中,将正样本图像的图像特征和用于表示正结果的第一标签输入利用分类算法构建的初始模型中,将负样本图像的图像特征和用于表示负结果的第二标签输入初始模型中,得到识别模型。
将得到的正样本图像的图像特征输入到利用分类算法构建的初始模型中,同时将正样本图像对应的正结果的第一标签也输入到初始模型中。比如:将表示正结果的第一标签设置为1。
将得到的负样本头像的图像特征输入到利用分类算法构建的初始模型中,同时将负样本图像对应的负结果的第二标签也输入到初始模型中。比如:将表示负结果的第二标签设置为-1。
其中,分类算法包括Adaboost、SVM(Support Vector Machine,支持向量机)、人工神经网络、遗传算法、朴素贝叶斯、决策树、KNN(k-Nearest Neighbor,k最近邻)算法中的至少一种。
比如:原始的正样本图像是256*256像素点,将原始的正样本图像进行haar特征的提取,每一幅正样本图像可以得到一个haar特征,将所有正样本图像提取到的图像特征输入到利用分类算法构建的初始模型中,同理,将负样本图像中提取到的图像特征输入到利用分类算法构建的初始模型中,经过训练后,得到识别模型。
综上所述,本公开实施例中提供的模型训练方法,通过获取预定样本图像,预定样本图像包括正样本图像和负样本图像,正样本图像包括有至少一个数字;负样本图像包括零个数字或残缺数字;提取正样本图像的图像特征和负样本图像的图像特征;将正样本图像的图像特征和用于表示正结果的第一标签输入利用分类算法构建的初始模型中,将负样本图像的图像特征和用于表示负结果的第二标签输入初始模型中,得到识别模型;解决了数字区域提取方法中对数字大小和数字位数有一定限制,对不同字体大小或不同数字位数的数字位置无法准确提取的问题;达到了通过识别模型对图像中不同字体样式或不同字体大小或不同数字位数的数字位置进行准确定位的效果。
图4是根据另一示例性实施例示出的一种区域提取方法的流程图,该区域提取方法包括以下几个步骤。
在步骤401中,获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;
获取识别模型,该识别模型是图3所示的实施例中通过对正样本图像和负样本图像利用分类算法构建的初始模型训练得到的模型。
在步骤402中,使用预设窗口按照预定扫窗策略从待识别的图像中提取候选窗口区域;
在获取到识别模型后,设置固定大小的预设窗口,利用预设窗口按照预定扫窗策略对待识别的图像进行扫窗;通过扫窗后从待识别的图像中提取到多个数字区域的候选窗口区域。
可选地,该预定扫窗策略包括:按照从上往下,从左到右的顺序对待识别的图像进行逐行扫窗。
可选的,该预定扫窗策略包括:对于同一幅待识别图像采用不同大小的预 设窗口多次扫窗。
可选的,该预定扫窗策略包括:在采用固定大小的预设窗口对待识别图像进行扫窗时,预设窗口在相邻的两次移动过程中存在有相互重叠的区域。
比如:预设窗口大小为16*16像素点,待识别图像的大小为256*256像素点,则将16*16像素点的预设窗口从待识别图像的左上角开始,按照预定扫窗策略从从上到下、从左至右依次对待识别图像的每一个像素点进行扫窗,预设窗口从左至右移动过程中,相邻的两次移动过程的预设窗口存在有相互重叠的区域。
在步骤403中,将候选窗口区域的图像特征输入识别模型进行分类,得到分类结果;
对步骤402中得到的候选窗口区域进行图像特征提取,图像特征的提取与图3A所示实施例中识别模型中的图像特征提取相同。将对候选窗口区域提取到的图像特征输入到图3A所示实施例中获取的识别模型中进行分类,识别模型将候选窗口区域中提取的图像特征与识别模型中的模板进行匹配,检测该候选窗口区域是否为数字区域。通过识别模型对候选窗口区域中图像特征的检测,识别出候选窗口区域中的数字区域。
在步骤404中,若分类结果为正结果,则将候选窗口区域识别为数字区域;
若识别模型经过分类后,得出的分类结果为正结果,则将候选窗口区域识别为数字区域。正结果是指该候选窗口区域属于正样本图像训练得到的模型。
根据识别模型中表示正结果的第一标签,在对候选窗口区域进行分类时,若分类结果为正结果,则分类后在该候选窗口区域标记第一标签。
在步骤405中,若分类结果为负结果,则将候选窗口区域识别为非数字区域;
若识别模型经过分类后,得出的分类结果为负结果,则将候选窗口区域识别为非数字区域。负结果是指该候选窗口区域属于负样本图像训练得到的模型。
根据识别模型中表示负结果的第二标签,在对候选窗口区域进行分类时,若分类结果为负结果,则分类后在该候选窗口区域标记第二标签。
在步骤406中,对数字区域进行区域切割,得到至少一个单数字区域。
对进行识别模型分类后,分类结果为正结果的候选窗口区域进行区域切割,得到该候选窗口区域中的单数字区域,其中,该候选窗口区域中至少包含一个 单数字区域。
综上,本公开实施例中提供的区域提取方法,通过获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;使用预设窗口按照预定扫窗策略从待识别的图像中提取候选窗口区域;将候选窗口区域的图像特征输入识别模型进行分类,得到分类结果;若分类结果为正结果,则将候选窗口区域识别为数字区域;对数字区域进行区域切割,得到至少一个单数字区域;解决了数字区域提取方法中对数字大小和数字位数有一定限制,对不同字体大小或不同数字位数的数字位置无法准确提取的问题;达到了通过识别模型对图像中不同字体样式或不同字体大小或不同数字位数的数字位置进行准确定位并切割提取的效果。
当对候选窗口区域的识别中得到的数字区域为至少两个时,该至少两个数字区域之间可能存在交集区域,对存在交集区域的多个数字区域需要进行合并。
在基于图4所示的可选实施例中,数字区域为至少两个,在步骤405之后还可以包括如下步骤,如图5所示:
在步骤501中,查找出存在交集区域的n个数字区域;
当数字区域为至少两个时,通过简单规则查找出所有数字区域中存在交集区域的n个数字区域;
比如:通过检测数字区域中重叠的个数查找出存在交集区域的数字区域,或者,通过数字区域中重叠区域的相互包含关系查找出存在交集区域的数字区域。
在步骤502中,将存在交集区域的n个数字区域进行合并,得到合并后的数字区域;
在查找出存在交集区域的n个数字区域后,将存在交集区域的n个数字区域进行合并,确定出最终的数字区域。
综上所述,本实施例通过查找出存在交集区域的n个数字区域;将存在交集区域的n个数字区域进行合并,得到合并后的数字区域;使得最终确定的数字区域更加准确,更加有利于对数字区域的识别和提取。
在基于图5所示的可选实施例中,步骤502可替代为如下步骤502a至步骤502c,如图6A所示:
在存在交集区域的n个数字区域之间的上边缘和下边缘重合时,
在步骤502a中,根据n个数字区域的n个左边缘中位于最左侧的左边缘,确定为合并后的左边缘;
当n个数字区域排列在一行时,获取n个数字区域的n个左边缘,将n个左边缘中位于最左侧的左边缘确定为n个数字区域合并后的左边缘m1,如图6B所示。
在步骤502b中,根据n个数字区域的n个右边缘中位于最右侧的右边缘,确定为合并后的右边缘;
当n个数字区域排列在一行时,获取n个数字区域的n个右边缘,将n个右边缘中位于最右侧的右边缘确定为n个数字区域合并后的右边缘m2,如图6C所示。
在步骤502c中,根据上边缘、下边缘、合并后的左边缘、合并后的右边缘,得到合并后的数字区域。
根据上述步骤中确定的上边缘、下边缘、合并后的左边缘、合并后的右边缘,得到最后合并的数字区域,如图6D所示。
综上所述,本实施例通过根据n个数字区域的n个左边缘中位于最左侧的左边缘,确定为合并后的左边缘;根据n个数字区域的n个右边缘中位于最右侧的右边缘,确定为合并后的右边缘;根据上边缘、下边缘、合并后的左边缘、合并后的右边缘,得到合并后的数字区域;使得合并后的数字区域更加准确,更加有利于数字区域的切割和提取。
在基于图6A所示的可选实施例中,步骤406可替代为如下步骤406a至步骤406e,如图7A所示:
在步骤406a中,对数字区域进行二值化,得到二值化后的数字区域;
可选地,根据步骤502c合并后的数字区域,对该数字区域进行预处理,其中,预处理可以包括:去噪、滤波、提取边缘等操作;将预处理后的数字区域进行二值化。
二值化是指将数字区域中的像素点的灰度值与预设灰度阈值比较,将数字 区域中的像素点分成两部分:大于预设灰度阈值的像素群和小于预设灰度阈值的像素群,将两部分像素群在数字区域中分别呈现出黑和白两种不同的颜色,得到二值化后的数字区域,如图7B所示。其中,位于前景的一种颜色的像素点称之为前景色像素点,也即图7B中的白色像素点;位于背景的一种颜色像素点称之为背景色像素点,也即图7B中的黑色像素点。
在步骤406b中,对二值化后的数字区域按照竖直方向计算直方图,直方图包括:每列像素点的横坐标和每列像素点中前景色像素点的累加值;
对二值化后的数字区域按照竖直方向计算直方图,该直方图在水平方向表示每列像素点的横坐标,在竖直方向表示每列像素点中前景色像素点的个数累加值,如图7C所示。
在步骤406c中,根据直方图中前景色像素点的累加值大于预设阈值的列所组成的连续列集合,识别得到n个单数字区域。
根据直方图可以获取到每一列像素点中前景色像素点的累加值,将每一列像素点中前景色像素点的累加值与预设阈值进行比较,将直方图中前景色像素点的累加值大于预设阈值的列所组成的连续列集合,确定为单数字区域所在的列。
连续列集合是指:前景色像素点的累加值大于预设阈值的列是连续的p列,该连续的p列像素点所组成的集合,如图7D所示,连续列集合为p,也即直方图中形成的连续白色区域。对于图中的p列像素点,在位于下侧直方图中的前景色像素点的累加值均大于预设阈值。而该p列像素点在数字图像中对应数字区域“3”。
每个连续列集合识别为一个数字区域,n个连续列集合识别为n个单数字区域。
综上所述,本实施例通过对数字区域二值化,并将二值化后的数字区域按照竖直方向计算直方图,识别出每个数字对应的单数字区域,能够提高识别数字区域中单数字区域的准确度。
下述为本公开装置实施例,可以用于执行本公开方法实施例。对于本公开装置实施例中未披露的细节,请参照本公开方法实施例。
图8是根据一示例性实施例示出的一种区域提取装置的框图,如图8所示, 该区域提取装置包括但不限于:
获取模块810,被配置为获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;
识别模块820,被配置为根据识别模型对待识别的图像进行识别,得到至少一个数字区域;
切割模块830,被配置为对数字区域进行区域切割,得到至少一个单数字区域。
综上所述,本公开实施例中提供的区域提取装置,通过获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;根据识别模型对待识别的图像进行识别,得到至少一个数字区域;对数字区域进行区域切割,得到至少一个单数字区域;解决了数字区域提取方法中对数字大小和数字位数有一定限制,对不同字体大小或不同数字位数的数字位置无法准确提取的问题;达到了通过识别模型对图像中不同字体样式或不同字体大小或不同数字位数的数字位置进行准确定位并切割提取的效果。
图9是根据另一示例性实施例示出的一种区域提取装置的框图,如图9所示,该区域提取装置包括但不限于:
获取模块810,被配置为获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;
获取模块810获取识别模型,该识别模型是通过对正样本图像和负样本图像利用分类算法构建的初始模型训练得到的模型。
识别模块820,被配置为根据识别模型对待识别的图像进行识别,得到至少一个数字区域;
本实施例中,识别模块820,还包括如下子模块:
扫窗子模块821,被配置为使用预设窗口按照预定扫窗策略从待识别的图像中提取候选窗口区域;
在获取模块810获取到识别模型后,扫窗子模块821设置固定大小的预设窗口,利用预设窗口按照预定扫窗策略对待识别的图像进行扫窗;通过扫窗后从待识别的图像中提取到多个数字区域的候选窗口区域。
可选地,该预定扫窗策略包括:按照从上往下,从左到右的顺序对待识别的图像进行逐行扫窗。
可选的,该预定扫窗策略包括:对于同一幅待识别图像采用不同大小的预设窗口多次扫窗。
可选的,该预定扫窗策略包括:在采用固定大小的预设窗口对待识别图像进行扫窗时,预设窗口在相邻的两次移动过程中存在有相互重叠的区域。
分类子模块822,被配置为将候选窗口区域的图像特征输入识别模型进行分类,得到分类结果;
对扫窗子模块821中得到的候选窗口区域进行图像特征提取。将对候选窗口区域提取到的图像特征输入到获取模块810获取的识别模型中进行分类,识别模型将候选窗口区域中提取的图像特征与识别模型中的模板进行匹配,检测该候选窗口区域是否为数字区域。分类子模块822通过识别模型对候选窗口区域中图像特征的检测,识别出候选窗口区域中的数字区域。
确认子模块823,被配置为在分类结果为正结果时,将候选窗口区域识别为数字区域。
若识别模型经过分类后,得出的分类结果为正结果,则确认子模块823将候选窗口区域识别为数字区域。正结果是指该候选窗口区域属于正样本图像训练得到的模型。
确认子模块823,还被配置为在分类结果为负结果时,将候选窗口区域识别为非数字区域。
若识别模型经过分类后,得出的分类结果为负结果,则确认子模块823将候选窗口区域识别为非数字区域。负结果是指该候选窗口区域属于负样本图像训练得到的模型。
切割模块830,被配置为对数字区域进行区域切割,得到至少一个单数字区域。
切割模块830对确认子模块823得到的候选窗口区域进行区域切割,得到该候选窗口区域中的单数字区域,其中,该候选窗口区域中至少包含一个单数 字区域。
综上所述,本公开实施例中提供的区域提取装置,通过获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;使用预设窗口按照预定扫窗策略从待识别的图像中提取候选窗口区域;将候选窗口区域的图像特征输入识别模型进行分类,得到分类结果;若分类结果为正结果,则将候选窗口区域识别为数字区域;对数字区域进行区域切割,得到至少一个单数字区域;解决了数字区域提取方法中对数字大小和数字位数有一定限制,对不同字体大小或不同数字位数的数字位置无法准确提取的问题;达到了通过识别模型对图像中不同字体样式或不同字体大小或不同数字位数的数字位置进行准确定位并切割提取的效果。
在基于图9所示的可选实施例中,该装置还可以包括如下模块,如图10所示:
查找模块1010,被配置为查找出存在交集区域的n个数字区域;
当数字区域为至少两个时,查找模块1010通过简单规则查找出所有数字区域中存在交集区域的n个数字区域。
合并模块1020,被配置为将存在交集区域的n个数字区域进行合并,得到合并后的数字区域。
在查找出存在交集区域的n个数字区域后,合并模块1020将存在交集区域的n个数字区域进行合并,确定出最终的数字区域。
其中,作为一种可选的实现方式,合并模块1020可以包括如下子模块:
在存在交集区域的n个数字区域之间的上边缘和下边缘重合时,
第一确定子模块1021,被配置为将n个数字区域的n个左边缘中位于最左侧的左边缘,确定为合并后的左边缘;
当n个数字区域排列在一行时,获取n个数字区域的n个左边缘,第一确定子模块1021将n个左边缘中位于最左侧的左边缘确定为n个数字区域合并后的左边缘。
第二确定子模块1022,被配置为将n个数字区域的n个右边缘中位于最右侧的右边缘,确定为合并后的右边缘;
当n个数字区域排列在一行时,获取n个数字区域的n个右边缘,第二确定子模块1022将n个右边缘中位于最右侧的右边缘确定为n个数字区域合并后的右边缘。
第三确定子模块1023,被配置为根据上边缘、下边缘、合并后的左边缘、合并后的右边缘,得到合并后的数字区域。
根据第一确定子模块1021和第二确定子模块1022中确定的上边缘、下边缘、合并后的左边缘、合并后的右边缘,第三确定子模块1023得到最后合并的数字区域。
综上所述,本实施例通过根据n个数字区域的n个左边缘中位于最左侧的左边缘,确定为合并后的左边缘;根据n个数字区域的n个右边缘中位于最右侧的右边缘,确定为合并后的右边缘;根据上边缘、下边缘、合并后的左边缘、合并后的右边缘,得到合并后的数字区域;使得合并后的数字区域更加准确,更加有利于数字区域的切割和提取。
在基于图8所示的可选实施例中,切割模块830还可以包括如下子模块,如图11所示:
二值化子模块831,被配置为对数字区域进行二值化,得到二值化后的数字区域;
可选地,根据第三确定子模块1023确定的数字区域,二值化子模块831对该数字区域进行预处理,其中,预处理可以包括:去噪、滤波、提取边缘等操作;将预处理后的数字区域进行二值化。
二值化是指将数字区域中的像素点的灰度值与预设灰度阈值比较,将数字区域中的像素点分成两部分:大于预设灰度阈值的像素群和小于预设灰度阈值的像素群,将两部分像素群在数字区域中分别呈现出黑和白两种不同的颜色,得到二值化后的数字区域。
计算子模块832,被配置为对二值化后的数字区域按照竖直方向计算直方图,直方图包括:每列像素点的横坐标和每列像素点中前景色像素点的累加值;
计算子模块832对二值化子模块831处理后的数字区域按照竖直方向计算直方图,该直方图在水平方向表示每列像素点的横坐标,在竖直方向表示每列像素点中前景色像素点的个数累加值。
数字识别子模块833,被配置为根据直方图中前景色像素点的累加值大于预设阈值的列所组成的连续列集合,识别得到n个单数字区域。
根据直方图可以获取到每一列像素点中前景色像素点的累加值,数字识别子模块833将每一列像素点中前景色像素点的累加值与预设阈值进行比较,将直方图中前景色像素点的累加值大于预设阈值的列所组成的连续列集合,确定为单数字区域所在的列。
连续列集合是指:前景色像素点的累加值大于预设阈值的列是连续的p列,该连续的p列像素点所组成的集合。
每个连续列集合识别为一个数字区域,n个连续列集合识别为n个单数字区域。
综上所述,本实施例通过对数字区域二值化,并将二值化后的数字区域按照竖直方向计算直方图,识别出每个数字对应的单数字区域,提供了一种准确切割和识别单数字区域的方法。
图12是根据一示例性实施例示出的一种模型训练装置的框图,如图12所示,该模型训练装置包括但不限于:
样本获取模块1210,被配置为获取预定样本图像,预定样本图像包括正样本图像和负样本图像,正样本图像包括有至少一个数字;负样本图像包括零个数字或残缺数字;
训练模块1220,被配置为利用分类算法对预设样本图像进行训练,得到识别模型。
综上所述,本公开实施例中提供的模型训练装置,通过获取预定样本图像,预定样本图像包括正样本图像和负样本图像,正样本图像包括有至少一个数字;负样本图像包括零个数字或残缺数字;利用分类算法对预设样本图像进行训练,得到识别模型;解决了数字区域提取方法中对数字大小和数字位数有一定限制,对不同字体大小或不同数字位数的数字位置无法准确提取的问题;达到了该训练过程得到的识别模型,具有对图像中不同字体样式或不同字体大小或不同数字位数的数字位置进行准确定位的效果。
图13是根据另一示例性实施例示出的一种模型训练装置的框图,如图13 所示,该模型训练装置包括但不限于:
样本获取模块1210,被配置为获取预定样本图像,预定样本图像包括正样本图像和负样本图像,正样本图像包括有至少一个数字;负样本图像包括零个数字或残缺数字;
样本获取模块1210通过图像库或直接拍摄得到的图像中选取预定样本图像,预定样本图像是指识别模型所使用的图像,预定样本图像包括正样本图像和负样本图像两种图像。其中,正样本图像可以为含有单个数字的数字图像,也可以为含有单行且数字个数不限的数字图像。正样本图像中的数字对字体大小、字体样式和数字位数不作限定;同时,正样本图像中对数字图像的个数不作限定。负样本图像可以是包含零个数字的图像,也可以是包含有部分残缺数字的图像。
训练模块1220,被配置为利用分类算法对预设样本图像进行训练,得到识别模型。
其中,分类算法包括Adaboost、支持向量机SVM、人工神经网络、遗传算法、朴素贝叶斯、决策树、最近邻KNN算法中的至少一种。
本实施例中,训练模型1220可以包括如下子模块:
提取子模块1221,被配置为提取正样本图像的图像特征和负样本图像的图像特征;
在样本获取模块1210获取到正样本图像和负样本图像后,提取子模块1221分别对正样本图像和负样本图像进行特征提取,得到正样本图像的图像特征和负样本图像的图像特征。
输入子模块1222,被配置为将正样本图像的图像特征和用于表示正结果的第一标签输入利用分类算法构建的初始模型中,将负样本图像的图像特征和用于表示负结果的第二标签输入初始模型中,得到识别模型。
输入子模块1222将得到的正样本图像的图像特征输入到利用分类算法构建的初始模型中,同时将正样本图像对应的正结果的第一标签也输入到初始模型中。
输入子模块1222将得到的负样本头像的图像特征输入到利用分类算法构建的初始模型中,同时将负样本图像对应的负结果的第二标签也输入到初始模型中。
其中,分类算法包括Adaboost、SVM、人工神经网络、遗传算法、朴素贝叶斯、决策树、KNN算法中的至少一种。
综上所述,本公开实施例中提供的模型训练装置,通过获取预定样本图像,预定样本图像包括正样本图像和负样本图像,正样本图像包括有至少一个数字;负样本图像包括零个数字或残缺数字;提取正样本图像的图像特征和负样本图像的图像特征;将正样本图像的图像特征和用于表示正结果的第一标签输入利用分类算法构建的初始模型中,将负样本图像的图像特征和用于表示负结果的第二标签输入初始模型中,得到识别模型;解决了数字区域提取方法中对数字大小和数字位数有一定限制,对不同字体大小或不同数字位数的数字位置无法准确提取的问题;达到了通过识别模型对图像中不同字体样式或不同字体大小或不同数字位数的数字位置进行准确定位的效果。
本公开还提供一种区域提取装置,该装置包括:处理器;
用于存储处理器可执行指令的存储器;
其中,处理器被配置为:
获取识别模型,识别模型通过分类算法对预设样本图像进行训练得到,预设样本图像包括正样本图像和负样本图像,每个正样本图像包括有至少一个数字,每个负样本图像包括零个数字或残缺数字;
根据识别模型对待识别的图像进行识别,得到至少一个数字区域;
对数字区域进行区域切割,得到至少一个单数字区域。
本公开还提供一种模型训练装置,该装置包括:处理器;
用于存储处理器可执行指令的存储器;
其中,处理器被配置为:
获取预定样本图像,预定样本图像包括正样本图像和负样本图像,正样本图像包括有至少一个数字;负样本图像包括零个数字或残缺数字;
利用分类算法对预设样本图像进行训练,得到识别模型。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
图14是根据一示例性实施例示出的一种用于执行区域提取方法的装置的框图。例如,装置1400可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。
参照图14,装置1400可以包括以下一个或多个组件:处理组件1402,存储器1404,电源组件1406,多媒体组件1408,音频组件1410,输入/输出(I/O)接口1412,传感器组件1414,以及通信组件1416。
处理组件1402通常控制装置1400的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件1402可以包括一个或多个处理器1418来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件1402可以包括一个或多个模块,便于处理组件1402和其他组件之间的交互。例如,处理组件1402可以包括多媒体模块,以方便多媒体组件1408和处理组件1402之间的交互。
存储器1404被配置为存储各种类型的数据以支持在装置1400的操作。这些数据的示例包括用于在装置1400上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器1404可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
电源组件1406为装置1400的各种组件提供电力。电源组件1406可以包括电源管理系统,一个或多个电源,及其他与为装置1400生成、管理和分配电力相关联的组件。
多媒体组件1408包括在装置1400和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件1408包括一个前置摄像头和/或后置摄像头。当装置1400处于操作模式,如拍摄模式或视频模式时,前 置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。
音频组件1410被配置为输出和/或输入音频信号。例如,音频组件1410包括一个麦克风(MIC),当装置1400处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器1404或经由通信组件1416发送。在一些实施例中,音频组件1410还包括一个扬声器,用于输出音频信号。
I/O接口1412为处理组件1402和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件1414包括一个或多个传感器,用于为装置1400提供各个方面的状态评估。例如,传感器组件1414可以检测到装置1400的打开/关闭状态,组件的相对定位,例如组件为装置1400的显示器和小键盘,传感器组件1414还可以检测装置1400或装置1400一个组件的位置改变,用户与装置1400接触的存在或不存在,装置1400方位或加速/减速和装置1400的温度变化。传感器组件1414可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件1414还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件1414还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。
通信组件1416被配置为便于装置1400和其他设备之间有线或无线方式的通信。装置1400可以接入基于通信标准的无线网络,如Wi-Fi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件1416经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,通信组件1416还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IRDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,装置1400可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述区域提取方法。
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器1404,上述指令可由装置1400的处理器1418执行以完成上述区域提取方法。例如,非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (18)

  1. 一种区域提取方法,其特征在于,所述方法包括:
    获取识别模型,所述识别模型通过分类算法对预设样本图像进行训练得到,所述预设样本图像包括正样本图像和负样本图像,每个所述正样本图像包括有至少一个数字,每个所述负样本图像包括零个数字或残缺数字;
    根据所述识别模型对待识别的图像进行识别,得到至少一个数字区域;
    对所述数字区域进行区域切割,得到至少一个单数字区域。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述识别模型对待识别的图像进行识别,得到至少一个数字区域,包括:
    使用预设窗口按照预定扫窗策略从所述待识别的图像中提取候选窗口区域;
    将所述候选窗口区域的图像特征输入所述识别模型进行分类,得到分类结果;
    若所述分类结果为正结果,则将所述候选窗口区域识别为所述数字区域。
  3. 根据权利要求2所述的方法,其特征在于,所述数字区域为至少两个,所述方法还包括:
    查找出存在交集区域的n个所述数字区域;
    将存在所述交集区域的所述n个数字区域进行合并,得到合并后的所述数字区域。
  4. 根据权利要求3所述的方法,其特征在于,所述将存在所述交集区域的n个所述数字区域进行合并,得到合并后的所述数字区域,包括:
    在存在所述交集区域的n个所述数字区域之间的上边缘和下边缘重合时,
    将n个所述数字区域的n个左边缘中位于最左侧的左边缘,确定为合并后的左边缘;
    将n个所述数字区域的n个右边缘中位于最右侧的右边缘,确定为合并后的右边缘;
    根据所述上边缘、所述下边缘、所述合并后的左边缘、所述合并后的右边缘,得到所述合并后的所述数字区域。
  5. 根据权利要求1至4任一所述的方法,其特征在于,所述对所述数字区域进行区域切割,得到至少一个单数字区域,包括:
    对所述数字区域进行二值化,得到二值化后的数字区域;
    对所述二值化后的数字区域按照竖直方向计算直方图,所述直方图包括:每列像素点的横坐标和所述每列像素点中前景色像素点的累加值;
    根据所述直方图中前景色像素点的累加值大于预设阈值的列所组成的连续列集合,识别得到n个单数字区域。
  6. 一种模型训练方法,其特征在于,所述方法包括:
    获取预定样本图像,所述预定样本图像包括正样本图像和负样本图像,所述正样本图像包括有至少一个数字;所述负样本图像包括零个数字或残缺数字;
    利用分类算法对所述预设样本图像进行训练,得到识别模型。
  7. 根据权利要求6所述的方法,其特征在于,所述利用分类算法对所述预设样本图像进行训练,得到识别模型,包括:
    提取所述正样本图像的图像特征和所述负样本图像的图像特征;
    将所述正样本图像的图像特征和用于表示正结果的第一标签输入利用所述分类算法构建的初始模型中,将所述负样本图像的图像特征和用于表示负结果的第二标签输入所述初始模型中,得到所述识别模型。
  8. 根据权利要求6或7所述的方法,其特征在于,所述分类算法包括Adaboost、支持向量机SVM、人工神经网络、遗传算法、朴素贝叶斯、决策树、最近邻KNN算法中的至少一种。
  9. 一种区域提取装置,其特征在于,所述装置包括:
    获取模块,被配置为获取识别模型,所述识别模型通过分类算法对预设样本图像进行训练得到,所述预设样本图像包括正样本图像和负样本图像,每个 所述正样本图像包括有至少一个数字,每个所述负样本图像包括零个数字或残缺数字;
    识别模块,被配置为根据所述识别模型对待识别的图像进行识别,得到至少一个数字区域;
    切割模块,被配置为对所述数字区域进行区域切割,得到至少一个单数字区域。
  10. 根据权利要求9所述的装置,其特征在于,所述识别模块,包括:
    扫窗子模块,被配置为使用预设窗口按照预定扫窗策略从所述待识别的图像中提取候选窗口区域;
    分类子模块,被配置为将所述候选窗口区域的图像特征输入所述识别模型进行分类,得到分类结果;
    确认子模块,被配置为在所述分类结果为正结果时,将所述候选窗口区域识别为所述数字区域。
  11. 根据权利要求10所述的装置,其特征在于,所述数字区域为至少两个,所述装置还包括:
    查找模块,被配置为查找出存在交集区域的n个所述数字区域;
    合并模块,被配置为将存在所述交集区域的所述n个数字区域进行合并,得到合并后的所述数字区域。
  12. 根据权利要求11所述的装置,其特征在于,所述合并模块,包括:
    在存在所述交集区域的n个所述数字区域之间的上边缘和下边缘重合时,
    第一确定子模块,被配置为将n个所述数字区域的n个左边缘中位于最左侧的左边缘,确定为合并后的左边缘;
    第二确定子模块,被配置为将n个所述数字区域的n个右边缘中位于最右侧的右边缘,确定为合并后的右边缘;
    第三确定子模块,被配置为根据所述上边缘、所述下边缘、所述合并后的左边缘、所述合并后的右边缘,得到所述合并后的所述数字区域。
  13. 根据权利要求9至12任一所述的装置,其特征在于,所述切割模块,包括:
    二值化子模块,被配置为对所述数字区域进行二值化,得到二值化后的数字区域;
    计算子模块,被配置为对所述二值化后的数字区域按照竖直方向计算直方图,所述直方图包括:每列像素点的横坐标和所述每列像素点中前景色像素点的累加值;
    数字识别子模块,被配置为根据所述直方图中前景色像素点的累加值大于预设阈值的列所组成的连续列集合,识别得到n个单数字区域。
  14. 一种模型训练装置,其特征在于,所述装置包括:
    样本获取模块,被配置为获取预定样本图像,所述预定样本图像包括正样本图像和负样本图像,所述正样本图像包括有至少一个数字;所述负样本图像包括零个数字或残缺数字;
    训练模块,被配置为利用分类算法对所述预设样本图像进行训练,得到识别模型。
  15. 根据权利要求14所述的装置,其特征在于,所述训练模块,包括:
    提取子模块,被配置为提取所述正样本图像的图像特征和所述负样本图像的图像特征;
    输入子模块,被配置为将所述正样本图像的图像特征和用于表示正结果的第一标签输入利用所述分类算法构建的初始模型中,将所述负样本图像的图像特征和用于表示负结果的第二标签输入所述初始模型中,得到所述识别模型。
  16. 根据权利要求14或15所述的装置,其特征在于,所述分类算法包括Adaboost、支持向量机SVM、人工神经网络、遗传算法、朴素贝叶斯、决策树、最近邻KNN算法中的至少一种。
  17. 一种区域提取装置,其特征在于,所述装置包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为:
    获取识别模型,所述识别模型通过分类算法对预设样本图像进行训练得到,所述预设样本图像包括正样本图像和负样本图像,每个所述正样本图像包括有至少一个数字,每个所述负样本图像包括零个数字或残缺数字;
    根据所述识别模型对待识别的图像进行识别,得到至少一个数字区域;
    对所述数字区域进行区域切割,得到至少一个单数字区域。
  18. 一种模型训练装置,其特征在于,所述装置包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为:
    获取预定样本图像,所述预定样本图像包括正样本图像和负样本图像,所述正样本图像包括有至少一个数字;所述负样本图像包括零个数字或残缺数字;
    利用分类算法对所述预设样本图像进行训练,得到识别模型。
PCT/CN2015/099300 2015-10-30 2015-12-29 区域提取方法、模型训练方法及装置 WO2017071064A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
RU2016110914A RU2016110914A (ru) 2015-10-30 2015-12-29 Способ извлечения области, способ обучения модели и устройства для их осуществления
MX2016003753A MX2016003753A (es) 2015-10-30 2015-12-29 Metodo para extraccion de region, metodo para entrenamiento de modelo y dispositivos de los mismos.
KR1020167005383A KR101763891B1 (ko) 2015-10-30 2015-12-29 영역 추출 방법, 모델 트레이닝 방법 및 장치
JP2017547047A JP2018503201A (ja) 2015-10-30 2015-12-29 領域抽出方法、モデル訓練方法及び装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510727932.0A CN105528607B (zh) 2015-10-30 2015-10-30 区域提取方法、模型训练方法及装置
CN201510727932.0 2015-10-30

Publications (1)

Publication Number Publication Date
WO2017071064A1 true WO2017071064A1 (zh) 2017-05-04

Family

ID=55770821

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/099300 WO2017071064A1 (zh) 2015-10-30 2015-12-29 区域提取方法、模型训练方法及装置

Country Status (8)

Country Link
US (1) US20170124719A1 (zh)
EP (1) EP3163509A1 (zh)
JP (1) JP2018503201A (zh)
KR (1) KR101763891B1 (zh)
CN (1) CN105528607B (zh)
MX (1) MX2016003753A (zh)
RU (1) RU2016110914A (zh)
WO (1) WO2017071064A1 (zh)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10133948B2 (en) * 2014-07-10 2018-11-20 Sanofi-Aventis Deutschland Gmbh Device and method for performing optical character recognition
CN106373160B (zh) * 2016-08-31 2019-01-11 清华大学 一种基于深度强化学习的摄像机主动目标定位方法
CN107784301B (zh) * 2016-08-31 2021-06-11 百度在线网络技术(北京)有限公司 用于识别图像中文字区域的方法和装置
CN107886102B (zh) * 2016-09-29 2020-04-07 北京君正集成电路股份有限公司 Adaboost分类器训练方法及系统
CN106991418B (zh) * 2017-03-09 2020-08-04 上海小蚁科技有限公司 飞虫检测方法、装置及终端
WO2019055849A1 (en) 2017-09-14 2019-03-21 Chevron U.S.A. Inc. CLASSIFICATION OF CHAINS OF CHARACTER USING MACHINE LEARNING
KR102030768B1 (ko) 2018-05-08 2019-10-10 숭실대학교산학협력단 영상을 이용한 가금류 무게 측정 방법, 이를 수행하기 위한 기록매체 및 장치
CN108846795A (zh) * 2018-05-30 2018-11-20 北京小米移动软件有限公司 图像处理方法及装置
CN109002846B (zh) * 2018-07-04 2022-09-27 腾讯医疗健康(深圳)有限公司 一种图像识别方法、装置和存储介质
CN111325228B (zh) * 2018-12-17 2021-04-06 上海游昆信息技术有限公司 一种模型训练方法及装置
CN113366390B (zh) * 2019-01-29 2024-02-20 Asml荷兰有限公司 半导体制造过程中的决定方法
CN111814514A (zh) 2019-04-11 2020-10-23 富士通株式会社 号码识别装置、方法以及电子设备
CN110119725B (zh) * 2019-05-20 2021-05-25 百度在线网络技术(北京)有限公司 用于检测信号灯的方法及装置
CN110533003B (zh) * 2019-09-06 2022-09-20 兰州大学 一种穿线法车牌数字识别方法及设备
CN110781877B (zh) * 2019-10-28 2024-01-23 京东方科技集团股份有限公司 一种图像识别方法、设备及存储介质
CN111275011B (zh) * 2020-02-25 2023-12-19 阿波罗智能技术(北京)有限公司 移动红绿灯检测方法、装置、电子设备和存储介质
CN111753851B (zh) * 2020-07-01 2022-06-07 中国铁路设计集团有限公司 基于图像处理的铁路雪深及风雪运移轨迹监测方法及系统
CN112330619B (zh) * 2020-10-29 2023-10-10 浙江大华技术股份有限公司 一种检测目标区域的方法、装置、设备及存储介质
CN115862045B (zh) * 2023-02-16 2023-05-26 中国人民解放军总医院第一医学中心 基于图文识别技术的病例自动识别方法、系统、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130182910A1 (en) * 2012-01-18 2013-07-18 Xerox Corporation License plate optical character recognition method and system
CN104298976A (zh) * 2014-10-16 2015-01-21 电子科技大学 基于卷积神经网络的车牌检测方法
CN104346628A (zh) * 2013-08-01 2015-02-11 天津天地伟业数码科技有限公司 基于多尺度多方向Gabor特征的车牌汉字识别方法
CN104899587A (zh) * 2015-06-19 2015-09-09 四川大学 一种基于机器学习的数字式表计识别方法
CN104966107A (zh) * 2015-07-10 2015-10-07 安徽清新互联信息科技有限公司 一种基于机器学习的信用卡卡号识别方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2917353B2 (ja) * 1990-01-22 1999-07-12 松下電器産業株式会社 文字切り出し装置
JP3442847B2 (ja) * 1994-02-17 2003-09-02 三菱電機株式会社 文字読取装置
JP4078045B2 (ja) * 2001-07-02 2008-04-23 キヤノン株式会社 画像処理装置、方法、プログラム、及び記憶媒体
US7715640B2 (en) * 2002-11-05 2010-05-11 Konica Minolta Business Technologies, Inc. Image processing device, image processing method, image processing program and computer-readable recording medium on which the program is recorded
JP2004287671A (ja) * 2003-03-20 2004-10-14 Ricoh Co Ltd 手書き文字認識装置、情報入出力システム、プログラム及び記憶媒体
CN101498592B (zh) * 2009-02-26 2013-08-21 北京中星微电子有限公司 指针式仪表的读数方法及装置
KR101183211B1 (ko) * 2012-04-30 2012-09-14 주식회사 신아시스템 계량기 영상 정보의 세그멘테이션 처리장치
EP2920743A4 (en) * 2012-11-19 2017-01-04 IMDS America Inc. Method and system for the spotting of arbitrary words in handwritten documents
CN104156704A (zh) * 2014-08-04 2014-11-19 胡艳艳 一种新的车牌识别方法及系统
CN104598885B (zh) * 2015-01-23 2017-09-22 西安理工大学 街景图像中的文字标牌检测与定位方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130182910A1 (en) * 2012-01-18 2013-07-18 Xerox Corporation License plate optical character recognition method and system
CN104346628A (zh) * 2013-08-01 2015-02-11 天津天地伟业数码科技有限公司 基于多尺度多方向Gabor特征的车牌汉字识别方法
CN104298976A (zh) * 2014-10-16 2015-01-21 电子科技大学 基于卷积神经网络的车牌检测方法
CN104899587A (zh) * 2015-06-19 2015-09-09 四川大学 一种基于机器学习的数字式表计识别方法
CN104966107A (zh) * 2015-07-10 2015-10-07 安徽清新互联信息科技有限公司 一种基于机器学习的信用卡卡号识别方法

Also Published As

Publication number Publication date
RU2016110914A (ru) 2017-09-28
MX2016003753A (es) 2017-05-30
CN105528607A (zh) 2016-04-27
CN105528607B (zh) 2019-02-15
KR20170061628A (ko) 2017-06-05
KR101763891B1 (ko) 2017-08-01
US20170124719A1 (en) 2017-05-04
JP2018503201A (ja) 2018-02-01
EP3163509A1 (en) 2017-05-03

Similar Documents

Publication Publication Date Title
WO2017071064A1 (zh) 区域提取方法、模型训练方法及装置
JP6401873B2 (ja) 領域認識方法及び装置
US10127471B2 (en) Method, device, and computer-readable storage medium for area extraction
JP6392468B2 (ja) 領域認識方法及び装置
WO2017071061A1 (zh) 区域识别方法及装置
US10007841B2 (en) Human face recognition method, apparatus and terminal
JP6400226B2 (ja) 領域認識方法及び装置

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 20167005383

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2017547047

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: MX/A/2016/003753

Country of ref document: MX

ENP Entry into the national phase

Ref document number: 2016110914

Country of ref document: RU

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15907125

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15907125

Country of ref document: EP

Kind code of ref document: A1