WO2021109775A1 - 训练样本生成、模型训练、字符识别方法及其装置 - Google Patents

训练样本生成、模型训练、字符识别方法及其装置 Download PDF

Info

Publication number
WO2021109775A1
WO2021109775A1 PCT/CN2020/126197 CN2020126197W WO2021109775A1 WO 2021109775 A1 WO2021109775 A1 WO 2021109775A1 CN 2020126197 W CN2020126197 W CN 2020126197W WO 2021109775 A1 WO2021109775 A1 WO 2021109775A1
Authority
WO
WIPO (PCT)
Prior art keywords
character
image
training
contained
character image
Prior art date
Application number
PCT/CN2020/126197
Other languages
English (en)
French (fr)
Inventor
翟新刚
张楠赓
Original Assignee
嘉楠明芯(北京)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 嘉楠明芯(北京)科技有限公司 filed Critical 嘉楠明芯(北京)科技有限公司
Priority to US17/782,677 priority Critical patent/US20230007989A1/en
Publication of WO2021109775A1 publication Critical patent/WO2021109775A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1914Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries, e.g. user dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the invention belongs to the technical field of image recognition, and specifically relates to methods and devices for training sample generation, model training, and character recognition.
  • the character recognition function of the character wheel type meter directly determines the quality of the system.
  • the character recognition model is usually used to recognize the dial readings of the character wheel type meter.
  • the character recognition includes the recognition of single whole characters and the recognition of double half characters
  • the training process of the character recognition model corresponds to the double half character type
  • the training samples used usually only carry category labels.
  • the labeled label is generally "0", "1", or "0-1”
  • the character image in Figure 2b is ignored Due to the character bias of the double-half character type, for example, the actual reading in Figure 2b is obviously biased toward the character "0", which may lead to the generation of training samples that do not conform to the real situation, and then the character recognition obtained based on the training of the training samples
  • the model has a low recognition accuracy for double-half-character images.
  • the present invention provides the following solutions.
  • a method for generating training samples includes: acquiring a character image and determining each character contained in the character image; using a projection method to determine the weight value of each character contained in the character image, and according to each character contained in the character image The weight value of a character labels the character image to form a training sample.
  • the above-mentioned acquiring the character image and determining each character contained in the character image further includes: collecting a dial image of a character wheel type meter; performing character segmentation processing on the dial image to obtain a character image corresponding to a preset character wheel; The rotation position of the preset character wheel determines each character contained in the character image.
  • using the projection method to determine the weight value of each character contained in the character image includes: using the projection method to determine the total character area of the character image and the local character area corresponding to each character in the total character area; determining the local character area Relative to the projection ratio of the total character area, and determine the weight value of each character contained in the character image according to the projection ratio.
  • the above labeling the character image according to the weight value of each character contained in the character image further includes: updating the preset weight sequence according to the weight value of each character contained in the character image to obtain the target weight sequence, and according to the target The weight sequence labels the character image; wherein, the preset weight sequence is constructed in advance from the preset weight values of multiple candidate characters according to the preset arrangement order, and the preset weight of each candidate character in the multiple candidate characters The value is 0.
  • a model training method including: obtaining a training set, the training set includes training samples generated by the method provided in the first aspect; training a character recognition model according to the training set, wherein the character recognition model is used to recognize characters Dial readings of wheel gauges.
  • a character recognition method including: acquiring a dial image of a character wheel type meter, and performing character segmentation processing on the dial image to obtain a to-be-recognized image corresponding to each character wheel of the character wheel type meter;
  • the image to be recognized is input into the character recognition model trained by the method provided in the above-mentioned second aspect to obtain the dial reading of the character wheel type meter.
  • a training sample generating device which includes: an acquisition module for acquiring a character image and determining each character contained in the character image; a labeling module for determining the value of each character contained in the character image using a projection method Weight value, and label the character image according to the weight value of each character contained in the character image to form a training sample.
  • the acquisition module is also used to: collect the dial image of the character wheel type meter; perform character segmentation processing on the dial image to obtain a character image corresponding to the preset character wheel; determine the character image containing the character image by the rotation position of the preset character wheel every character.
  • the labeling module is further used for: using the projection method to determine the total character area of the character image and the local character area corresponding to each character in the total character area; determine the projection ratio of the local character area relative to the total character area, and according to the projection The ratio determines the weight value of each character contained in the character image.
  • the labeling module is further configured to: update the preset weight sequence according to the weight value of each character contained in the character image to obtain the target weight sequence, and label the character image according to the target weight sequence; wherein the preset weight sequence is The preset weight value of a plurality of candidate characters is constructed in advance according to a preset arrangement sequence, and the preset weight value of each candidate character in the plurality of candidate characters is 0.
  • a model training device which includes: a training set acquisition module for acquiring a training set, the training set including training samples generated by the device provided in the fourth aspect; and a model training module for acquiring training sets based on Train a character recognition model, where the character recognition model is used to recognize the dial readings of a character wheel type meter.
  • a character recognition device including: an image acquisition module to be recognized, used to obtain a dial image of a character wheel meter, and perform character segmentation processing on the dial image to obtain each corresponding to the character wheel meter Character wheel image to be recognized; character recognition module, used to input the image to be recognized into the character recognition model trained by the device provided in the fifth aspect to obtain the dial reading of the character wheel type meter.
  • a training sample generation device including: one or more multi-core processors; a memory for storing one or more programs; when one or more programs are executed by one or more multi-core processors, Enable one or more multi-core processors to achieve: obtain character images, and determine each character contained in the character image; use the projection method to determine the weight value of each character contained in the character image, and according to the weight of each character contained in the character image The value labels the character image to form a training sample.
  • a computer-readable storage medium stores a program, and when the program is executed by a multi-core processor, the multi-core processor is caused to execute the method provided in the above-mentioned first aspect.
  • the character image of the character wheel type meter is first obtained, and the character image is projected by the projection method to obtain each character image contained in the character image.
  • the weight value of a character and then determine the label of the character image according to the weight value corresponding to each character, so that the label carried by the generated training sample is not a single classification label (such as "0", "1” or "2” ”), but set a biased label according to the weight value of each character contained in the character image, so that the training sample is more in line with the objective reality.
  • the character recognition model trained by using the above training sample can not only The character image of the single character type has a good recognition effect, and it can also realize the accurate recognition of the character image of the double half character type, and for the character image of the double half character type, it can give a biased and higher The recognition result of the recognition accuracy rate.
  • FIG. 1 is a schematic flowchart of a method for generating training samples according to an embodiment of the present invention
  • FIG. 2a is a schematic character image of a single integer character "1" in an embodiment of the present invention
  • FIG. 2b is a schematic character image of a double half character "0-1" in an embodiment of the present invention
  • Fig. 3a is the binary image of Fig. 2a
  • Fig. 3b is the binary image of Fig. 2b
  • Fig. 4a is a projection histogram formed after performing projection on Fig. 3a
  • Fig. 4b is a projection histogram formed after performing projection on Fig. 3b;
  • FIG. 5 is a schematic flowchart of a model training method according to an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart of a character recognition method according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a dial image in an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of an image to be recognized in an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a training sample generating device according to an embodiment of the present invention.
  • Fig. 10 is a schematic structural diagram of a model training device according to an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of a character recognition device according to an embodiment of the present invention.
  • Fig. 12 is a schematic structural diagram of a training sample generating apparatus according to another embodiment of the present invention.
  • the word wheel type meter refers to a meter device that drives one or more word wheels to rotate so that the numbers marked on the word wheel are displayed in the reading frame of the dial for the user to read the values, such as in life Common water meters, gas meters, etc.
  • the training sample generation method provided in this application can be applied to any processing device with graphics processing capabilities.
  • the processing device may be a terminal, server, etc. including a central processing unit (CPU) and/or a graphics processing unit (GPU).
  • terminals include desktop terminals, mobile smart terminals such as mobile phones/tablets, vehicle-mounted terminals, wearable terminals, and so on.
  • FIG. 1 is a schematic flowchart of a training sample generating method 10 according to an embodiment of the present application.
  • the training sample generating method 10 is used to generate training samples to train a character recognition model, which is used to recognize the dial of a character wheel type meter Readings.
  • the execution subject can be one or more electronic devices; from the perspective of programs, the execution subject can correspondingly be the programs carried on these electronic devices.
  • the process in FIG. 1 may include the following steps 11 to 12.
  • Step 11 Obtain a character image, and determine each character contained in the character image
  • the character image is a partial image corresponding to a certain character wheel in the dial image of the character wheel type meter.
  • the dial image can be captured by a camera set up above the character wheel type meter, and the character image can be obtained by performing segmentation processing on the dial image.
  • the character image can also have other sources, such as from other equipment, or It can also be a ready-made image, which is not limited by the present invention.
  • the character image may contain one or more characters among the candidate characters, and the above-mentioned candidate characters may include: "0", "1", “2", “3", “4", "5", "6” ", "7", “8", “9".
  • each character contained in the character image can be determined by pre-setting a character image collection strategy or performing preliminary template matching on the character image, or it can be manually judged.
  • the character image displayed in the dial image can include two types of single integer characters and double half characters, where single integer characters include: 0, 1, 2 , 3, 4, 5, 6, 7, 8, 9 are used to indicate that there is only a single character in a character image.
  • Figure 2a shows a character image of a single integer character "1".
  • Double half characters include: 0- 1, 1-2, 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-0, used to indicate that there are two characters in the character image at the same time,
  • Figure 2b a character image of the double-half character "0-1" is shown. This embodiment does not limit the number of characters contained in the character image.
  • three or more characters may also exist in the character image at the same time.
  • Step 12 Use the projection method to determine the weight value of each character contained in the character image, and label the character image according to the weight value of each character contained in the character image to form a training sample.
  • the weight value of each character contained in the character image is used to indicate the distribution of characters in the character image, where the characters corresponding to a higher weight value occupy a higher proportion in the character image, and the character wheel type measurement
  • the actual reading of the table should also be more biased towards characters with higher weight values.
  • the character wheel type meter a number of characters are usually evenly distributed on the surface of the character wheel, and the characters distributed on the surface of the character wheel are displayed in the character frame of the dial in turn by rotating the character wheel. By projecting the character image in the direction of the character wheel, the distribution ratio of each character contained in the character image can be easily and accurately obtained, and the above distribution ratio can be used as the weight value.
  • a projection method can be used to determine the respective weight values of the characters "0" and "1" contained in the character image. It can be seen that although the character image contains the characters "0" and "1” at the same time, it is obviously biased toward the character "1". At this time, if the traditional marking method is adopted, mark the character image with "0" or "1". "Or "0-1", neither can truly express the real character reading represented by the character image.
  • the label marked on the character image in this application carries the weight value of the character “0” and the character “1” at the same time, wherein the character “0” has a smaller weight value, and the character “1” has a higher weight value. It is possible to generate training samples that are more in line with objective reality.
  • the character image of the character wheel meter is first obtained, and the character image is projected by the projection method to obtain the weight value of each character contained in the character image, and then according to the weight value corresponding to each character Determine the label of the character image so that the label carried by the generated training sample is not a single classification label (such as "0", "1” or "2"), but based on the proportion of each character contained in the character image
  • the weight value is used to set a biased label to make the training sample more in line with the objective reality.
  • the character recognition model trained by using the above training sample can not only have a good recognition effect on the character image of the single character type, but also The accurate recognition of the character image of the double-half character type is realized, and for the character image of the double-half character type, it can give a biased recognition result with higher recognition accuracy.
  • some embodiments of the present application also provide some specific implementation schemes and extension schemes of the method, which will be described below.
  • the above step 11 may further include: collecting a dial image of a character wheel type meter; performing character segmentation processing on the dial image to obtain a character image corresponding to a preset character wheel; and rotating the preset character wheel The position determines each character contained in the character image.
  • the character wheel type meter includes one or more character wheels, and any one of them can be selected as the preset character wheel.
  • the preset character wheel can be rotated in accordance with the preset rotation rules in advance, and the dial image can be collected by the camera device installed above the character wheel type meter, using global threshold method, edge inspection method or contour detection method, etc.
  • the segmentation algorithm performs character segmentation processing on the dial image to obtain a character image corresponding to the preset character wheel, and then can calculate the rotation position of the preset character wheel according to the preset rotation rule and the collection time of the dial image, and then The current reading of the preset character wheel on the dial is calculated from the rotation position of the preset character wheel, so as to determine each character contained in the character image. In this way, the cumbersome steps of determining characters from the character image are avoided, and the efficiency of generating training samples is further improved.
  • the weight value corresponding to at least one character presented on the dial is also fixed, so the character image can also be directly determined by the rotation position of the preset character wheel The weight value of each character included.
  • using the projection method to determine the weight value of each character contained in the character image may further include: using the projection method to determine the total character area of the character image and the total character area corresponding to each The local character area of the character; the projection ratio of the local character area relative to the total character area is determined, and the weight value of each character contained in the character image is determined according to the projection ratio.
  • FIG. 2a shows a character image of a single integer character "1”
  • the binary image of the single integer character “1” shown in FIG. 3a is obtained after the binarization process is performed on FIG. 2a.
  • the projection histogram shown in Fig. 4a is obtained after the projection of 3a. From the projection histogram shown in Fig. 4a, it can be seen that the total character area of the character image is the X coordinate interval "8 ⁇ 35", which corresponds to the character
  • the local character area of "1” is the coordinate interval "8 ⁇ 35" of X. It can be seen that the projection ratio of the local character area of the character "1" to the total character area is 100%, and the weight value of the character "1" Is 1 (100%).
  • FIG. 2b shows a character image of the double-half character "0-1".
  • the binary image of the double-half character "0-1" shown in FIG. 3b is obtained.
  • the projection histogram shown in figure 4b is obtained.
  • labeling the character image according to the weight value of each character contained in the character image may further include: updating the preset weight sequence according to the weight value of each character contained in the character image to Obtain the target weight sequence, and label the character image according to the target weight sequence.
  • the preset weight sequence is constructed in advance from the preset weight values of a plurality of candidate characters according to a preset arrangement sequence, and the preset weight value of each candidate character in the plurality of candidate characters is 0.
  • the preset weight sequence may be: [R 0 , R 1 , R 2 , R 3 , R 4 , R 5 , R 6 , R 7 , R 8 , R 9 ]; wherein, multiple candidate characters include : 0, 1, 2,..., 9, R 0 refers to the preset weight value corresponding to the candidate character "0", R 1 refers to the preset weight value corresponding to the candidate character "1", and so on . Further, since the preset weight value of each candidate character in the plurality of candidate characters in this embodiment is 0, the foregoing preset weight sequence may be: [0,0,0,0,0,0,0,0,0]. Further, the aforementioned preset weight sequence is updated according to the weight value of each character contained in the character image to obtain the target weight sequence.
  • Figure 2a shows a character image of a single character "1". If the above-mentioned preset weight sequence is updated according to the weight value of each character contained in the character image shown in Figure 2a, since the character shown in Figure 2a The image contains the character "1", and the weight value of the character "1" is 1 (100%), so the target weight sequence obtained can be: [0,1,0,0,0,0,0,0,0 ,0].
  • FIG. 2b shows a character image of the double half-character "0-1". If the weight value of each character contained in the character image shown in FIG. 2b is updated, the preset weight sequence is updated as shown in FIG. 2b.
  • the output character image contains the double-half character "0-1", and the weight value of the character "0” is 61.29%, and the weight value of the character "1" is 38.71%. Therefore, the target weight sequence obtained can be: [0.6129,0.33871,0,0,0,0,0,0,0,0,0,0].
  • weight values can be directly labeled as labels by constructing a weight sequence according to the arrangement order, without additional specific character categories, such as "0", "1", etc., and
  • the label in the above sequence format is more conducive to statistics of sample coverage of multiple training samples in the training set.
  • FIG. 5 is a schematic flowchart of a model training method 50 according to an embodiment of the present application. As shown in FIG. 5, the method includes the following steps :
  • Step 51 Obtain the training set
  • the above-mentioned training set includes training samples, and the training samples are obtained according to the above-mentioned training sample generation method.
  • Step 52 Train the character recognition model according to the training set
  • the above-mentioned character recognition model is used to recognize the dial readings of the character wheel type meter.
  • the existing method is used to train the model to be trained, so that the trained character recognition model can output at least one character and its corresponding weight value according to the input character image.
  • the training method is not specifically limited here, and the model to be trained used in this embodiment may be a model such as a deep learning model or a convolutional neural network model.
  • the label carried by the training sample used to train the character recognition model is not a single classification label, such as "0", "1” or "2", but is based on each of the characters contained in the image.
  • Set a biased label based on the weight value of the proportion of the character so that the trained character recognition model can not only have a good recognition effect on the character image of the single character type, but also can realize the character image of the double half character type. It can give a biased recognition result for the character image of the double-half character type, so that a character recognition model with higher recognition accuracy can be trained.
  • FIG. 6 is a schematic flowchart of a character recognition method 60 according to an embodiment of the present application. As shown in FIG. 6, the character recognition method includes the following step:
  • Step 61 Obtain the dial image of the character wheel type meter, perform character segmentation processing on the dial image, and obtain the to-be-recognized image corresponding to each character wheel of the character wheel type meter;
  • the dial image can be captured by a camera device erected above the character wheel type meter.
  • Step 62 Input the to-be-recognized image into the character recognition model to obtain the dial reading of the character wheel type meter.
  • FIG. 7 shows the dial image of a schematic character wheel type meter.
  • the dial image shown in FIG. 7 can be binarized and character segmented to obtain 5 images as shown in FIG. Image to be recognized: "001.png", “002.png”, “003.png”, “004.png”, “005.png”, it should be understood that the 5 images to be recognized correspond to the word wheel measurement
  • the five images to be recognized can be further input into the trained character recognition model respectively, and the following recognition results are output (assuming that the label used is the above [R 0 , R 1 , R 2 , R 3 , R 4 , R 5 , R 6 , R 7 , R 8 , R 9 ]):
  • the recognized characters corresponding to "001.png”, “002.png”, “003.png”, and “004.png” output by the character recognition model are “0”, "0”, and “0” respectively. , "3”, and the weight value corresponding to the recognized character is 100%, so you can get a single character whose first to fourth digits are 0,0,0,3; but for the image to be recognized "005.png ", the recognition result output by the character recognition model is [0,0,0,0,0,0,0.79,0.21,0,0], so it can be judged that the recognized characters are "6” and "7” and correspond to " The weight value of 6" is higher, so the to-be-recognized image "005.png” is a double half character of "6-7", and it is biased towards 6.
  • the character recognition model used to perform character recognition not only has a good recognition effect for single-full character type character images, but also can realize accurate recognition of double-half-character type character images. , And for the character image of the double-half character type, it can give a biased recognition result, so that it has a higher recognition accuracy.
  • FIG. 9 is a schematic structural diagram of a training sample generating device 90 according to an embodiment of the present application. As shown in FIG. 9, it includes:
  • the obtaining module 91 is used to obtain a character image and determine each character contained in the character image
  • the labeling module 92 is configured to determine the weight value of each character contained in the character image by using the projection method, and label the character image according to the weight value of each character contained in the character image to form a training sample.
  • the acquisition module is further used to: collect the dial image of the character wheel type meter; perform character segmentation processing on the dial image to obtain a character image corresponding to the preset character wheel; determined by the rotation position of the preset character wheel Each character contained in the character image.
  • the labeling module is further used to: determine the total character area of the character image and the local character area corresponding to each character in the total character area by using the projection method; determine the projection ratio of the local character area to the total character area, And according to the projection ratio, the weight value of each character contained in the character image is determined.
  • the labeling module is further configured to: update the preset weight sequence according to the weight value of each character contained in the character image to obtain the target weight sequence, and label the character image according to the target weight sequence; wherein, the preset The weight sequence is constructed in advance from the preset weight values of a plurality of candidate characters according to a preset arrangement sequence, and the preset weight value of each candidate character in the plurality of candidate characters is 0.
  • the character image of the character wheel meter is first obtained, and the character image is projected by the projection method to obtain the weight value of each character contained in the character image, and then according to the weight value corresponding to each character Determine the label of the character image so that the label carried by the generated training sample is not a single classification label (such as "0", "1” or "2"), but based on the proportion of each character contained in the character image
  • the weight value is used to set a biased label to make the training sample more in line with the objective reality.
  • the character recognition model trained by using the above training sample can not only have a good recognition effect on the character image of the single character type, but also The accurate recognition of the double-half character type character image is realized, and for the double-half character type character image, it can give a biased recognition result with higher recognition accuracy.
  • FIG. 10 is a model training device 100 according to an embodiment of the present application.
  • the schematic diagram of the structure, as shown in Figure 10, includes:
  • the training set acquisition module 101 is configured to acquire a training set, and the training set includes training samples generated by the device provided in the foregoing fourth aspect;
  • the model training module 102 is used to train a character recognition model according to the training set, wherein the character recognition model is used to recognize the dial readings of the word wheel type meter.
  • the label carried by the training sample used to train the character recognition model is not a single classification label, such as "0", "1” or "2", but is based on each of the characters contained in the image.
  • Set a biased label based on the weight value of the proportion of the character so that the trained character recognition model can not only have a good recognition effect on the character image of the single character type, but also can realize the character image of the double half character type. It can give a biased recognition result for the character image of the double-half character type, so that a character recognition model with higher recognition accuracy can be trained.
  • FIG. 11 is a schematic structural diagram of a character recognition device 110 according to an embodiment of the present application. As shown in Figure 11, it includes:
  • the to-be-recognized image acquisition module 111 is used to acquire the dial image of the character wheel type meter, perform character segmentation processing on the dial image, and obtain the to-be-recognized image corresponding to each character wheel of the character wheel type meter;
  • the character recognition module 112 is used to input the image to be recognized into the character recognition model trained by the device provided in the fifth aspect to obtain the dial reading of the character wheel type meter.
  • the character recognition model used to perform character recognition not only has a good recognition effect on the character image of the single integer character type, but also can realize the accurate recognition of the character image of the double half character type. , And for double-half-character type character images, it can give a biased recognition result, so that it has a higher recognition accuracy.
  • training sample generating device model training device, and character recognition device in the embodiments of the present application can respectively implement the various processes of the foregoing training sample generating method, model training method, and character recognition method, and achieve the same The effects and functions will not be repeated here.
  • FIG. 12 is a schematic diagram of a training sample generating device according to an embodiment of the present application, which is used to execute the training sample generating method shown in FIG. 1, and the device includes:
  • At least one processor and,
  • a memory communicatively connected with at least one processor; wherein,
  • the memory stores instructions that can be executed by at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute:
  • the projection method is used to determine the weight value of each character contained in the character image, and label the character image according to the weight value of each character contained in the character image to form a training sample.
  • a non-volatile computer storage medium with the above training sample generation method is provided, and computer-executable instructions are stored thereon, and the computer-executable instructions are set to be executed when run by a processor:
  • the projection method is used to determine the weight value of each character contained in the character image, and label the character image according to the weight value of each character contained in the character image to form a training sample.
  • the apparatus, equipment, and computer-readable storage medium provided in the embodiments of the present application correspond to the method in a one-to-one manner. Therefore, the apparatus, equipment, and computer-readable storage medium also have beneficial technical effects similar to their corresponding methods.
  • the beneficial technical effects of the method are described in detail, and therefore, the beneficial technical effects of the device, equipment and computer-readable storage medium are not repeated here.
  • the embodiments of the present invention can be provided as a method, a system, or a computer program product. Therefore, the present invention may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present invention may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can also be stored in a computer-readable memory that can direct a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • the computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-permanent memory in computer readable media, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer readable media.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology.
  • the information can be computer-readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • CD-ROM

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

一种训练样本生成、模型训练、字符识别方法及其装置,其中训练样本生成方法包括:获取字符图像,并确定字符图像包含的每一个字符(步骤11);利用投影法确定字符图像包含的每一个字符的权重值,并根据字符图像包含的每一个字符的权重值对字符图像标注标签,形成训练样本(步骤12)。模型训练方法包括利用该训练样本训练字符识别模型。字符识别方法包括利用该字符识别模型进行字符识别。利用这种方法和装置,可以实现针对字轮型计量表的双半字符类型的字符图像的精准识别,能够给出具有偏向性的、具有更高识别准确率的识别结果。

Description

训练样本生成、模型训练、字符识别方法及其装置 技术领域
本发明属于图像识别技术领域,具体涉及训练样本生成、模型训练、字符识别方法及其装置。
背景技术
本部分旨在为权利要求书中陈述的本发明的实施方式提供背景或上下文。此处的描述不因为包括在本部分中就承认是现有技术。
随着智能化系统的日益发展和完善,远程抄表技术的兴起解决了人工抄表统计工作的困难,成为为现代化管理系统的重要组成部分。具有无线抄表功能的计量表如水表、电表、燃气表等已开始在住宅区、高档园区逐渐使用。
作为字轮型计量表的自动抄表系统中的基础与核心,字轮型计量表的字符识别功能直接决定了系统的好坏。目前通常采用字符识别模型识别字轮型计量表的表盘读数。
然而上述现有方案存在以下问题:针对字轮型计量表,字符识别包括对单整字符的识别和对双半字符的识别,而由于在字符识别模型的训练过程中,对应于双半字符类型而采用的训练样本通常只携带类别标签,比如,对于图2b中的字符图像,所标注的标签一般为“0”、“1”,或“0-1”,而忽略了图2b中字符图像由于是双半字符类型而具有的字符偏向性,比如图2b中的实际读数明显偏向字符“0”,因此可能导致生成不符合真实情况的训练样本,进而基于该训练样本所训练获得的字符识别模型对于双半字符图像的识别正确率较低。
发明内容
针对上述现有技术中易于生成不符合真实情况的训练样本,进而基于该训练样本所训练获得的字符识别模型对于双半字符图像的识别正确率较低这一 问题,提出了训练样本生成、模型训练、字符识别方法及其装置与计算机可读存储介质,利用这种方法和装置,能够解决上述问题。
本发明提供了以下方案。
第一方面,提供一种训练样本生成方法,包括:获取字符图像,并确定字符图像包含的每一个字符;利用投影法确定字符图像包含的每一个字符的权重值,并根据字符图像包含的每一个字符的权重值对字符图像标注标签,形成训练样本。
优选地,上述获取字符图像,确定字符图像包含的每一个字符,还包括:采集字轮型计量表的表盘图像;对表盘图像进行字符分割处理,得到对应于预设字轮的字符图像;由预设字轮的转动位置确定字符图像包含的每一个字符。
优选地,上述利用投影法确定字符图像包含的每一个字符的权重值,包括:利用投影法确定字符图像的总字符区域与总字符区域中对应于每一个字符的局部字符区域;确定局部字符区域相对于总字符区域之投影比例,并根据投影比例确定字符图像包含的每一个字符的权重值。
优选地,上述根据字符图像包含的每一个字符的权重值对字符图像标注标签,还包括:根据字符图像包含的每一个字符的权重值更新预设权重序列,以得到目标权重序列,并根据目标权重序列对字符图像标注标签;其中,预设权重序列是由多个候选字符的预设权重值按照预设排列顺序而预先构建的,且多个候选字符中的每一个候选字符的预设权重值为0。
第二方面,提供一种模型训练方法,包括:获取训练集,训练集包括如上述第一方面提供的方法而生成的训练样本;根据训练集训练字符识别模型,其中字符识别模型用于识别字轮型计量表的表盘读数。
第三方面,提供一种字符识别方法,包括:获取字轮型计量表的表盘图像,对表盘图像进行字符分割处理,得到对应于字轮型计量表的每一个字轮的待识别图像;将待识别图像输入如上述第二方面提供的方法而训练得到的字符识别模型,得到字轮型计量表的表盘读数。
第四方面,提供一种训练样本生成装置,包括:获取模块,用于获取字符图像,并确定字符图像包含的每一个字符;标注模块,用于利用投影法确定字符图像包含的每一个字符的权重值,并根据字符图像包含的每一个字符的权重值对字符图像标注标签,形成训练样本。
优选地,获取模块还用于:采集字轮型计量表的表盘图像;对表盘图像进行字符分割处理,得到对应于预设字轮的字符图像;由预设字轮的转动位置确定字符图像包含的每一个字符。
优选地,标注模块还用于:利用投影法确定字符图像的总字符区域与总字符区域中对应于每一个字符的局部字符区域;确定局部字符区域相对于总字符区域之投影比例,并根据投影比例确定字符图像包含的每一个字符的权重值。
优选地,标注模块还用于:根据字符图像包含的每一个字符的权重值更新预设权重序列,以得到目标权重序列,并根据目标权重序列对字符图像标注标签;其中,预设权重序列是由多个候选字符的预设权重值按照预设排列顺序而预先构建的,且多个候选字符中的每一个候选字符的预设权重值为0。
第五方面,提供一种模型训练装置,包括:训练集获取模块,用于获取训练集,训练集包括如上述第四方面提供的装置而生成的训练样本;模型训练模块,用于根据训练集训练字符识别模型,其中字符识别模型用于识别字轮型计量表的表盘读数。
第六方面,提供一种字符识别装置,包括:待识别图像获取模块,用于获取字轮型计量表的表盘图像,对表盘图像进行字符分割处理,得到对应于字轮型计量表的每一个字轮的待识别图像;字符识别模块,用于将待识别图像输入如第五方面提供的装置而训练得到的字符识别模型,得到字轮型计量表的表盘读数。
第七方面,提供一种训练样本生成装置,包括:一个或者多个多核处理器;存储器,用于存储一个或多个程序;当一个或多个程序被一个或者多个多核处理器执行时,使得一个或多个多核处理器实现:获取字符图像,并确定字符图像包含的每一个字符;利用投影法确定字符图像包含的每一个字符的权重值,并根据字符图像包含的每一个字符的权重值对字符图像标注标签,形成训练样本。
第八方面,提供一种计算机可读存储介质,计算机可读存储介质存储有程序,当程序被多核处理器执行时,使得多核处理器执行如上述第一方面提供的方法。
本申请实施例采用的上述至少一个技术方案能够达到以下有益效果:本实施例中,首先获取字轮型计量表的字符图像,并通过投影法对字符图像执行投 影,得到字符图像所包含的每一个字符的权重值,然后根据该对应于每一个字符的权重值确定字符图像的标签,使得生成的训练样本所携带的标签并不是单一的分类标签(比如“0”、“1”或“2”),而是根据字符图像所包含的每一个字符的占比权重值而设置具有偏向性的标签,使得训练样本更加符合客观真实,通过采用上述训练样本所训练出来的字符识别模型不仅可以对单整字符类型的字符图像具有很好的识别效果,同时也可以实现双半字符类型的字符图像的精准识别,且对于双半字符类型的字符图像其能够给出具有偏向性的、具有更高识别准确率的识别结果。
应当理解,上述说明仅是本发明技术方案的概述,以便能够更清楚地了解本发明的技术手段,从而可依照说明书的内容予以实施。为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举说明本发明的具体实施方式。
附图说明
通过阅读下文的示例性实施例的详细描述,本领域普通技术人员将明白本文所述的有点和益处以及其他优点和益处。附图仅用于示出示例性实施例的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的标号表示相同的部件。在附图中:
图1为根据本发明一实施例的训练样本生成方法的流程示意图;
图2a为本发明实施例中的单整字符“1”的示意性字符图像,图2b为本发明实施例中的双半字符“0-1”的示意性字符图像;
图3a为图2a的二值图像,图3b为图2b的二值图像;
图4a为对图3a执行投影后形成的投影直方图,图4b为对图3b执行投影后形成的投影直方图;
图5为根据本发明一实施例的模型训练方法的流程示意图;
图6为根据本发明一实施例的字符识别方法的流程示意图;
图7为本发明实施例中的表盘图像的示意图;
图8为本发明实施例中的待识别图像的示意图;
图9为根据本发明一实施例的训练样本生成装置的结构示意图;
图10为根据本发明一实施例的模型训练装置的结构示意图;
图11为根据本发明一实施例的字符识别装置的结构示意图;
图12为根据本发明另一实施例的训练样本生成装置的结构示意图。
在附图中,相同或对应的标号表示相同或对应的部分。
具体实施方式
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。
在本发明中,应理解,诸如“包括”或“具有”等术语旨在指示本说明书中所公开的特征、数字、步骤、行为、部件、部分或其组合的存在,并且不旨在排除一个或多个其他特征、数字、步骤、行为、部件、部分或其组合存在的可能性。
以下对本上下文中出现的名词进行简要介绍:
字轮型计量表,是指通过驱动一个或多个字轮进行转动,从而将字轮上标注的数字分别展示于表盘的读数框内,以供用户读取数值的计量表装置,比如生活中常见的水表、燃气表等。
可以理解,本申请提供的训练样本生成方法可以应用于任意具有图形处理能力的处理设备。具体地,该处理设备可以是包括中央处理器(Central Processing Unit/Processor,CPU)和/或图形处理器(Graphics Processing Unit,GPU)的终端、服务器等设备。其中,终端包括桌面终端、手机/平板电脑等移动智能终端、车载终端以及可穿戴终端等等。
另外还需要说明的是,在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本发明。
图1为根据本申请一实施例的训练样本生成方法10的流程示意图,该训练样本生成方法10用于生成训练样本以训练字符识别模型,该字符识别模型用于识别字轮型计量表的表盘读数,在该流程中,从设备角度而言,执行主体可以是一个或者多个电子设备;从程序角度而言,执行主体相应地可以是搭载于这些电子设备上的程序。
图1中的流程可以包括以下步骤11~步骤12。
步骤11:获取字符图像,并确定字符图像包含的每一个字符;
具体地,该字符图像是该字轮型计量表的表盘图像中对应于某一个字轮的局部图像。可以通过架设在字轮型计量表上方的摄像装置而采集该表盘图像,并通过对该表盘图像执行分割处理后得到该字符图像,当然该字符图像也可以有其它来源,例如来自其它设备,或者也可以是现成的图像,本发明对此不进行限制。进一步地,该字符图像可能包含候选字符中的一个或更多字符,上述候选字符可以包括:“0”,“1”,“2”,“3”,“4”,“5”,“6”,“7”,“8”,“9”。可选地,可以通过预先设定字符图像采集策略或对字符图像进行初步模板匹配而确定字符图像包含的每一个字符,也可以由人工判断。
举例来说,对于字轮型计量表的任意一个字轮来说,在表盘图像中展示的字符图像可以包括单整字符与双半字符两种类型,其中单整字符包括:0、1、2、3、4、5、6、7、8、9,用于表示字符图像中只存在单一字符,比如在图2a示出了单整字符“1”的字符图像,双半字符包括:0-1、1-2、2-3、3-4、4-5、5-6、6-7、7-8、8-9、9-0,用于表示字符图像中同时存在两个字符,比如在图2b示出了双半字符“0-1”的字符图像。本实施例并不限制字符图像中包含的字符个数,当采用其他类型的字轮型计量表时,字符图像中也可能同时存在三个或以上字符。
步骤12:利用投影法确定字符图像包含的每一个字符的权重值,并根据字符图像包含的每一个字符的权重值对字符图像标注标签,形成训练样本。
具体地,字符图像包含的每一个字符的权重值用于指示该字符图像中的字符分布情况,其中对应于更高权重值的字符在该字符图像中分布占比率更高,该字轮型计量表的实际读数也应该更加偏向于具有更高权重值的字符。进一步地,在字轮型计量表中通常将多个字符均匀地分布在字轮表面,并通过旋转字轮使分布在字轮表面的字符轮流展示在表盘的字符框中,因此通过在垂直于字轮的方向上对字符图像进行投影,能够便捷且准确地获取字符图像包含的每一个字符的分布占比率,进而可以将上述分布占比率作为权重值。
举例来说,对于图2b所示出的双半字符“0-1”的字符图像,可以利用投影法确定该字符图像包含的字符“0”与“1”各自对应的权重值。可以看出,尽管该字符图像中同时包含字符“0”与“1”,但是明显偏向字符“1”,此时若采取传统的打标方式,对该字符图像打标“0”或“1”或“0-1”,均不能真实 表达出该字符图像表示的真实字符读数。而本申请对该字符图像所标注的标签中,同时携带字符“0”与字符“1”的权重值,其中使字符“0”具有更小权重值,字符“1”具有更高权重值,则可以生成更加符合客观真实的训练样本。
本实施例中,首先获取字轮型计量表的字符图像,并通过投影法对字符图像执行投影,得到字符图像所包含的每一个字符的权重值,然后根据该对应于每一个字符的权重值确定字符图像的标签,使得生成的训练样本所携带的标签并不是单一的分类标签(比如“0”、“1”或“2”),而是根据字符图像所包含的每一个字符的占比权重值而设置具有偏向性的标签,使得训练样本更加符合客观真实,通过采用上述训练样本所训练出来的字符识别模型不仅可以对单整字符类型的字符图像具有很好的识别效果,同时也可以实现双半字符类型的字符图像的精准识别,且对于双半字符类型的字符图像其能够给出具有偏向性的、具有更高识别准确率的识别结果。
基于图1的训练样本生成方法,本申请的一些实施例还提供了该方法的一些具体实施方案,以及扩展方案,下面进行说明。
在一实施例中,上述步骤11还可以还包括:采集字轮型计量表的表盘图像;对表盘图像进行字符分割处理,得到对应于预设字轮的字符图像;由预设字轮的转动位置确定字符图像包含的每一个字符。
具体地,字轮型计量表包括一个或更多字轮,可以选择其中任意一个字轮作为上述预设字轮。进一步地,可以预先使预设字轮按照预设转动规则转动,并通过架设在字轮型计量表上方的摄像装置而采集表盘图像,利用全局阈值法、边缘检查法或轮廓检测法等多种分割算法对该表盘图像执行字符分割处理,从而得到对应于该预设字轮的字符图像,进而能够根据预设转动规则以及该表盘图像的采集时间推算出该预设字轮的转动位置,进而由该预设字轮的转动位置计算出该预设字轮当前呈现在表盘中的读数,从而确定该字符图像包含的每一个字符。这样,避免了从字符图像中确定字符的繁琐步骤,进一步提高了生成训练样本的效率。
可选地,由于该预设字轮的转动位置固定时,呈现在表盘中的至少一个字符各自对应的权重值也是固定的,因此也可以由该预设字轮的转动位置直接确定该字符图像包含的每一个字符的权重值。
在一实施例中,上述步骤12中的利用投影法确定字符图像包含的每一个字符的权重值,还可以进一步包括:利用投影法确定字符图像的总字符区域与总字符区域中对应于每一个字符的局部字符区域;确定局部字符区域相对于总字符区域之投影比例,并根据该投影比例确定字符图像包含的每一个字符的权重值。
具体地,对字符图像执行投影之前还需要预先执行二值化处理。举例来说,图2a示出了单整字符“1”的字符图像,对该图2a执行二值化处理后得到图3a所示出的单整字符“1”的二值图像,对该图3a执行投影后得到图4a所示出的投影直方图,由该图4a所示出的投影直方图可以看出,字符图像的总字符区域为X的坐标区间“8~35”,对应于字符“1”的局部字符区域为X的坐标区间“8~35”,从而可以看出,字符“1”的局部字符区域相对于总字符区域之投影比例为100%,字符“1”的权重值为1(100%)。
又例如,图2b示出了双半字符“0-1”的字符图像,对该图2b执行二值化处理后得到图3b所示出的双半字符“0-1”的二值图像,对该图3b执行投影后得到图4b所示出的投影直方图,由该图4b所示出的投影直方图可以看出,字符图像的总字符区域为X的坐标区间“2~20”以及“28~39”,对应于字符“0”的局部字符区域为X的坐标区间“2~20”,对应于字符“1”的局部字符区域为X的坐标区间“28~39”,从而可以计算得到,字符“0”的局部字符区域相对于总字符区域之投影比例为(20-2+1)/[(20-2+1)+(39-28+1)]=0.6129,字符“0”的权重值为61.29%,字符“1”的局部字符区域相对于总字符区域之投影比例为(39-28+1)/[(20-2+1)+(39-28+1)]=0.3871,字符“1”的权重值为38.71%,使字符图像包含的至少一个字符的权重值之和为1。这样,能够更为便捷地获取字符图像包含的每一个字符的权重值。
在一实施例中,上述步骤12中的根据字符图像包含的每一个字符的权重值对字符图像标注标签,还可以包括:根据字符图像包含的每一个字符的权重值更新预设权重序列,以得到目标权重序列,并根据目标权重序列对字符图像标注标签。
具体地,其中预设权重序列是由多个候选字符的预设权重值按照预设排列顺序而预先构建的,且多个候选字符中的每一个候选字符的预设权重值为0。
举例来说,预设权重序列可以是:[R 0,R 1,R 2,R 3,R 4,R 5,R 6,R 7,R 8,R 9];其中,多个候选字符包括:0、1、2、…、9,R 0指的是对应于候选字符“0”的预设权重值,R 1指的是对应于候选字符“1”的预设权重值,并依次类推。进一步地,由于本实施例中多个候选字符中的每一个候选字符的预设权重值为0,因此上述预设权重序列可以是:[0,0,0,0,0,0,0,0,0,0]。进一步地,根据字符图像包含的每一个字符的权重值更新上述预设权重序列,得到目标权重序列。
例如,图2a示出了单整字符“1”的字符图像,若根据图2a所示出的字符图像包含的每一个字符的权重值更新上述预设权重序列,由于图2a所示出的字符图像包含字符“1”,且该字符“1”的权重值为1(100%),因此得到的目标权重序列可以是:[0,1,0,0,0,0,0,0,0,0]。
又例如,图2b示出了双半字符“0-1”的字符图像,若根据图2b所示出的字符图像包含的每一个字符的权重值更新上述预设权重序列,由于图2b所示出的字符图像包含双半字符“0-1”,且字符“0”的权重值为61.29%,字符“1”的权重值为38.71%。因此得到的目标权重序列可以是:[0.6129,0.33871,0,0,0,0,0,0,0,0,0]。
通过预先设置多个候选字符的排列顺序,使多个权重值按照该排列顺序构建权重序列就可以直接作为标签进行标注,无需额外携带具体的字符类别,比如“0”、“1”等,并且上述序列格式的标签更加有利于统计训练集的多个训练样本的样本覆盖率。
基于上述的训练样本生成方法,本申请实施例还提供了一种模型训练方法,图5为根据本申请一实施例的模型训练方法50的流程示意图,如图5所示,该方法包括如下步骤:
步骤51:获取训练集;
其中,上述训练集包括训练样本,该训练样本根据上述训练样本生成方法而获得。
步骤52:根据训练集训练字符识别模型;
其中,上述字符识别模型用于识别字轮型计量表的表盘读数。
本实施例中,采用已有的方法对待训练模型进行训练,使得训练得到的字符识别模型能够根据输入的字符图像而输出至少一个字符及其对应的权重值。 这里对训练方法不做具体限定,本实施例中所使用的待训练模型可以为深度学习模型或卷积神经网络模型等模型。
本实施例中,用于训练字符识别模型而采用的训练样本所携带的标签并不是单一的分类标签,比如“0”、“1”或“2”,而是根据字符图像所包含的每一个字符的占比权重值而设置具有偏向性的标签,这样所训练出来的字符识别模型不仅可以对单整字符类型的字符图像具有很好的识别效果,同时也可以实现双半字符类型的字符图像的精准识别,且对于双半字符类型的字符图像其能够给出具有偏向性的识别结果,从而能够训练出具有更高识别准确率的字符识别模型。
基于上述的模型训练方法,本申请实施例还提供了一种字符识别方法,图6为根据本申请一实施例的字符识别方法60的流程示意图,如图6所示,该字符识别方法包括如下步骤:
步骤61:获取字轮型计量表的表盘图像,对表盘图像进行字符分割处理,得到对应于字轮型计量表的每一个字轮的待识别图像;
具体地,可以通过架设在字轮型计量表上方的摄像装置而采集该表盘图像。
步骤62:将待识别图像输入字符识别模型,得到字轮型计量表的表盘读数。
具体地,上述字符识别模型如图5所示出的模型训练方法而训练得到的
举例来说,图7示出了示意性的字轮型计量表的表盘图像,可以对图7所示出的表盘图像进行二值化处理以及字符分割处理,得到如图8所示出5个待识别图像:“001.png”、“002.png”、“003.png”、“004.png”、“005.png”,应理解,该5个待识别图像分别对应于字轮型计量表的5个字轮,进一步可以分别将该5个待识别图像输入训练好的字符识别模型,输出如下识别结果(假设采用的标注标签为上述[R 0,R 1,R 2,R 3,R 4,R 5,R 6,R 7,R 8,R 9]):
“001.png”:[1,0,0,0,0,0,0,0,0,0]
“002.png”:[1,0,0,0,0,0,0,0,0,0]
“003.png”:[1,0,0,0,0,0,0,0,0,0]
“004.png”:[0,0,0,3,0,0,0,0,0,0]
“005.png”:[0,0,0,0,0,0,0.79,0.21,0,0]
可以看出,字符识别模型输出的对应于“001.png”、“002.png”、“003.png”、“004.png”的识别字符分别是“0”,“0”,“0”,“3”,且对应于识别字符的权 重值为100%,所以就可以得到第一位到第四位是0,0,0,3的单整字符;然而对于待识别图像“005.png”,字符识别模型输出的识别结果为[0,0,0,0,0,0,0.79,0.21,0,0],因此可以判断识别字符为“6”和“7”,且对应于“6”的权重值更高,因此可以得到待识别图像“005.png”为是“6-7”的双半字符,且偏向于6。
本实施例所采用的字符识别方法中,用于执行字符识别的字符识别模型不仅对单整字符类型的字符图像具有很好的识别效果,同时也可以实现双半字符类型的字符图像的精准识别,且对于双半字符类型的字符图像其能够给出具有偏向性的识别结果,从而具有更高识别准确率。
基于上述的训练样本生成方法,本申请实施例还提供了一种训练样本生成装置,图9为根据本申请一实施例的训练样本生成装置90的结构示意图,如图9所示,包括:
获取模块91,用于获取字符图像,并确定字符图像包含的每一个字符;
标注模块92,用于利用投影法确定字符图像包含的每一个字符的权重值,并根据字符图像包含的每一个字符的权重值对字符图像标注标签,形成训练样本。
在一实施例中,获取模块还用于:采集字轮型计量表的表盘图像;对表盘图像进行字符分割处理,得到对应于预设字轮的字符图像;由预设字轮的转动位置确定字符图像包含的每一个字符。
在一实施例中,标注模块还用于:利用投影法确定字符图像的总字符区域与总字符区域中对应于每一个字符的局部字符区域;确定局部字符区域相对于总字符区域之投影比例,并根据投影比例确定字符图像包含的每一个字符的权重值。
在一实施例中,标注模块还用于:根据字符图像包含的每一个字符的权重值更新预设权重序列,以得到目标权重序列,并根据目标权重序列对字符图像标注标签;其中,预设权重序列是由多个候选字符的预设权重值按照预设排列顺序而预先构建的,且多个候选字符中的每一个候选字符的预设权重值为0。
本实施例中,首先获取字轮型计量表的字符图像,并通过投影法对字符图像执行投影,得到字符图像所包含的每一个字符的权重值,然后根据该对应于每一个字符的权重值确定字符图像的标签,使得生成的训练样本所携带的标签 并不是单一的分类标签(比如“0”、“1”或“2”),而是根据字符图像所包含的每一个字符的占比权重值而设置具有偏向性的标签,使得训练样本更加符合客观真实,通过采用上述训练样本所训练出来的字符识别模型不仅可以对单整字符类型的字符图像具有很好的识别效果,同时也可以实现双半字符类型的字符图像的精准识别,且对于双半字符类型的字符图像其能够给出具有偏向性的、具有更高识别准确率的识别结果。
基于上述的模型训练方法,本申请实施例还提供了一种模型训练装置,该字符识别模型用于识别字轮型计量表的表盘读数,图10为根据本申请一实施例的模型训练装置100的结构示意图,如图10所示,包括:
训练集获取模块101,用于获取训练集,训练集包括如上述第四方面提供的装置而生成的训练样本;
模型训练模块102,用于根据训练集训练字符识别模型,其中字符识别模型用于识别字轮型计量表的表盘读数。
本实施例中,用于训练字符识别模型而采用的训练样本所携带的标签并不是单一的分类标签,比如“0”、“1”或“2”,而是根据字符图像所包含的每一个字符的占比权重值而设置具有偏向性的标签,这样所训练出来的字符识别模型不仅可以对单整字符类型的字符图像具有很好的识别效果,同时也可以实现双半字符类型的字符图像的精准识别,且对于双半字符类型的字符图像其能够给出具有偏向性的识别结果,从而能够训练出具有更高识别准确率的字符识别模型。
基于上述的字符识别方法,本申请实施例还提供了一种用于识别字轮型计量表的表盘读数的字符识别装置,图11为根据本申请一实施例的字符识别装置110的结构示意图,如图11所示,包括:
待识别图像获取模块111,用于获取字轮型计量表的表盘图像,对表盘图像进行字符分割处理,得到对应于字轮型计量表的每一个字轮的待识别图像;
字符识别模块112,用于将待识别图像输入如第五方面提供的装置而训练得到的字符识别模型,得到字轮型计量表的表盘读数。
本实施例所采用的字符识别装置中,用于执行字符识别的字符识别模型不仅对单整字符类型的字符图像具有很好的识别效果,同时也可以实现双半字符 类型的字符图像的精准识别,且对于双半字符类型的字符图像其能够给出具有偏向性的识别结果,从而具有更高识别准确率。
需要说明的是,本申请实施例中的训练样本生成装置、模型训练装置以及字符识别装置分别可以实现前述训练样本生成方法、模型训练方法以及字符识别方法的实施例的各个过程,并达到相同的效果和功能,这里不再赘述。
图12为根据本申请一实施例的一种训练样本生成装置的示意图,用于执行如图1所示出的训练样本生成方法,该装置包括:
至少一个处理器;以及,
与至少一个处理器通信连接的存储器;其中,
存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行:
获取字符图像,并确定字符图像包含的每一个字符;
利用投影法确定字符图像包含的每一个字符的权重值,并根据字符图像包含的每一个字符的权重值对字符图像标注标签,形成训练样本。
根据本申请的一些实施例,提供了与以上训练样本生成方法的非易失性计算机存储介质,其上存储有计算机可执行指令,该计算机可执行指令设置为在由处理器运行时执行:
获取字符图像,并确定字符图像包含的每一个字符;
利用投影法确定字符图像包含的每一个字符的权重值,并根据字符图像包含的每一个字符的权重值对字符图像标注标签,形成训练样本。
本申请中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置、设备和计算机可读存储介质实施例而言,由于其基本相似于方法实施例,所以其描述进行了简化,相关之处可参见方法实施例的部分说明即可。
本申请实施例提供的装置、设备和计算机可读存储介质与方法是一一对应的,因此,装置、设备和计算机可读存储介质也具有与其对应的方法类似的有益技术效果,由于上面已经对方法的有益技术效果进行了详细说明,因此,这里不再赘述装置、设备和计算机可读存储介质的有益技术效果。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、 快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。此外,尽管在附图中以特定顺序描述了本发明方法的操作,但是,这并非要求或者暗示必须按照该特定顺序来执行这些操作,或是必须执行全部所示的操作才能实现期望的结果。附加地或备选地,可以省略某些步骤,将多个步骤合并为一个步骤执行,和/或将一个步骤分解为多个步骤执行。
虽然已经参考若干具体实施方式描述了本发明的精神和原理,但是应该理解,本发明并不限于所公开的具体实施方式,对各方面的划分也不意味着这些方面中的特征不能组合以进行受益,这种划分仅是为了表述的方便。本发明旨在涵盖所附权利要求的精神和范围内所包括的各种修改和等同布置。

Claims (14)

  1. 一种训练样本生成方法,其特征在于,包括:
    获取字符图像,并确定所述字符图像包含的每一个字符;
    利用投影法确定所述字符图像包含的每一个字符的权重值,并根据所述字符图像包含的每一个字符的权重值对所述字符图像标注标签,形成训练样本。
  2. 如权利要求1所述的训练方法,其特征在于,获取字符图像,并确定所述字符图像包含的每一个字符,还包括:
    采集字轮型计量表的表盘图像;
    对所述表盘图像进行字符分割处理,得到对应于预设字轮的字符图像;
    由所述预设字轮的转动位置确定所述字符图像包含的每一个字符。
  3. 如权利要求1所述的训练方法,其特征在于,利用投影法确定所述字符图像包含的每一个字符的权重值,包括:
    利用投影法确定所述字符图像的总字符区域与所述总字符区域中对应于所述每一个字符的局部字符区域;
    确定所述局部字符区域相对于所述总字符区域之投影比例,并根据所述投影比例确定所述字符图像包含的每一个字符的权重值。
  4. 如权利要求1所述的训练方法,其特征在于,根据所述字符图像包含的每一个字符的权重值对所述字符图像标注标签,还包括:
    根据所述字符图像包含的每一个字符的权重值更新预设权重序列,以得到目标权重序列,并根据所述目标权重序列对所述字符图像标注标签;
    其中,所述预设权重序列是由多个候选字符的预设权重值按照预设排列顺序而预先构建的,且所述多个候选字符中的每一个候选字符的预设权重值为0。
  5. 一种模型训练方法,其特征在于,包括:
    获取训练集,所述训练集包括根据权利要求1-4中任一项所述的方法而生成的训练样本;
    根据所述训练集训练字符识别模型,其中所述字符识别模型用于识别字轮型计量表的表盘读数。
  6. 一种字符识别方法,其特征在于,包括:
    获取字轮型计量表的表盘图像,对所述表盘图像进行字符分割处理,得到对应于所述字轮型计量表的每一个字轮的待识别图像;
    将所述待识别图像输入如权利要求5所述的方法而训练得到的字符识别模型,得到所述字轮型计量表的表盘读数。
  7. 一种训练样本生成装置,其特征在于,包括:
    获取模块,用于获取字符图像,并确定所述字符图像包含的每一个字符;
    标注模块,用于利用投影法确定所述字符图像包含的每一个字符的权重值,并根据所述字符图像包含的每一个字符的权重值对所述字符图像标注标签,形成训练样本。
  8. 如权利要求7所述的训练装置,其特征在于,所述获取模块还用于:
    采集字轮型计量表的表盘图像;
    对所述表盘图像进行字符分割处理,得到对应于预设字轮的字符图像;
    由所述预设字轮的转动位置确定所述字符图像包含的每一个字符。
  9. 如权利要求7所述的训练装置,其特征在于,所述标注模块还用于:
    利用投影法确定所述字符图像的总字符区域与所述总字符区域中对应于所述每一个字符的局部字符区域;
    确定所述局部字符区域相对于所述总字符区域之投影比例,并根据所述投影比例确定所述字符图像包含的每一个字符的权重值。
  10. 如权利要求7所述的训练装置,其特征在于,所述标注模块还用于:
    根据所述字符图像包含的每一个字符的权重值更新预设权重序列,以得到目标权重序列,并根据所述目标权重序列对所述字符图像标注标签;
    其中,所述预设权重序列是由多个候选字符的预设权重值按照预设排列顺序而预先构建的,且所述多个候选字符中的每一个候选字符的预设权重值为0。
  11. 一种模型训练装置,其特征在于,包括:
    训练集获取模块,用于获取训练集,所述训练集包括根据权利要求7-10中任一项所述的装置而生成的训练样本;
    模型训练模块,用于根据所述训练集训练字符识别模型,其中所述字符识别模型用于识别字轮型计量表的表盘读数。
  12. 一种字符识别装置,其特征在于,包括:
    待识别图像获取模块,用于获取字轮型计量表的表盘图像,对所述表盘图像进行字符分割处理,得到对应于所述字轮型计量表的每一个字轮的待识别图像;
    字符识别模块,用于将所述待识别图像输入如权利要求11所述的装置而训练得到的字符识别模型,得到所述字轮型计量表的表盘读数。
  13. 一种训练样本生成装置,其特征在于,包括:
    一个或者多个多核处理器;
    存储器,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或者多个多核处理器执行时,使得所述一个或多个多核处理器实现:
    获取字符图像,并确定所述字符图像包含的每一个字符;
    利用投影法确定所述字符图像包含的每一个字符的权重值,并根据所述字符图像包含的每一个字符的权重值对所述字符图像标注标签,形成训练样本。
  14. 一种计算机可读存储介质,所述计算机可读存储介质存储有程序,当所述程序被多核处理器执行时,使得所述多核处理器执行如权利要求1-4中任一项所述的方法。
PCT/CN2020/126197 2019-12-05 2020-11-03 训练样本生成、模型训练、字符识别方法及其装置 WO2021109775A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/782,677 US20230007989A1 (en) 2019-12-05 2020-11-03 Methods and devices for generating training sample, training model and recognizing character

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911233955.0 2019-12-05
CN201911233955.0A CN111079763B (zh) 2019-12-05 2019-12-05 训练样本生成、模型训练、字符识别方法及其装置

Publications (1)

Publication Number Publication Date
WO2021109775A1 true WO2021109775A1 (zh) 2021-06-10

Family

ID=70313065

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/126197 WO2021109775A1 (zh) 2019-12-05 2020-11-03 训练样本生成、模型训练、字符识别方法及其装置

Country Status (3)

Country Link
US (1) US20230007989A1 (zh)
CN (1) CN111079763B (zh)
WO (1) WO2021109775A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111079763B (zh) * 2019-12-05 2023-08-08 嘉楠明芯(北京)科技有限公司 训练样本生成、模型训练、字符识别方法及其装置
CN112464932A (zh) * 2020-11-26 2021-03-09 广东工业大学 水表数值自动读取方法、装置、电子设备及存储介质
CN112446383B (zh) * 2020-11-30 2022-09-02 展讯通信(上海)有限公司 车牌识别方法及装置、存储介质、终端
CN112381177A (zh) * 2020-12-07 2021-02-19 江苏科技大学 一种基于深度学习的表盘数字字符识别方法及系统
CN113269194A (zh) * 2021-06-11 2021-08-17 四川长虹网络科技有限责任公司 读数表不完整字符识别方法以及读数表字符识别方法
CN113516110B (zh) * 2021-09-13 2021-12-21 成都千嘉科技有限公司 基于图像分割的燃气表字轮坐标提取方法
CN114973248B (zh) * 2022-05-18 2023-03-24 慧之安信息技术股份有限公司 基于ocr识别的pdf识别方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709530A (zh) * 2017-01-17 2017-05-24 中国科学院上海高等研究院 基于视频的车牌识别方法
US20170308768A1 (en) * 2015-01-15 2017-10-26 Suntront Tech Co., Ltd Character information recognition method based on image processing
CN108491844A (zh) * 2018-02-07 2018-09-04 西安工程大学 基于图像处理的水表自动检测系统及其图像处理方法
CN111079763A (zh) * 2019-12-05 2020-04-28 北京嘉楠捷思信息技术有限公司 训练样本生成、模型训练、字符识别方法及其装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9008429B2 (en) * 2013-02-01 2015-04-14 Xerox Corporation Label-embedding for text recognition
CN105825212A (zh) * 2016-02-18 2016-08-03 江西洪都航空工业集团有限责任公司 一种基于Hadoop的分布式车牌识别方法
CN110245613B (zh) * 2019-06-17 2023-01-20 珠海华园信息技术有限公司 基于深度学习特征对比的船牌识别方法
CN110503090B (zh) * 2019-07-09 2021-11-09 中国科学院信息工程研究所 基于受限注意力模型的字符检测网络训练方法、字符检测方法和字符检测器

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170308768A1 (en) * 2015-01-15 2017-10-26 Suntront Tech Co., Ltd Character information recognition method based on image processing
CN106709530A (zh) * 2017-01-17 2017-05-24 中国科学院上海高等研究院 基于视频的车牌识别方法
CN108491844A (zh) * 2018-02-07 2018-09-04 西安工程大学 基于图像处理的水表自动检测系统及其图像处理方法
CN111079763A (zh) * 2019-12-05 2020-04-28 北京嘉楠捷思信息技术有限公司 训练样本生成、模型训练、字符识别方法及其装置

Also Published As

Publication number Publication date
US20230007989A1 (en) 2023-01-12
CN111079763B (zh) 2023-08-08
CN111079763A (zh) 2020-04-28

Similar Documents

Publication Publication Date Title
WO2021109775A1 (zh) 训练样本生成、模型训练、字符识别方法及其装置
WO2022089360A1 (zh) 人脸检测神经网络及训练方法、人脸检测方法、存储介质
CN110659636A (zh) 基于深度学习的指针式仪表读数识别方法
AU2020103716A4 (en) Training method and device of automatic identification device of pointer instrument with numbers in natural scene
US20210090266A1 (en) Method and device for labeling point of interest
CN111854758A (zh) 一种基于建筑楼cad图的室内导航地图转换方法及系统
CN110704559B (zh) 一种多尺度矢量面数据匹配方法
CN104463826A (zh) 一种新的点云并行Softassign配准算法
CN109711441B (zh) 图像分类方法、装置、存储介质及电子设备
CN110991437B (zh) 字符识别方法及其装置、字符识别模型的训练方法及其装置
CN110874591A (zh) 一种图像定位方法、装置、设备及存储介质
CN111027456A (zh) 基于图像识别的机械水表读数识别方法
CN110909804B (zh) 基站异常数据的检测方法、装置、服务器和存储介质
CN113280764A (zh) 基于多星协同技术的输变电工程扰动范围定量监测方法及系统
CN117274388A (zh) 基于视觉文本关系对齐的无监督三维视觉定位方法及系统
CN111144466B (zh) 一种图像样本自适应的深度度量学习方法
CN115410174B (zh) 一种两阶段车险反欺诈图像采集质检方法、装置和系统
CN117173223A (zh) 电表断码屏的标准模板生成方法、装置、设备及介质
CN114219804B (zh) 一种基于原型分割网络的小样本牙齿检测方法及存储介质
CN114580975A (zh) 街区活力获得方法及系统及装置及介质
CN114708462A (zh) 多数据训练的检测模型生成方法、系统、设备及存储介质
CN114155524A (zh) 单阶段3d点云目标检测方法及装置、计算机设备、介质
CN114417965A (zh) 图像处理模型的训练方法、目标检测方法及相关装置
CN113808142A (zh) 一种地面标识的识别方法、装置、电子设备
CN113610909B (zh) 一种基于距离搜索的点云剖面生成系统及方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20897356

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20897356

Country of ref document: EP

Kind code of ref document: A1