CN111476232A - Water washing label detection method, equipment and storage medium - Google Patents

Water washing label detection method, equipment and storage medium Download PDF

Info

Publication number
CN111476232A
CN111476232A CN202010166048.5A CN202010166048A CN111476232A CN 111476232 A CN111476232 A CN 111476232A CN 202010166048 A CN202010166048 A CN 202010166048A CN 111476232 A CN111476232 A CN 111476232A
Authority
CN
China
Prior art keywords
information
character
text
network
character information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010166048.5A
Other languages
Chinese (zh)
Inventor
刘艳丽
王毅宏
张恒
乔国
李炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Manifen Garment Co ltd
East China Jiaotong University
Original Assignee
Jiangxi Manifen Garment Co ltd
East China Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Manifen Garment Co ltd, East China Jiaotong University filed Critical Jiangxi Manifen Garment Co ltd
Priority to CN202010166048.5A priority Critical patent/CN111476232A/en
Publication of CN111476232A publication Critical patent/CN111476232A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The application relates to the technical field of image recognition, in particular to a washing label detection method, equipment and a storage medium, which are used for solving the technical problems of low text information extraction speed and high error rate caused by manual image character recognition in the prior art. The method comprises the following steps: obtaining effective image information of an object to be identified, wherein the effective image information contains a text area; recognizing by adopting a character recognition model to obtain character information in the effective image information; comparing the character information with standard character information prestored in a database; and when the character information is inconsistent with the standard character information prestored in the database, determining the character information to be corrected in the object to be recognized. Therefore, all character information in the image text is automatically recognized by adopting the character recognition model, the labor cost can be reduced, and the character recognition efficiency and the recognition accuracy of the irregular text are improved.

Description

Water washing label detection method, equipment and storage medium
Technical Field
The application relates to the technical field of image recognition, in particular to a water washing mark detection method, water washing mark detection equipment and a storage medium.
Background
Optical Character Recognition (OCR) is a process of analyzing, recognizing and processing a scanned document image to obtain text and layout information. The technique has further been applied to character recognition in scenes, i.e., recognizing character information in images of natural scenes. The OCR technology is widely applied to various industries at present, and related applications relate to identification card recognition, bill recognition, license plate recognition, hang tag recognition, washing mark recognition and the like.
When carrying out character recognition to drop, washing mark in workshop, in order to improve the quality of drop, washing mark, whether printing and matching in need short-term test drop, washing mark have the mistake. At present, most drop tags and washable labels are detected by utilizing manual work to detect wrong texts. The process of manual detection drop, washing mark is boring, loaded down with trivial details, and all consumes greatly in time and cost to when manual detection drop, washing mark, face text information extraction rate slow, error rate height, wait complicated problem.
In view of the above, a method for detecting a shipping label needs to be redesigned to overcome the above-mentioned drawbacks.
Disclosure of Invention
The embodiment of the application provides a washing mark detection method, equipment and a storage medium, and is used for solving the technical problems that in the prior art, manual image character recognition causes slow text information extraction speed and high error rate.
The embodiment of the application provides the following specific technical scheme:
the first aspect of the embodiment of the application provides a wash mark detection method, includes:
obtaining effective image information of an object to be identified, wherein the effective image information comprises a text area, and the text area comprises an irregular-shaped text; the object to be identified comprises a washing label;
recognizing by adopting a character recognition model to obtain character information in the effective image information;
comparing the character information with standard character information prestored in a database to obtain a comparison result;
and when the comparison result shows that the character information is inconsistent with standard character information prestored in a database, determining character information to be corrected in the object to be recognized.
Optionally, before obtaining the effective image information of the object to be recognized, the method further includes:
acquiring image information of an object to be identified;
and executing preprocessing operation on the image information to obtain effective image information of the object to be identified, wherein the preprocessing operation comprises graying, down sampling, Gaussian denoising, binarization, rotation and/or cutting processing.
Optionally, the character recognition model includes a feature pyramid network, a region generation network, a fast region convolution neural network, a text segmentation network, a sequence-to-sequence network based on an attention mechanism, and a reconstruction network; the method for obtaining the character information in the effective image information by adopting character recognition model recognition specifically comprises the following steps:
extracting feature information in the effective image by adopting the feature pyramid network;
inputting the characteristic information into a region generation network to generate the characteristic information of a target text candidate box corresponding to the effective image;
inputting the characteristic information of the target text candidate box into a fast regional convolutional neural network, and classifying the target text candidate box by using a classifier according to the characteristic information of the target text candidate box;
performing character segmentation and example segmentation on the target text in the target text candidate box by using the text segmentation network;
recognizing the segmented characters and examples by adopting a sequence-to-sequence network based on an attention mechanism to obtain a recognition result;
and integrating the recognition results by adopting a reconstruction network to obtain all character information in the effective image information.
Optionally, the sequence-to-sequence network based on the attention mechanism may further include a multi-layer bidirectional L STM network, and before the recognition result is integrated by using the reconstruction network, the method further includes:
and performing bidirectional analysis on the feature sequences corresponding to the segmented characters and instances by adopting a multi-layer bidirectional L STM network to determine the context correlation between each character and each instance.
Optionally, the text information is compared with standard text information prestored in a database to obtain a comparison result, and the method specifically includes:
and matching and comparing the character string corresponding to the text information with a character string corresponding to standard text information prestored in a database by using a KMP algorithm.
Optionally, the object to be identified is a clothing drop or a washing label, and the text information includes washing mode information, prompt information and style component information;
after determining the character information to be corrected in the object to be recognized, the method also comprises the step of
And correcting the character information to be corrected into character information consistent with the corresponding standard character information according to the standard character information prestored in the database.
Optionally, the text segmentation network includes a character segmentation network and an instance segmentation network; performing character segmentation and example segmentation on the target text in the target text candidate box by using the text segmentation network, which specifically comprises the following steps:
and respectively sending the classified characteristic information of the target text candidate box into the character segmentation network and the example segmentation network to obtain the segmented character information and example information.
The second aspect of the embodiment of this application provides a wash mark check out test set, includes:
the effective image information acquisition module is used for acquiring effective image information of an object to be identified, wherein the effective image information contains a text area, and the text area comprises an irregular-shaped text;
the character information identification module is used for identifying character information in the effective image information by adopting a character identification model;
the comparison module is used for comparing the character information with standard character information prestored in a database to obtain a comparison result;
and the character information to be corrected determining module is used for determining the character information to be corrected in the object to be recognized when the comparison result shows that the character information is inconsistent with the standard character information prestored in the database.
The third aspect of the embodiment of this application provides a wash mark check out test set, and this equipment includes: a memory and a processor; wherein the content of the first and second substances,
a memory for storing executable instructions;
a processor for reading and executing executable instructions stored in the memory to implement a method as claimed in any one of the preceding claims.
In a fourth aspect of the embodiments of the present application, there is also provided a storage medium, wherein instructions of the storage medium, when executed by a processor, enable execution of the method according to any one of the above.
In the embodiment of the application, all texts in the image to be recognized are automatically recognized by adopting the character recognition model, including the character information in the irregular text, and then the recognized character information is compared with the standard character information prestored in the database, so that whether the character information in the image to be recognized is accurate or not is judged. By the method, the technical problems of low text information extraction speed and high error rate caused by manual image character recognition in the prior art can be solved, the labor cost is reduced, and the character recognition efficiency and the recognition accuracy rate of irregular texts are improved.
Drawings
FIG. 1 is a system architecture diagram of a shipping label inspection method according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart of a shipping label detection method in the embodiment of the present application;
FIG. 3 is a schematic structural diagram of a text recognition model according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a drop label and a washable mark character recognition method in the embodiment of the present application;
FIG. 5 is a schematic structural diagram of an image and text recognition apparatus corresponding to FIG. 2 in an embodiment of the present application;
fig. 6 is a schematic diagram of a server structure in an embodiment of the present application.
Detailed Description
In order to solve the technical problems of low text information extraction speed and high error rate caused by manual image character recognition in the prior art, in the embodiment of the application, all texts in the image to be recognized, including the character information in the irregular text, are automatically recognized by adopting a character recognition model, and then the recognized character information is compared with the standard character information prestored in the database so as to judge whether the character information in the image to be recognized is accurate. By the method, the technical problems of low text information extraction speed and high error rate caused by manual image character recognition in the prior art can be solved, and the character recognition efficiency of irregular texts is improved.
The present application will now be described in further detail with reference to the accompanying drawings, in which:
fig. 1 is a system architecture diagram of a shipping label detection method in the embodiment of the present application.
As shown in fig. 1, in a specific implementation process, a system architecture of the shipping label detection method may include a mode selection unit 101, an input unit 102, a processing unit 103, and an output unit 104. The mode selection unit 101 is used for selecting an object to be identified to be processed, and the control input unit 102 acquires image information corresponding to the object; the input unit 102 obtains an object to be recognized and transmits the object to the processing unit 103, and the processing unit 103 may pre-process image information of the object to be recognized, extract a text image including text information, and extract text information in a picture by using a text recognition technology. Then, comparing the extracted image character information with standard information stored in a database; the output unit 104 is used for visually outputting detailed information such as the comparison result.
In practical applications, the module selection unit 101 and the input unit 102, the input unit 102 and the processing unit 103, and the processing unit 103 and the output unit 104 may be partially or entirely connected in a wired or wireless manner, so as to ensure effective transmission of data.
The user selects an object to be recognized to be processed through the touch screen, and sends the object to be recognized to the processing unit 103. The input unit 102 acquires the images of the hang tag and the washing label under the control of the processing unit 103 and transmits the images to the processing unit. Specifically, the processing unit 103 controls the input unit to take images at a frequency of 30 frames per second, and performs sharpness screening and effective area screening with or without occlusion on the taken images. If the picture passes the filtering, the acquired picture is sent to the processing unit 103, and if the picture does not pass the filtering, the picture is acquired again. After the acquisition is completed, the acquired image is sent to the processing unit 103.
Next, a washing label detection method provided by the embodiment of the present application will be specifically described with reference to the accompanying drawings:
FIG. 2 is a schematic flow chart of a shipping label detection method in the embodiment of the present application. The execution subject of the flow may be a program installed in an application server or an application client.
As shown in fig. 2, the process may include the following steps:
step 201: obtaining effective image information of an object to be identified, wherein the effective image information contains a text region, and the text region comprises an irregular-shaped text.
The object to be identified can be a drop of clothing or a washing label and the like in the scheme. Wherein, wash mark again called and wash mark (care label, wash label), can mark the surface fabric composition of clothes and the correct washing method: such as dry/machine/hand washing, bleaching availability, airing methods, ironing temperature requirements, etc., to guide the user in the proper washing and maintenance of the garments.
In current clothing enterprises, each piece of clothing is provided with a hang tag and a washing label, a large number of standards (such as ready-made clothing standards, washing standards, style standards, thickness standards and position standards and the like) can be related to the hang tag and the washing label, and the name, the grade, the description and even the symbol of the recorded clothing need to meet the corresponding national or even international standards (for example, FZ/T43014-2008 represents the national standard of silk scarf). Therefore, the text information on the hang tag and the washing label needs to be identified.
However, the characters on the drop and the label may have irregular shapes (such as horizontal, multi-directional, curved and complicated character information). The irregular font herein may refer to: shapes are fonts caused by perspective distortion, curved text position, and the like. Therefore, when recognizing the character information, it is necessary to acquire valid image information corresponding to the object to be recognized first. Such as: the image of the hangtag and the washing mark collected by the camera device can be acquired, and the effective image information containing the text region is determined.
Step 202: and identifying by adopting a character identification model to obtain character information in the effective image information.
The character recognition model mentioned here can be used to recognize all the character information in the effective image information, including characters, symbols, Chinese characters, etc. The text recognition model may be an Optical Character Recognition (OCR) model applied in natural scenes, such as: and (3) a model based on Mask-OCR character recognition technology.
Step 203: and comparing the character information with standard character information prestored in a database to obtain a comparison result.
Aiming at the application scene of the clothing hang tag and the washing label, the corresponding text information in the hang tag and the washing label corresponding to each type of clothing has standard text information. For example: the text information on the drop and the washing label of the clothes in the same batch and the same model can be the same except the size of a single piece of clothes.
In the practical application process, whether inconsistent information such as printing errors exists in the text information on the hang tag and the washing label needs to be judged. Therefore, whether the character information on each hang tag and each washing label is consistent with the pre-stored standard information in the database or not can be compared.
Step 204: and when the comparison result shows that the character information is inconsistent with standard character information prestored in a database, determining character information to be corrected in the object to be recognized.
If the comparison result shows that the character information is consistent with the standard character information prestored in the database, the character information of the object to be recognized can be considered to be correct, and the comparison result shows that the character information is inconsistent with the standard character information prestored in the database, and the character information of the object to be recognized can be considered to have errors.
In the method shown in fig. 2, all texts in the image to be recognized, including the text information in the irregular text, are automatically recognized by using the text recognition model, and then the recognized text information is compared with the standard text information prestored in the database, so as to determine whether the text information in the image to be recognized is accurate. By the method, the technical problems of low text information extraction speed and high error rate caused by manual image character recognition in the prior art can be solved, the labor cost is reduced, and the character recognition efficiency and the recognition accuracy rate of irregular texts are improved.
Optionally, before the obtaining the effective image information of the object to be recognized, the method may further include:
acquiring image information of an object to be identified;
and executing preprocessing operation on the image information to obtain effective image information of the object to be identified, wherein the preprocessing operation comprises graying, down sampling, Gaussian denoising, binarization, rotation and/or cutting processing.
In an actual application scenario, an image acquisition device may be employed to acquire image information of an object to be identified. After the image information of the object to be recognized is acquired, the image information needs to be preprocessed. Specifically, graying, downsampling, gaussian denoising, binarization, rotation and cutting processing can be performed on the image information of the object to be recognized.
More specifically, when the image is subjected to the graying processing, the color image may be subjected to the graying processing by the weighted average method. When the down-sampling processing is performed on the image, the down-sampling coefficient can be selected as k, that is, every k points in each row and each column in the original image are taken to form an image. Then, the image is subjected to Gaussian denoising treatment, each pixel in the image is scanned by a template (or called convolution and mask), the weighted average gray value of the pixels in the neighborhood determined by the template is used for replacing the value of the central pixel point of the template, and particularly, the discretization window sliding window convolution can be used. When the image is binarized, the threshold value may be set to 125. When doing rotation processing to the image, can be in the plane with the image around drop, wash the mark center rotation certain angle, make drop, wash the mark level and arrange the image center in. When the image is cut, the effective area of the image can be intercepted, and then the size of the image is reduced, so that the transmission and recognition efficiency is improved on the premise of not influencing the recognition result.
This application embodiment only uses washing mark as an example, and the discernment object is not limited to washing mark, still is applicable to other objects of waiting to discern.
By the method, after the image is preprocessed by adopting the image processing technology, on one hand, noise influencing character recognition is inhibited, and part of irrelevant features are removed, on the other hand, the contrast ratio of characters and a background is enhanced, so that the data volume in the image is greatly reduced, the outline of a target is highlighted, and the accuracy of character recognition is improved.
The method steps in fig. 2 may be implemented by using a text recognition model, and specifically, may be described with reference to fig. 3:
fig. 3 is a schematic structural diagram of a character recognition model in an embodiment of the present application.
As shown in fig. 3, the character recognition model may include a Feature Pyramid Network (FPN)301, a Region generation Network (RPN) 302, a Fast Region convolutional neural Network (Fast R-CNN)303, a text segmentation Network (Mask bridge) 304, a Sequence to Sequence (Sequence to Sequence attachment) 305 based on an Attention mechanism, and a reconstruction Network 306.
Optionally, the obtaining of the text information in the effective image information by using the text recognition model may specifically include:
extracting feature information in the effective image by adopting the feature pyramid network;
inputting the characteristic information into a region generation network to generate the characteristic information of a target text candidate box corresponding to the effective image;
inputting the characteristic information of the target text candidate box into a fast regional convolutional neural network, and classifying the target text candidate box by using a classifier according to the characteristic information of the target text candidate box;
performing character segmentation, instance segmentation and false positive exclusion on the target text in the target text candidate box by using the text segmentation network;
recognizing the segmented characters and examples by adopting a sequence-to-sequence network based on an attention mechanism to obtain a recognition result;
and integrating the recognition results by adopting a reconstruction network to obtain all character information in the effective image information.
Feature Pyramid Network (FPN) 301: the Feature Pyramid Network (FPN)301 is a feature extractor designed based on the feature pyramid concept with the aim of improving accuracy and speed. Replaces the feature extractor in classes such as Faster R-CNN and generates a higher quality feature map pyramid. The FPN is composed of a bottom-up part and a top-down part. The FPN is the most frequently used network for detecting characters in a natural scene, the characters with different sizes obtained by convolution are fused together, a large text is detected in a small feature diagram, and a small text is detected in a large feature diagram, so that the utilization of information is enhanced, and the small text is not omitted as much as possible.
Region generation Network (RPN) 302: may be the network used to extract the candidate boxes. The RPN network does not adopt a long rectangular box commonly used in text detection, mainly aims at the detection and identification of the curved text, and the shape of the curved text is not a long rectangle, so that the size and the proportion of the original rectangular box are skillfully used.
Fast regional convolutional neural network (Fast R-CNN) 303: fast R-CNN finds out candidate frames firstly, then the whole image passes through CNN once, then the parts corresponding to the candidate frames are sampled to obtain the characteristics with the same length, and the final characteristics are obtained after passing through two layers of full connection layers. Two branches are then generated, one branch classifying the feature and the other branch regressing the candidate box offsets for the feature. Fast R-CNN fuses the classification and regression tasks in one model. Fast R-CNN and MaskBranch. The Mask bridge (example segmentation model) is the main network structure, and the Fast R-CNN is responsible for the classification and regression of the target region, and the results detected by the Fast R-CNN are mainly input into the Mask bridge.
Optionally, the attention-based sequence-to-sequence network 305 may further include a multi-layer bidirectional L STM network, and before the integrating the recognition result with the reconstruction network, the method may further include:
and performing bidirectional analysis on the feature sequences corresponding to the segmented characters and instances by adopting a multi-layer bidirectional L STM network to determine the context correlation between each character and each instance.
It should be noted that the bidirectional long-Short Term Memory network (L ong Short-Term Memory, L STM for Short) is a time recurrent neural network, and is suitable for processing and predicting important events with relatively long interval and delay in a time sequence.
An attention-based sequence-to-sequence network 305 for recognition of global text and a Reconstruct network for reconstruction of global text and partial text the attention-based sequence-to-sequence network may comprise two modules, an encoder module of a CNN + bidirectional long-Short Term Memory network (L ong Short-Term Memory, L STM for Short), and a L STM + attention decoder module, wherein the encoder module extracts input pattern features through convolutional layers to form a feature map and encodes the feature sequence using L STM techniques, and the decoder module decodes based on bidirectional L STM.
Firstly, an encoder forms a feature graph through global text instance features extracted from a Mask-branch. Next, a feature sequence is extracted in units of 1 pixel for the feature map. The size of the feature map is: h isconv×wconv×dconvRespectively representing the height, width and depth of the feature map. The feature map is converted into wconvA feature vector, wherein each vector has hconv×dconvDimension (d) of (a).
In order to expand the correlation of the feature context, a multi-layer bidirectional L STM network is used on the feature sequence, the feature sequence is bidirectionally analyzed, the long-term correlation in two directions is captured, and a new feature sequence with the same length is output and is recorded as H ═ H [1,.。。,hn]Wherein n ═ wconv
The decoder is responsible for converting the signature sequence into a character sequence. The sequence of allowed inputs and outputs is of arbitrary length. Such a model is simple and powerful in sequence modeling and is able to capture output dependencies.
Note that the sequence-to-sequence model is a one-way recursive network. Iterate T times to generate a symbol sequence of length T, denoted as (y)1,……,yT)。
In step t, the decoder predicts the output H, internal state s based on the encoder's last step(t-1)And symbol y(t-1)The character or sequence ending symbol is predicted. In this step, the decoder first calculates an attention weight vector a by its attention mechanismt∈Rn
ei,t=WTtanh(Wst-1+Vhi+b)
Figure RE-GDA0002550334460000111
Where W, V are weights that can be trained.
The attention weight effectively indicates the importance of each term of the encoder output. With weights as coefficients, the decoder linearly combines H-columns into one vector, called glimpse:
Figure RE-GDA0002550334460000112
a glance describes a portion of the entire context encoded with H. Is taken as input to a decoder loop unit that generates an output vector and a new state vector:
(xt,st)=rnn(st-1,(gt,f(yt-1)))
wherein (g)t,f(yt-1) Is g(t)And y(t-1)In series. RNN stands for any cyclic unit. Its output and new state are respectively x(t)And s(t)And (4) showing. Finally, adopt x(t)Predicting the current step size symbol:
p(yt)=softmax(Woxt+b0)
yt~p(yt)
due to y(t-1)Included in the calculation, the decoder learns to capture the correlation between its output characters. This acts like an implicit language model to help identify learned language priorities. During the reasoning process, a step-by-step search method is adopted, namely the cumulative score of k candidates is kept highest at each step.
In a bi-directional decoder, although a sequence-to-sequence decoder can capture the output correlation, only one direction is captured, while the other direction is ignored. Decoders that work in the same direction may be complementary. In order to balance the correlation in both directions, a bi-directional decoder is proposed, consisting of two decoders in opposite directions. One decoder is trained to predict characters from left to right, and the other from right to left. After the two decoders are operated, two recognition results are generated, and the combined result only needs to select one with the highest recognition score as the correct character of the text instance.
The character recognition model adopted in the scheme is different from the traditional example recognition model Mask R-CNN: the Mask branch in the scheme can not only divide text regions, but also predict character probability map, the Sequence to Sequence attention branch and the reconstruction network branch can classify the categories of the complex Chinese characters, the Sequence to Sequence network based on attention mechanism can identify global text examples according to global text positioning of the Mask branch, and then the complex Chinese characters and the global Chinese characters are integrated through a Reconstruct network.
Optionally, the text segmentation network may include a character segmentation network and an instance segmentation network;
the performing character segmentation and instance segmentation on the target text in the target text candidate box by using the text segmentation network may specifically include:
and respectively sending the classified characteristic information of the target text candidate box into the character segmentation network and the example segmentation network to obtain the segmented character information and example information.
The dimension of an output character graph of a character segmentation module in the prior Mask branch is 37 × 32 × 128, wherein the 37 character graphs comprise 10 digital graphs, 26 English character graphs and a background character graph, but in the identification tasks of a drop and a washing mark, the drop and the washing mark not only comprise numbers and letters, but also comprise a series of Chinese characters, and the characters in the drop and the washing mark cannot be completely classified by using a classification method alone.
Aiming at the defects of the prior art, a new character segmentation mechanism is provided, and a character segmentation mapping is directly generated by shared feature mapping. The output character is Ns× 32 × 128, wherein N issThe number of the representative classes is set to be N +2, the N complex Chinese character graphs comprise N complex Chinese character graphs, one NU LL character graph and one background character graph, a new mechanism is proposed at the same time of the classification, and the score and the position information are redistributed to the classification of each character instanceiDenotes, RSiThe composition of (A) is as follows:
RSi={Ci,CLiin which C isi={CS1,CS2…CSnIn which CSnIndicating the probability that the character instance belongs to the nth class, C LiThe location information representing the ith text instance is composed of C Li={Wi,HiIn which W isi、HiRespectively representing the horizontal distance and the vertical distance of the ith character instance relative to the text instance. The following function is performed for the classification of the character instance:
Figure RE-GDA0002550334460000121
classifying the character instance i through the function, and if the highest score of the character instance in all the classifications N is less than or equal to 0.7 of a threshold value, classifying the character instance i into a null character class; if the highest classification score of the character instance i is higher than the threshold value of 0.7, the character instance i is classified as which is the highest score. Finally according to WiThe location of (a) results in a text sequence of the text instance.
In the above attention-based sequence-to-sequence network, although most characters in the text example are recognized, the recognition effect on a short text sequence with complex Chinese characters is poor, and a satisfactory effect cannot be achieved. In the Mask branch, the class limit for the character instance is N, and generally only some complex fonts can be recognized, and for some simple fonts, empty characters are usually used instead.
In addition, the above proposed rebuilt network. The input to the network consists of two parts: (1) text sequences output by Mask branch; (2) a text sequence output by the sequence-to-sequence entry. Through comparison, characters classified as blank characters in the output result of the Maskbranch are replaced by characters output from a sequence-to-sequence network based on an attention mechanism, the existing characters belonging to N types of complex characters are kept unchanged, and finally the combined result is used as the final output of the network.
Optionally, the comparing the text information with standard text information pre-stored in a database to obtain a comparison result may specifically include:
and matching and comparing the character string corresponding to the text information with a character string corresponding to standard text information prestored in a database by using a KMP algorithm.
The KMP algorithm is an improved character string matching algorithm, and the key of the KMP algorithm is to reduce the matching times of a mode string and a main string to the greatest extent by using information after matching failure so as to achieve the purpose of quick matching. The specific implementation is to implement a next () function, which itself contains the local matching information of the pattern string.
Preprocessing two parts of characters: uniformly converting lower case letters in the document into upper case letters, removing all punctuations in the document, and then respectively storing two documents in two character strings: str1 and Str 2. Comparing the two character strings by using a KMP algorithm (the KMP algorithm is an improved character string matching algorithm), and outputting a part of the two character strings, wherein the matching degree of the two character strings is not matched with the out-character of the two character strings.
In a specific application scene, the object to be identified is a clothing drop or a washing label, and the text information comprises washing mode information, prompt information and style component information;
after determining the character information to be corrected in the object to be recognized, the method also comprises the step of
And correcting the character information to be corrected into character information consistent with the corresponding standard character information according to the standard character information prestored in the database.
Taking fig. 4 as an example, fig. 4 is a schematic diagram of a drop label and a washable label character recognition method in the embodiment of the application. As shown in fig. 4, when recognizing characters in the drop and the shipping label, the picture of the drop and the shipping label can be obtained first. Screening qualified picture information from the obtained pictures, preprocessing the qualified picture information to obtain preprocessed image information, determining information with character area marks from the preprocessed image information, and identifying the character information from the character area to form a document of the character information; on the other hand, according to the hang tag and the washing mark selected from the selected mode information, the standard text documents corresponding to the hang tag and the washing mark to be identified can be extracted from the corresponding database; and then comparing the characters corresponding to the recognized text information with the characters corresponding to the standard text information in the database to obtain a comparison result, and then displaying the comparison result on a display screen.
In the method, text information in hang tags and washing marks in a production workshop is quickly identified by adopting a character identification model; whether the printing and matching of the hang tag and the washable label are wrong or not is judged by manually comparing the real object text information of the hang tag and the washable label with the database text information.
By adopting the Mask-OCR-based character recognition technology, the horizontal, multi-directional, bent and complex character information in the drop and the washable label can be quickly and accurately extracted, and whether the printing and matching of the drop and the washable label are accurate or not is judged by comparing the information with database information. The method improves the recognition accuracy of the character recognition technology on complex texts such as bent texts, improves the application range of OCR, greatly improves the processing efficiency and reduces the labor cost to a certain extent.
In addition, in practical application, when the image acquisition device is used for acquiring the image information of the object to be identified, the definition of the image acquired by the camera can be judged, and if the definition meets the preset condition, whether the preset size is met and whether all text regions are included can be further judged. If both are satisfied, the acquired image can be considered as a qualified image. If not, reacquisition is required. To ensure the quality of the acquired image.
Fig. 5 is a schematic structural diagram of an image character recognition device corresponding to fig. 2 in an embodiment of the present application. As shown in fig. 5, based on the same inventive concept, an embodiment of the present application further provides an image character recognition apparatus, including:
an effective image information obtaining module 501, configured to obtain effective image information of an object to be identified, where the effective image information includes a text region, where the text region includes an irregular-shaped text;
a text information recognition module 502, configured to obtain text information in the effective image information by using a text recognition model;
a comparison module 503, configured to compare the text information with standard text information pre-stored in a database to obtain a comparison result;
a to-be-corrected text information determining module 504, configured to determine to-be-corrected text information in the object to be recognized when the comparison result indicates that the text information is inconsistent with standard text information pre-stored in a database.
Optionally, the apparatus may further include:
the device comprises an object to be identified image information acquisition module, a recognition module and a recognition module, wherein the object to be identified image information acquisition module is used for acquiring the image information of an object to be identified;
and the preprocessing module is used for executing preprocessing operation on the image information to obtain the effective image information of the object to be identified, wherein the preprocessing operation comprises graying, down-sampling, Gaussian denoising, binarization, rotation and/or cutting processing.
Optionally, the character recognition model includes a feature pyramid network, a region generation network, a fast region convolution neural network, a text segmentation network, a sequence-to-sequence network based on an attention mechanism, and a reconstruction network; the text information identifying module 502 may specifically include:
the character characteristic information extraction unit is used for extracting the characteristic information in the effective image by adopting the characteristic pyramid network;
the target text candidate box feature information generating unit is used for inputting the feature information into a network for generating feature information of a target text candidate box corresponding to the effective image;
the classification unit is used for inputting the characteristic information of the target text candidate box into a fast regional convolutional neural network and classifying the target text candidate box by using a classifier according to the characteristic information of the target text candidate box;
a text segmentation unit, configured to perform character segmentation and instance segmentation on the target text in the target text candidate box by using the text segmentation network;
the recognition unit is used for recognizing the segmented characters and examples by adopting a sequence-to-sequence network based on an attention mechanism to obtain a recognition result;
and the character information identification unit is used for integrating the identification result by adopting a reconstruction network to obtain all character information in the effective image information.
Optionally, the text information identifying module 502 may further include:
and the bidirectional analysis unit is used for performing bidirectional analysis on the feature sequences corresponding to the segmented characters and examples by adopting a multi-layer bidirectional L STM network and determining the context correlation between each character and each example.
Optionally, the comparing module 503 may specifically include:
and the comparison unit is used for matching and comparing the character string corresponding to the text information with a character string corresponding to standard text information prestored in a database by using a KMP algorithm.
Optionally, the object to be identified is a clothing drop or a washing label, and the text information includes washing mode information, prompt information and style component information;
the alignment module 503 may be specifically configured to:
and correcting the character information to be corrected into character information consistent with the corresponding standard character information according to the standard character information prestored in the database.
Optionally, the text segmentation network includes a character segmentation network and an instance segmentation network;
the text segmentation unit may be specifically configured to:
and respectively sending the classified characteristic information of the target text candidate box into the character segmentation network and the example segmentation network to obtain the segmented character information and example information.
Fig. 6 is a schematic diagram of a server structure in an embodiment of the present application. As shown in fig. 6, based on the same inventive concept, an embodiment of the present application provides a server, where the server at least includes: a memory 601 and a processor 602, wherein,
a memory 601 for storing executable instructions;
a processor 602 for reading and executing the executable instructions stored in the memory to implement any of the methods involved in the above embodiments.
Based on the same inventive concept, the present application provides a storage medium, wherein when instructions in the storage medium are executed by a processor, the storage medium enables any one of the methods related to the embodiments to be executed.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the embodiments of the present application without departing from the spirit and scope of the embodiments of the present application. Thus, if such modifications and variations of the embodiments of the present application fall within the scope of the claims of the present application and their equivalents, the present application is also intended to encompass such modifications and variations.

Claims (10)

1. A washing mark detection method is characterized by comprising the following steps:
obtaining effective image information of an object to be identified, wherein the effective image information comprises a text area, and the text area comprises an irregular-shaped text; the object to be identified comprises a washing mark;
recognizing by adopting a character recognition model to obtain character information in the effective image information;
comparing the character information with standard character information prestored in a database to obtain a comparison result;
and when the comparison result shows that the character information is inconsistent with standard character information prestored in a database, determining character information to be corrected in the object to be recognized.
2. The method of claim 1, wherein prior to obtaining valid image information of the object to be identified, further comprising:
acquiring image information of an object to be identified;
and executing preprocessing operation on the image information to obtain effective image information of the object to be identified, wherein the preprocessing operation comprises graying, down sampling, Gaussian denoising, binarization, rotation and/or cutting processing.
3. The method of claim 1, wherein the word recognition model comprises a feature pyramid network, a region generation network, a fast region convolution neural network, a text segmentation network, an attention-based sequence-to-sequence network, and a reconstruction network; the obtaining of the text information in the effective image information by using the text recognition model specifically includes:
extracting feature information in the effective image by adopting the feature pyramid network;
inputting the characteristic information into a region generation network to generate the characteristic information of a target text candidate box corresponding to the effective image;
inputting the characteristic information of the target text candidate box into a fast regional convolutional neural network, and classifying the target text candidate box by using a classifier according to the characteristic information of the target text candidate box;
performing character segmentation and example segmentation on the target text in the target text candidate box by using the text segmentation network;
recognizing the segmented characters and examples by adopting a sequence-to-sequence network based on an attention mechanism to obtain a recognition result;
and integrating the recognition results by adopting a reconstruction network to obtain all character information in the effective image information.
4. The method of claim 3, wherein the attention-based sequence-to-sequence network further comprises a multi-layer bidirectional long-short term memory network, and before the integrating the recognition result with the reconstruction network, the method further comprises:
and performing bidirectional analysis on the feature sequences corresponding to the segmented characters and instances by adopting a multi-layer bidirectional L STM network to determine the context correlation between each character and each instance.
5. The method of claim 1, wherein after determining the text information to be corrected in the object to be recognized, the method further comprises:
and matching and comparing the character string corresponding to the text information with a character string corresponding to standard text information prestored in a database by using a KMP algorithm.
6. The method according to claim 1, wherein the object to be identified is a clothing drop or a washing mark, and the text information includes washing mode information, prompt information, style component information;
after determining the character information to be corrected in the object to be recognized, the method also comprises the step of
And correcting the character information to be corrected into character information consistent with the corresponding standard character information according to the standard character information prestored in the database.
7. The method of claim 3, wherein the text segmentation network comprises a character segmentation network and an instance segmentation network;
the character segmentation and instance segmentation of the target text in the target text candidate box by using the text segmentation network specifically includes:
and respectively sending the classified characteristic information of the target text candidate box into the character segmentation network and the example segmentation network to obtain the segmented character information and example information.
8. The utility model provides a wash mark check out test set which characterized in that includes:
the effective image information acquisition module is used for acquiring effective image information of an object to be identified, wherein the effective image information contains a text area, and the text area comprises an irregular-shaped text;
the character information identification module is used for identifying character information in the effective image information by adopting a character identification model;
the comparison module is used for comparing the character information with standard character information prestored in a database to obtain a comparison result;
and the character information to be corrected determining module is used for determining the character information to be corrected in the object to be recognized when the comparison result shows that the character information is inconsistent with the standard character information prestored in the database.
9. The utility model provides a wash mark check out test set which characterized in that includes: a memory and a processor; wherein the content of the first and second substances,
a memory for storing executable instructions;
a processor for reading and executing executable instructions stored in the memory to implement the method of any one of claims 1-7.
10. A storage medium, wherein instructions in the storage medium, when executed by a processor, enable performance of the method of any one of claims 1-7.
CN202010166048.5A 2020-03-11 2020-03-11 Water washing label detection method, equipment and storage medium Pending CN111476232A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010166048.5A CN111476232A (en) 2020-03-11 2020-03-11 Water washing label detection method, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010166048.5A CN111476232A (en) 2020-03-11 2020-03-11 Water washing label detection method, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111476232A true CN111476232A (en) 2020-07-31

Family

ID=71747329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010166048.5A Pending CN111476232A (en) 2020-03-11 2020-03-11 Water washing label detection method, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111476232A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112163559A (en) * 2020-10-21 2021-01-01 珠海格力电器股份有限公司 Drawing information sharing method and device and mobile phone
CN114332872A (en) * 2022-03-14 2022-04-12 四川国路安数据技术有限公司 Contract document fault-tolerant information extraction method based on graph attention network
CN114550384A (en) * 2022-03-25 2022-05-27 中国工商银行股份有限公司 Job processing method, device and system
CN115438214A (en) * 2022-11-07 2022-12-06 北京百度网讯科技有限公司 Method for processing text image, neural network and training method thereof

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919147A (en) * 2019-03-04 2019-06-21 上海宝尊电子商务有限公司 The method of text identification in drop for clothing image

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919147A (en) * 2019-03-04 2019-06-21 上海宝尊电子商务有限公司 The method of text identification in drop for clothing image

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BAOGUANG SHI等: "An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
MINGHUI LIAO等: "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112163559A (en) * 2020-10-21 2021-01-01 珠海格力电器股份有限公司 Drawing information sharing method and device and mobile phone
CN114332872A (en) * 2022-03-14 2022-04-12 四川国路安数据技术有限公司 Contract document fault-tolerant information extraction method based on graph attention network
CN114332872B (en) * 2022-03-14 2022-05-24 四川国路安数据技术有限公司 Contract document fault-tolerant information extraction method based on graph attention network
CN114550384A (en) * 2022-03-25 2022-05-27 中国工商银行股份有限公司 Job processing method, device and system
CN115438214A (en) * 2022-11-07 2022-12-06 北京百度网讯科技有限公司 Method for processing text image, neural network and training method thereof

Similar Documents

Publication Publication Date Title
CN111476232A (en) Water washing label detection method, equipment and storage medium
TWI744283B (en) Method and device for word segmentation
US6996295B2 (en) Automatic document reading system for technical drawings
CN110503054B (en) Text image processing method and device
CN111368682B (en) Method and system for detecting and identifying station caption based on master RCNN
CN113963147B (en) Key information extraction method and system based on semantic segmentation
CN109712162A (en) A kind of cable character defect inspection method and device based on projection histogram difference
JP2010039788A (en) Image processing apparatus and method thereof, and image processing program
CN112001362A (en) Image analysis method, image analysis device and image analysis system
Den Hartog et al. Knowledge-based interpretation of utility maps
CN107403179A (en) A kind of register method and device of article packaged information
CN113989577B (en) Image classification method and device
CN112949455B (en) Value-added tax invoice recognition system and method
CN111126112B (en) Candidate region determination method and device
CN111414889B (en) Financial statement identification method and device based on character identification
CN113034492A (en) Printing quality defect detection method and storage medium
CN112613367A (en) Bill information text box acquisition method, system, equipment and storage medium
CN112200789A (en) Image identification method and device, electronic equipment and storage medium
CN111444876A (en) Image-text processing method and system and computer readable storage medium
CN114078106A (en) Defect detection method based on improved Faster R-CNN
WO2017058252A1 (en) Detecting document objects
CN115937095A (en) Printing defect detection method and system integrating image processing algorithm and deep learning
JP7364639B2 (en) Processing of digitized writing
CN115953744A (en) Vehicle identification tracking method based on deep learning
CN114663803A (en) Logistics center hanging clothing classification method and device based on video streaming

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200731