CN112434780B - Target object recognition network model, training method thereof and target object recognition method - Google Patents

Target object recognition network model, training method thereof and target object recognition method Download PDF

Info

Publication number
CN112434780B
CN112434780B CN201910788789.4A CN201910788789A CN112434780B CN 112434780 B CN112434780 B CN 112434780B CN 201910788789 A CN201910788789 A CN 201910788789A CN 112434780 B CN112434780 B CN 112434780B
Authority
CN
China
Prior art keywords
interference
network
target object
training
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910788789.4A
Other languages
Chinese (zh)
Other versions
CN112434780A (en
Inventor
徐杨柳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Goldway Intelligent Transportation System Co Ltd
Original Assignee
Shanghai Goldway Intelligent Transportation System Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Goldway Intelligent Transportation System Co Ltd filed Critical Shanghai Goldway Intelligent Transportation System Co Ltd
Priority to CN201910788789.4A priority Critical patent/CN112434780B/en
Publication of CN112434780A publication Critical patent/CN112434780A/en
Application granted granted Critical
Publication of CN112434780B publication Critical patent/CN112434780B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a target object recognition network model, a training method thereof and a target object recognition method. The target object identification method comprises the steps of receiving a target object to be identified with interference factors; inputting the target object to be identified into a trained target object identification network model; the target object recognition network model comprises an interference elimination network and a target object recognition network; the interference elimination network comprises at least one first type of interference elimination network based on the generation type countermeasure network and a second type of interference elimination network connected with the first type of interference elimination network; respectively utilizing a first type interference elimination network and a second type interference elimination network to eliminate interference of different interference factors in the target object to be identified; and identifying the target object to be identified after interference elimination by using a target object identification network. The technical scheme solves the problem that the existing target object identification technology is not strong in anti-interference capability when identifying the target object with interference.

Description

Target object recognition network model, training method thereof and target object recognition method
[ field of technology ]
The invention relates to the technical field of target object recognition based on a neural network, in particular to a target object recognition network model, a training method thereof and a target object recognition method.
[ background Art ]
With the rapid development of target object recognition technology, text images can be well recognized under the condition of less external interference, such as printed matter scanning pieces, credentials, and the like, but when the interference occurs, errors tend to occur easily, and the interference comprises rotation, abnormal illumination, blurring, complex colors and background. Because the neural network itself has a certain noise filtering capability, the industry is striving to artificially generate more samples to further improve the noise filtering capability of the neural network, but the anti-interference capability of the method is limited.
[ invention ]
In view of the above, the embodiment of the invention provides a target object recognition network model, a training method thereof and a target object recognition method, which are used for solving the problem that the existing target object recognition technology has weak anti-interference capability when recognizing a target object with interference.
In one aspect, an embodiment of the present invention provides a target object identification method, including: receiving a target object to be identified, wherein the target object to be identified has an interference factor; inputting the target object to be identified into a trained target object identification network model; wherein the target object recognition network model comprises an interference elimination network and a target object recognition network; the interference elimination network comprises at least one first type of interference elimination network based on a generation type countermeasure network and a second type of interference elimination network connected with the first type of interference elimination network; respectively utilizing the first type interference elimination network and the second type interference elimination network to eliminate interference of different interference factors in the target object to be identified; and identifying the target object to be identified after interference elimination by utilizing the target object identification network.
Optionally, the interference rejection network is trained by the following method: acquiring training sample pairs, wherein each training sample pair comprises a standard non-interference training sample and an interference training sample; outputting the interference training samples after passing through the interference elimination network to generate interference-free samples; extracting standard features from the standard non-interference training samples and extracting generated features from the generated non-interference samples respectively by utilizing the target object recognition network; comparing the standard feature with the generated feature to obtain a feature error; and in the process of training the interference elimination network, training the interference elimination network based on the characteristic error in an auxiliary mode so as to complete training of the interference elimination network.
Optionally, the first type of interference rejection network is trained by the following method: outputting a first generated interference-free sample after the interference training sample passes through the first type interference elimination network; extracting first standard features from the standard non-interference training samples and extracting first generated features from the first generated non-interference samples by using the target object recognition network; comparing the first standard feature with the first generated feature to obtain a first feature error; and in the process of training the first type of interference elimination network, training the first type of interference elimination network based on the first characteristic error in an auxiliary mode so as to complete training of the first type of interference elimination network.
Optionally, in the training of the first type of interference rejection network, training the first type of interference rejection network based on the first characteristic error in an assisted manner, so as to complete training of the first type of interference rejection network includes: alternately training a generation network and a discrimination network of the generation type countermeasure network to optimize the generation network and the discrimination network, respectively; training the generation network in an assisted manner based on the first characteristic error; and taking the trained generating network as the first-type interference elimination network after training is completed.
Optionally, the second type of interference rejection network is trained by the following method: outputting a second generation non-interference sample after the first generation non-interference sample passes through the second type interference elimination network; respectively extracting second standard features from the standard non-interference training samples and second generated features from the second generated non-interference samples by utilizing the target object recognition network; comparing the second standard feature with the second generated feature to obtain a second feature error; and in the process of training the second-type interference elimination network, training the second-type interference elimination network based on the second characteristic error in an auxiliary mode so as to complete training of the second-type interference elimination network.
On the other hand, the embodiment of the invention also provides a target object identification device, which comprises: the object to be identified receiving module is used for receiving a target object to be identified, wherein the target object to be identified has interference factors; the object to be identified processing module is used for inputting the object to be identified into a trained object identification network model; wherein the target object recognition network model comprises an interference elimination network and a target object recognition network; the interference elimination network comprises at least one first type of interference elimination network based on a generation type countermeasure network and a second type of interference elimination network connected with the first type of interference elimination network; the interference elimination module is used for eliminating interference of different interference factors in the target object to be identified by utilizing the first-type interference elimination network and the second-type interference elimination network respectively; and the target object identification module is used for identifying the target object to be identified after interference elimination by utilizing the target object identification network.
On the other hand, the embodiment of the invention also provides a training method of the target object recognition model, wherein the target object recognition network model comprises an interference elimination network and a target object recognition network;
The training method comprises the following steps: acquiring training sample pairs, wherein each training sample pair comprises a standard non-interference training sample and an interference training sample; outputting the interference training samples after passing through the interference elimination network to generate interference-free samples; extracting standard features from the standard non-interference training samples and extracting generated features from the generated non-interference samples respectively by utilizing the target object recognition network; comparing the standard feature with the generated feature to obtain a feature error; and in the process of training the interference elimination network, training the interference elimination network based on the characteristic error in an auxiliary mode so as to complete training of the interference elimination network.
In still another aspect, an embodiment of the present invention further provides a target object recognition model, including: an interference rejection network and a target object recognition network; wherein the interference rejection network comprises at least one first type of interference rejection network based on a generated type countermeasure network and a second type of interference rejection network connected with the first type of interference rejection network; the first type interference elimination network and the second type interference elimination network are respectively used for eliminating different interference factors in the target object to be identified; the target object identification network is used for identifying the target object to be identified after interference elimination.
Compared with the prior art, the technical scheme has at least the following beneficial effects:
according to the target object recognition method provided by the embodiment of the invention, the utilized target object recognition model comprises an interference elimination network and a target object recognition network, wherein the interference elimination network comprises at least one first type of interference elimination network based on a generation type countermeasure network and a second type of interference elimination network connected with the first type of interference elimination network. Different interference factors of the target object to be identified can be eliminated layer by layer through the two types of interference elimination networks, so that the target object identification network only needs to identify the target object to be identified with smaller interference factors (namely smaller noise), and the complexity and the operand of the target object identification network are reduced.
Further, when training the target object recognition model, in the process of training the interference elimination network, the target object recognition network is utilized to extract standard features from the standard non-interference training samples and extract generated features from the generated non-interference samples respectively, the standard features and the generated features are compared to obtain feature errors, and the interference elimination network is trained in an auxiliary mode based on the feature errors. When training the target object recognition network, the standard non-interference training sample is used for pre-training, and then after the interference elimination network training is completed, the standard non-interference training sample and the interference training sample are used for training the pre-trained target object recognition network, so that the training of the target object recognition network is completed. Therefore, in the whole training process of the target object recognition network model, the interference elimination network and the target object recognition network complement each other and do not independently complete the training, so that the trained target object recognition network model can better recognize the target object.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of an embodiment of a target object recognition method provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of a target object recognition model in the target object recognition method shown in FIG. 1;
FIG. 3 is a flow chart of a training method of the disturbance rejection network in the target object recognition model shown in FIG. 2;
FIG. 4A is a flow chart of a training method of a first type of interference rejection network in the target object recognition model shown in FIG. 2;
FIG. 4B is a schematic diagram of one embodiment of a training method of the first type of interference rejection network depicted in FIG. 4A;
FIG. 5A is a flow chart of one embodiment of a training method for a second type of interference rejection network in the target object recognition model shown in FIG. 2;
fig. 5B is a schematic diagram illustrating one embodiment of a training method of the second type of interference cancellation network illustrated in fig. 5A.
FIG. 6 is a flow chart of one embodiment of a training method for a target object recognition network in the target object recognition model shown in FIG. 2;
FIG. 7 is a schematic diagram of an embodiment of a target object recognition device according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an embodiment of a training apparatus for a target object recognition network model according to an embodiment of the present invention.
[ detailed description ] of the invention
For a better understanding of the technical solution of the present invention, the following detailed description of the embodiments of the present invention refers to the accompanying drawings.
It should be understood that the described embodiments are merely some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a flowchart of an embodiment of a target object recognition method according to an embodiment of the present invention. Referring to fig. 1, the target object recognition method includes the steps of:
step 101, receiving a target object to be identified, wherein the target object to be identified has an interference factor.
Step 102, inputting the target object to be identified into a trained target object identification network model. Wherein the target object recognition network model comprises an interference elimination network and a target object recognition network; the interference rejection network comprises at least one first type of interference rejection network based on a generative countermeasure network and a second type of interference rejection network connected to the first type of interference rejection network.
And 103, performing interference elimination on different interference factors in the target object to be identified by using the first-type interference elimination network and the second-type interference elimination network respectively.
And 104, identifying the target object to be identified after interference elimination by utilizing the target object identification network.
In this embodiment, a text image with interference factors (as the target object to be identified) is identified, and an application scenario in which text on the image is identified is described as an example. In this embodiment, the text to be recognized and the different interference factors are provided on the text to be recognized and the picture. One type of interference factor includes, but is not limited to, picture background interference, illumination interference, blur interference, and color interference, which can be considered noise interference of the text picture to be identified. Another type of interference factor includes, but is not limited to, text tilt perspective interference of the text itself to be recognized, text bend interference, and the like.
Fig. 2 is a schematic diagram of a structure of a target object recognition model in the target object recognition method shown in fig. 1.
Referring to fig. 2, the target object recognition network model 21 includes an interference rejection network 211 and a target object recognition network 212. Wherein the interference rejection network 211 comprises at least one first type of interference rejection network 2111 based on a generative antagonism network and a second type of interference rejection network 2112 connected to the first type of interference rejection network 2111. The first type of interference rejection network 2111 and the second type of interference rejection network 2112 are respectively used to reject different interference factors in the target object to be identified. The target object recognition network 212 is configured to perform target object recognition on the target object to be recognized after the interference is eliminated. As shown in fig. 2, the target object to be identified is a text image including text "generator" and background color (black), and after passing through the target object identification model 21, the text "generator" on the target object to be identified can be identified.
Specifically, the interference elimination network 211 in the target object recognition network model 21 provided in this embodiment can eliminate different interference factors on the target object to be recognized first, and then the target object recognition network 212 in the target object recognition network model 21 is used to recognize the target object to be recognized, from which the interference factors are eliminated, so that the target object recognition network only needs to recognize the target object to be recognized, from which the interference factors are smaller, thereby reducing the complexity and the operation amount of the target object recognition network.
Further, the interference rejection network 211 includes a first type of interference rejection network 2111 and a second type of interference rejection network 2112 connected to the first type of interference rejection network 2111. Wherein the first type of interference rejection network 2111 is an interference rejection network determined based on a generation network G in a generation-type countermeasure network (Generative Adversarial Networks, GAN for short). In this embodiment, the first type of interference rejection network 2111 is configured to reject noise interference (i.e., any one or more of image background interference, illumination interference, blur interference, and color interference) in the target object to be identified. The second type of disturbance rejection network 2112 is an orientation correction network, for example, a correction network based on a sheet function model (TPS) or a correction network based on an affine transformation (Affine Transformation) may be used to geometrically correct the target object to be identified. In this embodiment, the second type interference rejection network 2112 is configured to reject any one or more interference factors of character tilt perspective interference and character bending interference in the object to be identified.
The training method of the target object recognition model provided in the present embodiment will be described in detail below.
Fig. 3 is a flow chart of a training method of the interference rejection network in the target object recognition model shown in fig. 2. Referring to fig. 3, the training method includes the steps of:
step 301, obtaining training sample pairs, wherein each training sample pair comprises a standard non-interference training sample and an interference training sample;
step 302, outputting the interfered training samples after passing through the interference elimination network to generate interference-free samples;
step 303, extracting standard features from the standard non-interference training samples and extracting generated features from the generated non-interference samples by using the target object recognition network;
step 304, comparing the standard feature and the generated feature to obtain a feature error;
step 305, in the process of training the interference rejection network, training the interference rejection network based on the characteristic error in an auxiliary manner, so as to complete the training of the interference rejection network.
In the present embodiment, in the training process of the interference elimination network, the target object recognition network is used to extract standard features from standard non-interference training samples and extract generated features from generated non-interference samples obtained after the interference elimination network, then the standard features and the generated features are compared to obtain feature errors, and the interference elimination network is trained in an auxiliary manner based on the feature errors, so that the interference elimination network after the training has stronger interference elimination capability.
In step 301, the training sample pairs may be generated using existing artificial sample generation tools, each generated pair of training sample pairs including standard non-interfering training samples and interfering training samples. The standard non-interference training sample is a target object sample without interference factors, and the interference training sample is a target object sample with different interference factors added on the basis of the standard non-interference training sample.
In step 302, the interfered training samples generated in step 301 are input into the interference rejection network and then output to generate interference-free samples. The interference elimination network is a neural network, through which the input interfered training samples can be subjected to interference elimination so as to output and generate interference-free samples.
In the target object recognition model provided in this embodiment, two types of interference rejection networks are set for two different types of interference factors, which are a first type of interference rejection network and a second type of interference rejection network, respectively. Wherein the first type of interference rejection network is configured to reject noise interference (e.g., any one or more of background interference, illumination interference, blur interference, and color interference) in the noisy training samples. The second type of interference rejection network is used for rejecting any one or more interference factors of target object oblique perspective interference and target object bending interference in the interfered training samples. Specific training methods for these two types of interference rejection networks will be described in detail in the embodiments below.
Standard features are extracted from the standard non-interfering training samples and generated features are extracted from the generated non-interfering samples, respectively, using a target object recognition network, and the standard features and the generated features are compared to obtain feature errors, as described in steps 304 and 305.
Specifically, the target object recognition network may extract features (such as endpoints, bifurcation points, concave-convex portions, etc. of characters) from the target object to be recognized, and then perform logical combination judgment according to the positions and interrelationships of the extracted features, so as to obtain a recognition result. The target object recognition network may adopt an Attention-based recognition network or a CTC (Connectionist Temporal Classification) -based recognition network or other target object recognition networks derived based on the two frames.
In this embodiment, the feature extraction portion of the target object recognition network is used to extract standard features from the standard non-interference training samples and extract generated features from the generated non-interference samples, and then compare the standard features and the generated features to obtain feature errors therebetween. The characteristic error represents a difference between the generated non-interfering samples and standard non-interfering samples.
In the training of the interference rejection network, the training of the interference rejection network is aided based on the characteristic error to complete the training of the interference rejection network, as described in step 306. In other words, in the process of training the interference elimination network, the characteristic errors fed back by the target object recognition network are utilized to further optimize and adjust relevant parameters in the interference elimination network.
Fig. 4A is a flowchart of a training method of the first type of interference rejection network in the target object recognition model shown in fig. 2. Referring to fig. 4A, the training method includes the steps of:
step 401, outputting a first generated interference-free sample after the interfered training sample passes through the first type interference rejection network;
step 402, respectively extracting a first standard feature from the standard non-interference training sample and a first generated feature from the first generated non-interference sample by using the target object recognition network;
step 403, comparing the first standard feature and the first generated feature to obtain a first feature error;
step 404, in the process of training the first type of interference rejection network, training the first type of interference rejection network based on the first characteristic error in an auxiliary manner, so as to complete training of the first type of interference rejection network.
In this embodiment, the first type of interference rejection network is an interference rejection network based on a generative antagonism network. Those skilled in the art understand that the generating type countermeasure network (GAN network hereinafter) is a deep learning model, the GAN network is composed of a generating network G and a discriminating network D, in the training process, the generating network G aims to generate a picture as close as possible to the real picture to deceptively discriminate the network D, and the discriminating network D aims to judge whether the input picture is the picture generated by the generating network G or the real picture, that is, the output of the discriminating network represents the probability that the input picture is the real picture, if 1, it represents 100% of the real picture, and if 0, it represents the impossible real picture. Thus, the generation network G and the discrimination network D constitute a dynamic gaming process.
Fig. 4B is a schematic diagram illustrating a specific embodiment of the training method of the first type of interference rejection network illustrated in fig. 4A. In fig. 4B, (a) is a schematic diagram for training the generation network G by fixing the parameters of the discrimination network D; (b) Is a schematic diagram for training the discrimination network D by fixing the parameters of the generation network G.
As shown in fig. 4B, shown in (a) and (B) are processes of alternately training the generation network and the discrimination network of the generation-type countermeasure network to optimize the generation network and the discrimination network, respectively.
Specifically, the process of fixing the parameters of the discrimination network D to train the generation network G is shown in (a), the first generation non-interference sample is output after the interference training sample passes through the generation network G, then the first generation non-interference sample and the interference training sample are input into the discrimination network D, the probability that the first generation non-interference sample output by the discrimination network D is the standard non-interference training sample (i.e. the real non-interference sample), and the discrimination result is fed back to the generation network G to adjust the parameters of the generation network G.
(b) Shown is a process of fixing the parameters of the generation network G to train the discrimination network D after adjusting the parameters of the generation network G in accordance with the training process shown in the above (a). One of the first generated non-interfering sample or the standard non-interfering training sample output via the generation network G is randomly selected as an input sample to be input to the discrimination network D, which compares the received input sample with the interfered training sample, thereby identifying whether the input sample is the first generated non-interfering sample or the standard non-interfering training sample, and outputting the probability that the input sample is the standard non-interfering training sample.
The generating network and the discriminating network are trained alternately in the above manner to optimize parameters in the generating network and the discriminating network, respectively.
In the process of training the generating network G, the target object recognition network is used to extract a first standard feature from the standard non-interference training sample and extract a first generating feature from the first generating non-interference sample, and then the first standard feature and the first generating feature are compared to obtain a first feature error, so that training of the generating network is assisted based on the first feature error, and the trained generating network is used as the first type interference elimination network after training is completed.
Further, in the process of training the generating network G, pixel features are extracted from the standard non-interference training samples and the first non-interference generating samples respectively, so as to perform pixel level comparison on the standard non-interference training samples and the first non-interference generating samples, thereby determining pixel level errors between the standard non-interference training samples and the first non-interference generating samples, and further assisting in training the generating network in combination with the pixel level errors. Therefore, the first generation undisturbed sample obtained in the training process is compared with the standard undisturbed training sample at the pixel level and the feature level while the generation network is trained based on the feedback of the discrimination network, so that the generation network can generate a finer first generation undisturbed sample.
From a mathematical model perspective, the first type of interference rejection network may be represented by the following formula:
G * =argmin G max D Lc GAN (G,D)+λ 1 L L1 (G)+λ 2 L L2 (G)
wherein G is a first type of interference rejection network, L cGAN (G, D) is a generic loss function of the GAN network, L L1 (G) Is a loss function of pixel level comparison of the generated first generated non-interference samples and standard non-interference training samples, L L2 (G) Is a loss function that compares the feature level of the first generated interference-free sample with a standard interference-free training sample. During training, L is caused by adjusting the discrimination network D cGAN (G, D) as large as possible, adjusting the generated network G to L cGAN (G, D) being as small as possible, G forming a antagonistic relationship with D such that the first generated undisturbed sample is in close proximity to a standard undisturbed training sample. L (L) L1 (G) By pixel level comparison, the generated first generated undisturbed sample is similar to the standard undisturbed training sample in detail, L L2 (G) And (3) through feature level comparison, information which has decisive effect on a text recognition result in the first generated undisturbed sample is consistent with the standard undisturbed training sample.
It should be noted that, in the target object recognition model, the number of the first type of interference rejection networks may be determined according to interference factors to be rejected. For example, a plurality of interference factors (such as illumination interference, blur interference and color interference) can be eliminated through one first-type interference elimination network, in this case, the calculation amount of the first-type interference elimination network is large, and the network structure is complex. For another example, a plurality of first-type interference rejection networks may be provided in cascade with each other, each of the first-type interference rejection networks being configured to reject an interference factor, in which case the calculation amount of each of the first-type interference rejection networks is greatly reduced.
FIG. 5A is a flow chart of one embodiment of a training method for a second type of interference rejection network in the target object recognition model shown in FIG. 2. Referring to fig. 5A, the training method includes the steps of:
step 501, outputting a second generated non-interference sample after the first generated non-interference sample passes through the second type of interference rejection network;
step 502, respectively extracting a second standard feature from the standard non-interference training sample and extracting a second generated feature from the second generated non-interference sample by using the character recognition network;
step 503, comparing the second standard feature and the second generated feature to obtain a second feature error;
step 504, in the process of training the second-type interference rejection network, training the second-type interference rejection network based on the second characteristic error in an auxiliary manner, so as to complete training of the second-type interference rejection network.
In this embodiment, the second type interference rejection network receives the first generated interference-free sample output after passing through the first type interference rejection network. The first generation interference-free sample is a sample subjected to illumination interference, fuzzy interference and color interference elimination, and any one or more of oblique perspective interference and bending interference of a target object of the first generation interference-free sample is further eliminated through the second type interference elimination network.
The second type of interference rejection network is an orientation correction network, for example a correction network based on a sheet function model (TPS) or a correction network based on an affine transformation (Affine Transformation) may be employed. The training mode of the orientation correction network can be determined according to different correction networks, for example, the correction network based on affine transformation (Affine Transformation) is taken as an example, the neural network is utilized to return the four corners of the characters, and geometric means such as affine transformation are utilized to correct the characters to the horizontal position.
A schematic diagram of one embodiment of a training method for a second type of interference rejection network is shown in connection with reference to fig. 5B.
In this embodiment, unlike the prior art, the target object recognition network is used to extract a second standard feature from the standard non-interference training sample and extract a second generated feature from the second generated non-interference sample, and then compare the second standard feature and the second generated feature to obtain a second feature error, so that in the process of training the second type of interference rejection network, the second type of interference rejection network is trained in an assisted manner based on the second feature error.
In practical application, for example, the target object recognition network is connected after the orientation correction network, the second generated interference-free sample obtained after the orientation correction network is corrected is directly input into the target object recognition network, and the adjustment required by current correction is estimated according to the recognition effect of the target object recognition network.
FIG. 6 is a flow chart of one embodiment of a method of training a target object recognition network in the target object recognition model shown in FIG. 2. Referring to fig. 6, the training method includes the steps of:
step 601, pre-training the target object recognition network by using the standard interference-free training sample;
step 602, training the pre-trained target object recognition network by using the standard non-interference training sample and the interference training sample after the interference rejection network training is completed, so as to obtain the trained target object recognition network.
In this embodiment, the target object recognition network may adopt an Attention-based recognition network or a CTC (Connectionist Temporal Classification) -based recognition network or other target object recognition networks derived based on the two frames.
The target object recognition network includes a feature extraction portion, an encoding portion (if necessary), and a decoding portion.
The training of the target object recognition network is that the target object recognition network after training hopes that the characteristics of the second generated interference-free sample and the characteristics extracted from the standard interference-free training sample are basically consistent, so that the target object recognition network obtained by training can correctly recognize the target object from the target object to be recognized.
To achieve this, it is not desirable that the performance of the target object recognition network is too strong, so that the target object recognition network is first trained (i.e. pre-trained) with standard non-interfering training samples, making it more susceptible to interference. And then, training the interference elimination network (comprising a first type of interference elimination network and a second type of interference elimination network) in an auxiliary manner by utilizing the pre-trained target object recognition network, and training the pre-trained target object recognition network by utilizing a standard non-interference training sample and an interference training sample after the interference elimination network is trained, so that the training of the target object recognition network is completed, and the performance of the target object recognition network is further improved.
Therefore, in the whole training process of the target object recognition network model, the interference elimination network and the target object recognition network complement each other and do not independently complete the training, so that the trained target object recognition network model can better recognize the target object.
Fig. 7 is a schematic structural diagram of an embodiment of a target object recognition device according to the present invention. Referring to fig. 7, the target object recognition apparatus 7 includes: the object to be identified receiving module 71 is configured to receive a target object to be identified, where the target object to be identified has an interference factor. The object to be identified processing module 72 is configured to input the object to be identified into a trained object identification network model; wherein the target object recognition network model comprises an interference elimination network and a target object recognition network; the interference rejection network comprises at least one first type of interference rejection network based on a generative countermeasure network and a second type of interference rejection network connected to the first type of interference rejection network. And the interference elimination module 73 is configured to eliminate interference of different interference factors in the target object to be identified by using the first type of interference elimination network and the second type of interference elimination network respectively. The target object recognition module 74 is configured to recognize the target object to be recognized after the interference rejection by using the target object recognition network.
The target object recognition means 7 further comprise an interference rejection network training module 75. The interference rejection network training module 75 is configured to obtain training sample pairs, and each of the training sample pairs includes a standard non-interference training sample and an interference training sample; outputting the interference training samples after passing through the interference elimination network to generate interference-free samples; extracting standard features from the standard non-interference training samples and extracting generated features from the generated non-interference samples respectively by utilizing the target object recognition network; comparing the standard feature with the generated feature to obtain a feature error; and in the process of training the interference elimination network, training the interference elimination network based on the characteristic error in an auxiliary mode so as to complete training of the interference elimination network.
The interference rejection network training module 75 includes a first type of interference rejection network training unit 751 for outputting a first generated interference-free sample after the interfered training sample passes through the first type of interference rejection network; extracting first standard features from the standard non-interference training samples and extracting first generated features from the first generated non-interference samples by using the target object recognition network; comparing the first standard feature with the first generated feature to obtain a first feature error; and in the process of training the first type of interference elimination network, training the first type of interference elimination network based on the first characteristic error in an auxiliary mode so as to complete training of the first type of interference elimination network.
The first-type interference rejection network training unit 751 is further configured to alternately train a generation network and a discrimination network of the generation-type countermeasure network to optimize the generation network and the discrimination network, respectively; training the generation network in an assisted manner based on the first characteristic error; and taking the trained generating network as the first-type interference elimination network after training is completed.
The interference rejection network training module 75 includes a second type of interference rejection network training unit 752 for outputting a second generated interference-free sample after the first generated interference-free sample passes through the second type of interference rejection network; respectively extracting second standard features from the standard non-interference training samples and second generated features from the second generated non-interference samples by utilizing the target object recognition network; comparing the second standard feature with the second generated feature to obtain a second feature error; and in the process of training the second-type interference elimination network, training the second-type interference elimination network based on the second characteristic error in an auxiliary mode so as to complete training of the second-type interference elimination network.
Fig. 8 is a schematic structural diagram of an embodiment of a training apparatus for a target object recognition network model according to an embodiment of the present invention. Referring to fig. 8, the training device 8 includes: the training sample acquiring module 81 is configured to acquire training sample pairs, and each training sample pair includes a standard non-interference training sample and an interference training sample. The interference-free sample generating module 82 is configured to output the interference-free training samples after passing through the interference rejection network to generate interference-free samples. The generated feature extraction module 83 is configured to extract standard features from the standard non-interference training samples and extract generated features from the generated non-interference samples, respectively, using the target object recognition network. A feature error determination module 84 for comparing the standard feature with the generated feature to obtain a feature error. And the auxiliary training module 85 is configured to assist in training the interference rejection network based on the characteristic error in the process of training the interference rejection network, so as to complete training of the interference rejection network.
Wherein the interference rejection network comprises at least one first type of interference rejection network based on a generative antagonism network. The interference-free sample generating module 82 is configured to output the interference-free training samples after passing through the first type of interference rejection network. The generated feature extraction module 83 is configured to extract a first standard feature from the standard non-interference training sample and extract a first generated feature from the first generated non-interference sample, respectively, using the target object recognition network. The feature error determination module 84 is configured to compare the first standard feature and the first generated feature to obtain a first feature error. The training assisting module 85 is configured to assist in training the first type of interference rejection network based on the first characteristic error in the process of training the first type of interference rejection network, so as to complete training of the first type of interference rejection network.
The interference rejection network further comprises a second type of interference rejection network connected to the first type of interference rejection network. The non-interference sample generating module 82 is further configured to output a second non-interference sample after the first non-interference sample passes through the second type of interference rejection network. The generated feature extraction module 83 is further configured to extract a second standard feature from the standard non-interference training sample and a second generated feature from the second generated non-interference sample, respectively, using the target object recognition network. The feature error determination module 84 is further configured to compare the second standard feature to the second generated feature to obtain a second feature error. The training assisting module 85 is further configured to assist in training the second type of interference rejection network based on the second characteristic error in the process of training the second type of interference rejection network, so as to complete training of the second type of interference rejection network.
The embodiment of the invention also provides a computer readable storage medium storing a computer program for executing the steps in the embodiment of the target object identification method.
The embodiment of the invention also provides electronic equipment, which comprises: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the instructions to implement the steps in the embodiments of the target object identification method described above.
The embodiment of the invention also provides a computer readable storage medium, wherein the storage medium stores a computer program, and the computer program is used for executing each step in the embodiment of the training method of the target object identification network model.
The embodiment of the invention also provides electronic equipment, which comprises: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the instructions to implement the steps in the foregoing embodiments of the training method for the target object recognition network model.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the invention.

Claims (9)

1. A method for identifying a target object, comprising:
receiving a target object to be identified, wherein the target object to be identified is a text picture with interference factors;
inputting the target object to be identified into a trained target object identification network model; wherein the target object recognition network model comprises an interference elimination network and a target object recognition network; the interference elimination network comprises at least one first type of interference elimination network based on a generation type countermeasure network and a second type of interference elimination network connected with the first type of interference elimination network;
the first type interference elimination network and the second type interference elimination network are used for eliminating noise interference of a to-be-identified text picture, wherein the noise interference comprises picture background interference, illumination interference, fuzzy interference and color interference, and the second type interference elimination network is used for eliminating text oblique perspective interference and text bending interference in the to-be-identified text picture;
And identifying the target object to be identified after interference elimination by utilizing the target object identification network.
2. The method of claim 1, wherein the interference rejection network is trained by:
acquiring training sample pairs, wherein each training sample pair comprises a standard non-interference training sample and an interference training sample;
outputting the interference training samples after passing through the interference elimination network to generate interference-free samples;
extracting standard features from the standard non-interference training samples and extracting generated features from the generated non-interference samples respectively by utilizing the target object recognition network;
comparing the standard feature with the generated feature to obtain a feature error;
and in the process of training the interference elimination network, training the interference elimination network based on the characteristic error in an auxiliary mode so as to complete training of the interference elimination network.
3. The method of claim 2, wherein the first type of interference rejection network is trained by:
outputting a first generated interference-free sample after the interference training sample passes through the first type interference elimination network;
Extracting first standard features from the standard non-interference training samples and extracting first generated features from the first generated non-interference samples by using the target object recognition network;
comparing the first standard feature with the first generated feature to obtain a first feature error;
and in the process of training the first type of interference elimination network, training the first type of interference elimination network based on the first characteristic error in an auxiliary mode so as to complete training of the first type of interference elimination network.
4. The method of claim 3, wherein the training the first type of interference rejection network based on the first characteristic error in training the first type of interference rejection network to complete training of the first type of interference rejection network comprises:
alternately training a generation network and a discrimination network of the generation type countermeasure network to optimize the generation network and the discrimination network, respectively;
training the generation network in an assisted manner based on the first characteristic error;
and taking the trained generating network as the first-type interference elimination network after training is completed.
5. The method of claim 3, wherein the second type of interference rejection network is trained by:
Outputting a second generation non-interference sample after the first generation non-interference sample passes through the second type interference elimination network;
respectively extracting second standard features from the standard non-interference training samples and second generated features from the second generated non-interference samples by utilizing the target object recognition network;
comparing the second standard feature with the second generated feature to obtain a second feature error;
and in the process of training the second-type interference elimination network, training the second-type interference elimination network based on the second characteristic error in an auxiliary mode so as to complete training of the second-type interference elimination network.
6. A target object recognition apparatus, characterized by comprising:
the object to be identified receiving module is used for receiving a target object to be identified, wherein the target object to be identified is a text picture with interference factors;
the object to be identified processing module is used for inputting the object to be identified into a trained object identification network model; wherein the target object recognition network model comprises an interference elimination network and a target object recognition network; the interference elimination network comprises at least one first type of interference elimination network based on a generation type countermeasure network and a second type of interference elimination network connected with the first type of interference elimination network;
The interference elimination module is used for eliminating interference of different interference factors in the target object to be identified by utilizing the first-type interference elimination network and the second-type interference elimination network respectively, wherein the first-type interference elimination network is used for eliminating noise interference of a character picture to be identified, the noise interference comprises picture background interference, illumination interference, fuzzy interference and color interference, and the second-type interference elimination network is used for eliminating character oblique perspective interference and character bending interference in the character picture to be identified;
and the target object identification module is used for identifying the target object to be identified after interference elimination by utilizing the target object identification network.
7. The apparatus of claim 6, further comprising an interference rejection network training module: the interference elimination network training module is used for acquiring training sample pairs, and each training sample pair comprises a standard non-interference training sample and an interference training sample; outputting the interference training samples after passing through the interference elimination network to generate interference-free samples; extracting standard features from the standard non-interference training samples and extracting generated features from the generated non-interference samples respectively by utilizing the target object recognition network; comparing the standard feature with the generated feature to obtain a feature error; and in the process of training the interference elimination network, training the interference elimination network based on the characteristic error in an auxiliary mode so as to complete training of the interference elimination network.
8. A computer readable storage medium storing a computer program for executing the target object identification method of any one of the preceding claims 1-5.
9. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the instructions to implement the target object identification method according to any one of the preceding claims 1-5.
CN201910788789.4A 2019-08-26 2019-08-26 Target object recognition network model, training method thereof and target object recognition method Active CN112434780B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910788789.4A CN112434780B (en) 2019-08-26 2019-08-26 Target object recognition network model, training method thereof and target object recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910788789.4A CN112434780B (en) 2019-08-26 2019-08-26 Target object recognition network model, training method thereof and target object recognition method

Publications (2)

Publication Number Publication Date
CN112434780A CN112434780A (en) 2021-03-02
CN112434780B true CN112434780B (en) 2023-05-30

Family

ID=74690475

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910788789.4A Active CN112434780B (en) 2019-08-26 2019-08-26 Target object recognition network model, training method thereof and target object recognition method

Country Status (1)

Country Link
CN (1) CN112434780B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548159A (en) * 2016-11-08 2017-03-29 中国科学院自动化研究所 Reticulate pattern facial image recognition method and device based on full convolutional neural networks
CN108280811A (en) * 2018-01-23 2018-07-13 哈尔滨工业大学深圳研究生院 A kind of image de-noising method and system based on neural network
CN108492258A (en) * 2018-01-17 2018-09-04 天津大学 A kind of radar image denoising method based on generation confrontation network
CN108765319A (en) * 2018-05-09 2018-11-06 大连理工大学 A kind of image de-noising method based on generation confrontation network
CN108875486A (en) * 2017-09-28 2018-11-23 北京旷视科技有限公司 Recongnition of objects method, apparatus, system and computer-readable medium
CN109087269A (en) * 2018-08-21 2018-12-25 厦门美图之家科技有限公司 Low light image Enhancement Method and device
CN109360156A (en) * 2018-08-17 2019-02-19 上海交通大学 Single image rain removing method based on the image block for generating confrontation network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10482600B2 (en) * 2018-01-16 2019-11-19 Siemens Healthcare Gmbh Cross-domain image analysis and cross-domain image synthesis using deep image-to-image networks and adversarial networks
US10719742B2 (en) * 2018-02-15 2020-07-21 Adobe Inc. Image composites using a generative adversarial neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548159A (en) * 2016-11-08 2017-03-29 中国科学院自动化研究所 Reticulate pattern facial image recognition method and device based on full convolutional neural networks
CN108875486A (en) * 2017-09-28 2018-11-23 北京旷视科技有限公司 Recongnition of objects method, apparatus, system and computer-readable medium
CN108492258A (en) * 2018-01-17 2018-09-04 天津大学 A kind of radar image denoising method based on generation confrontation network
CN108280811A (en) * 2018-01-23 2018-07-13 哈尔滨工业大学深圳研究生院 A kind of image de-noising method and system based on neural network
CN108765319A (en) * 2018-05-09 2018-11-06 大连理工大学 A kind of image de-noising method based on generation confrontation network
CN109360156A (en) * 2018-08-17 2019-02-19 上海交通大学 Single image rain removing method based on the image block for generating confrontation network
CN109087269A (en) * 2018-08-21 2018-12-25 厦门美图之家科技有限公司 Low light image Enhancement Method and device

Also Published As

Publication number Publication date
CN112434780A (en) 2021-03-02

Similar Documents

Publication Publication Date Title
WO2021027336A1 (en) Authentication method and apparatus based on seal and signature, and computer device
CN111401372B (en) Method for extracting and identifying image-text information of scanned document
CN105917353B (en) Feature extraction and matching for biological identification and template renewal
Semary et al. Currency recognition system for visually impaired: Egyptian banknote as a study case
CN110008909B (en) Real-name system business real-time auditing system based on AI
CN111475797A (en) Method, device and equipment for generating confrontation image and readable storage medium
CN107798279B (en) Face living body detection method and device
CN111340716B (en) Image deblurring method for improving double-discrimination countermeasure network model
CN108960214A (en) Fingerprint enhancement binarization method, device, equipment, system and storage medium
CN110059607B (en) Living body multiplex detection method, living body multiplex detection device, computer equipment and storage medium
Du et al. A new approach to iris pattern recognition
CN111783761A (en) Certificate text detection method and device and electronic equipment
CN113592776A (en) Image processing method and device, electronic device and storage medium
Devadethan et al. Face detection and facial feature extraction based on a fusion of knowledge based method and morphological image processing
CN112949464B (en) Face changing counterfeiting detection method, system and equipment based on three-dimensional shape of human face
CN112434780B (en) Target object recognition network model, training method thereof and target object recognition method
CN104966271B (en) Image de-noising method based on biological vision receptive field mechanism
CN116596891A (en) Wood floor color classification and defect detection method based on semi-supervised multitasking detection
CN113807237B (en) Training of in vivo detection model, in vivo detection method, computer device, and medium
CN110059557A (en) A kind of face identification method adaptive based on low-light (level)
CN114332983A (en) Face image definition detection method, face image definition detection device, electronic equipment and medium
CN110134924A (en) Overlay text component extracting method and device, text recognition system and storage medium
CN114694196A (en) Living body classifier establishing method, human face living body detection method and device
CN108038516B (en) White blank cloth flatness grading method based on low-dimensional image coding and ensemble learning
CN108960222B (en) Image binarization method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant