CN111104942B - Template matching network training method, recognition method and device - Google Patents
Template matching network training method, recognition method and device Download PDFInfo
- Publication number
- CN111104942B CN111104942B CN201911248538.3A CN201911248538A CN111104942B CN 111104942 B CN111104942 B CN 111104942B CN 201911248538 A CN201911248538 A CN 201911248538A CN 111104942 B CN111104942 B CN 111104942B
- Authority
- CN
- China
- Prior art keywords
- target
- target object
- frame
- training
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012549 training Methods 0.000 title claims abstract description 88
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000013507 mapping Methods 0.000 claims abstract description 13
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 8
- 238000013528 artificial neural network Methods 0.000 claims abstract description 7
- 238000012545 processing Methods 0.000 claims description 14
- 238000010586 diagram Methods 0.000 claims description 13
- 239000002245 particle Substances 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 9
- 206010061274 Malocclusion Diseases 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 2
- 238000013135 deep learning Methods 0.000 abstract description 3
- 238000012937 correction Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 241001270131 Agaricus moelleri Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
- G06T7/344—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving models
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
Abstract
The application discloses a template matching network training method, an identification method and a device, which belong to the field of computer vision and deep learning, wherein the method comprises the following steps: obtaining a sample template; generating training data comprising a plurality of sample pictures based on a sample template, and recording center point coordinates, rotation angles, scaling ratios and category information of each target object contained in each sample picture; normalizing the training data to obtain target training data, and extracting features of the target training data by using a convolutional neural network to obtain feature data; training a suggestion frame network according to the feature data, reserving a target suggestion frame meeting preset requirements, mapping the target suggestion frame to a position corresponding to a feature map, and performing rotation operation on the target suggestion frame to obtain a target feature map; and training the neural network through the target feature map to obtain a template matching network. The application can accurately position the target object and the pixel area where the target object is positioned.
Description
Technical Field
The application belongs to the field of computer vision and deep learning, and particularly relates to a template matching network training method, a template matching network recognition method and a template matching network recognition device.
Background
Traditional template matching is used as a common method in the field of vision, is widely applied to simple occasions with good illumination and clear foreground and background, but cannot adapt to complex scenes. Based on the target detection of the deep learning, various objects can be detected in a complex environment, and the objects are classified and positioned in a specific positioning mode that the external axes of the detected objects are aligned with a rectangular envelope frame (axially aligned bounding box, AABB). However, since the area outlined by the AABB box is typically much larger than the pixel area occupied by the actual target object. Under certain applications (e.g., robotically unordered grabbing), the positioning reference value of the AABB frame is not great.
Disclosure of Invention
Aiming at the defects or improvement demands of the prior art, the application provides a template matching network training method, a template matching network recognition method and a template matching network recognition device, so that the technical problem that a target object and a pixel area where the target object is positioned cannot be accurately positioned in a complex environment in the existing AABB positioning mode is solved.
To achieve the above object, according to one aspect of the present application, there is provided a template matching network training method, including:
(1) Obtaining a sample template, wherein the sample template at least comprises a target object diagram and a background diagram, and the sample template comprises a target outline of a target object;
(2) Generating training data comprising a plurality of sample pictures based on the sample template, and recording center point coordinates of each target object contained in each sample picture, rotation angles of each target object, scaling ratios of each target object and category information of each target object;
(3) Performing normalization processing on the training data to obtain target training data, and performing feature extraction on the target training data by using a convolutional neural network to obtain feature data, wherein the feature data comprises gray scale of the target training data and contour feature data of each target object;
(4) Training a suggestion frame network according to the feature data, reserving a target suggestion frame meeting preset requirements, mapping the target suggestion frame to a position corresponding to a feature map, and performing rotation operation on the target suggestion frame to obtain a target feature map;
(5) And training a neural network through the target feature map to obtain a template matching network, wherein the output of the template matching network is at least the category of the target object and the scaling ratio of the target object.
Preferably, step (2) comprises:
(2.1) randomly selecting target objects, randomly setting the scaling and the rotation angle of each target object based on the sample template;
and (2.2) generating training data comprising a plurality of sample pictures by adding a von neumann topological structure in a particle swarm algorithm, and recording the center point coordinates of each target object, the rotation angle of each target object, the scaling ratio of each target object and the category information of each target object in each sample picture.
Preferably, step (4) comprises:
(4.1) training a suggestion frame network according to the characteristic data, and predicting the axis alignment suggestion frame of each target object and the rotation angle value of each target object, wherein the suggestion frame network comprises a rpn network and an angle classification network;
(4.2) obtaining the coincidence ratio between the shaft alignment suggested frame and the actual suggested frame, obtaining a difference value between the rotation angle value of the target object in the shaft alignment suggested frame and the rotation angle value of the target object in the actual suggested frame, and if the coincidence ratio is smaller than a preset coincidence ratio threshold value and the difference value is smaller than a preset difference value threshold value, discarding the shaft alignment suggested frame with the coincidence ratio lower than the preset coincidence ratio threshold value to obtain a reserved target shaft alignment suggested frame;
and (4.3) mapping the target axis alignment suggestion frame to a position corresponding to the feature map, and performing rotation transformation on the target axis alignment suggestion frame at the position based on the rotation angle value of the target object in the target axis alignment suggestion frame to obtain the target feature map.
Preferably, in step (4.1), predicting the rotation angle value of the target object includes:
and classifying the angles, wherein the standard angle and a plurality of left and right degrees thereof are positive values, and the rest angle label values are all set to 0, wherein the standard angle is the direction of the centroid of the target object along the long side of the minimum rectangular envelope frame.
According to another aspect of the present application, there is provided an identification method comprising:
inputting the picture to be identified into the template matching network trained by the template matching network training method of any one of the above, and carrying out identification processing to obtain category information and pose information of each target object contained in the picture to be identified.
According to another aspect of the present application, there is provided a template matching network training apparatus comprising:
the template acquisition module is used for acquiring a sample template, wherein the sample template at least comprises a target object diagram and a background diagram, and the sample template comprises a target outline of a target object;
the training data acquisition module is used for generating training data comprising a plurality of sample pictures based on the sample template and recording the center point coordinates of each target object contained in each sample picture, the rotation angle of each target object, the scaling ratio of each target object and the category information of each target object;
the feature extraction module is used for carrying out normalization processing on the training data to obtain target training data, and carrying out feature extraction on the target training data by utilizing a convolutional neural network to obtain feature data, wherein the feature data comprises the gray level of the target training data and the contour feature data of each target object;
the feature map acquisition module is used for training a suggestion frame network according to the feature data, reserving a target suggestion frame meeting preset requirements, mapping the target suggestion frame to a corresponding position of a feature map, and performing rotation operation on the target suggestion frame to obtain a target feature map;
and the training module is used for training the neural network through the target feature map to obtain a template matching network, wherein the output of the template matching network is at least the category of the target object and the scaling ratio of the target object.
Preferably, the training data acquisition module includes:
the preprocessing module is used for randomly selecting target objects based on the sample template, and randomly setting the scaling and the rotation angle of each target object;
the training data acquisition sub-module is used for generating training data comprising a plurality of sample pictures in a mode of adding a von Neumann topological structure in a particle swarm algorithm, and recording center point coordinates of each target object contained in each sample picture, rotation angles of each target object, scaling ratios of each target object and category information of each target object.
Preferably, the feature map acquisition module includes:
the first training module is used for training a suggestion frame network according to the characteristic data and predicting the shaft alignment suggestion frame of each target object and the rotation angle value of each target object, wherein the suggestion frame network comprises a rpn network and an angle classification network;
the judging and processing module is used for acquiring the coincidence ratio between the shaft alignment suggested frame and the actual suggested frame, acquiring a difference value between the rotation angle value of the target object in the shaft alignment suggested frame and the rotation angle value of the target object in the actual suggested frame, and discarding the shaft alignment suggested frame with the coincidence ratio lower than the preset coincidence ratio threshold value if the coincidence ratio is smaller than the preset coincidence ratio threshold value and acquiring a reserved target shaft alignment suggested frame if the coincidence ratio is smaller than the preset coincidence ratio threshold value;
and the characteristic map acquisition sub-module is used for mapping the target axis alignment suggestion frame to a position corresponding to the characteristic map, and then carrying out rotation transformation on the target axis alignment suggestion frame at the position based on the rotation angle value of the target object in the target axis alignment suggestion frame to obtain the target characteristic map.
Preferably, predicting the rotation angle value of the target object includes:
and classifying the angles, wherein the standard angle and a plurality of left and right degrees thereof are positive values, and the rest angle label values are all set to 0, wherein the standard angle is the direction of the centroid of the target object along the long side of the minimum rectangular envelope frame.
According to another aspect of the present application, there is provided an identification device comprising:
the recognition result acquisition module is used for inputting the picture to be recognized into the template matching network trained by the template matching network training device to perform recognition processing to obtain the category information and the pose information of each target object contained in the picture to be recognized.
In general, the above technical solutions conceived by the present application, compared with the prior art, enable the following beneficial effects to be obtained:
1. according to the application, through angle prediction, feature mapping, feature map rotation and the like, the information such as the central position of various target objects, the scaling ratio relative to the template, the rotation angle and the like can be detected under the complex background condition.
2. The RPN network is additionally provided with a function of predicting angles, and the angles are classified into 360 categories by adopting a classification method for angle prediction. Wherein the standard angle and the left and right 5 degrees are positive values, and the rest angle label values are all set to 0. Assignment method of positive value label: the label value of "standard angle" is 1, decreasing by 0.2 per 1 degree deviation of the label value. In addition, the symmetrical object has a plurality of "standard angles". For example, a rectangle has 2 standard angles, a square has 4 standard angles, and a circle is all standard angles.
3. The roiafine method is proposed. The conventional roiplating method maps the suggestion frame to the featureMap and cuts out a new featureMap area (hereinafter referred to as new_featuremap), and then predicts the length, width and type of the object according to the new_featuremap. Since the target object in new_featuremap is not subjected to angle correction, the characteristic extraction of calculated length and width is imperfect due to diversified angles, and therefore, the prediction of the length and width of the target object is usually inaccurate. The ROIAffine provided by the application performs cutting and angle correction work simultaneously when the feature is mapped, and the length and width of the target object can be calculated more accurately after the feature_map is cut.
4. A method of generating training data based on an improved particle swarm algorithm is presented. The user only needs to provide all the object images and the background images and outline the outer outline of the object, so that the image can be automatically generated.
Drawings
FIG. 1 is a schematic flow chart of a training method of a template matching network according to an embodiment of the present application;
FIG. 2 is a result diagram of a suggestion box network provided by an embodiment of the present application;
fig. 3 is a diagram of a result of a rotation of a suggestion box area in a picture according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application. In addition, the technical features of the embodiments of the present application described below may be combined with each other as long as they do not collide with each other.
In order to accurately position a target object and a pixel area where the target object is located in a complex environment, the application provides a template matching network training method, a template matching method and a template matching device, which can detect the information such as the central position of various target objects, the scaling ratio and the rotation angle relative to a template and the like under the complex background condition.
Fig. 1 is a schematic flow chart of a training method for a template matching network according to an embodiment of the present application, including the following steps:
s1: obtaining a sample template, wherein the sample template at least comprises a target object graph and a background graph, and the sample template comprises a target outline of a target object;
in the step of making the template, a user only needs to provide all the object images and the background images of the target, outline the outer outline of the target, mark the template type and determine the angle orientation.
S2: processing the sample template obtained in the step S1 by using an improved particle swarm algorithm, generating sample picture training data, and recording the center point coordinates of each target object contained in each sample picture, the rotation angle of the target object, the scaling ratio of the target object and the category information of the target object;
in an embodiment of the application, training data is generated based on the improved particle swarm. Randomly selecting a target object, randomly setting the scaling and the rotation angle of the target object. In order to make the arrangement of the objects in each piece of picture data as compact as possible, in the embodiment of the present application, the object positions are arranged using an improved particle swarm algorithm. And automatically recording the center point, the angle, the scaling ratio and the category information of each target object when generating the picture. To balance the speed and global search capability at optimization iterations, von neumann topology is added to the particle swarm algorithm. In order to improve the convergence rate of the particle swarm algorithm, principal component analysis is performed on the generated picture in the first half period, and the search space in iteration is reduced.
S3: normalizing the training data to obtain target training data, and extracting features of the target training data by using a convolutional neural network to obtain feature data, wherein the feature data comprises gray scales of the target training data and contour feature data of each target object;
in the embodiment of the present application, the convolutional neural network may be vgg, acceptable v3, resnet, etc., and the embodiment of the present application is not limited to uniqueness.
S4: training a suggestion frame network according to the feature data, reserving a target suggestion frame meeting preset requirements, mapping the target suggestion frame to a position corresponding to the feature map, and performing rotation operation on the target suggestion frame to obtain a target feature map;
as an alternative embodiment, step S4 may be implemented in the following manner:
s4.1: training a suggestion frame network according to the characteristic data, and predicting an axis alignment suggestion frame of each target object, a rotation angle value and a score value of the target object, wherein the suggestion frame network comprises a rpn network and an angle classification network, and a result diagram of the suggestion frame network provided by the embodiment of the application is shown in fig. 2;
as an optional implementation manner, the angle prediction adopts a classification method to divide the angles into a plurality of categories, wherein a plurality of degrees of the standard angles are positive values, the rest angle label values are all set to 0, and the standard angles are the directions of the centroid of the target object along the long sides of the minimum rectangular envelope frame.
In the embodiment of the application, the standard angle has a plurality of degrees of positive values, and the specific degree of positive values can be determined according to actual needs, and the embodiment of the application is not limited by uniqueness. In the embodiment of the application, the standard angle is preferably about 5 degrees and has a positive value.
The method for assigning the positive value label comprises the following steps: the label value of "standard angle" is 1, decreasing by 0.2 per 1 degree deviation of the label value. In addition, the symmetrical object has a plurality of "standard angles". For example, a square has 4 standard angles, and a circle is all standard angles.
S4.2: obtaining the coincidence ratio between the shaft alignment suggested frame and the actual suggested frame, obtaining the difference value between the rotation angle value of the target object in the shaft alignment suggested frame and the rotation angle value of the target object in the actual suggested frame, and discarding the shaft alignment suggested frame with the coincidence ratio lower than the preset coincidence ratio threshold value if the coincidence ratio is smaller than the preset coincidence ratio threshold value and the difference value is smaller than the preset difference value threshold value, so as to obtain a reserved target shaft alignment suggested frame;
s4.3: mapping the target axis alignment suggestion frame to a position corresponding to the feature map, and performing rotation transformation on the target axis alignment suggestion frame at the position based on the rotation angle value of the target object in the target axis alignment suggestion frame to obtain a target feature map, wherein fig. 3 is a result map after the suggestion frame region in the picture is rotated.
S5: and training the neural network through the target feature map to obtain a template matching network, wherein the output of the template matching network is at least the category of the target object and the scaling ratio of the target object.
In the embodiment of the application, the position and the angle value of the central point obtained in the step S4.2 can be finely adjusted through a template matching network.
In another embodiment of the present application, there is also provided an identification method including:
and inputting the picture to be identified into a trained template matching network for identification processing to obtain the category information and the pose information of each target object contained in the picture to be identified.
The pose information of the target object comprises a center position, a scaling ratio, a rotation angle and the like.
In another embodiment of the present application, there is also provided a template matching network training apparatus, including:
the template acquisition module is used for acquiring a sample template, wherein the sample template at least comprises a target object diagram and a background diagram, and the sample template comprises a target outline of a target object;
the training data acquisition module is used for generating training data comprising a plurality of sample pictures based on the sample template and recording the center point coordinates of each target object contained in each sample picture, the rotation angle of each target object, the scaling ratio of each target object and the category information of each target object;
the feature extraction module is used for carrying out normalization processing on the training data to obtain target training data, and carrying out feature extraction on the target training data by utilizing a convolutional neural network to obtain feature data, wherein the feature data comprises the gray level of the target training data and the contour feature data of each target object;
the feature map acquisition module is used for training a suggestion frame network according to the feature data, reserving target suggestion frames meeting preset requirements, mapping the target suggestion frames to corresponding positions of the feature map, and performing rotation operation on the target suggestion frames to obtain a target feature map;
and the training module is used for training the neural network through the target feature map to obtain a template matching network, wherein the output of the template matching network is at least the category of the target object and the scaling ratio of the target object.
In the embodiment of the present application, specific implementation manners of each module may refer to descriptions of method embodiments, and the embodiment of the present application will not be repeated.
In another embodiment of the present application, there is also provided an identification device including:
the recognition result acquisition module is used for inputting the picture to be recognized into the trained template matching network for recognition processing to obtain the category information and the pose information of each target object contained in the picture to be recognized.
The pose information of the target object comprises a center position, a scaling ratio, a rotation angle and the like.
It should be noted that each step/component described in the present application may be split into more steps/components, or two or more steps/components or part of operations of the steps/components may be combined into new steps/components, according to the implementation needs, to achieve the object of the present application.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the application and is not intended to limit the application, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the application are intended to be included within the scope of the application.
Claims (10)
1. A template matching network training method, comprising:
(1) Obtaining a sample template, wherein the sample template at least comprises a target object diagram and a background diagram, and the sample template comprises a target outline of a target object;
(2) Generating training data comprising a plurality of sample pictures based on the sample template, and recording center point coordinates of each target object contained in each sample picture, rotation angles of each target object, scaling ratios of each target object and category information of each target object;
(3) Performing normalization processing on the training data to obtain target training data, and performing feature extraction on the target training data by using a convolutional neural network to obtain feature data, wherein the feature data comprises gray scale of the target training data and contour feature data of each target object;
(4) Training a suggestion frame network according to the feature data, reserving a target suggestion frame meeting preset requirements, mapping the target suggestion frame to a position corresponding to a feature map, and performing rotation operation on the target suggestion frame to obtain a target feature map;
(5) And training a neural network through the target feature map to obtain a template matching network, wherein the output of the template matching network is at least the category of the target object and the scaling ratio of the target object.
2. The method of claim 1, wherein step (2) comprises:
(2.1) randomly selecting target objects, randomly setting the scaling and the rotation angle of each target object based on the sample template;
and (2.2) generating training data comprising a plurality of sample pictures by adding a von neumann topological structure in a particle swarm algorithm, and recording the center point coordinates of each target object, the rotation angle of each target object, the scaling ratio of each target object and the category information of each target object in each sample picture.
3. The method according to claim 1 or 2, wherein step (4) comprises:
(4.1) training a suggestion frame network according to the characteristic data, and predicting the axis alignment suggestion frame of each target object and the rotation angle value of each target object, wherein the suggestion frame network comprises a rpn network and an angle classification network;
(4.2) obtaining the coincidence ratio between the shaft alignment suggested frame and the actual suggested frame, obtaining a difference value between the rotation angle value of the target object in the shaft alignment suggested frame and the rotation angle value of the target object in the actual suggested frame, and if the coincidence ratio is smaller than a preset coincidence ratio threshold value and the difference value is smaller than a preset difference value threshold value, discarding the shaft alignment suggested frame with the coincidence ratio lower than the preset coincidence ratio threshold value to obtain a reserved target shaft alignment suggested frame;
and (4.3) mapping the target axis alignment suggestion frame to a position corresponding to the feature map, and performing rotation transformation on the target axis alignment suggestion frame at the position based on the rotation angle value of the target object in the target axis alignment suggestion frame to obtain the target feature map.
4. A method according to claim 3, wherein in step (4.1) the rotation angle value of the target object is predicted, comprising:
and classifying the angles, wherein the standard angle and a plurality of left and right degrees thereof are positive values, and the rest angle label values are all set to 0, wherein the standard angle is the direction of the centroid of the target object along the long side of the minimum rectangular envelope frame.
5. A method of identification, comprising:
inputting a picture to be identified into the template matching network trained by the template matching network training method according to any one of claims 1 to 4 for identification processing, so as to obtain category information and pose information of each target object contained in the picture to be identified.
6. A template matching network training device, comprising:
the template acquisition module is used for acquiring a sample template, wherein the sample template at least comprises a target object diagram and a background diagram, and the sample template comprises a target outline of a target object;
the training data acquisition module is used for generating training data comprising a plurality of sample pictures based on the sample template and recording the center point coordinates of each target object contained in each sample picture, the rotation angle of each target object, the scaling ratio of each target object and the category information of each target object;
the feature extraction module is used for carrying out normalization processing on the training data to obtain target training data, and carrying out feature extraction on the target training data by utilizing a convolutional neural network to obtain feature data, wherein the feature data comprises the gray level of the target training data and the contour feature data of each target object;
the feature map acquisition module is used for training a suggestion frame network according to the feature data, reserving a target suggestion frame meeting preset requirements, mapping the target suggestion frame to a corresponding position of a feature map, and performing rotation operation on the target suggestion frame to obtain a target feature map;
and the training module is used for training the neural network through the target feature map to obtain a template matching network, wherein the output of the template matching network is at least the category of the target object and the scaling ratio of the target object.
7. The apparatus of claim 6, wherein the training data acquisition module comprises:
the preprocessing module is used for randomly selecting target objects based on the sample template, and randomly setting the scaling and the rotation angle of each target object;
the training data acquisition sub-module is used for generating training data comprising a plurality of sample pictures in a mode of adding a von Neumann topological structure in a particle swarm algorithm, and recording center point coordinates of each target object contained in each sample picture, rotation angles of each target object, scaling ratios of each target object and category information of each target object.
8. The apparatus according to claim 6 or 7, wherein the feature map acquisition module includes:
the first training module is used for training a suggestion frame network according to the characteristic data and predicting the shaft alignment suggestion frame of each target object and the rotation angle value of each target object, wherein the suggestion frame network comprises a rpn network and an angle classification network;
the judging and processing module is used for acquiring the coincidence ratio between the shaft alignment suggested frame and the actual suggested frame, acquiring a difference value between the rotation angle value of the target object in the shaft alignment suggested frame and the rotation angle value of the target object in the actual suggested frame, and discarding the shaft alignment suggested frame with the coincidence ratio lower than the preset coincidence ratio threshold value if the coincidence ratio is smaller than the preset coincidence ratio threshold value and acquiring a reserved target shaft alignment suggested frame if the coincidence ratio is smaller than the preset coincidence ratio threshold value;
and the characteristic map acquisition sub-module is used for mapping the target axis alignment suggestion frame to a position corresponding to the characteristic map, and then carrying out rotation transformation on the target axis alignment suggestion frame at the position based on the rotation angle value of the target object in the target axis alignment suggestion frame to obtain the target characteristic map.
9. The apparatus of claim 8, wherein predicting the rotation angle value of the target object comprises:
and classifying the angles, wherein the standard angle and a plurality of left and right degrees thereof are positive values, and the rest angle label values are all set to 0, wherein the standard angle is the direction of the centroid of the target object along the long side of the minimum rectangular envelope frame.
10. An identification device, comprising:
the recognition result obtaining module is configured to input a picture to be recognized into the template matching network trained by the template matching network training device according to any one of claims 6 to 9, and perform recognition processing on the template matching network to obtain category information and pose information of each target object included in the picture to be recognized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911248538.3A CN111104942B (en) | 2019-12-09 | 2019-12-09 | Template matching network training method, recognition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911248538.3A CN111104942B (en) | 2019-12-09 | 2019-12-09 | Template matching network training method, recognition method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111104942A CN111104942A (en) | 2020-05-05 |
CN111104942B true CN111104942B (en) | 2023-11-03 |
Family
ID=70422155
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911248538.3A Active CN111104942B (en) | 2019-12-09 | 2019-12-09 | Template matching network training method, recognition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111104942B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598094B (en) * | 2020-05-27 | 2023-08-18 | 深圳市铁越电气有限公司 | Angle regression instrument reading identification method, equipment and system based on deep learning |
CN111950567B (en) * | 2020-08-18 | 2024-04-09 | 创新奇智(成都)科技有限公司 | Extractor training method and device, electronic equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106250812A (en) * | 2016-07-15 | 2016-12-21 | 汤平 | A kind of model recognizing method based on quick R CNN deep neural network |
CN107194318A (en) * | 2017-04-24 | 2017-09-22 | 北京航空航天大学 | The scene recognition method of target detection auxiliary |
CN109583445A (en) * | 2018-11-26 | 2019-04-05 | 平安科技(深圳)有限公司 | Character image correction processing method, device, equipment and storage medium |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9031317B2 (en) * | 2012-09-18 | 2015-05-12 | Seiko Epson Corporation | Method and apparatus for improved training of object detecting system |
-
2019
- 2019-12-09 CN CN201911248538.3A patent/CN111104942B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106250812A (en) * | 2016-07-15 | 2016-12-21 | 汤平 | A kind of model recognizing method based on quick R CNN deep neural network |
CN107194318A (en) * | 2017-04-24 | 2017-09-22 | 北京航空航天大学 | The scene recognition method of target detection auxiliary |
CN109583445A (en) * | 2018-11-26 | 2019-04-05 | 平安科技(深圳)有限公司 | Character image correction processing method, device, equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
孙显 ; 胡岩峰 ; 王宏琦 ; .基于主动式边界基元模型的多类目标识别方法.中国科学院研究生院学报.2009,(04),全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN111104942A (en) | 2020-05-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111563442B (en) | Slam method and system for fusing point cloud and camera image data based on laser radar | |
CN110232311B (en) | Method and device for segmenting hand image and computer equipment | |
CN111080693A (en) | Robot autonomous classification grabbing method based on YOLOv3 | |
CN107833213B (en) | Weak supervision object detection method based on false-true value self-adaptive method | |
TWI395145B (en) | Hand gesture recognition system and method | |
CN111401410B (en) | Traffic sign detection method based on improved cascade neural network | |
WO2019033574A1 (en) | Electronic device, dynamic video face recognition method and system, and storage medium | |
CN105512683A (en) | Target positioning method and device based on convolution neural network | |
CN110910350B (en) | Nut loosening detection method for wind power tower cylinder | |
CN111862119A (en) | Semantic information extraction method based on Mask-RCNN | |
CN112825192B (en) | Object identification system and method based on machine learning | |
CN105069774B (en) | The Target Segmentation method of optimization is cut based on multi-instance learning and figure | |
CN111104942B (en) | Template matching network training method, recognition method and device | |
CN111401449B (en) | Image matching method based on machine vision | |
US20180225799A1 (en) | System and method for scoring color candidate poses against a color image in a vision system | |
CN109767431A (en) | Accessory appearance defect inspection method, device, equipment and readable storage medium storing program for executing | |
CN113989604A (en) | Tire DOT information identification method based on end-to-end deep learning | |
CN112861870A (en) | Pointer instrument image correction method, system and storage medium | |
US9053383B2 (en) | Recognizing apparatus and method, program, and recording medium | |
CN114936997A (en) | Detection method, detection device, electronic equipment and readable storage medium | |
CN113780040A (en) | Lip key point positioning method and device, storage medium and electronic equipment | |
CN111738264A (en) | Intelligent acquisition method for data of display panel of machine room equipment | |
CN110689556A (en) | Tracking method and device and intelligent equipment | |
CN110378337A (en) | Metal cutting tool drawing identification information vision input method and system | |
CN109753981B (en) | Image recognition method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20231207 Address after: 518000, Building 5, Building C, Building C, Huaqiang Creative Park, Biyan Community, Guangming Street, Guangming District, Shenzhen, Guangdong Province, China 1301 Patentee after: SHENZHEN ROBOT VISION TECHNOLOGY Co.,Ltd. Address before: 518031 703, 7th floor, Zhongdian Difu building, Zhenhua Road, Fuqiang community, Huaqiang North Street, Futian District, Shenzhen City, Guangdong Province Patentee before: SHANGZHI TECHNOLOGY (SHENZHEN) Co.,Ltd. |