CN111104942B - Template matching network training method, recognition method and device - Google Patents

Template matching network training method, recognition method and device Download PDF

Info

Publication number
CN111104942B
CN111104942B CN201911248538.3A CN201911248538A CN111104942B CN 111104942 B CN111104942 B CN 111104942B CN 201911248538 A CN201911248538 A CN 201911248538A CN 111104942 B CN111104942 B CN 111104942B
Authority
CN
China
Prior art keywords
target
target object
frame
training
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911248538.3A
Other languages
Chinese (zh)
Other versions
CN111104942A (en
Inventor
赵青
蔡旗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Robot Vision Technology Co Ltd
Original Assignee
Seizet Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Seizet Technology Shenzhen Co Ltd filed Critical Seizet Technology Shenzhen Co Ltd
Priority to CN201911248538.3A priority Critical patent/CN111104942B/en
Publication of CN111104942A publication Critical patent/CN111104942A/en
Application granted granted Critical
Publication of CN111104942B publication Critical patent/CN111104942B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • G06T7/344Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a template matching network training method, an identification method and a device, which belong to the field of computer vision and deep learning, wherein the method comprises the following steps: obtaining a sample template; generating training data comprising a plurality of sample pictures based on a sample template, and recording center point coordinates, rotation angles, scaling ratios and category information of each target object contained in each sample picture; normalizing the training data to obtain target training data, and extracting features of the target training data by using a convolutional neural network to obtain feature data; training a suggestion frame network according to the feature data, reserving a target suggestion frame meeting preset requirements, mapping the target suggestion frame to a position corresponding to a feature map, and performing rotation operation on the target suggestion frame to obtain a target feature map; and training the neural network through the target feature map to obtain a template matching network. The application can accurately position the target object and the pixel area where the target object is positioned.

Description

Template matching network training method, recognition method and device
Technical Field
The application belongs to the field of computer vision and deep learning, and particularly relates to a template matching network training method, a template matching network recognition method and a template matching network recognition device.
Background
Traditional template matching is used as a common method in the field of vision, is widely applied to simple occasions with good illumination and clear foreground and background, but cannot adapt to complex scenes. Based on the target detection of the deep learning, various objects can be detected in a complex environment, and the objects are classified and positioned in a specific positioning mode that the external axes of the detected objects are aligned with a rectangular envelope frame (axially aligned bounding box, AABB). However, since the area outlined by the AABB box is typically much larger than the pixel area occupied by the actual target object. Under certain applications (e.g., robotically unordered grabbing), the positioning reference value of the AABB frame is not great.
Disclosure of Invention
Aiming at the defects or improvement demands of the prior art, the application provides a template matching network training method, a template matching network recognition method and a template matching network recognition device, so that the technical problem that a target object and a pixel area where the target object is positioned cannot be accurately positioned in a complex environment in the existing AABB positioning mode is solved.
To achieve the above object, according to one aspect of the present application, there is provided a template matching network training method, including:
(1) Obtaining a sample template, wherein the sample template at least comprises a target object diagram and a background diagram, and the sample template comprises a target outline of a target object;
(2) Generating training data comprising a plurality of sample pictures based on the sample template, and recording center point coordinates of each target object contained in each sample picture, rotation angles of each target object, scaling ratios of each target object and category information of each target object;
(3) Performing normalization processing on the training data to obtain target training data, and performing feature extraction on the target training data by using a convolutional neural network to obtain feature data, wherein the feature data comprises gray scale of the target training data and contour feature data of each target object;
(4) Training a suggestion frame network according to the feature data, reserving a target suggestion frame meeting preset requirements, mapping the target suggestion frame to a position corresponding to a feature map, and performing rotation operation on the target suggestion frame to obtain a target feature map;
(5) And training a neural network through the target feature map to obtain a template matching network, wherein the output of the template matching network is at least the category of the target object and the scaling ratio of the target object.
Preferably, step (2) comprises:
(2.1) randomly selecting target objects, randomly setting the scaling and the rotation angle of each target object based on the sample template;
and (2.2) generating training data comprising a plurality of sample pictures by adding a von neumann topological structure in a particle swarm algorithm, and recording the center point coordinates of each target object, the rotation angle of each target object, the scaling ratio of each target object and the category information of each target object in each sample picture.
Preferably, step (4) comprises:
(4.1) training a suggestion frame network according to the characteristic data, and predicting the axis alignment suggestion frame of each target object and the rotation angle value of each target object, wherein the suggestion frame network comprises a rpn network and an angle classification network;
(4.2) obtaining the coincidence ratio between the shaft alignment suggested frame and the actual suggested frame, obtaining a difference value between the rotation angle value of the target object in the shaft alignment suggested frame and the rotation angle value of the target object in the actual suggested frame, and if the coincidence ratio is smaller than a preset coincidence ratio threshold value and the difference value is smaller than a preset difference value threshold value, discarding the shaft alignment suggested frame with the coincidence ratio lower than the preset coincidence ratio threshold value to obtain a reserved target shaft alignment suggested frame;
and (4.3) mapping the target axis alignment suggestion frame to a position corresponding to the feature map, and performing rotation transformation on the target axis alignment suggestion frame at the position based on the rotation angle value of the target object in the target axis alignment suggestion frame to obtain the target feature map.
Preferably, in step (4.1), predicting the rotation angle value of the target object includes:
and classifying the angles, wherein the standard angle and a plurality of left and right degrees thereof are positive values, and the rest angle label values are all set to 0, wherein the standard angle is the direction of the centroid of the target object along the long side of the minimum rectangular envelope frame.
According to another aspect of the present application, there is provided an identification method comprising:
inputting the picture to be identified into the template matching network trained by the template matching network training method of any one of the above, and carrying out identification processing to obtain category information and pose information of each target object contained in the picture to be identified.
According to another aspect of the present application, there is provided a template matching network training apparatus comprising:
the template acquisition module is used for acquiring a sample template, wherein the sample template at least comprises a target object diagram and a background diagram, and the sample template comprises a target outline of a target object;
the training data acquisition module is used for generating training data comprising a plurality of sample pictures based on the sample template and recording the center point coordinates of each target object contained in each sample picture, the rotation angle of each target object, the scaling ratio of each target object and the category information of each target object;
the feature extraction module is used for carrying out normalization processing on the training data to obtain target training data, and carrying out feature extraction on the target training data by utilizing a convolutional neural network to obtain feature data, wherein the feature data comprises the gray level of the target training data and the contour feature data of each target object;
the feature map acquisition module is used for training a suggestion frame network according to the feature data, reserving a target suggestion frame meeting preset requirements, mapping the target suggestion frame to a corresponding position of a feature map, and performing rotation operation on the target suggestion frame to obtain a target feature map;
and the training module is used for training the neural network through the target feature map to obtain a template matching network, wherein the output of the template matching network is at least the category of the target object and the scaling ratio of the target object.
Preferably, the training data acquisition module includes:
the preprocessing module is used for randomly selecting target objects based on the sample template, and randomly setting the scaling and the rotation angle of each target object;
the training data acquisition sub-module is used for generating training data comprising a plurality of sample pictures in a mode of adding a von Neumann topological structure in a particle swarm algorithm, and recording center point coordinates of each target object contained in each sample picture, rotation angles of each target object, scaling ratios of each target object and category information of each target object.
Preferably, the feature map acquisition module includes:
the first training module is used for training a suggestion frame network according to the characteristic data and predicting the shaft alignment suggestion frame of each target object and the rotation angle value of each target object, wherein the suggestion frame network comprises a rpn network and an angle classification network;
the judging and processing module is used for acquiring the coincidence ratio between the shaft alignment suggested frame and the actual suggested frame, acquiring a difference value between the rotation angle value of the target object in the shaft alignment suggested frame and the rotation angle value of the target object in the actual suggested frame, and discarding the shaft alignment suggested frame with the coincidence ratio lower than the preset coincidence ratio threshold value if the coincidence ratio is smaller than the preset coincidence ratio threshold value and acquiring a reserved target shaft alignment suggested frame if the coincidence ratio is smaller than the preset coincidence ratio threshold value;
and the characteristic map acquisition sub-module is used for mapping the target axis alignment suggestion frame to a position corresponding to the characteristic map, and then carrying out rotation transformation on the target axis alignment suggestion frame at the position based on the rotation angle value of the target object in the target axis alignment suggestion frame to obtain the target characteristic map.
Preferably, predicting the rotation angle value of the target object includes:
and classifying the angles, wherein the standard angle and a plurality of left and right degrees thereof are positive values, and the rest angle label values are all set to 0, wherein the standard angle is the direction of the centroid of the target object along the long side of the minimum rectangular envelope frame.
According to another aspect of the present application, there is provided an identification device comprising:
the recognition result acquisition module is used for inputting the picture to be recognized into the template matching network trained by the template matching network training device to perform recognition processing to obtain the category information and the pose information of each target object contained in the picture to be recognized.
In general, the above technical solutions conceived by the present application, compared with the prior art, enable the following beneficial effects to be obtained:
1. according to the application, through angle prediction, feature mapping, feature map rotation and the like, the information such as the central position of various target objects, the scaling ratio relative to the template, the rotation angle and the like can be detected under the complex background condition.
2. The RPN network is additionally provided with a function of predicting angles, and the angles are classified into 360 categories by adopting a classification method for angle prediction. Wherein the standard angle and the left and right 5 degrees are positive values, and the rest angle label values are all set to 0. Assignment method of positive value label: the label value of "standard angle" is 1, decreasing by 0.2 per 1 degree deviation of the label value. In addition, the symmetrical object has a plurality of "standard angles". For example, a rectangle has 2 standard angles, a square has 4 standard angles, and a circle is all standard angles.
3. The roiafine method is proposed. The conventional roiplating method maps the suggestion frame to the featureMap and cuts out a new featureMap area (hereinafter referred to as new_featuremap), and then predicts the length, width and type of the object according to the new_featuremap. Since the target object in new_featuremap is not subjected to angle correction, the characteristic extraction of calculated length and width is imperfect due to diversified angles, and therefore, the prediction of the length and width of the target object is usually inaccurate. The ROIAffine provided by the application performs cutting and angle correction work simultaneously when the feature is mapped, and the length and width of the target object can be calculated more accurately after the feature_map is cut.
4. A method of generating training data based on an improved particle swarm algorithm is presented. The user only needs to provide all the object images and the background images and outline the outer outline of the object, so that the image can be automatically generated.
Drawings
FIG. 1 is a schematic flow chart of a training method of a template matching network according to an embodiment of the present application;
FIG. 2 is a result diagram of a suggestion box network provided by an embodiment of the present application;
fig. 3 is a diagram of a result of a rotation of a suggestion box area in a picture according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application. In addition, the technical features of the embodiments of the present application described below may be combined with each other as long as they do not collide with each other.
In order to accurately position a target object and a pixel area where the target object is located in a complex environment, the application provides a template matching network training method, a template matching method and a template matching device, which can detect the information such as the central position of various target objects, the scaling ratio and the rotation angle relative to a template and the like under the complex background condition.
Fig. 1 is a schematic flow chart of a training method for a template matching network according to an embodiment of the present application, including the following steps:
s1: obtaining a sample template, wherein the sample template at least comprises a target object graph and a background graph, and the sample template comprises a target outline of a target object;
in the step of making the template, a user only needs to provide all the object images and the background images of the target, outline the outer outline of the target, mark the template type and determine the angle orientation.
S2: processing the sample template obtained in the step S1 by using an improved particle swarm algorithm, generating sample picture training data, and recording the center point coordinates of each target object contained in each sample picture, the rotation angle of the target object, the scaling ratio of the target object and the category information of the target object;
in an embodiment of the application, training data is generated based on the improved particle swarm. Randomly selecting a target object, randomly setting the scaling and the rotation angle of the target object. In order to make the arrangement of the objects in each piece of picture data as compact as possible, in the embodiment of the present application, the object positions are arranged using an improved particle swarm algorithm. And automatically recording the center point, the angle, the scaling ratio and the category information of each target object when generating the picture. To balance the speed and global search capability at optimization iterations, von neumann topology is added to the particle swarm algorithm. In order to improve the convergence rate of the particle swarm algorithm, principal component analysis is performed on the generated picture in the first half period, and the search space in iteration is reduced.
S3: normalizing the training data to obtain target training data, and extracting features of the target training data by using a convolutional neural network to obtain feature data, wherein the feature data comprises gray scales of the target training data and contour feature data of each target object;
in the embodiment of the present application, the convolutional neural network may be vgg, acceptable v3, resnet, etc., and the embodiment of the present application is not limited to uniqueness.
S4: training a suggestion frame network according to the feature data, reserving a target suggestion frame meeting preset requirements, mapping the target suggestion frame to a position corresponding to the feature map, and performing rotation operation on the target suggestion frame to obtain a target feature map;
as an alternative embodiment, step S4 may be implemented in the following manner:
s4.1: training a suggestion frame network according to the characteristic data, and predicting an axis alignment suggestion frame of each target object, a rotation angle value and a score value of the target object, wherein the suggestion frame network comprises a rpn network and an angle classification network, and a result diagram of the suggestion frame network provided by the embodiment of the application is shown in fig. 2;
as an optional implementation manner, the angle prediction adopts a classification method to divide the angles into a plurality of categories, wherein a plurality of degrees of the standard angles are positive values, the rest angle label values are all set to 0, and the standard angles are the directions of the centroid of the target object along the long sides of the minimum rectangular envelope frame.
In the embodiment of the application, the standard angle has a plurality of degrees of positive values, and the specific degree of positive values can be determined according to actual needs, and the embodiment of the application is not limited by uniqueness. In the embodiment of the application, the standard angle is preferably about 5 degrees and has a positive value.
The method for assigning the positive value label comprises the following steps: the label value of "standard angle" is 1, decreasing by 0.2 per 1 degree deviation of the label value. In addition, the symmetrical object has a plurality of "standard angles". For example, a square has 4 standard angles, and a circle is all standard angles.
S4.2: obtaining the coincidence ratio between the shaft alignment suggested frame and the actual suggested frame, obtaining the difference value between the rotation angle value of the target object in the shaft alignment suggested frame and the rotation angle value of the target object in the actual suggested frame, and discarding the shaft alignment suggested frame with the coincidence ratio lower than the preset coincidence ratio threshold value if the coincidence ratio is smaller than the preset coincidence ratio threshold value and the difference value is smaller than the preset difference value threshold value, so as to obtain a reserved target shaft alignment suggested frame;
s4.3: mapping the target axis alignment suggestion frame to a position corresponding to the feature map, and performing rotation transformation on the target axis alignment suggestion frame at the position based on the rotation angle value of the target object in the target axis alignment suggestion frame to obtain a target feature map, wherein fig. 3 is a result map after the suggestion frame region in the picture is rotated.
S5: and training the neural network through the target feature map to obtain a template matching network, wherein the output of the template matching network is at least the category of the target object and the scaling ratio of the target object.
In the embodiment of the application, the position and the angle value of the central point obtained in the step S4.2 can be finely adjusted through a template matching network.
In another embodiment of the present application, there is also provided an identification method including:
and inputting the picture to be identified into a trained template matching network for identification processing to obtain the category information and the pose information of each target object contained in the picture to be identified.
The pose information of the target object comprises a center position, a scaling ratio, a rotation angle and the like.
In another embodiment of the present application, there is also provided a template matching network training apparatus, including:
the template acquisition module is used for acquiring a sample template, wherein the sample template at least comprises a target object diagram and a background diagram, and the sample template comprises a target outline of a target object;
the training data acquisition module is used for generating training data comprising a plurality of sample pictures based on the sample template and recording the center point coordinates of each target object contained in each sample picture, the rotation angle of each target object, the scaling ratio of each target object and the category information of each target object;
the feature extraction module is used for carrying out normalization processing on the training data to obtain target training data, and carrying out feature extraction on the target training data by utilizing a convolutional neural network to obtain feature data, wherein the feature data comprises the gray level of the target training data and the contour feature data of each target object;
the feature map acquisition module is used for training a suggestion frame network according to the feature data, reserving target suggestion frames meeting preset requirements, mapping the target suggestion frames to corresponding positions of the feature map, and performing rotation operation on the target suggestion frames to obtain a target feature map;
and the training module is used for training the neural network through the target feature map to obtain a template matching network, wherein the output of the template matching network is at least the category of the target object and the scaling ratio of the target object.
In the embodiment of the present application, specific implementation manners of each module may refer to descriptions of method embodiments, and the embodiment of the present application will not be repeated.
In another embodiment of the present application, there is also provided an identification device including:
the recognition result acquisition module is used for inputting the picture to be recognized into the trained template matching network for recognition processing to obtain the category information and the pose information of each target object contained in the picture to be recognized.
The pose information of the target object comprises a center position, a scaling ratio, a rotation angle and the like.
It should be noted that each step/component described in the present application may be split into more steps/components, or two or more steps/components or part of operations of the steps/components may be combined into new steps/components, according to the implementation needs, to achieve the object of the present application.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the application and is not intended to limit the application, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the application are intended to be included within the scope of the application.

Claims (10)

1. A template matching network training method, comprising:
(1) Obtaining a sample template, wherein the sample template at least comprises a target object diagram and a background diagram, and the sample template comprises a target outline of a target object;
(2) Generating training data comprising a plurality of sample pictures based on the sample template, and recording center point coordinates of each target object contained in each sample picture, rotation angles of each target object, scaling ratios of each target object and category information of each target object;
(3) Performing normalization processing on the training data to obtain target training data, and performing feature extraction on the target training data by using a convolutional neural network to obtain feature data, wherein the feature data comprises gray scale of the target training data and contour feature data of each target object;
(4) Training a suggestion frame network according to the feature data, reserving a target suggestion frame meeting preset requirements, mapping the target suggestion frame to a position corresponding to a feature map, and performing rotation operation on the target suggestion frame to obtain a target feature map;
(5) And training a neural network through the target feature map to obtain a template matching network, wherein the output of the template matching network is at least the category of the target object and the scaling ratio of the target object.
2. The method of claim 1, wherein step (2) comprises:
(2.1) randomly selecting target objects, randomly setting the scaling and the rotation angle of each target object based on the sample template;
and (2.2) generating training data comprising a plurality of sample pictures by adding a von neumann topological structure in a particle swarm algorithm, and recording the center point coordinates of each target object, the rotation angle of each target object, the scaling ratio of each target object and the category information of each target object in each sample picture.
3. The method according to claim 1 or 2, wherein step (4) comprises:
(4.1) training a suggestion frame network according to the characteristic data, and predicting the axis alignment suggestion frame of each target object and the rotation angle value of each target object, wherein the suggestion frame network comprises a rpn network and an angle classification network;
(4.2) obtaining the coincidence ratio between the shaft alignment suggested frame and the actual suggested frame, obtaining a difference value between the rotation angle value of the target object in the shaft alignment suggested frame and the rotation angle value of the target object in the actual suggested frame, and if the coincidence ratio is smaller than a preset coincidence ratio threshold value and the difference value is smaller than a preset difference value threshold value, discarding the shaft alignment suggested frame with the coincidence ratio lower than the preset coincidence ratio threshold value to obtain a reserved target shaft alignment suggested frame;
and (4.3) mapping the target axis alignment suggestion frame to a position corresponding to the feature map, and performing rotation transformation on the target axis alignment suggestion frame at the position based on the rotation angle value of the target object in the target axis alignment suggestion frame to obtain the target feature map.
4. A method according to claim 3, wherein in step (4.1) the rotation angle value of the target object is predicted, comprising:
and classifying the angles, wherein the standard angle and a plurality of left and right degrees thereof are positive values, and the rest angle label values are all set to 0, wherein the standard angle is the direction of the centroid of the target object along the long side of the minimum rectangular envelope frame.
5. A method of identification, comprising:
inputting a picture to be identified into the template matching network trained by the template matching network training method according to any one of claims 1 to 4 for identification processing, so as to obtain category information and pose information of each target object contained in the picture to be identified.
6. A template matching network training device, comprising:
the template acquisition module is used for acquiring a sample template, wherein the sample template at least comprises a target object diagram and a background diagram, and the sample template comprises a target outline of a target object;
the training data acquisition module is used for generating training data comprising a plurality of sample pictures based on the sample template and recording the center point coordinates of each target object contained in each sample picture, the rotation angle of each target object, the scaling ratio of each target object and the category information of each target object;
the feature extraction module is used for carrying out normalization processing on the training data to obtain target training data, and carrying out feature extraction on the target training data by utilizing a convolutional neural network to obtain feature data, wherein the feature data comprises the gray level of the target training data and the contour feature data of each target object;
the feature map acquisition module is used for training a suggestion frame network according to the feature data, reserving a target suggestion frame meeting preset requirements, mapping the target suggestion frame to a corresponding position of a feature map, and performing rotation operation on the target suggestion frame to obtain a target feature map;
and the training module is used for training the neural network through the target feature map to obtain a template matching network, wherein the output of the template matching network is at least the category of the target object and the scaling ratio of the target object.
7. The apparatus of claim 6, wherein the training data acquisition module comprises:
the preprocessing module is used for randomly selecting target objects based on the sample template, and randomly setting the scaling and the rotation angle of each target object;
the training data acquisition sub-module is used for generating training data comprising a plurality of sample pictures in a mode of adding a von Neumann topological structure in a particle swarm algorithm, and recording center point coordinates of each target object contained in each sample picture, rotation angles of each target object, scaling ratios of each target object and category information of each target object.
8. The apparatus according to claim 6 or 7, wherein the feature map acquisition module includes:
the first training module is used for training a suggestion frame network according to the characteristic data and predicting the shaft alignment suggestion frame of each target object and the rotation angle value of each target object, wherein the suggestion frame network comprises a rpn network and an angle classification network;
the judging and processing module is used for acquiring the coincidence ratio between the shaft alignment suggested frame and the actual suggested frame, acquiring a difference value between the rotation angle value of the target object in the shaft alignment suggested frame and the rotation angle value of the target object in the actual suggested frame, and discarding the shaft alignment suggested frame with the coincidence ratio lower than the preset coincidence ratio threshold value if the coincidence ratio is smaller than the preset coincidence ratio threshold value and acquiring a reserved target shaft alignment suggested frame if the coincidence ratio is smaller than the preset coincidence ratio threshold value;
and the characteristic map acquisition sub-module is used for mapping the target axis alignment suggestion frame to a position corresponding to the characteristic map, and then carrying out rotation transformation on the target axis alignment suggestion frame at the position based on the rotation angle value of the target object in the target axis alignment suggestion frame to obtain the target characteristic map.
9. The apparatus of claim 8, wherein predicting the rotation angle value of the target object comprises:
and classifying the angles, wherein the standard angle and a plurality of left and right degrees thereof are positive values, and the rest angle label values are all set to 0, wherein the standard angle is the direction of the centroid of the target object along the long side of the minimum rectangular envelope frame.
10. An identification device, comprising:
the recognition result obtaining module is configured to input a picture to be recognized into the template matching network trained by the template matching network training device according to any one of claims 6 to 9, and perform recognition processing on the template matching network to obtain category information and pose information of each target object included in the picture to be recognized.
CN201911248538.3A 2019-12-09 2019-12-09 Template matching network training method, recognition method and device Active CN111104942B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911248538.3A CN111104942B (en) 2019-12-09 2019-12-09 Template matching network training method, recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911248538.3A CN111104942B (en) 2019-12-09 2019-12-09 Template matching network training method, recognition method and device

Publications (2)

Publication Number Publication Date
CN111104942A CN111104942A (en) 2020-05-05
CN111104942B true CN111104942B (en) 2023-11-03

Family

ID=70422155

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911248538.3A Active CN111104942B (en) 2019-12-09 2019-12-09 Template matching network training method, recognition method and device

Country Status (1)

Country Link
CN (1) CN111104942B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598094B (en) * 2020-05-27 2023-08-18 深圳市铁越电气有限公司 Angle regression instrument reading identification method, equipment and system based on deep learning
CN111950567B (en) * 2020-08-18 2024-04-09 创新奇智(成都)科技有限公司 Extractor training method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250812A (en) * 2016-07-15 2016-12-21 汤平 A kind of model recognizing method based on quick R CNN deep neural network
CN107194318A (en) * 2017-04-24 2017-09-22 北京航空航天大学 The scene recognition method of target detection auxiliary
CN109583445A (en) * 2018-11-26 2019-04-05 平安科技(深圳)有限公司 Character image correction processing method, device, equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9031317B2 (en) * 2012-09-18 2015-05-12 Seiko Epson Corporation Method and apparatus for improved training of object detecting system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250812A (en) * 2016-07-15 2016-12-21 汤平 A kind of model recognizing method based on quick R CNN deep neural network
CN107194318A (en) * 2017-04-24 2017-09-22 北京航空航天大学 The scene recognition method of target detection auxiliary
CN109583445A (en) * 2018-11-26 2019-04-05 平安科技(深圳)有限公司 Character image correction processing method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孙显 ; 胡岩峰 ; 王宏琦 ; .基于主动式边界基元模型的多类目标识别方法.中国科学院研究生院学报.2009,(04),全文. *

Also Published As

Publication number Publication date
CN111104942A (en) 2020-05-05

Similar Documents

Publication Publication Date Title
CN111563442B (en) Slam method and system for fusing point cloud and camera image data based on laser radar
CN110232311B (en) Method and device for segmenting hand image and computer equipment
CN111080693A (en) Robot autonomous classification grabbing method based on YOLOv3
CN107833213B (en) Weak supervision object detection method based on false-true value self-adaptive method
TWI395145B (en) Hand gesture recognition system and method
CN111401410B (en) Traffic sign detection method based on improved cascade neural network
WO2019033574A1 (en) Electronic device, dynamic video face recognition method and system, and storage medium
CN105512683A (en) Target positioning method and device based on convolution neural network
CN110910350B (en) Nut loosening detection method for wind power tower cylinder
CN111862119A (en) Semantic information extraction method based on Mask-RCNN
CN112825192B (en) Object identification system and method based on machine learning
CN105069774B (en) The Target Segmentation method of optimization is cut based on multi-instance learning and figure
CN111104942B (en) Template matching network training method, recognition method and device
CN111401449B (en) Image matching method based on machine vision
US20180225799A1 (en) System and method for scoring color candidate poses against a color image in a vision system
CN109767431A (en) Accessory appearance defect inspection method, device, equipment and readable storage medium storing program for executing
CN113989604A (en) Tire DOT information identification method based on end-to-end deep learning
CN112861870A (en) Pointer instrument image correction method, system and storage medium
US9053383B2 (en) Recognizing apparatus and method, program, and recording medium
CN114936997A (en) Detection method, detection device, electronic equipment and readable storage medium
CN113780040A (en) Lip key point positioning method and device, storage medium and electronic equipment
CN111738264A (en) Intelligent acquisition method for data of display panel of machine room equipment
CN110689556A (en) Tracking method and device and intelligent equipment
CN110378337A (en) Metal cutting tool drawing identification information vision input method and system
CN109753981B (en) Image recognition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231207

Address after: 518000, Building 5, Building C, Building C, Huaqiang Creative Park, Biyan Community, Guangming Street, Guangming District, Shenzhen, Guangdong Province, China 1301

Patentee after: SHENZHEN ROBOT VISION TECHNOLOGY Co.,Ltd.

Address before: 518031 703, 7th floor, Zhongdian Difu building, Zhenhua Road, Fuqiang community, Huaqiang North Street, Futian District, Shenzhen City, Guangdong Province

Patentee before: SHANGZHI TECHNOLOGY (SHENZHEN) Co.,Ltd.