CN112733067B - Data set selection method for robot target detection algorithm - Google Patents

Data set selection method for robot target detection algorithm Download PDF

Info

Publication number
CN112733067B
CN112733067B CN202011542396.4A CN202011542396A CN112733067B CN 112733067 B CN112733067 B CN 112733067B CN 202011542396 A CN202011542396 A CN 202011542396A CN 112733067 B CN112733067 B CN 112733067B
Authority
CN
China
Prior art keywords
data set
similarity
target detection
row vector
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011542396.4A
Other languages
Chinese (zh)
Other versions
CN112733067A (en
Inventor
沈文婷
陆林东
郑军奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Robot Industrial Technology Research Institute Co Ltd
Original Assignee
Shanghai Robot Industrial Technology Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Robot Industrial Technology Research Institute Co Ltd filed Critical Shanghai Robot Industrial Technology Research Institute Co Ltd
Priority to CN202011542396.4A priority Critical patent/CN112733067B/en
Publication of CN112733067A publication Critical patent/CN112733067A/en
Application granted granted Critical
Publication of CN112733067B publication Critical patent/CN112733067B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Manipulator (AREA)
  • Image Analysis (AREA)

Abstract

Aiming at the problem of how to select the data set by the robot target detection algorithm model in various application scenes, the invention provides a data set selection method for the robot target detection algorithm. According to the invention, the machine learning method is used, the data set can be automatically selected for training and testing of the model according to different requirements of the algorithm model, a manual experience method can be effectively replaced, and meanwhile, the robustness performance and generalization performance of the algorithm model are improved. According to the method provided by the invention, the metadata characteristics of the data set affecting the detection effect of the model are extracted according to the test conclusion, and the row vectors are recoded, so that the similarity value of the next matching can be effectively reduced. Through continuous iterative updating learning, the row vector coding of the data set in the first step is improved, a more proper data set can be provided for a next new robot target recognition algorithm model, and the robustness and generalization performance of the algorithm model are improved.

Description

Data set selection method for robot target detection algorithm
Technical Field
The invention relates to the field of machine learning, in particular to a data set selection method for a robot target detection algorithm.
Background
With the rapid development of artificial intelligence deep learning, computer vision-based object detection techniques have been applied to various scenes. In particular, in the field of robots, target detection technologies for scenes such as industrial robots, service robots, unmanned aerial vehicles, security monitoring, and the like have been increasingly developed and matured. Different application scenarios typically select corresponding data sets for training and testing of the model. In order to select a proper data set for algorithm model developers so as to achieve the optimal performance of the algorithm model, the method becomes a research hot spot of each algorithm model developer in recent years.
The existing robot target detection algorithm model data set recommendation research generally adopts a method based on manual experience, and the selection with artificial subjectivity often needs to be subjected to a large number of model parameter adjustment processes.
Disclosure of Invention
The purpose of the invention is that: a data set selection method based on machine learning is provided for a robot target detection algorithm model.
In order to achieve the above object, the present invention provides a data set selection method for a robot target detection algorithm, which is characterized by comprising the following steps:
step 1: performing row vector encoding on the metadata characteristics of each type of existing data set comprises:
step 101: each metadata feature contains different numbers of feature elements, and if the total number of all the feature elements of all the metadata features is n, a 1 Xn-dimensional matrix is constructed, wherein each element in the matrix corresponds to one feature element in one metadata feature;
step 102: setting all elements of the 1×n-dimensional matrix obtained in step 101 to 0, obtaining a set of n row vectors containing 0 for each type of dataset
Figure BDA0002850218540000011
Figure BDA0002850218540000012
Step 103: obtaining row vectors corresponding to each type of data set
Figure BDA0002850218540000013
The method comprises the following steps:
if the current dataset contains a feature element in a metadata feature, the corresponding row vector obtained in step 102 is then used to determine the metadata feature
Figure BDA0002850218540000014
The value of the corresponding element in (1) is set from 0 to 1, so that the current data set corresponds to a 1 Xn-dimensional matrix containing a number of element values of 1, which 1 Xn-dimensional matrix is defined as row vector +.>
Figure BDA0002850218540000015
Step 2: determining a target detection object required by a robot target detection algorithm model, and performing row vector coding on metadata features of the target detection object by using the same method as the step 1 so as to establish row vectors corresponding to the metadata features of the target detection object
Figure BDA0002850218540000021
Step 3: based on the row vector obtained in step 1
Figure BDA0002850218540000022
And row vector obtained in step 2 +.>
Figure BDA0002850218540000023
Calculating the similarity between the metadata characteristics of the target detection object related to the robot field and the metadata characteristics of the existing data set, wherein the higher the similarity is, the more the current data set is matched with the target detection object, and the lower the similarity is, the more the current data set is not matched with the target detection object;
step 4: taking the data set with the highest similarity in the step 3 as a reference data set recommended by a target detection object in the robot field, respectively carrying out similarity calculation on the rest data sets and the reference data sets, and calculating a value of similarity II by using a row vector corresponding to each data set;
step 5: and (3) giving a first similarity threshold and a second similarity threshold, and taking all data sets with the value of the first similarity higher than the first similarity threshold and the value of the second similarity higher than the second similarity threshold and the reference data set determined in the step (4) as recommended data sets of the target detection object.
Preferably, the row vector of the current data set is calculated using a cosine similarity formula
Figure BDA0002850218540000024
Said row vector corresponding to the target detection object +.>
Figure BDA0002850218540000025
The distance between the two data sets is taken as a value of similarity one between the target detection object and the current data set; the larger the cosine similarity value, the more the target detection object matches the current dataset.
Preferably, the concrete calculation method of the cosine similarity is as follows:
Figure BDA0002850218540000026
cos (A, B) represents a row vector
Figure BDA0002850218540000027
And row vector->
Figure BDA0002850218540000028
Is the cosine similarity of (a) and (a) represents the row vector +.>
Figure BDA0002850218540000029
Is used to represent the row vector +.>
Figure BDA00028502185400000210
Is a mold of (a).
Preferably, in step 3, the calculated values of the similarity of all the data sets are arranged in descending order from high to low, so as to complete the descending order of all the corresponding data sets.
Preferably, in step 4, all the values of the similarity two are arranged in descending order from high to low.
Preferably, after the step 5, the method further comprises:
step 6: after training test is carried out by using the recommended data set in the robot field obtained in the step 5, extracting metadata characteristics of the data set influencing the detection effect of the model according to a test conclusion, and recoding row vectors, so that the similarity value of the next matching can be effectively reduced; through continuous iterative updating learning, the row vector coding of the data set in the step 1 is improved, a more proper data set is provided for the next new robot target recognition algorithm model, and the robustness and generalization performance of the algorithm model are improved.
Aiming at the problem of how to select the data set by the robot target detection algorithm model in various application scenes, the invention provides a method for recommending a proper data set for a robot target detection algorithm model developer. According to the invention, the machine learning method is used, the data set can be automatically selected for training and testing of the model according to different requirements of the algorithm model, a manual experience method can be effectively replaced, and meanwhile, the robustness performance and generalization performance of the algorithm model are improved. The method provided by the invention analyzes the metadata characteristics of the data set affecting the detection effect of the model, and recodes the row vector, so that the similarity value of the next matching can be effectively reduced. Through continuous iterative updating, a more proper data set can be provided for the next robot target recognition algorithm model, and the robustness and generalization performance of the algorithm model are improved.
Drawings
FIG. 1 is a schematic overall flow chart of a data set selection method for a robot target detection algorithm model provided by the invention;
fig. 2 is a schematic diagram of line vector encoding of metadata features provided in the present invention.
Detailed Description
The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. Further, it is understood that various changes and modifications may be made by those skilled in the art after reading the teachings of the present invention, and such equivalents are intended to fall within the scope of the claims appended hereto.
As shown in fig. 1, the data set selection method for the robot target recognition algorithm provided by the invention comprises the following steps:
step one: the metadata features of each type of existing dataset are row vector encoded.
The metadata features include application scenes, target detection object categories, target detection object sizes, illumination brightness and the like. The algorithm model developer target detection object establishes a similarity matching relation with each metadata characteristic traversal, and specifically comprises the following contents:
each metadata feature contains a different number of feature elements, e.g. a robot application scenario is a metadata feature, which is a package of metadata featuresContains a plurality of characteristic elements corresponding to different scenes of home, market, park and the like. Let the total number of all feature elements of all metadata features be n, a 1 x n dimensional matrix is constructed. At initialization, all elements of the 1 Xn-dimensional matrix are set to 0, so that a set of row vectors containing n 0's can be obtained
Figure BDA0002850218540000031
Figure BDA0002850218540000032
Among n feature elements of the 1×n-dimensional matrix, the first metadata feature is a robot application scene, 1 st to nth 1 The feature elements belong to a first metadata feature and respectively correspond to different scenes such as home, market, park and the like. Nth (n) 1 +1 characteristic elements to nth 2 The feature elements belong to a second metadata feature, which is the target detection object class. Nth (n) 2 The +1 to nth feature elements belong to a third metadata feature, which is the target detection object size. Coding is carried out in the initialized 1 Xn-dimensional matrix according to the corresponding relation. When encoding, if the current data set contains a certain characteristic element in a certain metadata characteristic, the value of the corresponding element in the 1 Xn-dimensional matrix is set from 0 to 1, so that each data set corresponds to a 1 Xn-dimensional matrix containing a plurality of element values of 1, and the 1 Xn-dimensional matrix is defined as a row vector
Figure BDA0002850218540000041
Each dataset can thus be represented as a different row vector +.>
Figure BDA0002850218540000042
Step two: and determining a target detection object required by the robot target detection algorithm model, and performing row vector coding on metadata characteristics of the target detection object by using the same method as the first step. In the same phase as the step oneSimilarly, if the target detection object contains a certain characteristic element in a certain metadata characteristic, the value of the corresponding element in the 1×n-dimensional matrix is set from 0 to 1, so as to establish a row vector corresponding to the metadata characteristic of the target detection object
Figure BDA0002850218540000043
Step three: the method comprises the steps of calculating the similarity between the metadata characteristics of a target detection object related to the robot field and the metadata characteristics of an existing dataset, wherein the higher the similarity is, the more the current dataset is matched with the target detection object, and the lower the similarity is, the more the current dataset is not matched with the target detection object.
Computing row vectors for a current dataset using a cosine similarity formula
Figure BDA0002850218540000044
Row vector corresponding to target detection object +.>
Figure BDA0002850218540000045
And taking the calculated distance as a value of similarity between the target detection object and the current data set. The larger the cosine similarity value, the more the target detection object matches the current dataset. The concrete calculation method of the cosine similarity is shown as follows:
Figure BDA0002850218540000046
cos (A, B) represents a row vector
Figure BDA0002850218540000047
And row vector->
Figure BDA0002850218540000048
Is the cosine similarity of (a) and (a) represents the row vector +.>
Figure BDA0002850218540000049
Is the modulus of (B) and (B) represents the direction of rowsQuantity->
Figure BDA00028502185400000410
Is a mold of (a).
And arranging the calculated similarity values of all the data sets in descending order according to the sequence from high to low, and further finishing the descending order of all the corresponding data sets.
Step four: and (3) taking the data set arranged at the first position in the third step as a reference data set recommended by a target detection object in the robot field, and respectively carrying out similarity calculation on the rest data sets and the reference data set. Similar to the first, second and third steps, the second similarity values obtained by calculating the row vectors corresponding to each data set are utilized, and all the second similarity values are arranged in descending order from high to low.
Step five: and (3) giving a first similarity threshold and a second similarity threshold, and taking all data sets with the value of the first similarity higher than the first similarity threshold and the value of the second similarity higher than the second similarity threshold and the reference data set determined in the step four as recommended data sets of the target detection object.
Step six: and D, after training and testing the recommended data set in the robot field obtained in the step five, extracting metadata characteristics of the data set influencing the detection effect of the model according to a test conclusion, recoding the row vector, and effectively reducing the similarity value of the next matching. Through continuous iterative updating learning, the row vector coding of the data set in the first step is improved, a more proper data set can be provided for a next new robot target recognition algorithm model, and the robustness and generalization performance of the algorithm model are improved.

Claims (5)

1. The data set selection method for the robot target detection algorithm is characterized by comprising the following steps of:
step 1: performing row vector encoding on the metadata characteristics of each type of existing data set comprises:
step 101: each metadata feature contains different numbers of feature elements, and if the total number of all the feature elements of all the metadata features is n, a 1 Xn-dimensional matrix is constructed, wherein each element in the matrix corresponds to one feature element in one metadata feature;
step 102: setting all elements of the 1×n-dimensional matrix obtained in step 101 to 0, obtaining a set of n row vectors containing 0 for each type of dataset
Figure FDA0004102614550000011
Figure FDA0004102614550000012
Step 103: obtaining row vectors corresponding to each type of data set
Figure FDA0004102614550000013
The method comprises the following steps:
if the current dataset contains a feature element in a metadata feature, the corresponding row vector obtained in step 102 is then used to determine the metadata feature
Figure FDA0004102614550000014
The value of the corresponding element in (1) is set from 0 to 1, so that the current data set corresponds to a 1 Xn-dimensional matrix containing a number of element values of 1, which 1 Xn-dimensional matrix is defined as row vector +.>
Figure FDA0004102614550000015
Step 2: determining a target detection object required by a robot target detection algorithm model, and performing row vector coding on metadata features of the target detection object by using the same method as the step 1 so as to establish row vectors corresponding to the metadata features of the target detection object
Figure FDA0004102614550000016
Step 3: based on the row vector obtained in step 1
Figure FDA0004102614550000017
And row vector obtained in step 2 +.>
Figure FDA0004102614550000018
Calculating the similarity between the metadata characteristics of the target detection object related to the robot field and the metadata characteristics of the existing data set, wherein the higher the similarity is, the more the current data set is matched with the target detection object, and the lower the similarity is, the more the current data set is not matched with the target detection object;
step 4: taking the data set with the highest similarity in the step 3 as a reference data set recommended by a target detection object in the robot field, respectively carrying out similarity calculation on the rest data sets and the reference data sets, and calculating a value of similarity II by using a row vector corresponding to each data set;
step 5: giving a first similarity threshold and a second similarity threshold, and taking all data sets with the value of the first similarity higher than the first similarity threshold and the value of the second similarity higher than the second similarity threshold and the reference data set determined in the step 4 as recommended data sets of the target detection object;
step 6: after training test is carried out by using the recommended data set in the robot field obtained in the step 5, extracting metadata characteristics of the data set influencing the detection effect of the model according to a test conclusion, and recoding row vectors, so that the similarity value of the next matching can be effectively reduced; through continuous iterative updating learning, the row vector coding of the data set in the step 1 is improved, a more proper data set is provided for the next new robot target recognition algorithm model, and the robustness and generalization performance of the algorithm model are improved.
2. The method for selecting a dataset for a robotic target detection algorithm as claimed in claim 1, wherein the row vector for the current dataset is calculated using a cosine similarity formula
Figure FDA0004102614550000021
With the objectThe row vector corresponding to the detection object +.>
Figure FDA0004102614550000022
The distance between the two data sets is taken as a value of similarity one between the target detection object and the current data set; the larger the cosine similarity value, the more the target detection object matches the current dataset.
3. The method for selecting a dataset for a robot target detection algorithm according to claim 2, wherein the concrete calculation method of the cosine similarity is as follows:
Figure FDA0004102614550000023
/>
cos (A, B) represents a row vector
Figure FDA0004102614550000024
And row vector->
Figure FDA0004102614550000025
Is the cosine similarity of (a) and (a) represents the row vector +.>
Figure FDA0004102614550000026
Is used to represent the row vector +.>
Figure FDA0004102614550000027
Is a mold of (a).
4. The method for selecting datasets for a target detection algorithm of a robot according to claim 1, wherein in step 3, the calculated similarity values of all datasets are arranged in descending order from high to low, thereby completing the descending order of all datasets.
5. The method of claim 1, wherein in step 4, all the values of the similarity two are arranged in descending order from high to low.
CN202011542396.4A 2020-12-22 2020-12-22 Data set selection method for robot target detection algorithm Active CN112733067B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011542396.4A CN112733067B (en) 2020-12-22 2020-12-22 Data set selection method for robot target detection algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011542396.4A CN112733067B (en) 2020-12-22 2020-12-22 Data set selection method for robot target detection algorithm

Publications (2)

Publication Number Publication Date
CN112733067A CN112733067A (en) 2021-04-30
CN112733067B true CN112733067B (en) 2023-05-09

Family

ID=75604750

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011542396.4A Active CN112733067B (en) 2020-12-22 2020-12-22 Data set selection method for robot target detection algorithm

Country Status (1)

Country Link
CN (1) CN112733067B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509467A (en) * 2017-05-04 2018-09-07 宁波数联软件有限公司 A kind of sea-freight quotation commending system and method based on User action log
CN109145111A (en) * 2018-07-27 2019-01-04 深圳市翼海云峰科技有限公司 A kind of multiple features text data similarity calculating method based on machine learning
CN111553193A (en) * 2020-04-01 2020-08-18 东南大学 Visual SLAM closed-loop detection method based on lightweight deep neural network

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107657216A (en) * 2017-09-11 2018-02-02 安徽慧视金瞳科技有限公司 1 to the 1 face feature vector comparison method based on interference characteristic vector data collection
CN107749034A (en) * 2017-11-17 2018-03-02 浙江工业大学 A kind of safe friend recommendation method in social networks
CN108491872B (en) * 2018-03-16 2020-10-30 深圳市商汤科技有限公司 Object re-recognition method and apparatus, electronic device, program, and storage medium
CN109255586B (en) * 2018-08-24 2022-03-29 安徽讯飞智能科技有限公司 Online personalized recommendation method for e-government affairs handling
CN109615466A (en) * 2018-11-27 2019-04-12 浙江工商大学 The mixed method of commending contents and collaborative filtering recommending towards mobile ordering system
CN110738245A (en) * 2019-09-29 2020-01-31 上海大学 automatic clustering algorithm selection system and method for scientific data analysis
CN110990383A (en) * 2019-10-14 2020-04-10 同济大学 Similarity calculation method based on industrial big data set
CN111460316B (en) * 2020-03-20 2022-08-26 南京邮电大学 Knowledge system-oriented personalized recommendation method and computer storage medium
CN111834011A (en) * 2020-07-10 2020-10-27 华东师范大学 Long-term care-for-the-elderly oriented collaborative interactive service recommendation method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509467A (en) * 2017-05-04 2018-09-07 宁波数联软件有限公司 A kind of sea-freight quotation commending system and method based on User action log
CN109145111A (en) * 2018-07-27 2019-01-04 深圳市翼海云峰科技有限公司 A kind of multiple features text data similarity calculating method based on machine learning
CN111553193A (en) * 2020-04-01 2020-08-18 东南大学 Visual SLAM closed-loop detection method based on lightweight deep neural network

Also Published As

Publication number Publication date
CN112733067A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
CN105960647B (en) Compact face representation
CN108734210B (en) Object detection method based on cross-modal multi-scale feature fusion
CN111079780B (en) Training method for space diagram convolution network, electronic equipment and storage medium
CN110209859A (en) The method and apparatus and electronic equipment of place identification and its model training
CN110619059B (en) Building marking method based on transfer learning
CN110175615B (en) Model training method, domain-adaptive visual position identification method and device
CN111144214B (en) Hyperspectral image unmixing method based on multilayer stack type automatic encoder
CN110941734A (en) Depth unsupervised image retrieval method based on sparse graph structure
Wang et al. An unequal deep learning approach for 3-D point cloud segmentation
CN110942091A (en) Semi-supervised few-sample image classification method for searching reliable abnormal data center
JP6892606B2 (en) Positioning device, position identification method and computer program
CN115578426B (en) Indoor service robot repositioning method based on dense feature matching
CN116128944A (en) Three-dimensional point cloud registration method based on feature interaction and reliable corresponding relation estimation
CN109886206B (en) Three-dimensional object identification method and equipment
US20230410465A1 (en) Real time salient object detection in images and videos
CN112733067B (en) Data set selection method for robot target detection algorithm
CN117058235A (en) Visual positioning method crossing various indoor scenes
CN116958809A (en) Remote sensing small sample target detection method for feature library migration
CN112487927B (en) Method and system for realizing indoor scene recognition based on object associated attention
Liu et al. Class incremental learning with self-supervised pre-training and prototype learning
CN115578574A (en) Three-dimensional point cloud completion method based on deep learning and topology perception
CN114496068A (en) Protein secondary structure prediction method, device, equipment and storage medium
CN111652102A (en) Power transmission channel target object identification method and system
CN112149566A (en) Image processing method and device, electronic equipment and storage medium
CN110852206A (en) Scene recognition method and device combining global features and local features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant