CN112184797B - Method for spatially positioning key part of kilogram group weight - Google Patents

Method for spatially positioning key part of kilogram group weight Download PDF

Info

Publication number
CN112184797B
CN112184797B CN202011103011.4A CN202011103011A CN112184797B CN 112184797 B CN112184797 B CN 112184797B CN 202011103011 A CN202011103011 A CN 202011103011A CN 112184797 B CN112184797 B CN 112184797B
Authority
CN
China
Prior art keywords
kilogram
group
weights
mask
point cloud
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011103011.4A
Other languages
Chinese (zh)
Other versions
CN112184797A (en
Inventor
马健
赵迪
石凌
刘桂雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GUANGZHOU INSTITUTE OF MEASURING AND TESTING TECHNOLOGY
Original Assignee
GUANGZHOU INSTITUTE OF MEASURING AND TESTING TECHNOLOGY
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GUANGZHOU INSTITUTE OF MEASURING AND TESTING TECHNOLOGY filed Critical GUANGZHOU INSTITUTE OF MEASURING AND TESTING TECHNOLOGY
Priority to CN202011103011.4A priority Critical patent/CN112184797B/en
Publication of CN112184797A publication Critical patent/CN112184797A/en
Application granted granted Critical
Publication of CN112184797B publication Critical patent/CN112184797B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Geometry (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for positioning the space of key parts of kilogram-group weights, which comprises the following steps: calibrating an RGBD camera; acquiring RGB channel data and Depth channel data; determining the characteristics and key parts of the kilogram groups of weights; inputting RGB channel data into a deep learning network, and identifying and segmenting key parts of interest in a picture through Mask R-CNN to generate a BBox frame and a Mask; aligning RGB channel data with Depth channel data; cutting the point cloud data according to a BBox frame to generate each example point cloud group; accurately dividing the data of each example point group according to Mask; performing parameter optimization by using the LM optimization algorithm by taking the sum of the distances from each point to the axis of the cylinder as an objective function to be optimized to obtain a final kilogram group weight handle cylinder fitting result; and generating a three-dimensional unit vector of key points of the handles of the kilogram-group weights according to the fitting result. The invention can quickly and accurately position the kilogram group weight handle.

Description

Method for spatially positioning key part of kilogram group weight
Technical Field
The invention relates to the technical field of computer vision three-dimensional positioning, in particular to a method for positioning a key part space of a kilogram group weight.
Background
The existing space positioning method is based on laser or multi-view geometric technology, is single-mode sensing, is usually only used in a simpler use environment or a single identification object, and has no algorithm capable of performing space positioning on multiple instances in a complex environment. Because the space positioning is often influenced by the environment and the number of the identification positioning examples, different environments and different examples have different numbers, which increases difficulty for the space positioning of multiple examples. According to the method for locating the key parts of the kilogram group weights in the space, the key parts of a plurality of examples in an image are identified and segmented by utilizing the current advanced deep neural network, so that the method has strong universality and robustness, and the problem of locating the space of stacked kilogram group weights is solved by adding a reasonable RGBD (red, green and blue) multi-modal sensing technology. The kilogram group weight key part space positioning technology provides possibility for multi-instance space positioning in a complex environment, and can achieve multi-instance space positioning in a simpler shape by combining different deep learning weights and key part fitting algorithms.
Disclosure of Invention
In order to solve the technical problems, the invention aims to provide a method for positioning the key part space of a kilogram group of weights.
The purpose of the invention is realized by the following technical scheme:
a face forehead key point identification method comprises the following steps:
calibrating an RGBD camera to obtain RGB channel data and Depth channel data;
b, determining characteristics and key parts of the kilogram-group weights, inputting RGB channel data into a deep learning network, and identifying and segmenting interested key parts in the picture through Mask R-CNN to generate a BBox frame and a Mask;
c, aligning the RGB channel data with the Depth channel data;
d, cutting the point cloud data according to a BBox frame to generate each example point cloud group; accurately dividing the data of each example point group according to Mask;
e, performing parameter optimization by using LM optimization algorithm by taking the sum of the distances from each point to the axis of the cylinder as an objective function to be optimized to obtain a final kilogram group weight handle cylinder fitting result;
and F, generating a three-dimensional unit vector of key points of the handles of the kilogram-group weights according to the fitting result.
One or more embodiments of the invention may have the following advantages over the prior art:
the three-dimensional space positioning of the stacked kilogram group weight handle part is realized, and good technical support is provided for the space positioning of cylindrical parts such as the kilogram group weight handle.
Drawings
FIG. 1 is a block diagram of a method for spatially locating key portions of kilogram weights;
FIG. 2 is a three-dimensional point cloud model of a handle of a kilogram group of weights;
fig. 3 is a model after the kilogram group weight handles are fitted.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
The embodiment provides a method for spatially positioning key parts of kilogram-group weights, which includes the steps of calibrating an RGBD (red, green and blue) camera to obtain RGB (red, green and blue) channel data and Depth channel data; secondly, determining the characteristics and key parts of kilogram groups of weights, inputting RGB channel data into a deep learning network, and identifying and segmenting the interested key parts in the picture through Mask R-CNN to generate a BBox frame and a Mask; aligning the RGB channel data with the Depth channel data, and cutting and accurately dividing the point cloud data according to a BBox frame and a Mask; and finally, performing parameter optimization by using an LM optimization algorithm by taking the sum of the distances from each point to the axis of the cylinder as an objective function to be optimized, fitting the original point clouds in the graph 2 to generate a kilogram group of weight handles in the graph 3, and generating a three-dimensional unit vector of key points of the kilogram group of weight handles according to a fitting result. The invention can quickly and accurately position the kilogram group weight handle. This provides a good technical support for the spatial positioning of cylindrical components such as handles for kilogram weights.
As shown in fig. 1, the method for identifying key parts of kilogram-group weights includes a data acquisition stage; an example segmentation stage; a picture processing stage; a point cloud simplification stage; a point cloud fitting stage; and a key point coordinate generation stage. The method specifically comprises the following steps:
step 10, calibrating an RGBD camera to obtain RGB channel data and Depth channel data;
step 20, determining characteristics and key parts of kilogram-group weights, inputting RGB channel data into a deep learning network, and identifying and segmenting interested key parts in a picture through Mask R-CNN to generate a BBox frame and a Mask;
step 30, aligning the RGB channel data with the Depth channel data;
step 40, cutting the point cloud data according to a BBox frame to generate each example point cloud group; accurately dividing the point group data of each example according to a Mask;
step 50, using the sum of the distances from each point to the axis of the cylinder as an objective function to be optimized, and performing parameter optimization by using an LM optimization algorithm to obtain a final kilogram group weight handle cylinder fitting result;
and step 60, generating a three-dimensional unit vector of key points of the handles of the kilogram-group weights according to the fitting result.
The step 10 specifically includes:
arranging a camera right above the kilogram group of weights in a overlooking state, and connecting the camera with a computer through a USB Type-C interface; calling an API (application programming interface) of Intel Real sensor to calibrate the RGBD camera in a linux/ubuntu operating system environment; collecting an RGB image and a Depth image.
The step 20 specifically includes:
calling a deep learning algorithm under the linux/ubuntu operating system environment, identifying and segmenting a kilogram group of weight handles in the collected RGB image by using weights trained in advance, and outputting a Mask binary image and BBox frame coordinates of the handles;
the training of the network can be represented by the following optimization formula:
Figure BDA0002726038830000031
wherein p is out A model representing the neural network as a function of network weights; n is a radical of ap Representing the number of samples; then solving the weight of the neural network corresponding to the minimum value of the equation by using a gradient descent methodAnd obtaining the trained neural network model.
The step 30 specifically includes:
and aligning the RGB image with the Depth image by a bilinear interpolation method so as to ensure the accuracy of the Depth data corresponding to all pixels in the RGB image.
Assuming that the depth information corresponding to a pixel in the RGB image is f (i, j), the depth value f (i + u, j + v) at (i + u, j + v) for u, v ∈ (0,1) is:
f(i+u,j+v)=(1-u)*(1-v)*f(i,j)+(1-u)*v*f(i,j+1)+u*(1-v)*f(i+1,j)+u*v*f(i+1,j+1)
the step 40 specifically includes:
cutting point cloud data of the whole image according to the BBox frame generated in the step 20, generating a corresponding small-scale point cloud image for each kilogram group weight example, and further simplifying the small-scale point cloud image according to a Mask 0-1 binary image;
the step 50 specifically includes:
fitting the point cloud data simplified in the step 40 according to a cylindrical curved surface, and setting the cloud points as X = (X, y, z) and the unit direction vector of the cylindrical axis as a = (a) x ,a y ,a z ) The objective function is then as follows:
d(x)=∑[f(x i ,a)-r]
where f is a function of the distance from the point to the axis and r is the radius of the fitting cylinder. A because the kilogram group weights are stacked horizontally and the shape of the handle is fixed and known z Is 0,a x 、a y The r is 15mm, which is obtained by point cloud data statistics. And then, carrying out optimization solution by utilizing an LM optimization algorithm to obtain a cylindrical fitting result of the handles of the kilogram groups of weights.
The step 60 specifically includes: and generating three-dimensional vectors of kilogram weight handles according to the fitting result generated in the step 50, and outputting the key point space positioning data of the stacked kilogram weight groups after sorting according to the Z value.
Although the embodiments of the present invention have been described above, the above descriptions are only for the convenience of understanding the present invention, and are not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (7)

1. A method for spatially positioning key parts of kilogram-group weights is characterized by comprising the following steps:
A. calibrating an RGBD camera to obtain an RGB image and a Depth image of a kilogram group weight;
B. determining the characteristics of the kilogram group weights and the kilogram group weight handles, inputting the RGB image into a Mask R-CNN model, and identifying and segmenting key parts of interest in the RGB image through the Mask R-CNN model to generate a BBox frame and a Mask; the key part is a handle;
C. aligning the RGB image with the Depth image;
D. cutting the point cloud data according to a BBox frame to generate each example point cloud group; accurately dividing the data of each example point group according to Mask;
E. performing parameter optimization by using LM optimization algorithm by taking the sum of the distances from each point to the axis of the cylinder as an objective function to be optimized to obtain a final kilogram group weight handle cylinder fitting result;
F. and generating a three-dimensional vector of key points of the handles of the kilogram group of weights according to the fitting result.
2. The method for spatially positioning the key parts of the kilogram-group weights according to claim 1, wherein the step A specifically comprises:
arranging a camera right above the kilogram group of weights in a overlooking state, and connecting the camera with a computer through a USB Type-C interface; calling an API (application programming interface) of Intel Real sensor to calibrate the RGBD camera in a linux/ubuntu operating system environment; collecting an RGB image and a Depth image.
3. The method for spatially locating the critical parts of kilogram-group weights according to claim 1, wherein in step B:
under the environment of a linux/ubuntu operating system, calling a Mask R-CNN model, identifying and segmenting a kilogram group weight handle in an acquired RGB image by using weights trained in advance, and outputting a Mask and BBox frame coordinates of the handle;
the training of the Mask R-CNN model is represented by the following optimization formula:
Figure FDA0004046535260000011
wherein p is out Representing a Mask R-CNN model, which is a function of the Mask R-CNN model weight; n is a radical of ap Representing the number of samples; and solving the weight of the Mask R-CNN model corresponding to the minimum value of the optimized formula by using a gradient descent method to obtain the trained Mask R-CNN model.
4. The method for spatially positioning the critical parts of the kilogram-group weights according to claim 1, wherein the step C specifically comprises:
aligning the RGB image with the Depth image through a bilinear interpolation method so as to ensure the accuracy of Depth data corresponding to all pixels in the RGB image;
assuming that the depth information corresponding to a pixel in the RGB image is f (i, j), the depth value f (i + u, j + v) at (i + u, j + v) for u, v ∈ (0,1) is:
Figure FDA0004046535260000021
5. the method for spatially positioning the critical part of the kilogram-group weight according to claim 1, wherein the step D specifically comprises:
and (3) cutting the point cloud data of the whole image according to the generated BBox frame, generating a corresponding small-scale point cloud picture by each kilogram group weight example, and further simplifying the small-scale point cloud picture according to a Mask.
6. The method for spatially positioning the critical parts of the kilogram-group weights according to claim 1, wherein the step E specifically comprises:
fitting the simplified point cloud data according to a cylindrical curved surface, and setting X = (X, y, z) for each point of the point cloud and a = (a) for a unit direction vector of a cylindrical axis x ,a y ,a z ) Then the objective function is as follows:
d(x)=∑[f(x i ,a)-r]
wherein f is a point-to-axis distance function and r is the radius of the fitting cylinder; a, because kilogram groups of weights are stacked horizontally and the shape of the handle is fixed z Is 0,a x 、a y The data are counted by point cloud, and r is 15mm; and performing optimization solution by utilizing an LM optimization algorithm to obtain a cylindrical fitting result of the handles of the kilogram groups of weights.
7. The method for spatially positioning the critical part of the kilogram-group weight according to claim 1, wherein the step F specifically comprises:
and generating three-dimensional vectors of the handles of the kilogram groups of weights according to the cylindrical fitting result of the handles of the kilogram groups of weights, and outputting the spatial location data of the key points of the handles of the stacked kilogram groups of weights after sequencing according to the size of the z value.
CN202011103011.4A 2020-10-15 2020-10-15 Method for spatially positioning key part of kilogram group weight Active CN112184797B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011103011.4A CN112184797B (en) 2020-10-15 2020-10-15 Method for spatially positioning key part of kilogram group weight

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011103011.4A CN112184797B (en) 2020-10-15 2020-10-15 Method for spatially positioning key part of kilogram group weight

Publications (2)

Publication Number Publication Date
CN112184797A CN112184797A (en) 2021-01-05
CN112184797B true CN112184797B (en) 2023-04-07

Family

ID=73950374

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011103011.4A Active CN112184797B (en) 2020-10-15 2020-10-15 Method for spatially positioning key part of kilogram group weight

Country Status (1)

Country Link
CN (1) CN112184797B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104626142A (en) * 2014-12-24 2015-05-20 镇江市计量检定测试中心 Method for automatically locating and moving binocular vision mechanical arm for weight testing
CN110599489A (en) * 2019-08-26 2019-12-20 华中科技大学 Target space positioning method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104626142A (en) * 2014-12-24 2015-05-20 镇江市计量检定测试中心 Method for automatically locating and moving binocular vision mechanical arm for weight testing
CN110599489A (en) * 2019-08-26 2019-12-20 华中科技大学 Target space positioning method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种千克组砝码无人化检定系统的设计;马健等;《计量与测试技术》;20191130;第46卷(第11期);第14-16页 *

Also Published As

Publication number Publication date
CN112184797A (en) 2021-01-05

Similar Documents

Publication Publication Date Title
CN109614935B (en) Vehicle damage assessment method and device, storage medium and electronic equipment
CN109993793B (en) Visual positioning method and device
US9165365B2 (en) Method and system for estimating attitude of camera
CN111046843B (en) Monocular ranging method in intelligent driving environment
CN110751097B (en) Semi-supervised three-dimensional point cloud gesture key point detection method
CN112200056B (en) Face living body detection method and device, electronic equipment and storage medium
US20200057778A1 (en) Depth image pose search with a bootstrapped-created database
CN116229189B (en) Image processing method, device, equipment and storage medium based on fluorescence endoscope
CN113343976A (en) Anti-highlight interference engineering measurement mark extraction method based on color-edge fusion feature growth
CN111340878B (en) Image processing method and device
CN112929626A (en) Three-dimensional information extraction method based on smartphone image
CN113592839A (en) Distribution network line typical defect diagnosis method and system based on improved fast RCNN
Kochi et al. A 3D shape-measuring system for assessing strawberry fruits
CN109993107B (en) Mobile robot obstacle visual detection method based on non-iterative K-means algorithm
CN117372604B (en) 3D face model generation method, device, equipment and readable storage medium
CN112184797B (en) Method for spatially positioning key part of kilogram group weight
CN116958434A (en) Multi-view three-dimensional reconstruction method, measurement method and system
CN113939852A (en) Object recognition device and object recognition method
CN113532424B (en) Integrated equipment for acquiring multidimensional information and cooperative measurement method
CN114882085A (en) Three-dimensional point cloud registration method and system based on single cube
CN113048899A (en) Thickness measuring method and system based on line structured light
JP2020087155A (en) Information processing apparatus, information processing method, and program
CN117593618B (en) Point cloud generation method based on nerve radiation field and depth map
CN113744181B (en) Hardware robot intelligent polishing method and device based on 2D3D vision fusion
CN109409278A (en) Image target positioning method based on estimation network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant