CN109816724B - Three-dimensional feature extraction method and device based on machine vision - Google Patents

Three-dimensional feature extraction method and device based on machine vision Download PDF

Info

Publication number
CN109816724B
CN109816724B CN201811474153.4A CN201811474153A CN109816724B CN 109816724 B CN109816724 B CN 109816724B CN 201811474153 A CN201811474153 A CN 201811474153A CN 109816724 B CN109816724 B CN 109816724B
Authority
CN
China
Prior art keywords
detected
image
position information
feature point
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811474153.4A
Other languages
Chinese (zh)
Other versions
CN109816724A (en
Inventor
沈震
熊刚
李志帅
彭泓力
郭超
董西松
商秀芹
王飞跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201811474153.4A priority Critical patent/CN109816724B/en
Publication of CN109816724A publication Critical patent/CN109816724A/en
Priority to PCT/CN2019/105962 priority patent/WO2020114035A1/en
Application granted granted Critical
Publication of CN109816724B publication Critical patent/CN109816724B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Graphics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Geometry (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention belongs to the field of machine vision, and particularly provides a three-dimensional feature extraction method and device based on machine vision. The invention aims to solve the problems of complex and time-consuming process, difficult popularization and the like of three-dimensional model reconstruction in the prior art. To this end, the steps of the three-dimensional feature extraction method based on machine vision of the present invention include: acquiring a multi-angle image containing preset feature points to be detected of a target object; extracting the position information of the feature point to be detected in each image; acquiring spatial position information of the feature points to be detected according to the position information of the feature points to be detected in each image; and calculating first distance information and/or second distance information corresponding to a certain feature point to be detected based on the spatial position information and a preset three-dimensional feature category. And acquiring different-angle images containing the characteristic points to be detected through machine vision so as to acquire the spatial position information of the characteristic points to be detected, so that the distance information of the target object can be calculated.

Description

Three-dimensional feature extraction method and device based on machine vision
Technical Field
The invention belongs to the field of machine vision, and particularly relates to a three-dimensional feature extraction method and device based on machine vision.
Background
With the development of cloud manufacturing, cloud computing and the approach of "industry 4.0", a social manufacturing model, i.e., a model of custom-made production for customers, is produced. The social manufacturing method has the characteristics that the requirements of consumers can be directly converted into products, based on the social computing theory, based on the mobile internet technology, social media and 3D printing technology, the social people can fully participate in the whole life manufacturing process of the products in the modes of crowdsourcing and the like, and the personalized, real-time and economical production and consumption mode is realized. That is, in social manufacturing, each consumer may participate in various stages of the product's full lifecycle, including the design, manufacture, and consumption of the product. Taking shoemaking as an example, the application of social manufacturing in the shoemaking process is embodied in that a user can customize and select the shoemaking according to the requirement, so that the three-dimensional characteristics of the foot shape of the user can be simply, quickly and accurately obtained.
However, the original manual measurement can obtain fewer parameters of the foot shape, the foot shape cannot be accurately described, and only professional tools in the shoe making industry can obtain accurate measurement results. In order to enable non-professionals to obtain more accurate foot shape parameters so as to realize personalized customization of shoes, the invention provides a method for obtaining the foot shape parameters by adopting model building calculation. Because the arch height of each person and the included angle between the toes and the sole plane are different, if only two characteristic sizes of the foot length and the foot width are obtained, the difference of different individual foot types belonging to the same model cannot be accurately reflected, and therefore, the foot type needs to be subjected to three-dimensional model reconstruction to obtain accurate foot type parameters. At present, foot-shaped three-dimensional model reconstruction can be carried out through equipment such as laser three-dimensional scanning, but the method is complex and time-consuming to operate, high in hardware cost and difficult to popularize. Thus, a simpler three-dimensional modeling method is needed to accurately obtain the foot shape parameters.
Accordingly, there is a need in the art for a new three-dimensional model reconstruction method that solves the above-mentioned problems.
Disclosure of Invention
In order to solve the above problems in the prior art, that is, to solve the problems of complexity, time consumption, difficulty in popularization and the like of the existing three-dimensional model reconstruction process, a first aspect of the present invention discloses a three-dimensional feature extraction method based on machine vision, the three-dimensional feature extraction method comprising the following steps: acquiring a multi-angle image containing a reference object and preset to-be-detected feature points of a target object arranged relative to the reference object; extracting the position information of the feature point to be detected in each image; acquiring spatial position information of the feature points to be detected according to the position information of the feature points to be detected in each image; calculating first distance information and/or second distance information corresponding to a certain feature point to be detected based on the spatial position information and a preset three-dimensional feature category; the first distance information is distance information between the certain characteristic point to be measured and other characteristic points to be measured, and the second distance information is vertical distance information between the certain characteristic point to be measured and a preset plane; the certain feature point to be measured, the other feature points to be measured and the plane are all dependent on the three-dimensional feature category.
In a preferred embodiment of the above three-dimensional feature extraction method based on machine vision, the step of "extracting the position information of the feature point to be measured in each image" includes: acquiring the pixel position of the characteristic point to be detected in a certain image by using a manual marking method; and extracting the corresponding pixel positions of the feature points to be detected in other images by using a preset feature point matching method and according to the acquired pixel positions.
In a preferred embodiment of the above three-dimensional feature extraction method based on machine vision, the step of "extracting the position information of the feature point to be measured in each image" includes: acquiring the area shape corresponding to the area where the characteristic point to be detected in the target object is located; acquiring a region to be detected corresponding to each image according to the region shape; and acquiring the position information of the feature point to be detected in each image according to the relative position between the feature point to be detected and the shape of the region and each region to be detected.
In a preferred embodiment of the above three-dimensional feature extraction method based on machine vision, the step of "extracting the position information of the feature point to be measured in each image" includes: acquiring the position information of the feature point to be detected in each image by utilizing a pre-constructed neural network; the neural network is a deep neural network which is based on a preset training set and trained by using a deep learning correlation algorithm.
In a preferred embodiment of the three-dimensional feature extraction method based on machine vision, the step of "obtaining spatial location information of the feature point to be measured according to location information of the feature point to be measured in each image" includes: and acquiring the Euclidean position of the feature point to be detected by using a triangulation method according to the position information of the feature point to be detected in each image and the internal and external parameters of the camera.
In a preferred embodiment of the three-dimensional feature extraction method based on machine vision, the step of "obtaining spatial location information of the feature point to be measured according to location information of the feature point to be measured in each image" includes: constructing a sparse model by using an incremental SFM method and the position information of the characteristic point to be detected in each image, and calculating the spatial position information of the characteristic point to be detected in a world coordinate system by using a triangulation method; and restoring the spatial position information of the characteristic point to be detected in the above steps in the world coordinate system by using the scale coefficient obtained in advance to obtain the real position of the characteristic point to be detected.
In a preferred technical solution of the three-dimensional feature extraction method based on machine vision, before "recovering, by using a scale coefficient obtained in advance, spatial position information of the feature point in the world coordinate system obtained in the above step to obtain a true position of the feature point to be measured", the three-dimensional feature extraction method based on machine vision further includes: acquiring coordinates of the vertex of the reference object in a world coordinate system by using the sparse model and according to the pixel position of the vertex of the reference object in the camera coordinate system, wherein the difference between the vertex coordinates and the real space position in the world coordinate system needs to be noted by a scale coefficient lambda; and calculating the scale coefficient lambda according to the coordinates of the vertex of the reference object in the world coordinate system and the real space position of the vertex of the reference object.
In a preferred embodiment of the above three-dimensional feature extraction method based on machine vision, the triangularization method includes: and acquiring the projective space position of the feature point to be detected according to the internal and external parameters of the camera and the position information of the feature point to be detected in each image, and performing homogenization processing on the projective space position to obtain the Euclidean space position of the feature point to be detected.
The technical solution of the present invention is to obtain images of different angles of a target object and extract the position of a feature point to be measured in the images, then calculate the spatial position of the feature point to be measured in a world coordinate system by using a triangulation method or a sparse reconstruction problem, and calculate first distance information and/or second distance information between the feature points according to the calculated spatial position information of the feature point to be measured. The three-dimensional feature extraction method can quickly determine the three-dimensional feature points of the target object only through the multi-angle image acquired by the photographing equipment, further calculates to obtain the distance information of the target object, does not need to use hardware equipment with high cost and complex operation, such as laser three-dimensional scanning and the like, and simplifies the three-dimensional reconstruction process.
In the preferred technical scheme of the invention, the pixel position of the characteristic point to be detected in each image is determined by manual marking or an automatic method, wherein the automatic method comprises the step of acquiring the position information of the characteristic point to be detected in each image by reusing the region to be detected of each image or utilizing a pre-constructed neural network according to the region shape corresponding to the region where the characteristic point to be detected is located. And then, the camera parameters are automatically calibrated by using a reference object, and then the real space position of the characteristic point to be detected is obtained by carrying out triangulation or solving through a sparse reconstruction problem, so that the model reconstruction of the whole target object is not needed, the calculated amount can be reduced, and the model establishment process is simplified. And finally, calculating distance information corresponding to the feature points to be detected based on the real space position of the feature points to be detected and the preset three-dimensional feature category.
A second aspect of the invention provides a storage device storing a plurality of programs adapted to be loaded by a processor to perform the machine vision based three-dimensional feature extraction method of any of the preceding claims.
It should be noted that the storage device has all the technical effects of the foregoing three-dimensional feature extraction method based on machine vision, and details are not repeated here.
The third aspect of the present invention also provides a control apparatus comprising a processor and a storage device, wherein the storage device is adapted to store a plurality of programs, and the programs are adapted to be loaded by the processor to perform the machine vision-based three-dimensional feature extraction method according to any one of the preceding claims.
It should be noted that the control device has all the technical effects of the aforementioned three-dimensional feature extraction method based on machine vision, and details are not repeated herein.
Drawings
The three-dimensional feature extraction method based on machine vision of the present invention is described below with reference to the accompanying drawings in conjunction with a foot shape. In the drawings:
FIG. 1 is a flow chart of the main steps of a method for extracting three-dimensional features of foot shapes based on machine vision in the embodiment of the present invention;
fig. 2 is a schematic diagram of detecting feature points by using a generalized hough transform with a circle as a template in a foot-shaped three-dimensional feature extraction method based on machine vision according to an embodiment of the present invention;
fig. 3 is a schematic diagram of detecting feature points by using a generalized hough transform with a circle as a template in a foot-shaped three-dimensional feature extraction method based on machine vision according to an embodiment of the present invention;
fig. 4 is a schematic diagram of detecting feature points by using a generalized hough transform with a circle as a template in a foot-shaped three-dimensional feature extraction method based on machine vision according to an embodiment of the present invention;
fig. 5 is a schematic diagram of detecting a reference object by using a generalized hough transform with a straight line as a template according to a foot-shaped three-dimensional feature extraction method based on machine vision in the embodiment of the present invention;
FIG. 6 is a schematic diagram of a process of solving spatial position information of feature points in a triangularization process of a foot-shaped three-dimensional feature extraction method based on machine vision according to an embodiment of the present invention;
fig. 7 is a schematic process diagram of solving the spatial position information of the feature points in the sparse reconstruction process of the foot-shaped three-dimensional feature extraction method based on machine vision in the embodiment of the present invention.
Detailed Description
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention. For example, although the present invention has been described with reference to a foot shape, other objects, such as clothing, may be converted into a product by modeling. In addition, the invention has been described with reference to A4 paper, but may also be other objects of known dimensions (such as floor tiles). And can be adjusted as needed by those skilled in the art to suit particular applications.
It should be noted that the terms "first", "second" and "third" in the description of the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The following describes a method for extracting three-dimensional features of a foot shape based on machine vision according to the present invention with reference to the accompanying drawings.
In a specific embodiment of the invention, the extraction and calculation process of the three-dimensional characteristic parameters of the foot shape is converted into the determination of the spatial position of the corresponding characteristic point, and then the characteristic parameters of the foot shape to be measured are calculated by using the euclidean distance formula. Wherein, the basic parameters of the foot shape that can be obtained include: foot type parameter information required by shoe manufacturing, such as foot length, foot circumference, instep circumference height, arch upper bending point height, foot width, thumb height, heel convexity point height, ankle external bone central point height and the like. The following describes a possible implementation manner of the foot-shape three-dimensional feature extraction method based on machine vision, taking three parameters of foot length, foot width and ankle point height as an example.
Referring to fig. 1, fig. 1 exemplarily shows main steps of a foot-shaped three-dimensional feature extraction method based on machine vision in an embodiment of the present invention, and the foot-shaped three-dimensional feature extraction method based on machine vision in the present invention may include the following steps:
and S100, acquiring a multi-angle image containing the preset feature points to be detected of the target object.
Specifically, a foot is placed on a4 paper, and images of the foot shape at multiple angles are taken by using a mobile photographing device such as a camera, so that the characteristics of the foot shape can be fully expressed, enough characteristic points to be measured can be obtained, such as the top point of the longest toe and the salient point of the heel, so as to calculate the length of the foot shape to be measured, the outer side point of the thumb ball and the outer side point of the root of the tail toe, so as to calculate the width of the foot shape to be measured, the ankle point, so as to calculate the height of the ankle point, and the like. It should be noted that the number of the captured images of the foot shape should be at least three or more, and the more the number of the images including the feature point to be measured is, the more accurate the foot shape parameter calculated from the feature point to be measured is.
And step S200, extracting the position information of the feature point to be detected in each image.
Specifically, in a preferred embodiment of this embodiment, the three-dimensional feature extraction method shown in fig. 1 may obtain the pixel position (x, y) of the feature point to be detected in each image according to the following steps, specifically:
firstly, the pixel position of the Feature Point to be detected in a certain image is marked manually, and then the corresponding pixel position of the Feature Point to be detected in other images is found by using a Feature Point matching method, such as Scale Invariant Feature Transform (SIFT) or Iterative Closest Point (ICP). Taking the measurement of the height of the ankle as an example, an image including the ankle point is selected, the pixel position of the ankle point in the image is manually marked, and then the corresponding pixel position of the ankle point in the images of other angles including the ankle point is found by using a feature point matching method such as SIFT or ICP. By the method, the corresponding pixel positions of the characteristic points to be detected in all the images can be quickly found without manually marking the characteristic points on each image, and the efficiency of acquiring the pixel positions of the characteristic points is improved.
Optionally, in another preferred embodiment of this embodiment, the three-dimensional feature extraction method shown in fig. 1 may further obtain a pixel position (x, y) of the feature point to be detected in each image according to the following steps, specifically:
and according to the uniqueness of the shape of the region where the feature point to be detected is located, detecting the specific shape by using a feature detection method such as generalized Hough transform so as to determine the position information of the feature point to be detected in each image. Specifically, the method comprises the steps of firstly determining the area shape corresponding to the area where the feature point to be detected is located, then automatically finding the corresponding area to be detected of the feature point to be detected in each image according to the area shape and by utilizing generalized Hough transform, and then obtaining the position information of the feature point to be detected in each image according to the relative position between the feature point to be detected and the area shape and the area to be detected in each image. In the following, a possible implementation is described by taking a circle as a template and finding feature points by using a generalized hough transform as an example.
Referring to fig. 2, fig. 3 and fig. 4, fig. 2 is a schematic diagram of detecting feature points by using a generalized hough transform with a circle as a template according to a foot-shaped three-dimensional feature extraction method based on machine vision in the embodiment of the present invention; fig. 3 is a schematic diagram of detecting feature points by using a generalized hough transform with a circle as a template in a foot-shaped three-dimensional feature extraction method based on machine vision according to an embodiment of the present invention; fig. 4 is a schematic diagram of detecting feature points by using a generalized hough transform with a circle as a template according to a foot-shaped three-dimensional feature extraction method based on machine vision in an embodiment of the present invention, and fig. 2, 3, and 4 respectively show specific implementations of finding feature points by using a generalized hough transform with an ankle point in a circular area as a template at different angles. As shown in fig. 2, 3 and 4, the ankle where the ankle center is located is circular, and as can be seen from the figures, this circular outline is unique in the foot shape, so that when the generalized hough transform is used, the circular shape is taken as a template, a circular position (a circular template shown by a dotted line in fig. 2-4) is automatically found in the image, the position is the position where the ankle is located, and the center G point of the circular position is searched, namely the position of the ankle point of the feature point to be measured in the image.
It can be understood that, when determining the position information of the vertex of the longest toe, the outline of the longest toe can be used as a template of the generalized hough transform, a search is performed in the image, and after finding the outline of the toe, the pixel position of the feature point is determined by the relative position of the outline and the vertex of the longest toe.
Optionally, in another preferred embodiment of this embodiment, the three-dimensional feature extraction method shown in fig. 1 may further obtain a pixel position (x, y) of the feature point to be detected in each image according to the following steps, specifically:
and constructing a deep neural network based on data samples of the characteristic points of the foot type marked by enough quantity and by using a deep learning algorithm, and then acquiring the position information of the characteristic points to be detected in each image by using the neural network. Specifically, when the neural network is trained, image data containing the feature point to be detected is input, the pixel position (x, y) of the feature point to be detected in the image is output, wherein the output comprises real output and expected output, the real output of the last full-connection layer of the network is the pixel position (x, y) of the feature point to be detected in the image, and the expected output of the network is the marked actual pixel position of the feature point to be detected in the image. And then reversely training the whole network by utilizing an error generated by the real output and the expected output of the network, iteratively training until the network is converged, inputting a certain image to be tested containing the characteristic point to be tested after the neural network is trained, and automatically outputting the pixel position of the neural network in the image by the neural network. Taking the pixel position of the ankle point as an example, selecting a sufficient amount of image samples marked with the ankle point as a training set, constructing a deep neural network, then training the deep neural network by using the training set, inputting an image to be tested containing the ankle point after the training is finished, and automatically outputting the pixel position of the ankle point in the image by using the neural network. It can be understood that when the pixel positions of other feature points are determined, the image data samples corresponding to the feature points are used for training a pre-constructed deep neural network, and then the image to be measured containing the feature points is input, so that the pixel positions of the feature points in the image are obtained.
And step S300, acquiring the spatial position information of the feature point to be detected according to the position information of the feature point to be detected in each image.
Specifically, in a preferred embodiment of this embodiment, the three-dimensional feature extraction method shown in fig. 1 may obtain spatial position information of the feature point to be measured according to the following steps:
firstly, calibrating camera parameters by using a reference object, and then calculating the spatial position information of the characteristic point to be measured by using a triangulation method. Specifically, taking a sheet of a4 as a reference, a foot shape is placed on a sheet of a4, and a plurality of images at different angles, including the outline of the sheet of a4, are acquired by an imaging device such as a camera. And calibrating the camera by using the images at different angles, and determining an internal parameter matrix K of the camera, and a rotation matrix R and a translation matrix t of external parameters relative to a world coordinate system. And then, according to the pixel position (X, Y) of the feature point to be measured in the image obtained in the step S200, and by utilizing a triangulation method and a homogenization method, the spatial position information (X, Y, Z) of the feature point to be measured in a world coordinate system is solved. A possible implementation of obtaining the true spatial position of the feature point by the triangulation method is described below with reference to fig. 5 and 6.
Referring to FIG. 5, FIG. 5 is a machine-based representation of an embodiment of the present inventionThe visual foot-shaped three-dimensional feature extraction method is characterized in that a straight line is used as a template to detect a schematic diagram of a reference object by using generalized Hough transform. As shown in fig. 5, the edge straight line of a4 paper in the image is detected using a straight line template and using a random hough transform. It can be seen that four edge straight lines are detected, each straight line intersects with each other in pairs, and the intersection point is the pixel position (x) of the four vertexes (A, B, C, D) of the A4 paperi,yi) And i is 1, 2, 3, 4. With continued reference to fig. 2, 3 and 4, the following relationship between points a in euclidean space and projective space can be derived from knowledge of the spatial geometry transformation:
Figure BDA0001891768560000081
the parameters K, R and t in equation (1) are the in-camera parameter matrix, the rotation matrix and the translation matrix of the camera relative to the world coordinate system ([ R | t [ ]), respectively]Collectively referred to as the camera extrinsic parameter matrix). Wherein the symbol "|" represents an augmented matrix, r1、r2、r3Respectively, the expansion form of the rotation matrix R of the camera relative to the world coordinate system, and R is obtained by matrix multiplication3Multiplied by 0 element to be eliminated.
Wherein,
Figure BDA0001891768560000091
is the pixel position of the apex A of A4 paper, (X)A,YA,ZA)TIs its true position in the world coordinate system, K [ R | t]Are the internal and external parameters of the camera. Homography matrix H ═ K [ r ]1r2|t]With 8 degrees of freedom, the world coordinate system is established on the vertex a of a4 paper, and the world coordinate systems of the four vertices of a4 paper are (0, 0, 0), (X, 0, 0), (0, Y, 0), (X, Y, 0), where X is 210mm and Y is 297 mm. Each vertex can be written in the form of equation (1) to construct two sets of linear equations. Therefore, four sets of vertices can construct 8 sets of Linear equations, and H is solved by a Direct Linear Transform (DLT) method.
The three photos are obtained at different angles, so that the cameras of the three photos have different poses according to the same methodThree homography matrixes H in the camera of the world coordinate system can be obtained by the method1,H2,H3
K can be determined from the homography matrix H, since H ═ H1 h2 h3]=K[r1 r2|t]Thus, it is possible to obtain:
K-1[h1 h2 h3]=[r1 r2|t] (2)
parameter K in equation (2)-1R, t and H are the inverse of the parameter matrix within the camera, the camera rotation relative to the world coordinate system matrix, translation matrix and homography matrix, respectively. Wherein r is1、r2Are respectively rotation matrixes of camera external parameters relative to a world coordinate system, h, obtained through two images at different angles1、h2、h3Three homography matrices in the camera are obtained relative to the world coordinate system through three images at different angles.
Wherein R ═ R1 r2 r3]Is a rotation matrix, has orthogonal properties, namely: r is1 Tr20 and | r1‖=‖r2| ═ 1. Thus, it is possible to obtain: h is1 TK-TK-1h20, further, one can obtain:
h1 TK-TK-1h1=h2 TK-TK-1h2 (3)
parameter K in equation (3)-TAnd K-1Orthogonal and inverse matrices, h, respectively, of the transpose of the intra-camera parameter matrix1、h2Two sets of homography matrices, h, respectively, in the camera relative to the world coordinate system obtained from two of the images at different angles1 T、h2 TIs a homography matrix h1、h2The transpose matrix of (2) can obtain the constraint equation of the intrinsic parameters of two cameras for every two images.
The camera intrinsic parameter matrix K is an upper triangular matrix, and w is equal to K-TK-1The method is a symmetric array, w is solved linearly through DLT according to images of three different angles in fig. 2, 3 and 4, and K can be solved through orthogonal decomposition. As shown in the formula (1), [ r ]1 r2|t]=K-1[h1 h2 h3]H obtained by combining the above solutions1、h2、h3And K, can be solved to obtain r1、r2And t. Deriving r from the orthogonality of the rotation matrices3=r1×r2Thus R ═ R1 r2 r3]. The method can obtain internal and external parameters K [ R ] of the camera when the camera is shot in figures 2, 3 and 41|t1]、K[R2|t2]、K[R3|t3]。
Referring to fig. 6, fig. 6 is a schematic diagram illustrating a process of solving feature point spatial position information in a triangularization process of a foot-shaped three-dimensional feature extraction method based on machine vision in the embodiment of the present invention. As shown in fig. 6, a triangularization process taking the ankle point G in fig. 3 and 4 (i.e., Image1 and Image2) as an example is shown, and the pixel position x of the ankle point G in Image1 and Image2 obtained in step S200 is shown according to the ankle point G in step S2001And x2And the internal and external parameters P of the camera obtained in the above steps1=K1[R1|t1]、P2=K2[R2|t2]The minimum min sigma of the quadratic sum of reprojection errors is carried out in sequencei‖PiX-xiII, obtaining the position X of the characteristic point to be measured in the projection space as (M, N, O, w), wherein P1、P2The internal and external parameters, K, of the images of Image1 and Image2 taken by the camera according to the calibration method1、K2The parameter matrices R in the camera are the two images of Image1 and Image2, respectively1、R2The rotation matrices t are relative to the world coordinate system when the camera takes two images, Image1 and Image2, respectively1、t2Respectively translation matrices. Finally, by homogenizing the projective space coordinates, the euclidean space position X of the feature point G to be measured (M/w, N/w, O/w) ═ X, Y, Z) can be obtained, where M, N, O, w are respectivelyIs the position coordinates of the feature point G in the projective space.
Optionally, in another preferred embodiment of this embodiment, the three-dimensional feature extraction method shown in fig. 1 may further obtain the real spatial position of the feature point to be detected according to the following steps, specifically:
and (3) converting the three-dimensional reconstruction problem into the sparse reconstruction problem of the characteristic points to be detected, such as constructing a sparse model by using an incremental SFM method and solving the sparse reconstruction problem by using a triangulation method. Specifically, according to the pixel position (X, Y) of the feature point to be measured in the multiple images obtained in step S200, unlike the previous embodiment, the incremental SFM method is used to directly solve the intra-camera parameter matrix K, the camera rotation matrix R, the translation t with respect to the world coordinate, and the coordinate λ (X, Y, Z) of the feature point to be measured in the world coordinate system, omitting the process of labeling the camera with the reference object, and then determining the scale coefficient λ using the reference object with known specifications, thereby obtaining the real spatial position coordinate (X, Y, Z) of the feature point. A possible implementation of solving the sparse reconstruction problem by using the incremental SFM method is described below with reference to fig. 7, which takes 3 images from different angles as an example.
Referring to fig. 7, fig. 7 is a schematic diagram illustrating a process of solving feature point spatial position information in a sparse reconstruction process of a foot-shaped three-dimensional feature extraction method based on machine vision in the embodiment of the present invention. As shown in fig. 7, the step of solving the sparse reconstruction problem by using the incremental SFM method specifically includes:
step 1: two images, Image1 and Image2, were randomly picked out of 3 images at different angles to determine an initial Image pair, and initial values of internal and external parameters [ R | t ] of cameras capturing the images, Image1 and Image2, were calculated by an incremental SFM method]Matrix: by using 5 sets of feature point pairs (longest toe apex and heel convex point, thumb ball outer side point and tail toe root outer side point, ankle point) in images Image1 and Image2, essence matrix E corresponding to images Image1 and Image2 is calculated by using 5-point method1And E2Wherein E ═ R | t]The camera rotation matrix R can be decomposed from the essential matrix E1、R2And translation t relative to world coordinates1、t2And (4) matrix. Then, combining the pixel positions of the feature point to be measured in the camera coordinate system obtained in the step S200 in the images Image1 and Image2 to construct an initial sparse model;
step 2: according to the initial sparse model constructed in the step 1, and a triangulation method is utilized to calculate the position coordinates lambda (X) of the feature point to be measured under the world coordinate system in the images Image1 and Image21,Y1,Z1) And λ (X)2,Y2,Z2);
And step 3: inputting the pixel position of the feature point to be measured of the Image3 in the camera coordinate system obtained in the step S200 into the initial sparse model obtained in the step 2, and acquiring the camera internal and external parameters [ R | t |)]Matrices, i.e. camera rotation matrices R3And translation t relative to world coordinates3Correcting the initial sparse model by utilizing the internal and external parameters of the camera;
and 4, step 4: according to the sparse model corrected in the step 3, a triangulation method is used for calculating a space position coordinate lambda (X) of the characteristic point to be measured in the world coordinate system in the Image33,Y3,Z3);
And 5: and (4) correcting the position coordinates of the feature points obtained in the steps (2) and (4) by using a Bundle Adjustment (BA) method to obtain an optimized sparse model.
And 5, repeatedly binding and adjusting the to-be-measured feature points at different coordinate positions obtained in the rest of other images until the error of the coordinates lambda (X, Y, Z) of the to-be-measured feature points obtained by two times of calculation is less than or equal to a preset threshold value.
Although the present invention provides only one specific implementation scheme of solving spatial position information of a feature point to be measured in three images by using an incremental SFM method, it can be understood by those skilled in the art that the incremental SFM method provided by the present invention can also be used to solve images at a plurality of different angles, in the process of constructing a sparse model by using the incremental SFM method, pixel position information of the feature point to be measured in a new image under a camera coordinate system is repeatedly substituted, internal and external parameters of a camera are re-acquired, and the sparse model is corrected by using the internal and external parameters of the camera until all the obtained images are added to the sparse model. It can be understood that the more different angles of the acquired image, the more times of iterative computation, the more accurate the obtained internal and external parameters of the camera, and the more accurate the spatial position information of the feature point to be measured in the world coordinate system, which is obtained by computing according to the sparse model constructed by the method.
Step 6: with point a in fig. 4 as the origin of coordinates, the spatial coordinates of vertex D are (M, N, 0) and the true spatial position of vertex D is (210mm, 297mm, 0) by calculation using the sparse model obtained in step 5 based on the pixel position information of vertex D of a4 paper in the camera coordinate system obtained in step S200, and therefore, the scale coefficient λ is 210mm/M or 297 mm/N. And (5) combining the space coordinates lambda (X, Y, Z) of the characteristic point to be detected in the world coordinate system obtained in the step (5), and dividing the space coordinates lambda by the scale coefficient lambda to obtain the real space position (X, Y, Z) of the characteristic point to be detected.
Step S400, calculating first distance information and/or second distance information corresponding to a certain feature point to be measured based on the spatial position information and the preset three-dimensional feature category.
It should be noted that the first distance information is distance information between a certain feature point to be measured and other feature points to be measured, such as length, and the second distance information is vertical distance information between the certain feature point to be measured and a preset plane, such as height.
Specifically, taking the foot shape as an example, the spatial position information of the five feature points to be measured calculated in step S300 is obtained, for example, the longest toe vertex is (X)1,Y1,Z1) The heel salient point is (X)2,Y2,Z2) The point on the outer side of the ball of the thumb is (X)3,Y3,Z3) The outer point of the root of the tail toe is (X)4,Y4,Z4) The ankle point is (X)5,Y5,Z5) Using distance formulas, e.g. Euclidean distance formulas
Figure BDA0001891768560000121
The following calculation can be obtainedThe formula:
Figure BDA0001891768560000122
parameters L, W and H in equation (4) are foot length, foot width, and ankle height, respectively.
Thus, three parameters of foot length, foot width and ankle point height can be obtained. Although the present invention provides only one specific embodiment of calculating three parameters of foot length, foot width and ankle point height by extracting three-dimensional feature points, it will be understood by those skilled in the art that the three-dimensional feature extraction method provided in the present invention can also calculate other foot type parameters, such as calculating instep height, in which case, the images at different angles all need to include feature points of the instep point, and then calculate the instep height sequentially according to the steps of the three-dimensional feature extraction method of the present invention described in the above embodiments.
In summary, in a preferred technical solution of the present invention, an image capturing device is used to obtain five images at different angles including five feature points to be measured, namely, a longest toe vertex of a foot shape, a heel convex point, an outer point of a thumb ball, an outer point of a tail toe root and an ankle point, pixel position information of each feature point to be measured in each image is determined through a manual marking or automatic method, and then a real space position of the feature point to be measured is obtained through manual calibration of camera parameters and subsequent triangulation or through sparse reconstruction problem solution, so that model reconstruction of the whole object is not required, calculation amount can be reduced, and a model building process is simplified. And finally, based on the space positions of the five characteristic points to be measured, calculating three foot type parameters of the foot length, the foot width and the ankle point height by using an Euclidean distance formula. By analogy, images of different angles of different feature points are obtained, and foot shape parameters corresponding to the feature points can also be calculated, for example, images of different angles including instep points are obtained, and spatial position information of the instep points can be calculated according to the steps, so that the parameter of the instep height is calculated.
Further, based on the above method embodiments, the present invention also provides a storage device, where multiple programs are stored, and the programs may be suitable for being loaded by a processor to execute the machine vision-based three-dimensional feature extraction method described in the above method embodiments.
Furthermore, based on the above method embodiments, the present invention also provides a control apparatus, which includes a processor and a storage device, wherein the storage device may be adapted to store a plurality of programs, and the programs may be adapted to be loaded by the processor to execute the machine vision-based three-dimensional feature extraction method described in the above method embodiments.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims (7)

1. A three-dimensional feature extraction method based on machine vision is characterized by comprising the following steps:
the method comprises the steps that a multi-angle image containing a reference object and preset feature points to be detected of a target object arranged relative to the reference object is obtained through a mobile camera, and the number of the multi-angle images is at least three;
extracting the position information of the feature point to be detected in each image, wherein the feature point to be detected is on a target object;
acquiring spatial position information of the feature points to be detected according to the position information of the feature points to be detected in each image, specifically, calibrating camera parameters by using a reference object, and calculating the spatial position information of the feature points to be detected by using a triangulation method;
calculating first distance information and/or second distance information corresponding to a certain feature point to be detected based on the spatial position information and a preset three-dimensional feature category;
the first distance information is distance information between the certain characteristic point to be measured and other characteristic points to be measured, and the second distance information is vertical distance information between the certain characteristic point to be measured and a preset plane; the certain feature point to be measured, the other feature points to be measured and the plane all depend on the three-dimensional feature category;
wherein the reference object is an object of known dimensions;
wherein the multi-angle image simultaneously comprises the reference object and the target object arranged relative to the reference object;
the step of "extracting the position information of the feature point to be detected in each image" is to obtain the pixel position (x, y) of the feature point to be detected in each image;
the model reconstruction is not performed on the whole target object, and the step of acquiring the spatial position information of the feature point to be detected according to the position information of the feature point to be detected in each image is to acquire the real spatial position (X, Y, Z) of the feature point to be detected.
2. The machine-vision-based three-dimensional feature extraction method according to claim 1, wherein the step of "extracting the position information of the feature point to be measured in each image" includes:
acquiring the pixel position of the characteristic point to be detected in a certain image by using a manual marking method;
and extracting the corresponding pixel positions of the feature points to be detected in other images by using a preset feature point matching method and according to the acquired pixel positions.
3. The machine-vision-based three-dimensional feature extraction method according to claim 1, wherein the step of "extracting the position information of the feature point to be measured in each image" includes:
acquiring the area shape corresponding to the area where the characteristic point to be detected in the target object is located;
acquiring a region to be detected corresponding to each image according to the region shape;
and acquiring the position information of the feature point to be detected in each image according to the relative position between the feature point to be detected and the shape of the region and each region to be detected.
4. The machine-vision-based three-dimensional feature extraction method according to claim 1, wherein the step of "extracting the position information of the feature point to be measured in each image" includes:
acquiring the position information of the feature point to be detected in each image by utilizing a pre-constructed neural network;
the neural network is a deep neural network which is based on a preset training set and trained by using a deep learning correlation algorithm.
5. The machine-vision-based three-dimensional feature extraction method according to any one of claims 1 to 4, wherein the step of acquiring spatial position information of the feature point to be measured from position information of the feature point to be measured in each of the images comprises:
and acquiring the Euclidean position of the feature point to be detected by using a triangulation method according to the position information of the feature point to be detected in each image and the internal and external parameters of the camera.
6. A storage device having stored therein a plurality of programs, characterized in that said programs are adapted to be loaded by a processor for performing the method of machine vision based three-dimensional feature extraction according to any of claims 1-5.
7. A control apparatus comprising a processor and a storage device adapted to store a plurality of programs, characterized in that the programs are adapted to be loaded by the processor to perform the machine vision based three-dimensional feature extraction method of any one of claims 1-5.
CN201811474153.4A 2018-12-04 2018-12-04 Three-dimensional feature extraction method and device based on machine vision Active CN109816724B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811474153.4A CN109816724B (en) 2018-12-04 2018-12-04 Three-dimensional feature extraction method and device based on machine vision
PCT/CN2019/105962 WO2020114035A1 (en) 2018-12-04 2019-09-16 Three-dimensional feature extraction method and apparatus based on machine vision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811474153.4A CN109816724B (en) 2018-12-04 2018-12-04 Three-dimensional feature extraction method and device based on machine vision

Publications (2)

Publication Number Publication Date
CN109816724A CN109816724A (en) 2019-05-28
CN109816724B true CN109816724B (en) 2021-07-23

Family

ID=66601919

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811474153.4A Active CN109816724B (en) 2018-12-04 2018-12-04 Three-dimensional feature extraction method and device based on machine vision

Country Status (2)

Country Link
CN (1) CN109816724B (en)
WO (1) WO2020114035A1 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816724B (en) * 2018-12-04 2021-07-23 中国科学院自动化研究所 Three-dimensional feature extraction method and device based on machine vision
CN110133443B (en) * 2019-05-31 2020-06-16 中国科学院自动化研究所 Power transmission line component detection method, system and device based on parallel vision
CN110223383A (en) * 2019-06-17 2019-09-10 重庆大学 A kind of plant three-dimensional reconstruction method and system based on depth map repairing
CN110796705B (en) * 2019-10-23 2022-10-11 北京百度网讯科技有限公司 Model error elimination method, device, equipment and computer readable storage medium
CN112070883A (en) * 2020-08-28 2020-12-11 哈尔滨理工大学 Three-dimensional reconstruction method for 3D printing process based on machine vision
CN112487979B (en) * 2020-11-30 2023-08-04 北京百度网讯科技有限公司 Target detection method, model training method, device, electronic equipment and medium
CN112541936B (en) * 2020-12-09 2022-11-08 中国科学院自动化研究所 Method and system for determining visual information of operating space of actuating mechanism
CN114113163B (en) * 2021-12-01 2023-12-08 北京航星机器制造有限公司 Automatic digital ray detection device and method based on intelligent robot
CN114841959B (en) * 2022-05-05 2023-04-04 广州东焊智能装备有限公司 Automatic welding method and system based on computer vision
CN115112098B (en) * 2022-08-30 2022-11-08 常州铭赛机器人科技股份有限公司 Monocular vision one-dimensional two-dimensional measurement method
CN116672082B (en) * 2023-07-24 2024-03-01 苏州铸正机器人有限公司 Navigation registration method and device of operation navigation ruler
CN118010751A (en) * 2024-04-08 2024-05-10 杭州汇萃智能科技有限公司 Machine vision detection method and system for workpiece defect detection

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04349583A (en) * 1991-05-27 1992-12-04 Nippon Telegr & Teleph Corp <Ntt> Generalized hough transform circuit
WO2002025592A2 (en) * 2000-09-22 2002-03-28 Hrl Laboratories, Llc Sar and flir image registration method
CN102157013A (en) * 2011-04-09 2011-08-17 温州大学 System for fully automatically reconstructing foot-type three-dimensional surface from a plurality of images captured by a plurality of cameras simultaneously
CN102354457A (en) * 2011-10-24 2012-02-15 复旦大学 General Hough transformation-based method for detecting position of traffic signal lamp
CN102376089A (en) * 2010-12-09 2012-03-14 深圳大学 Target correction method and system
CN105184857A (en) * 2015-09-13 2015-12-23 北京工业大学 Scale factor determination method in monocular vision reconstruction based on dot structured optical ranging
CN106127258A (en) * 2016-07-01 2016-11-16 华中科技大学 A kind of target matching method
JP2017191022A (en) * 2016-04-14 2017-10-19 有限会社ネットライズ Method for imparting actual dimension to three-dimensional point group data, and position measurement of duct or the like using the same
CN107767442A (en) * 2017-10-16 2018-03-06 浙江工业大学 A kind of foot type three-dimensional reconstruction and measuring method based on Kinect and binocular vision

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7580546B2 (en) * 2004-12-09 2009-08-25 Electronics And Telecommunications Research Institute Marker-free motion capture apparatus and method for correcting tracking error
CN106204727A (en) * 2016-07-11 2016-12-07 北京大学深圳研究生院 The method and device that a kind of foot 3-D scanning is rebuild
CN108305286B (en) * 2018-01-25 2021-09-07 哈尔滨工业大学深圳研究生院 Color coding-based multi-view stereoscopic vision foot type three-dimensional measurement method, system and medium
CN109816724B (en) * 2018-12-04 2021-07-23 中国科学院自动化研究所 Three-dimensional feature extraction method and device based on machine vision

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04349583A (en) * 1991-05-27 1992-12-04 Nippon Telegr & Teleph Corp <Ntt> Generalized hough transform circuit
WO2002025592A2 (en) * 2000-09-22 2002-03-28 Hrl Laboratories, Llc Sar and flir image registration method
CN102376089A (en) * 2010-12-09 2012-03-14 深圳大学 Target correction method and system
CN102157013A (en) * 2011-04-09 2011-08-17 温州大学 System for fully automatically reconstructing foot-type three-dimensional surface from a plurality of images captured by a plurality of cameras simultaneously
CN102354457A (en) * 2011-10-24 2012-02-15 复旦大学 General Hough transformation-based method for detecting position of traffic signal lamp
CN105184857A (en) * 2015-09-13 2015-12-23 北京工业大学 Scale factor determination method in monocular vision reconstruction based on dot structured optical ranging
JP2017191022A (en) * 2016-04-14 2017-10-19 有限会社ネットライズ Method for imparting actual dimension to three-dimensional point group data, and position measurement of duct or the like using the same
CN106127258A (en) * 2016-07-01 2016-11-16 华中科技大学 A kind of target matching method
CN107767442A (en) * 2017-10-16 2018-03-06 浙江工业大学 A kind of foot type three-dimensional reconstruction and measuring method based on Kinect and binocular vision

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
"A generalized Hough transform template and its applications in computer vision";Zhu T等;《Journal of Computational Information Systems》;20050930;第1卷(第3期);全文 *
"Trinocular stereovision by generalized Hough transform";Jun Shen等;《Intelligent Robots and Computer Vision XIV: Algorithms, Techniques, Active Vision, and Materials Handling》;19951003;全文 *
"低成本多目立体视觉脚型三维测量方法研究";秦绪功;《中国优秀硕士学位论文全文数据库·信息科技辑》;20180215;第2018年卷(第2期);第2章-第4章 *
"面向大型装备的工业摄影测量技术及实现";史传飞等;《航空制造技术》;20181001;第61卷(第19期);第24-30页 *
秦绪功."低成本多目立体视觉脚型三维测量方法研究".《中国优秀硕士学位论文全文数据库·信息科技辑》.2018,第2018年卷(第2期),第I138-2199页. *

Also Published As

Publication number Publication date
WO2020114035A1 (en) 2020-06-11
CN109816724A (en) 2019-05-28

Similar Documents

Publication Publication Date Title
CN109816724B (en) Three-dimensional feature extraction method and device based on machine vision
CN112002014B (en) Fine structure-oriented three-dimensional face reconstruction method, system and device
CN107767442B (en) Foot type three-dimensional reconstruction and measurement method based on Kinect and binocular vision
US10460517B2 (en) Mobile device human body scanning and 3D model creation and analysis
JP5671281B2 (en) Position / orientation measuring apparatus, control method and program for position / orientation measuring apparatus
CN106705849B (en) Calibrating Technique For The Light-strip Sensors
Läbe et al. Automatic relative orientation of images
WO2018148841A1 (en) System, method, and apparatus for modelling feet and selecting footwear
US20130259403A1 (en) Flexible easy-to-use system and method of automatically inserting a photorealistic view of a two or three dimensional object into an image using a cd,dvd or blu-ray disc
CN103971378A (en) Three-dimensional reconstruction method of panoramic image in mixed vision system
JP2013524593A (en) Methods and configurations for multi-camera calibration
US11176738B2 (en) Method for calculating the comfort level of footwear
JP2011085971A (en) Apparatus, method, and program for processing image, recording medium, and image processing system
CN107025647B (en) Image tampering evidence obtaining method and device
Santos et al. Flexible three-dimensional modeling of plants using low-resolution cameras and visual odometry
US11475629B2 (en) Method for 3D reconstruction of an object
JP5976089B2 (en) Position / orientation measuring apparatus, position / orientation measuring method, and program
Ye et al. Accurate and dense point cloud generation for industrial Measurement via target-free photogrammetry
JP7178803B2 (en) Information processing device, information processing device control method and program
US11816806B2 (en) System and method for foot scanning via a mobile computing device
Skabek et al. Comparison of photgrammetric techniques for surface reconstruction from images to reconstruction from laser scanning
Ni et al. Plant or tree reconstruction based on stereo vision
CN113128292A (en) Image identification method, storage medium and terminal equipment
Ni et al. 3D reconstruction of small plant from multiple views
Ni et al. 3D dense reconstruction of plant or tree canopy based on stereo vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant