WO2023024766A1 - 物体尺寸识别方法、可读存储介质及物体尺寸识别系统 - Google Patents

物体尺寸识别方法、可读存储介质及物体尺寸识别系统 Download PDF

Info

Publication number
WO2023024766A1
WO2023024766A1 PCT/CN2022/106607 CN2022106607W WO2023024766A1 WO 2023024766 A1 WO2023024766 A1 WO 2023024766A1 CN 2022106607 W CN2022106607 W CN 2022106607W WO 2023024766 A1 WO2023024766 A1 WO 2023024766A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
line
vertex
lines
boundary
Prior art date
Application number
PCT/CN2022/106607
Other languages
English (en)
French (fr)
Inventor
罗欢
徐青松
李青
Original Assignee
成都睿琪科技有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 成都睿琪科技有限责任公司 filed Critical 成都睿琪科技有限责任公司
Publication of WO2023024766A1 publication Critical patent/WO2023024766A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/181Segmentation; Edge detection involving edge growing; involving edge linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Definitions

  • the invention relates to the technical field of object recognition, in particular to an object size recognition method, a readable storage medium and an object size recognition system.
  • the object of the present invention is to provide a method for identifying object size, a readable storage medium and an object size identifying system, so as to solve the existing problem that it is difficult to measure the size of an object.
  • a method for identifying object size which includes:
  • a three-dimensional space coordinate system is established according to a feature point matching method to determine the spatial position of the camera.
  • the step of obtaining two-dimensional position information of multiple object vertices in the image includes:
  • the position of each of the object vertices in the two-dimensional image coordinate system is obtained Two-dimensional position information.
  • the step of determining the actual position of each object vertex in the image includes:
  • For each vertex of the object perform corner detection in a preset area where the reference position of the vertex of the object is located;
  • the actual position of each object vertex in the image is determined according to the corner point detection result.
  • the preset area where the reference position of the object vertex is located is a circular area with the pixel point at the reference position of the object vertex as the center and the first preset pixel as the radius;
  • the corner point detection is performed in the preset area where the reference position of the object vertices is located, including:
  • the determining the actual position of each object vertex in the image according to the corner detection result includes:
  • the corner detection result of the vertex of the object For each vertex of the object, if the corner detection result of the vertex of the object contains a corner point, then determine the position of the corner point as the actual position of the vertex of the object in the image, if the corner point of the vertex of the object If the detection result does not contain a corner point, the reference position of the object vertex in the image is determined as the actual position of the object vertex in the image.
  • the step of obtaining vertices of multiple objects in the image includes:
  • For each boundary line area determine a target boundary line corresponding to the boundary line area from a plurality of reference boundary lines;
  • intersection of the edges of the object in the image is configured as the object vertex.
  • the steps of merging similar lines in the line graph to obtain multiple reference boundary lines include:
  • a plurality of reference boundary lines are determined from the plurality of target lines according to the boundary matrix.
  • a three-dimensional space coordinate system is established according to the feature point matching method, and the step of determining the spatial position of the camera includes:
  • the three-dimensional spatial positions of the two-dimensional feature points in each of the images are obtained, and then the spatial positions of the cameras corresponding to each of the images are obtained.
  • a readable storage medium on which a program is stored, and when the program is executed, the object size recognition method as described above is realized.
  • an object size recognition system which includes a processor and a memory, and a program is stored in the memory, and when the program is executed by the processor , realizing the object size recognition method as described above.
  • the object size recognition method includes: obtaining at least two images of an object from different perspectives by shooting; two-dimensional position information of a plurality of object vertices therein; according to at least two said images, a three-dimensional space coordinate system is established according to a feature point matching method, and the spatial position of the camera is determined; and any one of the said images is selected; Based on the camera calibration parameter information and the spatial position of the camera, the three-dimensional space position information of multiple vertices is obtained, and then the size of the object is obtained.
  • the size of the object can be obtained by taking at least two images of an object from different angles of view, combined with the parameter information of the camera calibration, and the operation steps are simple, which overcomes the inability of the existing technology to measure the size of the object in space The problem.
  • Fig. 1 is the flowchart of the object size recognition method of the embodiment of the present invention.
  • Fig. 2 is a schematic diagram of a photographed object according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of another photographed object according to an embodiment of the present invention.
  • Fig. 4 is a schematic diagram of line merging according to an embodiment of the present invention.
  • the object of the present invention is to provide a method for identifying object size, a readable storage medium and an object size identifying system, so as to solve the existing problem that it is difficult to measure the size of an object.
  • an embodiment of the present invention provides a method for identifying object size, which includes:
  • Step S1 Obtain at least two images of an object from different viewing angles by shooting. It can be understood that each image has a plurality of object vertices representing the object.
  • a binocular camera or a depth camera can be used for shooting, and in other embodiments, a mobile phone with more than two cameras can also be used for shooting.
  • Step S2 Obtain two-dimensional position information of a plurality of object vertices in each of the images.
  • the two-dimensional position information of object vertices here refers to the coordinates of each object vertex in the image coordinate system.
  • Step S3 Establishing a three-dimensional space coordinate system according to the feature point matching method based on at least two of the images, and determining the spatial position of the camera;
  • Step S4 Select any one of the images, and obtain the three-dimensional space position information of a plurality of object vertices based on the parameter information of the camera calibration and the spatial position of the camera, and then obtain the size of the object.
  • the size of the object can be obtained by taking at least two images of an object from different angles of view, combined with the parameter information of the camera calibration, and the operation steps are simple, which overcomes the inability of the existing technology to measure the size of the object in space The problem.
  • the object being photographed is a rectangle (for example a business card), and it has four edges (ie lines) A1 ⁇ A4, the connection of two adjacent edges in these four edges constitutes an object vertex, that is, the business card in the image has four object vertices a1-a4.
  • the object vertex that is, the business card in the image has four object vertices a1-a4.
  • the four edge lines B1-B4 of the photographed object in the image can be extended to obtain the virtual vertex in the lower left corner and the virtual vertex in the upper right corner of the photographed object, which are different from the actual photographed object. vertexes together to obtain four object vertices b1-b4 of the object to be photographed.
  • the shape of the above rectangle is only an example of the object to be photographed, rather than limiting the shape of the object to be photographed, and the object to be photographed may also be in other plane or three-dimensional shapes. But preferably, the object to be photographed should have several vertices to facilitate subsequent identification and calculation.
  • step S2 is executed to acquire the two-dimensional position information of the object vertices.
  • the step of obtaining two-dimensional position information of a plurality of object vertices in the image includes:
  • Step SA21 Input the image into the trained vertex recognition model to obtain the relative position of each object vertex and its corresponding image vertex.
  • the vertex recognition model here can be implemented using machine learning technology and run on a general-purpose computing device or a special-purpose computing device, for example.
  • the vertex recognition model is a pre-trained neural network model.
  • the vertex recognition model can be implemented using a deep convolutional neural network (DEEP-CNN) and other neural networks.
  • the image is input into the vertex recognition model, and the vertex recognition model can recognize object vertices in the image to obtain the relative position of each object vertex and its corresponding image vertex.
  • the image vertices of the image refer to the vertices of the edge of the image.
  • the image is a rectangle, and the image vertices are respectively a5-a8.
  • the vertex recognition model may be established through machine learning training.
  • the training steps for the vertex recognition model include:
  • Step SA211 obtaining a training sample set, each sample image in the training sample set is marked with each object vertex of the object in the image, and the relative position of each object vertex and its corresponding image vertex;
  • Step SA212 obtain a test sample set, each sample image in the test sample set is also marked with each object vertex of the object in the image, and the relative position of each object vertex and its corresponding image vertex, wherein the test sample set is different in the training sample set;
  • Step SA213 training the vertex recognition model based on the training sample set
  • Step SA214 testing the vertex recognition model based on the test sample set
  • Step SA215 when the test result indicates that the recognition accuracy of the vertex recognition model is less than the preset accuracy, increase the number of samples in the training sample set for retraining;
  • Step SA216 when the test result indicates that the recognition accuracy of the vertex recognition model is greater than or equal to the preset accuracy, the training is completed.
  • the type of the object to be measured is not particularly limited in the present invention, for example, it may be a two-dimensional object such as a business card, a test paper, a test sheet, a document, or an invoice, or a three-dimensional object.
  • a certain number of sample images marked with corresponding information are acquired, and the number of sample images prepared for each object type may be the same or different.
  • Each sample image may contain the entire area of the object (as shown in FIG. 2 ), or may only contain a partial area of the object (as shown in FIG. 3 ).
  • the sample images acquired for each object type may include images taken under different shooting angles and different lighting conditions as much as possible.
  • the corresponding information marked for each sample image may also include information such as shooting angle and illumination of the sample image.
  • the sample images that have been marked above can be divided into a training sample set for training the vertex recognition model and a test sample set for testing the training results.
  • the number of samples in the training sample set is significantly greater than the number of samples in the test sample set.
  • the number of samples in the test sample set can account for 5% to 20% of the total sample images, while the corresponding training sample set
  • the number of samples can account for 80% to 95% of the total number of sample images.
  • the number of samples in the training sample set and the test sample set can be adjusted as needed.
  • the vertex recognition model can be trained by using the training sample set, and the recognition accuracy of the trained vertex recognition model can be tested by using the test sample set. If the recognition accuracy rate does not meet the requirements, then increase the number of sample images in the training sample set, and use the updated training sample set to retrain the vertex recognition model until the recognition accuracy rate of the trained vertex recognition model meets until required. If the recognition accuracy meets the requirements, the training ends. In one embodiment, whether the training can end can be judged based on whether the recognition accuracy rate is lower than a preset accuracy rate. In this way, the trained vertex identification model whose output accuracy rate meets the requirements can be used to identify object vertices in the image.
  • the adjacent edge lines can also be extended to obtain the sample image Outer object vertices b1 and b3, and mark the object vertices b1 and b3, and mark the relative positions of the object vertices b1-b4 and their corresponding image vertices respectively.
  • the vertex recognition model when the vertex recognition model recognizes an image similar to FIG. 3 , it can not only recognize the object vertex located in the image, but also Object vertices located outside the image are identified, and the relative position of each object vertex to its corresponding image vertex is identified. Further, when labeling the sample image, the object vertices located outside the image are obtained by extending the adjacent edge lines, but the vertex recognition model after training does not need to extend the edge lines to obtain the image when recognizing the image Instead, the coordinates of the external object vertices and their corresponding image vertices can be obtained directly.
  • step SA211 when labeling the relative position of each object vertex of the object in the sample image and its corresponding image vertex, it is preferable to label the distance between each object vertex and the object vertex The relative position of the nearest image vertex.
  • the distance between the object vertex a1 and the image vertex a5 is the shortest, so mark the relative position of the object vertex a1 and the image vertex a5, that is, for the object vertex a1, convert the coordinates of the object vertex a1 to The coordinates with the image vertex a5 as the origin, similarly, for the object vertex a2, convert the coordinates of the object vertex a2 to the coordinates with the image vertex a6 as the origin, and for the object vertex a3, convert the coordinates of the object vertex a3 to the image vertex a7 is the coordinates of the origin, and for the object vertex a4, the coordinates of the object vertex a4 are converted into coordinates with the image vertex a8 as the origin.
  • the sample image marked according to the above-mentioned marking method is used to train the vertex recognition model, and the recognition result of the vertex recognition model is to identify the distance between each object vertex in the image relative to the distance from the image. The relative position of the object's vertex to the nearest image vertex.
  • the relative position of the object vertex a1 relative to the image vertex a5 (that is, the coordinates of the object vertex a1 when the image vertex a5 is the origin)
  • the object vertex The relative position of a2 relative to the image vertex a6 (that is, the coordinates of the object vertex a2 when the image vertex a6 is the origin)
  • the relative position of the object vertex a3 relative to the image vertex a7 that is, the coordinates of the object vertex a3 when the image vertex a7 is the origin
  • the relative position of the object vertex a4 relative to the image vertex a8 (that is, the coordinates of the object vertex a4 when the image vertex a8 is taken as the origin).
  • Step SA22 Determine the actual position of each object vertex in the image according to the relative position of each object vertex and its corresponding image vertex.
  • the relative position of each object vertex and the image vertex closest to the object vertex in the image is converted into the coordinates of the object vertex in the target coordinate system to obtain the actual position of each object vertex in the image Location.
  • Step SA23 According to the actual position of each of the object vertices in the image, using a reference point of the image as the coordinate origin of the two-dimensional image coordinate system, obtain the coordinates of each of the object vertices in the two-dimensional image Two-dimensional position information in the system.
  • the target coordinate system is a two-dimensional image coordinate system, the origin of which is a point in the image.
  • the coordinates of the object vertex a1 when the image vertex a5 is the origin
  • the coordinates of the object vertex a2 when the image vertex a6 is the origin
  • the object vertex a7 when the image vertex a7 is the origin
  • the coordinates of the vertex a3, the coordinates of the object vertex a4 when the image vertex a8 is taken as the origin. Since the coordinates of each object vertex obtained at this time are not coordinates in the same coordinate system, it is necessary to convert the coordinates of each object vertex into coordinates in the same coordinate system.
  • step SA23 the above-mentioned The coordinates of the four object vertices are transformed into coordinates with the same position point as the origin of the common coordinate system, so as to facilitate the determination of the actual positions of the respective object vertices in the image.
  • each object vertex of the image and the position point are known, and then each object vertex can be obtained with the position point as Relative coordinates at the origin of the coordinate system.
  • the origin of the target coordinate system may be the center point of the image.
  • the origin of the target coordinate system is a certain image vertex of the image. Taking the image shown in Figure 2 as an example, the origin of the target coordinate system can be the image vertex a5, so the coordinate values of the object vertices a1-a4 can be obtained when the image vertex a5 is used as the origin of the coordinate system , and then the actual positions of the object vertices a1-a4 in the image are known.
  • step S3 a three-dimensional space coordinate system is established according to the feature point matching method.
  • a three-dimensional space coordinate system is established according to the feature point matching method, and the step of determining the spatial position of the camera includes:
  • Step S31 extracting two-dimensional feature points that match each other in at least two images
  • Step S32 Obtain the constraint relationship of at least two images according to the matched two-dimensional feature points
  • Step S33 Based on the constraints, the three-dimensional spatial positions of the two-dimensional feature points in each of the images are obtained, and then the spatial positions of the cameras corresponding to each of the images are obtained.
  • the ORB algorithm is used to quickly find and extract all the two-dimensional feature points of each image, and the two-dimensional feature points will not change with the movement, rotation or illumination of the camera. Then, the two-dimensional feature points of each image are matched to extract mutually matching two-dimensional feature points in each image.
  • Described two-dimensional feature point is made up of two parts: key point (Keypoint) and descriptor (Descriptor), and key point refers to the position of this two-dimensional feature point in the image, and some also has direction, scale information;
  • Descriptor usually is a vector that, by design, describes the information of the pixels surrounding the keypoint. Usually, the descriptors are designed according to the features with similar appearance.
  • the descriptors of two two-dimensional feature points have similar distances in the vector space, they can be considered to match each other.
  • Feature points In this embodiment, during matching, the key points in each image are extracted, the descriptors of each two-dimensional feature point are calculated according to the positions of the key points, and the matching is performed according to the descriptors to extract the matched features in each image. 2D feature points.
  • the three-dimensional space position of the camera corresponding to the picture can be obtained according to any one of the pictures (the lens orientation of the camera is always perpendicular to the two-dimensional plane of the captured picture) . Furthermore, according to the position of the camera corresponding to each picture, all the two-dimensional feature points in each picture are converted into three-dimensional feature points to form a three-dimensional space, and a three-dimensional space coordinate system is established.
  • K is the internal reference of the camera, that is to say, the fundamental matrix F of each image can be calculated only by matching two-dimensional feature point pairs (at least 7 pairs), and then decomposed from F to obtain the camera rotation matrix R and
  • the translation vector t also obtains the spatial position of the camera in the three-dimensional space coordinate system.
  • the homography matrix H can provide more constraints on each image.
  • the epipolar constraints of the two images are no longer applicable.
  • You can use The homography matrix H is used to describe the relationship between these two images. It can be seen that both the fundamental matrix F and the homography matrix H can represent the constraint relationship of two images, but both have their own applicable scenarios. For different application scenarios, the constraint relationship between images may be applicable to different matrices (basic The matrix represents the epipolar constraint, which requires the position of the camera to have rotation and translation, and the homography matrix requires the camera to only rotate without translation). In this embodiment, for the situation of the camera corresponding to each image, an appropriate matrix is selected. For the process of calculating the fundamental matrix and homography matrix of each image, please refer to the prior art, which will not be described in detail in this embodiment.
  • step S4 After the spatial position of the camera is determined in step S3, in step S4, any one of the images is selected, and based on the parameter information of the camera calibration and the spatial position of the camera, the positions of multiple object vertices in the image can be known The three-dimensional space position information, and then the actual size of the object can be obtained.
  • the purpose of camera calibration is to determine the value of some parameter information of the camera.
  • these parameter information can establish the mapping relationship between the three-dimensional coordinate system determined by the calibration board and the camera image coordinate system. In other words, these parameter information can be used to map a point in a three-dimensional space to the image space, or vice versa.
  • the parameters that need to be calibrated by the camera are usually divided into two parts: internal reference and external reference.
  • the external parameters determine the position and orientation of the camera in a three-dimensional space, and the external parameter matrix represents how a point in a three-dimensional space (world coordinates) is rotated and translated, and then falls into the image space (camera coordinates).
  • the rotation and translation of the camera are both external parameters, which are used to describe the motion of the camera in a static scene, or the rigid motion of a moving object when the camera is fixed. Therefore, in image stitching or 3D reconstruction, it is necessary to use external parameters to find the relative motion between several images, so as to register them in the same coordinate system.
  • the internal parameters can be said to be the internal parameters of the camera, which are generally the inherent properties of the camera.
  • the internal parameter matrix represents how a point in a three-dimensional space will continue to pass through the lens of the camera after it falls on the image space, and how it will pass through the optical imaging and Electrons are transformed into pixels.
  • the real camera lens also has radial and tangential distortion, and these distortion parameters are also internal parameters of the camera, and these internal parameters can be obtained through pre-calibration.
  • the specific calibration method of the camera can be understood by those skilled in the art according to the prior art, for example, Zhang's calibration method can be used.
  • Zhang's calibration method can be used.
  • step SA22 determines the actual position of each object vertex in the image according to the relative position of each object vertex and its corresponding image vertex, including:
  • Step SA221 Determine the reference position of each object vertex in the image according to the relative position of each object vertex and its corresponding image vertex;
  • Step SA222 For each object vertex, perform corner detection in the preset area where the reference position of the object vertex is located;
  • Step SA223 Determine the actual position of each object vertex in the image according to the corner point detection result.
  • the position of each object vertex in the image obtained by using the relative position of each object vertex and its corresponding image vertex is not directly used as the actual position, but It is determined as the reference position of each object vertex in the image. Then perform corner detection at the reference position of each object vertex, and finally determine the actual position of each object vertex in the image according to the result of corner detection. Because the corner point detection method is used to correct the position of the object vertices, the edge detection of the object with edges in the image is realized, and the accuracy of edge and vertex detection is also improved.
  • step SA221 the relative position of each object vertex and the image vertex closest to the object vertex in the image is converted into the reference coordinates of the object vertex in the target coordinate system, and the reference position of each object vertex in the image is obtained .
  • a corner point is an extreme point, that is, a point with a particularly prominent attribute in a certain aspect, an isolated point or an end point of a line segment with the greatest or smallest strength in some attributes.
  • a corner point is usually defined as the intersection of two edges, or in other words, the local neighborhood of a corner point should have boundaries of two different regions with different orientations. More strictly speaking, the local neighborhood of a corner point should have boundaries of two different regions with different orientations.
  • corner detection methods detect image points with specific features, not just "corners". These feature points have specific coordinates in the image, and have certain mathematical characteristics, such as local maximum or minimum gray level, certain gradient characteristics, etc.
  • the basic idea of the corner detection algorithm is to use a fixed window (take a neighborhood window of a certain pixel) to slide in any direction on the image, compare the two cases before sliding and after sliding, the degree of change in the pixel gray level in the window , if there is a sliding in any direction, there is a large grayscale change, then it can be considered that there are corner points in the window.
  • any object vertex of an object with an edge corresponds to a corner point in the image.
  • the corner point corresponding to each object vertex is detected by performing corner point detection in the preset area where the reference position of each object vertex is located.
  • the preset area where the reference position of the object vertex is located is a circular area with the pixel point at the reference position of the object vertex as the center and the first preset pixel as the radius; the first preset
  • the range of pixels is, for example, 10 to 20 pixels, preferably 15 pixels.
  • performing corner detection in a preset area where the reference position of the object vertex is located includes: performing corner detection on pixels in a circular area corresponding to each of the object vertices , in the process of corner point detection, the pixel points whose eigenvalue change range is greater than a preset threshold are all used as candidate corner points, and the target corner points corresponding to each object vertex are determined from the candidate corner points.
  • the variation range of the eigenvalue refers to the variation degree of the pixel gray level in the fixed window used for corner point detection. It can be understood that the smaller the variation range of the eigenvalue, the smaller the possibility that the pixel is a corner point.
  • corner detection algorithms include, for example, a grayscale-based corner detection algorithm, a binary image-based corner detection algorithm, a contour curve-based corner detection algorithm, etc. For details, please refer to the prior art, which will not be discussed here. repeat.
  • the determining the target corner point corresponding to each object vertex from the candidate corner points includes:
  • Step SA2221 Sort the candidate corner points in descending order according to the change range of the feature value, determine the candidate corner point ranked first as the target corner point, and determine the candidate corner point ranked second Determined as the current corner point to be selected;
  • Step SA2222 Determine whether the distance between the current corner point to be selected and all the current target corner points is larger than the second preset pixel; if yes, execute step SA2223; otherwise, execute step SA2224;
  • Step SA2223 Determine the current corner point to be selected as the target corner point
  • Step SA2224 Discard the current candidate corner point, and determine the next candidate corner point as the current candidate corner point, and return to step SA2222.
  • the eigenvalue change range of the first candidate corner point is the largest, so it is most likely to be a corner point, so it can be directly determined as the target corner.
  • the second-ranked candidate corner point it may be located in the circular area of the same object vertex (assumed to be object vertex 1) with the first-ranked candidate corner point, or it may be located in another object vertex (assumed to be object vertex 2 ) within the circular area.
  • the first-ranked candidate corner point since the first-ranked candidate corner point has been determined as the target vertex of the object vertex 1, it is impossible to determine the second-ranked candidate corner point as the target vertex of the object vertex 1.
  • the second candidate corner point must be the most likely pixel point in the circular area of the object vertex 2, so it is necessary to determine the second candidate corner point as the object vertex 2 target vertex. Based on the above considerations, in this embodiment, it is determined whether the distance between the second-ranked candidate corner point and the target corner point is greater than the second preset pixel to determine which of the above situations the second-ranked candidate corner point belongs to. If the distance between the second-ranked candidate corner point and the target corner point is greater than the second preset threshold, it means that the second-ranked candidate corner point belongs to the second case, otherwise it means that the second-ranked candidate corner point belongs to The first case.
  • each candidate corner point is judged according to the above logic, so that multiple target corner points can be finally determined from each candidate corner point.
  • the range of the second preset pixels can be set to ⁇ 50 pixels, and the upper limit can be set according to the specific size of the image, which is not limited here.
  • the corner cannot be detected there may be cases where the corner cannot be detected.
  • the preset area of the object vertex and the background of the image have little change and the corner cannot be detected.
  • this object vertex is outside the image (for example object vertex b1, b3 in Fig. 3) and there is no corner point at all.
  • the object vertex can also be regarded as the corner point.
  • step SA223 the step of determining the actual position of each object vertex in the image according to the corner detection result includes:
  • the corner detection result of the vertex of the object For each vertex of the object, if the corner detection result of the vertex of the object contains a corner point, then determine the position of the corner point as the actual position of the vertex of the object in the image, if the corner point of the vertex of the object If the detection result does not contain a corner point, the reference position of the object vertex in the image is determined as the actual position of the object vertex in the image.
  • object vertices around which remaining corner points appear may be replaced with corresponding corner points as actual vertices of the object.
  • the position of the corner point is determined as the actual position of the object vertex in the image, if the object vertex's If no corner point is included in the corner point detection result, then the reference position of the object vertex in the image is determined as the actual position of the object vertex in the image.
  • the actual position of the object vertex in the image can be corrected according to the detected coordinates of the corner point, so that the position detection of the object vertex is more accurate.
  • the recognition of the object vertices in the image may be different from the foregoing examples.
  • the object vertices are obtained by using edge intersection after recognizing the edges, instead of being directly recognized.
  • the step of obtaining multiple object vertices in the image in step S2 includes:
  • Step SB21 Process the image to obtain a line drawing of the grayscale contour in the image
  • Step SB22 merging similar lines in the line graph to obtain multiple reference boundary lines
  • Step SB23 Process the image through the trained boundary line area recognition model to obtain multiple boundary line areas of objects in the image;
  • Step SB24 For each boundary line area, determine a target boundary line corresponding to the boundary line area from a plurality of reference boundary lines;
  • Step SB25 Determine the edge of the object in the image according to the determined multiple target boundary lines
  • Step SB26 Configure the intersection point of the edge of the object in the image as the object vertex.
  • the image includes objects with edges, the line drawing includes multiple lines, and the line drawing is a grayscale image.
  • the edge here is not limited to a straight line edge, and it may also be an arc, a line segment with a small wave shape, a zigzag shape, or the like.
  • the image can be a grayscale image or a color image.
  • the image may be an original image directly captured by a camera, or an image obtained after preprocessing the original image.
  • an operation of preprocessing the image may also be included. Preprocessing can eliminate irrelevant information or noise information in the image, so as to process the image better.
  • step SB21 may include: processing the image through an edge detection algorithm to obtain a line drawing of the gray contour in the image.
  • the input image may be processed through an OpenCV-based edge detection algorithm to obtain a line drawing of the grayscale contour in the input image.
  • OpenCV is an open source computer vision library.
  • Edge detection algorithms based on OpenCV include Sobel, Scarry, Canny, Laplacian, Prewitt, Marr-Hildresh, scharr and other algorithms. Those skilled in the art can select a suitable edge detection algorithm according to the prior art. No further explanation will be given here.
  • step SB21 may include: processing the image through a boundary region recognition model to obtain multiple boundary regions; processing the multiple boundary regions through an edge detection algorithm to obtain a line drawing of a grayscale contour in the image. For example, multiple boundary areas are processed to obtain multiple boundary area label boxes; edge detection algorithms are used to process multiple boundary area label boxes to obtain line drawings of grayscale contours in the image.
  • the boundary area recognition model can be implemented using machine learning techniques and run, for example, on a general-purpose computing device or a special-purpose computing device.
  • the boundary area recognition model is a neural network model obtained through pre-training.
  • the boundary area recognition model can be implemented using a neural network such as a deep convolutional neural network (DEEP-CNN).
  • DEEP-CNN deep convolutional neural network
  • the image is input into the boundary region recognition model, and the boundary region recognition model can recognize the edge of the object in the image to obtain multiple boundary regions (ie, the mask regions of the respective boundaries of the object); then, the identified A plurality of boundary regions of the region are marked out, thereby determining a plurality of boundary region labeling boxes, for example, a plurality of boundary regions can be circumscribed by a rectangular frame to mark a plurality of boundary regions; finally, an edge detection algorithm (for example, Canny edge detection algorithm, etc.) ) to process the labeled frames of multiple boundary regions to obtain the line drawing of the grayscale contour in the image.
  • an edge detection algorithm for example, Canny edge detection algorithm, etc.
  • the edge detection algorithm only needs to perform edge detection on the marked frame of the boundary area, and does not need to perform edge detection on the entire image, so that the calculation amount can be reduced and the processing speed can be improved.
  • the bounding area annotation frame marks a part of the image.
  • step SB21 may include: performing binarization processing on the image to obtain a binarized image of the image; filtering noise lines in the binarized image, thereby obtaining lines of grayscale contours in the image picture.
  • corresponding filtering rules can be set in advance to filter out various line segments and various relatively small lines in the binarized image, for example, so as to obtain a line drawing of the grayscale contour in the image.
  • step SB22 merges similar lines in the line drawing, and the steps of obtaining multiple reference boundary lines include:
  • Step SB221 Merge similar lines in the line graph to obtain an initial merged line group; wherein, multiple initial merged line groups correspond to multiple boundary areas one by one, and each initial merged line group in the multiple initial merged line groups
  • the line group includes at least one initial merging line; according to the multiple initial merging line groups, a plurality of boundary connecting lines are determined, wherein the multiple boundary connecting lines correspond to the multiple boundary areas one by one, and the multiple boundary connecting lines correspond to the multiple initial merging Line groups are also in one-to-one correspondence; the multiple boundary areas are respectively converted into multiple line groups, wherein the multiple line groups are in one-to-one correspondence with the multiple boundary areas, and each line group in the multiple line groups includes at least one straight line; calculate a plurality of average slopes corresponding to a plurality of straight line groups; calculate the slopes of a plurality of boundary connection lines respectively; for the i-th boundary connection line in the plurality of boundary connection lines, determine the i-th boundary connection line Whether the difference between the average slope corresponding to the
  • the difference between two slopes means the difference between the inclination angles corresponding to the two slopes.
  • the inclination angle corresponding to the slope of the i-th boundary connection line can represent the angle between the i-th boundary connection line with respect to a given direction (for example, horizontal direction or vertical direction), and the inclination angle corresponding to the average slope can be Indicates the angle between the line determined based on the average slope and the given direction.
  • the inclination angle (for example, the first inclination angle) of the i-th boundary connecting line and the inclination angle (for example, the second inclination angle) corresponding to the average slope corresponding to the i-th boundary connecting line among the plurality of average slopes can be calculated , if the difference between the first inclination angle and the second inclination angle is higher than or equal to the second inclination threshold, the i-th boundary connection line is not used as a reference boundary line; and if the first inclination angle and the second inclination If the difference between the angles is lower than the second slope threshold, the i-th boundary connecting line can be used as a reference boundary line.
  • step SB221 similar lines among the multiple lines are combined to obtain multiple groups of initially combined lines, and a boundary matrix is determined according to the multiple initially combined lines.
  • the step of merging similar lines in the plurality of lines includes: acquiring a plurality of long lines in the plurality of lines, wherein each of the plurality of long lines is a line whose length exceeds a length threshold; according to the plurality of long lines , to obtain multiple merged line groups, wherein each merged line group in the multiple merged line groups includes at least two sequentially adjacent long lines, and any adjacent two long lines in each merged line group The included angles between are all smaller than the angle threshold; for each merged line group in multiple merged line groups, each long line in the merged line group is merged in turn to obtain the initial merged line corresponding to the merged line group, respectively for multiple Merge the merged line groups to obtain the initial merged lines in the multiple initial merged line groups.
  • the number of all the initial merged lines included in the multiple initial merged line groups is the same as the number of the multiple merged line groups, and all the initial merged lines included in the multiple initial merged line groups are in one-to-one correspondence with the multiple merged line groups . It should be noted that after the initial merged line corresponding to the merged line group is obtained based on the merged line group, the boundary area corresponding to the initial merged line can be determined based on the position of the initial merged line, so as to determine the initial merged line to which the initial merged line belongs. Merge groups of lines.
  • a long line in a line graph refers to a line whose length exceeds a length threshold among multiple lines in a line graph.
  • a line whose length exceeds 2 pixels is defined as a long line, that is, the length threshold is 2 pixels.
  • the disclosed embodiments include but are not limited thereto. In some other embodiments, the length threshold may also be 3 pixels, 4 pixels, and so on.
  • the merged line group can be obtained in the following way: first select a long line T1, and then start from this long line T1, and then judge whether the angle between two adjacent long lines is smaller than the angle threshold, if it is judged that a certain long line
  • the long line T1, the long line T2, and all sequentially adjacent long lines between the long line T1 and the long line T2 can be Long lines form a combined line group.
  • two adjacent long lines means two adjacent long lines in physical position, that is, there is no other long lines between the two adjacent long lines.
  • the initial merged lines are multiple lines that are longer than the long line.
  • Fig. 4 is a schematic diagram of a line merging process provided by an embodiment of the present disclosure.
  • FIG. 4 first select the first long line A, and judge whether the angle between the long line A and the long line B adjacent to the long line A is smaller than the angle threshold, if the long line A and the long line The angle between B is less than the angle threshold, which means that the long line A and the long line B belong to the same merged line group, and then continue to judge the angle between the long line B and the long line C adjacent to the long line B Whether it is less than the angle threshold, if the angle between the long line B and the long line C is also less than the angle threshold, it means that the long line C, the long line B and the long line A all belong to the same merged line group, and then continue to judge the long line The angle between C and the long line D adjacent to the long line C, if the angle between the long line C and the long line D is also smaller than the angle threshold, it means that the long line D, the long line C, the
  • the angle threshold means that the long line E and the long line A/B/C/D do not belong to the same merged line group. So far, the long line A, the long line B, the long line C, and the long line D can be regarded as a merged line
  • the line group for example, the combined line group formed by the long line A, the long line B, the long line C, and the long line D may be the first combined line group.
  • the long line E it is sequentially judged whether the angle between two adjacent long lines is smaller than the angle threshold, so that the long line G, the long line H, the long line I, and the long line J belong to a merged line group
  • the merged line group composed of long line G, long line H, long line I, and long line J can be the second merged line group
  • long line M, long line N, and long line O also belong to a merged line group
  • the merged line group formed by the long line M, the long line N, and the long line O may be the third merged line group.
  • a long line can be arbitrarily selected from a plurality of long lines, for example, long line D, and the long lines adjacent to the long line D include long lines C and long lines E, then Determine whether the angle between the long line D and the long line C is less than the angle threshold, and determine whether the angle between the long line D and the long line E is less than the angle threshold, because the angle between the long line D and the long line C is less than Angle threshold, the long line D and the long line C belong to the same merged line group, since the angle between the long line D and the long line E is greater than the angle threshold, the long line D and the long line E belong to different merged line groups, Then, on the one hand, it is possible to continue to judge the angles between other adjacent long lines starting from the long line C, so as to determine other long lines belonging to the same merged line group as the long line D, and to determine other merged lines line group; on the other hand, starting from the long line E, it is possible to determine the angles between other adjacent long
  • the long line A, the long line B, the long line C, and the long line D belong to a merged line group
  • the long line G, the long line H, the long line I, and the long line J belong to a merged line group
  • the long line M, the long line N, and the long line O also belong to a merged line group.
  • the angle between two adjacent long lines is calculated by the following formula:
  • the value of the angle threshold can be set according to actual conditions.
  • the range of the angle threshold can be 0-20 degrees, preferably 0-10 degrees, more preferably 5 degrees, 15 degrees, etc.
  • merging two long lines refers to averaging the slopes of the two long lines to obtain the average value of the slopes, which is the slope of the combined lines.
  • the combination of two long lines is calculated based on the array form of the two long lines.
  • the two long lines are respectively the first long line and the second long line, and merging two long lines means that the first long line The starting point of the line (i.e. the head of the line segment) and the end point of the second longest line (i.e.
  • the tail of the line segment are directly connected to form a new longer line, that is to say, in the coordinate system corresponding to the line drawing, the first long line
  • the starting point and the end point of the second longest line are directly connected in a straight line to obtain the merged line.
  • the coordinate value of the pixel point corresponding to the starting point of the first long line is used as the coordinate value of the pixel point corresponding to the starting point of the merged line
  • the second The coordinate value of the pixel point corresponding to the end point of the two long lines is used as the coordinate value of the pixel point corresponding to the end point of the merged line
  • the coordinate value of the pixel point corresponding to the starting point of the merged line and the pixel corresponding to the end point of the merged line The coordinate values of the points form an array of merged lines and store the array.
  • Each long line in each merged line group is merged sequentially to obtain a corresponding initial merged line.
  • the long line A, long line B, long line C, and long line D in the first merged line group are sequentially merged to obtain the initial merged line corresponding to the merged line group, for example, first, you can
  • the long line A is combined with the long line B to obtain the first combined line
  • the first combined line is combined with the long line C to obtain the second combined line
  • the second combined line is combined with the long line D to obtain The initial merged line 1 corresponding to the first merged line group.
  • merge each long line in the second merged line group to obtain an initial merged line 2 corresponding to the second merged line group
  • merge each long line in the third merged line group to obtain an initial merged line 2 corresponding to the second merged line group.
  • Initial merged line 3 for the third merged line group.
  • the boundary matrix is determined by: redrawing a plurality of the initially merged lines and the unmerged lines in the long lines, and corresponding the position information of the pixels in all the redrawn lines to the whole image
  • the values at the positions of the pixels of these lines in the image matrix are set to a first value
  • the values of the positions of pixels other than these lines are set to a second value, thereby forming a boundary matrix.
  • the boundary matrix can be a matrix with the same size as the image matrix, for example, if the size of the image is 1024 ⁇ 1024 pixels, the image matrix is a matrix of 1024 ⁇ 1024, and the boundary matrix is also a matrix of 1024 ⁇ 1024 , redrawing a plurality of the initially merged lines and the unmerged lines in the long lines according to a certain line width (such as a line width of 2), and aligning according to the pixel points of the redrawn lines corresponding to the positions in the matrix
  • the boundary matrix is filled with values.
  • the pixel points on the lines corresponding to the positions in the matrix are set to the first value such as 255, and the positions of the pixels without lines corresponding to the matrix are set to the second value such as 0, thus forming the entire
  • the very large matrix of the picture is the boundary matrix.
  • the multiple initially merged lines and the unmerged lines in the long lines are all stored in the form of an array, it needs to be formed as actual line data when determining the boundary matrix, so Redraw the line, for example, according to the line width of 2, so as to obtain the coordinate value of the pixel point corresponding to each point on each line, and then fill the value in the boundary matrix according to the obtained coordinate value, for example, the The value of the position corresponding to the coordinate value in the boundary matrix is set to 255, and the values of other positions are set to 0.
  • a boundary matrix is exemplarily provided below, and the boundary matrix is a 10 ⁇ 10 matrix, where all positions with a value of 255 in the boundary matrix are connected to form a plurality of initially merged lines and unmerged lines in the long lines.
  • Step SB222 Merge similar lines among the plurality of initially merged lines to obtain a target line, and use the unmerged initially merged lines as target lines.
  • step SB221 the merged initial merged lines are a plurality of longer lines.
  • Step SB222 can continue to judge whether there are similar lines in the multiple initial merged lines according to the merge rule in the above step SB221, so as to merge similar lines again to obtain multiple target lines, and at the same time, use the initial merged lines that cannot be merged as targets line.
  • Step a Obtain multiple groups of second-type lines from the multiple initial merging lines; wherein, the The second type of lines includes at least two sequentially adjacent initial merged lines, and the angle between any adjacent two initial merged lines is smaller than the third preset threshold; step b: for each group of second type of lines, Each initial combined line in the group of second type of lines is sequentially combined to obtain a target line.
  • the third preset threshold may be the same as or different from the second preset threshold, which is not limited in this embodiment, for example, the third preset threshold is set to an included angle of 10 degrees.
  • the comparison chart before and after the lines are merged, after the above-mentioned steps of merging the initial merged lines 1, 2, and 3, since the angle between the initial merged lines 1 and 2 is smaller than the third preset threshold, the initial merged lines The angle between 3 and the initial 2 is greater than the third preset threshold, therefore, the initial merged lines 1 and 2 can be further merged into the target line 12, and if the initial merged line 3 cannot be merged, the initial merged line 3 is directly used as a target line.
  • multiple target lines have been obtained.
  • the multiple target lines there are not only reference boundary lines, but also some long interference lines.
  • the corresponding lines of internal text and graphics, and external objects, etc. are merged
  • these interfering lines will be removed according to the subsequent processing (specifically through the processing of step SB223 and step SB23) and rules.
  • Step SB223 According to the boundary matrix, determine a plurality of reference boundary lines from the plurality of target lines; specifically, determine a plurality of reference boundary lines from the plurality of target lines according to the boundary matrix, including: First, for each target line, extend the target line, determine a line matrix according to the extended target line, then compare the line matrix with the boundary matrix, and calculate the length of the extended target line. The number of pixels belonging to the boundary matrix is used as the score of the target line, wherein the size of the line matrix is the same as that of the boundary matrix; then, according to the scores of each target line, multiple reference lines are determined from multiple target lines borderline.
  • the line matrix can be determined in the following manner: redraw the extended target line, map the position information of the pixels in the redrawn line to the entire image matrix, and map the pixel points of the lines in the image matrix to The value of the position is set to the first value, and the value of the position of the pixel point other than the line is set to the second value, thereby forming a line matrix.
  • the formation method of the line matrix is similar to that of the boundary matrix, and will not be repeated here.
  • the target line is stored in the form of an array, that is, the coordinate values of its starting point and end point are stored.
  • the extended target line is stored as the extended target line
  • the coordinate values of the start point and end point form an array, so when redrawing the extended target line, it is also redrawn according to the same line width, for example, the line width is 2, so as to obtain the corresponding points on the extended target line
  • the coordinate value of the pixel point and then fill the line matrix with values according to the coordinate value, that is, set the value of the position corresponding to the coordinate value in the line matrix to 255, and set the value of the other positions to 0.
  • Extending the merged target line and judging that the pixel points on it fall into the target line with the most number on the initial merged line in step SB222 and the unmerged lines among the long lines as the reference boundary line.
  • judge how many pixels belong to the boundary matrix and calculate a result, specifically: extend the target line, and the line obtained after extending the target line also forms a line according to the formation method of the boundary matrix Matrix, compare the line matrix with the boundary matrix to determine how many pixels fall into the boundary matrix, that is, determine how many pixels at the same position in the two matrices have the same first value, such as 255, to calculate the score.
  • the multiple target lines with the best results are determined from the multiple target lines as the reference boundary line.
  • the line matrix formed by an extended target line is as follows. By comparing the line matrix with the above boundary matrix, it can be known that 7 pixels of the extended target line fall into the boundary matrix, thus obtaining the target Line scores.
  • the boundary area recognition model can be implemented using machine learning technology and run on a general-purpose computing device or a special-purpose computing device, for example.
  • the boundary area recognition model is a neural network model obtained through pre-training.
  • the boundary area recognition model can be implemented using a neural network such as a deep convolutional neural network (DEEP-CNN).
  • DEEP-CNN deep convolutional neural network
  • a boundary area recognition model is established through machine learning training.
  • the boundary area recognition model can be trained through the following process: label each image sample in the image sample set to mark the boundary line area of the object in each image sample , the internal area and the external area; and training the neural network through the labeled image sample set to obtain a boundary area recognition model.
  • the boundary area recognition model established through machine learning training can identify three parts in the image: the boundary line area, the internal area (that is, the area where the object is located) and the external area (that is, the external area of the object), so as to obtain the various parts of the image.
  • Boundary area at this time, the edge outline in the boundary area is thicker.
  • the shape of the object can be a rectangle, and the number of boundary regions can be 4, that is, the input image can be recognized by the boundary region recognition model, so that four boundary regions corresponding to the four sides of the rectangle can be obtained.
  • the plurality of border regions includes a first border region, a second border region, a third border region, and a fourth border region.
  • the first boundary area may represent the area corresponding to the boundary line A1
  • the second boundary area may represent the area corresponding to the boundary line A2
  • the third boundary area may represent the area corresponding to the boundary line A3.
  • the fourth boundary area may represent the area corresponding to the boundary line A4; in other embodiments, as shown in Figure 3, the first boundary area may represent the area corresponding to the boundary line B1, and the second boundary area may indicates an area corresponding to the boundary line B2, the third boundary area may indicate an area corresponding to the boundary line B3, and the fourth boundary area may indicate an area corresponding to the boundary line B4.
  • step SB24 for each boundary line area, determine a target boundary line corresponding to the boundary line area from a plurality of the reference boundary lines; it may include: first, calculating each of the reference boundary lines Then, for each of the boundary line regions, convert the boundary line region into a plurality of straight lines, and calculate the average slope of the plurality of straight lines, and then judge whether there is a slope and A reference boundary line matching the average slope, if present, is determined as a target boundary line corresponding to the boundary line area.
  • the boundary area can be converted into multiple straight lines by using Hough transform, and of course other ways can also be used for conversion, which is not limited in this embodiment.
  • the edge outline in the boundary line area is relatively thick, and for each boundary line area, the Hough transform can be used to convert the boundary line area into a plurality of straight lines, these lines have approximate slopes, and the average slope is obtained, Then compare with the slope of each reference boundary line to determine whether there is a reference boundary line whose slope matches the average slope in the multiple reference boundary lines, that is, to find the most approximate reference boundary line from the multiple reference boundary lines, as the target boundary line corresponding to the boundary line area.
  • a comparison threshold will be set. When the slope of a certain reference boundary line When the absolute value of the difference from the average slope is smaller than the comparison threshold, it is determined that the slope of the reference boundary line is a reference boundary line matching the average slope, and then it is determined that the reference boundary line is a target boundary line corresponding to the boundary line area.
  • the following processing is performed: converting the boundary line area to obtain For each straight line, the line matrix formed by the straight line is compared with the boundary matrix, and the number of pixels belonging to the boundary matrix on the straight line is calculated as the score of the straight line; the straight line with the best score is determined as The target boundary line corresponding to this boundary line area. If there are multiple straight lines with the best results, the straight line that appears first among them will be used as the best boundary line according to the sorting algorithm.
  • the line matrix is determined in the following manner: the straight line is redrawn, the position information of the pixels in the redrawn line is corresponding to the entire image matrix, and the value of the pixel position of the line in the image matrix is set to The first numerical value and the value of the position of the pixel point other than the line are set as the second numerical value, thereby forming a line matrix.
  • the formation method of the line matrix is similar to that of the boundary matrix, and will not be repeated here.
  • step SB222 and step SB223 For the target boundary line corresponding to a certain boundary line area cannot be found from the reference boundary line, then form a corresponding line matrix according to the matrix formation method described in step SB222 and step SB223 for the multiple straight lines obtained by Hough transform, Judging the pixel point of which line falls into the boundary matrix with the best score, it is considered to be the target boundary line corresponding to the boundary line area.
  • the method of comparing the line matrix formed by the straight line with the boundary matrix to calculate the score of the straight line can refer to the relevant description in step SB223 , which will not be repeated here.
  • step SB25 after determining the multiple target boundary lines, since each target boundary line corresponds to a boundary line region of the object in the image, the multiple target boundary lines constitute the edge of the object in the image.
  • the edge of the object in the image is composed of four longer lines in Figure 2, namely the target boundary lines A1, A2, A3, A4; in the image shown in Figure 3, the edge of the object in the image is formed by
  • the four longer lines in Fig. 3 are composed of target boundary lines B1, B2, B3, and B4.
  • step SB26 after obtaining the edges of the object in the image, the intersection of these edges is configured as the object vertex. Please refer to step S2 to step S4 for subsequent operation steps, which will not be repeated here.
  • This embodiment also provides a readable storage medium, on which a program is stored, and when the program is executed, the object size recognition method as described above is realized. Further, this embodiment also provides an object size recognition system, which includes a processor and a memory, and a program is stored in the memory, and when the program is executed by the processor, the above-mentioned object size recognition is realized method.
  • the object size recognition method includes: obtaining at least two images of an object from different perspectives by shooting; two-dimensional position information of a plurality of object vertices therein; according to at least two said images, a three-dimensional space coordinate system is established according to a feature point matching method, and the spatial position of the camera is determined; and any one of the said images is selected; Based on the camera calibration parameter information and the spatial position of the camera, the three-dimensional space position information of multiple vertices is obtained, and then the size of the object is obtained.
  • the size of the object can be obtained by taking at least two images of an object from different angles of view, combined with the parameter information of the camera calibration, and the operation steps are simple, which overcomes the inability of the existing technology to measure the size of the object in space The problem.

Abstract

本发明提供一种物体尺寸识别方法、可读存储介质及物体尺寸识别系统,所述物体尺寸识别方法包括:通过拍摄获取一物体的至少两张不同视角的图像;对每张所述图像,分别获取其中的多个对象顶点的二维位置信息;根据至少两张所述图像,根据特征点匹配法建立三维空间坐标系,确定相机的空间位置;以及选取任意一张所述图像,基于所述相机标定的参数信息以及所述相机的空间位置,得到多个顶点的三维空间位置信息,进而得到所述物体的尺寸。如此配置,通过拍摄获取一物体的至少两张不同视角的图像,结合相机标定的参数信息,即可得到物体的尺寸,操作步骤简便,克服了现有技术无法对空间中的物体的尺寸进行测量的问题。

Description

物体尺寸识别方法、可读存储介质及物体尺寸识别系统 技术领域
本发明涉及对象识别技术领域,特别涉及一种物体尺寸识别方法、可读存储介质及物体尺寸识别系统。
背景技术
当身边没有测量工具或者待测量的物体不在身边时,如何对物体尺寸进行测量是一个难题。目前,人们经常对物体进行拍照,可以获取物体的图像,然而拍摄获得的图像中的物体并没有标尺,无法得知物体的实际尺寸。如何简单地对空间中的物体的尺寸进行测量是一件亟待解决的问题。
发明内容
本发明的目的在于提供一种物体尺寸识别方法、可读存储介质及物体尺寸识别系统,以解决现有难以测量物体尺寸的问题。
为解决上述技术问题,根据本发明的第一个方面,提供了一种物体尺寸识别方法,其包括:
通过拍摄获取一物体的至少两张不同视角的图像;
对每张所述图像,分别获取其中的多个对象顶点的二维位置信息;
根据至少两张所述图像,根据特征点匹配法建立三维空间坐标系,确定相机的空间位置;以及
选取任意一张所述图像,基于所述相机标定的参数信息以及所述相机的空间位置,得到多个对象顶点的三维空间位置信息,进而得到所述物体的尺寸。
可选的,获取所述图像中多个对象顶点的二维位置信息的步骤包括:
将所述图像输入经训练的顶点识别模型,得到每个所述对象顶点及与其对应的图像顶点的相对位置;
根据每个所述对象顶点及与其对应的图像顶点的相对位置,确定各个所述对象顶点在所述图像中的实际位置;
根据每个所述对象顶点在所述图像中的实际位置,以所述图像的一参考点为二维图像坐标系的坐标原点,得到各个所述对象顶点在所述二维图像坐标系中的二维位置信息。
可选的,根据每个所述对象顶点及与其对应的图像顶点的相对位置,确定各个所述对象顶点在所述图像中的实际位置的步骤包括:
根据每个所述对象顶点及与其对应的图像顶点的相对位置,确定各个所述对象顶点在所述图像中的参考位置;
针对每个所述对象顶点,在该对象顶点的参考位置所处的预设区域内进行角点检测;
根据角点检测结果确定每个所述对象顶点在所述图像中的实际位置。
可选的,所述对象顶点的参考位置所处的预设区域为以所述对象顶点的参考位置处的像素点为圆心、以第一预设像素为半径的圆形区域;
所述针对每个所述对象顶点,在该对象顶点的参考位置所处的预设区域内进行角点检测,包括:
对各个所述对象顶点对应的圆形区域内的像素点进行角点检测,在角点检测过程中,将特征值变化幅度大于预设阈值的像素点均作为候选角点,从所述候选角点中确定各个对象顶点对应的目标角点。
可选的,所述根据角点检测结果确定每个所述对象顶点在所述图像中的实际位置,包括:
针对每个所述对象顶点,若该对象顶点的角点检测结果中包含一个角点,则将该角点的位置确定为该对象顶点在所述图像中实际位置,若该对象顶点的角点检测结果中不包含角点,则将该对象顶点在所述图像中的参考位置确定为该对象顶点在所述图像中实际位置。
可选的,获取所述图像中多个对象顶点的步骤包括:
对所述图像进行处理,获得所述图像中灰度轮廓的线条图;
将所述线条图中相似的线条进行合并,得到多条参考边界线;
通过经训练的边界线区域识别模型对所述图像进行处理,得到所述图像中物体的多个边界线区域;
针对每个所述边界线区域,从多条所述参考边界线中确定与该边界线区域相对应的目标边界线;
根据确定的多条所述目标边界线确定所述图像中物体的边缘;
将所述图像中物体的边缘的交点配置为所述对象顶点。
可选的,将所述线条图中相似的线条进行合并,得到多条参考边界线的步骤包括:
将所述线条图中相似的线条进行合并,得到多条初始合并线条,并根据多条所述初始合并线条确定一边界矩阵;
将多条所述初始合并线条中相似的线条进行合并得到目标线条,并且将未合并的所述初始合并线条也作为目标线条;
根据所述边界矩阵,从多条所述目标线条中确定多条参考边界线。
可选的,根据至少两张所述图像,根据特征点匹配法建立三维空间坐标系,确定相机的空间位置的步骤包括:
提取至少两张所述图像中相互匹配的二维特征点;
根据相互匹配的所述二维特征点,得到至少两张所述图像的约束关系;
基于所述约束关系,得到每张所述图像中的二维特征点的三维空间位置,进而得到每张所述图像所对应的相机的空间位置。
为解决上述技术问题,根据本发明的第二个方面,还提供了一种可读存储介质,其上存储有程序,所述程序被执行时实现如上所述的物体尺寸识别方法。
为解决上述技术问题,根据本发明的第三个方面,还提供了一种物体尺寸识别系统,其包括处理器和存储器,所述存储器上存储有程序,所述程序被所述处理器执行时,实现如上所述的物体尺寸识别方法。
综上所述,在本发明提供的物体尺寸识别方法、可读存储介质及物体尺寸识别系统中,所述物体尺寸识别方法包括:通过拍摄获取一物体的至少两张不同视角的图像;对每张所述图像,分别获取其中的多个对象顶点的二维位置信息;根据至少两张所述图像,根据特征点匹配法建立三维空间坐标系,确定相机的空间位置;以及选取任意一张所述图像,基于所述相机标定的参 数信息以及所述相机的空间位置,得到多个顶点的三维空间位置信息,进而得到所述物体的尺寸。
如此配置,通过拍摄获取一物体的至少两张不同视角的图像,结合相机标定的参数信息,即可得到物体的尺寸,操作步骤简便,克服了现有技术无法对空间中的物体的尺寸进行测量的问题。
附图说明
本领域的普通技术人员将会理解,提供的附图用于更好地理解本发明,而不对本发明的范围构成任何限定。其中:
图1是本发明实施例的物体尺寸识别方法的流程图;
图2是本发明实施例的一个被拍摄的物体的示意图;
图3是本发明实施例的另一个被拍摄的物体的示意图;
图4是本发明实施例的线条合并的示意图。
具体实施方式
为使本发明的目的、优点和特征更加清楚,以下结合附图和具体实施例对本发明作进一步详细说明。需说明的是,附图均采用非常简化的形式且未按比例绘制,仅用以方便、明晰地辅助说明本发明实施例的目的。此外,附图所展示的结构往往是实际结构的一部分。特别的,各附图需要展示的侧重点不同,有时会采用不同的比例。
如在本说明书中所使用的,单数形式“一”、“一个”以及“该”包括复数对象,术语“或”通常是以包括“和/或”的含义而进行使用的,术语“若干”通常是以包括“至少一个”的含义而进行使用的,术语“至少两个”通常是以包括“两个或两个以上”的含义而进行使用的,此外,术语“第一”、“第二”、“第三”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”、“第三”的特征可以明示或者隐含地包括一个或者至少两个该特征。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本说明书中的具体 含义。
本发明的目的在于提供一种物体尺寸识别方法、可读存储介质及物体尺寸识别系统,以解决现有难以测量物体尺寸的问题。
以下参考附图进行描述。
请参考图1,本发明实施例提供一种物体尺寸识别方法,其包括:
步骤S1:通过拍摄获取一物体的至少两张不同视角的图像。可以理解的,每张所述图像均具有表示所述物体的多个对象顶点。在一些实施例中,可采用双目相机或者深度相机来拍摄,在另一些实施例中,也可采用具有两个以上摄像头的手机来拍摄。
步骤S2:对每张所述图像,分别获取其中的多个对象顶点的二维位置信息。这里的对象顶点的二维位置信息,指各对象顶点在该图像坐标系下的坐标。
步骤S3:根据至少两张所述图像,根据特征点匹配法建立三维空间坐标系,确定相机的空间位置;以及
步骤S4:选取任意一张所述图像,基于所述相机标定的参数信息以及所述相机的空间位置,得到多个对象顶点的三维空间位置信息,进而得到所述物体的尺寸。
如此配置,通过拍摄获取一物体的至少两张不同视角的图像,结合相机标定的参数信息,即可得到物体的尺寸,操作步骤简便,克服了现有技术无法对空间中的物体的尺寸进行测量的问题。
请参考图2,在一个示范例中,被拍摄的物体为一矩形(例如一张名片),其具有四个边缘(即线条)A1~A4,这四个边缘中相邻两个边缘的连接处构成一个对象顶点,即该图像中的名片具有四个对象顶点a1~a4。请参考图3,在另一个示范例中,由于图像并未将整个被拍摄物体的区域全部拍摄进来,因此,该被拍摄物体的左下角顶点和右上角顶点并未包含在所述图像中,针对这种情况,可以对所述图像中该被拍摄物体的4个边缘线条B1~B4进行延伸,得到该被拍摄物体的左下角的虚拟顶点和右上角的虚拟顶点,与被拍摄 得到的实际的顶点一同,得到该被拍摄物体的4个对象顶点b1~b4。当然上述矩形的形状仅为被拍摄的物体的一个示范例,而非对被拍摄的物体的形状的限定,被拍摄的物体也可以是其它的平面或立体的形状。但优选的,被拍摄的物体应具有若干顶点,以便于后续的识别和计算。
在识别得到图像中的对象顶点后,即执行步骤S2,来获取对象顶点的二维位置信息。在一个可替代的示范例中,获取所述图像中多个对象顶点的二维位置信息的步骤包括:
步骤SA21:将所述图像输入经训练的顶点识别模型,得到每个所述对象顶点及与其对应的图像顶点的相对位置。这里的顶点识别模型如可以采用机器学习技术实现并且例如运行在通用计算装置或专用计算装置上。所述顶点识别模型为预先训练得到的神经网络模型。例如,所述顶点识别模型可以采用深度卷积神经网络(DEEP-CNN)等神经网络实现。在一些实施例中,将所述图像输入所述顶点识别模型,所述顶点识别模型可以识别出所述图像中的对象顶点,以得到每个所述对象顶点与其对应的图像顶点的相对位置。需理解,所述图像的图像顶点指的是图像边缘的顶点,例如在图2中,所述图像为矩形,图像顶点分别为a5~a8。
可选的,顶点识别模型可以是通过机器学习训练建立的。在一个示范例中,顶点识别模型的训练步骤包括:
步骤SA211,获取训练样本集,所述训练样本集中的每一样本图像标注有图像中对象的各个对象顶点,以及各个对象顶点与其对应的图像顶点的相对位置;
步骤SA212,获取测试样本集,所述测试样本集中的每一样本图像也标注有图像中对象的各个对象顶点,以及各个对象顶点与其对应的图像顶点的相对位置,其中,所述测试样本集不同于所述训练样本集;
步骤SA213,基于所述训练样本集对所述顶点识别模型进行训练;
步骤SA214,基于所述测试样本集对所述顶点识别模型进行测试;
步骤SA215,在所述测试结果指示所述顶点识别模型的识别准确率小于预设准确率时,增加所述训练样本集中的样本数量进行再次训练;以及
步骤SA216,在所述测试结果指示所述顶点识别模型的识别准确率大于或等于所述预设准确率时,完成训练。
可选的,本发明对待测量的物体的类型不作特别的限定,例如可为名片、试卷、化验单、文档、发票等二维的物体,也可以是三维的物体。针对每种物体的对象类型,均获取一定数量的标注有对应信息的样本图像,为每种对象类型准备的样本图像的数量可以相同也可以不同。每个样本图像中可以包含对象的全部区域(如图2所示),也可以仅包含对象的部分区域(如图3所示)。为每种对象类型获取的样本图像可以尽可能包括不同拍摄角度、不同光照条件下拍摄的图像。在这些情况下,为每个样本图像标注的对应信息还可以包括该样本图像的拍摄角度、光照等信息。
可以将经过上述标注处理的样本图像划分为用于训练所述顶点识别模型的训练样本集和用于对训练结果进行测试的测试样本集。通常训练样本集内的样本的数量明显大于测试样本集内的样本的数量,例如,测试样本集内的样本的数量可以占总样本图像数量的5%到20%,而相应的训练样本集内的样本的数量可以占总样本图像数量的80%到95%。本领域技术人员应该理解的是,训练样本集和测试样本集内的样本数量可以根据需要来调整。
可以利用训练样本集对所述顶点识别模型进行训练,并利用测试样本集对经过训练的所述顶点识别模型的识别准确率进行测试。若识别准确率不满足要求,则增加训练样本集中的样本图像的数量,并利用更新的训练样本集重新对所述顶点识别模型进行训练,直到经过训练的所述顶点识别模型的识别准确率满足要求为止。若识别准确率满足要求,则训练结束。在一个实施例中,可以基于识别准确率是否小于预设准确率来判断训练是否可以结束。如此,输出准确率满足要求的经过训练的所述顶点识别模型可以用于进行所述图像中对象顶点的识别。
需要说明的是,若采用如图3所示的图像作为样本图像,在标注时,除了将样本图像内的对象顶点b2、b4标注出来,还可以对相邻的边缘线条进行延长以获得样本图像外的对象顶点b1、b3,并将对象顶点b1、b3也标注出来,同时还分别标注对象顶点b1~b4与其对应的图像顶点的相对位置。
如此,将按照上述标注方式进行标注后的样本图像用于训练所述顶点识别模型,则所述顶点识别模型在识别类似图3的图像时,不仅能够识别出位于图像内的对象顶点,还能识别出位于图像外的对象顶点,以及识别出各个对象顶点与其对应的图像顶点的相对位置。进一步的,在标注样本图像时是通过延长相邻的边缘线条来获取位于图像外部的对象顶点的,但是训练完成后的所述顶点识别模型在识别图像时,并不需要延长边缘线条来获取图像外部的对象顶点,而是能够直接获得外部的对象顶点与其对应的图像顶点的坐标。
优选的,在所述顶点识别模型的训练步骤中,步骤SA211中,在标注样本图像中对象的各个对象顶点与其对应的图像顶点的相对位置时,优选标注每一所述对象顶点距离该对象顶点最近的图像顶点的相对位置。以图2所示图像为样本图像为例,对象顶点a1与图像顶点a5的距离最近,因此标注对象顶点a1与图像顶点a5的相对位置,即针对对象顶点a1,将对象顶点a1的坐标转换为以图像顶点a5为原点的坐标,同理,针对对象顶点a2,将对象顶点a2的坐标转换为以图像顶点a6为原点的坐标,针对对象顶点a3,将对象顶点a3的坐标转换为以图像顶点a7为原点的坐标,针对对象顶点a4,将对象顶点a4的坐标转换为以图像顶点a8为原点的坐标。
如此,将按照上述标注方式进行标注后的样本图像用于训练所述顶点识别模型,则所述顶点识别模型的识别结果是识别出所述图像中每一对象顶点相对于与所述图像距离该对象顶点最近的图像顶点的相对位置。
以图2所示的图像为例,通过所述顶点识别模型识别后,可以得到对象顶点a1相对于图像顶点a5的相对位置(即以图像顶点a5为原点时对象顶点a1的坐标),对象顶点a2相对于图像顶点a6的相对位置(即以图像顶点a6为原点时对象顶点a2的坐标),对象顶点a3相对于图像顶点a7的相对位置(即以图像顶点a7为原点时对象顶点a3的坐标),对象顶点a4相对于图像顶点a8的相对位置(即以图像顶点a8为原点时对象顶点a4的坐标)。
步骤SA22:根据每个所述对象顶点及与其对应的图像顶点的相对位置,确定各个所述对象顶点在所述图像中的实际位置。
在一些实施例中,将各个对象顶点与所述图像中距离该对象顶点最近的图像顶点的相对位置转换为该对象顶点在目标坐标系中的坐标,得到各个对象顶点在所述图像中的实际位置。
步骤SA23:根据每个所述对象顶点在所述图像中的实际位置,以所述图像的一参考点为二维图像坐标系的坐标原点,得到各个所述对象顶点在所述二维图像坐标系中的二维位置信息。
优选的,所述目标坐标系为二维图像坐标系,其原点为所述图像中的一位置点。以图2所示的图像为例,在步骤SA21中获得了以图像顶点a5为原点时对象顶点a1的坐标,以图像顶点a6为原点时对象顶点a2的坐标,以图像顶点a7为原点时对象顶点a3的坐标,以图像顶点a8为原点时对象顶点a4的坐标。由于此时获得的各个对象顶点的坐标不是同一坐标系内的坐标,因此需要对各个对象顶点的坐标进行转换,转换为在同一坐标系中的坐标,具体的,在步骤SA23中,可以将上述4个对象顶点的坐标转换为以同一个位置点作为共同的坐标系原点的坐标,从而便于确定各个对象顶点在所述图像中的实际位置。
由于所述的同一个位置点是所述图像中的一个特定位置,因此所述图像的各个图像顶点与该位置点的相对坐标是已知的,进而可以求得各个对象顶点以该位置点为坐标系原点时的相对坐标。
例如,在一些实施例中,目标坐标系的原点可以为所述图像的中心点。在另一些实施例中,目标坐标系的原点为所述图像的某一图像顶点。以图2所示的图像为图像为例,所述目标坐标系的原点如可以为图像顶点a5,因此可以获得在以所述图像顶点a5为坐标系原点时,对象顶点a1~a4的坐标值,进而也就得知对象顶点a1~a4在所述图像中的实际位置。
在完成步骤S2获取对象顶点的二维位置信息后,步骤S3中,根据特征点匹配法建立三维空间坐标系。优选的,根据至少两张所述图像,根据特征点匹配法建立三维空间坐标系,确定相机的空间位置的步骤包括:
步骤S31:提取至少两张所述图像中相互匹配的二维特征点;
步骤S32:根据相互匹配的所述二维特征点,得到至少两张所述图像的约 束关系;
步骤S33:基于所述约束关系,得到每张所述图像中的二维特征点的三维空间位置,进而得到每张所述图像所对应的相机的空间位置。
在一个示范例中,采用ORB算法快速找到各图像的所有二维特征点并提取,所述二维特征点不会随着相机的移动、旋转或者光照的变化而变化。接着将各图像的二维特征点进行匹配,以提取各图像中相互匹配的二维特征点。所述二维特征点由两部分构成:关键点(Keypoint)和描述子(Descriptor),关键点指的是该二维特征点在图像中的位置,有些还具有方向、尺度信息;描述子通常是一个向量,按照设计的方式,描述关键点周围像素的信息。通常描述子是按照外观相似的特征应该有相似的描述子设计的,因此,在匹配的时候,只要两个二维特征点的描述子在向量空间的距离相近,就可以认为它们是相互匹配的特征点。本实施例中,在匹配时,提取各图像中的关键点,根据关键点的位置计算出各二维特征点的描述子,根据所述描述子进行匹配,以提取出各图像中相互匹配的二维特征点。当然提取所述二维特征点还有其他方式,例如粗暴匹配或邻近搜索等,此处不再一一举例,本领域技术人员可根据实际选用。
在对各图像的二维特征点进行匹配后,可以根据其中任一张图片,得到该图片所对应的相机的三维空间位置(相机的镜头朝向始终垂直于其所拍摄得到图片的二维平面)。进而根据各图片所对应的相机的位置,将各图片中的二维特征点全部转换为三维特征点,形成三维空间,建立三维空间坐标系。
可以理解的是,三维场景中的同一个三维特征点在不同视角下(指相机同时存在旋转和平移时)的二维特征点存在着一种约束关系:对极约束,基础矩阵是这种约束关系的代数表示,并且这种约束关系独立于场景的结构,只依赖于相机的内参和外参,对于相互匹配的二维特征点p 1、p 2以及基础矩阵F有如下关系:
Figure PCTCN2022106607-appb-000001
其中F=K -Tt ×RK -1     (1)
其中,K为相机的内参,也就是说,仅通过相互匹配的二维特征点对(最 少7对)可以计算出各图像的基础矩阵F,然后再从F中分解得到相机的旋转矩阵R和平移向量t,也就得到了相机在三维空间坐标系中的空间位置。
进一步,单应矩阵H能够对各图像提供更多的约束,当相机在只有旋转而没有平移的情况下取得同一场景的两张图像,该两张图像的对极约束就不再适用,可以使用单应矩阵H来描述这两张图像之间的关系。可见基础矩阵F和单应矩阵H均能够表征两张图像的约束关系,但是两者有各自适用的场景,对于不同的应用场景来说,各图像之间的约束关系可能适用的矩阵不同(基础矩阵表示对极约束,需要相机的位置有旋转和平移,单应矩阵需要相机只有旋转而无平移),本实施例中,对于各图像所对应的相机的情况,选用合适的矩阵。计算各图像的基础矩阵和单应矩阵的过程,请参考现有技术,本实施例不再详述。
在步骤S3确定了相机的空间位置后,步骤S4中,选取任意一张所述图像,基于所述相机标定的参数信息以及所述相机的空间位置,即可获知该图像中多个对象顶点的三维空间位置信息,进而即可得到所述物体的实际尺寸。
相机标定的目的是确定相机的一些参数信息的值。通常,这些参数信息可以建立标定板确定的三维坐标系和相机图像坐标系的映射关系,换句话说,可以用这些参数信息把一个三维空间中的点映射到图像空间,或者反过来。相机需要标定的参数通常分为内参和外参两部分。外参确定了相机在某个三维空间中的位置和朝向,外参数矩阵代表了一个三维空间中的点(世界坐标)是怎样经过旋转和平移,然后落到图像空间(相机坐标)上。相机的旋转和平移均属于外参,用于描述相机在静态场景下相机的运动,或者在相机固定时,运动物体的刚性运动。因此,在图像拼接或者三维重建中,就需要使用外参来求几幅图像之间的相对运动,从而将其注册到同一个坐标系下。
内参可以说是相机内部的参数,其一般为相机固有的属性,内参数矩阵代表了一个三维空间中的点在落到图像空间上后,是如何继续经过相机的镜头、并如何通过光学成像和电子转化而成为像素点的。需要注意的是,真实的相机镜头还会有径向和切向畸变等,而这些畸变参数也属于相机的内参,这些内参均可通过事先标定的方式来获得。
相机的具体标定方法,本领域技术人员可根据现有技术进行理解,例如可采用张氏标定法等。通过对相机外参和内参的标定,基于所述相机的空间位置,即可得到图片中的对象顶点的三维空间位置信息,从而计算得到物体的实际的尺寸。
在另一个可替代的示范例中,步骤SA22根据每个所述对象顶点及与其对应的图像顶点的相对位置,确定各个所述对象顶点在所述图像中的实际位置的步骤包括:
步骤SA221:根据每个所述对象顶点及与其对应的图像顶点的相对位置,确定各个所述对象顶点在所述图像中的参考位置;
步骤SA222:针对每个所述对象顶点,在该对象顶点的参考位置所处的预设区域内进行角点检测;
步骤SA223:根据角点检测结果确定每个所述对象顶点在所述图像中的实际位置。
在该示范例中,与前一示范例不同的,并不是直接将利用每个对象顶点及与其对应的图像顶点的相对位置,得到的各个对象顶点在图像中的位置作为实际位置,而是将其确定为各个对象顶点在图像中的参考位置。然后在各对象顶点的参考位置处进行角点检测,根据角点检测的结果最终确定各对象顶点在图像中的实际位置。由于采用角点检测的方法对对象顶点的位置进行修正,实现了对图像中的具有边缘的对象的边缘检测,还提高了边缘和顶点检测的准确性。
下面同样以图2和图3为示例进行说明。步骤SA221中,将各个对象顶点与所述图像中距离该对象顶点最近的图像顶点的相对位置转换为该对象顶点在目标坐标系中的参考坐标,得到各个对象顶点在所述图像中的参考位置。
步骤SA222中,通常意义上来说,角点就是极值点,即在某方面属性特别突出的点,是在某些属性上强度最大或者最小的孤立点、线段的终点。角点通常被定义为两条边的交点,或者说,角点的局部邻域应该具有两个不同区域的不同方向的边界。更严格的说,角点的局部邻域应该具有两个不同区域的不同方向的边界。而实际应用中,大多数所谓的角点检测方法检测的是 拥有特定特征的图像点,而不仅仅是“角点”。这些特征点在图像中有具体的坐标,并具有某些数学特征,如局部最大或最小灰度、某些梯度特征等。
角点检测算法基本思想是使用一个固定窗口(取某个像素的一个邻域窗口)在图像上进行任意方向上的滑动,比较滑动前与滑动后两种情况,窗口中的像素灰度变化程度,如果存在任意方向上的滑动,都有着较大灰度变化,那么可以认为该窗口中存在角点。
一般来说,具有边缘的对象的任一个对象顶点在所述图像中就对应一个角点。通过在每一对象顶点的参考位置所处的预设区域内进行角点检测,以检测出每一对象顶点对应的角点。
优选的,所述对象顶点的参考位置所处的预设区域为以所述对象顶点的参考位置处的像素点为圆心、以第一预设像素为半径的圆形区域;所述第一预设像素的范围例如为10~20个像素,优选为15个像素。
所述针对每个所述对象顶点,在该对象顶点的参考位置所处的预设区域内进行角点检测,包括:对各个所述对象顶点对应的圆形区域内的像素点进行角点检测,在角点检测过程中,将特征值变化幅度大于预设阈值的像素点均作为候选角点,从所述候选角点中确定各个对象顶点对应的目标角点。其中,特征值变化幅度指的是用于角点检测的固定窗口中像素灰度变化程度。可以理解的是,特征值变化幅度越小,表示该像素点是角点的可能性也越小。通过将特征值变化幅度与预设阈值做比较,可以剔除掉角点可能性小的像素点,而保留角点可能性较大的像素点作为候选角点,进而便于从候选角点中进一步确定目标角点。具体的角点检测算法例如有基于灰度图的角点检测算法、基于二值图像的角点检测算法、基于轮廓曲线的角点检测算法等等,具体可参见现有技术,在此不做赘述。
具体的,所述从所述候选角点中确定各个对象顶点对应的目标角点,包括:
步骤SA2221:对所述候选角点按照所述特征值变化幅度进行降序排序,将排第一位的所述候选角点确定为所述目标角点,将排第二位的所述候选角点确定为当前待选角点;
步骤SA2222:判断所述当前待选角点与当前所有所述目标角点之间的距离是否均大于第二预设像素;如果是则执行步骤SA2223,否则执行步骤SA2224;
步骤SA2223:将所述当前待选角点确定为所述目标角点;
步骤SA2224,舍弃所述当前待选角点,并将排下一位的所述候选角点确定为当前待选角点,返回执行步骤SA2222。
可以理解的是,按照所述特征值变化幅度进行降序排序,那么排第一位候选角点的特征值变化幅度最大,因此其为角点的可能性也最大,因此可直接将其确定为目标角点。对于排第二位的候选角点,其可能与排第一位的候选角点位于同一对象顶点(假设为对象顶点1)的圆形区域内,也可能位于其它对象顶点(假设为对象顶点2)的圆形区域内。对于第一种情况,由于已确定排第一位的候选角点作为对象顶点1的目标顶点,因此不可能再确定排第二位的候选角点也作为该对象顶点1的目标顶点。对于第二种情况,排第二位的候选角点必然是该对象顶点2的圆形区域内角点可能性最大的像素点,因此需要将排第二位候选角点确定为该对象顶点2的目标顶点。基于以上考虑,本实施例通过判断排第二位的候选角点与目标角点之间的距离是否大于第二预设像素,来判断排第二位的候选角点属于上述哪种情况。如果排第二位的候选角点与目标角点之间的距离大于第二预设阈值,表示排第二位的候选角点属于第二种情况,否则表示排第二位的候选角点属于第一种情况。若属于第二种情况,则需要确定排第二位的候选角点为目标角点,若属于第一种情况,则需要舍弃排第二位的候选角点。依次类推,对各个候选角点均按照以上逻辑进行判断,从而可以从各个候选角点中最终确定出多个目标角点。
通过上述处理,可以保证每个对象顶点周围最多只有一个候选角点剩余,该剩余出来的候选角点所在的位置即为该对象顶点的实际位置。优选的,所述第二预设像素的范围可以设为≥50个像素,其上限值可根据图像的具体大小进行设定,在此不做限定。
需要说明的是,在对一个对象顶点进行角点检测的过程中可能存在检测不到角点的情况,例如,这个对象顶点的预设区域与图像背景的变化很小而 无法检测到角点,或者这个对象顶点在图像外(例如图3中的对象顶点b1、b3)而根本不存在角点。对于检测不到角点的情况,也可以将该对象顶点视为角点。
优选的,步骤SA223中,根据角点检测结果确定每个所述对象顶点在所述图像中的实际位置的步骤包括:
针对每个所述对象顶点,若该对象顶点的角点检测结果中包含一个角点,则将该角点的位置确定为该对象顶点在所述图像中实际位置,若该对象顶点的角点检测结果中不包含角点,则将该对象顶点在所述图像中的参考位置确定为该对象顶点在所述图像中实际位置。在一些实施例中,可将周围出现剩余角点的对象顶点替换为对应的角点作为对象的实际顶点。即,针对每一所述对象顶点,若该对象顶点的角点检测结果中包含一个角点,则将该角点的位置确定为该对象顶点在所述图像中实际位置,若该对象顶点的角点检测结果中不包含角点,则将该对象顶点在所述图像中的参考位置确定为该对象顶点在所述图像中实际位置。
通过上述处理,可根据检测出的角点的坐标更正对象顶点的在图像中的实际位置,使得对象顶点的位置检测更加准确。
在另一个可替代的示范例中,图像中的对象顶点的识别可与前述示范例不同,在本示范例中,对象顶点是在识别边缘后,利用边缘相交来得到,而不是直接识别得到。具体的,步骤S2中获取所述图像中多个对象顶点的步骤包括:
步骤SB21:对所述图像进行处理,获得所述图像中灰度轮廓的线条图;
步骤SB22:将所述线条图中相似的线条进行合并,得到多条参考边界线;
步骤SB23:通过经训练的边界线区域识别模型对所述图像进行处理,得到所述图像中物体的多个边界线区域;
步骤SB24:针对每个所述边界线区域,从多条所述参考边界线中确定与该边界线区域相对应的目标边界线;
步骤SB25:根据确定的多条所述目标边界线确定所述图像中物体的边缘;
步骤SB26:将所述图像中物体的边缘的交点配置为所述对象顶点。
步骤SB21中,所述图像包括具有边缘的对象,线条图包括多条线条,线条图为灰度图。需要说明的,这里的边缘并非限制为直线边缘,其也可以是弧线、具有细小的波浪形、锯齿形等形状的线段等。所述图像可以为灰度图像,也可以为彩色图像。例如,图像可以为相机直接拍摄获取的原始图像,也可以是对原始图像进行预处理之后获得的图像。例如,为了避免图像的数据质量、数据不均衡等对于对象边缘检测的影响,在处理图像前,还可以包括对图像进行预处理的操作。预处理可以消除图像中的无关信息或噪声信息,以便于更好地对图像进行处理。
进一步的,步骤SB21可包括:通过边缘检测算法对图像进行处理,获得图像中灰度轮廓的线条图。
在一些实施例中,如可以通过基于OpenCV的边缘检测算法对输入图像进行处理,以获得输入图像中灰度轮廓的线条图。OpenCV为一种开源计算机视觉库,基于OpenCV的边缘检测算法包括Sobel、Scarry、Canny、Laplacian、Prewitt、Marr-Hildresh、scharr等多种算法。本领域技术人员可根据现有技术选择合适的边缘检测算法。这里不再展开说明。
在另一些实施例中,步骤SB21可包括:通过边界区域识别模型对图像进行处理,得到多个边界区域;通过边缘检测算法对多个边界区域进行处理,获得图像中灰度轮廓的线条图。例如,对多个边界区域进行处理,以得到多个边界区域标注框;通过边缘检测算法对多个边界区域标注框进行处理,获得图像中灰度轮廓的线条图。
边界区域识别模型可以采用机器学习技术实现并且例如运行在通用计算装置或专用计算装置上。该边界区域识别模型为预先训练得到的神经网络模型。例如,边界区域识别模型可以采用深度卷积神经网络(DEEP-CNN)等神经网络实现。在一些实施例中,将图像输入边界区域识别模型,边界区域识别模型可以识别出图像中的对象的边缘,以得到多个边界区域(即对象的各个边界的mask区域);然后,将识别出的多个边界区域标注出来,从而确定多个边界区域标注框,例如,可以对多个边界区域外接矩形框,以标注多 个边界区域;最后,利用边缘检测算法(例如,Canny边缘检测算法等)对标注出的多个边界区域标注框进行处理,以得到图像中灰度轮廓的线条图。
在本实施例中,边缘检测算法仅需要对标注出的边界区域标注框进行边缘检测,而不需要对整个图像进行边缘检测,从而可以减少计算量,提升处理速度。需要说明的是,边界区域标注框标注的是图像中的部分区域。
在其它的一些实施例中,步骤SB21可包括:对图像进行二值化处理,以得到图像的二值化图像;滤除二值化图像中的噪声线条,从而获得图像中灰度轮廓的线条图。例如,可以预先设定相应的滤除规则,以滤除二值化图像中的例如对象内部的各种线段和各种比较小的线条等,从而得到图像中灰度轮廓的线条图。
在一个可替代的示范例中,步骤SB22将所述线条图中相似的线条进行合并,得到多条参考边界线的步骤包括:
步骤SB221:将所述线条图中相似的线条进行合并,得到初始合并线条组;其中,多个初始合并线条组与多个边界区域一一对应,多个初始合并线条组中的每个初始合并线条组包括至少一条初始合并线条;根据多个初始合并线条组,确定多条边界连接线条,其中,多条边界连接线条与多个边界区域一一对应,多条边界连接线条与多个初始合并线条组也一一对应;分别将多个边界区域转换为多个直线组,其中,多个直线组与所述多个边界区域一一对应,多条直线组中的每个直线组包括至少一条直线;计算与多个直线组一一对应的多个平均斜率;分别计算多条边界连接线条的斜率;针对多条边界连接线条中的第i条边界连接线条,判断第i条边界连接线条的斜率和多个平均斜率中与第i条边界连接线条对应的平均斜率的差值是否高于第二斜率阈值,其中,i为正整数,i小于等于多条边界连接线条的数量;响应于第i条边界连接线条的斜率和与第i条边界连接线条对应的平均斜率的差值低于等于第二斜率阈值,将第i条边界连接线条和第i条边界连接线条对应的初始合并线条组中的初始合并线条作为第i条边界连接线条对应的边界区域对应的参考边界线组中的参考边界线,响应于第i条边界连接线条的斜率和与第i条边界连接线条对应的平均斜率的差值高于第二斜率阈值,将第i条边界连接 线条对应的初始合并线条组中的初始合并线条作为第i条边界连接线条对应的边界区域对应的参考边界线组中的参考边界线,分别对多条边界连接线条进行上述操作,从而确定多条参考边界线。在一些实施例中,第二斜率阈值的范围可以为0-20度,优选为0-10度,更优选的,第二斜率阈值可以为5度、15度等。
值得注意的是,在本公开的实施例中,“两个斜率的差值”表示两个斜率对应的倾斜角度之间的差值。例如,第i条边界连接线条的斜率对应的倾斜角度可以表示第i条边界连接线条相对于给定方向(例如,水平方向或竖直方向)之间的夹角,平均斜率对应的倾斜角度可以表示基于该平均斜率确定的直线与该给定方向之间的夹角。例如,可以计算第i条边界连接线条的倾斜角度(例如,第一倾斜角度)和多个平均斜率中与第i条边界连接线条对应的平均斜率对应的倾斜角度(例如,第二倾斜角度),若该第一倾斜角度和第二倾斜角度之间的差值高于等于第二斜率阈值,则该第i条边界连接线条不作为参考边界线;而若该第一倾斜角度和第二倾斜角度之间的差值低于第二斜率阈值,则该第i条边界连接线条可以作为参考边界线。
需要说明的是,关于直线组、平均斜率和边界区域等将在后面描述,在此不作赘述。
例如,在步骤SB221中,将多条线条中相似的线条进行合并,得到多个初始合并线条组,并根据多条所述初始合并线条确定一边界矩阵。将多条线条中相似的线条进行合并的步骤包括:获取多条线条中的多条长线条,其中,多条长线条中的每条长线条为长度超过长度阈值的线条;根据多条长线条,获取多个合并线条组,其中,多个合并线条组中的每个合并线条组包括至少两个依次相邻的长线条,且每个合并线条组中的任意相邻的两个长线条之间的夹角均小于角度阈值;针对多个合并线条组中的每个合并线条组,将合并线条组中的各个长线条依次进行合并以得到与合并线条组对应的初始合并线条,分别对多个合并线条组进行合并处理以得到多个初始合并线条组中的初始合并线条。
例如,多个初始合并线条组所包括的所有初始合并线条的数量与多个合 并线条组的数量相同,且多个初始合并线条组所包括的所有初始合并线条与多个合并线条组一一对应。需要说明的是,在基于合并线条组得到该合并线条组对应的初始合并线条之后,基于该初始合并线条的位置可以确定该初始合并线条所对应的边界区域,从而确定该初始合并线条所属的初始合并线条组。
需要说明的是,在本公开的实施例中,“相似的线条”表示两个线条之间的夹角小于角度阈值。
例如,线条图中的长线条指的是线条图中的多条线条中的长度超过长度阈值的线条,例如将长度超过2个像素的线条定义为长线条,即长度阈值为2个像素,本公开的实施例包括但不限于此,在另一些实施例中,长度阈值也可以为3个像素、4个像素等。仅获取线条图中的长线条进行后续合并处理,而不考虑线条图中的一些较短的线条,这样可以在合并线条时避免对象内部和对象外部的线条干扰,例如,可以去除对象内部的文字和图形、对象外部的其它对象等的对应的线条。
例如,可以通过以下方式获取合并线条组:首先选择一个长线条T1,然后从该长线条T1开始,依次判断两条相邻长线条之间的夹角是否小于角度阈值,若判断出某一长线条T2和与该长线条T2相邻的长线条之间的夹角不小于角度阈值时,则可以将长线条T1、长线条T2和长线条T1到长线条T2之间的所有依次相邻的长线条组成一个合并线条组。接着,重复上述过程,即从与该长线条T2相邻的长线条开始,依次判断两条相邻长线条之间的夹角是否小于角度阈值,依次类推,直到遍历完所有长线条,从而得到多个合并线条组。需要说明的是,“两条相邻长线条”表示两条在物理位置上相邻的长线条,即该两条相邻长线条之间不存在其他的长线条。
例如,初始合并线条为多条比长线条更长的线条。
图4为本公开一实施例提供的一种线条合并过程的示意图。
下面以图4为例,对上述获取合并线条组的过程进行说明。在一个实施例中,例如,首先选择第一个长线条A,判断该长线条A和与长线条A相邻的长线条B之间的夹角是否小于角度阈值,若长线条A和长线条B之间的夹 角小于角度阈值,则表示长线条A和长线条B是属于同一个合并线条组,然后,继续判断长线条B和与长线条B相邻的长线条C之间的夹角是否小于角度阈值,若长线条B和长线条C之间的夹角也小于角度阈值,则表示长线条C、长线条B和长线条A均属于同一个合并线条组,接着,继续判断长线条C和与长线条C相邻的长线条D之间的夹角,若长线条C和长线条D之间的夹角也小于角度阈值,表示长线条D、长线条C、长线条B和长线条A均属于同一个合并线条组,接着,继续判断长线条D和与长线条D相邻的长线条E之间的夹角,若长线条D和长线条E之间的夹角大于或等于角度阈值,则表示长线条E与长线条A/B/C/D不属于同一个合并线条组,到此为止,可以将长线条A、长线条B、长线条C、长线条D作为一个合并线条组,例如,长线条A、长线条B、长线条C、长线条D组成的合并线条组可以为第一个合并线条组。然后,再从长线条E开始依次判断两条相邻长线条之间的夹角是否小于角度阈值,从而可以得到长线条G、长线条H、长线条I、长线条J属于一个合并线条组,例如,长线条G、长线条H、长线条I、长线条J组成的合并线条组可以为第二个合并线条组,长线条M、长线条N、长线条O也属于一个合并线条组,例如,长线条M、长线条N、长线条O组成的合并线条组可以为第三个合并线条组。
例如,在另一个实施例中,首先,可以从多条长线条中任意选择一个长线条,例如,长线条D,与该长线条D相邻的长线条包括长线条C和长线条E,则判断长线条D和长线条C之间的夹角是否小于角度阈值,判断长线条D和长线条E之间的夹角是否小于角度阈值,由于长线条D和长线条C之间的夹角小于角度阈值,则长线条D和长线条C属于同一个合并线条组,由于长线条D和长线条E之间的夹角大于角度阈值,则长线条D和长线条E属于不同的合并线条组,然后,一方面,可以继续从长线条C开始判断依次相邻的其它长线条之间的夹角,从而确定与长线条D属于同一个合并线条组的其它长线条,此外,还可以确定其他合并线条组;另一方面,可以从长线条E开始判断依次相邻的其它长线条之间的夹角,从而确定其他合并线条组。以此类推,最终,也可确定长线条A、长线条B、长线条C、长线条D属于一个 合并线条组,长线条G、长线条H、长线条I、长线条J属于一个合并线条组,长线条M、长线条N、长线条O也属于一个合并线条组。
例如,相邻两条长线条之间的夹角通过以下公式计算:
Figure PCTCN2022106607-appb-000002
其中,
Figure PCTCN2022106607-appb-000003
分别表示相邻两条长线条的向量。例如,角度阈值的数值可以根据实际情况进行设置,例如,在一些实施例中,角度阈值的范围可以为0-20度,优选为0-10度,更优选为5度、15度等。
例如,合并两个长线条指的是将两个长线条的斜率求平均,以得到斜率平均值,此斜率平均值为合并后线条的斜率。在实际应用中,两个长线条合并是根据两个长线条的数组形式进行计算,例如,两个长线条分别为第一长线条和第二长线条,合并两个长线条表示将第一长线条的起点(即线段头)和第二长线条的终点(即线段尾)直接连接形成一个新的更长的线条,也就是说,在线条图对应的坐标系中,将第一长线条的起点和第二长线条的终点直接直线连接,以得到合并后线条,例如,将第一长线条的起点对应的像素点的坐标值作为合并后线条的起点对应的像素点的坐标值,将第二长线条的终点对应的像素点的坐标值作为合并后线条的终点对应的像素点的坐标值,最后,将合并后线条的起点对应的像素点的坐标值和合并后线条的终点对应的像素点的坐标值形成合并后线条的数组并存储该数组。将每一个合并线条组中的各个长线条依次进行合并,从而得到对应的初始合并线条。
例如,如图4所示,将第一个合并线条组中的长线条A、长线条B、长线条C、长线条D依次合并得到该合并线条组对应的初始合并线条,例如,首先,可以将长线条A与长线条B合并以得到第一合并线条,然后,将第一合并线条与长线条C合并以得到第二合并线条,接着,将第二合并线条与长线条D合并,从而得到该第一个合并线条组对应的初始合并线条1。同理,对第二个合并线条组中的各个长线条进行合并以得到与第二个合并线条组对应的初始合并线条2,对第三个合并线条组中的各个长线条进行合并以得到与第三个合并线条组对应的初始合并线条3。合并各个合并线条组之后,还有长线条E、长线条F、长线条K、长线条L没有被合并。
另外,所述边界矩阵通过以下方式确定:对多条所述初始合并线条以及所述长线条中未合并的线条进行重新绘制,将重新绘制的所有线条中的像素点的位置信息对应到整个图像矩阵中,将图像矩阵中这些线条的像素点所在位置的值设置第一数值、这些线条以外的像素点所在位置的值设置为第二数值,从而形成边界矩阵。具体而言,所述边界矩阵可以是一个与图像矩阵大小相同的矩阵,例如图像的大小为1024×1024像素,则图像矩阵为1024×1024的矩阵,那么边界矩阵也就是一个1024×1024的矩阵,将多条所述初始合并线条以及所述长线条中未合并的线条按照一定的线宽(如线宽为2)重新绘制,根据重新绘制的线条的像素点对应到矩阵中的位置来对边界矩阵进行值的填充,线条上像素点对应到矩阵中的位置都设定为第一数值例如255,没有线条的像素点对应到矩阵中的位置设定为第二数值例如0,从而形成整个图片的超大矩阵即边界矩阵。需要说明的是,由于多条所述初始合并线条以及所述长线条中未合并的线条均是以数组的形式存储的,因此在确定所述边界矩阵时需要将其形成为实际线条数据,因此将线条重新绘制例如按照线宽为2进行重新绘制,从而获得每个线条上各个点对应的像素点的坐标值,进而根据所获得的坐标值对所述边界矩阵中进行值的填充,例如将所述边界矩阵中与坐标值相对应的位置的值设为255,其余位置的值设为0。
下面示例性的提供一个边界矩阵,该边界矩阵为10×10矩阵,其中该边界矩阵中所有值为255的位置连接起来即为多条初始合并线条以及长线条中未合并的线条。
Figure PCTCN2022106607-appb-000004
步骤SB222:将多条所述初始合并线条中相似的线条进行合并得到目标线条,并且将未合并的所述初始合并线条也作为目标线条。
在步骤SB221中,合并后的初始合并线条为多条较长的线条。步骤SB222可以根据上述步骤SB221中的合并规则,继续判断多条初始合并线条中是否存在相似的线条从而将相似线条再次进行合并得到多条目标线条,同时将不能进行合并的初始合并线条也作为目标线条。
其中,将多条所述初始合并线条中相似的线条进行合并得到目标线条的具体的合并步骤如下:步骤a:从多条所述初始合并线条中获取多组第二类线条;其中,所述第二类线条包括至少两个依次相邻的初始合并线条,且任意相邻的两初始合并线条之间的夹角均小于第三预设阈值;步骤b:针对每一组第二类线条,将该组第二类线条中的各个初始合并线条依次进行合并得到一条目标线条。
上述对初始合并线条进行合并的步骤的原理,与步骤SB221中对线条图中线条进行合并的原理相同,可以参见步骤SB221中的相关描述,在此不做赘述。其中,所述第三预设阈值可以和所述第二预设阈值相同,也可以不同,本实施例对此不做限定,例如将所述第三预设阈值设置为夹角10度。如图4所示的线条合并前后对照图,通过上述对初始合并线条1、2、3进行合并的步骤后,由于初始合并线条1和2的夹角小于第三预设阈值,而初始合并线条3与初始2的夹角大于第三预设阈值,因此,初始合并线条1、2可以进一步合并为目标线条12,初始合并线条3不能合并则将初始合并线条3直接作为一个目标线条。
至此获得了多条目标线条,在多条目标线条中不仅存在参考边界线,还存在一些较长的干扰线条,例如,内部的文字和图形、外部的其它物体等的对应的线条经过合并处理后得到的较长线条,这些干扰线条会根据后续的处理(具体通过步骤SB223、步骤SB23的处理)及规则进行去除。
步骤SB223:根据所述边界矩阵,从多条所述目标线条中确定多条参考边界线;具体的,根据所述边界矩阵,从多条所述目标线条中确定多条参考边界线,包括:首先,针对每一条所述目标线条,将该目标线条进行延长,根据延长后的该目标线条确定一线条矩阵,然后将该线条矩阵与所述边界矩阵进行对比,计算延长后的该目标线条上属于所述边界矩阵的像素点的个数, 作为该目标线条的成绩,其中线条矩阵与边界矩阵的大小相同;然后,根据各个目标线条的成绩,从多条所述目标线条中确定多条参考边界线。
其中,所述线条矩阵可以按照以下方式确定:对延长后的目标线条进行重新绘制,将重新绘制的线条中的像素点的位置信息对应到整个图像矩阵中,将图像矩阵中线条的像素点所在位置的值设置为第一数值、线条以外的像素点所在位置的值设置为第二数值,从而形成线条矩阵。所述线条矩阵的形成方式与所述边界矩阵类似,在此不做赘述。需要说明的是,所述目标线条是以数组的形式存储的,即存储其起点和终点的坐标值,对目标线条进行延长后,延长后的目标线条在存储时是以延长后的目标线条的起点和终点的坐标值形成数组的,因此在对延长后的目标线条进行重新绘制时,也是按照相同的线宽例如线宽为2进行重新绘制,从而获得延长后的目标线条上各个点对应的像素点的坐标值,进而根据坐标值对线条矩阵进行值的填充,即将线条矩阵中与坐标值相对应的位置的值设为255,其余位置的值设为0。
将合并后的目标线条进行延长,判断其上的像素点落入步骤SB222中初始合并线条和所述长线条中未合并的线条上最多的目标线条作为参考边界线。针对每一条目标线条,判断上有多少像素点是属于边界矩阵的,计算一个成绩,具体为:将该目标线条进行延长,该目标线条延长后所得的线条也按照边界矩阵的形成方式形成一线条矩阵,将该线条矩阵与边界矩阵进行对比来判断有多少像素点落入到边界矩阵里面,即判断两个矩阵中有多少相同位置的像素点具有相同的第一数值例如255,从而计算成绩。这时成绩最好的线条可能还是有较多条,因此,根据各个目标线条的成绩,从多条目标线条中确定成绩最好的多条目标线条作为参考边界线。
例如,一条延长后的目标线条形成的线条矩阵如下,通过将该线条矩阵与上述的边界矩阵进行对比可知延长后的该目标线条上有7个像素点落入到边界矩阵里面,从而得到该目标线条的成绩。
Figure PCTCN2022106607-appb-000005
优选的,在步骤SB23中,边界区域识别模型可以采用机器学习技术实现并且例如运行在通用计算装置或专用计算装置上。该边界区域识别模型为预先训练得到的神经网络模型。例如,边界区域识别模型可以采用深度卷积神经网络(DEEP-CNN)等神经网络实现。需要说明的,这里的边界区域识别模型和前述步骤SB21中的边界区域识别模型可以为同一个模型,也可以是不同的模型。
首先,通过机器学习训练来建立边界区域识别模型,边界区域识别模型可以通过如下过程训练得到:对图像样本集中的每个图像样本进行标注处理,以标注出每个图像样本中对象的边界线区域、内部区域和外部区域;以及通过经过标注处理的图像样本集,对神经网络进行训练,以得到边界区域识别模型。
例如,通过机器学习训练建立的边界区域识别模型,可以识别出图像中的边界线区域、内部区域(即对象所在区域)和外部区域(即对象的外部区域)3个部分,从而获取图像的各个边界区域,此时,边界区域中的边缘轮廓较粗。例如,在一些实施例中,对象的形状可以为矩形,边界区域的数量可以为4,即通过边界区域识别模型识别输入图像,从而可以得到与矩形的四条边分别对应的四个边界区域。
在一些实施例中,多个边界区域包括第一边界区域、第二边界区域、第三边界区域和第四边界区域。在一些实施例中,如图2所示,第一边界区域可以表示与边界线A1对应的区域,第二边界区域可以表示与边界线A2对应的区域,第三边界区域可以表示与边界线A3对应的区域,第四边界区域可以 表示与边界线A4对应的区域;在另一些实施例中,如图3所示,第一边界区域可以表示与边界线B1对应的区域,第二边界区域可以表示与边界线B2对应的区域,第三边界区域可以表示与边界线B3对应的区域,第四边界区域可以表示与边界线B4对应的区域。
可以理解的是,通过边界区域识别模型来识别图像中的对象的边界区域,然后,基于边界区域从多条参考边界线中确定目标边界线,可以去除误识别的干扰线条,例如落入名片或者文档中间的线条、表格中间的线条等。
优选的,步骤SB24:针对每个所述边界线区域,从多条所述参考边界线中确定与该边界线区域相对应的目标边界线;可以包括:首先,计算每一条所述参考边界线的斜率;然后,针对每一个所述边界线区域,将该边界线区域转换为多条直线,并计算所述多条直线的平均斜率,再判断多条所述参考边界线中是否存在斜率与所述平均斜率相匹配的参考边界线,如果存在,将该参考边界线确定为与该边界线区域相对应的目标边界线。其中,可以利用霍夫变换将该边界线区域转换为多条直线,当然也可以采用其它方式进行转换,本实施例对此不做限定。
本实施例中,所述边界线区域中的边缘轮廓较粗,针对每一边界线区域,可以利用霍夫变换将边界线区域转换为多条直线,这些线条具有近似的斜率,求得平均斜率,然后和每一条参考边界线的斜率进行比较,判断多条参考边界线中是否存在斜率与所述平均斜率相匹配的参考边界线,即从多条参考边界线中找到最为近似的参考边界线,作为与该边界线区域相对应的目标边界线。
由于所确定的目标边界线的斜率与平均斜率的差距不能太大,因此在将平均斜率与每一参考边界线的斜率进行比较时,会设定一个比较阈值,当某一参考边界线的斜率与平均斜率之差的绝对值小于此比较阈值时,判定该参考边界线的斜率是与平均斜率相匹配的参考边界线,进而判定该参考边界线是与边界线区域相对应的目标边界线。
进一步的,针对每一个所述边界线区域,如果判断出多条所述参考边界线中不存在斜率与所述平均斜率相匹配的参考边界线,则进行如下处理:针 对该边界线区域转换得到的每一条直线,将该直线形成的线条矩阵与所述边界矩阵进行对比,计算该直线上属于所述边界矩阵的像素点的个数,作为该直线的成绩;将成绩最好的直线确定为与该边界线区域相对应的目标边界线。如果成绩最好的直线有多条,则根据排序算法将其中最先出现的一条直线作为最佳边界线。其中,所述线条矩阵按照以下方式确定:对直线进行重新绘制,将重新绘制的线条中的像素点的位置信息对应到整个图像矩阵中,将图像矩阵中线条的像素点所在位置的值设置为第一数值、线条以外的像素点所在位置的值设置为第二数值,从而形成线条矩阵。所述线条矩阵的形成方式与所述边界矩阵类似,在此不做赘述。
如果不能从参考边界线中找到与某一边界线区域相对应的目标边界线,则对霍夫变换获取的多条直线按照步骤SB222和步骤SB223中所述的形成矩阵的方式形成对应的线条矩阵,判断哪条直线的像素点落入边界矩阵里面的成绩最好,则认为是该边界线区域相对应的目标边界线。将直线形成的线条矩阵与边界矩阵进行对比来计算直线的成绩的方式可以参照步骤SB223中的相关描述,在此不做赘述。
步骤SB25在确定多条目标边界线后,由于每条目标边界线均对应图像中物体的一个边界线区域,因此多条目标边界线构成了图像中物体的边缘。如图2所示的图像,图像中物体的边缘由图2中的四个较长线条即目标边界线A1、A2、A3、A4构成;如图3所示的图像,图像中物体的边缘由图3中的四个较长线条即目标边界线B1、B2、B3、B4构成。
进一步的,步骤SB26中,在获得图像中物体的边缘后,将这些边缘的交点配置为所述对象顶点。后续的操作步骤请参考步骤S2~步骤S4,这里不再重复。
本实施例还提供了一种可读存储介质,其上存储有程序,所述程序被执行时实现如上所述的物体尺寸识别方法。进一步的,本实施例还提供了一种物体尺寸识别系统,其包括处理器和存储器,所述存储器上存储有程序,所述程序被所述处理器执行时,实现如上所述的物体尺寸识别方法。
综上所述,在本发明提供的物体尺寸识别方法、可读存储介质及物体尺 寸识别系统中,所述物体尺寸识别方法包括:通过拍摄获取一物体的至少两张不同视角的图像;对每张所述图像,分别获取其中的多个对象顶点的二维位置信息;根据至少两张所述图像,根据特征点匹配法建立三维空间坐标系,确定相机的空间位置;以及选取任意一张所述图像,基于所述相机标定的参数信息以及所述相机的空间位置,得到多个顶点的三维空间位置信息,进而得到所述物体的尺寸。如此配置,通过拍摄获取一物体的至少两张不同视角的图像,结合相机标定的参数信息,即可得到物体的尺寸,操作步骤简便,克服了现有技术无法对空间中的物体的尺寸进行测量的问题。
上述描述仅是对本发明较佳实施例的描述,并非对本发明范围的任何限定,本发明领域的普通技术人员根据上述揭示内容做的任何变更、修饰,均属于权利要求书的保护范围。

Claims (10)

  1. 一种物体尺寸识别方法,其特征在于,包括:
    通过拍摄获取一物体的至少两张不同视角的图像;
    对每张所述图像,分别获取其中的多个对象顶点的二维位置信息;
    根据至少两张所述图像,根据特征点匹配法建立三维空间坐标系,确定相机的空间位置;以及
    选取任意一张所述图像,基于所述相机标定的参数信息以及所述相机的空间位置,得到多个对象顶点的三维空间位置信息,进而得到所述物体的尺寸。
  2. 根据权利要求1所述的物体尺寸识别方法,其特征在于,获取所述图像中多个对象顶点的二维位置信息的步骤包括:
    将所述图像输入经训练的顶点识别模型,得到每个所述对象顶点及与其对应的图像顶点的相对位置;
    根据每个所述对象顶点及与其对应的图像顶点的相对位置,确定各个所述对象顶点在所述图像中的实际位置;
    根据每个所述对象顶点在所述图像中的实际位置,以所述图像的一参考点为二维图像坐标系的坐标原点,得到各个所述对象顶点在所述二维图像坐标系中的二维位置信息。
  3. 根据权利要求2所述的物体尺寸识别方法,其特征在于,根据每个所述对象顶点及与其对应的图像顶点的相对位置,确定各个所述对象顶点在所述图像中的实际位置的步骤包括:
    根据每个所述对象顶点及与其对应的图像顶点的相对位置,确定各个所述对象顶点在所述图像中的参考位置;
    针对每个所述对象顶点,在该对象顶点的参考位置所处的预设区域内进行角点检测;
    根据角点检测结果确定每个所述对象顶点在所述图像中的实际位置。
  4. 根据权利要求3所述的物体尺寸识别方法,其特征在于,所述对象顶 点的参考位置所处的预设区域为以所述对象顶点的参考位置处的像素点为圆心、以第一预设像素为半径的圆形区域;
    所述针对每个所述对象顶点,在该对象顶点的参考位置所处的预设区域内进行角点检测,包括:
    对各个所述对象顶点对应的圆形区域内的像素点进行角点检测,在角点检测过程中,将特征值变化幅度大于预设阈值的像素点均作为候选角点,从所述候选角点中确定各个对象顶点对应的目标角点。
  5. 根据权利要求4所述的物体尺寸识别方法,其特征在于,所述根据角点检测结果确定每个所述对象顶点在所述图像中的实际位置,包括:
    针对每个所述对象顶点,若该对象顶点的角点检测结果中包含一个角点,则将该角点的位置确定为该对象顶点在所述图像中实际位置,若该对象顶点的角点检测结果中不包含角点,则将该对象顶点在所述图像中的参考位置确定为该对象顶点在所述图像中实际位置。
  6. 根据权利要求1所述的物体尺寸识别方法,其特征在于,获取所述图像中多个对象顶点的步骤包括:
    对所述图像进行处理,获得所述图像中灰度轮廓的线条图;
    将所述线条图中相似的线条进行合并,得到多条参考边界线;
    通过经训练的边界线区域识别模型对所述图像进行处理,得到所述图像中物体的多个边界线区域;
    针对每个所述边界线区域,从多条所述参考边界线中确定与该边界线区域相对应的目标边界线;
    根据确定的多条所述目标边界线确定所述图像中物体的边缘;
    将所述图像中物体的边缘的交点配置为所述对象顶点。
  7. 根据权利要求6所述的物体尺寸识别方法,其特征在于,将所述线条图中相似的线条进行合并,得到多条参考边界线的步骤包括:
    将所述线条图中相似的线条进行合并,得到多条初始合并线条,并根据多条所述初始合并线条确定一边界矩阵;
    将多条所述初始合并线条中相似的线条进行合并得到目标线条,并且将 未合并的所述初始合并线条也作为目标线条;
    根据所述边界矩阵,从多条所述目标线条中确定多条参考边界线。
  8. 根据权利要求1所述的物体尺寸识别方法,其特征在于,根据至少两张所述图像,根据特征点匹配法建立三维空间坐标系,确定相机的空间位置的步骤包括:
    提取至少两张所述图像中相互匹配的二维特征点;
    根据相互匹配的所述二维特征点,得到至少两张所述图像的约束关系;
    基于所述约束关系,得到每张所述图像中的二维特征点的三维空间位置,进而得到每张所述图像所对应的相机的空间位置。
  9. 一种可读存储介质,其上存储有程序,其特征在于,所述程序被执行时实现根据权利要求1~8中任一项所述的物体尺寸识别方法。
  10. 一种物体尺寸识别系统,其特征在于,包括处理器和存储器,所述存储器上存储有程序,所述程序被所述处理器执行时,实现根据权利要求1~8中任一项所述的物体尺寸识别方法。
PCT/CN2022/106607 2021-08-24 2022-07-20 物体尺寸识别方法、可读存储介质及物体尺寸识别系统 WO2023024766A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110975318.1A CN113688846B (zh) 2021-08-24 2021-08-24 物体尺寸识别方法、可读存储介质及物体尺寸识别系统
CN202110975318.1 2021-08-24

Publications (1)

Publication Number Publication Date
WO2023024766A1 true WO2023024766A1 (zh) 2023-03-02

Family

ID=78581917

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/106607 WO2023024766A1 (zh) 2021-08-24 2022-07-20 物体尺寸识别方法、可读存储介质及物体尺寸识别系统

Country Status (2)

Country Link
CN (1) CN113688846B (zh)
WO (1) WO2023024766A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117315664A (zh) * 2023-09-18 2023-12-29 山东博昂信息科技有限公司 一种基于图像序列的废钢斗号码识别方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113688846B (zh) * 2021-08-24 2023-11-03 成都睿琪科技有限责任公司 物体尺寸识别方法、可读存储介质及物体尺寸识别系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110455215A (zh) * 2019-08-13 2019-11-15 利生活(上海)智能科技有限公司 一种通过图像获得物体在物理三维空间尺寸的方法及装置
CN112991369A (zh) * 2021-03-25 2021-06-18 湖北工业大学 基于双目视觉的行驶车辆外廓尺寸检测方法
CN113177977A (zh) * 2021-04-09 2021-07-27 上海工程技术大学 一种非接触式三维人体尺寸的测量方法
CN113688846A (zh) * 2021-08-24 2021-11-23 成都睿琪科技有限责任公司 物体尺寸识别方法、可读存储介质及物体尺寸识别系统

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4742190B2 (ja) * 2005-01-13 2011-08-10 国立大学法人 奈良先端科学技術大学院大学 3次元オブジェクト計測装置
JP2006300656A (ja) * 2005-04-19 2006-11-02 Nippon Telegr & Teleph Corp <Ntt> 画像計測方法、装置、プログラム及び記録媒体
US8483446B2 (en) * 2010-06-17 2013-07-09 Mississippi State University Method and system for estimating antler, horn, and pronghorn size of an animal
CN104236478B (zh) * 2014-09-19 2017-01-18 山东交通学院 一种基于视觉的车辆外廓尺寸自动测量系统及方法
CN109214980B (zh) * 2017-07-04 2023-06-23 阿波罗智能技术(北京)有限公司 一种三维姿态估计方法、装置、设备和计算机存储介质
CN110533774B (zh) * 2019-09-09 2023-04-07 江苏海洋大学 一种基于智能手机的三维模型重建方法
CN112683169A (zh) * 2020-12-17 2021-04-20 深圳依时货拉拉科技有限公司 物体尺寸测量方法、装置、设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110455215A (zh) * 2019-08-13 2019-11-15 利生活(上海)智能科技有限公司 一种通过图像获得物体在物理三维空间尺寸的方法及装置
CN112991369A (zh) * 2021-03-25 2021-06-18 湖北工业大学 基于双目视觉的行驶车辆外廓尺寸检测方法
CN113177977A (zh) * 2021-04-09 2021-07-27 上海工程技术大学 一种非接触式三维人体尺寸的测量方法
CN113688846A (zh) * 2021-08-24 2021-11-23 成都睿琪科技有限责任公司 物体尺寸识别方法、可读存储介质及物体尺寸识别系统

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117315664A (zh) * 2023-09-18 2023-12-29 山东博昂信息科技有限公司 一种基于图像序列的废钢斗号码识别方法
CN117315664B (zh) * 2023-09-18 2024-04-02 山东博昂信息科技有限公司 一种基于图像序列的废钢斗号码识别方法

Also Published As

Publication number Publication date
CN113688846B (zh) 2023-11-03
CN113688846A (zh) 2021-11-23

Similar Documents

Publication Publication Date Title
CN108898610B (zh) 一种基于mask-RCNN的物体轮廓提取方法
CN110866871A (zh) 文本图像矫正方法、装置、计算机设备及存储介质
US8467596B2 (en) Method and apparatus for object pose estimation
WO2023024766A1 (zh) 物体尺寸识别方法、可读存储介质及物体尺寸识别系统
JP4723834B2 (ja) 映像に基づいたフォトリアリスティックな3次元の顔モデリング方法及び装置
WO2020228187A1 (zh) 边缘检测方法、装置、电子设备和计算机可读存储介质
JP2015533434A (ja) 教師あり形状ランク付けに基づく生物学的単位の識別
CN111160291B (zh) 基于深度信息与cnn的人眼检测方法
CN111307039A (zh) 一种物体长度识别方法、装置、终端设备和存储介质
CN109272577B (zh) 一种基于Kinect的视觉SLAM方法
JP2014041477A (ja) 画像認識装置及び画像認識方法
CN106296587B (zh) 轮胎模具图像的拼接方法
CN113159043A (zh) 基于语义信息的特征点匹配方法及系统
US11216905B2 (en) Automatic detection, counting, and measurement of lumber boards using a handheld device
CN111444773B (zh) 一种基于图像的多目标分割识别方法及系统
CN113033558A (zh) 一种用于自然场景的文本检测方法及装置、存储介质
CN110363196B (zh) 一种倾斜文本的文字精准识别的方法
CN116740758A (zh) 一种防止误判的鸟类图像识别方法及系统
CN113436251B (zh) 一种基于改进的yolo6d算法的位姿估计系统及方法
Cui et al. Global propagation of affine invariant features for robust matching
JP2017500662A (ja) 投影ひずみを補正するための方法及びシステム
CN116523916B (zh) 产品表面缺陷检测方法、装置、电子设备及存储介质
CN110910497B (zh) 实现增强现实地图的方法和系统
CN110008902B (zh) 一种融合基本特征和形变特征的手指静脉识别方法及系统
CN108985294B (zh) 一种轮胎模具图片的定位方法、装置、设备及存储介质

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE