CN113688846B

CN113688846B - Object size recognition method, readable storage medium, and object size recognition system

Info

Publication number: CN113688846B
Application number: CN202110975318.1A
Authority: CN
Inventors: 罗欢; 徐青松; 李青
Original assignee: Chengdu Ruiqi Technology Co ltd
Current assignee: Chengdu Ruiqi Technology Co ltd
Priority date: 2021-08-24
Filing date: 2021-08-24
Publication date: 2023-11-03
Anticipated expiration: 2041-08-24
Also published as: WO2023024766A1; CN113688846A

Abstract

The invention provides an object size recognition method, a readable storage medium and an object size recognition system, wherein the object size recognition method comprises the following steps: acquiring at least two images of an object at different visual angles through shooting; respectively acquiring two-dimensional position information of a plurality of object vertexes in each image; according to at least two images, a three-dimensional space coordinate system is established according to a feature point matching method, and the space position of a camera is determined; and selecting any one image, and obtaining three-dimensional space position information of a plurality of vertexes based on the parameter information calibrated by the camera and the space position of the camera, thereby obtaining the size of the object. By the configuration, at least two images with different visual angles of an object are obtained through shooting, the size of the object can be obtained by combining parameter information calibrated by a camera, the operation steps are simple, and the problem that the size of the object in space cannot be measured in the prior art is solved.

Description

Object size recognition method, readable storage medium, and object size recognition system

Technical Field

The present invention relates to the field of object recognition technologies, and in particular, to an object size recognition method, a readable storage medium, and an object size recognition system.

Background

When there is no measuring tool on hand or the object to be measured is not on hand, it is a difficult problem how to measure the object dimensions. At present, people often shoot an object, an image of the object can be obtained, however, the object in the shot image does not have a scale, and the actual size of the object cannot be known. How to simply measure the dimensions of objects in space is a matter of great need.

Disclosure of Invention

The invention aims to provide an object size recognition method, a readable storage medium and an object size recognition system, which are used for solving the problem that the object size is difficult to measure in the prior art.

To solve the above technical problem, according to a first aspect of the present invention, there is provided an object size recognition method, comprising:

acquiring at least two images of an object at different visual angles through shooting;

respectively acquiring two-dimensional position information of a plurality of object vertexes in each image;

according to at least two images, a three-dimensional space coordinate system is established according to a feature point matching method, and the space position of a camera is determined; and

and selecting any one image, and obtaining three-dimensional space position information of a plurality of object vertexes based on the parameter information calibrated by the camera and the space position of the camera, thereby obtaining the size of the object.

Optionally, the step of acquiring two-dimensional position information of vertices of a plurality of objects in the image includes:

inputting the images into a trained vertex recognition model to obtain the relative positions of each object vertex and the corresponding image vertex;

determining the actual position of each object vertex in the image according to the relative positions of each object vertex and the corresponding image vertex;

and according to the actual position of each object vertex in the image, taking a reference point of the image as the coordinate origin of a two-dimensional image coordinate system, and obtaining the two-dimensional position information of each object vertex in the two-dimensional image coordinate system.

Optionally, the step of determining the actual position of each object vertex in the image according to the relative positions of each object vertex and the corresponding image vertex comprises:

determining the reference position of each object vertex in the image according to the relative position of each object vertex and the corresponding image vertex;

for each object vertex, performing corner detection in a preset area where the reference position of the object vertex is located;

and determining the actual position of each object vertex in the image according to the corner detection result.

Optionally, the preset area where the reference position of the object vertex is located is a circular area with a pixel point at the reference position of the object vertex as a circle center and a first preset pixel as a radius;

for each object vertex, performing corner detection in a preset area where a reference position of the object vertex is located, including:

and carrying out corner detection on pixel points in the circular area corresponding to each object vertex, wherein in the corner detection process, all the pixel points with the characteristic value change amplitude larger than a preset threshold value are used as candidate corner points, and the target corner points corresponding to each object vertex are determined from the candidate corner points.

Optionally, the determining, according to the corner detection result, an actual position of each object vertex in the image includes:

for each object vertex, if the corner detection result of the object vertex contains a corner, determining the position of the corner as the actual position of the object vertex in the image, and if the corner detection result of the object vertex does not contain a corner, determining the reference position of the object vertex in the image as the actual position of the object vertex in the image.

Optionally, the step of acquiring a plurality of object vertices in the image includes:

processing the image to obtain a line drawing of a gray scale contour in the image;

merging similar lines in the line drawings to obtain a plurality of reference boundary lines;

processing the image through a trained boundary line area recognition model to obtain a plurality of boundary line areas of an object in the image;

for each boundary line region, determining a target boundary line corresponding to the boundary line region from a plurality of reference boundary lines;

determining edges of objects in the image according to the determined target boundary lines;

and configuring an intersection point of edges of an object in the image as the object vertex.

Optionally, the step of merging similar lines in the line drawing to obtain a plurality of reference boundary lines includes:

combining similar lines in the line drawing to obtain a plurality of initial combined lines, and determining a boundary matrix according to the initial combined lines;

merging similar lines in the plurality of initial merging lines to obtain a target line, and taking the initial merging lines which are not merged as target lines;

And determining a plurality of reference boundary lines from a plurality of target lines according to the boundary matrix.

Optionally, the step of establishing a three-dimensional space coordinate system according to the feature point matching method according to at least two images, and determining the space position of the camera includes:

extracting two-dimensional feature points matched with each other in at least two images;

obtaining a constraint relation of at least two images according to the two-dimensional feature points matched with each other;

and based on the constraint relation, obtaining the three-dimensional space position of the two-dimensional feature points in each image, and further obtaining the space position of the camera corresponding to each image.

In order to solve the above technical problem, according to a second aspect of the present invention, there is also provided a readable storage medium having stored thereon a program which, when executed, implements the object size identifying method as described above.

In order to solve the above-mentioned technical problem, according to a third aspect of the present invention, there is also provided an object size recognition system including a processor and a memory, the memory having stored thereon a program which, when executed by the processor, implements the object size recognition method as described above.

In summary, in the object size recognition method, the readable storage medium and the object size recognition system provided by the present invention, the object size recognition method includes: acquiring at least two images of an object at different visual angles through shooting; respectively acquiring two-dimensional position information of a plurality of object vertexes in each image; according to at least two images, a three-dimensional space coordinate system is established according to a feature point matching method, and the space position of a camera is determined; and selecting any one image, and obtaining three-dimensional space position information of a plurality of vertexes based on the parameter information calibrated by the camera and the space position of the camera, thereby obtaining the size of the object.

By the configuration, at least two images with different visual angles of an object are obtained through shooting, the size of the object can be obtained by combining parameter information calibrated by a camera, the operation steps are simple, and the problem that the size of the object in space cannot be measured in the prior art is solved.

Drawings

Those of ordinary skill in the art will appreciate that the figures are provided for a better understanding of the present invention and do not constitute any limitation on the scope of the present invention. Wherein:

FIG. 1 is a flow chart of a method of object size identification in accordance with an embodiment of the present invention;

FIG. 2 is a schematic view of a photographed object according to an embodiment of the present invention;

FIG. 3 is a schematic view of another photographed object according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of line merging according to an embodiment of the present invention.

Detailed Description

The invention will be described in further detail with reference to the drawings and the specific embodiments thereof in order to make the objects, advantages and features of the invention more apparent. It should be noted that the drawings are in a very simplified form and are not drawn to scale, merely for convenience and clarity in aiding in the description of embodiments of the invention. Furthermore, the structures shown in the drawings are often part of actual structures. In particular, the drawings are shown with different emphasis instead being placed upon illustrating the various embodiments.

As used in this specification, the singular forms "a," "an," and "the" include plural referents, the term "or" is generally used in the sense of comprising "and/or" and the term "several" is generally used in the sense of comprising "at least one," the term "at least two" is generally used in the sense of comprising "two or more," and the term "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying any relative importance or number of technical features indicated. Thus, a feature defining "a first", "a second", "a third" may include one or at least two such features, either explicitly or implicitly. The specific meaning of the above terms in this specification will be understood by those of ordinary skill in the art in view of the specific circumstances.

The following description refers to the accompanying drawings.

Referring to fig. 1, an embodiment of the present invention provides an object size recognition method, which includes:

step S1: at least two images of an object at different viewing angles are acquired by photographing. It will be appreciated that each of the images has a plurality of object vertices representing the object. In some embodiments, a binocular camera or depth camera may be used for photographing, and in other embodiments, a cell phone with more than two cameras may be used for photographing.

Step S2: and respectively acquiring two-dimensional position information of a plurality of object vertexes in each image. Here, the two-dimensional positional information of the object vertices refers to coordinates of each object vertex in the image coordinate system.

Step S3: according to at least two images, a three-dimensional space coordinate system is established according to a feature point matching method, and the space position of a camera is determined; and

step S4: and selecting any one image, and obtaining three-dimensional space position information of a plurality of object vertexes based on the parameter information calibrated by the camera and the space position of the camera, thereby obtaining the size of the object.

Referring to fig. 2, in an exemplary embodiment, the object to be photographed is a rectangle (e.g., a business card) having four edges (i.e., lines) A1-A4, and the connection between two adjacent edges of the four edges forms an object vertex, i.e., the business card in the image has four object vertices A1-A4. Referring to fig. 3, in another example, since the image does not take all of the area of the object, the lower left corner vertex and the upper right corner vertex of the object are not included in the image, and for this case, the 4 edge lines B1 to B4 of the business card in the image may be extended to obtain the virtual vertex of the lower left corner and the virtual vertex of the upper right corner of the object, and the virtual vertex obtained by taking together with the actual vertex obtained by taking, the 4 object vertices B1 to B4 of the business card may be obtained. Of course, the above-mentioned rectangular shape is only an example of the photographed object, and is not limited to the shape of the photographed object, and the photographed object may be other planar or three-dimensional shapes. But preferably the object to be photographed should have several vertices to facilitate subsequent recognition and calculation.

After the object vertex in the image is identified, step S2 is performed to obtain two-dimensional position information of the object vertex. In an alternative example, the step of acquiring two-dimensional position information of vertices of a plurality of objects in the image includes:

step SA21: inputting the images into a trained vertex recognition model to obtain the relative positions of each object vertex and the corresponding image vertex. The vertex recognition model herein may be implemented, for example, using machine learning techniques and run on a general purpose computing device or a special purpose computing device, for example. The object vertex recognition model is a neural network model which is obtained through training in advance. For example, the object vertex recognition model may be implemented using a neural network such as a DEEP convolutional neural network (DEEP-CNN). In some embodiments, the image is input into the object vertex recognition model, which can recognize object vertices in the image to obtain the relative position of each object vertex and its corresponding image vertex. It should be understood that the image vertices of the image refer to vertices of edges of the image, for example, in fig. 2, the image is rectangular, and the image vertices are a5 to a8, respectively.

Alternatively, the vertex recognition model may be built through machine learning training. In one exemplary embodiment, the training step of the object vertex recognition model includes:

step SA211, obtaining a training sample set, wherein each sample image in the training sample set is marked with each object vertex of an object in the image and the relative position of each object vertex and the corresponding image vertex;

step SA212, obtaining a test sample set, wherein each sample image in the test sample set is also marked with each object vertex of an object in the image and the relative position of each object vertex and the corresponding image vertex, and the test sample set is different from the training sample set;

step SA213, training the object vertex recognition model based on the training sample set;

step SA214, testing the object vertex recognition model based on the test sample set;

step SA215, when the test result indicates that the recognition accuracy of the object vertex recognition model is smaller than the preset accuracy, increasing the number of samples in the training sample set for retraining; and

and step SA216, finishing training when the test result indicates that the recognition accuracy of the object vertex recognition model is greater than or equal to the preset accuracy.

Optionally, the type of the object to be measured is not particularly limited, and can be two-dimensional objects such as business cards, test papers, laboratory sheets, documents, invoices and the like, or three-dimensional objects. For each object type, a certain number of sample images marked with corresponding information are acquired, and the number of sample images prepared for each object type can be the same or different. Each sample image may contain all of the area of the object (as shown in fig. 2) or only a portion of the area of the object (as shown in fig. 3). The sample images acquired for each object type may comprise images taken at different angles of view, under different lighting conditions, as much as possible. In these cases, the corresponding information noted for each sample image may further include information such as the imaging angle and illumination of the sample image.

The sample image subjected to the labeling process can be divided into a training sample set for training the object vertex recognition model and a test sample set for testing training results. Typically the number of samples in the training sample set is significantly greater than the number of samples in the test sample set, e.g., the number of samples in the test sample set may be 5% to 20% of the total sample image number, while the number of samples in the corresponding training sample set may be 80% to 95% of the total sample image number. It will be appreciated by those skilled in the art that the number of samples in the training sample set and the test sample set may be adjusted as desired.

The object vertex recognition model can be trained by using a training sample set, and the recognition accuracy of the trained object vertex recognition model can be tested by using a test sample set. If the recognition accuracy rate does not meet the requirement, increasing the number of sample images in a training sample set, and training the object vertex recognition model again by using the updated training sample set until the recognition accuracy rate of the trained object vertex recognition model meets the requirement. If the recognition accuracy meets the requirement, the training is finished. In one embodiment, it may be determined whether training may end based on whether the recognition accuracy is less than a preset accuracy. In this way, the trained object vertex recognition model with the required output accuracy can be used to perform recognition of object vertices in the image.

When an image as shown in fig. 3 is used as the sample image, in addition to labeling object vertices b2 and b4 in the sample image, adjacent edge lines may be extended to obtain object vertices b1 and b3 other than the sample image, and the object vertices b1 and b3 may be labeled, and the relative positions of the object vertices b1 to b4 and the corresponding image vertices may be labeled.

In this way, when the sample image marked according to the marking manner is used to train the object vertex recognition model, the object vertex recognition model can recognize not only the object vertices located in the image but also the object vertices located outside the image and the relative positions of the object vertices and the corresponding image vertices when recognizing the image similar to fig. 3. Furthermore, when the sample image is marked, the object vertex positioned outside the image is obtained by extending the adjacent edge lines, but when the image is identified by the object vertex identification model after training, the object vertex positioned outside the image is obtained by directly obtaining the coordinates of the external object vertex and the corresponding image vertex without extending the edge lines.

Preferably, in the training step of the object vertex recognition model, in step SA211, when labeling the relative positions of each object vertex of the object in the sample image and its corresponding image vertex, the relative position of each image vertex closest to the object vertex is preferably labeled. Taking the image shown in fig. 2 as an example of a sample image, since the object vertex a1 and the image vertex a5 are closest in distance, the relative positions of the object vertex a1 and the image vertex a5 are labeled, that is, the coordinates of the object vertex a1 are converted into the coordinates with the image vertex a5 as the origin for the object vertex a1, the coordinates of the object vertex a2 are converted into the coordinates with the image vertex a6 as the origin for the object vertex a2, the coordinates of the object vertex a3 are converted into the coordinates with the image vertex a7 as the origin for the object vertex a3, and the coordinates of the object vertex a4 are converted into the coordinates with the image vertex a8 as the origin for the object vertex a 4.

And if the sample image marked according to the marking mode is used for training the object vertex recognition model, the recognition result of the object vertex recognition model is that the relative position of each object vertex in the image relative to the image vertex closest to the object vertex in the image is recognized.

Taking the image shown in fig. 2 as an example, after the object vertex recognition model is used for recognition, the relative position of the object vertex a1 with respect to the image vertex a5 (that is, the coordinates of the object vertex a1 when the image vertex a5 is the origin), the relative position of the object vertex a2 with respect to the image vertex a6 (that is, the coordinates of the object vertex a2 when the image vertex a6 is the origin), the relative position of the object vertex a3 with respect to the image vertex a7 (that is, the coordinates of the object vertex a3 when the image vertex a7 is the origin), and the relative position of the object vertex a4 with respect to the image vertex a8 (that is, the coordinates of the object vertex a4 when the image vertex a8 is the origin) can be obtained.

Step SA22: and determining the actual position of each object vertex in the image according to the relative positions of each object vertex and the corresponding image vertex.

In some embodiments, the relative positions of the object vertices and the image vertices closest to the object vertices in the image are converted into coordinates of the object vertices in a target coordinate system, so as to obtain the actual positions of the object vertices in the image.

Step SA23: and according to the actual position of each object vertex in the image, taking a reference point of the image as the coordinate origin of a two-dimensional image coordinate system, and obtaining the two-dimensional position information of each object vertex in the two-dimensional image coordinate system.

Preferably, the target coordinate system is a two-dimensional image coordinate system, and the origin point of the target coordinate system is a position point in the image. Taking the image shown in fig. 2 as an example, in step SA21, coordinates of the object vertex a1 at the origin of the image vertex a5, coordinates of the object vertex a2 at the origin of the image vertex a6, coordinates of the object vertex a3 at the origin of the image vertex a7, and coordinates of the object vertex a4 at the origin of the image vertex a8 are obtained. Since the coordinates of the vertices of the objects obtained at this time are not coordinates in the same coordinate system, the coordinates of the vertices of the objects need to be converted into coordinates in the same coordinate system, specifically, in step SA23, the coordinates of the vertices of the 4 objects may be converted into coordinates with the same position point as the origin of the common coordinate system, so as to facilitate determination of the actual positions of the vertices of the objects in the image.

Since the same position point is a specific position in the image, the relative coordinates of each image vertex of the image and the position point are known, and the relative coordinates of each object vertex when the position point is taken as the origin of the coordinate system can be obtained.

For example, in some embodiments, the origin of the target coordinate system may be the center point of the image. In other embodiments, the origin of the target coordinate system is an image vertex of the image. Taking the image shown in fig. 2 as an example, the origin of the target coordinate system may be, for example, the image vertex a5, so that coordinate values of the object vertices a1 to a4 when the image vertex a5 is taken as the origin of the coordinate system can be obtained, and further, the actual positions of the object vertices a1 to a4 in the image can be obtained.

After the step S2 is completed to obtain the two-dimensional position information of the object vertex, in step S3, a three-dimensional space coordinate system is established according to the feature point matching method. Preferably, the step of establishing a three-dimensional space coordinate system according to the feature point matching method based on at least two images, and determining the space position of the camera includes:

step S31: extracting two-dimensional feature points matched with each other in at least two images;

step S32: obtaining a constraint relation of at least two images according to the two-dimensional feature points matched with each other;

step S33: and based on the constraint relation, obtaining the three-dimensional space position of the two-dimensional feature points in each image, and further obtaining the space position of the camera corresponding to each image.

In one example, an ORB algorithm is used to quickly find and extract all two-dimensional feature points of each image that do not change with camera movement, rotation, or changes in illumination. And matching the two-dimensional characteristic points of each image to extract the two-dimensional characteristic points matched with each other in each image. The two-dimensional feature point is composed of two parts: key points (keypoints) and descriptors (descriptors), wherein the key points refer to positions of the two-dimensional feature points in an image, and some key points also have direction and scale information; the descriptor is usually a vector, and describes the information of pixels around the key point in an artificial design manner. Usually, descriptors are designed according to similar descriptors of similar appearance, so that when the descriptors of two-dimensional feature points are matched, the descriptors can be considered as feature points matched with each other as long as the descriptors of the two-dimensional feature points are similar in distance in vector space. In this embodiment, during matching, key points in each image are extracted, descriptors of each two-dimensional feature point are calculated according to positions of the key points, and matching is performed according to the descriptors, so as to extract two-dimensional feature points matched with each other in each image. Of course, there are other ways to extract the two-dimensional feature points, such as rough matching or proximity searching, which are not exemplified here, and those skilled in the art may choose them according to the actual situation.

After the two-dimensional feature points of each image are matched, the three-dimensional space position of the camera corresponding to the image can be obtained according to any one of the images (the lens orientation of the camera is always perpendicular to the two-dimensional plane of the image obtained by shooting). And according to the positions of the cameras corresponding to the pictures, converting all the two-dimensional characteristic points in the pictures into three-dimensional characteristic points, forming a three-dimensional space, and establishing a three-dimensional space coordinate system.

It can be understood that a constraint relationship exists between two-dimensional feature points of the same three-dimensional feature point in a three-dimensional scene under different viewing angles (when the camera rotates and translates at the same time): epipolar constraint, the base matrix is an algebraic representation of such constraint relationships, and such constraint relationships are independent of the structure of the scene, dependent only on camera internal and external parameters, for mutually matched two-dimensional feature points p ₁ 、p ₂ The basis matrix F has the following relationship:

wherein f=k ^-T t×RK ^-1 (1)

Wherein K is an internal reference of the camera, that is, the basic matrix F of each image can be calculated only by two-dimensional feature point pairs (at least 7 pairs) matched with each other, and then the rotation matrix R and the translation vector t of the camera are obtained by decomposing from F, so that the spatial position of the camera in the three-dimensional spatial coordinate system is obtained.

Further, the homography matrix H can provide more constraints on each image, and when the camera takes two images of the same scene with only rotation and no translation, the epipolar constraints of the two images are no longer applicable, and the homography matrix H can be used to describe the relationship between the two images. It can be seen that the basis matrix F and the homography matrix H can both represent constraint relationships of two images, but both have respective suitable scenes, and for different application scenes, the matrices to which the constraint relationships between the images may be suitable are different (the basis matrix represents epipolar constraint, the position of the camera needs to be rotated and translated, and the homography matrix needs to have only rotation but not translation), in this embodiment, a suitable matrix is selected for the situation of the camera corresponding to each image. For the process of calculating the basis matrix and homography matrix of each image, please refer to the prior art, and this embodiment will not be described in detail.

After determining the spatial position of the camera in step S3, in step S4, selecting any one image, and based on the parameter information calibrated by the camera and the spatial position of the camera, obtaining three-dimensional spatial position information of a plurality of object vertices in the image, thereby obtaining the actual size of the object.

The purpose of camera calibration is to determine the value of some parameter information of the camera. In general, the parameter information may establish a mapping relationship between the three-dimensional coordinate system determined by the calibration plate and the camera image coordinate system, in other words, a point in a three-dimensional space may be mapped to the image space using the parameter information, or vice versa. The parameters that the camera needs to calibrate are generally divided into an internal reference part and an external reference part. The outlier determines the position and orientation of the camera in a certain three-dimensional space, and the outlier matrix represents how points (world coordinates) in a three-dimensional space undergo rotation and translation and then fall onto the image space (camera coordinates). Both rotation and translation of the camera are external parameters describing the camera's motion in a static scene or the rigid motion of a moving object when the camera is stationary. Therefore, in image stitching or three-dimensional reconstruction, external parameters are required to find the relative motion between several images, so that they are registered in the same coordinate system.

An internal reference can be said to be a parameter within the camera, which is generally an inherent property of the camera, and the internal parameter matrix represents how a point in three-dimensional space, after falling on the image space, continues through the lens of the camera and becomes a pixel by optical imaging and electronic conversion. It should be noted that the actual camera lens also has radial and tangential distortions, and these distortion parameters also belong to the internal parameters of the camera, which can be obtained by means of a calibration in advance.

The specific calibration method of the camera can be understood by those skilled in the art according to the prior art, for example, a Zhang calibration method or the like can be adopted. By calibrating the external reference and the internal reference of the camera, the three-dimensional space position information of the object vertex in the picture can be obtained based on the space position of the camera, so that the actual size of the object is calculated.

In another alternative example, step SA22, according to the relative positions of each object vertex and its corresponding image vertex, determines the actual position of each object vertex in the image, including:

step SA221: determining the reference position of each object vertex in the image according to the relative position of each object vertex and the corresponding image vertex;

step SA222: for each object vertex, performing corner detection in a preset area where the reference position of the object vertex is located;

step SA223: and determining the actual position of each object vertex in the image according to the corner detection result.

In this exemplary embodiment, unlike the previous exemplary embodiment, the position of each object vertex in the image obtained by using the relative positions of each object vertex and the image vertex corresponding thereto is not directly taken as the actual position, but is determined as the reference position of each object vertex in the image. And then, carrying out corner detection at the reference position of each object vertex, and finally determining the actual position of each object vertex in the image according to the result of the corner detection. The position of the vertex of the object is corrected by adopting the method of corner detection, so that the edge detection of the object with the edge in the image is realized, and the accuracy of the edge and vertex detection is also improved.

The following is also described by way of example in fig. 2 and 3. In step SA221, the relative positions of each object vertex and the image vertex closest to the object vertex in the image are converted into the reference coordinates of the object vertex in the target coordinate system, so as to obtain the reference positions of each object vertex in the image.

In step SA222, the corner point is an extreme point, i.e. a point with a particular attribute in some aspect, which is an endpoint of an isolated point or line segment with the greatest or smallest intensity in some attribute. A corner point is generally defined as the intersection of two edges, or, in other words, a local neighborhood of the corner point should have boundaries of two different directions for two different regions. More strictly, a local neighborhood of a corner point should have boundaries of different directions for two different regions. In practice, most of the so-called corner detection methods detect image points having specific features, not just "corners". These feature points have specific coordinates in the image and have certain mathematical features such as local maximum or minimum gray scale, certain gradient features, etc.

The basic idea of the corner detection algorithm is to use a fixed window (a neighborhood window taking a certain pixel) to slide in any direction on an image, compare the two conditions before and after sliding, and if sliding in any direction has larger gray level change, then the corner in the window can be considered to exist.

In general, any object vertex of an object with an edge corresponds to a corner point in the image. And detecting the corner points corresponding to the vertexes of each object by detecting the corner points in a preset area where the reference positions of the vertexes of each object are located.

Preferably, the preset area where the reference position of the object vertex is located is a circular area taking a pixel point at the reference position of the object vertex as a circle center and taking a first preset pixel as a radius; the first preset pixels range from 10 to 20 pixels, preferably 15 pixels, for example.

For each object vertex, performing corner detection in a preset area where a reference position of the object vertex is located, including: and carrying out corner detection on pixel points in the circular area corresponding to each object vertex, wherein in the corner detection process, all the pixel points with the characteristic value change amplitude larger than a preset threshold value are used as candidate corner points, and the target corner points corresponding to each object vertex are determined from the candidate corner points. The characteristic value change amplitude refers to the change degree of pixel gray in a fixed window for corner detection. It will be appreciated that the smaller the amplitude of the eigenvalue variation, the less likely it is that the pixel point is a corner point. By comparing the characteristic value variation amplitude with a preset threshold value, pixel points with small corner possibility can be removed, and pixel points with large corner possibility are reserved as candidate corner points, so that further determination of target corner points from the candidate corner points is facilitated. Specific corner detection algorithms include a corner detection algorithm based on a gray level map, a corner detection algorithm based on a binary image, a corner detection algorithm based on a contour curve, etc., and the specific reference may be made to the prior art, and details thereof will not be described herein.

Specifically, the determining, from the candidate corner points, the target corner point corresponding to each object vertex includes:

step SA2221: the candidate corner points are ordered in descending order according to the characteristic value variation amplitude, the candidate corner point of the first rank is determined to be the target corner point, and the candidate corner point of the second rank is determined to be the current corner point to be selected;

step SA2222: judging whether the distances between the current candidate corner point and all the current target corner points are larger than a second preset pixel or not; if yes, go to step SA2223, otherwise go to step SA2224;

step SA2223: determining the current candidate corner point as the target corner point;

step SA2224, discarding the current candidate corner, determining the candidate corner next to be the current candidate corner, and returning to step SA2222.

It will be appreciated that, when the feature value change amplitude is sorted in a descending order, the feature value change amplitude of the first candidate corner is greatest, so that the likelihood of being a corner is also greatest, and thus, the first candidate corner can be directly determined as the target corner. For the candidate corner of the second order of the row, it may be located in the circular area of the same object vertex (assumed to be object vertex 1) as the candidate corner of the first order of the row, or may be located in the circular area of the other object vertices (assumed to be object vertex 2). For the first case, since the candidate corner of the first row has been determined as the target vertex of the object vertex 1, it is impossible to determine the candidate corner of the second row as the target vertex of the object vertex 1. For the second case, the candidate corner of the row second is necessarily the pixel point with the highest possibility of the corner in the circular area of the object vertex 2, and therefore it is necessary to determine the candidate corner of the row second as the target vertex of the object vertex 2. Based on the above consideration, the present embodiment determines which case the candidate corner of the second rank belongs to by determining whether the distance between the candidate corner of the second rank and the target corner is greater than the second preset pixel. If the distance between the candidate corner point of the second rank and the target corner point is larger than a second preset threshold value, the candidate corner point of the second rank belongs to the second condition, otherwise, the candidate corner point of the second rank belongs to the first condition. If the first case is the second case, the candidate corner of the second row needs to be determined as the target corner, and if the second case is the first case, the candidate corner of the second row needs to be discarded. And by analogy, judging each candidate angular point according to the logic, so that a plurality of target angular points can be finally determined from each candidate angular point.

Through the processing, the situation that at most one candidate corner point remains around each object vertex can be ensured, and the position of the remaining candidate corner point is the actual position of the object vertex. Preferably, the range of the second preset pixels may be greater than or equal to 50 pixels, and the upper limit value of the second preset pixels may be set according to the specific size of the image, which is not limited herein.

It should be noted that, in the process of detecting the corner point of an object vertex, there may be a case where the corner point cannot be detected, for example, the preset area of the object vertex and the background of the image change little and the corner point cannot be detected, or the object vertex is outside the image (for example, the object vertices b1 and b3 in fig. 3) and the corner point does not exist at all. For the case that no corner point is detected, the object vertex can also be considered as a corner point.

Preferably, in step SA223, the step of determining the actual position of each object vertex in the image according to the corner detection result includes:

for each object vertex, if the corner detection result of the object vertex contains a corner, determining the position of the corner as the actual position of the object vertex in the image, and if the corner detection result of the object vertex does not contain a corner, determining the reference position of the object vertex in the image as the actual position of the object vertex in the image. In some embodiments, the vertices of the object around which the remaining corner points appear may be replaced with the corresponding corner points as actual vertices of the object. That is, for each object vertex, if the corner detection result of the object vertex includes a corner, determining the position of the corner as the actual position of the object vertex in the image, and if the corner detection result of the object vertex does not include a corner, determining the reference position of the object vertex in the image as the actual position of the object vertex in the image.

Through the processing, the actual position of the object vertex in the image can be corrected according to the coordinates of the detected corner point, so that the position detection of the object vertex is more accurate.

In another alternative example, the recognition of object vertices in the image may be different from the previous example, in which the object vertices are obtained by edge intersection after recognition of the edges, rather than direct recognition. Specifically, the step of obtaining the vertices of the plurality of objects in the image in step S2 includes:

step SB21: processing the image to obtain a line drawing of a gray scale contour in the image;

step SB22: merging similar lines in the line drawings to obtain a plurality of reference boundary lines;

step SB23: processing the image through a trained boundary line area recognition model to obtain a plurality of boundary line areas of an object in the image;

step SB24: for each boundary line region, determining a target boundary line corresponding to the boundary line region from a plurality of reference boundary lines;

step SB25: determining edges of objects in the image according to the determined target boundary lines;

step SB26: and configuring an intersection point of edges of an object in the image as the object vertex.

In step SB21, the image includes an object having edges, the line drawing includes a plurality of lines, and the line drawing is a gray scale drawing. The edge is not limited to a straight edge, but may be an arc, a line segment having a shape of fine wave, zigzag, or the like. The image may be a gray scale image or a color image. For example, the image may be an original image obtained by directly capturing with a camera, or may be an image obtained by preprocessing the original image. For example, in order to avoid the influence of the data quality, data imbalance, and the like of the image on the object edge detection, an operation of preprocessing the image may be further included before processing the image. Preprocessing may eliminate extraneous or noise information in the image to facilitate better processing of the image.

Further, step SB21 may include: and processing the image by an edge detection algorithm to obtain a line drawing of the gray scale contour in the image.

In some embodiments, the input image may be processed, such as by an OpenCV-based edge detection algorithm, to obtain a line drawing of the gray scale profile in the input image. OpenCV is an open source computer vision library, and the edge detection algorithm based on OpenCV comprises a plurality of algorithms such as Sobel, scarry, canny, laplacian, prewitt, marr-Hildresh, scharr. One skilled in the art can select an appropriate edge detection algorithm according to the prior art. And will not be described again here.

In other embodiments, step SB21 may comprise: processing the image through the boundary region identification model to obtain a plurality of boundary regions; and processing the plurality of boundary areas through an edge detection algorithm to obtain a line drawing of the gray scale contour in the image. For example, processing the plurality of boundary regions to obtain a plurality of boundary region annotation boxes; and processing the plurality of boundary region labeling frames through an edge detection algorithm to obtain a line drawing of the gray scale contour in the image.

The boundary region identification model may be implemented using machine learning techniques and run on a general purpose computing device or a special purpose computing device, for example. The boundary area recognition model is a neural network model which is obtained through pre-training. For example, the boundary region identification model may be implemented using a neural network such as a DEEP convolutional neural network (DEEP-CNN). In some embodiments, the image is input into a bounding region identification model, which can identify edges of objects in the image to obtain a plurality of bounding regions (i.e., mask regions for respective boundaries of the objects); then, marking the identified plurality of boundary areas, thereby determining a plurality of boundary area marking frames, for example, the plurality of boundary areas can be circumscribed with a rectangular frame to mark the plurality of boundary areas; finally, the marked multiple boundary region marking frames are processed by an edge detection algorithm (e.g. Canny edge detection algorithm, etc.) to obtain a line drawing of the gray scale contour in the image.

In this embodiment, the edge detection algorithm only needs to perform edge detection on the marked boundary region marking frame, and does not need to perform edge detection on the whole image, so that the calculated amount can be reduced, and the processing speed can be improved. The boundary region labeling frame labels a partial region in the image.

In other embodiments, step SB21 may comprise: performing binarization processing on the image to obtain a binarized image of the image; noise lines in the binarized image are filtered out, so that a line drawing of a gray scale contour in the image is obtained. For example, a corresponding filtering rule may be preset to filter out various line segments, such as the interior of the object, and various relatively small lines, in the binarized image, so as to obtain a line drawing of the gray profile in the image.

In an alternative exemplary embodiment, step SB22 of merging similar lines in the line drawing to obtain a plurality of reference boundary lines includes:

step SB221: combining similar lines in the line drawings to obtain an initial combined line group; the plurality of initial merging line groups are in one-to-one correspondence with the plurality of boundary areas, and each initial merging line group in the plurality of initial merging line groups comprises at least one initial merging line; determining a plurality of boundary connecting strips according to the initial merging line groups, wherein the boundary connecting strips are in one-to-one correspondence with the boundary areas, and the boundary connecting strips are also in one-to-one correspondence with the initial merging line groups; converting the plurality of boundary areas into a plurality of linear groups respectively, wherein the plurality of linear groups are in one-to-one correspondence with the plurality of boundary areas, and each linear group in the plurality of linear groups comprises at least one linear; calculating a plurality of average slopes corresponding to the plurality of straight line groups one by one; respectively calculating the slopes of a plurality of boundary connecting strips; judging whether the difference value between the slope of the ith boundary connecting wire strip and the average slope corresponding to the ith boundary connecting wire strip in the plurality of boundary connecting wire strips is higher than a second slope threshold value or not according to the ith boundary connecting wire strip in the plurality of boundary connecting wire strips, wherein i is a positive integer, and i is less than or equal to the number of the plurality of boundary connecting wire strips; and in response to the difference between the slope of the ith boundary connecting strip and the average slope corresponding to the ith boundary connecting strip being lower than or equal to a second slope threshold, taking the initial merging line in the initial merging line group corresponding to the ith boundary connecting strip and the ith boundary connecting strip as a reference boundary line in the reference boundary line group corresponding to the boundary region corresponding to the ith boundary connecting strip, and in response to the difference between the slope of the ith boundary connecting strip and the average slope corresponding to the ith boundary connecting strip being higher than the second slope threshold, taking the initial merging line in the initial merging line group corresponding to the ith boundary connecting strip as a reference boundary line in the reference boundary line group corresponding to the boundary region corresponding to the ith boundary connecting strip, respectively carrying out the operations on the plurality of boundary connecting strips, thereby determining a plurality of reference boundary lines. In some embodiments, the second slope threshold may range from 0-20 degrees, preferably from 0-10 degrees, more preferably, the second slope threshold may be 5 degrees, 15 degrees, etc.

It is noted that in the embodiments of the present disclosure, "the difference of two slopes" means the difference between the inclination angles corresponding to the two slopes. For example, the inclination angle corresponding to the slope of the i-th boundary connecting strip may represent an angle between the i-th boundary connecting strip with respect to a given direction (e.g., a horizontal direction or a vertical direction), and the inclination angle corresponding to the average slope may represent an angle between a straight line determined based on the average slope and the given direction. For example, an inclination angle (e.g., a first inclination angle) of an i-th boundary connecting wire and an inclination angle (e.g., a second inclination angle) of a plurality of average inclinations corresponding to an average inclination angle corresponding to the i-th boundary connecting wire may be calculated, and if a difference between the first inclination angle and the second inclination angle is high Yu Dengyu by a second inclination threshold, the i-th boundary connecting wire is not used as a reference boundary line; and if the difference between the first inclination angle and the second inclination angle is lower than the second slope threshold, the i-th boundary connecting strip can be used as a reference boundary line.

Note that, the straight line group, the average slope, the boundary region, and the like will be described later, and are not described here.

For example, in step SB221, similar lines of the plurality of lines are combined to obtain a plurality of initial combined line groups, and a boundary matrix is determined according to the plurality of initial combined lines. The step of merging similar ones of the plurality of lines includes: acquiring a plurality of long lines in a plurality of lines, wherein each long line in the plurality of long lines is a line with the length exceeding a length threshold value; acquiring a plurality of merging line groups according to a plurality of long lines, wherein each merging line group in the plurality of merging line groups comprises at least two long lines which are adjacent in sequence, and the included angle between any two adjacent long lines in each merging line group is smaller than an angle threshold; and for each merging line group in the plurality of merging line groups, sequentially merging all long lines in the merging line groups to obtain initial merging lines corresponding to the merging line groups, and respectively merging the plurality of merging line groups to obtain initial merging lines in the plurality of initial merging line groups.

For example, the number of all initial merging lines included in the plurality of initial merging line groups is the same as the number of the plurality of merging line groups, and all initial merging lines included in the plurality of initial merging line groups are in one-to-one correspondence with the plurality of merging line groups. It should be noted that, after the initial merging line corresponding to the merging line group is obtained based on the merging line group, a boundary area corresponding to the initial merging line may be determined based on the position of the initial merging line, so as to determine the initial merging line group to which the initial merging line belongs.

It should be noted that, in the embodiments of the present disclosure, "similar lines" means that an included angle between two lines is smaller than an angle threshold.

For example, a long line in a line drawing refers to a line of a plurality of lines in the line drawing having a length exceeding a length threshold, e.g., a line having a length exceeding 2 pixels is defined as a long line, i.e., a length threshold of 2 pixels, and embodiments of the present disclosure include, but are not limited to, in other embodiments, a length threshold of 3 pixels, 4 pixels, and the like. Only long lines in the line drawings are acquired for subsequent merging processing, and some shorter lines in the line drawings are not considered, so that line interference inside and outside the object can be avoided when the lines are merged, for example, corresponding lines of characters and graphics inside the object, other objects outside the object and the like can be removed.

For example, the merge bar group may be obtained by: firstly, selecting a long line T1, then, starting from the long line T1, sequentially judging whether the included angle between two adjacent long lines is smaller than an angle threshold value, and if the included angle between a certain long line T2 and the long line adjacent to the long line T2 is not smaller than the angle threshold value, forming a combined line group by the long line T1, the long line T2 and all the long lines between the long line T1 and the long line T2 which are sequentially adjacent to each other. Then, the above process is repeated, namely, from the long line adjacent to the long line T2, whether the included angle between two adjacent long lines is smaller than an angle threshold value is judged in sequence, and the process is repeated until all the long lines are traversed, so that a plurality of combined line groups are obtained. It should be noted that "two adjacent long lines" means two physically adjacent long lines, that is, there is no other long line between the two adjacent long lines.

For example, the initial merge bar is a plurality of bars that are longer than the long bars.

Fig. 4 is a schematic diagram of a line merging process according to an embodiment of the disclosure.

The procedure for acquiring the merge bar group will be described below with reference to fig. 4. In one embodiment, for example, first long line a is selected, whether the included angle between the long line a and the long line B adjacent to the long line a is smaller than the angle threshold is determined, if the included angle between the long line a and the long line B is smaller than the angle threshold, the long line a and the long line B are indicated to belong to the same combination line group, then, whether the included angle between the long line B and the long line C adjacent to the long line B is smaller than the angle threshold is continuously determined, if the included angle between the long line B and the long line C is also smaller than the angle threshold, the long line C, the long line B and the long line a are all indicated to belong to the same combination line group, then, if the included angle between the long line C and the long line D adjacent to the long line C is also smaller than the angle threshold, the long line C, the long line B and the long line a/long line a are continuously determined to belong to the same combination line group, and if the included angle between the long line D and the long line E adjacent to the long line B is not equal to the angle threshold, and the long line B/the long line B can be combined, and the long line B can be determined to be equal to the combination line D. Then, it is sequentially determined from the long line E whether the included angle between two adjacent long lines is smaller than the angle threshold, so that a long line G, a long line H, a long line I, and a long line J can be obtained, for example, the combined line group formed by the long line G, the long line H, the long line I, and the long line J can be the second combined line group, and the combined line group formed by the long line M, the long line N, and the long line O can be the third combined line group.

For example, in another embodiment, first, one long line may be arbitrarily selected from a plurality of long lines, for example, long line D, where a long line adjacent to the long line D includes long line C and long line E, then it is determined whether an included angle between the long line D and the long line C is smaller than an angle threshold, it is determined whether an included angle between the long line D and the long line E is smaller than an angle threshold, because the included angle between the long line D and the long line C is smaller than the angle threshold, the long line D and the long line C belong to the same merged line group, and because the included angle between the long line D and the long line E is larger than the angle threshold, the long line D and the long line E belong to different merged line groups, then, on one hand, it may continue to determine other long lines adjacent in sequence from the long line C, thereby determining other long lines belonging to the same merged line group with the long line D, and may also determine other merged line groups; on the other hand, the included angles between the other adjacent long lines in turn may be judged from the long line E, thereby determining other merging line groups. And finally, determining that the long lines A, B, C and D belong to one merging line group, and the long lines G, H, I and J belong to one merging line group, and the long lines M, N and O also belong to one merging line group.

For example, the angle between two adjacent long lines is calculated by the following formula:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the vectors of two adjacent long lines, respectively. For example, the value of the angle threshold may be set according to the actual situation, for example, in some embodiments, the angle threshold may range from 0 to 20 degrees, preferably from 0 to 10 degrees, more preferably from 5 degrees, 15 degrees, and so on.

For example, merging two long lines refers to averaging the slopes of the two long lines to obtain a slope average, which is the slope of the merged line. In practical applications, the two long lines are combined according to an array form of the two long lines, for example, the two long lines are a first long line and a second long line, the two long lines are combined to directly connect a start point (i.e., a line head) of the first long line and an end point (i.e., a line tail) of the second long line to form a new longer line, that is, in a coordinate system corresponding to a line diagram, the start point of the first long line and the end point of the second long line are directly connected in a straight line, so as to obtain a combined line, for example, coordinate values of pixel points corresponding to the start point of the first long line are used as coordinate values of pixel points corresponding to the start point of the combined line, coordinate values of pixel points corresponding to the end point of the second long line are used as coordinate values of pixel points corresponding to the end point of the combined line, and finally, the coordinate values of pixel points corresponding to the start point of the combined line and the end point of the combined line are formed into an array of the combined line, and the array is stored. And sequentially combining the long lines in each combined line group, thereby obtaining corresponding initial combined lines.

For example, as shown in fig. 4, the long line a, the long line B, the long line C, and the long line D in the first merging line group are sequentially merged to obtain an initial merging line corresponding to the merging line group, for example, first, the long line a and the long line B may be merged to obtain a first merging line, then the first merging line and the long line C are merged to obtain a second merging line, and then the second merging line and the long line D are merged to obtain an initial merging line 1 corresponding to the first merging line group. Similarly, each long line in the second merging line group is merged to obtain an initial merging line 2 corresponding to the second merging line group, and each long line in the third merging line group is merged to obtain an initial merging line 3 corresponding to the third merging line group. After the merging of the respective merged line groups, also the long line E, the long line F, the long line K, the long line L are not merged.

In addition, the boundary matrix is determined by: redrawing the initial combined lines and the lines which are not combined in the long lines, corresponding the position information of the pixel points in all the redrawn lines to the whole image matrix, setting the values of the positions of the pixel points of the lines in the image matrix to a first value and the values of the positions of the pixel points other than the lines to a second value, and thus forming a boundary matrix. Specifically, the boundary matrix may be a matrix having the same size as the image matrix, for example, the size of the image is 1024×1024 pixels, and the image matrix is 1024×1024, then the boundary matrix is a 1024×1024 matrix, and the initial merged lines and the lines which are not merged in the long lines are redrawn according to a certain line width (for example, the line width is 2), the boundary matrix is filled with values according to the positions of the pixels of the redrawn lines corresponding to the matrix, the positions of the pixels of the lines corresponding to the matrix are all set to a first value, for example, 255, and the positions of the pixels of the lines not corresponding to the matrix are set to a second value, for example, 0, so as to form an oversized matrix of the whole image, that is, the boundary matrix. It should be noted that, since the plurality of initial merged lines and the lines that are not merged in the long line are all stored in the form of an array, it is necessary to form the lines into actual line data when determining the boundary matrix, and therefore, redrawing the lines, for example, according to a line width of 2, so as to obtain coordinate values of pixel points corresponding to each point on each line, and further, filling values in the boundary matrix according to the obtained coordinate values, for example, setting values of positions corresponding to the coordinate values in the boundary matrix to 255, and setting values of the rest positions to 0.

An exemplary boundary matrix is provided below, which is a 10×10 matrix, where all the 255 values in the boundary matrix are connected to form a plurality of initial merged lines and non-merged lines in the long lines.

Step SB222: and merging similar lines in the plurality of initial merging lines to obtain a target line, and taking the initial merging lines which are not merged as the target line.

In step SB221, the combined initial combined line is a plurality of longer lines. Step SB222 may further determine whether there are similar lines in the plurality of initial merging lines according to the merging rule in step SB221, so as to merge the similar lines again to obtain a plurality of target lines, and meanwhile, the initial merging lines that cannot be merged are also used as the target lines.

The specific merging step of merging similar lines in the plurality of initial merging lines to obtain a target line is as follows: step a: obtaining a plurality of groups of second-class lines from a plurality of initial merging lines; the second type of lines comprise at least two initial merging lines which are adjacent in sequence, and the included angle between any two adjacent initial merging lines is smaller than a third preset threshold value; step b: and combining all initial combined lines in the second type lines in sequence aiming at each group of the second type lines to obtain a target line.

The principle of the above-mentioned step of merging the initial merging lines is the same as that of merging the lines in the line drawing in step SB221, and the description thereof in step SB221 will be omitted herein. The third preset threshold may be the same as or different from the second preset threshold, and this embodiment is not limited to this, for example, the third preset threshold is set to an included angle of 10 degrees. After the above steps of merging the initial merging lines 1, 2, and 3, the angle between the initial merging line 1 and 2 is smaller than the third preset threshold, and the angle between the initial merging line 3 and the initial 2 is larger than the third preset threshold, so that the initial merging lines 1 and 2 can be further merged into the target line 12, and the initial merging line 3 can be directly used as a target line if the initial merging line 3 cannot be merged, as shown in the comparison chart before and after the line merging of fig. 4.

Thus, a plurality of target lines are obtained, and in the plurality of target lines, not only reference boundary lines but also longer interference lines, for example, longer lines obtained by combining corresponding lines of internal characters and graphics, external other objects and the like, are present, and the interference lines are removed according to subsequent processing (specifically, the processing of step SB223 and step SB 23) and rules.

Step SB223: determining a plurality of reference boundary lines from a plurality of target lines according to the boundary matrix; specifically, determining, according to the boundary matrix, a plurality of reference boundary lines from a plurality of target lines includes: firstly, for each target line, extending the target line, determining a line matrix according to the extended target line, comparing the line matrix with the boundary matrix, and calculating the number of pixels belonging to the boundary matrix on the extended target line as the achievement of the target line, wherein the line matrix and the boundary matrix have the same size; then, a plurality of reference boundary lines are determined from a plurality of target lines according to the achievement of each target line.

Wherein the line matrix may be determined as follows: redrawing the extended target line, corresponding the position information of the pixel points in the redrawn line to the whole image matrix, setting the value of the position of the pixel point of the line in the image matrix as a first value and the value of the position of the pixel point other than the line as a second value, thereby forming the line matrix. The line matrix is formed in a similar manner to the boundary matrix, and will not be described in detail herein. The target line is stored in an array form, that is, coordinate values of a start point and an end point of the target line are stored, after the target line is lengthened, the lengthened target line is formed into an array by using the coordinate values of the start point and the end point of the lengthened target line when the target line is stored, therefore, when the lengthened target line is redrawn according to the same line width, for example, the line width is 2, the coordinate values of pixel points corresponding to all points on the lengthened target line are obtained, and further, the line matrix is filled with values according to the coordinate values, that is, the value of a position corresponding to the coordinate values in the line matrix is 255, and the values of the rest positions are 0.

And (3) extending the combined target line, and judging that the pixel points on the target line fall into the maximum target line on the initial combined line in step SB222 and the non-combined line in the long line as a reference boundary line. For each target line, judging how many pixel points belong to the boundary matrix, and calculating a score, wherein the score is specifically as follows: and (3) extending the target line, forming a line matrix by the line obtained after the target line is extended according to the formation mode of the boundary matrix, comparing the line matrix with the boundary matrix to judge how many pixel points fall into the boundary matrix, namely judging how many pixel points at the same position in the two matrices have the same first numerical value, such as 255, so as to calculate the achievement. In this case, there may be more lines with the best results, and therefore, a plurality of marking lines with the best results are determined as reference boundary lines from the plurality of target lines according to the results of the respective target lines.

For example, a line matrix formed by an extended target line is as follows, and comparing the line matrix with the boundary matrix can determine that 7 pixel points on the extended target line fall into the boundary matrix, so as to obtain the achievement of the target line.

Preferably, in step SB23, the boundary region identification model may be implemented using machine learning techniques and run, for example, on a general purpose computing device or a special purpose computing device. The boundary area recognition model is a neural network model which is obtained through pre-training. For example, the boundary region identification model may be implemented using a neural network such as a DEEP convolutional neural network (DEEP-CNN). The boundary region identification model may be the same model or different models as the boundary region identification model in step SB 21.

First, a boundary region recognition model is established through machine learning training, and the boundary region recognition model can be obtained through training in the following process: labeling each image sample in the image sample set to label a boundary line area, an inner area and an outer area of an object in each image sample; and training the neural network through the image sample set subjected to the labeling processing to obtain a boundary region identification model.

For example, by the boundary region recognition model established by machine learning training, 3 parts of the boundary region, the inner region (i.e., the region where the object is located), and the outer region (i.e., the outer region of the object) in the image can be recognized, so that the respective boundary regions of the image are acquired, and at this time, the edge contour in the boundary region is thicker. For example, in some embodiments, the shape of the object may be a rectangle, and the number of bounding regions may be 4, i.e., the input image is identified by the bounding region identification model, so that four bounding regions corresponding to four sides of the rectangle may be obtained, respectively.

In some embodiments, the plurality of boundary regions includes a first boundary region, a second boundary region, a third boundary region, and a fourth boundary region. In some embodiments, as shown in fig. 2, the first boundary region may represent a region corresponding to the boundary line A1, the second boundary region may represent a region corresponding to the boundary line A2, the third boundary region may represent a region corresponding to the boundary line A3, and the fourth boundary region may represent a region corresponding to the boundary line A4; in other embodiments, as shown in fig. 3, the first boundary region may represent a region corresponding to the boundary line B1, the second boundary region may represent a region corresponding to the boundary line B2, the third boundary region may represent a region corresponding to the boundary line B3, and the fourth boundary region may represent a region corresponding to the boundary line B4.

It will be appreciated that by identifying the boundary region of the object in the image by the boundary region identification model and then determining the target boundary line from the plurality of reference boundary lines based on the boundary region, misidentified disturbance lines, such as lines falling into the middle of a business card or document, lines in the middle of a form, etc., may be removed.

Preferably, step SB24: for each boundary line region, determining a target boundary line corresponding to the boundary line region from a plurality of reference boundary lines; may include: firstly, calculating the slope of each reference boundary line; then, for each of the boundary line areas, the boundary line area is converted into a plurality of straight lines, the average slope of the straight lines is calculated, whether a reference boundary line with the slope matched with the average slope exists in the plurality of reference boundary lines or not is judged, and if the reference boundary line exists, the reference boundary line is determined as a target boundary line corresponding to the boundary line area. The boundary line region may be converted into a plurality of straight lines by hough transform, and of course, conversion may be performed in other manners, which is not limited in this embodiment.

In this embodiment, the edge contour in the boundary line area is thicker, and for each boundary line area, the boundary line area may be converted into a plurality of straight lines by hough transform, where the straight lines have similar slopes, an average slope is obtained, and then the average slope is compared with the slope of each reference boundary line to determine whether there is a reference boundary line with a slope matching the average slope among the plurality of reference boundary lines, that is, a reference boundary line with a most similar slope is found from the plurality of reference boundary lines, as a target boundary line corresponding to the boundary line area.

Since the difference between the slope of the determined target boundary line and the average slope cannot be too large, a comparison threshold is set when comparing the average slope with the slope of each reference boundary line, and when the absolute value of the difference between the slope of a certain reference boundary line and the average slope is smaller than the comparison threshold, it is determined that the slope of the reference boundary line is the reference boundary line matching the average slope, and it is determined that the reference boundary line is the target boundary line corresponding to the boundary line region.

Further, for each boundary line area, if it is determined that there is no reference boundary line having a slope matching the average slope among the plurality of reference boundary lines, the following processing is performed: comparing a line matrix formed by each straight line with the boundary matrix according to each straight line obtained by converting the boundary line area, and calculating the number of pixel points belonging to the boundary matrix on the straight line as the achievement of the straight line; the best performing straight line is determined as the target boundary line corresponding to the boundary line region. If there are several lines with best performance, the first line is used as the best boundary line according to the sorting algorithm. Wherein the line matrix is determined as follows: and redrawing the straight line, corresponding the position information of the pixel points in the redrawn line to the whole image matrix, setting the value of the position of the pixel point of the line in the image matrix as a first value and the value of the position of the pixel point other than the line as a second value, thereby forming the line matrix. The line matrix is formed in a similar manner to the boundary matrix, and will not be described in detail herein.

If the target boundary line corresponding to a certain boundary line area cannot be found from the reference boundary lines, forming a corresponding line matrix for the multiple straight lines obtained by hough transformation according to the matrix forming modes in the step SB222 and the step SB223, and judging which straight line has the best performance of the pixel points falling into the boundary matrix, and considering the target boundary line corresponding to the boundary line area. The manner of comparing the line matrix formed by the straight line with the boundary matrix to calculate the performance of the straight line may refer to the related description in step SB223, which will not be described herein.

After the multi-item object boundary line is determined in step SB25, the multi-item object boundary line constitutes an edge of the object in the image since each of the multi-item object boundary lines corresponds to one boundary line region of the object in the image. As shown in the image of fig. 2, the edges of the object in the image are constituted by four longer lines in fig. 2, namely target boundary lines A1, A2, A3, A4; as in the image shown in fig. 3, the edges of the object in the image are constituted by four longer lines in fig. 3, i.e., target boundary lines B1, B2, B3, B4.

Further, in step SB26, after edges of an object in an image are obtained, intersections of the edges are configured as the object vertices. The subsequent steps refer to steps S2 to S4, which are not repeated here.

The present embodiment also provides a readable storage medium having stored thereon a program which, when executed, implements the object size recognition method as described above. Further, the present embodiment also provides an object size recognition system, which includes a processor and a memory, where the memory stores a program, and when the program is executed by the processor, the object size recognition method as described above is implemented.

In summary, in the object size recognition method, the readable storage medium and the object size recognition system provided by the present invention, the object size recognition method includes: acquiring at least two images of an object at different visual angles through shooting; respectively acquiring two-dimensional position information of a plurality of object vertexes in each image; according to at least two images, a three-dimensional space coordinate system is established according to a feature point matching method, and the space position of a camera is determined; and selecting any one image, and obtaining three-dimensional space position information of a plurality of vertexes based on the parameter information calibrated by the camera and the space position of the camera, thereby obtaining the size of the object. By the configuration, at least two images with different visual angles of an object are obtained through shooting, the size of the object can be obtained by combining parameter information calibrated by a camera, the operation steps are simple, and the problem that the size of the object in space cannot be measured in the prior art is solved.

The above description is only illustrative of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention, and any alterations and modifications made by those skilled in the art based on the above disclosure shall fall within the scope of the appended claims.

Claims

1. A method of identifying object dimensions, comprising:

selecting any one image, and obtaining three-dimensional space position information of a plurality of object vertexes based on parameter information calibrated by the camera and the space position of the camera, thereby obtaining the size of the object;

the step of obtaining a plurality of object vertices in the image comprises:

2. The object size recognition method according to claim 1, wherein the step of acquiring two-dimensional position information of a plurality of object vertices in the image comprises:

3. The object size recognition method according to claim 2, wherein the step of determining the actual position of each of the object vertices in the image based on the relative positions of each of the object vertices and the image vertices corresponding thereto comprises:

4. The object size recognition method according to claim 3, wherein the preset area where the reference position of the object vertex is located is a circular area with a pixel point at the reference position of the object vertex as a center and a first preset pixel as a radius;

5. The object size recognition method according to claim 4, wherein the determining the actual position of each of the object vertices in the image based on the corner detection result comprises:

6. The method of claim 1, wherein the step of merging similar lines in the line drawing to obtain a plurality of reference boundary lines comprises:

7. The object size recognition method according to claim 1, wherein the step of establishing a three-dimensional space coordinate system from at least two of the images according to a feature point matching method, and determining the spatial position of the camera comprises:

8. A readable storage medium having a program stored thereon, characterized in that the program, when executed, implements the object size recognition method according to any one of claims 1 to 7.

9. An object size recognition system comprising a processor and a memory, the memory having stored thereon a program which, when executed by the processor, implements the object size recognition method according to any one of claims 1 to 7.