CN107392958B

CN107392958B - Method and device for determining object volume based on binocular stereo camera

Info

Publication number: CN107392958B
Application number: CN201610324103.2A
Authority: CN
Inventors: 张文聪; 贾永华
Original assignee: Hangzhou Hikrobot Technology Co Ltd
Current assignee: Hangzhou Hikrobot Co Ltd
Priority date: 2016-05-16
Filing date: 2016-05-16
Publication date: 2020-07-03
Anticipated expiration: 2036-05-16
Also published as: CN107392958A

Abstract

The embodiment of the invention provides a method and a device for determining the volume of an object based on a binocular stereo camera. The method comprises the following steps: acquiring binocular images which are acquired by a binocular stereo camera and contain a target object; generating a target depth image containing the target object according to the binocular image; dividing to obtain a target image area corresponding to the target object based on the depth data in the target depth image; determining a target circumscribed rectangle corresponding to the target image area and meeting a preset condition; determining a volume of the target object based on the target bounding rectangle and the depth data in the target depth image. Therefore, the purpose of high precision, high efficiency and lower economic cost is achieved when the object volume is determined through the scheme.

Description

Method and device for determining object volume based on binocular stereo camera

Technical Field

The invention relates to the technical field of machine vision, in particular to a method and a device for determining the volume of an object based on a binocular stereo camera.

Background

Volume data, which is the most basic attribute information of an object, is widely used in the fields of production, logistics, and the like, and is particularly applied to volume-based logistics billing, automatic loading of objects, and the like. The object herein refers to a relatively standard rectangular parallelepiped object.

In the prior art, common volume determination methods include a determination method using a laser and a determination method using a manual scale. Although the determination method adopting the laser has high precision, expensive laser measurement equipment is needed, the cost performance is low, and the method is hardly widely accepted by users; the manual scale determination method needs manual cooperation and is influenced by manual operation and emotion, so that the accuracy and the efficiency cannot be guaranteed.

Disclosure of Invention

The embodiment of the invention aims to provide a method and a device for determining the volume of an object based on a binocular stereo camera, so as to achieve the purposes of high precision, high efficiency and low economic cost when determining the volume of the object. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present invention provides a method for determining an object volume based on a binocular stereo camera, including:

acquiring binocular images which are acquired by a binocular stereo camera and contain a target object;

generating a target depth image containing the target object according to the binocular image;

dividing to obtain a target image area corresponding to the target object based on the depth data in the target depth image;

determining a target circumscribed rectangle corresponding to the target image area and meeting a preset condition;

determining a volume of the target object based on the target bounding rectangle and the depth data in the target depth image.

Optionally, the generating a target depth image including the target object according to the binocular image includes:

performing stereo matching on the binocular images to obtain a disparity map;

and calculating to obtain a target depth image containing the target object by utilizing a triangulation principle based on the disparity map.

carrying out deformity correction treatment on the binocular image;

performing stereo matching on the binocular image subjected to the deformity correction processing to obtain a disparity map;

Optionally, the segmenting to obtain the target image region corresponding to the target object based on the depth data in the target depth image includes:

and based on the depth data in the target depth image, a target image area corresponding to the target object is obtained by segmentation by using a depth map frame difference method.

Optionally, the obtaining, by segmenting based on the depth data in the target depth image and using a depth map frame difference method, a target image region corresponding to the target object includes:

subtracting the depth data of each pixel point in the target depth image from the depth data of a corresponding pixel point in a preset background depth image, wherein the preset background depth image is an image which does not contain the target object and is specific to the background environment where the target object is located;

forming a frame difference image corresponding to the target depth image based on the subtraction result corresponding to each pixel point;

carrying out binarization processing on the frame difference image;

and dividing the frame difference image after the binarization processing to obtain a target image area corresponding to the target object.

Optionally, the determining a target circumscribed rectangle corresponding to the target image area and meeting a predetermined condition includes:

and determining a target external rectangle which corresponds to the target image area and meets a preset condition through a connected area analysis algorithm or an edge detection fitting algorithm.

determining a target circumscribed rectangle with the minimum area value corresponding to the target image area;

alternatively, the first and second electrodes may be,

and determining a target circumscribed rectangle with the smallest difference between the area value corresponding to the target image area and a preset area threshold value.

Optionally, the determining the volume of the target object based on the target bounding rectangle and the depth data in the target depth image includes:

extracting image coordinates of each vertex of the target circumscribed rectangle in the frame difference image after binarization processing;

projecting the extracted image coordinates of each vertex into the target depth image to form a reference point located in the target depth image;

calculating three-dimensional coordinates of each reference point in a world coordinate system corresponding to the camera by using a perspective projection principle of camera imaging;

and obtaining the volume of the target object by using the three-dimensional coordinates of the reference points and the depth data of the target depth image.

In a second aspect, an embodiment of the present invention provides an apparatus for determining an object volume based on a binocular stereo camera, including:

the binocular image acquisition module is used for acquiring binocular images which are acquired by the binocular stereo camera and contain the target object;

the depth image generation module is used for generating a target depth image containing the target object according to the binocular image;

the image area segmentation module is used for segmenting to obtain a target image area corresponding to the target object based on the depth data in the target depth image;

the external rectangle determining module is used for determining a target external rectangle which corresponds to the target image area and meets a preset condition;

and the volume determining module is used for determining the volume of the target object based on the target circumscribed rectangle and the depth data in the target depth image.

Optionally, the depth image generating module includes:

the first disparity map determining unit is used for performing stereo matching on the binocular images to obtain disparity maps;

and the first depth image generating unit is used for calculating a target depth image containing a target object by utilizing a triangulation principle based on the disparity map.

Optionally, the depth image generating module includes:

the correction processing unit is used for carrying out deformity correction processing on the binocular image;

the second disparity map determining unit is used for performing stereo matching on the binocular image subjected to the deformity correction processing to obtain a disparity map;

and the second depth image generating unit is used for calculating a target depth image containing a target object by utilizing a triangulation principle based on the disparity map.

Optionally, the image region segmentation module includes:

and the image area segmentation unit is used for segmenting to obtain a target image area corresponding to the target object by utilizing a depth image frame difference method based on the depth data in the target depth image.

Optionally, the image region segmentation unit includes:

a subtraction subunit, configured to subtract depth data of each pixel point in the target depth image from depth data of a corresponding pixel point in a predetermined background depth image, where the predetermined background depth image is an image that does not include the target object and is specific to a background environment where the target object is located;

a frame difference image forming subunit, configured to form a frame difference image corresponding to the target depth image based on a subtraction result corresponding to each pixel point;

a binarization processing subunit, configured to perform binarization processing on the frame difference image;

and the image segmentation subunit is used for segmenting the frame difference image subjected to the binarization processing to obtain a target image area corresponding to the target object.

Optionally, the circumscribed rectangle determining module includes:

and the first circumscribed rectangle determining unit is used for determining a target circumscribed rectangle which corresponds to the target image area and meets a preset condition through a connected area analysis algorithm or an edge detection fitting algorithm.

Optionally, the circumscribed rectangle determining module includes:

a second circumscribed rectangle determining unit, configured to determine a target circumscribed rectangle with a smallest area value corresponding to the target image region;

alternatively, the first and second electrodes may be,

and the third circumscribed rectangle determining unit is used for determining the target circumscribed rectangle with the smallest difference between the area value corresponding to the target image area and the preset area threshold value.

Optionally, the volume determination module includes:

the image coordinate extraction unit is used for extracting the image coordinates of each vertex of the target circumscribed rectangle in the frame difference image after binarization processing;

a reference point forming unit, configured to project the extracted image coordinates of each vertex into the target depth image, and form a reference point located in the target depth image;

the three-dimensional coordinate calculation unit is used for calculating three-dimensional coordinates of each reference point in a world coordinate system corresponding to the camera by utilizing a perspective projection principle of camera imaging;

and the volume determining unit is used for obtaining the volume of the target object by utilizing the three-dimensional coordinates of the reference points and the depth data of the target depth image.

In the embodiment of the invention, after acquiring the binocular image which is acquired by the binocular stereo camera and contains the target object, a target depth image containing the target object is generated according to the binocular image; dividing to obtain a target image area corresponding to a target object based on the depth data in the target depth image; determining a target circumscribed rectangle corresponding to the target image area and meeting a preset condition; and determining the volume of the target object based on the target bounding rectangle and the depth data in the target depth image. Compared with the determination method adopting laser in the prior art, the method adopts the binocular stereo camera without laser measurement equipment, so that the economic cost is lower, and in addition, compared with the determination method adopting a manual scale in the prior art, the method adopts a software program to automatically determine the volume without manual cooperation, so that the method has higher precision and efficiency, and can realize the purposes of high precision, high efficiency and lower economic cost when determining the volume of an object.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of a method for determining an object volume based on a binocular stereo camera according to an embodiment of the present invention;

fig. 2 is another flowchart of a method for determining an object volume based on a binocular stereo camera according to an embodiment of the present invention;

fig. 3 is another flowchart of a method for determining an object volume based on a binocular stereo camera according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an apparatus for determining an object volume based on a binocular stereo camera according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to solve the problems in the prior art, embodiments of the present invention provide a method and an apparatus for determining an object volume based on a binocular stereo camera, so as to achieve the purpose of considering high precision, high efficiency and low economic cost when determining the object volume.

First, a method for determining an object volume based on a binocular stereo camera according to an embodiment of the present invention will be described below.

It should be noted that an implementation subject of the method for determining the object volume based on the binocular stereo camera provided by the embodiment of the present invention may be an apparatus for determining the object volume based on the binocular stereo camera, and for convenience of reference, the apparatus for determining the object volume based on the binocular stereo camera is simply referred to as an apparatus for determining the object volume. In practical application, the device for determining the volume of the object may be functional software arranged in the binocular stereo camera, or may be functional software arranged in a background server in communication with the binocular stereo camera, which is reasonable. In addition, the object whose volume is to be determined according to the embodiment of the present invention may be: the production and logistics are carried out on the objects such as workpieces, packages and the like which belong to relatively standard cuboids.

The binocular stereo camera is a device which simulates the eyes of a person through two camera units and performs camera shooting simultaneously, the shot video content comprises videos at two angles, the videos are respectively sent to the two eyes of a viewer through 3D imaging equipment during display, and the viewer can generate the stereo vision of the scene. The pitch of the two camera element lenses is called Stereo Base width (Stereo Base).

As shown in fig. 1, a method for determining a volume of an object based on a binocular stereo camera may include the steps of:

s101, acquiring binocular images which are acquired by a binocular stereo camera and contain a target object;

s102, generating a target depth image containing the target object according to the binocular image;

in the process of determining the volume of the target object, the apparatus for determining the volume of the object may first obtain binocular images including the target object, which are acquired by a binocular stereo camera, and generate a target depth image including the target object according to the binocular images, and then perform subsequent processing using the obtained target depth image including the target object, wherein the binocular images are two frames of images about the target object, which are acquired from different viewpoints, and each frame of image may include at least one target object, and one frame of target depth image may include at least one target object. It will be appreciated that, in the case where the means for determining the volume of the object is functional software located in a binocular stereo camera, the means for determining the volume of the object may directly obtain binocular images captured by the binocular stereo camera; for the case that the device for determining the object volume is functional software located in the backend server, the device for determining the object volume may obtain the binocular image acquired by the backend server from the binocular stereo camera, and the manner in which the backend server acquires the binocular image from the binocular stereo camera may be active acquisition or passive reception, which is reasonable.

It should be noted that, in order to ensure that the binocular stereo camera can acquire the binocular images of the target object, the binocular stereo camera may be placed at a position where the binocular images of the target object can be acquired. The mode of acquiring the binocular image may be a triggered acquisition mode, and the triggered acquisition mode may be a mode of triggering acquisition of the binocular image only when a target object to be detected appears in a scene, for example, the triggered acquisition mode may include an external physical triggering mode triggered by a photoelectric signal or an intelligent analysis automatic triggering mode, where the external physical triggering mode triggered by the photoelectric signal specifically indicates that when a target object whose volume needs to be determined passes through, the photoelectric signal is interrupted, so as to send a triggering signal to the binocular stereo camera, and the intelligent analysis automatic triggering mode indicates that automatic detection is performed by using a motion detection algorithm to determine whether the target object appears, and then the binocular stereo camera may acquire the binocular image related to the target object when the determination result indicates that the target object appears.

Specifically, in an implementation manner, the generating a target depth image including the target object according to the binocular image may include:

carrying out stereo matching on the binocular image to obtain a disparity map;

based on the disparity map, a target depth image containing a target object is calculated by utilizing a triangulation principle.

The matching algorithm for stereo matching of binocular images can be implemented by the prior art, wherein the matching algorithm for stereo matching of binocular images in the prior art can include: a local-based matching algorithm and a global-based matching algorithm, wherein the basic idea of the local-based matching algorithm is as follows: firstly, a certain point A on a left sub-image is given, a sub-window of a local area of the pixel point is selected, then, in a region in a right sub-image, a sub-window which is most similar to the sub-window of the left sub-image is found out according to a specific similarity judgment standard, and finally, a corresponding pixel point B is found out in the right sub-image and is a matching point of the pixel point A, wherein the similarity judgment standard can be average difference measurement SAD or normalized cross-correlation measurement NCC; the essence of the global-based matching algorithm is to convert the matching problem of corresponding points into a global optimal problem for finding a certain energy function, perform iterative computation on disparity values by using the overall data information of the image, and solve the matching problem of uncertain regions in the image with emphasis, wherein typical methods include Graph Cut algorithm Graph-Cut, dynamic programming algorithm and the like. Further, based on the disparity map, a target depth image including a target object is calculated by using a triangulation principle, and a specific formula can be as follows:

wherein, Z is the depth value of a pixel point in the target depth image, f is the focal length of the lens, B is the base length of the binocular camera, and d is the parallax value corresponding to the pixel point.

In another implementation, the generating a target depth image including the target object according to the binocular image may include:

carrying out deformity correction processing on the binocular image;

The method comprises the steps of acquiring a binocular image, acquiring a target depth image, acquiring a disparity map, and calculating the target depth image comprising a target object by utilizing a triangulation principle based on the disparity map. The binocular image may be corrected by using a calibration correction method of a video camera of a Zhangyou in the prior art, but is not limited thereto.

To facilitate understanding of embodiments of the present invention, the correlation of the depth of the image is described as follows: the picture is formed by pixel points one by one, all pixel points of different colors form a complete image, the computer storage picture is carried out by binary system, wherein, if 1bit is adopted for storage, namely one bit is used for storage, the value range of the pixel point is 0 or 1, and the frame picture is either black or white; if 4 bits are adopted for storage, the value range of the pixel point is 4 times of 0 to 2; if 8 bits are adopted for storage, the value range of the pixel point is 8 times of 0 to 2, and so on; in the prior art, bit used by a computer to store a single pixel point is called the depth of an image, and a so-called depth image is a picture capable of representing the depth of the picture.

It should be emphasized that the specific implementation of generating the target depth image including the target object according to the binocular image is given as an example only, and should not be construed as limiting the embodiments of the present invention.

S103, based on the depth data in the target depth image, a target image area corresponding to the target object is obtained through segmentation;

because the volume of the target object needs to be determined, after a target depth image including the target object is obtained, a target image region corresponding to the target object can be obtained by segmentation based on depth data in the target depth image.

Specifically, in an implementation manner, the segmenting to obtain a target image region corresponding to the target object based on the depth data in the target depth image may include:

It should be noted that the specific implementation of segmenting the target image region corresponding to the target object based on the depth data in the target depth image is merely an example, and should not be construed as a limitation to the embodiment of the present invention. In addition, for the sake of clear layout, a specific implementation manner of obtaining a target image region corresponding to the target object by segmenting based on the depth data in the target depth image and using a depth map frame difference method will be described later.

S104, determining a target circumscribed rectangle corresponding to the target image area and meeting a preset condition;

after the target image area corresponding to the target object is obtained through segmentation, in order to determine the volume of the target object, a target circumscribed rectangle corresponding to the target image area and meeting a predetermined condition may be determined, and then subsequent processing is performed by using the target circumscribed rectangle.

It can be understood that, in practical application, the target circumscribed rectangle corresponding to the target image region and meeting the predetermined condition may be determined through a connected region analysis algorithm or an edge detection fitting algorithm, and is certainly not limited to the connected region analysis algorithm and the edge detection fitting algorithm. Specifically, the basic principle of the connected component analysis algorithm is as follows: firstly, carrying out connected region marking on a binary image, then calculating convex hulls of each connected region, and calculating the minimum connected rectangle corresponding to a target object by utilizing the characteristics of the external rectangle with the minimum area of the convex hulls, namely the characteristics that one edge of the convex hull is coincident with one edge of the external rectangle and four edges of the rectangle are bound to have the vertexes of the convex hulls, wherein the convex hulls are the existing basic concept in computer geometry and are the minimum convex polygons containing all point sets in the connected regions; the basic principle of the so-called edge detection fitting algorithm is: and directly fitting the edges of the target image areas by adopting a straight line fitting method, and calculating the circumscribed rectangle of the target image areas according to an edge straight line equation, wherein the commonly used method of the straight line fitting method in the prior art mainly comprises Hough change and least square fitting.

In addition, it should be noted that there may be a plurality of target circumscribed rectangles corresponding to the target image region, and one circumscribed rectangle meeting the predetermined condition may be obtained from the plurality of target circumscribed rectangles, and then a subsequent volume determination process is performed according to the obtained target circumscribed rectangle meeting the predetermined condition. Based on the above requirement, the determining the target circumscribed rectangle corresponding to the target image area and meeting the predetermined condition may include: determining a target circumscribed rectangle with the minimum area value corresponding to the target image area;

alternatively, the first and second electrodes may be,

and determining a target circumscribed rectangle with the smallest difference between the area value corresponding to the target image area and the preset area threshold value.

The target circumscribed rectangle with the smallest area value is the circumscribed rectangle most fitting the edge of the target image area, so that the target circumscribed rectangle can be used for subsequent volume determination; in addition, the target circumscribed rectangle having the smallest difference between the area value and the predetermined area threshold value is the circumscribed rectangle having the smallest error with the circumscribed rectangle as the reference standard, and therefore, the target circumscribed rectangle can be used for the subsequent volume determination.

It should be noted that the specific implementation manner of determining the target circumscribed rectangle corresponding to the target image area and meeting the predetermined condition is only an example, and should not be construed as a limitation to the embodiment of the present invention.

And S105, determining the volume of the target object based on the target bounding rectangle and the depth data in the target depth image.

After the target bounding rectangle is determined, the volume of the target object can be determined through a specific processing mode based on the target bounding rectangle and the depth data in the target depth image.

It should be noted that there are various specific implementations for determining the volume of the target object based on the target bounding rectangle and the depth data in the target depth image, and for clarity of the scheme and layout, a specific implementation for determining the volume of the target object based on the target bounding rectangle and the depth data in the target depth image will be described in the following by way of example.

A specific implementation manner of obtaining a target image region corresponding to the target object by segmentation based on the depth data in the target depth image and using a depth map frame difference method is described in detail below.

As shown in fig. 2, the segmenting to obtain the target image region corresponding to the target object by using a depth map frame difference method based on the depth data in the target depth image (S103) may include:

s1031, subtracting the depth data of each pixel point in the target depth image from the depth data of the corresponding pixel point in the preset background depth image;

the predetermined background depth image is an image which does not include the target object and is specific to the background environment where the target object is located.

Wherein, it is understood that the predetermined background depth image may be acquired in advance by a depth image acquisition device, wherein, in practical application, the depth image acquisition device may be a TOF (Time of flight) camera in the prior art; of course, the predetermined background depth image may also be obtained in a manner similar to the manner of obtaining the target depth image including the target object according to the embodiment of the present invention, that is: the method comprises the steps of obtaining binocular images which are collected by a binocular stereo camera and do not contain a target object and are specific to the background environment where the target object is located, and generating a preset background depth image according to the binocular images.

S1032, forming a frame difference image corresponding to the target depth image based on the subtraction result corresponding to each pixel point;

s1033, carrying out binarization processing on the frame difference image;

s1034, a target image region corresponding to the target object is obtained by dividing the frame difference image after the binarization processing.

Subtracting the depth data of each pixel point in the target depth image from the depth data of the corresponding pixel point in the preset background depth image specifically means that: for each pixel point in the target depth image, subtracting the depth data of the corresponding pixel point of the preset background depth image from the depth data of the pixel point. For example, subtracting the pixel point 1 in the target depth image from the corresponding pixel point 2 in the predetermined background depth image may specifically be subtracting the depth value of the pixel point 2 from the depth value of the pixel point 1.

Here, assuming that the binary values of the binarization processing are 0 and 1, the specific way to perform binarization processing on the frame difference image is: comparing the absolute value of the pixel value of each pixel point in the frame difference image with a preset threshold, if the absolute value is greater than the threshold, changing the pixel value of the pixel point to 1, otherwise, changing the pixel value of the pixel point to 0, of course, theoretically, if the absolute value is greater than the threshold, the pixel value of the pixel point can also be changed to 0, otherwise, the pixel value of the pixel point is changed to 1; furthermore, by such a processing method, the pixel value of each pixel point in the target image region corresponding to the target object in the frame difference image is different from the pixel value of each pixel point in the target image region, and the target image region corresponding to the target object can be obtained by dividing the frame difference image after the binarization processing. Of course, the binary values of the binarization process may be 0 and 255, and in this case, the specific manner of performing the binarization process on the frame difference image is similar to the binarization process described above with respect to 0 and 1, and will not be described in detail here.

A specific implementation of determining the volume of the target object based on the target bounding rectangle and the depth data in the target depth image is described below by way of example, but the specific implementation described by way of example is only and should not be construed as a limitation to the embodiments of the present invention.

As shown in fig. 3, the determining the volume of the target object based on the target bounding rectangle and the depth data in the target depth image (S105) may include:

s1051, extracting the image coordinates of each vertex of the target external rectangle in the frame difference image after binarization processing;

s1052, projecting the extracted image coordinates of each vertex into the target depth image to form a reference point in the target depth image;

s1053, calculating the three-dimensional coordinates of each reference point in the world coordinate system corresponding to the camera by using the perspective projection principle of camera imaging;

and S1054, obtaining the volume of the target object by using the three-dimensional coordinates of the reference points and the depth data of the target depth image.

It can be understood that the frame difference image corresponds to a two-dimensional coordinate system, and therefore, image coordinates of each vertex of the target circumscribed rectangle in the frame difference image after the binarization processing can be extracted; in addition, since the frame difference image is determined according to the target depth image, the frame difference image and the target depth image have the same image specification, and then the frame difference image and the target depth image have the same two-dimensional coordinate system, so that the image coordinates of the reference point in the target depth image are the same as the image coordinates of the corresponding vertex in the frame difference image after the binarization processing.

The specific implementation manner of calculating the three-dimensional coordinates of each reference point in the world coordinate system of the camera by using the perspective projection principle of camera imaging can be implemented by adopting the prior art, and is not described herein again.

The specific implementation process of obtaining the volume of the target object by using the three-dimensional coordinates of each reference point and the depth data of the target depth image may include: calculating Euclidean distances between every two 4 reference points, taking the calculated Euclidean distances as the length and width of a target object, subtracting the Z value corresponding to the target object from the Z value corresponding to a preset background depth image, and further determining the product of the length, the width and the height of the determined target object as the volume of the target object; wherein, the Z value corresponding to the target object is the Z value corresponding to the area corresponding to the 4 reference points, namely the depth value; the Z value corresponding to the predetermined background depth image is a depth value.

Corresponding to the above method embodiment, an embodiment of the present invention further provides an apparatus for determining an object volume based on a binocular stereo camera, as shown in fig. 4, the apparatus may include:

a binocular image obtaining module 410, configured to obtain binocular images including a target object, which are collected by a binocular stereo camera;

a depth image generating module 420, configured to generate a target depth image including the target object according to the binocular image;

an image region segmentation module 430, configured to segment, based on depth data in the target depth image, a target image region corresponding to the target object;

a circumscribed rectangle determining module 440, configured to determine a target circumscribed rectangle corresponding to the target image area and meeting a predetermined condition;

a volume determination module 450, configured to determine a volume of the target object based on the target bounding rectangle and the depth data in the target depth image.

In one implementation, the depth image generating module 420 may include:

In another implementation manner, the depth image generating module 420 may include:

In one implementation, the image region segmentation module 430 may include:

Further, the image region segmentation unit may include:

The circumscribed rectangle determining module 440 may include:

alternatively, the first and second electrodes may be,

Wherein the volume determination module 450 may include:

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A method for determining a volume of an object based on a binocular stereo camera, comprising:

determining a volume of the target object based on the target bounding rectangle and depth data in the target depth image;

the determining the volume of the target object based on the target bounding rectangle and the depth data in the target depth image comprises:

obtaining the volume of the target object by using the three-dimensional coordinates of each reference point and the depth data of the target depth image;

the obtaining the volume of the target object by using the three-dimensional coordinates of the reference points and the depth data of the target depth image includes:

calculating Euclidean distances between every two reference points, and taking distance values except the longest distance in the calculated Euclidean distances as the length and width of the target object;

subtracting the depth value corresponding to the target object from the depth value corresponding to the predetermined background depth image to obtain the height of the target object, wherein the depth value corresponding to the target object is the depth value corresponding to the area corresponding to each reference point, and the predetermined background depth image is an image which does not contain the target object and is specific to the background environment where the target object is located;

determining the product of the determined length, width and height of the target object as the volume of the target object.

2. The method of claim 1, wherein generating a target depth image containing the target object from the binocular images comprises:

performing stereo matching on the binocular images to obtain a disparity map;

3. The method of claim 1, wherein generating a target depth image containing the target object from the binocular images comprises:

carrying out deformity correction treatment on the binocular image;

4. The method according to claim 1, wherein the segmenting a target image region corresponding to the target object based on the depth data in the target depth image comprises:

5. The method according to claim 4, wherein the segmenting the target image region corresponding to the target object by using a depth map frame difference method based on the depth data in the target depth image comprises:

carrying out binarization processing on the frame difference image;

6. The method according to claim 1, wherein the determining the target circumscribed rectangle corresponding to the target image area and meeting the predetermined condition comprises:

7. The method according to claim 1, wherein the determining the target circumscribed rectangle corresponding to the target image area and meeting the predetermined condition comprises:

alternatively, the first and second electrodes may be,

8. An apparatus for determining a volume of an object based on a binocular stereo camera, comprising:

a volume determination module for determining a volume of the target object based on the target bounding rectangle and the depth data in the target depth image;

the volume determination module comprises:

the volume determining unit is used for obtaining the volume of the target object by utilizing the three-dimensional coordinates of the reference points and the depth data of the target depth image;

9. The apparatus of claim 8, wherein the depth image generation module comprises:

10. The apparatus of claim 8, wherein the depth image generation module comprises:

11. The apparatus of claim 8, wherein the image region segmentation module comprises:

12. The apparatus of claim 11, wherein the image region segmentation unit comprises:

13. The apparatus of claim 8, wherein the circumscribed rectangle determining module comprises:

14. The apparatus of claim 8, wherein the circumscribed rectangle determining module comprises:

alternatively, the first and second electrodes may be,