CN115063472A

CN115063472A - Deep learning-based luggage identification and measurement method and device

Info

Publication number: CN115063472A
Application number: CN202210760186.5A
Authority: CN
Inventors: 张红颖; 陈宝举; 李彪
Original assignee: Civil Aviation University of China
Current assignee: Civil Aviation University of China
Priority date: 2022-06-29
Filing date: 2022-06-29
Publication date: 2022-09-16

Abstract

The invention relates to a deep learning-based luggage identification and measurement method, which comprises the following steps: constructing an ACVNet network model, acquiring a training set and a luggage stereo matching data set, training the ACVNet network model by using the training set, and performing transfer learning on the trained ACVNet network model by using the luggage stereo matching data set to obtain a final trained ACVNet network model; acquiring a luggage image by using a binocular camera, preprocessing the luggage image, inputting the preprocessed luggage image into a trained ACVNet network model to acquire a disparity map of the luggage, converting the disparity map into a depth map by using a triangulation principle, and calibrating internal and external parameters by combining the binocular camera to further acquire point cloud data of a shooting picture of the binocular camera; visually classifying the point cloud data and obtaining the size information of the luggage; the invention can ensure the accuracy and efficiency of luggage size measurement.

Description

Deep learning-based luggage identification and measurement method and device

Technical Field

The invention belongs to the technical field of intelligent luggage carrying, identifying and measuring, and particularly relates to a luggage identifying and measuring method and device based on deep learning.

Background

At present, in airports at home and abroad, for the post-treatment of luggage to be consigned, the luggage is generally manually transported to a luggage transportation container or an airplane cargo hold by ground service personnel according to the visual inspection and manual evaluation of the specification of the luggage.

The manual handling of the luggage needs higher labor cost, and the luggage is easily damaged in the handling process, so that the flight experience of passengers and the operation efficiency of an airport are influenced. And the intelligent transport of luggage need acquire the size and the position appearance information of luggage and can just put things in good order luggage in suitable position with the arm cooperation to improve handling efficiency and cargo hold space utilization.

At present, the measurement of the luggage is generally the traditional grating measurement or line laser scanning, the grating measurement easily causes the problem of larger luggage size error, and the cost of a whole set of laser scanning equipment is relatively higher, so that the identification and measurement of a plurality of pieces of luggage and luggage with different types and different shapes can not be better met.

Therefore, in order to solve the above problems, the present invention provides a baggage non-contact measurement method and device based on deep learning.

Disclosure of Invention

The invention aims to provide a method and a device for luggage identification and measurement based on deep learning, which solve the problems of higher cost of the conventional luggage measurement equipment and larger error of the conventional three-dimensional matching algorithm, can obtain better size measurement precision compared with the conventional algorithm, and solve the problems of complicated and variable illumination of an airport transportation environment and poor matching effect of a weak texture area on the surface of luggage, thereby influencing the luggage detection and transportation efficiency and the airport service satisfaction.

The technical problem to be solved by the invention is realized by adopting the following technical scheme:

the deep learning-based luggage identification and measurement method comprises the following steps:

constructing an ACVNet network model, acquiring a training set and a luggage stereo matching data set, training the ACVNet network model by using the training set, and performing transfer learning on the trained ACVNet network model by using the luggage stereo matching data set to obtain a final trained ACVNet network model;

acquiring a luggage image by using a binocular camera, preprocessing the luggage image, inputting the preprocessed luggage image into a trained ACVNet network model to acquire a disparity map of the luggage, converting the disparity map into a depth map by using a triangulation principle, and calibrating internal and external parameters by combining the binocular camera to further acquire point cloud data of a shooting picture of the binocular camera;

and visually classifying the point cloud data and obtaining the size information of the luggage.

Further, the method for acquiring the baggage stereo matching data set comprises the following steps:

a. acquiring an RGB (red, green and blue) image of the luggage and a parallax image of the luggage through a binocular camera, processing the acquired RGB image of the luggage to obtain left and right eye images of the luggage, and fusing the parallax images of the luggage subjected to external participation calibrated by a laser radar and a radar to form a complete parallax image;

b. and (c) adjusting the placing position and the posture of the luggage under the visual fields of the camera and the laser radar, repeating the step a, and obtaining a plurality of groups of image pairs and a parallax map so as to obtain a luggage stereo matching data set.

Further, the method for training the ACVNet network model by using the training set comprises:

carrying out model pre-training on the ACVNet network model by utilizing a sceneFlow data set, and storing training weight;

and respectively carrying out model pre-training on the ACVNet network model by utilizing data sets of KITTI2012 and KITTI2015 of the real driving scene, and storing training weights.

Further, the method for preprocessing the baggage image comprises the following steps: and sequentially carrying out image epipolar line correction, image enhancement processing and image segmentation processing on the baggage image to obtain a preprocessed baggage image.

Further, the method for converting the disparity map into the depth map by the triangulation principle and calibrating the obtained internal and external parameters by combining the binocular camera to further obtain the point cloud data of the shot picture of the binocular camera comprises the following steps:

converting the disparity map into a depth map by using a triangulation principle:

wherein b is the distance between the left and right optical axes of the binocular camera, f is the focal length of the camera, z is the depth of the object to be measured from the camera, XL is the horizontal coordinate of a point in the image of the left eye camera in the image coordinate system of the left eye camera, XR is the horizontal coordinate of a point in the image of the right eye camera in the image coordinate system of the right eye camera;

acquiring three-dimensional coordinates of each point of the luggage:

wherein (X, y, z) is three-dimensional coordinate information of the baggage, d is a disparity map of left and right images of the baggage, and (X) _a ，Y _a ) The image coordinate values are the image coordinate values after image preprocessing;

and converting the obtained baggage depth map into a point cloud map.

Further, the method for performing image epipolar line correction on the baggage image comprises the following steps:

camera calibration is carried out through a Zhang Zhengyou calibration method, internal and external parameters of the binocular camera are obtained, and epipolar line correction is carried out on the left and right baggage images obtained by the binocular camera by using a Bouguet image correction algorithm, so that the images meet epipolar line constraint of stereo matching.

Further, a histogram equalization image gray level transformation enhancement algorithm is adopted to perform enhancement processing on the luggage image after polar line correction; and carrying out baggage image segmentation processing on the obtained enhanced image by using a Hough transform algorithm.

Luggage identification measuring device based on deep learning includes:

the ACVNet network model acquisition module is used for constructing an ACVNet network model, acquiring a training set and a luggage stereo matching data set, training the ACVNet network model by using the training set, and performing transfer learning on the trained ACVNet network model by using the luggage stereo matching data set to obtain a final trained ACVNet network model;

the baggage image point cloud data acquisition module is used for acquiring a baggage image by using a binocular camera, preprocessing the baggage image, inputting the preprocessed baggage image into a trained ACVNet network model to acquire a disparity map of the baggage, converting the disparity map into a depth map by a triangulation principle, and calibrating internal and external parameters obtained by combining the binocular camera to further obtain point cloud data of a shooting picture of the binocular camera;

and the luggage size information acquisition module is used for visually classifying the point cloud data and acquiring the size information of the luggage.

An electronic device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the deep learning based baggage identification measurement method described above.

A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the above-mentioned deep learning-based baggage identification measurement method.

The invention has the advantages and positive effects that:

according to the invention, the binocular camera is used for collecting the luggage image, the luggage image is preprocessed, so that a relatively accurate luggage image is obtained, and the interference of external noise is reduced; then, a pre-trained ACVNet network model is used for fine adjustment on a real luggage data set, weight parameters of stereo matching are adjusted, a depth map obtained by a solid-state laser radar is used as a supplement, more accurate luggage depth map estimation is achieved, three-dimensional information of luggage is obtained, and the accuracy and the stability of a luggage visual measurement scheme are improved in the process; finally, converting the depth map into a luggage point cloud according to the output depth map, and performing visual classification through point cloud data to obtain the size information of the luggage; therefore, the invention can provide a more convenient and more intelligent luggage size measuring method while ensuring the accuracy and the efficiency, and provides a new idea for the construction and industrial structural reform of the intelligent airport.

Detailed Description

First, it should be noted that the specific structures, features, advantages, etc. of the present invention will be specifically described below by way of example, but all the descriptions are for illustrative purposes only and should not be construed as limiting the present invention in any way. Furthermore, any individual technical features described or implicit in the embodiments mentioned herein may still be continued in any combination or subtraction between these technical features (or their equivalents) to obtain still further embodiments of the invention that may not be mentioned directly herein.

It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict.

The deep learning-based baggage identification and measurement method provided by the embodiment comprises the following steps:

The method for acquiring the luggage stereo matching data set comprises the following steps: a. acquiring a luggage RGB image and a luggage disparity map through a binocular camera, processing the acquired luggage RGB image to obtain a left luggage eye image and a right luggage eye image, and fusing the disparity maps of the outer participating luggage calibrated by a laser radar and a radar to form a complete disparity map; b. and (c) adjusting the placing position and the posture of the luggage under the visual fields of the camera and the laser radar, repeating the step a, and obtaining a plurality of groups of image pairs and a parallax map so as to obtain a luggage stereo matching data set.

Specifically, in this embodiment, the method for acquiring the baggage stereo matching data set includes:

step 2-1-1: using a binocular camera of a standard version of Smartz S1030-IR and a sick tim561 laser radar to manufacture a luggage stereo matching data set, and fixing the camera and the radar;

step 2-1-2: acquiring an RGB image of the luggage by using a small foraging S1030-IR binocular camera, and storing the RGB image;

step 2-1-3: acquiring a luggage disparity map by using a small foraging S1030-IR binocular camera, and storing the acquired luggage disparity map;

step 2-1-4: fusing luggage disparity maps acquired by an external camera calibrated by using a sick tim561 laser radar in combination with a radar to form a complete disparity map;

step 2-1-5: processing the binocular luggage image obtained in the step 2-1-2, and storing the image as a left and right target image pair of the luggage;

step 2-1-6: processing the binocular luggage disparity map obtained in the step 2-1-3 to enable the binocular luggage disparity map to meet the disparity map requirements of the image pair obtained in the step 2-1-5, and storing the disparity map;

step 2-1-7: adjusting the placing position and the posture of the luggage on the workbench under the camera and radar view fields, repeating the steps 2-1-1, 2-1-2, 2-1-3, 2-1-4, 2-1-5 and 2-1-6, and shooting a plurality of groups of image pairs and parallax maps so as to complete the manufacturing of a luggage stereo matching data set;

specifically, the method for training the ACVNet network model by using the training set includes:

and performing model pre-training on the ACVNet network model by using data sets of KITTI2012 and KITTI2015 of the real driving scene, and storing training weights.

It should be noted that the SceneFlow data set is a large-scale synthetic data set, and provides 3 ten thousand pairs of training images, so that the deep neural network can be fully trained, and the ACVNet firstly performs model pre-training on the data set once and stores the training weight; the KITTI data set is divided into 2012 and 2015 versions, the actual data set has certain influence in the industry, and the ACVNet respectively performs pre-training once on the KITTI2012 and the KITTI2015 and stores training weights.

converting the disparity map into a depth map by utilizing a triangulation principle:

acquiring three-dimensional coordinates of each point of the luggage:

and converting the obtained baggage depth map into a point cloud map.

calibrating a camera by a Zhang Zhengyou calibration method, acquiring internal and external parameters of a binocular camera, and performing epipolar line correction on left and right baggage images acquired by the binocular camera by using a Bouguet image correction algorithm to enable the images to meet epipolar line constraint of stereo matching; adopting a histogram equalization image gray level transformation enhancement algorithm to enhance the luggage image after polar line correction; and carrying out baggage image segmentation processing on the obtained enhanced image by using a Hough transform algorithm.

Specifically, an experimental platform is built, a binocular camera is fixed, camera calibration is carried out through a Zhang-friend calibration method, and internal and external parameters of the binocular camera are obtained; performing epipolar line correction on left and right baggage images acquired by a binocular camera by using a Bouguet image correction algorithm according to acquired camera parameters including a distortion parameter K, a rotation matrix R, a translation matrix T and the like of the camera, so that the images meet epipolar line constraint of stereo matching;

the method adopts a histogram equalization image gray level transformation enhancement algorithm to perform enhancement processing on the corrected luggage image, and comprises the following specific steps:

firstly, obtaining a gray level histogram of a corrected luggage image, and counting to obtain a probability density function corresponding to each gray level, wherein the formula is as follows:

in the formula, p (r) _k ) Is a probability density function of a pixel point with a gray level of k, n _k The number of pixels with the gray level of k is shown, M multiplied by N is the total number of image pixels, and L represents the gray level of the image;

and obtaining the cumulative distribution function of each gray level by the probability density function, wherein the formula is as follows:

in the formula, S _k The accumulated probability value of the pixel with the gray level k in the whole image is obtained;

rounding calculation and determining pixel mapping relation, wherein the formula is as follows:

f(k)＝(L-1)×S(k)

wherein, f (k) is a new pixel value corresponding to the mapping relationship after rounding and expansion of the pixel with the gray level of k;

calculating a new equalized histogram according to the mapping relation;

the method for segmenting the enhanced image by using the Hough transform algorithm comprises the following specific steps of:

detecting the edges of the luggage according to a Canny edge detection algorithm;

carrying out linear Hough space transformation on the obtained baggage edge image, and converting the acquired baggage edge image from an image Cartesian coordinate system to an image polar coordinate Hough space system, wherein the formula is as follows:

r＝x·cos(θ)+y·sin(θ)

in the formula, theta is an included angle between r and an x axis, r is a geometric vertical distance from a straight line, x is an abscissa of a pixel point, and y is an ordinate of the pixel point;

setting a threshold value according to an image polar coordinate Hough space system, and outputting a maximum Hough transform value;

and (3) according to the obtained maximum Hough transform value, carrying out luggage image segmentation, wherein the formula is as follows:

wherein G is the gray-scale image pixel value of the luggage obtained after transformation, G is the gray-scale image pixel value of the luggage before transformation, and c is the number of the columns of the image pixels,c _left To obtain the number of columns in which the left Hough transform values of the image are located, c _right Obtaining the number of columns of right Hough transform values of the image;

in this embodiment, the method for visually classifying the point cloud data and obtaining the size information of the baggage includes:

running a program on the Visual Studio 2017 by using a small foraging S1030-IR binocular camera SDK, outputting binocular images of the luggage and a network to obtain a disparity map, and marking the obtained size information of the luggage to be detected on a left image of the luggage;

obtaining baggage weight information by a baggage conveyor device and the conveyor speed is known;

the length and width data of the luggage can be obtained by a minimum circumscribed rectangle, and the average value of the point clouds on the surface of the luggage is taken as a height value for the point cloud data with a smooth surface, so that the volume of the luggage is obtained;

and projecting the irregular point cloud to an XOY plane in a point cloud slicing mode, wherein points on the plane have different depth values, and obtaining the volume of the irregular luggage through integration.

Example 2

Luggage identification measuring device based on deep learning includes:

the luggage image point cloud data acquisition module is used for acquiring a luggage image by using a binocular camera, preprocessing the luggage image, inputting the preprocessed luggage image into a trained ACVNet network model to acquire a disparity map of the luggage, converting the disparity map into a depth map by using a triangulation principle, and calibrating internal and external parameters obtained by combining the binocular camera to further obtain point cloud data of a shooting picture of the binocular camera;

In addition, the present embodiment also provides a computing device, including:

one or more processing units;

a storage unit for storing one or more programs,

wherein the one or more programs, when executed by the one or more processing units, cause the one or more processing units to perform the deep learning-based baggage identification measurement method described above; it is noted that the computing device may include, but is not limited to, a processing unit, a storage unit; those skilled in the art will appreciate that the computing device includes processing units, memory units, and not limitation of the computing device, and may include more components, or combine certain components, or different components, e.g., the computing device may also include input output devices, network access devices, buses, and the like.

There is also provided a computer readable storage medium having non-volatile program code executable by a processor, the computer program, when executed by the processor, implementing the steps of the deep learning-based baggage identification measurement method described above; it is noted that the readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing; the program embodied on the readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. For example, program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the C programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, or entirely on a remote computing device or server. In situations involving remote computing devices, the remote computing devices may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to external computing devices (e.g., through the internet using an internet service provider).

The present invention has been described in detail with reference to the above examples, but the description is only for the preferred examples of the present invention and should not be construed as limiting the scope of the present invention. All equivalent changes and modifications made within the scope of the present invention shall fall within the scope of the present invention.

Claims

1. The deep learning-based luggage identification and measurement method is characterized by comprising the following steps of:

2. The deep learning based baggage identification measurement method according to claim 1, wherein: the method for acquiring the luggage stereo matching data set comprises the following steps:

a. acquiring a luggage RGB image and a luggage disparity map through a binocular camera, processing the acquired luggage RGB image to obtain a left luggage eye image and a right luggage eye image, and fusing the disparity maps of the outer participating luggage calibrated by a laser radar and a radar to form a complete disparity map;

3. The deep learning-based baggage identification measurement method according to claim 2, wherein: the method for training the ACVNet network model by using the training set comprises the following steps:

carrying out model pre-training on the ACVNet network model by utilizing a sceneFlow data set, and storing training weights;

4. The deep learning based baggage identification measurement method according to claim 3, wherein: the method for preprocessing the luggage image comprises the following steps: and sequentially carrying out image epipolar line correction, image enhancement processing and image segmentation processing on the baggage image to obtain a preprocessed baggage image.

5. The deep learning based baggage identification measurement method according to claim 4, wherein: the method for converting the parallax map into the depth map by the triangulation principle and obtaining the point cloud data of the shooting picture of the binocular camera by combining the internal and external parameters obtained by calibrating the binocular camera comprises the following steps:

acquiring three-dimensional coordinates of each point of the luggage:

and converting the obtained baggage depth map into a point cloud map.

6. The deep learning-based baggage identification measurement method according to claim 4, wherein: the method for carrying out image epipolar line correction on the luggage image comprises the following steps:

calibrating a camera by a Zhang Zhengyou calibration method, acquiring internal and external parameters of a binocular camera, and performing epipolar line correction on a left luggage image and a right luggage image acquired by the binocular camera by using a Bouguet image correction algorithm to enable the images to meet epipolar constraint of stereo matching.

7. The deep learning based baggage identification measurement method according to claim 6, wherein: adopting a histogram equalization image gray level transformation enhancement algorithm to enhance the luggage image after polar line correction; and carrying out baggage image segmentation processing on the obtained enhanced image by using a Hough transform algorithm.

8. Luggage discernment measuring device based on degree of depth study, its characterized in that includes:

9. An electronic device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 7.

10. A computer-readable storage medium, storing a computer program which, when executed by a processor, implements the method of any of claims 1 to 7.