CN115661230A

CN115661230A - Estimation method for warehouse material volume

Info

Publication number: CN115661230A
Application number: CN202211301346.6A
Authority: CN
Inventors: 谭龙兴; 郑晓彬
Original assignee: Zhejiang Tianzhui Technology Co ltd
Current assignee: Zhejiang Tianzhui Technology Co ltd
Priority date: 2022-10-24
Filing date: 2022-10-24
Publication date: 2023-01-31

Abstract

The present disclosure provides an estimation method, an estimation apparatus, an electronic device, and a storage medium, wherein the method includes: acquiring a video stream for warehouse materials, and decomposing the video stream frame by frame to obtain a first group of frame pictures; identifying warehouse materials in each frame of picture in the first group of frames of pictures, and performing depth detection on the warehouse materials in each frame of picture to determine depth information of the warehouse materials; acquiring the height information of the warehouse material, and calculating the volume of the warehouse material based on the depth information and the height information of the warehouse material; and performing fusion learning on the volume of the warehouse material determined by the periodically acquired video stream, and taking the volume after the fusion learning as the volume of the warehouse material.

Description

Estimation method for warehouse material volume

Technical Field

The disclosure relates to the field of material volume estimation, in particular to an estimation method based on material stacking depth.

Background

In the engineering field, in order to improve the management level of intelligent factories and supply chains, the estimation of the volume or the quality of a large material stack body is urgently needed. The raw material information is automatically, quickly and accurately acquired by the technical means, so that the working efficiency can be improved, and the consumption of manpower and material resources is reduced.

At present, the estimation method of the volume of the large material pile mainly comprises the technical schemes of multi-view stereoscopic vision, joint detection of a projector and a camera, laser radar, a three-dimensional laser scanner, ultrasonic ranging and the like. The method can be practically applied to the measurement fields of granaries, coal mines, cement, landslide and the like.

The prior art solutions rely on expensive equipment such as laser scanners, depth cameras, or require additional auxiliary devices such as projection devices, which require new expensive devices to be implemented. Meanwhile, the calculation of a visual scheme or a point cloud scheme mainly based on a convolutional neural network is higher in resource consumption, so that the requirement on the performance of system hardware is higher, and an expensive server needs to be configured.

Meanwhile, the method lacks relevant optimization for a scene with low result updating frequency of a warehouse stack, and the result of processing by converting the video into the frame image is that the detection result has large floating and is not in accordance with the real situation.

Disclosure of Invention

The present disclosure provides a method for estimating the volume of a warehouse material, so as to at least solve the aforementioned technical problem.

According to a first aspect of the present disclosure, there is provided a method for estimating a warehouse material volume, comprising:

acquiring a first video stream for warehouse materials, and performing frame decomposition on the first video stream to obtain a first group of frame pictures;

identifying the warehouse material in each frame of picture in the first group of frames of pictures, and performing depth detection on the warehouse material in each frame of picture to determine the depth information of the warehouse material;

acquiring height information of the warehouse material, and calculating the volume of the warehouse material based on the depth information and the height information of the warehouse material;

and performing fusion learning on the volume of the warehouse material determined by the periodically acquired video stream, and taking the volume after the fusion learning as the volume of the warehouse material.

The method has the advantages that by combining the perspective principle, the signal processing, the rule learning and the deep learning, the warehouse materials can be simply and quickly detected, and the estimation of the warehouse stack volume can be realized by only using a single common camera by combining the actual situation of a field.

In some embodiments, before identifying the warehouse material in each picture of the first set of pictures, the method further comprises:

performing green screen inspection on each frame of picture in the first group of frames of pictures;

the green screen inspection comprises the following steps: and (4) checking the pixel average value of each frame picture in the first group of frame pictures, and screening out that the pixel average value meets the preset frame picture average value standard.

Performing bar test on each picture in the first group of pictures;

the brace inspection comprises: and (3) checking the standard deviation of the longitudinal pixels of each frame picture in the first group of frame pictures, and screening out that the standard deviation of the longitudinal pixels meets the preset standard deviation of the longitudinal pixels of the frame pictures.

Performing fuzzy inspection on each frame picture in the first group of frame pictures;

the fuzzy check comprises the following steps: and checking the standard deviation of the integral pixel of each frame picture in the first group of frame pictures, and screening out the standard deviation value of the integral pixel which meets the integral standard deviation of the preset frame picture.

Performing light ray inspection on each frame of picture in the first group of frames of pictures;

the light ray inspection comprises the following steps: and counting the highlighted pixel points in each frame of picture in the first group of frame pictures, and screening out the highlighted pixel points, wherein the ratio of the highlighted pixel points to the whole pixels is lower than the preset ratio of the highlighted pixel points in the frame pictures.

Carrying out vehicle inspection on each frame of picture in the first group of frames of pictures;

the vehicle inspection comprises the following steps: and checking the first group of frame pictures by using the target detection model, and identifying the pictures containing the vehicles in the first group of frame pictures and the number of the vehicles in the pictures.

The method has the beneficial effects that the possibility that the estimation result is influenced due to the defects of the pictures is reduced by screening the obtained frame pictures.

In some embodiments, wherein the identifying the warehouse material in each picture of the first set of pictures and performing depth detection on the warehouse material in each picture comprises:

marking the positions of four vertexes of the bottom edge of the warehouse in each frame picture in the first group of frame pictures in the frame pictures;

intercepting the bottom part of the warehouse in the frame picture according to the marked positions of four vertexes of the bottom edge of the warehouse to obtain a second group of pictures;

carrying out affine transformation on the second group of pictures, and converting the second group of pictures into rectangles to obtain a third group of pictures;

performing convolution processing on pixels of each picture in the third group of pictures and the normal distribution, and converting all the convoluted third group of pictures into a gray-scale image corresponding to the third group of pictures;

converting the third group of pictures into one-dimensional signals corresponding to the third group of pictures according to the pixel information of the gray-scale images corresponding to the third group of pictures;

and according to the corresponding relation between the pixel points in the third group of pictures and the size of the bottom edge of the warehouse, taking the determined peak positions in the one-dimensional signals as the depth information of the warehouse materials.

In some embodiments, wherein the fusion learning is performed on the volume of warehouse material determined by the periodically acquired video stream, taking the fusion-learned volume as the volume of warehouse material includes:

sorting the detection results of each frame of the first group of frame pictures into a first queue, wherein the detection results comprise: whether the current frame picture passes green screen inspection, brace inspection, fuzzy inspection and light inspection, vehicle information in the current frame picture and a specific numerical value of warehouse material volume in the current frame picture;

replacing the detection results which do not pass through the green screen detection, the brace detection, the fuzzy detection and the light ray detection in the first queue with the detection results of the previous frame to generate a second queue;

when the frame image corresponding to the last detection result in the first queue does not identify the vehicle in the vehicle inspection, taking the volume of the warehouse material in the last detection result in the second queue as the volume of the warehouse material obtained by the fusion learning;

when the frame image corresponding to the last result in the first queue identifies the vehicle in the vehicle inspection, and the number of the vehicles is 1, screening the detection results from back to front of the second queue until the detection results without the vehicle information are screened out, comparing the specific value of the warehouse material volume in the detection results screened out by the second queue with the specific value of the warehouse material volume in the last detection result of the first queue, and calculating the difference value between the specific value of the warehouse material volume in the detection results screened out by the second queue and the specific value of the warehouse material volume in the last detection result of the first queue;

when the frame image corresponding to the last result in the first queue identifies the vehicle in the vehicle inspection, and the number of the vehicles is more than 1, stopping the fusion learning, and continuously screening the inspection results of the frame images decomposed by the subsequent video streams until the inspection results without the vehicle information are screened out;

carrying out median calculation on specific numerical values of warehouse material volumes in the inspection results of the next 10 frames of images of the inspection results screened from the inspection results in the frame images decomposed by the subsequent video stream;

and taking the median calculation result as the warehouse material volume obtained by the fusion learning.

In some embodiments, comparing the bin material volume specific value in the test result screened by the second queue with the bin material volume specific value in the last test result of the first queue, further comprises:

if the warehouse material volume in the detection result screened in the second queue is larger than the warehouse material volume in the last detection result of the first queue, comparing the result of the difference calculation with the material volume shoveled by a forklift;

if the result of the difference calculation does not exceed the material volume shoveled by one forklift, taking the warehouse material volume in the last detection result in the first queue as the warehouse material volume obtained by the current fusion learning, and if the result of the difference calculation exceeds the material volume shoveled by one forklift, subtracting the material volume shoveled by one forklift from the specific value of the warehouse material volume in the detection result screened by the second queue as the warehouse material volume obtained by the current fusion learning;

if the warehouse material volume in the detection result screened in the second queue is smaller than the warehouse material volume in the last detection result of the first queue, comparing the result of the difference calculation with the material volume increased by one truck;

and if the result of the difference calculation does not exceed the material volume increased by one truck, taking the warehouse material volume in the last detection result in the first queue as the warehouse material volume obtained by the current fusion learning, and if the result of the difference calculation exceeds the material volume increased by one truck, adding the specific value of the warehouse material volume in the detection result screened by the second queue and the material volume increased by one truck as the warehouse material volume obtained by the current fusion learning.

According to a first aspect of the present disclosure, there is also provided an estimation device for warehouse material volume, wherein the device includes:

an image pickup unit; for shooting warehouse video streams;

an image acquisition unit; the system comprises a video processing unit, a storage unit and a processing unit, wherein the video processing unit is used for decomposing a warehouse video stream to obtain a first group of frame pictures;

an information processing unit; the system is used for carrying out depth detection on the first group of pictures and obtaining the warehouse material volume corresponding to each frame of picture in the first group of pictures according to the detection result;

the signal processing unit is also used for performing fusion learning on the warehouse material volume corresponding to each frame of picture;

a signal receiving unit; the image acquisition unit is used for acquiring image information transmitted by the image acquisition unit;

the signal receiving unit is also used for outputting the volume of the fused and learned materials.

In some embodiments, further comprising:

an image inspection unit; the image acquisition unit is used for acquiring a first group of pictures;

wherein, the first test includes:

performing light ray inspection on the first group of pictures;

performing quality inspection on the first group of pictures;

performing obstacle inspection on the first group of pictures;

a storage unit; the warehouse video stream is used for storing the warehouse video stream shot by the camera unit;

the storage unit is further used for storing and outputting the volume of the fused and learned materials.

According to a first aspect of the present disclosure, there is also provided an electronic device, including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method of any one of the above.

According to the first aspect of the present disclosure, there is also provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of any of the above.

This application combines the characteristics in engineering field warehouse, turns into the prediction to the interior heap of body degree of depth of warehouse with the prediction of heap body volume, reaches the detection purpose through the light change that detects heap body and ground. Meanwhile, stability and accuracy of a detection result are improved by constructing a fusion learning method. In the existing scheme, a better prediction effect is achieved mainly by acquiring the spatial information in detail, and the spatial information needs to be acquired by means of laser and the like. According to the method, the warehouse material volume estimation requirement can be met by only using a single common camera in combination with the actual situation of a field, the matched hardware is simple, a GPU (graphics processing unit) server is not needed, and a good prediction effect is achieved through information fusion at different times.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The above and other objects, features and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:

in the drawings, like or corresponding reference characters designate like or corresponding parts.

Fig. 1 shows a first flow diagram of an estimation method for warehouse material volume of an embodiment of the present disclosure;

FIG. 2 illustrates a second flow chart diagram of an estimation method for warehouse material volume of an embodiment of the present disclosure;

fig. 3 shows a schematic flowchart of the embodiment of the present disclosure for performing quality detection on a frame picture;

FIG. 4 is a schematic flow diagram of the present embodiment for estimating warehouse material volume;

fig. 5 shows a one-dimensional signal diagram converted from a gray scale map of the present embodiment;

fig. 6 is a schematic structural diagram showing the components of the device for the estimation method of the warehouse material volume according to the embodiment.

Detailed Description

In order to make the objects, features and advantages of the present disclosure more apparent and understandable, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

The essence of the technical solution of the embodiments of the present application is explained in detail below with reference to the accompanying drawings.

Fig. 1 is a first flowchart of a visual estimation method for warehouse material volume according to an embodiment of the present disclosure, and the first flowchart of the visual estimation method for warehouse material volume according to the embodiment of the present disclosure shown in fig. 1 includes the following steps:

step 101, obtaining a first video stream for warehouse materials, and performing frame decomposition on the first video stream to obtain a first group of frame pictures.

In this embodiment, preferably, the first video stream is a video stream transmitted by a camera in real time. The source of the video stream is not limited to the monitoring camera, and the video stream can be any product capable of realizing a shooting function, and can also be any storage medium capable of outputting video resources.

In this embodiment, each frame of picture in the first group of frame pictures corresponds to an independent result, which will not affect each other, and these results are finally subjected to fusion learning in step 104.

And 102, identifying warehouse materials in each frame of picture in the first group of pictures, performing depth detection on the warehouse materials in each frame of picture, and determining depth information of the warehouse materials.

In this embodiment, the depth of the warehouse material identified in each frame of the picture is detected by a depth detection method, wherein the depth detection method identifies the pixel information of the material in the picture and the bottom edge part of the warehouse, determines the pixel information from the boundary line position between the material in the picture and the bottom edge of the warehouse to the critical position of the material, and determines the actual depth size of the warehouse material according to the corresponding relationship between the pixel information in the picture and the actual size of the warehouse.

And 103, acquiring the height information of the warehouse materials, and calculating the volume of the warehouse materials based on the depth information and the height information of the warehouse materials.

In this embodiment, the height information of the warehouse material is related to the material of the material itself, and the stacking heights of different materials are different, and since the material is stacked in the warehouse according to the standard, the stacking height is a fixed empirical value. The material stacking width in the warehouse is the same as the warehouse width, and the volume of the warehouse material can be calculated by acquiring the height empirical value of corresponding material stacking and the warehouse material stacking depth information and combining the fixed material stacking width.

And 104, performing fusion learning on the volume of the warehouse material determined by the periodically acquired video stream, and taking the volume after the fusion learning as the volume of the warehouse material.

In this embodiment, the estimated volume of the warehouse material obtained in step 103 corresponds to each frame of picture in step 101, and since the frequency of updating the material in the warehouse is low, the estimated volume of the warehouse material corresponding to each frame of picture in the first group of pictures is subjected to fusion learning in order to ensure the stability and accuracy of the estimation result. Preferably, after fusion learning is performed on all warehouse material estimation volumes in the first group of frame pictures, a single result is output as a result of the fusion learning, and the result is used as a final estimation result of the material volume stacked in the warehouse in the first video stream.

Fig. 2 is a second flowchart of the estimation method for warehouse material volume according to the embodiment of the present application, and as shown in fig. 2, the second flowchart of the estimation method for warehouse material volume according to the embodiment of the present application includes the following steps:

step 201, video streams for warehouse materials are acquired and decomposed frame by frame.

The acquiring of the video stream of the warehouse camera in this embodiment includes decomposing the acquired video stream into a group of frame pictures, and performing subsequent steps on each decomposed picture individually to obtain an independent result, and in step 205, fusing the multi-frame results to output a result.

In this embodiment, since the acquired video stream is updated in real time, the corresponding decomposition of the updated video stream into frame pictures is also updated in real time, but the update frequency of the material accumulated in the warehouse is low, and preferably, the estimated volume can be updated every three hours.

Step 202 checks whether the frame picture passes quality check, light check.

The frame pictures of this embodiment may have different quality problems, which affect the judgment of the algorithm, so that quality check is required before processing. Preferably, the quality detection can be performed according to several common image quality problems appearing in the picture, such as scenes of green screen, bar, blur and the like.

Fig. 3 is a schematic flowchart of the quality inspection of the frame image in step 202 of this embodiment, and as shown in fig. 3, the quality inspection of the frame image in this embodiment of the present application includes the following steps:

step 301, checking by a green screen. The green screen detection in this embodiment may check the frame picture by the pixel average value, and judge whether the quality of the detected frame picture meets the standard by judging whether the average value of the pixels of the frame picture meets the standard.

Step 302, checking the braces. The brace inspection in this embodiment may be performed by a standard deviation of vertical pixels of the frame picture, and determine whether the detected frame picture quality meets a standard by determining whether the standard deviation of the frame picture is within a specified range.

Step 303, fuzzy check. The blurring test in this embodiment may be performed by using a standard deviation of the entire frame picture, and whether the standard deviation of the frame picture is within a specified range is determined to determine whether the quality of the detected frame picture meets a standard.

The green screen inspection, the brace inspection, and the fuzzy inspection may be performed sequentially or simultaneously, which is not limited in this embodiment.

In this embodiment, because of some special moments, there may be light that affects the quality of the acquired frame picture, for example: the windows in the early warehouse are irradiated by light, and the sunlight irradiates in the stockpile, so that the subsequent estimation method detection can be influenced. Preferably, the frame picture which meets the above-mentioned inspection steps is subjected to light inspection, in this embodiment, the inspection method may be to count the number of highlighted pixel points existing in the frame picture, and if the proportion of the existing highlighted pixel points to the total pixels of the picture is small, the light inspection is passed.

Preferably, the average pixel value, the standard deviation of the longitudinal pixels, the standard deviation of the whole pixels, and the proportion of the highlighted pixels of the frame picture can be preset according to the pixel information of the expected picture before detection.

In this embodiment, if the frame picture passes the above-mentioned inspection, the subsequent steps are continued, otherwise, the process directly proceeds to step 205 to perform result fusion and output.

Step 203, checking whether the frame picture passes the detection of the work vehicle.

In this embodiment, vehicles may exist in the acquired picture, which may affect the detection of the estimation method, for example: when there are unloading and loading operations, the warehouse material may change, and these operations require the vehicle to perform the work. It is necessary to detect whether there is a vehicle in the captured frame picture. Preferably, an object detection model can be used to detect the presence of a truck or forklift, for example: a deep learning single-stage detection model such as YOLO is selected, and whether vehicles are contained in the image or not and the number of the vehicles can be rapidly and accurately detected by means of a pre-training model in a COCO data set.

If the vehicle is detected, the subsequent detection is not performed on the current frame image, the step 205 is directly performed to perform result fusion and output, but the detection of the working vehicle is performed on the subsequent frame image in a key manner.

And step 204, identifying the warehouse materials in the picture, and estimating the volume of the warehouse materials.

Fig. 4 is a schematic flow chart illustrating the estimation of the warehouse material volume in step 204 of the present embodiment, and as shown in fig. 4, the estimation of the warehouse material volume in the present embodiment includes the following steps:

step 401, a complete bottom edge screenshot of a warehouse is obtained.

In this embodiment, the camera and the warehouse are in a relatively fixed position, so that four vertex positions of the bottom edge of the warehouse can be labeled on the first group of collected pictures, and information of four points of the complete bottom edge is stored in the configuration. And intercepting the frame picture according to the marking information of the four point positions of the bottom edge, wherein the intercepted picture is a quadrilateral picture only containing the bottom edge of the warehouse.

Step 402, modifying the perspective direction of the screenshot.

In this embodiment, since the captured screenshot of the warehouse bottom edge is not a standard rectangle and has perspective, the perspective direction of the screenshot needs to be changed to perform the subsequent steps. Preferably, the intercepted bottom edge image part can be subjected to affine transformation, and the transformation result is a standard rectangle without a perspective view angle. And according to the pixel point information of the converted screenshot and the actual size of the bottom edge of the warehouse, the corresponding relation between the pixels of the converted screenshot and the actual size can be obtained.

Step 403, converting the converted screenshot into a one-dimensional signal.

In this embodiment, after the converted screenshot is obtained, preferably, the image may be subjected to gaussian blurring to eliminate noise and reduce detail levels. And converting the image into a gray-scale image so as to calculate the segmentation positions of the materials in the image and the bottom edge of the warehouse, and calculating the pixel mean value in the width direction in the gray-scale image, preferably, a sliding mean value with 5 pixels as a window can be adopted in the depth direction to reduce the noise influence. Thereby converting the gray scale image into a one-dimensional signal for processing.

Step 404, determining the peak position of the one-dimensional signal.

Fig. 5 shows a one-dimensional signal diagram obtained by converting the gray scale diagram in step 403 of this embodiment, where the abscissa of the one-dimensional signal diagram is the pixel value in the depth direction of the truncated picture, and the ordinate is the sliding difference of the pixels. In this embodiment, the sliding difference of the pixel corresponding to each 1 meter of the one-dimensional signal is calculated according to the correspondence between the pixel point information and the actual size of the bottom edge of the warehouse. And determining the possible positions of the dividing points in the one-dimensional signal according to the obtained difference, wherein the dividing points represent the boundary positions of the accumulated materials and the bottom edge of the warehouse, the dividing points can present a wave shape in the one-dimensional signal, and the sliding difference corresponding to the dividing points is the position of the wave crest, namely a maximum value of the signal.

Step 405, verifying the peak position.

In this embodiment, due to the problem of light and specification in the warehouse material accumulation, a plurality of different peaks may be reflected in the one-dimensional signal, and to ensure that the peak positions determined in the one-dimensional signal in the above steps are appropriate, it is preferable to select 10 pictures for each material by manual marking, mark the boundary lines thereof, calculate the sliding difference of the marked positions, and use the average value of the sliding differences of the 10 pictures as the threshold.

In this embodiment, according to the threshold value, the minimum position x1 exceeding the threshold value in the sliding difference value of the one-dimensional signal of each new picture is compared, the peak position x2 can be determined according to a certain range from x1 backward, and the start position x0 of fluctuation can be determined according to a certain range from x1 forward.

If a value larger than x1 exists before the one-dimensional signal fluctuation position x0, determining that a certain position smaller than x0 still has fluctuation, continuing to forward recursively search for a new peak until the foremost peak is determined, and taking the position of the abscissa pixel point corresponding to the peak position as the depth position of the material in the intercepted warehouse bottom edge image.

In the embodiment, after the last wave crest of the one-dimensional signal is determined, the wavelength of the one-dimensional signal is judged, if the wavelength is in a reasonable range, the picture effect is judged to be good, and corresponding wave crest position information is output; if the wavelength is too large, the stacking state of the materials in the picture is proved to be poor, and a signal that the picture does not meet the depth detection requirement is output.

At step 406, the warehouse material stacking volume is determined.

In this embodiment, since the boundary between the warehouse material stacking and the bottom edge of the warehouse is in the form of a wave through the one-dimensional signal, the peak position obtained in step 405 is the representation of the depth of warehouse material stacking in the picture, and since the picture is the mapping of the complete bottom edge of the warehouse, the peak position can be directly converted into the depth of actual material stacking in the warehouse before the perspective is modified.

Therefore, after the output peak position information is obtained, according to the corresponding relation between the intercepted pixel point information of the warehouse bottom edge image and the actual size of the warehouse bottom edge, the pixel point position corresponding to the peak position is converted as the actual size of the position of the warehouse material stacking depth, and the estimated warehouse material stacking depth is obtained. Preferably, the width of the warehouse is a fixed width, and in an actual application process, the stacking height of the warehouse materials is generally fixed, so that the height can be set as an empirical value, and therefore, after the stacking depth of the warehouse materials is determined, the estimated volume of stacking the materials is calculated according to the empirical values of the fixed width and the height.

And step 205, performing fusion learning on the estimation result and outputting a fusion learning result.

In the embodiment, because the warehouse updating frequency is low, a multi-frame result can be fused by adopting a fusion learning method, the accuracy and the stability of the result can be improved, and the problem that the result error is easily caused because the truck does not standardize the material accumulation after just unloading the truck is avoided.

In this embodiment, two empty queues and a vehicle counter are provided, the queue length at least includes an estimation result of each frame in a video stream, the first queue is used as an actual queue, and the actual estimation result of each frame and the information of the quality inspection, the light inspection and the vehicle inspection are stored; and discarding results corresponding to the frame pictures with the results of the previous quality inspection and the light inspection which do not meet the requirements, correspondingly replacing the discarded result with the result of the previous frame of the discarded result, and taking the replaced queue as a fusion queue to occupy a second empty queue.

In this embodiment, when the information of vehicle inspection in the last result of the actual queue is 0, that is, the corresponding frame picture does not include a vehicle, the result of fusion learning is: and outputting the last result in the fusion queue as a fusion learning result, wherein the fusion queue corresponds to the actual queue, when the fusion step is performed, when the last result in the actual queue detects the vehicle operation state, namely the vehicle information is detected in the frame picture corresponding to the last result in the actual queue, and the vehicle information is 1, the detection results are screened from back to front in the fusion queue until the detection result without the vehicle information is screened out, and in order to ensure the accuracy and stability of the output estimated volume, the detection results except the screened out detection results are emptied in the fusion queue, and the vehicle counter is changed to 1.

In this embodiment, during the fusion step, when the last result of the actual queue is in a normal state, that is, when no vehicle information is detected in the frame picture corresponding to the last result of the actual queue, the current fusion queue is detected, and if the fusion queue is just emptied, only the last fusion learning result is included, and the vehicle counter is 1, it is proved that the vehicle unloading or loading condition occurs during the last fusion learning, that is, the vehicle operation condition occurs. Comparing the result in the fusion queue with the final result in the actual queue, if the final result in the actual queue is larger than the result in the fusion queue, proving that the unloading operation exists, and evaluating the difference value obtained by comparison, if the volume of the material unloaded by one vehicle is exceeded, indicating that the unloaded material is not subjected to standard processing by the forklift, and outputting the result in the fusion queue plus the reasonable unloading volume of the one vehicle as the volume of the fusion learning; and if the difference does not exceed the material volume unloaded by one vehicle, outputting the final result in the actual queue as a result of fusion learning.

If the final result of the actual queue is smaller than the result in the fusion queue, the loading operation exists, the difference obtained by comparison is evaluated, if the difference exceeds the material volume loaded by one forklift, the material is not subjected to standard processing after loading, and the result of the fusion queue minus the reasonable volume loaded by one forklift is output as the result of fusion learning; if the difference value does not exceed the material volume loaded by one forklift, the material is subjected to standard processing after loading, and the final result of the actual queue is output as a result of fusion learning.

Preferably, since the unloading amount of the vehicle is an empirically determined value, if the vehicle operation is checked in the previous frame, the estimated material volume corresponding to the picture before the vehicle operation is compared with the estimated material volume after the vehicle operation, and if the comparison difference is too large, for example, the unloading amount of a truck of more than 5 trucks is exceeded, indicating that the forklift does not operate to standardize the material accumulation after the vehicle is unloaded, the information on the estimated material volume of the picture after the vehicle operation is not adopted, and the material volume carried by one truck is added to the estimated material volume of the picture before the vehicle operation.

In this embodiment, when a plurality of vehicles are unloaded or loaded, that is, when the last result of the actual queue detects the operating state of the vehicle, and a frame image corresponding to the result is recognized that a plurality of vehicles operate simultaneously during vehicle inspection, the fusion learning is stopped, the estimation system continues to recognize the subsequent video stream, after the forklift is recognized to complete the leveling operation, the median calculation is performed on the estimation result of the next 10 frames of images after the frame image, and the calculation result is used as the final estimation result of the video stream and is output.

Fig. 6 is a schematic structural diagram illustrating a device for a warehouse material volume estimation method according to the present embodiment, and as shown in fig. 6, the schematic structural diagram illustrating the device for the warehouse material volume estimation method according to the present embodiment includes:

and the camera unit 601 is used for shooting the warehouse video stream. The image acquisition unit 602 is configured to decompose the warehouse video stream to obtain a first group of frame pictures. The information processing unit 603 is configured to perform depth detection on the first group of pictures, and obtain, according to a detection result, a warehouse material volume corresponding to each frame of picture in the first group of pictures; the signal processing unit is further used for conducting fusion learning on the warehouse material volume corresponding to each frame of picture. The signal receiving unit 604 is configured to receive the picture information transmitted to the information processing unit by the image acquisition unit; wherein the signal receiving unit is also used for outputting the volume of the materials after fusion learning. An image checking unit 605, configured to perform a first check on the first group of pictures obtained by the image acquisition unit; wherein, the first test includes: performing light ray inspection on the first group of pictures; performing quality inspection on the first group of pictures; and performing obstacle inspection on the first group of pictures. The storage unit 606 is used for storing warehouse video streams shot by the camera shooting unit; the storage unit is further used for storing and outputting the volume of the materials after fusion learning.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems on a chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server combining a blockchain.

Another aspect of embodiments of the present invention provides a computer-readable storage medium comprising a set of computer-executable instructions, which when executed, perform any one of the above-described estimation methods.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Moreover, various embodiments or examples and features of various embodiments or examples described in this specification can be combined and combined by one skilled in the art without being mutually inconsistent.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method of estimating the volume of material in a warehouse, the method comprising:

identifying warehouse materials in each frame of picture in the first group of frames of pictures, and performing depth detection on the warehouse materials in each frame of picture to determine depth information of the warehouse materials;

acquiring the height information of the warehouse material, and calculating the volume of the warehouse material based on the depth information and the height information of the warehouse material;

2. The method of claim 1, before identifying warehouse material in each picture in the first set of pictures, the method further comprising:

performing green screen inspection on each picture in the first group of pictures;

the green screen inspection comprises: and checking the pixel average value of each frame picture in the first group of frame pictures, and screening out that the pixel average value meets the preset frame picture average value standard.

3. The method of claim 1, before identifying warehouse material in each picture in the first set of pictures, the method further comprising:

performing bar testing on each picture in the first group of pictures;

the brace inspection comprises: and checking the standard deviation of the longitudinal pixels of each frame picture in the first group of frame pictures, and screening out that the standard deviation of the longitudinal pixels meets the preset standard deviation of the longitudinal pixels of the frame pictures.

4. The method of claim 1, before identifying warehouse material in each picture in the first set of pictures, the method further comprising:

the blur checking includes: and checking the standard deviation of the integral pixel of each frame picture in the first group of frame pictures, and screening out the standard deviation value of the integral pixel which meets the integral standard deviation of the preset frame picture.

5. The method of claim 1, before identifying warehouse material in each picture in the first set of pictures, the method further comprising:

the light ray inspection includes: and counting the highlighted pixel points in each frame of picture in the first group of frame pictures, and screening out the highlighted pixel points, wherein the ratio of the highlighted pixel points to the whole pixels is lower than the preset ratio of the highlighted pixel points in the frame pictures.

6. The method of claim 1, before identifying warehouse material in each picture in the first set of pictures, the method further comprising:

performing vehicle inspection on each picture in the first group of pictures;

the vehicle inspection includes: and checking the first group of frame pictures by using the target detection model, and identifying the pictures containing the vehicles in the first group of frame pictures and the number of the vehicles in the pictures.

7. The method of claim 1, wherein the pair of identifying warehouse material in each picture of the first set of pictures and performing depth detection on warehouse material in each picture, and determining depth information of the warehouse material comprises:

8. The method according to claim 1, wherein the fusion learning of the volume of the warehouse material determined by the periodically acquired video stream, and the taking of the fusion-learned volume as the volume of the warehouse material comprises:

sorting detection results of each frame of the first group of frame pictures into a first queue, the detection results including: whether the current frame picture passes through green screen inspection, brace inspection, fuzzy inspection and light inspection, vehicle information in the current frame picture and a specific numerical value of warehouse material volume in the current frame picture;

when the frame image corresponding to the last result in the first queue identifies a vehicle in vehicle inspection, and the number of the vehicles is 1, screening the detection results from back to front of the second queue until the detection results without vehicle information are screened out, comparing the specific value of the warehouse material volume in the detection results screened out by the second queue with the specific value of the warehouse material volume in the last detection result of the first queue, and performing difference calculation on the specific value of the warehouse material volume in the detection results screened out by the second queue and the specific value of the warehouse material volume in the last detection result of the first queue;

when the frame image corresponding to the last result in the first queue identifies the vehicle in vehicle inspection, and the number of the vehicles is more than 1, stopping the fusion learning, and continuing to screen the inspection results of the frame images decomposed by the subsequent video streams until the inspection results without the vehicle information are screened out;

9. The method of claim 8, comparing the bin volume specific value in the test result screened by the second queue with the bin volume specific value in the last test result of the first queue, the method further comprising:

if the result of the difference calculation does not exceed the material volume shoveled by one forklift, taking the warehouse material volume in the last detection result in the first queue as the warehouse material volume obtained by the current fusion learning, and if the result of the difference calculation exceeds the material volume shoveled by one forklift, subtracting the material volume shoveled by one forklift from the specific value of the warehouse material volume in the detection results screened by the second queue as the warehouse material volume obtained by the current fusion learning;

and if the result of the difference calculation does not exceed the added material volume of one truck, taking the warehouse material volume in the last detection result in the first queue as the warehouse material volume obtained by the current fusion learning, and if the result of the difference calculation exceeds the added material volume of one truck, adding the specific value of the warehouse material volume in the detection results screened by the second queue to the added material volume of one truck as the warehouse material volume obtained by the current fusion learning.

10. An estimation device for the volume of material in a warehouse, said device comprising:

an image pickup unit; for shooting warehouse video streams;

11. The apparatus of claim 9, further comprising:

wherein the first verifying comprises:

performing light ray inspection on the first group of pictures;

performing quality inspection on the first group of pictures;

performing obstacle inspection on the first group of pictures;

and the storage unit is also used for storing and outputting the volume of the fused and learned material.

12. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9.

13. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method according to any one of claims 1-9.