CN116946610B

CN116946610B - Method and device for picking up goods in intelligent warehousing system

Info

Publication number: CN116946610B
Application number: CN202311219343.2A
Authority: CN
Inventors: 张立峰
Original assignee: Zhongke Source Code Chengdu Service Robot Research Institute Co ltd
Current assignee: Zhongke Source Code Chengdu Service Robot Research Institute Co ltd
Priority date: 2023-09-21
Filing date: 2023-09-21
Publication date: 2023-12-12
Anticipated expiration: 2043-09-21
Also published as: CN116946610A

Abstract

The application relates to the technical field of warehouse management, in particular to a method and a device for picking up goods in an intelligent warehouse system. The application obtains the text information of the object to be picked up, determines the approximate placement position of the object based on the text information, takes the position as the object position to drive the pickup part to move based on the optimal path, acquires the image at the object position when the object position is moved, obtains the image similar to the object phase by analyzing the image as the object to be picked up, obtains the size information about the object by point cloud processing, and drives the pickup part to pick up the object by the size information. The method and the device provided by the embodiment of the application can realize the information identification of the target in the warehouse system and the accurate position identification of the target in the warehouse, and can accurately identify the target by an image processing method and accurately pick up the target by accurately measuring the size information of the target.

Description

Method and device for picking up goods in intelligent warehousing system

Technical Field

The application relates to the technical field of warehouse management, in particular to a method and a device for picking up goods in an intelligent warehouse system.

Background

The traditional warehouse transportation system adopts a 'person arrives' picking mode, and has the problems of high labor cost, low picking speed, low transportation efficiency and the like. Modern logistics have the characteristics of large order quantity, wide storage area, complex picking path and the like, and the traditional storage and transportation system cannot meet the requirements of the modern logistics. Therefore, the unmanned intelligent warehousing system is comprehensively built, the labor cost and the error rate are reduced, the overall operation efficiency of the system is improved, and the unmanned intelligent warehousing system becomes a primary target for modern logistics development.

The intelligent warehouse system is particularly suitable for a large intelligent warehouse system, and because the floor area is large and the stored goods are more, the intelligent warehouse system is difficult to implement due to the fact that the identification of the target goods is inaccurate when the intelligent warehouse system is applied.

Disclosure of Invention

In order to solve the problems, the application provides a method and a device for picking up goods in an intelligent warehousing system, which can solve the problem of inaccurate goods picking up in the prior art.

In order to achieve the above purpose, the technical scheme adopted by the embodiment of the application is as follows:

in a first aspect, there is provided a method for picking up goods in an intelligent warehousing system, the intelligent warehousing system including a user terminal, a server, and a warehouse in which a track and a picking up component moving along the track are disposed, the method being applied to the server driving the picking up component to pick up the goods in the warehouse, the method comprising: receiving a picking command initiated by the user side, analyzing the command to obtain a keyword in the command, searching a label corresponding to the keyword in a preset database based on the keyword, determining a first target position based on the label, establishing the first target position based on a world coordinate system, and acquiring information of a target object to be picked and a second target position for placing the target object in the picking command; performing path planning based on the current position of the pickup element and a first target position to obtain a movement strategy of the pickup element, and driving the pickup element to move to the first target position based on the movement strategy; acquiring an image of the first target position, identifying the acquired image to obtain a target object, acquiring size information of the target object, and driving the pickup component to pick up the target object based on the size information; and planning a path based on the current position of the pickup element and the second target position to obtain a movement strategy of the pickup element, and driving the pickup element to move to the second target position by the movement strategy.

Further, the receiving the pickup command initiated by the user terminal, and analyzing the command to obtain keywords in the command, includes: obtaining truncation information for truncating a feature sequence of an input speech signal based on the speech signal; based on the cut-off information, cutting off the characteristic sequence into a plurality of subsequences, and dividing the subsequences to obtain independent word units; and inputting the word units into the trained recognition model to obtain recognized texts, and combining the texts into corresponding keywords.

Further, the truncated information of the feature sequence of the truncated voice signal includes: obtaining spike information related to the speech signal by performing a connection timing classification CTC process on the feature sequence; and determining the cut-off information based on the obtained spike information.

Further, the truncating the signature sequence into a plurality of subsequences includes: for each spike in the spike information, a sub-sequence in the feature sequence corresponding to a predetermined number of spikes adjacent to each spike is selected, the predetermined number of spikes including a first number of spikes before each spike and a second number of spikes after each spike.

Further, the recognition model comprises a BERT layer, a BiGRU layer, a self-attention layer and a CRF layer which are sequentially connected; the BiGRU layer comprises a forward BiGRU layer and a backward BiGRU layer, the outputs of the BERT layer are respectively input to the forward BiGRU layer and the backward BiGRU layer to obtain two corresponding outputs, and the two outputs are respectively input to the self-attention layer.

Further, the identifying the collected image to obtain the target object includes: and carrying out binarization processing on the image, carrying out contour extraction on the binarized image to obtain a contour image, comparing the contour image with a preset template image to obtain an initial identification image, processing the initial identification image through an identification model to obtain a target image with highest similarity with the keywords, and carrying out reduction on the target image to obtain a target object.

Further, comparing the contour image with a preset template image to obtain an initial identification image, including: comparing the contour image with a preset template image to obtain the similarity of the contour image and the template image, and taking the contour image which accords with a similarity threshold value as an initial identification image based on the similarity threshold value.

Further, the obtaining the size information of the target object includes: obtaining point cloud data of the target object, performing format conversion on the point cloud data, filtering the point cloud data subjected to the format conversion to remove noise in the point cloud data, clustering the point cloud data subjected to the noise removal to obtain complete point cloud of the target object, constructing a minimum point cloud bounding box, and obtaining the three-dimensional size of the target object based on the minimum point cloud bounding box.

Further, driving the pickup unit to pick up the target object based on the size information includes: and driving the pickup part to form a pickup space larger than the size of the target object based on the three-dimensional size of the target object, and picking up the target object based on the pickup space.

In a second aspect, there is provided an intelligent warehousing system cargo pickup apparatus, the apparatus comprising: the position determining module is used for receiving a picking command initiated by the user terminal, analyzing the command to obtain keywords in the command, searching labels corresponding to the keywords in a preset database based on the keywords, determining a first target position based on the labels, establishing the first target position based on a world coordinate system, and enabling the picking command to contain information of a target object to be picked and a second target position for placing the target object; the first movement control module is used for planning a path based on the current position of the pickup element and a first target position, obtaining a movement strategy of the pickup element, and driving the pickup element to move to the first target position based on the movement strategy; the pickup control module is used for acquiring an image of the first target position, identifying the acquired image to obtain a target object, acquiring size information of the target object, and driving the pickup component to pick up the target object based on the size information; and the second movement control module is used for planning a path based on the current position of the pickup component and the second target position to obtain a movement strategy of the pickup component, and driving the pickup component to move to the second target position by the movement strategy.

In a third aspect, a computer readable storage medium is provided, the computer readable storage medium storing a computer program which, when executed by a processor, implements the method of any one of the above.

According to the intelligent warehouse system goods picking method and device, text information of the object to be picked is obtained, the approximate placement position of the object is determined based on the text information, the picking component is driven to move based on the optimal path by taking the position as the target position, when the object moves to the target position, an image at the target position is collected, an image similar to the object is obtained through image analysis and serves as the object to be picked, size information about the object is obtained through point cloud processing, and the picking component is driven to pick the object through the size information. The method and the device provided by the embodiment of the application can realize the information identification of the target in the warehouse system and the accurate position identification of the target in the warehouse, and can accurately identify the target by an image processing method and accurately pick up the target by accurately measuring the size information of the target.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

The methods, systems, and/or programs in the accompanying drawings will be described further in terms of exemplary embodiments. These exemplary embodiments will be described in detail with reference to the drawings. These exemplary embodiments are non-limiting exemplary embodiments, wherein the exemplary numbers represent like mechanisms throughout the various views of the drawings.

Fig. 1 is a schematic flow chart of a cargo picking method of an intelligent warehousing system according to an embodiment of the present application.

Fig. 2 is a schematic structural diagram of a cargo pickup device of an intelligent warehousing system according to an embodiment of the present application.

Fig. 3 is a schematic diagram of an intelligent warehouse system cargo picking apparatus according to an embodiment of the present application.

Detailed Description

In order to better understand the above technical solutions, the following detailed description of the technical solutions of the present application is made by using the accompanying drawings and specific embodiments, and it should be understood that the specific features of the embodiments and the embodiments of the present application are detailed descriptions of the technical solutions of the present application, and not limiting the technical solutions of the present application, and the technical features of the embodiments and the embodiments of the present application may be combined with each other without conflict.

In the following detailed description, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant teachings. It will be apparent, however, to one skilled in the art that the application can be practiced without these details. In other instances, well known methods, procedures, systems, components, and/or circuits have been described at a relatively high-level, without detail, in order to avoid unnecessarily obscuring aspects of the present application.

The present application uses a flowchart to illustrate the execution of a system according to an embodiment of the present application. It should be clearly understood that the execution of the flowcharts may be performed out of order. Rather, these implementations may be performed in reverse order or concurrently. Additionally, at least one other execution may be added to the flowchart. One or more of the executions may be deleted from the flowchart.

Before describing embodiments of the present application in further detail, the terms and terminology involved in the embodiments of the present application will be described, and the terms and terminology involved in the embodiments of the present application will be used in the following explanation.

(1) In response to a condition or state that is used to represent the condition or state upon which the performed operation depends, the performed operation or operations may be in real-time or with a set delay when the condition or state upon which it depends is satisfied; without being specifically described, there is no limitation in the execution sequence of the plurality of operations performed.

(2) Based on the conditions or states that are used to represent the operations that are being performed, one or more of the operations that are being performed may be in real-time or with a set delay when the conditions or states that are being relied upon are satisfied; without being specifically described, there is no limitation in the execution sequence of the plurality of operations performed.

Noun interpretation:

CTC: connection timing classification (Connectionist temporal classification, CTC) allows the network to make label predictions at any point in the input sequence, so long as the overall order of the labels is correct, in order to achieve timing classification.

BERT: the bi-directional coded representation from the transducer is known as Bidirectional Encoder Representation from Transformers.

BiGRU: the bi-directional gated loop unit is an improved Recurrent Neural Network (RNN) architecture.

CRF: the full name Conditional Random Fields, chinese name conditional random field, is a conditional probability distribution model of another set of output sequences given a set of input sequences.

argmax: is a function, which is a function of solving parameters (a set) of the function, and is called arguments of the maxima in full.

CNN: convolutional neural networks (Convolutional Neural Networks, CNN) are a class of feedforward neural networks that contain convolutional computations and have a deep structure.

LR: the method is a classification algorithm, and a linear model is converted into a classification model by applying the linear model to a nonlinear function, and is totally called Logistic regression, and Chinese is called logistic regression.

DBSCAN: the method is a Density-based clustering algorithm, and English is called Density-Based Spatial Clustering of Applications with Noise, which means a Density-based spatial clustering algorithm robust to noise.

The segment module: the segmentation module is in the form of hard segmentation (Hard Segmentation) and is used for dividing pixels in the picture into a plurality of categories, and if the pixels are segmented into front and background, the pixels are divided into two categories, wherein one category represents a foreground and one category represents a background.

KD Tree: the method is short for k-dimensional tree, is a tree data structure for storing example points in k-dimensional space so as to quickly search the example points, and is mainly applied to searching of multi-dimensional space key data.

The method provided by the embodiment of the application is applied to an intelligent warehouse system, wherein the intelligent warehouse system is used for placing and transferring articles placed in a warehouse, the system mainly comprises a user side and a server, the user side is a terminal component for issuing commands to the server, the command issuing main body is a user, and the server is used for processing according to the issued commands by the command issuing person and issuing next-level commands according to processing results. In this embodiment, the executor for the final command is a pickup component arranged in the warehouse and used for picking up the goods placed in the warehouse according to the command, wherein a track is further arranged in the warehouse, the pickup component moves to a target position along the track to identify the goods at the target position, and the goods to be picked up in the plurality of goods at the target position are determined to be picked up.

Therefore, in the embodiment of the present application, three processes are mainly performed for the pickup device, the first process is to move from the current position to the target position, the second process is to pick up the target object at the target position, and the third process is to move the picked-up target object to the target placement position.

The method for picking up the goods of the intelligent warehousing system, which is provided by the embodiment of the application, is executed according to the whole implementation process, and comprises the following specific processes:

s110, receiving a pick-up command initiated by the user terminal, analyzing the command to obtain keywords in the command, searching labels corresponding to the keywords in a preset database based on the keywords, and determining a first target position based on the labels.

In the embodiment of the application, the intelligent warehouse system in the prior art performs processing based on command transmission from the control center to the execution part, but the control center in a general system only comprises one, and all scheduling and data storage are processed based on the center. However, in the actual operation process, because the intelligent warehousing system, especially the large intelligent warehousing system, has larger area, and the actual situation that the tasks executed by the staff are more is more, if all the scheduling is performed by only one control center, the problem of higher time cost is caused.

Therefore, in order to solve this technical problem, the embodiment of the present application does not set a conventional control center, that is, in this embodiment, the issuing of the command is not centralized, but is performed according to the user side set in the distribution. For example, when a person wants to pick up a certain cargo, the task is established and the command is issued only through the corresponding user terminal.

The mode of task initiation for the existing control center, namely when a task for picking up a certain target object is initiated, the target object is determined in the center or the terminal and then the task is issued. In order to achieve convenience for personnel use, in the embodiment of the application, the task is determined only through a voice recognition method by a user, namely, the voice initiated by the personnel of the object to be picked up is analyzed, and the position information corresponding to the key information in the preset database is obtained based on the analyzed key information and the labels in the preset database.

The method comprises three processing procedures, namely, analyzing voice information received by a user to obtain keywords, and acquiring labels of the keywords in a preset database and determining target positions based on the labels.

The method comprises the following processing steps of:

step S111, based on the input voice signal, obtaining truncation information for truncating the characteristic sequence of the voice signal.

In an embodiment of the present application, the cut-off information may be spike information related to the speech signal obtained by performing a Connection Timing Classification (CTC) process on the feature sequence, the CTC process may output a sequence of spikes, and the spikes may be separated by blank spaces (blank), where one spike may represent a syllable (syllabic) or a group of phones (phone), such as a combination of high frequency phones. It should be appreciated that while the following sections herein are described using CTC spiking information as one example of providing truncation information, any other model and/or algorithm capable of providing truncation information of an input speech signal, now known or developed in the future, may also be used in conjunction with embodiments of the present disclosure.

S112, cutting the characteristic sequence into a plurality of subsequences based on the cutting information, and dividing the subsequences to obtain independent word units.

Specifically, for each peak in the peak information, a sub-sequence in the feature sequence corresponding to a predetermined number of peaks adjacent to each peak is selected, wherein the predetermined number of peaks includes a first number of peaks before each peak and a second number of peaks after each peak. And taking the word corresponding to the peak as an independent word unit, wherein the word unit is used as a basic input for subsequent recognition.

S113, inputting a plurality of word units into the trained recognition model to obtain recognized texts, and combining the texts into corresponding keywords.

In the embodiment of the application, the structure of the recognition model is a BERT layer, a BiGRU layer, a self-attention layer and a CRF layer which are sequentially connected.

The BiGRU layer comprises a forward BiGRU layer and a backward BiGRU layer, the outputs of the BERT layer are respectively input to the forward BiGRU layer and the backward BiGRU layer to obtain two corresponding outputs, and the two outputs are respectively input to the self-attention layer.

The self-attention layer is a multi-head attention layer structure and comprises a plurality of linear layers which are arranged corresponding to multi-head input, a zoom dot product attention layer which is linked with the plurality of linear layers and a splicing layer which is linked with the zoom dot product attention layer, wherein the splicing layer is connected with an output linear layer.

Before recognition, independent word units need to be vectorized, specifically, a plurality of word units are respectively subjected to position embedding, segment embedding and word embedding to obtain word embedding common representation. And adopting a position coding mode aiming at embedding, and adding coding and embedded data so as to add relative position information.

The specific identification process is as follows: embedding the processed word into a common representation and inputting the common representation into the BiGRU layer to obtain a word vector; respectively inputting the word vectors into the forward BiGRU layer and the backward BiGRU layer to correlate the front text and the rear text to obtain a plurality of word output vectors; inputting a plurality of word output vectors into the self-attention layer for vector splicing to obtain an alarm information recording sequence; and inputting the alarm information record sequence into the CRF layer to obtain an optimal tag sequence, and carrying out text construction on the optimal tag sequence to obtain a corresponding event.

Wherein the obtaining for the best tag sequence is obtaining a plurality of tag sequence position scores by the following formula:

wherein->、/>And->The tag in the optimal tag sequence, respectively +.>Label->And tag j-1, P is the output matrix of the BiGRU layer, A is the transfer score matrix, where +.>Representing the transition score from tag i to tag j.

Then solving a plurality of tag sequence position score sets based on argmax function to obtain the tag sequence corresponding to the function with the maximum final independent variable as the optimal tag sequence, wherein the tag sequence is specifically expressed by the following formula: 。

Through the processing of step S111-step S113, a keyword in the voice information is obtained, which is information about the target object, but the keyword information cannot be confirmed for the pickup device, so that it is necessary to convert the obtained text information into feature code information of the pickup device for the target object. In the embodiment of the application, the conversion process is determined through the label associated with the text information of the target object in the database, and the text information is converted based on the label. The method comprises the steps of providing a database, wherein the database is a preset database, the database comprises text information of a target object and a corresponding storage space, the storage space comprises cargo information and cargo placement information about the target object, the cargo placement information is a first target position, the first target position is used for representing the relative geographic coordinates of the target object in a warehouse, and the cargo information is information such as the name, the warehouse-in time and the like of the target object. The acquisition and input of the information are realized based on the acquisition and input of the objects during warehouse entry, namely, each object has a corresponding label and stored information in a database.

And S120, planning a path based on the current position of the pickup component and the first target position, obtaining a movement strategy of the pickup component, and driving the pickup component to move to the first target position based on the movement strategy.

In the embodiment of the present application, for the first target position in the pickup element, that is, the placement position of the target object, acquired in step S110, and for the pickup element to move, the current position of the pickup element needs to be determined, and the moving distance of the pickup element is determined by the current position and the target position. Wherein the number of pick-up parts in respect of the pick-up part in the intelligent warehousing system is plural, and the pick-up part that can perform the pick-up work needs to be determined before the first target position movement of the pick-up part and the pick-up of the target object is performed. Therefore, it is also necessary to determine the pickup section in the idle state before making the determination of the current position of the pickup section, and the determination of the pickup section for the idle state may be made based on the pickup section movement state described in the server.

After determining the operational status of the pick-up units, a situation arises in which a plurality of pick-up units are in an idle state, in which case it is necessary to screen the pick-up units for which the command is finally executed. The screening logic in the embodiment of the present application is distance logic, that is, a pickup device closest to the first target location is screened from a plurality of pickup devices in an idle state.

Therefore, the current position information of the plurality of idle-state pickup units is determined before the control of the movement of the pickup units is performed, the position information being coordinate data in a warehouse-based world coordinate system, and the first target position is also coordinate data in the warehouse-based world coordinate system. And comparing the two coordinate data to determine the pickup element corresponding to the shortest path in the coordinate system in the plurality of idle pickup elements as the target pickup element. The process is based on the comparison of the coordinate data, the prior art can be directly adopted for processing, the coordinate data about the pick-up component can be collected by adopting a sensor arranged in the pick-up component, the information points sent by the sensor are determined by establishing a simulation model, and the transformation of the coordinate data is carried out in the simulation model, and the detailed description is omitted in the embodiment of the application.

In an actual warehouse system, the arrangement for the rails is in a grid shape, and in order to achieve a minimum time for the transfer of the pickup element from the real-time position to the first target position, it is necessary to determine an optimal rail among the grid-shaped rails as a movement path of the pickup element to the first target position.

The determination for the optimal trajectory is based on the minimum time as the filtering logic, specifically, the determination of a plurality of motion paths based on the real-time position and the first target position, and the determination of the path with the minimum time in the plurality of motion paths.

The acquisition of the path is not determined based on the travel path between the starting points, but because there are a plurality of pickup units in the warehouse that are simultaneously in operation, and there is a possibility that there are other pickup units moving and operating on the same track, in the determination of the path, it is necessary to acquire the operation states of the other pickup units on a plurality of moving paths and the states of the other pickup units moving to the position based on the target pickup unit, and to increase the time for the pickup units to wait for the operation of the other pickup units at the position in time. The process is a pre-estimation process, the running states of all the pickup components in the warehouse and the running paths which are formed are fetched through a server, the running process of the pickup components is divided into a plurality of state nodes with unit time, wherein the units can be 5S, the running states of the other pickup components in the system are determined through the state nodes with single time, the running states are expressed on the paths which the pickup components are to be arranged, and the running states and the future running states of the pickup components and the other pickup components in the process are compared, so that the time estimation of the pickup components is carried out.

And determining the running time required by a plurality of paths through the estimation, screening out an optimal movement path in the running time, driving the pickup element to move to a first target position along the path through the optimal movement path, and uploading the pickup element information and the optimal path information to a database for estimation of the running path of the subsequent pickup element after generating the optimal movement path.

S130, acquiring an image of the first target position, identifying the acquired image to obtain a target object, acquiring size information of the target object, and driving the pickup component to pick up the target object based on the size information.

In the embodiment of the present application, the step S120 is mainly to move the pick-up component to the target position where the target object is located. In an actual application scene, since the placement of the object in the warehouse cannot achieve accurate placement based on the coordinate data, and the coordinate data for the target position is approximate position information, a certain deviation is formed between the coordinate data and the actual object placement. Therefore, when the pickup device moves to the first target position, it is also necessary to identify the cargo at the first target position, determine the target object in the area, determine the size corresponding to the outline of the target object, and then pick it up.

The process mainly comprises the steps of determining the target object, determining the size of the determined target object, and picking up the target object after the size is determined.

The identification and determination of the object are performed based on an image processing method in the embodiment of the application, and the specific process is that the object in the first target position is acquired through the pickup component, and the object existing in the acquired image is identified and screened out.

The image acquisition is realized through an image acquisition device arranged on the pickup component, wherein the image acquisition device comprises a camera, and the camera acquires images in a visual field.

Identifying the acquired image to obtain a target object comprises the following steps: and carrying out binarization processing on the image, carrying out contour extraction on the binarized image to obtain a contour image, comparing the contour image with a preset template image to obtain an initial identification image, processing the initial identification image through an identification model to obtain a target image with highest similarity with the keywords, and carrying out reduction on the target image to obtain a target object.

In the embodiment of the present application, comparing the contour image with a preset template image to obtain an initial identification image includes: comparing the contour image with a preset template image to obtain the similarity of the contour image and the template image, and taking the contour image which accords with a similarity threshold value as an initial identification image based on the similarity threshold value.

In the embodiment of the application, the target recognition process includes two processes, wherein the first process is to acquire an initial recognition image, and acquire a contour image by performing binarization processing on the image and extracting a contour, and because a plurality of objects possibly exist in the image, the acquired contour image is a plurality of contour images, and in the process, a plurality of independent contour images are also acquired by performing segmentation processing on the contour image. And comparing the similarity between the plurality of contour images and the template image, and determining the contour image meeting the requirements based on a preset similarity threshold. Because in the actual operation process, the problem that the similarity of the placed objects is large exists, a plurality of contour images which are obtained in a simple template matching mode and meet the requirements are also obtained, and therefore, when the plurality of obtained contour images are obtained, the plurality of contour images are required to be subjected to secondary screening. Wherein the setting of the threshold values for the embodiments of the present application is empirically set.

In the embodiment of the application, the secondary screening is processed through the recognition model, wherein the recognition model is set based on the neural network, and the neural network in the embodiment is a graph neural network. In the embodiment of the application, the efficiency of image processing is improved by combining the template matching and the neural network with the method for identifying the images, when the contour images obtained by the template matching are only one, the contour images are obtained as unique images, and when the contour images obtained by the template matching are a plurality of, the contour images are secondarily screened by the neural network. If the processing is performed by using a neural network from the start of recognition, the processing time for the neural network may be long because the shape of the cargo is greatly different in the actual application scene, which may result in high processing time cost. Therefore, the embodiment of the application can improve the efficiency of data processing and the accuracy of identification by combining the template matching and the neural network.

The neural network comprises a CNN-based backbone network LR grouping mechanism and a double pooling fusion module. Firstly, CNN performs feature extraction on the image to obtain a view level descriptor, then sends the view level descriptor into an LR grouping mechanism and a double-pooling fusion module to respectively perform grouping and feature fusion, and finally sends the view level descriptor into a full-connection layer to perform object recognition. Aiming at an LR grouping mechanism and a double pooling fusion module architecture in the embodiment, after the multi-view is grouped by utilizing an L2 norm and a ReLU activation function, the multi-view is sent to the double pooling fusion module, and twice pooling weighting is carried out on the multi-view, so that a shape descriptor is finally obtained. In this embodiment, the feature of processing the view level by using the L2 norm is more reasonable and effective, on one hand, because the L2 norm makes functions in the related interval have definitions, there is no situation that some points are undefined and cannot be calculated, and on the other hand, a ReLU activation function can be used when calculating the discrimination score.

In this embodiment, the above procedure is to acquire the object, and the size information of the object needs to be determined in the picking process for the pick-up unit, although the size information of the object is stored when the object is recorded. However, the problem of the placement mode during the placement of the object can result in the size information of the pick-up component within the pick-up range being uncertain. For example, three data of length, width and height are recorded for the size information when the object is recorded, but the pick-up component cannot determine whether the object is horizontally placed or vertically placed when the object is placed, and the uncertainty of the pick-up space when the object is picked up is caused for two different modes of horizontally placing and vertically placing.

Therefore, it is necessary for the present embodiment to determine the current pickable size information of the target object at the time of picking up.

The size information for the present embodiment is obtained based on the point cloud data, and the following steps are included for this process: obtaining point cloud data of the target object, performing format conversion on the point cloud data, filtering the point cloud data subjected to the format conversion to remove noise in the point cloud data, clustering the point cloud data subjected to the noise removal to obtain complete point cloud of the target object, constructing a minimum point cloud bounding box, and obtaining the three-dimensional size of the target object based on the small point cloud bounding box.

In the embodiment, a statistical filtering algorithm is adopted to remove noise points in the point cloud data, so that the running speed and the running precision of a subsequent clustering and segmentation algorithm of the point cloud are improved. Statistical filtering mainly relies on the calculation of the distance distribution from the point of the point cloud data of the input filter to the adjacent point, and the calculation process is as follows: based on each point in the point cloud data, calculating the average distance between the point and all adjacent points in the k neighborhood of the point, and determining the distance threshold asWherein->The scaling factor is set to 1. By carrying out statistical analysis on the distance, the characteristic that the distance distribution meets Gaussian distribution can be obtained, and the corresponding curve shape depends on the mean value and the standard deviation. Therefore, the points whose average distances are outside the distance threshold set according to the global distance average value and the standard deviation are identified as outliers and are removed from the point cloud data.

Wherein the point cloud data set for the input filter comprises n data points, and coordinates of the n data points are respectively marked asWhere i is the data point, ">Representing coordinates of data point i on x, y, z axes, respectively; then the n data points are taken to any point in the input point cloud data setDistance of->Where m is the data point, ">Representing the coordinates of data point m in the x, y, z axes, respectively.

Averaging the average distance from n data points in the point cloud data to all neighboring points in the k neighborhoodStandard deviation->The distance average and standard deviation are obtained by:

。

the scale factor of the standard deviation is set asAccording to the ∈of user input>And->Two thresholds, where k is the number of nearest neighbors of each point in the point cloud data. When the average distance from a certain data point in the point cloud data to all adjacent points in the neighborhood is within a set distance range +.>And reserving the point when the distance is within the set distance range, and rejecting the point as an outlier when the distance is outside the set distance range.

The above processing can realize noise reduction processing for the point cloud data.

In the embodiment of the application, clustering is carried out on the point cloud data after noise reduction by adopting a clustering segmentation processing method, wherein the basic idea of clustering segmentation is to select one characteristic of the input point cloud for segmentation, and two classes with the nearest characteristics and less than the characteristic threshold are combined into one class by setting the characteristic threshold. And when the characteristics between the two classes are smaller than the set threshold value, continuing to perform clustering processing on the point cloud data, and iterating until the characteristics of all the classes are larger than the set threshold value or the number of the classes is smaller than the set minimum clustering number, and ending the clustering. In this embodiment, a DBSCAN clustering algorithm is selected to process the filtered point cloud data to obtain complete point cloud data of the target object. The process for the DBSCAN clustering algorithm is as follows: and inputting a point cloud data set to be clustered. Clustering is carried out through a segment module. An initial neighborhood radius threshold and the number of clustered minimum points are set. And selecting a point cloud data collection point P, and searching a collection of points in the neighborhood of the point cloud data collection point P through the KD Tree. Points in the set of points in the neighborhood of the point are classified into a new set Q, wherein the points are smaller than the neighborhood radius threshold value. And selecting points except P in the set Q, and repeating the above processing procedure. Clustering is completed when the points in set Q no longer increase.

In the embodiment, the DBSCAN clustering algorithm is selected to perform clustering operation on the point cloud, the clustered effect is good, noise points in the point cloud are removed, and finally complete point cloud data are obtained.

Through the above processing procedure, complete point cloud data is obtained, and corresponding size detection is required for obtaining the size information of the target object, which is realized based on a bounding box algorithm in the embodiment. Wherein for the size detection process comprises: based on point cloud data, calculating the mass center of the point cloud, decentralizing, constructing a covariance matrix of the point cloud, determining a characteristic value and a characteristic vector of the covariance matrix, carrying out orthogonal normalization processing on the characteristic vector to obtain an orthogonal matrix, converting a data coordinate origin and a coordinate axis of the point cloud data, converting a target object to the vicinity of the coordinate origin, generating a minimum bounding box, and outputting the outline size of the target object based on the minimum bounding box.

The specific treatment process is as follows:

setting the coordinates of any point in the point cloud input set asThe point cloud can be regarded as a dataset Q consisting of the n points. The point cloud centroid calculation is as follows:

。

performing decentralization processing on n points in the point cloud set to enable the point cloud data to be distributed around a coordinate origin, wherein the format of the processed data set is as follows:

；

Then constructing a covariance matrix C of the model, as follows:

。

the covariance matrix C is a symmetric matrix, and 9 elements of the covariance matrix C are composed of variances and covariances of point cloud data.

And solving eigenvalues of the covariance matrix of the point cloud data, corresponding eigenvectors and eigenvalues.

Orthogonalization and normalization processing are carried out on the obtained feature vectors, and standard orthogonalization base and orthogonalization matrix are obtained.

The characteristic vector directions correspond to the x-axis, y-axis and z-axis directions of the new coordinate axis respectively, input point cloud data are converted to the vicinity of the origin of coordinates by utilizing the point cloud centroid coordinates and the new coordinate axis directions, all point cloud data in the minimum bounding box are traversed, 8 vertex coordinates of the bounding box are obtained, and the minimum bounding box of the point cloud data under the new coordinate system is generated. The bounding box vertex coordinates are as follows:

，/>，

，/>。

based on bounding box vertex coordinates, the length L, width W, and height H of the target object are calculated according to the following formula:

，/>，/>。

the three-dimensional size of the target object is obtained through the processing procedure, and the target object is picked up through the size information driving pickup part, wherein the processing procedure is as follows: and driving the pickup part to form a pickup space larger than the size of the target object based on the three-dimensional size of the target object, and picking up the target object based on the pickup space.

And S140, planning a path based on the current position of the pickup component and the second target position to obtain a movement strategy of the pickup component, and driving the pickup component to move to the second target position by the movement strategy.

In an embodiment of the present application, this step may be implemented using the processing logic of step S110 with respect to the movement of the pick-up member to the first target position. In this process, no further description will be made.

Referring to fig. 2, an embodiment of the present application provides an intelligent warehouse system cargo pickup device 200 based on step S110-step S140, for executing a method for implementing intelligent warehouse system cargo pickup, where the device includes:

the location determining module 210 is configured to receive a pickup command initiated by the user terminal, parse the command to obtain a keyword related to the command, search for a tag corresponding to the keyword in a preset database based on the keyword, determine a first target location based on the tag, and establish the first target location based on a world coordinate system, where the pickup command includes information of a target object to be picked and a second target location where the target object is placed;

A first movement control module 220, configured to perform path planning based on the current position of the pickup element and a first target position, obtain a movement strategy of the pickup element, and drive the pickup element to move to the first target position based on the movement strategy;

the pickup control module 230 is configured to perform image acquisition on the first target position, identify an acquired image to obtain a target object, acquire size information of the target object, and drive the pickup component to pick up the target object based on the size information;

and a second movement control module 240, configured to perform path planning based on the current position of the pickup element and the second target position, and obtain a movement strategy of the pickup element, and drive the pickup element to move to the second target position with the movement strategy.

Referring to fig. 3, there is further provided a cargo picking apparatus 300 of an intelligent warehousing system, where specific apparatuses may have relatively large differences due to different configurations or performances, and may include one or more processors 301 and a memory 302, where the memory 302 may store one or more storage applications or data. Wherein the memory 302 may be transient storage or persistent storage. The application program stored in the memory 302 may include one or more modules (not shown), each of which may include a series of computer executable instructions in the intelligent warehousing system cargo pickup device. Still further, the processor 301 may be configured to communicate with the memory 302 and execute a series of computer executable instructions in the memory 302 on the intelligent warehousing system cargo pickup device. The intelligent warehousing system cargo pickup apparatus may also include one or more power supplies 303, one or more wired/wireless network interfaces 304, one or more input/output interfaces 305, one or more keyboards 306, and the like.

In one particular embodiment, a smart warehousing system cargo pickup device includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer executable instructions for the smart warehousing system cargo pickup device, and configured to be executed by one or more processors, the one or more programs including computer executable instructions for:

receiving a pick-up command initiated by the user terminal, analyzing the command to obtain keywords in the command, searching labels corresponding to the keywords in a preset database based on the keywords, and determining a first target position based on the labels;

performing path planning based on the current position of the pickup element and a first target position to obtain a movement strategy of the pickup element, and driving the pickup element to move to the first target position based on the movement strategy;

acquiring an image of the first target position, identifying the acquired image to obtain a target object, acquiring size information of the target object, and driving the pickup component to pick up the target object based on the size information;

And planning a path based on the current position of the pickup element and the second target position to obtain a movement strategy of the pickup element, and driving the pickup element to move to the second target position by the movement strategy.

The following describes each component of the processor in detail:

wherein in this embodiment the processor is a specific integrated circuit (application specific integrated circuit, ASIC), or one or more integrated circuits configured to implement embodiments of the present application, such as: one or more microprocessors (digital signal processor, DSPs), or one or more field programmable gate arrays (field programmable gate array, FPGAs).

Alternatively, the processor may perform various functions, such as performing the method shown in fig. 1 described above, by running or executing a software program stored in memory, and invoking data stored in memory.

In a particular implementation, the processor may include one or more microprocessors, as one embodiment.

The memory is configured to store a software program for executing the scheme of the present application, and the processor is used to control the execution of the software program, and the specific implementation manner may refer to the above method embodiment, which is not described herein again.

Alternatively, the memory may be read-only memory (ROM) or other type of static storage device that can store static information and instructions, random access memory (random access memory, RAM) or other type of dynamic storage device that can store information and instructions, but may also be, without limitation, electrically erasable programmable read-only memory (electrically erasable programmable read-only memory, EEPROM), compact disc read-only memory (compact disc read-only memory) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store the desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be integrated with the processor or may exist separately and be coupled to the processing unit through an interface circuit of the processor, which is not particularly limited by the embodiment of the present application.

It should be noted that the structure of the processor shown in this embodiment is not limited to the apparatus, and an actual apparatus may include more or less components than those shown in the drawings, or may combine some components, or may be different in arrangement of components.

In addition, the technical effects of the processor may refer to the technical effects of the method described in the foregoing method embodiments, which are not described herein.

It should be appreciated that the processor in embodiments of the application may be other general purpose processors, digital signal processors (digital signal processor, DSP), application specific integrated circuits (application specific integrated circuit, ASIC), off-the-shelf programmable gate arrays (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

It should also be appreciated that the memory in embodiments of the present application may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. The volatile memory may be random access memory (random access memory, RAM) which acts as an external cache. By way of example but not limitation, many forms of random access memory (random access memory, RAM) are available, such as Static RAM (SRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced Synchronous Dynamic Random Access Memory (ESDRAM), synchronous Link DRAM (SLDRAM), and direct memory bus RAM (DR RAM).

The above embodiments may be implemented in whole or in part by software, hardware (e.g., circuitry), firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.

In the present application, "at least one" means one or more, and "a plurality" means two or more. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.

It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for picking up goods in an intelligent warehousing system, wherein the intelligent warehousing system is applied to a warehouse, the intelligent warehousing system comprises a user side and a server, a track and a picking-up component moving along the track are arranged in the warehouse, the method is applied to the server to drive the picking-up component to pick up the goods in the warehouse, and the method comprises the following steps:

receiving a picking command initiated by the user side, analyzing the command to obtain keywords in the command, searching tags corresponding to the keywords in a preset database based on the keywords, determining a first target position based on the tags, establishing the first target position based on a world coordinate system, and acquiring information of a target object to be picked and a second target position for placing the target object in the picking command;

performing path planning based on the current position of the pickup element and the first target position to obtain a movement strategy of the pickup element, and driving the pickup element to move to the first target position based on the movement strategy;

acquiring an image of the first target position, identifying the acquired image to obtain a target object, acquiring size information of the target object, and driving the pickup component to pick up the target object based on the size information; the identifying the collected image to obtain the target object comprises the following steps: performing binarization processing on the image, performing contour extraction on the binarized image to obtain a contour image, comparing the contour image with a preset template image to obtain an initial identification image, processing the initial identification image through an identification model to obtain a target image with highest similarity with keywords, and performing reduction on the target image to obtain a target object; the recognition model is set based on a neural network, the neural network comprises a backbone network LR grouping mechanism and a double-pooling fusion module which are mainly composed of CNNs, firstly, the CNNs perform feature extraction on images to obtain view level descriptors, then send the view level descriptors into the LR grouping mechanism and the double-pooling fusion module to perform grouping and feature fusion respectively, and finally send the images into a full-connection layer to perform object recognition, wherein the LR grouping mechanism and the double-pooling fusion module structure group multiple views by using L2 norms and a ReLU activation function, send the images into the double-pooling fusion module to perform pooling weighting on the multiple views twice, and finally obtain shape descriptors;

And carrying out path planning based on the current position of the pickup part and the second target position to obtain a movement strategy of the pickup part, and driving the pickup part to move to the second target position based on the movement strategy.

2. The method for picking up goods in intelligent warehousing system according to claim 1, wherein the receiving the picking up command initiated by the user terminal, analyzing the command to obtain keywords in the command includes: obtaining truncation information for truncating a feature sequence of an input speech signal based on the speech signal; based on the cut-off information, cutting off the characteristic sequence into a plurality of subsequences, and dividing the subsequences to obtain independent word units; and inputting the word units into the trained recognition model to obtain recognized texts, and combining the texts into corresponding keywords.

3. The intelligent warehousing system cargo pickup method of claim 2, wherein the cutoff information for cutting off the signature sequence of the voice signal comprises: obtaining spike information related to the speech signal by performing a connection timing classification CTC process on the feature sequence; and determining the cut-off information based on the obtained spike information.

4. The intelligent warehousing system cargo pickup method of claim 3 wherein the truncation of the signature sequence into a plurality of subsequences comprises: for each spike in the spike information, a sub-sequence in the feature sequence corresponding to a predetermined number of spikes adjacent to each spike is selected, the predetermined number of spikes including a first number of spikes before each spike and a second number of spikes after each spike.

5. The intelligent warehousing system cargo pickup method of claim 4 wherein the identification model comprises a BERT layer, a biglu layer, a self-attention layer, and a CRF layer connected in sequence; the BiGRU layer comprises a forward BiGRU layer and a backward BiGRU layer, the outputs of the BERT layer are respectively input to the forward BiGRU layer and the backward BiGRU layer to obtain two corresponding outputs, and the two outputs are respectively input to the self-attention layer.

6. The intelligent warehousing system cargo pickup method according to claim 5, wherein comparing the profile image with a preset template image to obtain an initial identification image comprises: comparing the contour image with a preset template image to obtain the similarity of the contour image and the template image, and taking the contour image which accords with a similarity threshold value as an initial identification image based on the similarity threshold value.

7. The intelligent warehousing system cargo pickup method according to claim 6, wherein the acquiring the size information of the target object comprises: obtaining point cloud data of the target object, performing format conversion on the point cloud data, filtering the point cloud data subjected to the format conversion to remove noise in the point cloud data, clustering the point cloud data subjected to the noise removal to obtain complete point cloud of the target object, constructing a minimum point cloud bounding box, and obtaining the three-dimensional size of the target object based on the minimum point cloud bounding box.

8. The intelligent warehousing system cargo pickup method of claim 7, wherein driving the pickup assembly to pick up the target object based on the dimensional information comprises: and driving the pickup part to form a pickup space larger than the size of the target object based on the three-dimensional size of the target object, and picking up the target object based on the pickup space.

9. An intelligent warehousing system cargo pickup device, wherein, intelligent warehousing system is applied to the warehouse, intelligent warehousing system includes user side and server be provided with the track in the warehouse and follow the track removes pick up part, the device is applied to the server drive pick up part picks up the cargo in the warehouse, the device includes:

The position determining module is used for receiving a picking command initiated by the user terminal, analyzing the command to obtain keywords in the command, searching labels corresponding to the keywords in a preset database based on the keywords, determining a first target position based on the labels, establishing the first target position based on a world coordinate system, and enabling the picking command to contain information of a target object to be picked and a second target position for placing the target object;

the first movement control module is used for planning a path based on the current position of the pickup element and a first target position, obtaining a movement strategy of the pickup element, and driving the pickup element to move to the first target position based on the movement strategy;

the pickup control module is used for acquiring an image of the first target position, identifying the acquired image to obtain a target object, acquiring size information of the target object, and driving the pickup component to pick up the target object based on the size information; the identifying the collected image to obtain the target object comprises the following steps: performing binarization processing on the image, performing contour extraction on the binarized image to obtain a contour image, comparing the contour image with a preset template image to obtain an initial identification image, processing the initial identification image through an identification model to obtain a target image with highest similarity with keywords, and performing reduction on the target image to obtain a target object; the recognition model is set based on a neural network, the neural network comprises a backbone network LR grouping mechanism and a double-pooling fusion module which are mainly composed of CNNs, firstly, the CNNs perform feature extraction on images to obtain view level descriptors, then send the view level descriptors into the LR grouping mechanism and the double-pooling fusion module to perform grouping and feature fusion respectively, and finally send the images into a full-connection layer to perform object recognition, wherein the LR grouping mechanism and the double-pooling fusion module structure group multiple views by using L2 norms and a ReLU activation function, send the images into the double-pooling fusion module to perform pooling weighting on the multiple views twice, and finally obtain shape descriptors;

And the second movement control module is used for planning a path based on the current position of the pickup component and the second target position to obtain a movement strategy of the pickup component, and driving the pickup component to move to the second target position by the movement strategy.