CN113160414A

CN113160414A - Automatic identification method and device for remaining amount of goods, electronic equipment and computer readable medium

Info

Publication number: CN113160414A
Application number: CN202110214835.7A
Authority: CN
Inventors: 王勃; 宋柏林; 王云吉; 孙建成; 于忠京; 王峰峰
Original assignee: Beijing Douniu Network Technology Co ltd
Current assignee: Beijing Douniu Network Technology Co ltd
Priority date: 2021-01-25
Filing date: 2021-02-25
Publication date: 2021-07-23

Abstract

A cargo allowance automatic identification method comprises the following steps: acquiring a video including goods to be identified, and processing the video into a plurality of images; respectively extracting characteristic points of the plurality of images, and matching the characteristic points among the plurality of images to determine each cargo in the plurality of images; solving the spatial three-dimensional coordinates of each feature point to generate a point cloud; fusing the generated point clouds to eliminate redundant repeated point cloud points; carrying out three-dimensional scene reduction on the fused point cloud to obtain a three-dimensional structure chart of the scene comprising each cargo; and calculating the allowance of the object according to the three-dimensional structure diagram, and outputting a result. According to the automatic identification method for the remaining amount of the goods, the remaining amount of the goods can be automatically identified through the video shot by the mobile phone, and the remaining amount of the goods is calculated in a three-dimensional scene reduction mode, so that the automatic identification method is convenient to operate, accurate in calculation, low in cost and wide in application range.

Description

Automatic identification method and device for remaining amount of goods, electronic equipment and computer readable medium

Technical Field

The invention relates to the technical field of computers, in particular to a method and a device for automatically identifying the selling allowance of goods based on machine vision identification, electronic equipment and a computer readable medium.

Background

In the process of selling goods, the rate of sale is important to both the seller and the supplier. In the prior art, each piece of goods is uniformly managed mainly by establishing a goods information management system, and information needs to be input into all the goods in the mode. At present, both supermarkets and large markets need to employ a large number of workers to check the selling conditions of goods, and after the workers finish the inspection, the workers statistically input key information such as the residual quantity of the goods into a corresponding system. However, the manual inspection of the remaining amount of the goods is costly, the information is unreliable, and the information is easy to lag behind. And the mode is suitable for supermarkets with longer sale period and smaller sale scale. The wholesale market or trading port with short sales period and large sales scale is difficult to implement and deploy by the existing technology.

In contrast, in a known automatic detection method for the dinner plate residual quantity, a photo of the current state of a desktop is obtained at regular time by setting a time interval, and the photo is preprocessed; image recognition is carried out on the preprocessed pictures through a food residual quantity recognition model so as to determine the position coordinates and the food residual quantity grades of all dinner plates on the desktop, and the food residual quantity grades are divided into three grades which are respectively as follows: empty tray, small amount and large amount.

Disclosure of Invention

Technical problem

However, the above automatic detection method for the residual has the following disadvantages: the method is used for shooting the position right above the dinner plate, so that the two-dimensional area ratio of food is calculated, the height information of the food in the dinner plate is not considered, and the actually eaten dinner plate generally has some residual food to fill the bottom of the dinner plate, so that misjudgment is caused by the method; in addition, the method calculates the two-dimensional area ratio of food, no specific food is considered, and actually, a plurality of side dishes or decorative dishes exist in one dish, so that the calculation of the food ratio without distinguishing the remaining food is inaccurate.

Aiming at the problems, the invention provides a method for identifying the surplus of wholesale market goods based on machine vision, which can identify the surplus of various goods based on videos shot by a mobile phone in a market scene, and has the advantages of convenience and rapidness in operation, accuracy in calculation, low cost and wider application range.

Problem solving scheme

According to an aspect of the present invention, there is provided an automatic recognition method of remaining amount of goods, including:

a cargo image acquisition step of acquiring a video including a cargo to be identified, and processing the video into a plurality of images;

a feature point extraction and matching step of extracting feature points from the plurality of images, respectively, and performing feature point matching between the plurality of images to find out the same feature points in different images;

a point cloud generating step of solving the spatial three-dimensional coordinates of each feature point to generate a point cloud;

a point cloud fusion step of fusing the generated point clouds to eliminate redundant repeated point cloud points

A three-dimensional scene reduction step, namely performing three-dimensional scene reduction on the point cloud to obtain a three-dimensional structure chart of the scene including the goods to be identified; and

and a cargo allowance calculation step, namely calculating the allowance of the cargo to be identified according to the three-dimensional structure diagram and outputting a result.

Optionally, the method according to an aspect of the present invention further comprises:

a semantic segmentation step of collecting image information including the goods to be recognized to generate a training data set and building the semantic segmentation network model, training the built semantic segmentation network model based on the training data set, and performing semantic segmentation on the plurality of images respectively by using the trained semantic segmentation network model to classify pixels belonging to the same object in the images into one category; and

and a feature point semantic fusion step, namely identifying semantic information of each feature point according to a semantic segmentation result after the feature point extraction and matching step, so that the three-dimensional structure diagram obtained in the three-dimensional scene reduction step comprises respective semantic information of different objects.

Optionally, the method according to an aspect of the present invention, wherein the training of the constructed semantic segmentation network model based on the training dataset specifically includes:

a training data set generation step of acquiring a plurality of pictures including a plurality of kinds of goods, and labeling the range and the boundary of the images of the various goods and the various kinds of objects included in the pictures to generate the training data set;

a semantic division network model building step, namely building the semantic division network model;

and training a semantic segmentation network model, namely training the semantic segmentation network model by using the generated training data set so that the semantic segmentation network model can classify each pixel in the input image.

Alternatively, the method according to an aspect of the invention,

the semantic segmentation network model is composed of a lightweight convolution network MobileNet serving as a main network and a characteristic pyramid network serving as a branch network.

and a characteristic point filtering step, wherein before the characteristic point semantic fusion step, unnecessary characteristic points are filtered according to the result of semantic segmentation.

Alternatively, the method according to an aspect of the invention,

in the cargo allowance calculation step, an image boundary of each cargo is extracted from the three-dimensional structure diagram, and the cargo allowance is calculated according to the size of the image boundary.

Alternatively, the method according to an aspect of the invention,

the image boundary of the goods is the minimum external cuboid of the goods in the three-dimensional structure diagram, and

and calculating the surplus of the goods according to the length, the width and the height of the minimum circumscribed cuboid.

According to another aspect of the present invention, there is provided an automatic recognition apparatus for remaining amount of goods, comprising:

the cargo image acquisition module is used for acquiring a video comprising cargos to be identified and processing the video into a plurality of images;

the characteristic point extracting and matching module is used for respectively extracting characteristic points of the plurality of images and matching the characteristic points among the plurality of images so as to find out the same characteristic points in different images;

the point cloud generating module is used for solving the spatial three-dimensional coordinates of each feature point to generate a point cloud;

the point cloud fusion module is used for fusing the generated point clouds to eliminate redundant and repeated point cloud points;

the three-dimensional scene reduction module is used for carrying out three-dimensional scene reduction on the point cloud to obtain a three-dimensional structure chart of a scene including the goods to be identified;

and the cargo allowance calculation module is used for calculating the allowance of the cargo to be identified according to the three-dimensional structure chart and outputting a result.

Optionally, the apparatus according to another aspect of the invention, further comprising

The semantic segmentation module is used for collecting image information including the goods to generate a training data set and building the semantic segmentation network model, training the built semantic segmentation network model based on the training data set, and performing semantic segmentation on the plurality of images by using the trained semantic segmentation network model respectively so as to classify pixels belonging to the same object in the images into one category; and

and the feature point semantic fusion module is used for identifying semantic information of each feature point according to a semantic segmentation result after the feature points are extracted and matched, so that a three-dimensional structure chart finally obtained by the three-dimensional scene restoration module comprises respective semantic information of different objects.

According to still another aspect of the present invention, there is provided an electronic device having an automatic remaining cargo amount recognition function, comprising:

one or more processors;

a storage device for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors implement the automatic remaining cargo amount recognition method according to any one of the aspects of the present invention.

According to still another aspect of the present invention, there is provided a computer-readable medium having stored thereon a computer program, characterized in that the program, when executed by a processor, implements the automatic remaining amount of goods identification method according to any one of the aspects of the present invention described above.

Advantageous effects of the invention

By using the automatic identification method for the remaining amount of the goods, the remaining amount and the position of various goods can be identified by shooting a market scene through a mobile phone, so that the automatic identification method for the remaining amount of the goods is convenient to operate, low in labor cost and wide in application range.

Moreover, the method according to the present invention calculates the remaining amount of the goods by three-dimensional scene reduction, taking into account the height information of the goods, and thus the calculation result is more accurate than that using a two-dimensional plane.

In addition, according to the method, the semantic information of the image is added while the three-dimensional scene is restored, so that the three-dimensional structure diagram obtained after the three-dimensional scene is restored contains the semantic information representing different objects, the surplus of various goods can be identified and calculated at the same time, and the application range is further expanded.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

fig. 1 is a flowchart illustrating main steps of an automatic recognition method of remaining amount of goods according to a preferred embodiment of the present invention;

FIG. 2 is a flow diagram illustrating training a semantic segmentation network model in accordance with a preferred embodiment of the present invention;

FIG. 3 is a flow chart showing the calculation of the remaining cargo amount fused with semantic information according to a preferred embodiment of the present invention

FIG. 4 is a diagram illustrating an example of three-dimensional scene restoration according to a preferred embodiment of the present invention;

fig. 5 is a block diagram illustrating an automatic recognition apparatus for remaining amount of goods according to a preferred embodiment of the present invention.

Detailed Description

The technical solution of the present invention will be clearly and completely described below with reference to the specific embodiments of the present invention and the accompanying drawings. It is to be understood that the described embodiments are only a few of the presently preferred embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to automatically identify the remaining amount of goods by shooting videos through a mobile phone, the invention provides an automatic identifying method for the remaining amount of goods, which is characterized in that the remaining amount of goods is restored and calculated through a three-dimensional scene based on videos including goods shot through the mobile phone, and various goods can be identified by endowing semantic information to different objects in the restored three-dimensional scene. The automatic recognition method of the remaining amount of goods according to the present invention is described in detail as follows.

Fig. 1 is a flowchart illustrating an automatic recognition method of a remaining amount of goods according to a preferred embodiment of the present invention. As shown in fig. 1, the automatic identification method of remaining amount of goods of the present invention includes: a cargo image acquisition step S1 of acquiring a video including a cargo to be identified, and processing the video into a plurality of images; a feature point extraction and matching step S2 of performing feature point extraction on each of the plurality of images and performing feature point matching between the plurality of images to determine each item among the plurality of images; a point cloud generation step S3, solving the space three-dimensional coordinates of each feature point to generate a point cloud; a point cloud fusion step S4, fusing the generated point clouds to eliminate redundant and repeated point cloud points; a three-dimensional scene reduction step S5 of performing three-dimensional scene reduction on the point cloud to obtain a three-dimensional structure diagram of a scene including each cargo; and a cargo remaining amount calculation step S6, calculating the remaining amount of the object according to the three-dimensional structure diagram, and outputting the result. The respective steps are described in detail below.

Step S1: cargo image acquisition

For example, a video of a good is taken in a wholesale market using a cell phone, and the video is processed into a plurality of pictures using any known method to obtain an image including the good.

Step S2: feature point extraction and matching

And respectively extracting feature points by using a known SIFT algorithm for each picture to obtain stable feature points which do not change along with the change of shooting positions, angles and the like, and then performing feature point matching between pictures by using the stable feature points to determine the positions of the same feature point in the space in each picture.

Step S3: point cloud generation

The point cloud is a collection of massive points which express the target space distribution and the target surface characteristics under the same space reference system, and after the space coordinates of each sampling point on the surface of the object are obtained, the collection of points is obtained, namely the point cloud.

In the present embodiment, the spatial three-dimensional coordinates of each feature point are obtained by triangulating each feature point using the least square method, thereby generating point cloud data based on the feature points included in all images. Specifically, for example, for a feature point, an observation ray from an observation position of each picture including the feature point to the feature point is obtained, the observation ray is transferred to a world coordinate system to obtain an observation ray equation under the world coordinate system, then the distance from the feature point to each observation ray is obtained under the world coordinate system, and when the feature point is closest to the three-dimensional distances of all observation rays, the spatial three-dimensional coordinates of the feature point are obtained.

The solving method of the transformation matrix between different images and the space three-dimensional coordinates of the characteristic points is as follows:

and solving the space three-dimensional coordinates of each characteristic point to obtain point cloud data.

Step S4: point cloud fusion

After point Cloud data is obtained, the known PCL (Point Cloud library) technology is used for fusing point clouds to eliminate redundant and repeated point Cloud points, and a set of final point Cloud data is obtained.

Step S5: three-dimensional scene restoration

After the point cloud data is obtained, the point cloud is restored in a three-dimensional scene by using a known elastic fusion technology, and the obtained three-dimensional structure diagram is shown in fig. 4, for example.

Step S6: cargo allowance calculation step

And extracting the boundary of the goods in the obtained three-dimensional structure diagram, and then calculating the surplus of the goods according to the boundary of the goods. For example, as shown in fig. 4, the volume of the cargo is calculated according to the length, width and height of the smallest external rectangular solid of the cargo in the three-dimensional structure diagram, that is, the remaining amount of the cargo is obtained.

The main flow of the automatic remaining cargo amount recognition according to the present embodiment is described above.

However, in an actual scene, there is usually more than one kind of goods, and there are also people, cars, backgrounds, etc., for example. In this case, in order to identify each cargo and calculate the remaining amount of each cargo, it is also necessary to distinguish between different cargos and between cargos and other objects such as people. The method adopted in the embodiment is to perform semantic segmentation on the image, and endow semantic information to each extracted feature point according to the result of the semantic segmentation, so that the three-dimensional structure chart obtained through point cloud generation, point cloud fusion and three-dimensional scene reduction also comprises semantic information of different objects, thereby identifying various goods and calculating respective surplus of various goods. In this embodiment, the semantic segmentation is implemented using an artificial neural network.

The artificial neural network is different networks formed by abstracting a human brain neuron network from the information processing perspective, establishing a certain model and connecting in different ways, and is simply called a neural network or a neural network. A neural network is an operational model, which is formed by connecting a large number of nodes (or neurons). Each node represents a particular output function, called the excitation function. Each connection between two nodes represents a weighted value, called weight, for the signal passing through the connection. The output of the network is different according to the connection mode, the weight value and the excitation function of the network. The network itself is usually an approximation to some algorithm or function in nature. The neural network finally extracts the features required for completing the task by abstracting the data features layer by layer. In this embodiment, the semantic segmentation neural network model is built and trained to classify each pixel of the input cargo image, so as to identify semantic information of each feature point according to the semantic segmentation result.

The building and training of the semantic segmentation network model is described in detail below with reference to fig. 2.

The semantic segmentation network model of the embodiment is divided into a main network and a branch network, wherein the main network adopts a well-known light-weight convolutional network mobilonetv 3 with a small calculation amount, and the branch network adopts a well-known feature pyramid network in order to adapt to objects (goods) with different sizes.

MobileNet is a model based on deep separable convolution, which is mainly designed for mobile-end devices and can efficiently operate on mobile devices such as mobile phones in an offline state, wherein MobileNet v3 introduces hole convolution in order to increase the receptive field of feature maps, so that each convolution output contains a larger range of information. The characteristic pyramid network mainly solves the multi-scale problem in object detection, and greatly improves the performance of small object detection under the condition of basically not increasing the calculated amount of an original model through simple network connection change.

In addition, after the feature pyramid network, the semantic segmentation network model connects other feature maps with different sizes together in an up-down sampling mode, then performs convolution by 3 × 3, and finally performs up-sampling by 4 times to obtain a final output layer. Wherein use is made of_i(c)∈[0，1]Indicating whether i pixels are input into class c, error vector m_i(c) As shown in equation 1.

Constructing Jaccard coefficients for class c

The substitution loss of (2):

and measuring the segmentation loss by using a semantic segmentation evaluation index mIoU, so as to obtain a final loss function as formula 3:

the semantic segmentation network model is then implemented using the well-known open source artificial neural network library Keras.

After the model is built, the model can be trained to carry out semantic segmentation on the input image. The training procedure is as follows.

S201: image acquisition

And shooting a video of the agricultural product trading scene in the wholesale market by using a shooting tool such as a mobile phone, and processing the video into a plurality of images.

S202: image annotation

The captured video is converted into a picture using the well-known multimedia processing tool FFmpeg, and then the range and boundaries of goods, people, cars, backlights in the picture are labeled using the well-known data labeling software Labelme to generate the training data set.

S203: model training

The generated data set is divided into a training set and a test set, for example, 80% of the training set and 20% of the test data set are randomly extracted. And then training the constructed semantic segmentation network model by using a training set.

S204: model testing

And evaluating the trained model by using the divided test set.

S205: model optimization

And optimizing the model according to the test result of the model by adopting any known method.

The building and training process of the semantic segmentation network model according to the embodiment is described above.

The following describes a flow of identifying and calculating the remaining amounts of a plurality of goods fused with semantic information by using the above semantic segmentation network model with reference to fig. 3.

Step S301: cargo image acquisition step

For example, a video including a plurality of different goods is taken in a wholesale market using a mobile phone, and the video is processed into a plurality of pictures using any known method to obtain a plurality of images including the different goods.

Step S302: semantic segmentation step

After collecting image information including goods to generate a training data set and building a semantic segmentation network model, training the built semantic segmentation network model based on the training data set, and performing semantic segmentation on a plurality of images by using the trained semantic segmentation network model respectively so as to classify pixels belonging to the same object in the images into one category.

Step S303: characteristic point extraction and matching step

The method comprises the steps of extracting feature points by using a well-known SIFT algorithm for each image respectively to obtain stable feature points which do not change along with the change of shooting positions, angles and the like, and then carrying out feature point matching between pictures by using the stable feature points to determine the positions of the same feature points in space in each image.

Step S304: feature point semantic fusion step

After the step of extracting and matching the feature points, identifying the semantic information of each extracted feature point according to the result of semantic segmentation, so that each object of the three-dimensional structure chart obtained after the three-dimensional scene reduction is carried out in the subsequent step has the semantic information of the object.

Step S305: point cloud generation step

As in the process described in the above-described step S3, the spatial three-dimensional coordinates of the feature points are obtained by triangularizing each feature point using the least square method, thereby obtaining the point cloud of each image.

Step S306: point cloud fusion step

Step S307: three-dimensional scene restoration

Step S308: cargo allowance calculation step

And extracting the boundary of each cargo in the reduced three-dimensional structure diagram, and then calculating the residual amount of the cargo according to the boundary of the cargo. For example, as shown in fig. 4, the volume of the cargo is calculated according to the length, width, and height of the smallest external rectangular solid of the cargo in the three-dimensional structure diagram, that is, the remaining amount of the cargo is obtained.

The above is the identification and calculation process of the surplus of various goods which are fused with semantic information and are carried out by utilizing the semantic segmentation network model.

Note that the execution sequence of steps S302 and S303 in the above process is not sequential, and step S303 may be performed first and then step S302 may be performed, or the reverse order may be performed, or the steps may be performed simultaneously.

In addition, after the image is subjected to semantic segmentation by using the semantic segmentation network model, the feature points in an unnecessary region such as a background can be removed according to the result of the semantic segmentation, so that the processing amount is reduced, and the execution process is shortened.

The automatic remaining cargo amount identification method according to the embodiment of the present invention is described above in detail, and the automatic remaining cargo amount identification method according to the present invention is further described below with reference to an example. It is noted that this example is merely illustrative.

In the example, a video of a certain indoor slot in a certain wholesale market is collected for 10 minutes, the goods sold by the slot are boxed red Fuji apples, the video is converted into pictures through FFmpeg to obtain 1200 pictures, 1000 pictures are extracted to be used as a training set, and 200 pictures are used as a testing set. And constructing a semantic separation network by using Keras, and training by using a training set, wherein 200 training rounds are performed to reach a fitting state. The test set was used for testing and gave an mlou value of 81.6. The model is converted to tflite format using the TensorFlow Lite so that the model can be shipped on a mobile device, such as a cell phone. A cargo allowance calculation process is developed by using java language and is installed on an android mobile phone, and automatic identification of the cargo allowance of a shelves for selling red Fuji apples can be completed.

Fig. 5 is a block diagram illustrating an embodiment of the automatic remaining cargo amount recognition apparatus according to the present invention. The automatic remaining cargo amount recognition apparatus 500 shown in fig. 5 includes at least: a cargo image acquisition module 501, configured to acquire a video including a cargo to be identified, and process the video into a plurality of images; a feature point extraction and matching module 502, which extracts feature points of the plurality of images respectively, and performs feature point matching between the plurality of images to determine each item in the plurality of images; a point cloud generating module 503, which solves the spatial three-dimensional coordinates of each feature point to generate a point cloud; a point cloud fusion module 504 for fusing the generated point clouds to eliminate redundant and repetitive point cloud points; a three-dimensional scene reduction module 505, configured to perform three-dimensional scene reduction on the point cloud to obtain a three-dimensional structure diagram of a scene including the cargo; and a cargo allowance calculation module 506 which calculates the allowance of the object according to the three-dimensional structure diagram and outputs a result.

In addition, in order to identify the remaining amount of various goods simultaneously, the automatic goods remaining amount identification device according to the present invention may further include a semantic segmentation module and a feature point semantic fusion module.

In the semantic segmentation module, after image information including goods is collected to generate a training data set and a semantic segmentation network model is built, the built semantic segmentation network model is trained on the basis of the training data set, and the trained semantic segmentation network model is used for performing semantic segmentation on a plurality of images respectively so as to classify pixels belonging to the same object in the images into one category.

In the feature point semantic fusion module, after feature points are extracted and matched, semantic information of each feature point is identified according to semantic segmentation results, so that a three-dimensional structure chart finally obtained in the three-dimensional scene restoration module comprises respective semantic information of different objects.

As another aspect, the present invention also provides an electronic device having an automatic remaining cargo amount recognition function, the electronic device including: one or more processors; a storage device for storing one or more programs and causing the one or more processors to implement the automatic remaining cargo amount recognition method as described above when the one or more programs are executed by the one or more processors.

As still another aspect, the present invention also provides a computer-readable medium carrying one or more programs which, when executed by the apparatus, cause the apparatus to include the steps of the automatic remaining cargo amount recognition method.

According to the technical scheme of the embodiment of the invention, the following effects are obtained.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for automatically identifying the remaining amount of goods is characterized by comprising the following steps:

a point cloud fusion step, namely fusing the generated point clouds to eliminate redundant repeated point cloud points;

a three-dimensional scene reduction step, namely performing three-dimensional scene reduction on the point cloud after fusion to obtain a three-dimensional structure diagram of the scene including the goods to be identified; and

2. The method of claim 1, further comprising:

and a feature point semantic fusion step, namely identifying semantic information of each feature point according to a semantic segmentation result after the feature point extraction and matching step, so that the three-dimensional structure diagram obtained in the three-dimensional scene restoration step later comprises respective semantic information of different objects.

3. The method of claim 2,

the semantic segmentation network model trained and built based on the training data set specifically comprises:

a training data set generation step of acquiring a plurality of pictures including a plurality of kinds of goods, and labeling the range and the boundary of the images of the various kinds of goods and the various kinds of objects included in the pictures to generate the training data set;

a semantic division network model building step, namely building the semantic division network model; and

4. The method according to claim 2 or 3,

5. The method of claim 2 or 3, further comprising:

6. The method according to claim 1 or 2,

in the step of calculating the remaining amount of the goods, the image boundary of the goods to be identified is extracted from the three-dimensional structure chart, and the remaining amount of the goods is calculated according to the size of the image boundary.

7. The method of claim 6,

the image boundary of the goods to be identified is the minimum external cuboid of the goods in the three-dimensional structure chart, and

8. An automatic recognition device for remaining amount of goods, comprising:

a feature point extraction and matching module which extracts feature points of the plurality of images respectively and performs feature point matching between the plurality of images to determine each cargo in the plurality of images;

the three-dimensional scene reduction module is used for carrying out three-dimensional scene reduction on the point cloud to obtain a three-dimensional structure chart of a scene including the goods to be identified; and

9. The apparatus of claim 8, further comprising:

the semantic segmentation module is used for collecting image information including the goods to be recognized to generate a training data set and building the semantic segmentation network model, the built semantic segmentation network model is trained on the basis of the training data set, and the trained semantic segmentation network model is used for performing semantic segmentation on the plurality of images respectively so as to classify pixels belonging to the same object in the images into one category; and

10. An electronic device having a cargo remaining amount automatic recognition function, comprising:

one or more processors; and

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.

11. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.