CN113160414A - Automatic identification method and device for remaining amount of goods, electronic equipment and computer readable medium - Google Patents

Automatic identification method and device for remaining amount of goods, electronic equipment and computer readable medium Download PDF

Info

Publication number
CN113160414A
CN113160414A CN202110214835.7A CN202110214835A CN113160414A CN 113160414 A CN113160414 A CN 113160414A CN 202110214835 A CN202110214835 A CN 202110214835A CN 113160414 A CN113160414 A CN 113160414A
Authority
CN
China
Prior art keywords
goods
semantic segmentation
images
cargo
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110214835.7A
Other languages
Chinese (zh)
Inventor
王勃
宋柏林
王云吉
孙建成
于忠京
王峰峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Douniu Network Technology Co ltd
Original Assignee
Beijing Douniu Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Douniu Network Technology Co ltd filed Critical Beijing Douniu Network Technology Co ltd
Publication of CN113160414A publication Critical patent/CN113160414A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Abstract

A cargo allowance automatic identification method comprises the following steps: acquiring a video including goods to be identified, and processing the video into a plurality of images; respectively extracting characteristic points of the plurality of images, and matching the characteristic points among the plurality of images to determine each cargo in the plurality of images; solving the spatial three-dimensional coordinates of each feature point to generate a point cloud; fusing the generated point clouds to eliminate redundant repeated point cloud points; carrying out three-dimensional scene reduction on the fused point cloud to obtain a three-dimensional structure chart of the scene comprising each cargo; and calculating the allowance of the object according to the three-dimensional structure diagram, and outputting a result. According to the automatic identification method for the remaining amount of the goods, the remaining amount of the goods can be automatically identified through the video shot by the mobile phone, and the remaining amount of the goods is calculated in a three-dimensional scene reduction mode, so that the automatic identification method is convenient to operate, accurate in calculation, low in cost and wide in application range.

Description

Automatic identification method and device for remaining amount of goods, electronic equipment and computer readable medium
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for automatically identifying the selling allowance of goods based on machine vision identification, electronic equipment and a computer readable medium.
Background
In the process of selling goods, the rate of sale is important to both the seller and the supplier. In the prior art, each piece of goods is uniformly managed mainly by establishing a goods information management system, and information needs to be input into all the goods in the mode. At present, both supermarkets and large markets need to employ a large number of workers to check the selling conditions of goods, and after the workers finish the inspection, the workers statistically input key information such as the residual quantity of the goods into a corresponding system. However, the manual inspection of the remaining amount of the goods is costly, the information is unreliable, and the information is easy to lag behind. And the mode is suitable for supermarkets with longer sale period and smaller sale scale. The wholesale market or trading port with short sales period and large sales scale is difficult to implement and deploy by the existing technology.
In contrast, in a known automatic detection method for the dinner plate residual quantity, a photo of the current state of a desktop is obtained at regular time by setting a time interval, and the photo is preprocessed; image recognition is carried out on the preprocessed pictures through a food residual quantity recognition model so as to determine the position coordinates and the food residual quantity grades of all dinner plates on the desktop, and the food residual quantity grades are divided into three grades which are respectively as follows: empty tray, small amount and large amount.
Disclosure of Invention
Technical problem
However, the above automatic detection method for the residual has the following disadvantages: the method is used for shooting the position right above the dinner plate, so that the two-dimensional area ratio of food is calculated, the height information of the food in the dinner plate is not considered, and the actually eaten dinner plate generally has some residual food to fill the bottom of the dinner plate, so that misjudgment is caused by the method; in addition, the method calculates the two-dimensional area ratio of food, no specific food is considered, and actually, a plurality of side dishes or decorative dishes exist in one dish, so that the calculation of the food ratio without distinguishing the remaining food is inaccurate.
Aiming at the problems, the invention provides a method for identifying the surplus of wholesale market goods based on machine vision, which can identify the surplus of various goods based on videos shot by a mobile phone in a market scene, and has the advantages of convenience and rapidness in operation, accuracy in calculation, low cost and wider application range.
Problem solving scheme
According to an aspect of the present invention, there is provided an automatic recognition method of remaining amount of goods, including:
a cargo image acquisition step of acquiring a video including a cargo to be identified, and processing the video into a plurality of images;
a feature point extraction and matching step of extracting feature points from the plurality of images, respectively, and performing feature point matching between the plurality of images to find out the same feature points in different images;
a point cloud generating step of solving the spatial three-dimensional coordinates of each feature point to generate a point cloud;
a point cloud fusion step of fusing the generated point clouds to eliminate redundant repeated point cloud points
A three-dimensional scene reduction step, namely performing three-dimensional scene reduction on the point cloud to obtain a three-dimensional structure chart of the scene including the goods to be identified; and
and a cargo allowance calculation step, namely calculating the allowance of the cargo to be identified according to the three-dimensional structure diagram and outputting a result.
Optionally, the method according to an aspect of the present invention further comprises:
a semantic segmentation step of collecting image information including the goods to be recognized to generate a training data set and building the semantic segmentation network model, training the built semantic segmentation network model based on the training data set, and performing semantic segmentation on the plurality of images respectively by using the trained semantic segmentation network model to classify pixels belonging to the same object in the images into one category; and
and a feature point semantic fusion step, namely identifying semantic information of each feature point according to a semantic segmentation result after the feature point extraction and matching step, so that the three-dimensional structure diagram obtained in the three-dimensional scene reduction step comprises respective semantic information of different objects.
Optionally, the method according to an aspect of the present invention, wherein the training of the constructed semantic segmentation network model based on the training dataset specifically includes:
a training data set generation step of acquiring a plurality of pictures including a plurality of kinds of goods, and labeling the range and the boundary of the images of the various goods and the various kinds of objects included in the pictures to generate the training data set;
a semantic division network model building step, namely building the semantic division network model;
and training a semantic segmentation network model, namely training the semantic segmentation network model by using the generated training data set so that the semantic segmentation network model can classify each pixel in the input image.
Alternatively, the method according to an aspect of the invention,
the semantic segmentation network model is composed of a lightweight convolution network MobileNet serving as a main network and a characteristic pyramid network serving as a branch network.
Optionally, the method according to an aspect of the present invention further comprises:
and a characteristic point filtering step, wherein before the characteristic point semantic fusion step, unnecessary characteristic points are filtered according to the result of semantic segmentation.
Alternatively, the method according to an aspect of the invention,
in the cargo allowance calculation step, an image boundary of each cargo is extracted from the three-dimensional structure diagram, and the cargo allowance is calculated according to the size of the image boundary.
Alternatively, the method according to an aspect of the invention,
the image boundary of the goods is the minimum external cuboid of the goods in the three-dimensional structure diagram, and
and calculating the surplus of the goods according to the length, the width and the height of the minimum circumscribed cuboid.
According to another aspect of the present invention, there is provided an automatic recognition apparatus for remaining amount of goods, comprising:
the cargo image acquisition module is used for acquiring a video comprising cargos to be identified and processing the video into a plurality of images;
the characteristic point extracting and matching module is used for respectively extracting characteristic points of the plurality of images and matching the characteristic points among the plurality of images so as to find out the same characteristic points in different images;
the point cloud generating module is used for solving the spatial three-dimensional coordinates of each feature point to generate a point cloud;
the point cloud fusion module is used for fusing the generated point clouds to eliminate redundant and repeated point cloud points;
the three-dimensional scene reduction module is used for carrying out three-dimensional scene reduction on the point cloud to obtain a three-dimensional structure chart of a scene including the goods to be identified;
and the cargo allowance calculation module is used for calculating the allowance of the cargo to be identified according to the three-dimensional structure chart and outputting a result.
Optionally, the apparatus according to another aspect of the invention, further comprising
The semantic segmentation module is used for collecting image information including the goods to generate a training data set and building the semantic segmentation network model, training the built semantic segmentation network model based on the training data set, and performing semantic segmentation on the plurality of images by using the trained semantic segmentation network model respectively so as to classify pixels belonging to the same object in the images into one category; and
and the feature point semantic fusion module is used for identifying semantic information of each feature point according to a semantic segmentation result after the feature points are extracted and matched, so that a three-dimensional structure chart finally obtained by the three-dimensional scene restoration module comprises respective semantic information of different objects.
According to still another aspect of the present invention, there is provided an electronic device having an automatic remaining cargo amount recognition function, comprising:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the automatic remaining cargo amount recognition method according to any one of the aspects of the present invention.
According to still another aspect of the present invention, there is provided a computer-readable medium having stored thereon a computer program, characterized in that the program, when executed by a processor, implements the automatic remaining amount of goods identification method according to any one of the aspects of the present invention described above.
Advantageous effects of the invention
By using the automatic identification method for the remaining amount of the goods, the remaining amount and the position of various goods can be identified by shooting a market scene through a mobile phone, so that the automatic identification method for the remaining amount of the goods is convenient to operate, low in labor cost and wide in application range.
Moreover, the method according to the present invention calculates the remaining amount of the goods by three-dimensional scene reduction, taking into account the height information of the goods, and thus the calculation result is more accurate than that using a two-dimensional plane.
In addition, according to the method, the semantic information of the image is added while the three-dimensional scene is restored, so that the three-dimensional structure diagram obtained after the three-dimensional scene is restored contains the semantic information representing different objects, the surplus of various goods can be identified and calculated at the same time, and the application range is further expanded.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
fig. 1 is a flowchart illustrating main steps of an automatic recognition method of remaining amount of goods according to a preferred embodiment of the present invention;
FIG. 2 is a flow diagram illustrating training a semantic segmentation network model in accordance with a preferred embodiment of the present invention;
FIG. 3 is a flow chart showing the calculation of the remaining cargo amount fused with semantic information according to a preferred embodiment of the present invention
FIG. 4 is a diagram illustrating an example of three-dimensional scene restoration according to a preferred embodiment of the present invention;
fig. 5 is a block diagram illustrating an automatic recognition apparatus for remaining amount of goods according to a preferred embodiment of the present invention.
Detailed Description
The technical solution of the present invention will be clearly and completely described below with reference to the specific embodiments of the present invention and the accompanying drawings. It is to be understood that the described embodiments are only a few of the presently preferred embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to automatically identify the remaining amount of goods by shooting videos through a mobile phone, the invention provides an automatic identifying method for the remaining amount of goods, which is characterized in that the remaining amount of goods is restored and calculated through a three-dimensional scene based on videos including goods shot through the mobile phone, and various goods can be identified by endowing semantic information to different objects in the restored three-dimensional scene. The automatic recognition method of the remaining amount of goods according to the present invention is described in detail as follows.
Fig. 1 is a flowchart illustrating an automatic recognition method of a remaining amount of goods according to a preferred embodiment of the present invention. As shown in fig. 1, the automatic identification method of remaining amount of goods of the present invention includes: a cargo image acquisition step S1 of acquiring a video including a cargo to be identified, and processing the video into a plurality of images; a feature point extraction and matching step S2 of performing feature point extraction on each of the plurality of images and performing feature point matching between the plurality of images to determine each item among the plurality of images; a point cloud generation step S3, solving the space three-dimensional coordinates of each feature point to generate a point cloud; a point cloud fusion step S4, fusing the generated point clouds to eliminate redundant and repeated point cloud points; a three-dimensional scene reduction step S5 of performing three-dimensional scene reduction on the point cloud to obtain a three-dimensional structure diagram of a scene including each cargo; and a cargo remaining amount calculation step S6, calculating the remaining amount of the object according to the three-dimensional structure diagram, and outputting the result. The respective steps are described in detail below.
Step S1: cargo image acquisition
For example, a video of a good is taken in a wholesale market using a cell phone, and the video is processed into a plurality of pictures using any known method to obtain an image including the good.
Step S2: feature point extraction and matching
And respectively extracting feature points by using a known SIFT algorithm for each picture to obtain stable feature points which do not change along with the change of shooting positions, angles and the like, and then performing feature point matching between pictures by using the stable feature points to determine the positions of the same feature point in the space in each picture.
Step S3: point cloud generation
The point cloud is a collection of massive points which express the target space distribution and the target surface characteristics under the same space reference system, and after the space coordinates of each sampling point on the surface of the object are obtained, the collection of points is obtained, namely the point cloud.
In the present embodiment, the spatial three-dimensional coordinates of each feature point are obtained by triangulating each feature point using the least square method, thereby generating point cloud data based on the feature points included in all images. Specifically, for example, for a feature point, an observation ray from an observation position of each picture including the feature point to the feature point is obtained, the observation ray is transferred to a world coordinate system to obtain an observation ray equation under the world coordinate system, then the distance from the feature point to each observation ray is obtained under the world coordinate system, and when the feature point is closest to the three-dimensional distances of all observation rays, the spatial three-dimensional coordinates of the feature point are obtained.
The solving method of the transformation matrix between different images and the space three-dimensional coordinates of the characteristic points is as follows:
Figure BDA0002952801080000081
and solving the space three-dimensional coordinates of each characteristic point to obtain point cloud data.
Step S4: point cloud fusion
After point Cloud data is obtained, the known PCL (Point Cloud library) technology is used for fusing point clouds to eliminate redundant and repeated point Cloud points, and a set of final point Cloud data is obtained.
Step S5: three-dimensional scene restoration
After the point cloud data is obtained, the point cloud is restored in a three-dimensional scene by using a known elastic fusion technology, and the obtained three-dimensional structure diagram is shown in fig. 4, for example.
Step S6: cargo allowance calculation step
And extracting the boundary of the goods in the obtained three-dimensional structure diagram, and then calculating the surplus of the goods according to the boundary of the goods. For example, as shown in fig. 4, the volume of the cargo is calculated according to the length, width and height of the smallest external rectangular solid of the cargo in the three-dimensional structure diagram, that is, the remaining amount of the cargo is obtained.
The main flow of the automatic remaining cargo amount recognition according to the present embodiment is described above.
However, in an actual scene, there is usually more than one kind of goods, and there are also people, cars, backgrounds, etc., for example. In this case, in order to identify each cargo and calculate the remaining amount of each cargo, it is also necessary to distinguish between different cargos and between cargos and other objects such as people. The method adopted in the embodiment is to perform semantic segmentation on the image, and endow semantic information to each extracted feature point according to the result of the semantic segmentation, so that the three-dimensional structure chart obtained through point cloud generation, point cloud fusion and three-dimensional scene reduction also comprises semantic information of different objects, thereby identifying various goods and calculating respective surplus of various goods. In this embodiment, the semantic segmentation is implemented using an artificial neural network.
The artificial neural network is different networks formed by abstracting a human brain neuron network from the information processing perspective, establishing a certain model and connecting in different ways, and is simply called a neural network or a neural network. A neural network is an operational model, which is formed by connecting a large number of nodes (or neurons). Each node represents a particular output function, called the excitation function. Each connection between two nodes represents a weighted value, called weight, for the signal passing through the connection. The output of the network is different according to the connection mode, the weight value and the excitation function of the network. The network itself is usually an approximation to some algorithm or function in nature. The neural network finally extracts the features required for completing the task by abstracting the data features layer by layer. In this embodiment, the semantic segmentation neural network model is built and trained to classify each pixel of the input cargo image, so as to identify semantic information of each feature point according to the semantic segmentation result.
The building and training of the semantic segmentation network model is described in detail below with reference to fig. 2.
The semantic segmentation network model of the embodiment is divided into a main network and a branch network, wherein the main network adopts a well-known light-weight convolutional network mobilonetv 3 with a small calculation amount, and the branch network adopts a well-known feature pyramid network in order to adapt to objects (goods) with different sizes.
MobileNet is a model based on deep separable convolution, which is mainly designed for mobile-end devices and can efficiently operate on mobile devices such as mobile phones in an offline state, wherein MobileNet v3 introduces hole convolution in order to increase the receptive field of feature maps, so that each convolution output contains a larger range of information. The characteristic pyramid network mainly solves the multi-scale problem in object detection, and greatly improves the performance of small object detection under the condition of basically not increasing the calculated amount of an original model through simple network connection change.
In addition, after the feature pyramid network, the semantic segmentation network model connects other feature maps with different sizes together in an up-down sampling mode, then performs convolution by 3 × 3, and finally performs up-sampling by 4 times to obtain a final output layer. Wherein use is made ofi(c)∈[0,1]Indicating whether i pixels are input into class c, error vector mi(c) As shown in equation 1.
Figure BDA0002952801080000101
Constructing Jaccard coefficients for class c
Figure BDA0002952801080000102
The substitution loss of (2):
Figure BDA0002952801080000103
and measuring the segmentation loss by using a semantic segmentation evaluation index mIoU, so as to obtain a final loss function as formula 3:
Figure BDA0002952801080000104
the semantic segmentation network model is then implemented using the well-known open source artificial neural network library Keras.
After the model is built, the model can be trained to carry out semantic segmentation on the input image. The training procedure is as follows.
S201: image acquisition
And shooting a video of the agricultural product trading scene in the wholesale market by using a shooting tool such as a mobile phone, and processing the video into a plurality of images.
S202: image annotation
The captured video is converted into a picture using the well-known multimedia processing tool FFmpeg, and then the range and boundaries of goods, people, cars, backlights in the picture are labeled using the well-known data labeling software Labelme to generate the training data set.
S203: model training
The generated data set is divided into a training set and a test set, for example, 80% of the training set and 20% of the test data set are randomly extracted. And then training the constructed semantic segmentation network model by using a training set.
S204: model testing
And evaluating the trained model by using the divided test set.
S205: model optimization
And optimizing the model according to the test result of the model by adopting any known method.
The building and training process of the semantic segmentation network model according to the embodiment is described above.
The following describes a flow of identifying and calculating the remaining amounts of a plurality of goods fused with semantic information by using the above semantic segmentation network model with reference to fig. 3.
Step S301: cargo image acquisition step
For example, a video including a plurality of different goods is taken in a wholesale market using a mobile phone, and the video is processed into a plurality of pictures using any known method to obtain a plurality of images including the different goods.
Step S302: semantic segmentation step
After collecting image information including goods to generate a training data set and building a semantic segmentation network model, training the built semantic segmentation network model based on the training data set, and performing semantic segmentation on a plurality of images by using the trained semantic segmentation network model respectively so as to classify pixels belonging to the same object in the images into one category.
Step S303: characteristic point extraction and matching step
The method comprises the steps of extracting feature points by using a well-known SIFT algorithm for each image respectively to obtain stable feature points which do not change along with the change of shooting positions, angles and the like, and then carrying out feature point matching between pictures by using the stable feature points to determine the positions of the same feature points in space in each image.
Step S304: feature point semantic fusion step
After the step of extracting and matching the feature points, identifying the semantic information of each extracted feature point according to the result of semantic segmentation, so that each object of the three-dimensional structure chart obtained after the three-dimensional scene reduction is carried out in the subsequent step has the semantic information of the object.
Step S305: point cloud generation step
As in the process described in the above-described step S3, the spatial three-dimensional coordinates of the feature points are obtained by triangularizing each feature point using the least square method, thereby obtaining the point cloud of each image.
Step S306: point cloud fusion step
After point Cloud data is obtained, the known PCL (Point Cloud library) technology is used for fusing point clouds to eliminate redundant and repeated point Cloud points, and a set of final point Cloud data is obtained.
Step S307: three-dimensional scene restoration
After the point cloud data is obtained, the point cloud is restored in a three-dimensional scene by using a known elastic fusion technology, and the obtained three-dimensional structure diagram is shown in fig. 4, for example.
Step S308: cargo allowance calculation step
And extracting the boundary of each cargo in the reduced three-dimensional structure diagram, and then calculating the residual amount of the cargo according to the boundary of the cargo. For example, as shown in fig. 4, the volume of the cargo is calculated according to the length, width, and height of the smallest external rectangular solid of the cargo in the three-dimensional structure diagram, that is, the remaining amount of the cargo is obtained.
The above is the identification and calculation process of the surplus of various goods which are fused with semantic information and are carried out by utilizing the semantic segmentation network model.
Note that the execution sequence of steps S302 and S303 in the above process is not sequential, and step S303 may be performed first and then step S302 may be performed, or the reverse order may be performed, or the steps may be performed simultaneously.
In addition, after the image is subjected to semantic segmentation by using the semantic segmentation network model, the feature points in an unnecessary region such as a background can be removed according to the result of the semantic segmentation, so that the processing amount is reduced, and the execution process is shortened.
The automatic remaining cargo amount identification method according to the embodiment of the present invention is described above in detail, and the automatic remaining cargo amount identification method according to the present invention is further described below with reference to an example. It is noted that this example is merely illustrative.
In the example, a video of a certain indoor slot in a certain wholesale market is collected for 10 minutes, the goods sold by the slot are boxed red Fuji apples, the video is converted into pictures through FFmpeg to obtain 1200 pictures, 1000 pictures are extracted to be used as a training set, and 200 pictures are used as a testing set. And constructing a semantic separation network by using Keras, and training by using a training set, wherein 200 training rounds are performed to reach a fitting state. The test set was used for testing and gave an mlou value of 81.6. The model is converted to tflite format using the TensorFlow Lite so that the model can be shipped on a mobile device, such as a cell phone. A cargo allowance calculation process is developed by using java language and is installed on an android mobile phone, and automatic identification of the cargo allowance of a shelves for selling red Fuji apples can be completed.
Fig. 5 is a block diagram illustrating an embodiment of the automatic remaining cargo amount recognition apparatus according to the present invention. The automatic remaining cargo amount recognition apparatus 500 shown in fig. 5 includes at least: a cargo image acquisition module 501, configured to acquire a video including a cargo to be identified, and process the video into a plurality of images; a feature point extraction and matching module 502, which extracts feature points of the plurality of images respectively, and performs feature point matching between the plurality of images to determine each item in the plurality of images; a point cloud generating module 503, which solves the spatial three-dimensional coordinates of each feature point to generate a point cloud; a point cloud fusion module 504 for fusing the generated point clouds to eliminate redundant and repetitive point cloud points; a three-dimensional scene reduction module 505, configured to perform three-dimensional scene reduction on the point cloud to obtain a three-dimensional structure diagram of a scene including the cargo; and a cargo allowance calculation module 506 which calculates the allowance of the object according to the three-dimensional structure diagram and outputs a result.
In addition, in order to identify the remaining amount of various goods simultaneously, the automatic goods remaining amount identification device according to the present invention may further include a semantic segmentation module and a feature point semantic fusion module.
In the semantic segmentation module, after image information including goods is collected to generate a training data set and a semantic segmentation network model is built, the built semantic segmentation network model is trained on the basis of the training data set, and the trained semantic segmentation network model is used for performing semantic segmentation on a plurality of images respectively so as to classify pixels belonging to the same object in the images into one category.
In the feature point semantic fusion module, after feature points are extracted and matched, semantic information of each feature point is identified according to semantic segmentation results, so that a three-dimensional structure chart finally obtained in the three-dimensional scene restoration module comprises respective semantic information of different objects.
As another aspect, the present invention also provides an electronic device having an automatic remaining cargo amount recognition function, the electronic device including: one or more processors; a storage device for storing one or more programs and causing the one or more processors to implement the automatic remaining cargo amount recognition method as described above when the one or more programs are executed by the one or more processors.
As still another aspect, the present invention also provides a computer-readable medium carrying one or more programs which, when executed by the apparatus, cause the apparatus to include the steps of the automatic remaining cargo amount recognition method.
According to the technical scheme of the embodiment of the invention, the following effects are obtained.
By using the automatic identification method for the remaining amount of the goods, the remaining amount and the position of various goods can be identified by shooting a market scene through a mobile phone, so that the automatic identification method for the remaining amount of the goods is convenient to operate, low in labor cost and wide in application range.
Moreover, the method according to the present invention calculates the remaining amount of the goods by three-dimensional scene reduction, taking into account the height information of the goods, and thus the calculation result is more accurate than that using a two-dimensional plane.
In addition, according to the method, the semantic information of the image is added while the three-dimensional scene is restored, so that the three-dimensional structure diagram obtained after the three-dimensional scene is restored contains the semantic information representing different objects, the surplus of various goods can be identified and calculated at the same time, and the application range is further expanded.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (11)

1. A method for automatically identifying the remaining amount of goods is characterized by comprising the following steps:
a cargo image acquisition step of acquiring a video including a cargo to be identified, and processing the video into a plurality of images;
a feature point extraction and matching step of extracting feature points from the plurality of images, respectively, and performing feature point matching between the plurality of images to find out the same feature points in different images;
a point cloud generating step of solving the spatial three-dimensional coordinates of each feature point to generate a point cloud;
a point cloud fusion step, namely fusing the generated point clouds to eliminate redundant repeated point cloud points;
a three-dimensional scene reduction step, namely performing three-dimensional scene reduction on the point cloud after fusion to obtain a three-dimensional structure diagram of the scene including the goods to be identified; and
and a cargo allowance calculation step, namely calculating the allowance of the cargo to be identified according to the three-dimensional structure diagram and outputting a result.
2. The method of claim 1, further comprising:
a semantic segmentation step of collecting image information including the goods to be recognized to generate a training data set and building the semantic segmentation network model, training the built semantic segmentation network model based on the training data set, and performing semantic segmentation on the plurality of images respectively by using the trained semantic segmentation network model to classify pixels belonging to the same object in the images into one category; and
and a feature point semantic fusion step, namely identifying semantic information of each feature point according to a semantic segmentation result after the feature point extraction and matching step, so that the three-dimensional structure diagram obtained in the three-dimensional scene restoration step later comprises respective semantic information of different objects.
3. The method of claim 2,
the semantic segmentation network model trained and built based on the training data set specifically comprises:
a training data set generation step of acquiring a plurality of pictures including a plurality of kinds of goods, and labeling the range and the boundary of the images of the various kinds of goods and the various kinds of objects included in the pictures to generate the training data set;
a semantic division network model building step, namely building the semantic division network model; and
and training a semantic segmentation network model, namely training the semantic segmentation network model by using the generated training data set so that the semantic segmentation network model can classify each pixel in the input image.
4. The method according to claim 2 or 3,
the semantic segmentation network model is composed of a lightweight convolution network MobileNet serving as a main network and a characteristic pyramid network serving as a branch network.
5. The method of claim 2 or 3, further comprising:
and a characteristic point filtering step, wherein before the characteristic point semantic fusion step, unnecessary characteristic points are filtered according to the result of semantic segmentation.
6. The method according to claim 1 or 2,
in the step of calculating the remaining amount of the goods, the image boundary of the goods to be identified is extracted from the three-dimensional structure chart, and the remaining amount of the goods is calculated according to the size of the image boundary.
7. The method of claim 6,
the image boundary of the goods to be identified is the minimum external cuboid of the goods in the three-dimensional structure chart, and
and calculating the surplus of the goods according to the length, the width and the height of the minimum circumscribed cuboid.
8. An automatic recognition device for remaining amount of goods, comprising:
the cargo image acquisition module is used for acquiring a video comprising cargos to be identified and processing the video into a plurality of images;
a feature point extraction and matching module which extracts feature points of the plurality of images respectively and performs feature point matching between the plurality of images to determine each cargo in the plurality of images;
the point cloud generating module is used for solving the spatial three-dimensional coordinates of each feature point to generate a point cloud;
the point cloud fusion module is used for fusing the generated point clouds to eliminate redundant and repeated point cloud points;
the three-dimensional scene reduction module is used for carrying out three-dimensional scene reduction on the point cloud to obtain a three-dimensional structure chart of a scene including the goods to be identified; and
and the cargo allowance calculation module is used for calculating the allowance of the cargo to be identified according to the three-dimensional structure chart and outputting a result.
9. The apparatus of claim 8, further comprising:
the semantic segmentation module is used for collecting image information including the goods to be recognized to generate a training data set and building the semantic segmentation network model, the built semantic segmentation network model is trained on the basis of the training data set, and the trained semantic segmentation network model is used for performing semantic segmentation on the plurality of images respectively so as to classify pixels belonging to the same object in the images into one category; and
and the feature point semantic fusion module is used for identifying semantic information of each feature point according to a semantic segmentation result after the feature points are extracted and matched, so that a three-dimensional structure chart finally obtained by the three-dimensional scene restoration module comprises respective semantic information of different objects.
10. An electronic device having a cargo remaining amount automatic recognition function, comprising:
one or more processors; and
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
11. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202110214835.7A 2021-01-25 2021-02-25 Automatic identification method and device for remaining amount of goods, electronic equipment and computer readable medium Pending CN113160414A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2021100958190 2021-01-25
CN202110095819 2021-01-25

Publications (1)

Publication Number Publication Date
CN113160414A true CN113160414A (en) 2021-07-23

Family

ID=76883496

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110214835.7A Pending CN113160414A (en) 2021-01-25 2021-02-25 Automatic identification method and device for remaining amount of goods, electronic equipment and computer readable medium

Country Status (1)

Country Link
CN (1) CN113160414A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115205300A (en) * 2022-09-19 2022-10-18 华东交通大学 Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107578052A (en) * 2017-09-15 2018-01-12 北京京东尚科信息技术有限公司 Kinds of goods processing method and system
CN108895981A (en) * 2018-05-29 2018-11-27 南京怀萃智能科技有限公司 A kind of method for three-dimensional measurement, device, server and storage medium
CN109035579A (en) * 2018-06-29 2018-12-18 深圳和而泰数据资源与云技术有限公司 A kind of commodity recognition method, self-service machine and computer readable storage medium
CN110120010A (en) * 2019-04-12 2019-08-13 嘉兴恒创电力集团有限公司博创物资分公司 A kind of stereo storage rack vision checking method and system based on camera image splicing
CN110223297A (en) * 2019-04-16 2019-09-10 广东康云科技有限公司 Segmentation and recognition methods, system and storage medium based on scanning point cloud data
CN111340873A (en) * 2020-02-28 2020-06-26 广东工业大学 Method for measuring and calculating object minimum outer envelope size of multi-view image
CN111489358A (en) * 2020-03-18 2020-08-04 华中科技大学 Three-dimensional point cloud semantic segmentation method based on deep learning
CN112132523A (en) * 2020-11-26 2020-12-25 支付宝(杭州)信息技术有限公司 Method, system and device for determining quantity of goods

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107578052A (en) * 2017-09-15 2018-01-12 北京京东尚科信息技术有限公司 Kinds of goods processing method and system
CN108895981A (en) * 2018-05-29 2018-11-27 南京怀萃智能科技有限公司 A kind of method for three-dimensional measurement, device, server and storage medium
CN109035579A (en) * 2018-06-29 2018-12-18 深圳和而泰数据资源与云技术有限公司 A kind of commodity recognition method, self-service machine and computer readable storage medium
CN110120010A (en) * 2019-04-12 2019-08-13 嘉兴恒创电力集团有限公司博创物资分公司 A kind of stereo storage rack vision checking method and system based on camera image splicing
CN110223297A (en) * 2019-04-16 2019-09-10 广东康云科技有限公司 Segmentation and recognition methods, system and storage medium based on scanning point cloud data
CN111340873A (en) * 2020-02-28 2020-06-26 广东工业大学 Method for measuring and calculating object minimum outer envelope size of multi-view image
CN111489358A (en) * 2020-03-18 2020-08-04 华中科技大学 Three-dimensional point cloud semantic segmentation method based on deep learning
CN112132523A (en) * 2020-11-26 2020-12-25 支付宝(杭州)信息技术有限公司 Method, system and device for determining quantity of goods

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115205300A (en) * 2022-09-19 2022-10-18 华东交通大学 Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion
CN115205300B (en) * 2022-09-19 2022-12-09 华东交通大学 Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion

Similar Documents

Publication Publication Date Title
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
CN108154105B (en) Underwater biological detection and identification method and device, server and terminal equipment
CN109086811B (en) Multi-label image classification method and device and electronic equipment
JP6397379B2 (en) CHANGE AREA DETECTION DEVICE, METHOD, AND PROGRAM
CN109754009B (en) Article identification method, article identification device, vending system and storage medium
CN112132213A (en) Sample image processing method and device, electronic equipment and storage medium
CN111061890A (en) Method for verifying labeling information, method and device for determining category
CN108230395A (en) Stereoscopic image is calibrated and image processing method, device, storage medium and electronic equipment
CN114667540A (en) Article identification and tracking system
CN108229375B (en) Method and device for detecting face image
CN112307864A (en) Method and device for determining target object and man-machine interaction system
CN115797736B (en) Training method, device, equipment and medium for target detection model and target detection method, device, equipment and medium
CN114219855A (en) Point cloud normal vector estimation method and device, computer equipment and storage medium
CN114565916A (en) Target detection model training method, target detection method and electronic equipment
CN113705669A (en) Data matching method and device, electronic equipment and storage medium
CN110321867B (en) Shielded target detection method based on component constraint network
CN115752683A (en) Weight estimation method, system and terminal based on depth camera
CN114358133B (en) Method for detecting looped frames based on semantic-assisted binocular vision SLAM
CN111160450A (en) Fruit and vegetable weighing method based on neural network, storage medium and device
CN114255377A (en) Differential commodity detection and classification method for intelligent container
CN114169425A (en) Training target tracking model and target tracking method and device
CN113160414A (en) Automatic identification method and device for remaining amount of goods, electronic equipment and computer readable medium
US11790642B2 (en) Method for determining a type and a state of an object of interest
CN106934339B (en) Target tracking and tracking target identification feature extraction method and device
CN112819953B (en) Three-dimensional reconstruction method, network model training method, device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination