CN111160261A

CN111160261A - Sample image labeling method and device for automatic sales counter and storage medium

Info

Publication number: CN111160261A
Application number: CN201911399456.9A
Authority: CN
Inventors: 邓博洋; 龙寿伦
Original assignee: Beijing Missfresh Ecommerce Co Ltd
Current assignee: Beijing Missfresh Ecommerce Co Ltd
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2020-05-15

Abstract

The disclosure discloses a sample image labeling method, a sample image labeling device and a storage medium for an automatic sales counter, wherein the method comprises the following steps: the method comprises the steps of obtaining at least one simulated article, wherein the simulated article is a three-dimensional model constructed by sampling solid articles, arranging the at least one simulated article in a virtual scene according to an arrangement scheme to obtain at least one sample image, and finally marking each simulated article corresponding to a feature region in the semantic segmentation image by determining the semantic segmentation image corresponding to the sample image to obtain a marked sample image. By the scheme, the computer equipment can automatically acquire and label the relevant data by constructing the virtual scene and the virtual article, so that the efficiency of labeling the data is improved on the premise of ensuring the accuracy of data labeling.

Description

Sample image labeling method and device for automatic sales counter and storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for labeling a sample image for an automatic sales counter, and a storage medium.

Background

Nowadays, with the increasing development of artificial intelligence technology, the automatic identification technology in artificial intelligence can be applied in the scene of automatic sales counter, i.e. intelligent cabinet.

In the related art, an artificial intelligence technology is adopted to realize an intelligent cabinet, a machine learning model needs to be trained, commodities in the intelligent cabinet are automatically and accurately identified, a large amount of data needs to be collected and marked in order to obtain an accurate machine learning model through training, and the collection and marking of the data are manually carried out.

However, in the solutions in the related art, in order to enable the intelligent cabinet to identify the commodity more accurately, as much marking data as possible needs to be trained to obtain a more accurate machine learning model, and the process of manually collecting the marking needs to consume a lot of time, resulting in low efficiency of collecting the marking data.

Disclosure of Invention

The disclosure provides a sample image labeling method and device for an automatic sales counter and a storage medium.

The technical scheme is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a sample image annotation method for an automatic sales container, the method comprising:

obtaining at least one simulated article, wherein the simulated article is a three-dimensional model constructed by sampling a solid article;

arranging the at least one simulated article in a virtual scene according to an arrangement scheme to obtain at least one sample image, wherein the sample image is an image of each simulated article shot by a virtual camera under different arrangement schemes, and the virtual scene is a simulated entity scene constructed by three-dimensional modeling of the interior of the automatic sales counter; the virtual camera corresponds to the physical camera in the automatic sales counter;

and marking each simulated article corresponding to the characteristic region in the semantic segmentation image by determining the semantic segmentation image corresponding to the sample image to obtain the marked sample image, wherein the characteristic region is used for determining the information of the simulated article.

Optionally, when the arrangement scheme is arranged in columns, the obtaining at least one sample image by arranging the at least one simulated article in the virtual scene according to the arrangement scheme includes:

arranging the obtained at least one simulated article in the virtual scene according to the columns;

in response to the virtual items of the column moving in a fixed order, acquiring, by the virtual camera, images after each movement as the sample images.

Optionally, when the arrangement scheme is random placement, the obtaining at least one sample image by arranging the at least one simulated article in the virtual scene according to the arrangement scheme includes:

randomly arranging the obtained at least one simulated article in the virtual scene;

and acquiring randomly arranged images as the sample images through the virtual camera.

Optionally, the acquiring at least one simulated item includes:

acquiring video samples of the entity article corresponding to the simulated article, wherein the video samples are used for displaying images of the entity article from three angles;

obtaining modeling information of each solid object according to the video samples of the solid objects, wherein the modeling information is used for constructing a three-dimensional model of the solid object;

and constructing a simulated article corresponding to the entity article according to the modeling information.

Optionally, the obtaining the labeled sample image by determining the semantic segmentation image corresponding to the sample image and labeling each simulated article corresponding to the feature region in the semantic segmentation image, where the feature region is used to determine simulated article information includes:

determining a semantic segmentation image corresponding to the sample image through a ray tracing algorithm;

acquiring an initial surrounding frame of each simulated article according to different pixel values of each simulated article;

determining a compact bounding box of each simulated article through a heuristic algorithm based on the initial bounding box, wherein the compact bounding box contains an identifiable characteristic region of the corresponding simulated article, and the identifiable characteristic region is a part of the characteristic region which is not blocked in the image;

marking the simulated articles in each compact enclosing frame according to the identifiable characteristic region;

and acquiring the sample image after each mark.

Optionally, before the acquiring at least one simulated article, the method further includes:

constructing the virtual scene according to the external factors and the internal factors of the corresponding actual scene in the automatic sales counter; the external factors refer to factors representing the external dimensions of the automatic sales counter, and the internal factors refer to factors representing the parameters of the analog camera and the light environment inside the automatic sales counter.

Optionally, the method further includes:

and training a detection model through the marked sample image, wherein the detection model is used for identifying at least one of the type information, the name information and the position information of each entity article in the automatic sales counter.

According to a second aspect of the embodiments of the present disclosure, there is provided a sample image annotation apparatus for an automatic sales counter, the apparatus comprising:

the system comprises an article acquisition module, a simulation module and a display module, wherein the article acquisition module is used for acquiring at least one simulated article, and the simulated article is a three-dimensional model constructed by sampling a solid article;

the sample image acquisition module is used for arranging the at least one simulated article in a virtual scene according to an arrangement scheme to obtain at least one sample image, wherein the sample image is an image of each simulated article shot by a virtual camera under different arrangement schemes, and the virtual scene is a simulated entity scene constructed by three-dimensional modeling of the interior of the automatic sales counter; the virtual camera corresponds to the physical camera in the automatic sales counter;

and the marked image acquisition module is used for marking each simulated article corresponding to the characteristic area in the semantic segmentation image by determining the semantic segmentation image corresponding to the sample image to obtain the marked sample image, wherein the characteristic area is used for determining the information of the simulated article.

Optionally, when the arrangement scheme is arranged in rows, the sample image obtaining module includes:

the first arrangement sub-module is used for arranging the acquired at least one simulated article in the virtual scene according to a row;

a first sample image acquiring sub-module, configured to respond to the virtual articles in the column moving in a fixed order, and acquire, by the virtual camera, an image after each movement as the sample image.

Optionally, when the arrangement scheme is randomly arranged, the sample image obtaining module includes:

the second arrangement submodule is used for randomly arranging the obtained at least one simulated article in the virtual scene;

and the second sample image acquisition sub-module is used for acquiring randomly arranged images as the sample images through the virtual camera.

Optionally, the article obtaining module includes:

the sampling acquisition sub-module is used for acquiring video samples of the entity article corresponding to the simulation article, and the video samples are used for displaying images of the entity article from three angles;

the information acquisition sub-module is used for acquiring modeling information of the entity article according to the video sampling of each entity article, wherein the modeling information is used for constructing a three-dimensional model of the entity article;

and the article construction sub-module is used for constructing a simulated article corresponding to the entity article according to the modeling information.

Optionally, the annotated image obtaining module includes:

the segmented image determining submodule is used for determining a semantic segmented image corresponding to the sample image through a ray tracing algorithm;

the initial frame acquisition sub-module is used for acquiring an initial surrounding frame of each simulated article according to different pixel values of each simulated article;

a compact frame determination sub-module, configured to determine, through a heuristic algorithm, a compact bounding frame of each simulated article based on the initial bounding frame, where the compact bounding frame includes an identifiable feature region of the corresponding simulated article, and the identifiable feature region is a feature region portion that is not occluded in the image;

the article labeling sub-module is used for labeling the simulated articles in the compact surrounding frames according to the identifiable characteristic areas;

and the labeled image acquisition submodule is used for acquiring each labeled sample image.

Optionally, the apparatus further comprises:

the scene building module is used for building the virtual scene according to external factors and internal factors of a corresponding actual scene in the automatic sales counter before at least one simulated article is obtained; the external factors refer to factors representing the external dimensions of the automatic sales counter, and the internal factors refer to factors representing the parameters of the analog camera and the light environment inside the automatic sales counter.

Optionally, the apparatus further comprises:

and the model training module is used for training a detection model through the marked sample image, and the detection model is used for identifying at least one of the type information, the name information and the position information of each entity article in the automatic sales counter.

According to a third aspect of the embodiments of the present disclosure, there is provided a sample image annotation apparatus for an automatic sales container, the apparatus comprising:

a processor;

a memory for storing executable instructions of the processor;

wherein the processor is configured to:

According to a fourth aspect of the embodiments of the present disclosure, there is provided a computer device readable storage medium, which contains executable instructions that are invoked by a processor to be executed, so as to implement the sample image annotation method for an automatic sales counter according to the first aspect or any one of the alternatives of the first aspect.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

the method comprises the steps of obtaining at least one simulated article, wherein the simulated article is a three-dimensional model constructed by sampling solid articles, arranging the at least one simulated article in a virtual scene according to an arrangement scheme to obtain at least one sample image, wherein the sample image is an image obtained by shooting each simulated article by a virtual camera under different arrangement schemes, the virtual scene is a simulated solid scene constructed by three-dimensional modeling of the interior of an automatic sales counter, the virtual camera corresponds to the solid cameras in the automatic sales counter, and finally marking each simulated article corresponding to a characteristic region in a semantic segmentation image by determining the semantic segmentation image corresponding to the sample image to obtain the marked sample image. By the scheme, the computer equipment can automatically acquire and label the relevant data by constructing the virtual scene and the virtual article, so that the efficiency of labeling the data is improved on the premise of ensuring the accuracy of data labeling.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

FIG. 1 is a schematic diagram illustrating a sample image annotation process for automated sales containers, according to an exemplary embodiment;

FIG. 2 is a flow chart illustrating a sample image annotation process for automated sales containers in accordance with an exemplary embodiment;

FIG. 3 is a flow chart illustrating a sample image annotation process for automated sales containers according to another exemplary embodiment;

FIG. 4 is a schematic diagram of a sample image of a simulated object in a column arrangement according to the embodiment of FIG. 3;

FIG. 5 is a schematic diagram of a sample image of a randomly arranged simulated object according to the embodiment of FIG. 3;

FIG. 6 is a schematic diagram of an automatic labeling process according to the embodiment shown in FIG. 3;

FIG. 7 is a block diagram illustrating a sample image annotation apparatus for automated sales containers in accordance with an exemplary embodiment;

FIG. 8 is a schematic diagram illustrating a configuration of a computer device, according to an example embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

It is to be understood that reference herein to "a number" means one or more and "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

For convenience of understanding, terms referred to in the embodiments of the present disclosure are explained below.

1) Artificial intelligence

Artificial intelligence is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

2) Semantically segmented images

The computer device automatically segments and identifies the content in the image, and automatically segments and divides the semantically segmented image.

The sample image annotation method for the automatic sales counter provided by the embodiment of the disclosure can be used in computer equipment with stronger data processing capacity, such as a personal computer or a server. In a possible implementation manner, the computer device constructs a simulated virtual scene and a virtual commodity according to data of an actual scene and a commodity, and according to the sample image annotation method for the automatic sales counter provided by the embodiment of the disclosure, the computer device can automatically acquire and annotate image data and ensure the quantity and quality of image data annotation.

FIG. 1 is a schematic diagram illustrating a sample image annotation process for automated sales containers, according to an exemplary embodiment. As shown in fig. 1, the method for labeling the sample image of the automatic sales counter comprises the following steps:

in step 110, a complete simulation intelligent cabinet scene is constructed.

In a possible implementation mode, firstly, an existing rendering engine is adopted, an indoor or outdoor environment is built, simulation modeling is carried out on the intelligent cabinet according to the size of the actual intelligent cabinet, then a simulation camera is added at a proper position in a simulation model of the intelligent cabinet, wherein the simulation camera can be a fisheye camera simulating the actual intelligent cabinet, the fisheye effect is achieved, the resolution ratio and the interception range of the simulation camera are adjusted according to the resolution ratio of the actual camera, and finally simulation light rays at different moments and in different weathers are added in a built scene according to actual light ray changes.

In step 120, the commodity is three-dimensionally modeled.

In one possible implementation, a real modeling environment is first arranged, the real modeling environment including lighting conditions that remain uniformly bright; a pure white non-reflective wall surface or cloth is used as a background; shading by adopting the shea butter; using 2 light supplement lamps for light supplement; the electric turntable is used as a carrier of commodities to automatically rotate objects.

Then, a video image was taken with a camera having at least 720p pixels, and the commodity was taken from three angles, i.e., a head-up view, a 30-degree plan view, and a 60-degree plan view, for 20 seconds each, to obtain a total of 1 minute video sample.

Finally, a plurality of pictures (which can be 180 pictures) obtained by video sampling are subjected to feature point matching, camera registration, point cloud generation, grid generation and texture mapping iteration and optimization on the article by adopting the existing algorithm. And obtaining a model after modeling, and removing a small amount of redundant parts, such as a small amount of ground (namely a plane of the electric turntable) possibly contained at the bottom of the model. The overall process is approximately 15 minutes for the current step.

In step 130, the picture sample data is automatically labeled.

In one possible implementation, first, a real placement scheme is simulated. The commodities to be placed can be the same commodity or a combination of different commodities. The commodity placement is mainly divided into two types, and all data sets are formed together. The first is to put according to the row, and a single removes an article, for example 8 colas of capacity in a row, prepares 6 colas, begins all to put in the intelligent cabinet outside, obtains first image, later removes current most inboard colas to the intelligent cabinet is inboard, obtains the second image, so on and so on, removes commodity to the intelligent cabinet is inboard from the intelligent cabinet outside. After one row is moved, the next row is replaced. The second is random placement, and random placement is performed according to the number and the positions of commodities. Disturbance factors can be increased through each placing scheme obtained through the two placing schemes, and the variation range of the collected pictures is enlarged. Such as small-range moving of the position of the simulated fisheye camera, random illumination intensity variation, light source position variation, color temperature variation, etc.

And then, acquiring a semantic segmentation image corresponding to the image by adopting a ray tracing algorithm. According to the difference of pixel values of each instance in a semantic segmentation image, distinguishing each instance, and obtaining an initial maximum surrounding frame of each instance, wherein the initial maximum surrounding frame is a rectangular frame surrounding all pixels of the article, but may contain a large proportion of other commodities or more background regions, according to the characteristic that the center of an image shot by a fisheye camera diverges towards the periphery, as a heuristic, a heuristic algorithm is adopted to obtain a compact frame of each commodity, and the compact frame is required to contain no other commodities and backgrounds to the maximum extent and contain important feature regions of the current commodity which are not shielded.

And finally, training a detection model according to the automatically acquired and labeled image data, and realizing the function of automatically identifying the commodities by the intelligent cabinet.

FIG. 2 is a flow chart illustrating a method for sample image annotation for automated sales containers according to an exemplary embodiment. The sample image annotation method for the automatic sales counter can be applied to computer equipment, and image data for annotation is obtained by constructing simulated articles and virtual scenes on the computer equipment. As shown in fig. 2, the sample image annotation method for the automatic sales counter may include the steps of:

in step 201, at least one simulated item is obtained, wherein the simulated item is a three-dimensional model constructed by sampling a physical item.

In step 202, at least one sample image is obtained by arranging the at least one simulated article in a virtual scene according to the arrangement scheme, wherein the sample image is an image of each simulated article shot by a virtual camera under different arrangement schemes, and the virtual scene is a simulated entity scene for performing three-dimensional modeling construction on the interior of the automatic sales counter; the virtual camera corresponds to the physical camera within the vending cabinet.

In step 203, a semantic segmentation image corresponding to the sample image is determined, and each simulated article corresponding to a feature region in the semantic segmentation image is labeled to obtain the labeled sample image, where the feature region is used to determine simulated article information.

To sum up, in the sample image labeling method for an automatic sales counter provided in the embodiments of the present disclosure, at least one simulated article is obtained, where the simulated article is a three-dimensional model constructed by sampling an entity article, then at least one simulated article is arranged in a virtual scene according to an arrangement scheme, to obtain at least one sample image, where the sample image is an image obtained by shooting each simulated article by a virtual camera under different arrangement schemes, the virtual scene is a simulated entity scene constructed by three-dimensional modeling of the inside of the automatic sales counter, and the virtual camera corresponds to the entity camera in the automatic sales counter, and finally each simulated article corresponding to a feature region in the semantic segmentation image is labeled by determining the semantic segmentation image corresponding to the sample image, to obtain a labeled sample image, wherein the characteristic region is used to determine simulated item information. By the scheme, the computer equipment can automatically acquire and label the relevant data by constructing the virtual scene and the virtual article, so that the efficiency of labeling the data is improved on the premise of ensuring the accuracy of data labeling.

FIG. 3 is a flowchart illustrating a sample image annotation process for automated sales containers according to another exemplary embodiment. The sample image annotation method for the automatic sales counter can be applied to computer equipment, and image data for annotation is obtained by constructing simulated articles and virtual scenes on the computer equipment. As shown in fig. 3, the sample image annotation method for the automatic sales counter may include the steps of:

in step 301, the computer device constructs the virtual scene according to the external factors and the internal factors of the corresponding actual scene in the vending cabinet.

In the embodiment of the present disclosure, the computer device may adopt a rendering engine to construct a completely simulated scene of the automatic sales counter according to external factors and internal factors of a corresponding actual scene in the automatic sales counter.

The external factors of the actual scene are factors for representing the external dimensions of the automatic sales counter, and the internal factors of the actual scene are factors for representing the parameters of the analog camera and the light environment in the automatic sales counter. The rendering engine has a three-dimensional image simulation function.

For example, the external factors of the actual scene may be the size value of the automatic sales counter, and the internal factors of the actual scene may be the resolution value and the interception range of the actual camera simulated by the analog camera in the automatic sales counter, and the simulated current actual light change condition.

The intensity of the light is different according to the difference of the current time, and the incident angle of the light is different; the light variation may be different for different weather conditions.

In addition, the analog camera can be a fisheye camera, the fisheye camera has a fisheye effect in the process of shooting images, the fisheye camera can enlarge the range of visual angles, and conditions are created for shooting large-range scenery at a short distance.

For example, the computer device may collect the dimensions of the automatic sales counter in the actual scene, model the automatic sales counter according to the actual dimensions by inputting the dimensions into the simulation modeling software, collect the light conditions in the actual scene in real time to perform real-time simulation, and the simulation camera may photograph the inside of the simulated automatic sales counter by simulating the actual camera photographing process.

In step 302, the computer device obtains a video sample of the physical object corresponding to the simulated object.

In the embodiment of the disclosure, an actual shooting scene meeting a predetermined condition is arranged, each entity article is shot in the actual shooting scene, and the computer device acquires a shooting video of each entity article as a video sample of each entity article.

Wherein, the video sampling is used for displaying the images of the physical object from three angles.

Optionally, when each physical object is placed in an actual shooting scene, the same camera is used to shoot videos of each physical object at three fixed angles.

Wherein, three angles of each entity article are fixed, and the shooting duration of three angles is also fixed.

For example, the arranged actual shooting scene can be a video with uniform and bright lighting conditions, a pure white wall surface or cloth without reflection is used as a background, the beef tallow paper is used for shading light, two light supplementing lamps are used for supplementing light, the solid object is placed on the electric turntable, the solid object can be automatically rotated, and the video of the solid object which is shot by the camera with the pixels 720p is respectively subjected to video sampling from the angle of looking up the solid object, the angle of looking down the solid object by 30 degrees and the angle of looking down the solid object by 60 degrees for 1 minute. The computer device obtains video samples of each physical object.

In step 303, the computer device obtains modeling information of each physical object according to the video sample of the physical object.

In the embodiment of the present disclosure, the computer device obtains the video samples of each entity article, and may obtain the modeling information of each entity article by analyzing the video samples.

Wherein the modeling information is used to construct a three-dimensional model of the physical object.

Optionally, the modeling information may include at least one of feature point matching information, camera registration information, point cloud generation information, mesh generation information, and iteration and optimization information of texture mapping.

In step 304, the computer device constructs a simulated item corresponding to the physical item according to the modeling information.

In the embodiment of the disclosure, the computer device obtains a model which is completed by modeling according to the modeling information and uses the model as a simulated article corresponding to the entity article.

Optionally, after the model modeling is completed, the computer device optimizes the model by removing a small amount of redundant parts, standardizes the built three-dimensional model to the size of the physical object, and guides the physical object into an automatic sales counter rendering the scene.

Wherein, the small redundant part to be eliminated can be the plane of the small electric turntable contained at the bottom of the model.

In addition, the types of the solid objects may be different, and then the types of the simulated objects obtained by modeling the solid objects may also be different.

In step 305, the computer device obtains the simulated item corresponding to the physical item.

In the embodiment of the present disclosure, in the process of establishing a model, the computer device may obtain a simulated article corresponding to an entity article by analyzing a sampled video of the entity article, and the computer device may obtain a plurality of simulated articles of different quantities and types.

In step 306, the computer device obtains at least one sample image by arranging the at least one simulated item in the virtual scene according to the arrangement scheme.

In the embodiment of the disclosure, the computer device arranges at least one simulated article in the constructed simulated automatic sales counter according to the arrangement scheme, and shoots scenes in the simulated automatic sales counter under each arrangement scheme through the virtual camera in the simulated automatic sales counter, wherein each arrangement mode obtains at least one image as a sample image.

The sample image is an image obtained by shooting each simulated article by a virtual camera under different arrangement schemes, and the virtual scene is a simulation entity scene constructed by three-dimensional modeling of the interior of the automatic sales counter; the virtual camera corresponds to the physical camera within the vending cabinet.

The virtual camera and the entity camera in the actual scene are consistent in the setting position and shooting angle of the automatic sales counter and the lens parameters of the camera including resolution, aperture parameters and the like.

Optionally, the arrangement scheme of the at least one simulated article may be arranged in two ways, where the two ways are respectively arranged in columns and randomly, and the two ways are arranged as follows:

1) when the arrangement is in columns.

The computer device arranges the acquired at least one simulated article in the virtual scene according to a column, then responds to the fact that the virtual article in the column moves according to a fixed sequence, and acquires images after moving for each time through the virtual camera to serve as the sample images.

For example, fig. 4 is a schematic diagram of sample images of simulated articles arranged in a column according to an embodiment of the present disclosure. As shown in FIG. 4, the image of step 401 is a sampled image of the interior of the vending cabinet taken by a simulated fisheye camera, and the simulated items A, B, C are initially arranged in a column on the side of the vending cabinet immediately adjacent to the cabinet, where the simulated items A, B, C may be the same type or different types of simulated items, and the simulated item C located closest to the inside of the vending cabinet is moved to the innermost side of the vending cabinet as the image shown in step 401, and the sampled image of the interior of the vending cabinet taken by the simulated fisheye camera is the image shown in step 401. In the case of the image shown in step 402, the simulation article B except the simulation article C located at the innermost side of the automatic sales counter is moved to the innermost side of the automatic sales counter next to the simulation article C, and the sampled image of the inside of the automatic sales counter taken by the simulated fisheye camera is the image shown in step 402. The last simulated item a, which has not moved, is moved to the side of the vending cabinet next to the simulated item B as shown in step 403, and the sampled image of the inside of the vending cabinet taken by the simulated fisheye camera is the image shown in step 403.

Through the sampling picture moved each time, the feature region which is not shielded can be exposed as much as possible, so that the feature region of each simulated article can be identified.

2) When the arrangement scheme is random placement.

The computer equipment randomly arranges the acquired at least one simulated article in the virtual scene, and then acquires a randomly arranged image as the sample image through the virtual camera.

For example, fig. 5 is a schematic diagram of a sample image of a randomly arranged simulated object according to an embodiment of the present disclosure. As shown in fig. 5, the sampled image of the inside of the automatic sales counter is shot by a simulated fish-eye camera, wherein the arrangement of each simulated article is random.

Optionally, when the computer device arranges the simulated articles according to the discharge scheme, disturbance factors may be added based on each arrangement situation to obtain a plurality of sample images.

The added disturbance factors can be the position of a small-range moving simulated fisheye camera, the change of random illumination light intensity, the change of light source position or the change of color temperature and the like.

In step 307, the computer device obtains the labeled sample image by determining a semantic segmentation image corresponding to the sample image, and labeling each of the simulated articles corresponding to the feature areas in the semantic segmentation image.

In the embodiment of the disclosure, the computer device performs semantic segmentation on each obtained sample image to obtain a corresponding semantic segmented image, and performs certain processing on the semantic segmented image, so that the computer device can label each simulated article in the semantic segmented image according to the feature region, and the computer device can obtain each labeled sample image.

Wherein the characteristic region is used for determining simulated article information, and the computer equipment can distinguish each simulated article by identifying the characteristic region.

Optionally, the computer device may determine the semantic segmentation image corresponding to the sample image through a ray tracing algorithm.

The ray tracing algorithm is a computer three-dimensional graph rendering algorithm, and the basic starting point is ray tracing, a real light path and an imaging process are simulated, and the shielding relation among simulated objects in a sample image can be obtained. And performing semantic segmentation on the sample images according to the parts of the simulated articles which are not shielded, and obtaining semantic segmentation images corresponding to the sample images.

In addition, the semantic segmentation image can distinguish the region of each simulated article in the sample image.

Optionally, in order to further mark the area of each simulated article, the computer device may obtain an initial bounding box of each simulated article according to a difference in pixel value of each simulated article.

The initial bounding box of each simulated item may contain all pixels of each simulated item and a larger proportion of pixels of other simulated items or pixels of background parts.

Optionally, in order to more accurately obtain the bounding box corresponding to each simulated article, the computer device determines, based on the initial bounding box, a compact bounding box of each simulated article through a heuristic algorithm.

The compact bounding box can maximally contain no pixels of other simulated articles and pixels of the background part, and the compact bounding box contains the corresponding recognizable characteristic region of the simulated article.

The recognizable characteristic region can be obtained according to the characteristic region of the simulated article, and the characteristic region existing in the part, which is not shielded, of the simulated article in the image is the recognizable characteristic region.

Optionally, the computer device may label the simulated item in each of the compact bounding boxes according to the identifiable characteristic region.

The computer equipment can obtain the type and name of the simulated article corresponding to the recognizable characteristic area according to the matching information of the recognizable characteristic area. The kind and name of the simulated object can be marked in the corresponding compact surrounding frame.

For example, fig. 6 is a schematic diagram of an automatic labeling process according to an embodiment of the present disclosure. As shown in fig. 6, in step 601, for an obtained sample image, performing semantic segmentation on the sample image 601 to obtain a semantic segmented image shown in step 602, and automatically labeling each simulated article according to the recognizable feature area of each simulated article to obtain a labeled sample image, as shown in step 603, where each simulated article corresponds to a rectangular frame, and the rectangular frame may be labeled with a type and a name corresponding to the simulated article.

Optionally, after the labeling of the type and the name of the corresponding simulated article in all the compact enclosing frames in the sample image is completed, the labeled sample image is obtained.

In step 308, the computer device trains a detection model through the labeled sample image.

The detection model is used for identifying at least one of type information, name information and position information of each entity article in the automatic sales counter.

In summary, the sample image annotation method for the automatic sales counter provided in the embodiments of the present disclosure obtains at least one simulated article, wherein the simulated object is a three-dimensional model constructed by sampling the solid object, and at least one sample image is obtained by arranging at least one simulated object in the virtual scene according to the arrangement scheme, wherein the sample image is an image obtained by shooting each simulated article by a virtual camera under different arrangement schemes, the virtual scene is a simulation entity scene constructed by three-dimensional modeling of the interior of the automatic sales counter, and the virtual camera corresponds to the entity camera in the automatic sales counter, and finally, the image is segmented by determining the semantic meaning corresponding to the sample image, and marking each simulated article corresponding to the characteristic region in the semantic segmentation image to obtain a marked sample image. By the scheme, the computer equipment can automatically acquire and label the relevant data by constructing the virtual scene and the virtual article, so that the efficiency of labeling the data is improved on the premise of ensuring the accuracy of data labeling.

FIG. 7 is a block diagram illustrating a sample image annotation apparatus for an automatic sales container according to an exemplary embodiment, as shown in FIG. 7, the sample image annotation apparatus for an automatic sales container can be implemented as all or part of a computer device by hardware or a combination of hardware and software to perform the steps shown in any one of the embodiments of FIG. 2 or FIG. 3. The sample image annotation apparatus for an automatic sales counter may include:

an article obtaining module 710, configured to obtain at least one simulated article, where the simulated article is a three-dimensional model constructed by sampling a physical article;

a sample image obtaining module 720, configured to obtain at least one sample image by arranging the at least one simulated article in a virtual scene according to an arrangement scheme, where the sample image is an image of each simulated article captured by a virtual camera under different arrangement schemes, and the virtual scene is a simulated entity scene constructed by performing three-dimensional modeling on the inside of an automatic sales counter; the virtual camera corresponds to the physical camera in the automatic sales counter;

the annotated image acquisition module 730 is configured to, by determining a semantic segmentation image corresponding to the sample image, annotate each of the simulated articles corresponding to the feature region in the semantic segmentation image, to obtain an annotated sample image.

Optionally, when the arrangement scheme is arranged in rows, the sample image obtaining module 720 includes:

Optionally, when the arrangement scheme is randomly arranged, the sample image obtaining module 720 includes:

Optionally, the item obtaining module 710 includes:

Optionally, the annotated image obtaining module 730 includes:

Optionally, the apparatus further comprises:

It should be noted that, when the apparatus provided in the foregoing embodiment implements the functions thereof, only the division of the above functional modules is illustrated, and in practical applications, the above functions may be distributed by different functional modules according to actual needs, that is, the content structure of the device is divided into different functional modules, so as to complete all or part of the functions described above.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

An exemplary embodiment of the present disclosure provides a sample image annotation apparatus for an automatic sales container, which may be implemented as all or part of a computer device in a hardware or software and hardware combination manner, and can implement all or part of the steps in any of the embodiments shown in fig. 2 or fig. 3 of the present disclosure. The sample image labeling device for the automatic sales counter further comprises: a processor, a memory for storing processor-executable instructions;

wherein the processor is configured to:

Optionally, the acquiring at least one simulated item includes:

and acquiring the sample image after each mark.

Optionally, the method further includes:

FIG. 8 is a schematic diagram illustrating a configuration of a computer device, according to an example embodiment. The computer apparatus 800 includes a Central Processing Unit (CPU) 801, a system Memory 804 including a Random Access Memory (RAM) 802 and a Read-Only Memory (ROM) 803, and a system bus 805 connecting the system Memory 804 and the CPU 801. The computer device 800 also includes a basic Input/Output system (I/O system) 806, which facilitates transfer of information between devices within the computer device, and a mass storage device 807 for storing an operating system 813, application programs 814, and other program modules 815.

The basic input/output system 806 includes a display 808 for displaying information and an input device 809 such as a mouse, keyboard, etc. for user input of information. Wherein the display 808 and the input device 809 are connected to the central processing unit 801 through the input output controller 88 connected to the system bus 805. The basic input/output system 806 may also include an input/output controller 810 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input-output controller 810 also provides output to a display screen, a printer, or other type of output device.

The mass storage device 807 is connected to the central processing unit 801 through a mass storage controller (not shown) connected to the system bus 805. The mass storage device 807 and its associated computer device-readable media provide non-volatile storage for the computer device 800. That is, the mass storage device 807 may include a computer device readable medium (not shown) such as a hard disk or Compact Disc-Only Memory (CD-ROM) drive.

Without loss of generality, the computer device readable media may comprise computer device storage media and communication media. Computer device storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer device readable instructions, data structures, program modules or other data. Computer device storage media includes RAM, ROM, Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), CD-ROM, Digital Video Disk (DVD), or other optical, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that the computer device storage media is not limited to the foregoing. The system memory 804 and mass storage 807 described above may be collectively referred to as memory.

The computer device 800 may also operate as a remote computer device connected to a network through a network, such as the internet, in accordance with various embodiments of the present disclosure. That is, the computer device 800 may be connected to the network 812 through the network interface unit 811 coupled to the system bus 805, or may be connected to other types of networks or remote computer device systems (not shown) using the network interface unit 811.

The memory further includes one or more programs, the one or more programs are stored in the memory, and the central processing unit 801 executes the one or more programs to implement all or part of the steps of the method shown in fig. 2 or fig. 3.

Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in embodiments of the disclosure may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-device-readable medium. Computer device readable media includes both computer device storage media and communication media including any medium that facilitates transfer of a computer device program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer device.

The embodiment of the disclosure also provides a computer device storage medium, which is used for storing computer device software instructions used by the testing apparatus, and which contains a program designed for executing the sample image labeling method for the automatic sales counter.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A sample image labeling method for an automatic sales counter is characterized by comprising the following steps:

2. The method of claim 1, wherein obtaining at least one sample image by arranging the at least one simulated item in the virtual scene according to the arrangement scheme when the arrangement scheme is in a column comprises:

3. The method of claim 2, wherein obtaining at least one sample image by arranging the at least one simulated item in the virtual scene according to the arrangement scheme when the arrangement scheme is random placement comprises:

4. The method of claim 1, wherein said obtaining at least one simulated item comprises:

acquiring modeling information of the solid object according to the video samples of the solid objects, wherein the modeling information is used for constructing a three-dimensional model of the solid object;

5. The method according to claim 1, wherein the obtaining of the labeled sample image by determining a semantic segmentation image corresponding to the sample image and labeling each simulated article corresponding to a feature region in the semantic segmentation image, the feature region being used to determine simulated article information, comprises:

labeling the simulated articles in each compact enclosure according to the identifiable characteristic region;

and acquiring the sample image after each mark.

6. The method of claim 1, wherein said obtaining at least one simulated item further comprises, prior to:

7. The method of claim 1, further comprising:

8. A sample image annotation apparatus for an automated sales counter, the apparatus comprising:

9. A sample image annotation apparatus for an automated sales counter, the apparatus comprising:

a processor;

a memory for storing executable instructions of the processor;

wherein the processor is configured to:

10. A computer device readable storage medium containing executable instructions that are invoked and executed by a processor to implement the method of any of claims 1 to 7 for sample image annotation of a vending container.