WO2022124673A1 - Dispositif et procédé pour mesurer le volume d'un objet dans un réceptacle sur la base d'une image d'appareil de prise de vues en utilisant un modèle d'apprentissage automatique - Google Patents
Dispositif et procédé pour mesurer le volume d'un objet dans un réceptacle sur la base d'une image d'appareil de prise de vues en utilisant un modèle d'apprentissage automatique Download PDFInfo
- Publication number
- WO2022124673A1 WO2022124673A1 PCT/KR2021/017807 KR2021017807W WO2022124673A1 WO 2022124673 A1 WO2022124673 A1 WO 2022124673A1 KR 2021017807 W KR2021017807 W KR 2021017807W WO 2022124673 A1 WO2022124673 A1 WO 2022124673A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- area
- container
- standard container
- input image
- volume measurement
- Prior art date
Links
- 238000010801 machine learning Methods 0.000 title claims abstract description 34
- 238000000034 method Methods 0.000 title claims description 39
- 238000005259 measurement Methods 0.000 claims abstract description 60
- 238000000691 measurement method Methods 0.000 claims abstract description 12
- 230000011218 segmentation Effects 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 24
- 230000000007 visual effect Effects 0.000 description 17
- 238000013527 convolutional neural network Methods 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 8
- 239000003086 colorant Substances 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- 239000003550 marker Substances 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 241000669298 Pseudaulacaspis pentagona Species 0.000 description 1
- 241000316887 Saissetia oleae Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 239000004480 active ingredient Substances 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/62—Analysis of geometric attributes of area, perimeter, diameter or volume
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06K—GRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K19/00—Record carriers for use with machines and with at least a part designed to carry digital markings
- G06K19/06—Record carriers for use with machines and with at least a part designed to carry digital markings characterised by the kind of the digital marking, e.g. shape, nature, code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Definitions
- Embodiments of the present disclosure relate to an apparatus and method for measuring the volume of an object in a container based on a captured image using a machine learning model.
- Embodiments of the present disclosure provide a method, an apparatus, and a computer program for measuring the volume of an object using image data.
- receiving an input image detecting a standard container area corresponding to a predefined standard container from the input image; recognizing a container wall area corresponding to the wall surface of the standard container and an object area corresponding to an object contained in the standard container using a first machine learning model from the input image; calculating a pixel ratio of the object area among pixels of the entire area including the container wall area and the object area; and generating a volume measurement value of the object based on the pixel ratio.
- the method for measuring object volume further includes removing a background area excluding the standard container area from the input image, and recognizing the container wall area and the object area. is performed using the input image from which the background area has been removed.
- removing the background region may include defining the standard container region as a region of interest; defining a region excluding the region of interest from the input image as the background region; and generating a first output image in which the background area is displayed as a single pixel value.
- the first machine learning model corresponds to a segmentic segmentation model
- the step of recognizing the container wall area and the object area includes the container wall area using the segmentic segmentation model. and recognizing the object region, wherein the object volume measurement method further includes converting the container wall region into a first pixel value and converting the object region into a second pixel value.
- the standard container is a standard container in the form of a polyhedron partially open
- the calculating of the pixel ratio includes recognizing a plurality of wall areas corresponding to each wall surface of the standard container. to do; dividing the object area into a plurality of sub object areas respectively corresponding to the plurality of wall areas; calculating a weighted object area pixel count by applying a weight corresponding to each wall surface to the pixel number of each of the plurality of sub object areas; and calculating the pixel ratio by using the number of pixels in the weighted object area and the number of pixels in the container wall area.
- the standard container is a rectangular standard container with an open top and front surfaces
- the input image is an image obtained by photographing the standard container with an oblique line from the open top and front surfaces.
- the weight of each wall surface of the rectangular parallelepiped container may increase in the order of both sides, the front, and the lower surface of the rectangular standard container.
- the volume measurement value of the object may be defined as a percentage of the total volume of the standard container.
- the input image is a moving picture including a plurality of frames
- the method for measuring the volume of an object further includes extracting a frame in which the standard container area is detected, the standard The detecting of the container region may include using the extracted frame as the input image.
- an input interface for receiving an input image; a memory storing at least one instruction; at least one processor executing the at least one instruction; and an output interface, wherein the at least one processor detects a standard container region corresponding to a predefined standard container from the input image by executing the at least one instruction, and performs first machine learning from the input image.
- the container wall area corresponding to the wall surface of the standard container and the object area corresponding to the object contained in the standard container are recognized using the model, and among the pixels of the entire area including the container wall area and the object area, the An object volume measurement apparatus is provided, which calculates a pixel ratio of an object area, generates a volume measurement value of the object based on the pixel ratio, and outputs the object volume measurement value through the output interface.
- the method for measuring object volume includes a predefined method from the input image. detecting a standard container area corresponding to a standard container; recognizing a container wall area corresponding to the wall surface of the standard container and an object area corresponding to an object contained in the standard container using a first machine learning model from the input image; calculating a pixel ratio of the object area among pixels of the entire area including the container wall area and the object area; and generating a volume measurement value of the object based on the pixel ratio.
- FIG. 1 is a diagram illustrating a system for measuring object volume according to an embodiment of the present disclosure.
- FIG. 2 is a view showing the structure of an object volume measurement apparatus according to an embodiment of the present disclosure.
- FIG. 3 is a flowchart illustrating a method for measuring the volume of an object according to an embodiment of the present disclosure.
- FIG. 4 is a view showing a standard container according to an embodiment of the present disclosure.
- FIG. 5 is a diagram in which a visual tag is disposed according to an embodiment of the present disclosure.
- FIG. 6 is a diagram illustrating a visual tag according to an embodiment of the present disclosure.
- FIG. 7 is a diagram illustrating a flowchart of a method for measuring the volume of an object according to an embodiment of the present disclosure.
- FIG. 8 is a diagram illustrating a process of detecting a standard container area from a black-and-white scale image, according to an embodiment of the present disclosure.
- FIG. 9 is a diagram illustrating a background removal process according to an embodiment of the present disclosure.
- FIG. 10 is a diagram illustrating a process of generating a first output image from a background removal image according to an embodiment of the present disclosure.
- FIG. 11 is a diagram illustrating a process of calculating the volume of an object contained in a standard container, according to an embodiment of the present disclosure.
- FIG. 12 is a diagram illustrating an output of a first machine learning model according to an embodiment of the present disclosure.
- FIG. 13 is a diagram illustrating a structure of a processor according to another embodiment of the present disclosure.
- FIG. 14 is a diagram illustrating the structure of a CNN model according to an embodiment of the present disclosure.
- 15 is a diagram illustrating image data and a first output image of a standard container area according to an embodiment of the present disclosure.
- FIG. 1 is a diagram illustrating a system for measuring object volume according to an embodiment of the present disclosure.
- the object volume measurement system 10 uses the camera 120 to photograph the object 112 contained in the standard container 110 , and from the captured input image 110 , the object volume measurement device Measure the object volume at (100).
- the object volume measurement system 10 may be used in a distribution warehouse, a factory, and the like. For example, the object 112 contained in the standard container 110 is placed on a shelf of the warehouse, and the object volume measurement system 10 takes the standard containers 110 placed while moving the camera 120, It is possible to measure the volume of the object 112 contained in each standard container 110 from the input image.
- the camera 120 is mounted on a predetermined moving means, for example, the camera 120 and the object volume measuring device 100 are implemented in the form of a movable robot, and while the robot moves in a predetermined path in the warehouse It is possible to measure the volume of the object in each standard container (110).
- the object 112 is contained in a predetermined standard container 110 .
- the standard container 110 is a container having a predefined shape, size, and color.
- the standard container 110 may be defined as one or more types.
- the object 112 is an object subject to volume measurement.
- the object volume means the volume occupied by the object 112 contained in one standard container 110 .
- the camera 120 photographs the object 112 contained in the standard container 110 .
- the camera 120 includes a lens, a shutter, and an image pickup device.
- the camera 120 captures an image, and outputs the captured input image to the object volume measurement apparatus 100 .
- the object volume measuring apparatus 100 may be implemented in the form of an electronic device including a processor and a memory, for example, in the form of a smart phone, a tablet PC, a notebook computer, or a wearable device. According to one embodiment, the object volume measuring apparatus 100 may be implemented in the form of a cloud server.
- the object volume measuring apparatus 100 may be implemented as one device including the camera 120 .
- the object volume measuring apparatus 100 may receive an input image from the external camera 120 .
- the object volume measuring apparatus 100 may receive an input image through a communication unit or a predetermined input interface.
- the camera 120 may correspond to a closed circuit television (CCTV) camera.
- CCTV closed circuit television
- FIG. 2 is a view showing the structure of an object volume measurement apparatus according to an embodiment of the present disclosure.
- the object volume measurement apparatus 100 includes an input interface 210 , a processor 220 , an output interface 230 , and a memory 240 .
- the input interface 210 receives an input image photographed from at least one camera for photographing a standard container.
- the object volume measuring apparatus 100 may include a camera in the object volume measuring apparatus 100 .
- the input interface 210 receives an input image from a camera built in the object volume measurement apparatus 100 .
- the object volume measurement apparatus 100 may be connected to a camera disposed outside the object volume measurement apparatus 100 to receive an input image through the input interface 210 .
- the camera photographs the standard container and transmits the photographed image data to the object volume measurement apparatus 100 .
- the camera is placed with a Field of View (FOV) set to photograph the standard container.
- the camera may correspond to an existing CCTV camera.
- FOV Field of View
- the input interface 210 may correspond to an input device of a predetermined standard for receiving image data from a camera or a communication unit.
- the input interface 210 transmits the input image data to the processor 220 or the memory 240 .
- the input image data corresponds to the input image.
- the processor 220 may read the input image stored in the memory 240 .
- the processor 220 controls the overall operation of the object volume measuring apparatus 100 .
- the processor 220 may be implemented with one or more processors.
- the processor 220 may execute an instruction or a command stored in the memory to perform a predetermined operation.
- the processor 220 detects the standard container area corresponding to the standard container from the input image.
- the processor 220 may detect the standard container area corresponding to the standard container from the input image using a method of detecting a visual tag or a method using an object detection algorithm such as You Only Look Once (YOLO).
- YOLO You Only Look Once
- the processor 220 obtains a container identifier corresponding to the detected standard container.
- the container identifier may be obtained using a visual tag.
- the container identifier may be obtained by recognizing the container identifier described in the standard container using a character recognition algorithm or a pattern recognition algorithm.
- the processor 220 may acquire object information, which is information about an object contained in a standard container, based on the container identifier.
- the processor 220 may further include a memory, and store object information corresponding to each container identifier in the memory.
- the product information may include at least one of a product name, a product category, a manufacturer, a salesperson, a serial number, an expiration date, an active ingredient, and a storage method, or a combination thereof.
- the processor 220 may acquire object information corresponding to the obtained container identifier based on the obtained container identifier and the object information stored in the memory.
- the processor 220 may acquire object information corresponding to the container identifier by using an external database such as a cloud server.
- the processor 220 recognizes the container wall area corresponding to the wall surface of the standard container and the object area corresponding to the object from the image of the standard container area.
- the processor 220 may recognize the container wall area and the object area using the first machine learning model.
- the processor 220 calculates the number of pixels in each of the container wall area and the object area, and calculates the pixel ratio of the object area.
- the processor 220 calculates the number of pixels in the entire area including the container wall area and the object area.
- the pixel ratio of the object area corresponds to the ratio of the number of pixels of the object area to the number of pixels of the entire area.
- the pixel ratio can be defined as a percentage.
- the processor 220 generates the object volume measurement value based on the pixel ratio of the object area.
- the processor 220 may define the pixel ratio as a measurement value of the object volume.
- the processor 220 may define a value obtained by multiplying a pixel ratio by a predetermined reference value as the object volume measurement value.
- the output interface 230 outputs the volume measurement value generated by the processor 220 .
- the output interface 230 may correspond to, for example, a display, an audio speaker, or a communication unit.
- the output interface 230 outputs the container identifier and the object volume value together. According to another embodiment, the output interface 230 outputs the container identifier, the object information, and the object volume value together.
- the memory 240 may store data and commands necessary for the operation of the object volume measuring apparatus 100 .
- the memory 240 may be implemented as at least one of a volatile storage medium and a non-volatile storage medium, or a combination thereof.
- the memory 240 may be implemented with various types of storage media.
- the memory 240 may include a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (eg, SD or XD memory), and a RAM.
- the memory 240 may correspond to a cloud storage space.
- the memory 240 may be implemented through a cloud service.
- the memory 240 may store an input image, a container region image, and an intermediate output image.
- FIG. 3 is a flowchart illustrating a method for measuring the volume of an object according to an embodiment of the present disclosure.
- Each step of the method for measuring the volume of an object according to an embodiment of the present disclosure may be performed by various types of electronic devices including a processor.
- the present disclosure will focus on an embodiment in which the object volume measurement apparatus 100 according to embodiments of the present disclosure performs the object volume measurement method. Therefore, the embodiments described with respect to the object volume measuring apparatus 100 are applicable to the embodiments of the object volume measuring method, and on the contrary, the embodiments described for the object volume measuring method are for the object volume measuring apparatus 100 . Applicable to the embodiments.
- the object volume measurement method according to the disclosed embodiments is not limited to being performed by the object volume measurement apparatus 100 disclosed in the present disclosure, and may be performed by various types of electronic devices.
- step S302 the object volume measurement apparatus 100 receives an input image captured by a camera.
- the object volume measurement apparatus 100 detects a standard container area from the input image.
- the object volume measuring apparatus 100 may detect a visual tag from an input image, and detect a standard container area based on the detected visual tag.
- the object volume measuring apparatus 100 may detect a standard container region from an input image using a machine learning model.
- step S306 the object volume measuring apparatus 100 recognizes the container wall area corresponding to the wall surface of the standard container and the object area corresponding to the object from the image of the standard container area.
- the object volume measuring apparatus 100 may recognize the container wall area and the object area using the first machine learning model.
- the object volume measuring apparatus 100 calculates the number of pixels in each of the container wall area and the object area, and calculates the pixel ratio of the object area.
- the object volume measuring apparatus 100 calculates the number of pixels in the entire area including the container wall area and the object area.
- the pixel ratio of the object area corresponds to the ratio of the number of pixels of the object area to the number of pixels of the entire area.
- the pixel ratio can be defined as a percentage.
- the object volume measurement apparatus 100 generates an object volume measurement value based on the pixel ratio of the object area.
- the object volume measurement apparatus 100 may define a pixel ratio as an object volume measurement value.
- the object volume measurement apparatus 100 may define a value obtained by multiplying a pixel ratio by a predetermined reference value as the object volume measurement value.
- FIG. 4 is a view showing a standard container according to an embodiment of the present disclosure.
- Standard containers (410, 420, 430) include one or a plurality of types. Each type of standard container (410, 420, 430) may have a different shape, size, color, and the like. When the standard containers (410, 420, 430) are photographed by a camera, the upper surface and the front part may have an open shape so that the contained object is visible. In addition, the standard containers (410, 420, 430) have a plurality of types of different sizes, the size may be defined by the width, depth, and height.
- the standard containers 410 , 420 , 430 may have a polyhedral shape.
- the standard containers (410, 420, 430) may have a rectangular parallelepiped shape as shown in FIG.
- Standard containers (410, 420, 430) may have a shape in which the upper surface and the front of the rectangular parallelepiped are open.
- the standard containers 410 , 420 , and 430 may have identifier regions 440a , 440b , and 440c indicating information indicating the container identifier in a predetermined region of the front surface photographed by the camera.
- the identifier areas 440a, 440b, and 440c are areas representing container identifier information as visual information.
- a visual tag is disposed in the identifier areas 440a, 440b, and 440c.
- the visual tag may include, for example, an ARUCO Marker, a Quick Response (QR) code, a barcode, and the like.
- the visual tag may include container identifier information corresponding to a corresponding standard container, object information, and the like.
- the identifier regions 440a, 440b, and 440c may be expressed as characters, patterns, symbols, or the like. Characters, patterns, symbols, etc. may represent container identifier information, object information, etc. corresponding to the corresponding standard container.
- FIG. 5 is a diagram in which a visual tag is disposed according to an embodiment of the present disclosure.
- the container identification tag 510 is disposed in the identifier area on the front of the standard container 110 .
- the container identification tag 510 is a configuration corresponding to the visual tag.
- the container identification tag 510 includes identifier information and object information of the standard container 110 .
- the identifier information indicates the identification number of the standard container 110 .
- the object information may correspond to, for example, a category (eg, food, clothing, miscellaneous goods, etc.) of an object contained in the standard container 110 , a product model, a product name, and the like.
- the object volume measurement apparatus 100 When the input image is a moving picture, when a new container identification tag 510 is detected from the frame, the object volume measurement apparatus 100 defines a new standard container area. According to an embodiment, the object volume measurement apparatus 100 may perform operations S304, S306, S308, and S310 based on the detection of the new container identification tag. According to an embodiment of the present disclosure, the object volume measurement system 10 may take stock of the warehouse as a whole while sequentially photographing standard containers in the warehouse while moving the camera.
- the object volume measurement apparatus 100 detects and stores the container identification tag 510 from an input image corresponding to the video. For example, the object volume measurement apparatus 100 may capture 60 frames per second and store the identification number of the container identification tag 510 identified in the captured image. Next, when the object volume measurement apparatus 100 detects a non-overlapping identification number among the stored container identification tags 510 in a new frame, stores the detected new identification number, and S304, S306 for a new standard container , S308, and S310 are performed. The object volume measurement apparatus 100 repeats the above process with respect to the input image, and acquires the volume measurement value of each standard container.
- FIG. 6 is a diagram illustrating a visual tag according to an embodiment of the present disclosure.
- the visual tags 610a and 610b may be implemented as ARUCO markers as shown in FIG. 6 .
- ARUCO marker is a type of two-dimensional visual marker, and consists of a two-dimensional bit pattern of n*n size and a black border area surrounding it. The black border area improves the recognition rate of the marker.
- the two-dimensional bit pattern inside is composed of a combination of white cells and black cells, and represents predetermined information.
- the container identification tag 510 may be implemented in the form of an ARUCO marker as shown in FIG. 6 .
- FIG. 7 is a diagram illustrating a flowchart of a method for measuring the volume of an object according to an embodiment of the present disclosure.
- the object volume measurement method may perform additional image processing in addition to the processing described with reference to FIG. 3 in the process of calculating a volume measurement value from an input image.
- the object volume measurement apparatus 100 receives an input image captured by a camera in step S702.
- step S704 the object volume measuring apparatus 100 converts the input image into a black and white scale.
- the object volume measuring apparatus 100 converts a pixel value of an input image into a black-and-white scale image having two pixel values corresponding to black and white, respectively, by using a predetermined reference value.
- the object volume measurement apparatus 100 detects the standard container area from the black-and-white scale image.
- the object volume measuring apparatus 100 may detect a visual tag from an input image, and detect a standard container area based on the detected visual tag.
- the object volume measuring apparatus 100 may detect a standard container region from an input image using a machine learning model.
- the object volume measurement apparatus 100 detects a standard container area from a black-and-white scale image using the YOLO model. A process of detecting the standard container area will be described with reference to FIG. 8 .
- FIG. 8 is a diagram illustrating a process of detecting a standard container area from a black-and-white scale image, according to an embodiment of the present disclosure.
- the object volume measurement apparatus 110 inputs the black-and-white scale image 810 into the object recognition model 820 .
- the object recognition model 820 is, for example, a You Only Look Once (YOLO) model.
- the object recognition model 820 may correspond to a machine learning model including a plurality of nodes and layers.
- the object recognition model 820 may include a YOLO model trained to detect the standard container 110 from the input image and output the position, area, and probability of corresponding to the standard container 110 of the standard container 110 .
- YOLO is a deep learning framework based on CNN (Convolutional Neural Network).
- YOLO's Object Recognition leaves a total of 4 positional information when recognizing an object trained in advance from a photo.
- the four positional information includes an x-coordinate, a y-coordinate, a width, and a height of the recognized object.
- What kind of object is recognized based on the four pieces of location information can be expressed as a rectangle 832 and text 834 through YOLO.
- one model capable of recognizing the standard container 110 through YOLO is manufactured. This model cannot recognize other objects, and is only used to find the standard container 110 from the input black-and-white scale image 810 to obtain four positional information.
- the object recognition model 820 using the YOLO model outputs the above-described four pieces of location information.
- the object recognition model 820 outputs an object recognition image 830 in which four pieces of location information are overlaid on a black-and-white scale image 810 .
- the object recognition image 830 includes a box 832 indicating a standard container area and an indicator 834 indicating a standard container.
- the indicator 834 indicating the standard container may include a probability that the area corresponding to the box 832 corresponds to the standard container 110 .
- the object recognition model 820 may crop the black-and-white scale image 810 to include only the standard container area corresponding to the standard container 110 to generate and output the standard container area image. have.
- step S708 the object volume measurement apparatus 100 performs a background removal process.
- a background removal processing process will be described.
- FIG. 9 is a diagram illustrating a background removal process according to an embodiment of the present disclosure.
- the object volume measurement apparatus 100 inputs the object recognition image 830 to the background removal module 910 .
- the background removal module 910 generates and outputs the background removal image 920 in which the background is removed from the object recognition image 830 except for the area corresponding to the standard container 110 .
- the background removal module 910 includes a Graph-cut algorithm.
- Graph-cut is one of the representative background removal algorithms.
- ROI Region of Interest
- the four positional information received from YOLO is used to construct this ROI, and the graph-cut algorithm displays the outer and inner pixels of the ROI as foreground or background.
- the clustering operation of the basic graph-cut algorithm is used, the pixels clustered in the foreground are left in color, and the colors of the pixels clustered in the background are all changed to black.
- the background removal module 910 outputs a background removal image 920 .
- the background removal image 920 may include a box 832 and an indicator 834 generated by the object recognition model 820 .
- step S710 the object volume measuring apparatus 100 recognizes the container wall area corresponding to the wall surface of the standard container and the object area corresponding to the object from the background removal image 920 .
- the object volume measuring apparatus 100 may recognize the container wall area and the object area using the first machine learning model.
- the object volume measuring apparatus 100 generates a first output image in which the container wall area and the object area are recognized. The operation of step S710 will be described in detail with reference to FIG. 10 .
- FIG. 10 is a diagram illustrating a process of generating a first output image from a background removal image according to an embodiment of the present disclosure.
- the object volume measuring apparatus 100 generates a first output image 1030 in which the object area and the container area are separated by inputting the background removal image 1010 to the first machine learning model 1020 .
- the background removal image 1010 is an image in which the box 832 and the indicator 834 are removed from the background removal image 920 described with reference to FIG. 9 .
- the first machine learning model 1020 receives the background removal image 1010 and recognizes and classifies an object from the background removal image 1010 .
- the first machine learning model 1020 includes a semantic segmentation model.
- the output of the Semantic Segmentation model is an image in which the object inside the standard container 110 and the wall of the container are displayed in different colors.
- Semantic segmentation model is a machine learning model that divides objects in an image into meaningful units. Semantic segmentation model predicts which class each pixel of an image belongs to. According to an embodiment of the present disclosure, the semantic segmentation model defines the object and the container wall inside the standard container 110 as classes, and distinguishes each pixel from the image input to the semantic segmentation model into the object and the container wall.
- the semantic segmentation model uses a convolutional neural network (CNN) model.
- CNN convolutional neural network
- the semantic segmentation model can generate and output a segmentation map indicating the predicted class of each pixel.
- the semantic segmentation model generates a first output image 1030 in which an object region and a container wall region are displayed in different colors.
- the object volume measuring apparatus 100 calculates the number of pixels in each of the container wall area and the object area, and calculates the pixel ratio of the object area.
- the object volume measuring apparatus 100 calculates the number of pixels in each of the container wall area and the object area from the first output image 1030 .
- the object volume measuring apparatus 100 calculates the number of pixels in the entire area including the container wall area and the object area.
- the pixel ratio of the object area corresponds to the ratio of the number of pixels of the object area to the number of pixels of the entire area.
- the pixel ratio can be defined as a percentage.
- the object volume measuring apparatus 100 may calculate the number of pixels in each of the container wall area and the object area by applying different weights to the respective container wall surfaces. A process of calculating the volume of an object according to an embodiment of the present disclosure will be described in detail with reference to FIG. 11 .
- FIG. 11 is a diagram illustrating a process of calculating the volume of an object contained in a standard container, according to an embodiment of the present disclosure.
- the final result of the semantic segmentation model is the first output image 1030 in which the object inside the standard container 110 and the container wall are displayed in two colors.
- the number of pixels corresponding to each color is determined from the final photo. If the number of pixels in the container wall area is a and the number of pixels in the object area is b, the pixel ratio of the most basic object area can be calculated through Equation 1.
- R percentage of pixels in the object area
- the pixels of the container wall region are divided into a total of four regions 1121, 1122, 1123, and 1124, and different region weights are given to each region 1121, 1122, 1123, and 1124.
- the region weight has a value between 0 and 1.
- the weight of each wall surface of the standard container 110 of the rectangular parallelepiped shape may increase in the order of both sides, the front side, and the bottom side.
- the area weight is used to correct shadows appearing in the input image.
- an embodiment of the present disclosure increases the weight of each wall surface in the order of both sides, the front side, and the bottom side by reflecting the characteristics of the shadow.
- the number of pixels b of the object area inside the standard container 110 is 5000.
- the number of pixels belonging to the first area 1121 is 800
- the number of pixels belonging to the second area 1122 is 700
- the number of pixels belonging to the third area 1123 is 950
- the number of pixels belonging to the third area 1123 is 950.
- the number of pixels belonging to the region 1124 is 2450.
- the region weight of the first region 1121 and the third region 1123 is defined as 0.7
- the region weight of the second region 1122 is 0.8
- the region weight of the fourth region 1124 is defined as 1.
- the number of pixels in the first region 1121 is multiplied by 0.7
- the number of pixels in the second region 1122 is multiplied by 0.8
- the number of pixels in the third region 1123 is multiplied by 0.7
- the pixels in the fourth region 1124 are multiplied by 0.7.
- the sum of the number of pixels multiplied by the weight is 4340.
- an adjustment of approximately 13% was applied to the pixel ratio using the number of pixels multiplied by the area weight to the pixel ratio of the object area calculated using Equation 1 above.
- FIG. 12 is a diagram illustrating an output of a first machine learning model according to an embodiment of the present disclosure.
- an error caused by a shadow occurs in the first output image 1210 in which each pixel is classified.
- regions 1220 , 1222 , and 1224 are regions corresponding to errors caused by shadows.
- Areas 1220 , 1222 , and 1224 partially include a portion corresponding to the container wall area of the standard container 110 , and include pixels incorrectly recognized as object areas due to shadows.
- the volume calculation module 1110 divides the weight for the shadow into a total of four regions.
- the area weight of each area is set to be low in the side where shadows occur the most, that is, the first area 1121 and the third area 1123 .
- the second area 1122 having less shadow than the first area 1121 and the third area 1123 is set higher than the first area 1121 and the third area 1123 .
- a weight of 1 is applied to the fourth area 1124 , which is not affected by shadows, in order to reflect the number of pixels as it is.
- the volume calculation module 1110 receives the number of pixels and positions of pixels in the object area as inputs, and divides it into four areas.
- the volume calculation module 1110 may use a beta function that outputs the number of pixels of the object area in each of the four areas from the first output image.
- the number of pixels in the object area of each area is multiplied by the weight of each area to output the sum of the number of pixels in the object area to which the weight is applied.
- the volume calculation module 1110 calculates the pixel ratio of the weighted object area by using Equation (2).
- R percentage of pixels in the object area
- the number of pixels b of the object area inside the standard container 110 is 5000.
- the number of pixels belonging to the first area 1121 is 800
- the number of pixels belonging to the second area 1122 is 700
- the number of pixels belonging to the third area 1123 is 950
- the number of pixels belonging to the third area 1123 is 950.
- the number of pixels belonging to the region 1124 is 2450.
- the value of a which is the number of pixels in the wall area, is 5200.
- the region weight of the first region 1121 and the third region 1123 is defined as 0.7
- the region weight of the second region 1122 is 0.8
- the region weight of the fourth region 1124 is defined as 1.
- the R value is 45.59%.
- the object volume measurement apparatus 100 generates a volume measurement value based on the pixel ratio of the object area.
- the object volume measurement apparatus 100 may define a pixel ratio of the object area as a volume measurement value.
- the pixel ratio of the object area may correspond to the R value of Equation 1 or the R value of Equation 2 described above.
- the object volume measurement apparatus 100 may define a value obtained by multiplying the pixel ratio of the object area by a predetermined reference value as the object volume measurement value.
- the predetermined reference value may correspond to a volume value when the standard container 110 is filled with an object.
- FIG. 13 is a diagram illustrating a structure of a processor according to another embodiment of the present disclosure.
- the processor 220 includes a black-and-white conversion module 1310 , an object recognition model 820 , a background removal module 910 , a first machine learning model 1020 , and a volume calculation module 1110 .
- Each block in the processor 220 corresponds to a software module, a hardware module, or a combination of a software module and a hardware module. Therefore, the embodiment of the present disclosure is not limited by the structure of each block in the processor 220 , and each block in the processor 220 may be combined with each other, or one block may be divided into a plurality of blocks.
- each module of FIG. 13 Since the operation of each module of FIG. 13 is similar to the operation of each step described with reference to FIG. 7 , the operation of each module will be briefly described in FIG. 13 to avoid duplicate description. The operation of the device described with reference to FIG. 7 may also be applied to each module of FIG. 13 .
- the black-and-white conversion module 1310 generates a black-and-white scale image by converting the input image 810 into a black-and-white scale.
- the black-and-white conversion module 1310 converts the input image 810 into a black-and-white scale in order to reduce the amount of processing.
- the processor 220 can process the input image with one channel instead of processing the input image with three channels, R, G, and B, so that the throughput can be reduced.
- the object recognition model 820 detects the standard container area from the black-and-white input image.
- the object recognition model 820 may include a YOLO model.
- the object recognition module 820 generates an object recognition image 830 corresponding to the standard container area and outputs it to the background removal module 910 .
- the background removal module 910 generates a background removal image 920 in which a background is removed from the object recognition image 830 except for an area corresponding to the standard container 110 .
- the background removal module 910 outputs the background removal image 920 to the first machine learning model 1020 .
- the first machine learning model 1020 receives the background removed image 920 , and recognizes and classifies an object from the background removed image 920 .
- the first machine learning model 1020 includes a semantic segmentation model.
- the output of the Semantic Segmentation model is an image in which the object inside the standard container 110 and the wall of the container are displayed in different colors.
- the first machine learning model 1020 outputs a first output image 1030 in which the object area and the container wall area are displayed in different colors.
- the first machine learning model 1020 may be trained using the Tensorflow API.
- the first machine learning model 1020 is learned using training data using the background-removed image 920 as input data and image data from which an object is recognized and classified as output data.
- the input data and the output data may have a predetermined size, and the size may be defined as, for example, 250*250, 128*128, or the like.
- any engine or data augmentation algorithm may be used.
- the volume calculation module 1110 receives the first output image 1030 from the first machine learning model 1020 .
- the volume calculation module 1110 generates and outputs a volume calculation value from the first output image 1030 in the manner described above with reference to FIG. 11 .
- FIG. 14 is a diagram illustrating the structure of a CNN model according to an embodiment of the present disclosure.
- the first machine learning model 1020 includes an artificial deep neural network of a CNN structure.
- the CNN structure includes a convolutional product layer and a fully connected layer.
- the convolutional product layer performs the operation of feature extraction.
- the synthetic product layer includes a convolution layer, an activation layer, and a pooling layer.
- the feature of the input vector is extracted from the input vector by the convolutional product layer.
- a fully connected layer is placed.
- the fully connected layer generates an output vector from features extracted from the convolutional product layer.
- Fully Connected layer is calculated by connecting all nodes between layers.
- the first machine learning model 1020 may be learned by training data based on a model including a CNN structure.
- 15 is a diagram illustrating image data and a first output image of a standard container area according to an embodiment of the present disclosure.
- the image data 1510 of the standard container area may be generated such that the standard container is disposed in the center. According to an embodiment, the image data 1510 of the standard container area may recognize the side 1512 of the standard container and display it on the image.
- the first output image 1520 classifies object types and indicates regions corresponding to each object type with the same pixel value or pattern.
- the disclosed embodiments may be implemented in the form of a computer-readable recording medium storing instructions and data executable by a computer.
- the instructions may be stored in the form of program code, and when executed by the processor, a predetermined program module may be generated to perform a predetermined operation. Further, the instruction, when executed by a processor, may perform certain operations of the disclosed embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Geometry (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Length Measuring Devices By Optical Means (AREA)
- Image Analysis (AREA)
- Apparatus For Radiation Diagnosis (AREA)
Abstract
L'invention concerne un procédé de mesure de volume d'objet comprenant les étapes suivantes : réception d'une image d'entrée ; détection d'une aire de réceptacle standard correspondant à une taille standard prédéterminée d'un réceptacle, à partir de l'image d'entrée ; reconnaissance d'une aire de surface de paroi de réceptacle correspondant aux surfaces de paroi du réceptacle standard et d'une aire d'objet correspondant à un objet reçu à l'intérieur du réceptacle standard, à partir de l'image d'entrée au moyen d'un premier modèle d'apprentissage automatique ; calcul du rapport des pixels de l'aire d'objet par rapport aux pixels de l'aire totale comprenant l'aire de la surface de paroi de réceptacle et l'aire d'objet ; et génération d'une valeur de mesure de volume de l'objet sur la base du rapport de pixel.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020200173501A KR102597692B1 (ko) | 2020-12-11 | 2020-12-11 | 영상을 이용한 물건 부피의 측정 장치, 방법, 및 컴퓨터 프로그램 |
KR10-2020-0173501 | 2020-12-11 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022124673A1 true WO2022124673A1 (fr) | 2022-06-16 |
Family
ID=81973779
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2021/017807 WO2022124673A1 (fr) | 2020-12-11 | 2021-11-30 | Dispositif et procédé pour mesurer le volume d'un objet dans un réceptacle sur la base d'une image d'appareil de prise de vues en utilisant un modèle d'apprentissage automatique |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR102597692B1 (fr) |
WO (1) | WO2022124673A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117078103A (zh) * | 2023-08-29 | 2023-11-17 | 南京图灵信息技术有限公司 | 商品品质监控数据处理方法及装置 |
WO2024168419A1 (fr) * | 2023-02-17 | 2024-08-22 | Binsentry Inc. | Procédé de cartographie d'un matériau en vrac dans un bac à l'aide d'un apprentissage automatique |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100998885B1 (ko) * | 2009-11-19 | 2010-12-08 | 한국건설기술연구원 | 영상의 시간적 픽셀의 농도 분포 변화를 이용한 액체 경계면 인식방법 및 이를 이용하여 액체높이를 인식하는 액체높이인식장치 |
KR20150103995A (ko) * | 2014-03-04 | 2015-09-14 | 주식회사 영국전자 | 탱크 내벽 검사 방법 |
KR101873124B1 (ko) * | 2016-12-30 | 2018-06-29 | 부산대학교 산학협력단 | 액체 저장 탱크의 수위 계측 방법 및 액체 저장 탱크의 수위 계측 시스템 |
JP2019519757A (ja) * | 2016-04-27 | 2019-07-11 | ベンタナ メディカル システムズ, インコーポレイテッド | リアルタイム体積制御するためのシステムおよび方法 |
JP2020024108A (ja) * | 2018-08-06 | 2020-02-13 | 地方独立行政法人 岩手県工業技術センター | 貯蔵タンクの貯蔵量推定装置 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101375018B1 (ko) * | 2012-11-22 | 2014-03-17 | 경일대학교산학협력단 | 이미지 인식을 통한 음식 정보를 제공하기 위한 방법 및 장치 |
KR101893098B1 (ko) * | 2014-08-18 | 2018-08-29 | 안상요 | 음식물 쓰레기 수거용기 및 이를 이용한 음식물쓰레기 수거시스템 |
KR20200125131A (ko) * | 2019-04-26 | 2020-11-04 | (주)제이엘케이 | 인공지능 기반 영상 두께 측정 방법 및 시스템 |
-
2020
- 2020-12-11 KR KR1020200173501A patent/KR102597692B1/ko active IP Right Grant
-
2021
- 2021-11-30 WO PCT/KR2021/017807 patent/WO2022124673A1/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100998885B1 (ko) * | 2009-11-19 | 2010-12-08 | 한국건설기술연구원 | 영상의 시간적 픽셀의 농도 분포 변화를 이용한 액체 경계면 인식방법 및 이를 이용하여 액체높이를 인식하는 액체높이인식장치 |
KR20150103995A (ko) * | 2014-03-04 | 2015-09-14 | 주식회사 영국전자 | 탱크 내벽 검사 방법 |
JP2019519757A (ja) * | 2016-04-27 | 2019-07-11 | ベンタナ メディカル システムズ, インコーポレイテッド | リアルタイム体積制御するためのシステムおよび方法 |
KR101873124B1 (ko) * | 2016-12-30 | 2018-06-29 | 부산대학교 산학협력단 | 액체 저장 탱크의 수위 계측 방법 및 액체 저장 탱크의 수위 계측 시스템 |
JP2020024108A (ja) * | 2018-08-06 | 2020-02-13 | 地方独立行政法人 岩手県工業技術センター | 貯蔵タンクの貯蔵量推定装置 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024168419A1 (fr) * | 2023-02-17 | 2024-08-22 | Binsentry Inc. | Procédé de cartographie d'un matériau en vrac dans un bac à l'aide d'un apprentissage automatique |
CN117078103A (zh) * | 2023-08-29 | 2023-11-17 | 南京图灵信息技术有限公司 | 商品品质监控数据处理方法及装置 |
CN117078103B (zh) * | 2023-08-29 | 2024-02-13 | 南京图灵信息技术有限公司 | 商品品质监控数据处理方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
KR20220083347A (ko) | 2022-06-20 |
KR102597692B1 (ko) | 2023-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022124673A1 (fr) | Dispositif et procédé pour mesurer le volume d'un objet dans un réceptacle sur la base d'une image d'appareil de prise de vues en utilisant un modèle d'apprentissage automatique | |
WO2018143550A1 (fr) | Appareil de notification de date d'expiration d'aliments stockés par une intelligence artificielle de lecture de caractères dans un réfrigérateur et procédé associé | |
WO2019132589A1 (fr) | Dispositif de traitement d'images et procédé de détection d'objets multiples | |
US11080559B2 (en) | Product onboarding machine | |
US20080152236A1 (en) | Image processing method and apparatus | |
CN109727275B (zh) | 目标检测方法、装置、系统和计算机可读存储介质 | |
WO2019225964A1 (fr) | Système et procédé de détection rapide d'objets | |
EP3417425A1 (fr) | Utilisation de repères multiples pour un classement d'objets à grains fins | |
WO2010101227A1 (fr) | Dispositif de création d'informations pour l'estimation de position d'élément, procédé de création d'informations pour l'estimation de position d'élément, et programme | |
WO2015182904A1 (fr) | Appareil d'étude de zone d'intérêt et procédé de détection d'objet d'intérêt | |
WO2022039330A1 (fr) | Système et procédé d'analyse de document à base d'ocr à l'aide d'une cellule virtuelle | |
CN112508033B (zh) | 检测方法、存储介质和电子装置 | |
JP2007188294A (ja) | 画像処理による移動体候補の検出方法及び移動体候補から移動体を検出する移動体検出方法、移動体検出装置及び移動体検出プログラム | |
CN115880765A (zh) | 区域入侵异常行为检测方法、装置及计算机设备 | |
WO2020141888A1 (fr) | Dispositif de gestion de l'environnement de ferme d'élevage | |
CN112749664A (zh) | 一种手势识别方法、装置、设备、系统及存储介质 | |
CN117218633A (zh) | 一种物品检测方法、装置、设备及存储介质 | |
JPWO2017179543A1 (ja) | 情報処理装置、情報処理方法及びプログラム記録媒体 | |
WO2023149603A1 (fr) | Système de surveillance par images thermiques utilisant une pluralité de caméras | |
JP2000030033A (ja) | 人物検出方法 | |
WO2016104842A1 (fr) | Système de reconnaissance d'objet et procédé de prise en compte de distorsion de caméra | |
EP3647236A1 (fr) | Dispositif d'instruction de projection, système de tri de bagage, et procédé d'instruction de projection | |
KR20230150625A (ko) | 수하물 분류 시스템 | |
US20060010582A1 (en) | Chin detecting method, chin detecting system and chin detecting program for a chin of a human face | |
EP4099264B1 (fr) | Dispositif d'apprentissage et procédé d'apprentissage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21903719 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21903719 Country of ref document: EP Kind code of ref document: A1 |