KR20210092552A

KR20210092552A - Method and device for detecting nutrition data of product

Info

Publication number: KR20210092552A
Application number: KR1020200006081A
Authority: KR
Inventors: 이병정; 민경식; 최지수; 이철훈; 정동주
Original assignee: 서울시립대학교 산학협력단
Priority date: 2020-01-16
Filing date: 2020-01-16
Publication date: 2021-07-26
Also published as: KR102376313B1

Abstract

A method for extracting nutrition information of a product comprises: a step of obtaining an image of a product; a step of recognizing a label of the product by inputting the image of the product in a first deep learning model; a step of recognizing a content information object comprising content information by inputting the label image of the product in a second deep learning model; and a step of recognizing content information from the content information object through preset recognition means. According to the present invention, a user can correctly use products according to component and content information.

Description

METHOD AND DEVICE FOR DETECTING NUTRITION DATA OF PRODUCT

본 발명은 제품의 영양 정보를 추출하는 방법 및 장치에 관한 것이다. The present invention relates to a method and apparatus for extracting nutritional information of a product.

일상 생활에서 누구나 다양한 상항에서 다양한 제품을 구매하고 사용하는 상황에 직면한다. 구매자들은 일반적으로 제품을 선택할 시 디자인, 가격, 실용성 등 다양한 요소를 고려한다. 하지만 제품의 상세 구성요소를 확인해야 하는 특별한 제품들이 존재한다. 건강 보조 식품, 의약품, 화장품 등이 이에 해당한다. 이런 제품들은 건강과 직접적으로 연관이 있어 각별한 주의를 요구하지만, 구성 물질들을 일반 사용자들이 모두 확인하기에는 무리가 있다. 이러한 상태에서 무분별하게 사용을 지속하는 과정에서 오남용의 우려가 있기에, 제품의 성분들을 인지하여 구매하도록 함으로써, 착오로 인한 부작용을 방지하여야 한다. In our daily life, everyone faces a situation in which they purchase and use various products in various situations. Buyers generally consider various factors such as design, price, and practicality when choosing a product. However, there are special products that require checking the detailed components of the product. These include health supplements, pharmaceuticals, and cosmetics. These products are directly related to health and require special attention, but it is difficult for general users to check all the constituent substances. In this state, there is a risk of misuse and abuse in the process of continuing use indiscriminately.

이러한 제품의 성분을 인지하기 위해서는 다양한 객체 탐지 기법이 이용될 수 있다. 객체 탐지 기법(Object Detection)이란, 영상 속에서 탐지를 원하는 객체(Label)가 어디에(x, y), 어느 사이즈(w, h)로 존재하는지를 파악하는 기법을 의미한다. 일반적인 객체 탐지 기법은 영상에서 객체를 탐지하여 바운딩 박스(Bounding Box) 형태로 어느 객체인지를 표시한다. In order to recognize the ingredients of such products, various object detection techniques may be used. The object detection technique refers to a technique for identifying where (x, y) and in which size (w, h) an object (label) to be detected exists in an image. A general object detection technique detects an object in an image and displays which object it is in the form of a bounding box.

주로 활용되는 객체 탐지 기법은 R-CNN (Convolutional Neural Network), Fast/Faster R-CNN 등이 있다. R-CNN은 입력된 이미지를 2,000개 정도의 서브 이미지로 추출하여 CNN을 통해 분류한 후, SVM(Support Vector Machine)을 통해 각 객체를 분류하는 기술이다. Mainly used object detection techniques include R-CNN (Convolutional Neural Network) and Fast/Faster R-CNN. R-CNN is a technology that extracts input images into about 2,000 sub-images, classifies them through CNN, and then classifies each object through SVM (Support Vector Machine).

Fast/Faster R-CNN은 R-CNN의 병목현상을 개선하고자, RPN(Region Proposal Networks)를 사용하여 모든 proposal들이 CNN을 거치지 않고, 전체 이미지를 한번에 CNN을 거치도록 하는 기술이다. Fast/Faster R-CNN is a technology that uses RPN (Region Proposal Networks) to pass through the entire image through the CNN at once without all proposals going through the CNN in order to improve the bottleneck of R-CNN.

선행기술문헌: 한국공개특허 제2018-0065856호Prior art literature: Korea Patent Publication No. 2018-0065856

제품의 이미지를 획득하고, 제 1 딥러닝 모델에 입력하여 제품의 라벨을 인식하여 제품의 영양 정보를 추출하는 방법 및 장치를 제공하고자 한다. An object of the present invention is to provide a method and apparatus for acquiring an image of a product and inputting it into the first deep learning model to recognize the label of the product and extract nutritional information of the product.

인식된 제품의 라벨의 이미지를 제 2 딥러닝 모델에 입력하여 함량 정보를 포함하는 함량 정보 객체를 인식하고, 기설정된 인식 수단을 통해 함량 정보 객체로부터 함량 정보를 인식하여 제품의 영양 정보를 추출하는 방법 및 장치를 제공하고자 한다. Recognizing the content information object including the content information by inputting the image of the label of the recognized product into the second deep learning model, and extracting the nutritional information of the product by recognizing the content information from the content information object through a preset recognition means A method and apparatus are provided.

다만, 본 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다. However, the technical problems to be achieved by the present embodiment are not limited to the technical problems described above, and other technical problems may exist.

상술한 기술적 과제를 달성하기 위한 수단으로서, 본 발명의 일 실시예는, 제품의 이미지를 획득하는 단계, 상기 제품의 이미지를 제 1 딥러닝 모델에 입력하여 상기 제품의 라벨을 인식하는 단계, 상기 제품의 라벨의 이미지를 제 2 딥러닝 모델에 입력하여 함량 정보를 포함하는 함량 정보 객체를 인식하는 단계 및 기설정된 인식 수단을 통해 상기 함량 정보 객체로부터 함량 정보를 인식하는 단계를 포함하는 제품 영양 정보 추출 방법을 제공할 수 있다. As a means for achieving the above-described technical problem, an embodiment of the present invention includes the steps of acquiring an image of a product, inputting the image of the product into a first deep learning model to recognize the label of the product, the Product nutrition information, comprising: inputting an image of a product label into a second deep learning model to recognize a content information object including content information; and recognizing content information from the content information object through a preset recognition means An extraction method may be provided.

본 발명의 다른 실시예는, 제품의 이미지를 획득하는 획득부, 상기 제품의 이미지를 제 1 딥러닝 모델에 입력하여 상기 제품의 라벨을 인식하는 라벨 인식부, 상기 제품의 라벨의 이미지를 제 2 딥러닝 모델에 입력하여 함량 정보를 포함하는 함량 정보 객체를 인식하는 객체 인식부 및 기설정된 인식 수단을 통해 상기 함량 정보 객체로부터 함량 정보를 인식하는 함량 정보 인식부를 포함하는 제품 영양 정보 추출 장치를 제공할 수 있다. Another embodiment of the present invention is an acquisition unit for acquiring an image of a product, a label recognition unit for recognizing a label of the product by inputting the image of the product into a first deep learning model, and a second image of the label of the product Provided is a product nutrition information extraction device comprising an object recognition unit for recognizing a content information object including content information by input into a deep learning model, and a content information recognition unit for recognizing content information from the content information object through a preset recognition means can do.

상술한 과제 해결 수단은 단지 예시적인 것으로서, 본 발명을 제한하려는 의도로 해석되지 않아야 한다. 상술한 예시적인 실시예 외에도, 도면 및 발명의 상세한 설명에 기재된 추가적인 실시예가 존재할 수 있다.The above-described problem solving means are merely exemplary, and should not be construed as limiting the present invention. In addition to the exemplary embodiments described above, there may be additional embodiments described in the drawings and detailed description.

전술한 본 발명의 과제 해결 수단 중 어느 하나에 의하면, 제품의 이미지로부터 제품의 라벨을 인식하고, 제품의 라벨의 이미지로부터 제품의 성분 정보 및 함량 정보를 포함하는 영양 정보를 추출하고, 추출된 영양 정보를 사용자에게 제공하는 방법 및 장치를 제공할 수 있다.According to any one of the above-described problem solving means of the present invention, the label of the product is recognized from the image of the product, the nutrition information including the ingredient information and the content information of the product is extracted from the image of the label of the product, and the extracted nutrition A method and apparatus for providing information to a user may be provided.

어플리케이션을 통해 제품의 성분 정보와 함량 정보가 표시되도록 함으로써, 사용자에게 올바른 복용법에 따라 제품을 이용하도록 하는 제품의 영양 정보를 추출하는 방법 및 장치를 제공할 수 있다.By displaying the ingredient information and content information of the product through the application, it is possible to provide a method and apparatus for extracting nutritional information of the product so that the user can use the product according to the correct dosage method.

도 1은 본 발명의 일 실시예에 따른 제품 영양 정보 추출 장치의 구성도이다.
도 2는 본 발명의 일 실시예에 따른 제품의 이미지로부터 인식된 제품의 라벨을 도시한 예시적인 도면이다.
도 3은 본 발명의 일 실시예에 따른 제품의 라벨의 이미지로부터 인식된 제품의 성분명 객체 및 함량 정보 객체를 도시한 예시적인 도면이다.
도 4는 본 발명의 일 실시예에 따른 제품의 성분명 및 함량 정보를 제공하는 화면을 도시한 예시적인 도면이다.
도 5는 본 발명의 일 실시예에 따른 제품 영양 정보 추출 장치에서 제품의 영양 정보를 추출하는 방법의 순서도이다.
도 6은 본 발명의 일 실시예에 따른 본 발명과 종래 기술 간의 제품 영양 정보 추출 방법을 비교한 예시적인 도면이다. 1 is a block diagram of an apparatus for extracting product nutrition information according to an embodiment of the present invention.
2 is an exemplary view illustrating a label of a product recognized from an image of the product according to an embodiment of the present invention.
3 is an exemplary diagram illustrating an object of ingredient name and content information of a product recognized from an image of a label of the product according to an embodiment of the present invention.
4 is an exemplary view showing a screen for providing information on the ingredient name and content of a product according to an embodiment of the present invention.
5 is a flowchart of a method of extracting nutritional information of a product in the apparatus for extracting nutritional information of a product according to an embodiment of the present invention.
6 is an exemplary view comparing a method for extracting product nutrition information between the present invention and the prior art according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시예를 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art can easily implement them. However, the present invention may be embodied in many different forms and is not limited to the embodiments described herein. And in order to clearly explain the present invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미하며, 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. Throughout the specification, when a part is "connected" with another part, this includes not only the case of being "directly connected" but also the case of being "electrically connected" with another element interposed therebetween. . Also, when a part "includes" a component, it means that other components may be further included, rather than excluding other components, unless otherwise stated, and one or more other features However, it is to be understood that the existence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded in advance.

본 명세서에 있어서 '부(部)'란, 하드웨어에 의해 실현되는 유닛(unit), 소프트웨어에 의해 실현되는 유닛, 양방을 이용하여 실현되는 유닛을 포함한다. 또한, 1 개의 유닛이 2 개 이상의 하드웨어를 이용하여 실현되어도 되고, 2 개 이상의 유닛이 1 개의 하드웨어에 의해 실현되어도 된다.In this specification, a "part" includes a unit realized by hardware, a unit realized by software, and a unit realized using both. In addition, one unit may be implemented using two or more hardware, and two or more units may be implemented by one hardware.

본 명세서에 있어서 단말 또는 디바이스가 수행하는 것으로 기술된 동작이나 기능 중 일부는 해당 단말 또는 디바이스와 연결된 서버에서 대신 수행될 수도 있다. 이와 마찬가지로, 서버가 수행하는 것으로 기술된 동작이나 기능 중 일부도 해당 서버와 연결된 단말 또는 디바이스에서 수행될 수도 있다.Some of the operations or functions described as being performed by the terminal or device in the present specification may be instead performed by a server connected to the terminal or device. Similarly, some of the operations or functions described as being performed by the server may also be performed in a terminal or device connected to the server.

이하 첨부된 도면을 참고하여 본 발명의 일 실시예를 상세히 설명하기로 한다. Hereinafter, an embodiment of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 제품 영양 정보 추출 장치의 구성도이다. 도 1을 참조하면, 제품 영양 정보 추출 장치(100)는 획득부(110), 라벨 인식부(120), 객체 인식부(130), 함량 정보 인식부(140) 및 맵핑부(150)를 포함할 수 있다. 1 is a block diagram of an apparatus for extracting product nutrition information according to an embodiment of the present invention. Referring to FIG. 1 , the product nutrition information extraction device 100 includes an acquisition unit 110 , a label recognition unit 120 , an object recognition unit 130 , a content information recognition unit 140 , and a mapping unit 150 . can do.

획득부(110)는 제품의 이미지를 획득할 수 있다. 예를 들어, 획득부(110)는 제품을 실시간으로 근접 촬영한 영상을 획득할 수 있다. The acquisition unit 110 may acquire an image of the product. For example, the acquiring unit 110 may acquire an image obtained by taking a close-up shot of a product in real time.

라벨 인식부(120)는 제품의 이미지를 제 1 딥러닝 모델에 입력하여 제품의 라벨을 인식할 수 있다. 여기서, 제 1 딥러닝 모델은 YOLO (You Only Look Once) 네트워크일 수 있다. 예를 들어, 라벨 인식부(120)는 제품의 영상을 제 1 딥러닝 모델에 입력하여 제품의 라벨을 인식할 수 있다.The label recognition unit 120 may recognize the label of the product by inputting the image of the product into the first deep learning model. Here, the first deep learning model may be a You Only Look Once (YOLO) network. For example, the label recognition unit 120 may recognize the label of the product by inputting the image of the product into the first deep learning model.

YOLO 네트워크는 기존의 R-CNN, Fast/Faster R-CNN 등과 달리 하나의 신경망을 전체 이미지에 적용하여, 이미지를 영역으로 분할하고, 각 영역의 바운딩 박스와 확률을 예측함으로써, 이미지의 전체를 보고 예측 정보를 알려주는 기법이다. 그러나, YOLO 네트워크는 실시간 객체 인식에 특화되어 있어, 이미지 내의 텍스트를 추출하는 과정에는 좋은 성능을 발휘하지 못한다는 단점을 가지고 있다. Unlike the existing R-CNN and Fast/Faster R-CNN, the YOLO network applies one neural network to the entire image, divides the image into regions, predicts the bounding box and probability of each region, and reports the entire image. It is a method of providing prediction information. However, since the YOLO network is specialized for real-time object recognition, it has a disadvantage in that it does not perform well in the process of extracting text from an image.

본 발명에서는 YOLO 네트워크를 이용하여 객체를 다단계로 추적함으로써, YOLO 네트워크를 활용하면서도 좋은 성능을 발휘하는 제품 영양 정보 추출 방법을 제공할 수 있다.In the present invention, by using the YOLO network to track the object in multiple steps, it is possible to provide a method for extracting product nutrition information that utilizes the YOLO network and exhibits good performance.

라벨 인식부(120)는 YOLO 네트워크를 통해 이미지 내에서 객체의 위치를 표시하는 바운딩 박스(bounding box)와 각 바운딩 박스가 기정의된 집합들 중 어떤 집합에 가까운지를 나타내는 확률인 클래스 확률(class probability)을 탐색할 수 있다. The label recognition unit 120 includes a bounding box indicating the position of an object in an image through the YOLO network, and class probability, which is a probability indicating which set among predefined sets each bounding box is close to. ) can be explored.

라벨 인식부(120)가 YOLO 네트워크를 활용함으로써 객체 인식을 하나의 독립 변수와 하나의 종속 변수 간의 관계를 판단하는 단일 회귀 문제(single regression problem)로 간주함으로써, 이미지를 한 번 확인한 것만으로 객체의 종류와 위치를 추측하는 것이 가능하다. By using the YOLO network, the label recognition unit 120 regards object recognition as a single regression problem to determine the relationship between one independent variable and one dependent variable, so that the It is possible to guess the type and location.

라벨 인식부(120)에서 제품의 라벨을 인식하는 과정에 대해서는 도 2를 통해 상세히 설명하도록 한다. The process of recognizing the label of the product in the label recognition unit 120 will be described in detail with reference to FIG. 2 .

도 2는 본 발명의 일 실시예에 따른 제품의 이미지로부터 인식된 제품의 라벨을 도시한 예시적인 도면이다. 도 2를 참조하면, 라벨 인식부(120)는 제품의 이미지(200)를 제 1 딥러닝 모델에 입력하면, 제 1 딥러닝 모델이 제품의 이미지(200)를 그리드 형태로 분할하고, 그리드에 해당하는 각 바운딩 박스의 신뢰도를 산출할 수 있다. 2 is an exemplary view illustrating a label of a product recognized from an image of the product according to an embodiment of the present invention. 2, when the label recognition unit 120 inputs the image 200 of the product to the first deep learning model, the first deep learning model divides the image 200 of the product in a grid form, and Reliability of each corresponding bounding box may be calculated.

이후, 라벨 인식부(120)는 제 1 딥러닝 모델의 합성곱 신경망(single convolutional network)을 통해 다수의 바운딩 박스에 대한 신뢰도에 기초하여, 가장 높은 신뢰도를 가진 최종 바운딩 박스를 결정하여 제품의 라벨(250)을 인식할 수 있다. 여기서, 'Label'로 표시된 바운딩 박스를 통해 제품의 라벨(250)이 인식된 것을 확인할 수 있다. Thereafter, the label recognition unit 120 determines the final bounding box with the highest reliability based on the reliability of a plurality of bounding boxes through a single convolutional network of the first deep learning model to label the product. (250) can be recognized. Here, it can be confirmed that the label 250 of the product is recognized through the bounding box marked 'Label'.

이러한 과정을 통해, 라벨 인식부(120)는 제품의 라벨을 인식하고, 최종적으로 제품의 라벨(250)의 이미지(바운딩 박스에 해당하는 이미지)만을 추출하여 결과 이미지로 생성할 수 있다. Through this process, the label recognition unit 120 may recognize the label of the product, and finally extract only the image (the image corresponding to the bounding box) of the label 250 of the product to generate the resulting image.

다시 도 1로 돌아와서, 객체 인식부(130)는 제품의 라벨의 이미지에 대해 제 2 딥러닝 모델을 이용하여 독립적으로 두 번의 객체 인식 프로세스를 진행함으로써, 제품의 라벨의 이미지로부터 성분명 객체 및 함량 정보 객체를 각각 인식할 수 있다. 여기서, 제 2 딥러닝 모델 또한 YOLO 네트워크일 수 있다. 본 발명에서는 제 2 딥러닝 모델을 이용하여 독립적으로 두 번의 객체 인식 프로세스를 진행시킴으로써, 객체 인식의 정확도를 보다 향상시킬 수 있다. 성분명 객체 및 함량 정보 객체를 인식하는 과정에 대해서는 도 3을 통해 상세히 설명하도록 한다. Returning to FIG. 1 again, the object recognition unit 130 independently performs the object recognition process twice using the second deep learning model for the image of the label of the product, and the ingredient name object and content information from the image of the label of the product Each object can be recognized. Here, the second deep learning model may also be a YOLO network. In the present invention, the accuracy of object recognition can be further improved by independently performing two object recognition processes using the second deep learning model. The process of recognizing the ingredient name object and the content information object will be described in detail with reference to FIG. 3 .

도 3은 본 발명의 일 실시예에 따른 제품의 라벨의 이미지로부터 인식된 제품의 성분명 객체 및 함량 정보 객체를 도시한 예시적인 도면이다. 도 3을 참조하면, 객체 인식부(130)는 제품의 라벨의 이미지(300)를 제 2 딥러닝 모델에 입력하여 성분명을 포함하는 성분명 객체를 인식할 수 있다. 여기서, 성분명 객체는 다양한 성분명 영상이 학습된 제 2 딥러닝 모델을 통해 인식될 수 있다. 3 is an exemplary diagram illustrating an object of ingredient name and content information of a product recognized from an image of a label of the product according to an embodiment of the present invention. Referring to FIG. 3 , the object recognition unit 130 may recognize the ingredient name object including the ingredient name by inputting the image 300 of the label of the product into the second deep learning model. Here, the component name object may be recognized through the second deep learning model from which various component name images have been learned.

예를 들어, 객체 인식부(130)는 제품의 라벨의 이미지(300)를 그리드 형태로 분할하고, 그리드에 해당하는 각 세부 바운딩 박스의 신뢰도를 산출하고, 다수의 세부 바운딩 박스에 대한 신뢰도에 기초하여 가장 높은 신뢰도를 가진 최종 바운딩 박스를 결정함으로써, "Calories"(310), "Total Fat"(320), "Cholesterol"(330), "Sodium"(340) 및 "Total Carbohydrate"(350)과 같이 성분명 객체를 인식할 수 있다. For example, the object recognition unit 130 divides the image 300 of the label of the product into a grid form, calculates the reliability of each detailed bounding box corresponding to the grid, and based on the reliability of a plurality of detailed bounding boxes and "Calories" (310), "Total Fat" (320), "Cholesterol" (330), "Sodium" (340) and "Total Carbohydrate" (350) In the same way, the component name object can be recognized.

객체 인식부(130)는 제품의 라벨의 이미지(300)를 제 2 딥러닝 모델에 입력하여 함량 정보를 포함하는 함량 정보 객체를 인식할 수 있다. 여기서, 제 2 딥러닝 모델은 '숫자%'의 형태로 표현된 함량 정보가 학습됨으로써, 동일한 형태를 가진 객체를 인식할 수 있다. The object recognition unit 130 may recognize the content information object including the content information by inputting the image 300 of the label of the product into the second deep learning model. Here, the second deep learning model can recognize an object having the same shape by learning the content information expressed in the form of 'number%'.

예를 들어, 객체 인식부(130)는 제품의 라벨의 이미지(300)를 그리드 형태로 분할하고, 그리드에 해당하는 각 세부 바운딩 박스의 신뢰도를 산출하고, 다수의 세부 바운딩 박스에 대한 신뢰도에 기초하여 가장 높은 신뢰도를 가진 최종 바운딩 박스를 결정함으로써, "%" 문자를 포함하는 "컨텐트(content)"를 함량 정보 객체로 인식할 수 있다. 이 때, "컨텐트"에 해당하는 함량 정보 객체들은 별도의 영상으로부터 추출되어 학습될 수 있다. For example, the object recognition unit 130 divides the image 300 of the label of the product into a grid form, calculates the reliability of each detailed bounding box corresponding to the grid, and based on the reliability of a plurality of detailed bounding boxes Thus, by determining the final bounding box with the highest reliability, "content" including the "%" character can be recognized as a content information object. In this case, the content information objects corresponding to "content" may be extracted and learned from a separate image.

이와 같이, 제 1 딥러닝 모델 및 제 2 딥러닝 모델에 모두 YOLO 네트워크를 이용하는 이유는 YOLO 네트워크의 수행의 정확도를 높이기 위함이다. 본 발명은 1차적으로 제 1 딥러닝 모델을 통해 YOLO 네트워크가 제품의 라벨의 인식에만 집중하도록 함으로써, 전체 이미지 내에서 제품의 영양 정보를 포함하는 라벨 인식의 정확도를 향상시킬 수 있다. 이후, 2차적으로 제 2 딥러닝 모델을 통해 YOLO 네트워크가 제품의 성분명 객체 및 제품의 함량 정보 객체에만 집중하도록 함으로써, 전체 이미지로부터 성분명을 탐색하는 것보다 속도와 정확도를 향상시킬 수 있다. As such, the reason for using the YOLO network for both the first deep learning model and the second deep learning model is to increase the accuracy of performing the YOLO network. The present invention primarily allows the YOLO network to focus only on recognizing product labels through the first deep learning model, thereby improving the accuracy of label recognition including nutritional information of products within the entire image. Thereafter, the second deep learning model allows the YOLO network to focus only on the ingredient name object of the product and the content information object of the product, thereby improving speed and accuracy compared to searching for ingredient names from the entire image.

이후, 함량 정보 인식부(140)는 기설정된 인식 수단을 통해 함량 정보 객체로부터 함량 정보를 인식할 수 있다. 여기서, 기설정된 인식 수단은 OCR (Optical Character Reader)일 수 있다. 예를 들어, 함량 정보 인식부(140)는 OCR을 통해 함량 정보 객체로부터 "0%"(325), "0%"(335), "1%"(345), "2%"(355)와 같이 함량 정보에 해당하는 숫자를 인식할 수 있다. Thereafter, the content information recognition unit 140 may recognize the content information from the content information object through a preset recognition means. Here, the preset recognition means may be an optical character reader (OCR). For example, the content information recognizing unit 140 may perform "0%" (325), "0%" (335), "1%" (345), and "2%" (355) from the content information object through OCR. It is possible to recognize the number corresponding to the content information as shown.

본 발명에서 OCR을 이용하여 함량 정보만을 인식하는 이유는, 종래와 같이 전체 이미지에 포함된 텍스트를 OCR을 이용하여 모두 인식하고자 하는 경우, 이미지가 클수록, 이미지의 내용이 많을수록 속도가 느리고 정확도가 낮아지기 때문이다. 따라서, 본 발명에서는 다단계의 YOLO 네트워크를 통해 검출된 함량 정보 객체만을 OCR을 통해 인식함으로써, 보다 향상된 속도로 제품의 영양 정보를 실시간으로 처리할 수 있다. The reason for recognizing only content information using OCR in the present invention is that, as in the prior art, when all texts included in the entire image are to be recognized using OCR, the larger the image, the more the image content, the slower the speed and the lower the accuracy. Because. Therefore, in the present invention, by recognizing only the content information object detected through the multi-step YOLO network through OCR, it is possible to process the nutritional information of the product in real time at an improved speed.

맵핑부(150)는 인식된 성분명 객체에 포함된 성분명과 인식된 함량 정보 객체에 포함된 함량 정보를 맵핑할 수 있다. The mapping unit 150 may map the ingredient name included in the recognized ingredient name object and the content information included in the recognized content information object.

맵핑부(150)는 저장된 각 성분명 객체 및 각 함량 정보 객체의 위치 정보에 기초하여 각 성분명에 대응하는 함량 정보를 맵핑할 수 있다. 여기서, 맵핑부(150)는 성분명 객체에 해당하는 좌표의 y값과 함량 정보 객체에 해당하는 좌표의 y값을 비교한 후, 유사한 위치에 위치한 성분명과 함량 정보를 쌍으로 구성하여 맵핑시킬 수 있다. The mapping unit 150 may map content information corresponding to each ingredient name based on the stored location information of each ingredient name object and each content information object. Here, the mapping unit 150 compares the y value of the coordinates corresponding to the ingredient name object and the y value of the coordinates corresponding to the content information object, and then forms a pair of the ingredient name and the content information located at a similar position to be mapped. .

예를 들어, 도 3을 참조하면, 맵핑부(150)는 "Total Fat"(320)에 해당하는 성분 객체는 함량 정보 객체 중 유사한 y좌표값을 가진 "0%"(325)와 맵핑하고, "Total Carbohydrate"(350)에 해당하는 성분 객체는 함량 정보 객체 중 유사한 y좌표값을 가진 "2%"(355)와 맵핑할 수 있다. For example, referring to FIG. 3 , the mapping unit 150 maps the component object corresponding to "Total Fat" 320 with "0%" (325) having a similar y-coordinate value among the content information objects, The component object corresponding to "Total Carbohydrate" 350 may be mapped to "2%" (355) having a similar y-coordinate value among content information objects.

그러나 맵핑부(150)는 "Calories"(310)에 해당하는 성분 객체의 경우, 유사한 y좌표값을 가진 함량 정보 객체가 존재하지 않으므로, 맵핑을 수행하지 않을 수 있다. However, in the case of the component object corresponding to the “Calories” 310 , the mapping unit 150 may not perform mapping because there is no content information object having a similar y-coordinate value.

이러한 과정을 거쳐, 제품 영양 정보 추출 장치(100)는 최종적으로, "Total Fat", "Saturated Fat", "Cholesterol" 등의 성분명과 그와 맵핑된 함량 정보를 인식할 수 있다. Through this process, the product nutritional information extraction apparatus 100 may finally recognize ingredient names such as "Total Fat", "Saturated Fat", and "Cholesterol" and content information mapped thereto.

도 4는 본 발명의 일 실시예에 따른 제품의 성분명 및 함량 정보를 제공하는 화면을 도시한 예시적인 도면이다. 도 4를 참조하면, 제품 영양 정보 추출 장치(100)는 제품의 영양 정보를 제공하는 어플리케이션을 통해 제품의 이미지를 촬영하고, 실시간으로 제품의 라벨의 이미지를 인식한 후, 제품의 성분명 객체 및 함량 정보 객체를 인식하고, 성분명 객체에 포함된 성분명 및 함량 정보 객체에 포함된 함량 정보를 맵핑함으로써, 제품의 성분명 및 해당 성분의 함량 정보를 제공할 수 있다. 4 is an exemplary view showing a screen for providing information on the ingredient name and content of a product according to an embodiment of the present invention. Referring to FIG. 4 , the product nutrition information extraction device 100 takes an image of a product through an application that provides nutritional information of the product, recognizes the image of the label of the product in real time, and then the ingredient name object and content of the product By recognizing the information object and mapping the ingredient name included in the ingredient name object and the content information included in the content information object, it is possible to provide the ingredient name of the product and content information of the corresponding ingredient.

예를 들어, 제품 영양 정보 제공 장치(100)는 제품에 대한 인식 결과 화면(400)을 통해 "Calories: 30%", "Total Fat: 0%", "Cholesterol: 0%", "Sodium: 1%", "Total Carbohydrate: 2%"와 같이 성분명과 함께 성분에 대응하는 함량 정보를 제공할 수 있다. For example, the product nutrition information providing device 100 may display “Calories: 30%”, “Total Fat: 0%”, “Cholesterol: 0%”, and “Sodium: 1” through the product recognition result screen 400 . %", "Total Carbohydrate: 2%", along with the ingredient name, content information corresponding to the ingredient can be provided.

이러한 제품 영양 정보 추출 장치(100)는 제품의 영양 정보를 추출하는 명령어들의 시퀀스를 포함하는 매체에 저장된 컴퓨터 프로그램에 의해 실행될 수 있다. 컴퓨터 프로그램은 컴퓨팅 장치에 의해 실행될 경우, 제품의 이미지를 획득하고, 제품의 이미지를 제 1 딥러닝 모델에 입력하여 제품의 라벨을 인식하고, 제품의 라벨의 이미지를 제 2 딥러닝 모델에 입력하여 함량 정보를 포함하는 함량 정보 객체를 인식하고, 기설정된 인식 수단을 통해 함량 정보 객체로부터 함량 정보를 인식하도록 하는 명령어들의 시퀀스를 포함할 수 있다. The product nutritional information extraction apparatus 100 may be executed by a computer program stored in a medium including a sequence of instructions for extracting nutritional information of a product. When the computer program is executed by the computing device, it acquires the image of the product, inputs the image of the product into the first deep learning model to recognize the label of the product, and inputs the image of the label of the product into the second deep learning model to and a sequence of commands for recognizing a content information object including content information and recognizing content information from the content information object through a preset recognition means.

도 5는 본 발명의 일 실시예에 따른 제품 영양 정보 추출 장치에서 제품의 영양 정보를 추출하는 방법의 순서도이다. 도 5에 도시된 제품 영양 정보 추출 장치(100)에서 제품의 영양 정보를 추출하는 방법은 도 1 내지 도 4에 도시된 실시예에 따라 시계열적으로 처리되는 단계들을 포함한다. 따라서, 이하 생략된 내용이라고 하더라도 도 1 내지 도 4에 도시된 실시예에 따라 제품 영양 정보 추출 장치(100)에서 제품의 영양 정보를 추출하는 방법에도 적용된다. 5 is a flowchart of a method of extracting nutritional information of a product in the apparatus for extracting nutritional information of a product according to an embodiment of the present invention. The method of extracting nutritional information of a product in the product nutritional information extraction apparatus 100 illustrated in FIG. 5 includes steps of time-series processing according to the embodiments illustrated in FIGS. 1 to 4 . Therefore, even if omitted below, it is also applied to the method of extracting nutritional information of a product by the product nutritional information extraction apparatus 100 according to the embodiment shown in FIGS. 1 to 4 .

단계 S510에서 제품 영양 정보 추출 장치(100)는 제품의 이미지를 획득할 수 있다. In step S510, the product nutrition information extraction apparatus 100 may acquire an image of the product.

단계 S520에서 제품 영양 정보 추출 장치(100)는 제품의 이미지를 제 1 딥러닝 모델에 입력하여 제품의 라벨을 인식할 수 있다. In step S520, the product nutrition information extraction apparatus 100 may recognize the label of the product by inputting the image of the product into the first deep learning model.

단계 S530에서 제품 영양 정보 추출 장치(100)는 제품의 라벨의 이미지를 제 2 딥러닝 모델에 입력하여 성분명을 포함하는 성분명 객체를 인식할 수 있다. In step S530, the product nutrition information extracting apparatus 100 may recognize the ingredient name object including the ingredient name by inputting the image of the label of the product into the second deep learning model.

단계 S540에서 제품 영양 정보 추출 장치(100)는 제품의 라벨의 이미지를 제 2 딥러닝 모델에 입력하여 함량 정보를 포함하는 함량 정보 객체를 인식할 수 있다. In step S540, the product nutrition information extraction apparatus 100 may recognize the content information object including the content information by inputting the image of the label of the product into the second deep learning model.

단계 S550에서 제품 영양 정보 추출 장치(100)는 기설정된 인식 수단을 통해 함량 정보 객체로부터 함량 정보를 인식할 수 있다. In step S550, the product nutrition information extraction apparatus 100 may recognize the content information from the content information object through a preset recognition means.

단계 S560에서 제품 영양 정보 추출 장치(100)는 인식된 성분명 객체에 포함된 성분명과 인식된 함량 정보 객체에 포함된 함량 정보를 맵핑할 수 있다. In step S560, the product nutrition information extraction apparatus 100 may map the ingredient name included in the recognized ingredient name object and the content information included in the recognized content information object.

상술한 설명에서, 단계 S510 내지 S560은 본 발명의 구현예에 따라서, 추가적인 단계들로 더 분할되거나, 더 적은 단계들로 조합될 수 있다. 또한, 일부 단계는 필요에 따라 생략될 수도 있고, 단계 간의 순서가 전환될 수도 있다.In the above description, steps S510 to S560 may be further divided into additional steps or combined into fewer steps according to an embodiment of the present invention. In addition, some steps may be omitted if necessary, and the order between the steps may be switched.

도 6은 본 발명의 일 실시예에 따른 본 발명과 종래 기술 간의 제품 영양 정보 추출 방법을 비교한 예시적인 도면이다. 여기서, 종래 기술은 제품의 라벨 인식을 위해 OCR만을 사용한다(Grubert, O., Gao, L., "recognition of nutrition facts, labels from mobile images", Technical Report, Stanford University, Apr. 2014.)6 is an exemplary view comparing a method for extracting product nutrition information between the present invention and the prior art according to an embodiment of the present invention. Here, the prior art uses only OCR for label recognition of products (Grubert, O., Gao, L., "recognition of nutrition facts, labels from mobile images", Technical Report, Stanford University, Apr. 2014.)

도 6을 참조하면, 본 발명(600) 및 종래 기술(610) 간의 비교 항목은 실시간 정보 추출 항목(620), OCR 항목(621), 객체 인식 항목(622), 다단계 객체 인식 항목(623) 및 대상 영상 항목(624)으로 구성했다. 6, the comparison items between the present invention 600 and the prior art 610 are a real-time information extraction item 620, an OCR item 621, an object recognition item 622, a multi-level object recognition item 623, and It consists of target video items (624).

본 발명(600)은 실시간으로 성분명 및 각 성분에 대한 함량 정보를 추출하고, 기설정된 인식 수단으로 OCR을 이용하고, 제품의 라벨을 인식한 후, 성분 객체 및 함량 정보 객체를 각각 인식하고, YOLO 네트워크로 구성된 제 1 딥러닝 모델 및 제 2 딥러닝 모델을 이용하여 다단계로 객체를 이용하고, 실시간 영상을 대상으로 한다. The present invention 600 extracts the ingredient name and content information for each ingredient in real time, uses OCR as a preset recognition means, recognizes the label of the product, recognizes the ingredient object and the content information object, respectively, and YOLO Objects are used in multiple steps using the first deep learning model and the second deep learning model composed of a network, and real-time images are targeted.

이와 달리, 종래 기술(610)은 제품을 기촬영한 이미지 영상으로부터 OCR 인식 기술을 이용하여 제품의 영양 정보를 파악한다는 점에서 많은 차이를 보이는 것을 확인할 수 있다. 만약, OCR 인식 기술만을 이용하여, 제품의 영양 정보를 파악하고자 하는 경우, OCR을 통해 다량의 데이터를 실시간으로 처리하지 못함에 따라 시간이 많이 소요되고, 이로 인해 제품의 영양 정보를 실시간으로 제공하지 못한다는 단점이 존재하였다. On the other hand, it can be seen that the prior art 610 shows a lot of difference in that the nutritional information of the product is recognized by using the OCR recognition technology from the image image taken previously. If you want to understand the nutritional information of a product using only OCR recognition technology, it takes a lot of time as a large amount of data cannot be processed in real time through OCR. There were drawbacks to not being able to.

따라서, 본 발명에서는 제품의 라벨을 실시간으로 인식하여, 제품의 라벨의 이미지에서 성분명에 해당하는 성분 객체를 인식함으로써, OCR 기법을 통해 라벨 전체의 문자를 인식하지 않고, 함량 정보에 해당하는 문자만을 인식함으로써, 실시간으로 제품의 영양 정보를 제공할 수 있다. Therefore, in the present invention, by recognizing the label of the product in real time and recognizing the component object corresponding to the ingredient name in the image of the product label, the OCR technique does not recognize the entire label, but only the character corresponding to the content information. By recognizing it, it is possible to provide nutritional information of the product in real time.

도 1 내지 도 6을 통해 설명된 제품 영양 정보 추출 장치에서 제품의 영양 정보를 추출하는 방법은 컴퓨터에 의해 실행되는 매체에 저장된 컴퓨터 프로그램 또는 컴퓨터에 의해 실행 가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 또한, 도 1 내지 도 6을 통해 설명된 제품 영양 정보 추출 장치에서 제품의 영양 정보를 추출하는 방법은 컴퓨터에 의해 실행되는 매체에 저장된 컴퓨터 프로그램의 형태로도 구현될 수 있다. The method for extracting nutritional information of a product from the product nutritional information extraction apparatus described through FIGS. 1 to 6 is also in the form of a computer program stored in a medium executed by a computer or a recording medium including instructions executable by the computer. can be implemented. In addition, the method for extracting nutritional information of a product in the product nutritional information extraction apparatus described with reference to FIGS. 1 to 6 may be implemented in the form of a computer program stored in a medium executed by a computer.

컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체를 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다. Computer-readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. Also, computer-readable media may include computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다. The description of the present invention described above is for illustration, and those of ordinary skill in the art to which the present invention pertains can understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present invention. will be. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive. For example, each component described as a single type may be implemented in a distributed manner, and likewise components described as distributed may be implemented in a combined form.

본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다. The scope of the present invention is indicated by the following claims rather than the above detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present invention. do.

100: 제품 영양 정보 추출 장치
110: 획득부
120: 라벨 인식부
130: 객체 인식부
140: 함량 정보 인식부
150: 맵핑부100: product nutritional information extraction device
110: acquisition unit
120: label recognition unit
130: object recognition unit
140: content information recognition unit
150: mapping unit

Claims

A method for extracting nutritional information of a product, the method comprising:
acquiring an image of the product;
Recognizing the label of the product by inputting the image of the product into a first deep learning model;
Recognizing a content information object including content information by inputting the image of the label of the product into a second deep learning model; and
Recognizing content information from the content information object through a preset recognition means
A method of extracting product nutritional information that includes.

The method of claim 1,
Recognizing the content information object including the content information by inputting the image of the label of the product into the second deep learning model
Recognizing an ingredient name object including an ingredient name by inputting the image of the label of the product into a second deep learning model
A method of extracting product nutritional information that includes.

3. The method of claim 2,
Mapping the ingredient name included in the recognized ingredient name object and the content information included in the recognized content information object
Which will further include, product nutritional information extraction method.

The method of claim 1,
The predetermined recognition means is OCR (Optical Character Reader), the product nutritional information extraction method.

The method of claim 1,
The first deep learning model and the second deep learning model is a YOLO (You Only Look Once) network, the product nutrition information extraction method.

A device for extracting nutritional information of a product, comprising:
an acquisition unit for acquiring an image of the product;
a label recognition unit for recognizing a label of the product by inputting the image of the product into a first deep learning model;
an object recognition unit for recognizing a content information object including content information by inputting an image of the label of the product into a second deep learning model; and
Content information recognition unit for recognizing content information from the content information object through a preset recognition means
That comprising a, product nutritional information extraction device.

7. The method of claim 6,
The object recognition unit inputting the image of the label of the product into a second deep learning model to recognize the ingredient name object including the ingredient name, product nutrition information extraction device.

8. The method of claim 7,
The object recognition unit will map the ingredient name included in the recognized ingredient name object and the content information included in the recognized content information object, product nutrition information extraction device.

7. The method of claim 6,
The preset recognition means is OCR (Optical Character Reader), the product nutritional information extraction device.

7. The method of claim 6,
The first deep learning model and the second deep learning model is a YOLO (You Only Look Once) network, the product nutritional information extraction device.