KR20200010672A

KR20200010672A - Smart merchandise searching method and system using deep learning

Info

Publication number: KR20200010672A
Application number: KR1020180078779A
Authority: KR
Inventors: 이명재
Original assignee: 주식회사 지브이글러벌
Priority date: 2018-07-06
Filing date: 2018-07-06
Publication date: 2020-01-31

Abstract

According to the present invention, a smart product search method using deep learning collects images to be learned by using an image crawler, learns the collected images by using a graphic processing unit (GPU), and based on the learned results, recognizes keywords and/or images input by a use to provide highly relevant information on a user screen. The present invention relates to a product searching method of the deep learning manner, which performs learning in the deep learning manner by using a variety of product images, and with the same, displays more accurate search results than any conventional method. In particular, the present invention can more quickly and accurately search a product by comparing a search result image with an image used in learning through image feature extraction by the deep learning manner, thereby preventing the lack of data which is a problem of the conventional product search method, and a search error generated by incorrect matching of the image and the keyword.

Description

Smart merchandise searching method and system using deep learning}

본 발명은 온라인 쇼핑몰에서의 스마트 상품 검색 방법에 관한 것으로, 더욱 상세하게는 상품의 키워드를 온라인 쇼핑몰 데이터베이스에 있는 이미지와 매칭함으로써 상품을 빠르고 정확하게 검색하는 방법에 관한 것이다.The present invention relates to a smart product searching method in an online shopping mall, and more particularly, to a method for quickly and accurately searching for a product by matching a keyword of a product with an image in an online shopping mall database.

현재 온라인 쇼핑몰에서 상품을 비교 검색할 때 사용하는 방법은 텍스트로 입력된 상품의 정보를 기반으로 속성별 분류를 통해 사용자에게 보여주는 방식이 일반적이다. 이 방법은 키워드 기반의 이미지 검색 기술로써, 상품에 대한 지식이 부족할 경우 모든 분류의 상품을 직접 확인해야 하고, 검색시 이미지의 키워드를 정확하게 맞추기가 어려워 상품 검색에 많은 어려움이 따른다. 이에 대한 새로운 방법으로 이미지의 특징 벡터를 기반으로 검색하는 내용 기반 이미지 검색(content-based retrieval) 기술을 사용할 수 있다. 이 방법은 데이터의 용량이 커지고 전처리 과정을 위한 비용도 크기 때문에 대용량 이미지 검색에 한계가 있으며, 기술적인 한계로 인해 만족할 만한 검색결과를 얻지 못하고 있다. 특히 이미지가 준비되지 않은 경우에 정확한 검색을 수행하기가 어렵다. Currently, the method used when comparing and searching for products in an online shopping mall is generally a method of showing them to users based on attribute information based on information of a product input as text. This method is a keyword-based image retrieval technology, and if there is a lack of knowledge about products, it is necessary to directly check the products of all classifications, and it is difficult to accurately match the keywords of the image during the search, which leads to many difficulties in product search. As a new method, a content-based retrieval technique for searching based on a feature vector of an image may be used. This method has a limitation in retrieving large images because of the large data size and the cost for the preprocessing process, and the technical limitations do not provide satisfactory search results. In particular, it is difficult to perform an accurate search when the image is not prepared.

한편, 앞서 언급한 두 가지 방식을 혼합하여 이미지 속성정보를 활용해 내용 기반 이미지 검색을 수행하는 기술도 연구되었다. 그러나 이러한 검색 방법을 지원하는 시스템도 사용자의 다양한 질의어를 적절하게 해석하여 결과를 나타내기가 어렵고, 타 시스템과의 통합이나 교환시 문제가 발생할 수 있는 여지가 있다. On the other hand, a technique for performing content-based image retrieval using image attribute information by mixing the aforementioned two methods has also been studied. However, even a system supporting such a search method is difficult to properly interpret various user queries and display results, and there may be a problem when integrating or exchanging with other systems.

따라서, 본 발명은 이러한 문제점을 해결하기 위한 것으로, 제한된 또는 정확하지 않은 키워드 또는 이미지의 입력 만으로도 원하는 상품을 온라인 쇼핑몰 내에서 빠르고 정확하게 검색할 수 있는 방법을 제공하기 위한 것이다. Accordingly, the present invention is to solve this problem, and to provide a method for quickly and accurately searching for a desired product in an online shopping mall only by inputting a limited or incorrect keyword or image.

상기 과제를 해결하기 위하여, 본 발명에서는 우선, 이미지 크롤러를 이용하여 학습할 이미지를 수집하고, 수집한 이미지를 그래픽 처리장치(Graphic Processing Unit; GPU)를 이용하여 학습하고, 학습된 결과를 기반으로 하여, 사용자에 의해 입력된 키워드 및/또는 이미지를 인식하여 관련성이 높은 정보를 사용자 화면에 제공하는 것을 특징으로 한다.In order to solve the above problems, in the present invention, first, by using an image crawler to collect the image to learn, using the graphic processing unit (Graphic Processing Unit; GPU) to learn, based on the learned results By recognizing the keyword and / or image input by the user, it provides a highly relevant information on the user screen.

본 발명은 딥러닝 방식의 상품 검색방법에 관한 것으로 다양한 상품 이미지를 이용하여 딥러닝 방식으로 학습을 시키고, 이를 이용하여 상품검색을 하는 것으로 기존의 어떠한 방법보다 정확한 검색결과를 보여준다. 특히, 기존의 상품 검색방법의 문제점인 데이터 부족, 이미지와 키워드의 부정확한 매칭 등에서 발생하는 검색 오류를 본 발명은 상기 딥러닝 방식에 의한 이미지 특징 추출을 통해 학습에 사용된 이미지와 검색 결과 이미지를 비교함으로써 상품을 보다 빠르고 정확하게 검색할 수 있는 발명이다.The present invention relates to a product learning method of the deep learning method to learn in a deep learning method using a variety of product images, and to search for a product using this shows more accurate search results than any conventional method. In particular, the present invention provides a search error that occurs due to a lack of data, an incorrect matching of an image and a keyword, which is a problem of the conventional product search method, and the image and search result image used for learning through image feature extraction by the deep learning method. The invention makes it possible to search products faster and more accurately by comparing them.

도 1은 온라인 쇼핑몰 상품 이미지 학습기 및 이미지 인식기의 개략도를 나타낸 것이다.
도 2는 희소 관계 형태의 합성곱 신경망과 전통적인 신경망의 모식도를 나타낸 것이다.1 illustrates a schematic diagram of an online shopping mall product image learner and an image recognizer.
Fig. 2 shows a schematic diagram of a convolutional neural network and a traditional neural network in the form of a rare relationship.

온라인 쇼핑 환경의 발전으로 소비자들은 다양한 상품들을 한 자리에서 폭 넓게 비교할 수 있게 되었다. 하지만 온라인 쇼핑몰에 올라와있는 상당량의 주요 상품 정보들이 이미지 형태이기 때문에 컴퓨터가 인지할 수 있는 텍스트 기반 검색 시스템에 반영될 수 없다는 한계가 존재한다. 이러한 한계점은 일반적으로 기존 기계학습 기술 및 OCR(Optical Character Recognition) 기술을 활용해, 이미지 형태로 된 키워드를 인식함으로써 개선할 수 있다. 그러나 기존 OCR 기술은 이미지 안에 글자가 아닌 그림이 많고 글자 크기가 작으면 낮은 인식률을 보인다는 문제가 있다. 이러한 기존 기술들의 한계점을 해결하기 위하여, 딥러닝 기반 사물인식 모형 중 하나인 SSD(Single Shot MultiBox Detector)를 개조하여 이미지 형태의 상품 카탈로그 내의 텍스트 인식모형이 설계되기도 하였다. 하지만 이를 학습시키기 위한 데이터를 구축하는 데 상당한 시간과 비용이 필요했는데, 이는 지도학습의 방법론을 따르는 SSD 모형은 훈련 데이터마다 직접 정답 라벨링을 해줘야 하기 때문이다. The development of the online shopping environment allows consumers to compare different products in one place. However, there is a limit that a considerable amount of main product information posted on the online shopping mall cannot be reflected in a text-based retrieval system that can be recognized by a computer because it is an image form. These limitations can generally be improved by recognizing keywords in the form of images using existing machine learning techniques and OCR (Optical Character Recognition) techniques. However, the existing OCR technology has a problem of low recognition rate when there are many non-letter pictures in the image and the font size is small. In order to solve the limitations of the existing technologies, a text recognition model in the product catalog in the form of an image was designed by modifying a single shot multibox detector (SSD), which is one of the deep learning based object recognition models. However, it took considerable time and money to build the data to train it, because SSD models that follow the supervised methodology require direct labeling of each training data.

이러한 종래기술과는 달리, 본 발명은 인공지능에 의한 머신러닝 및 딥러닝에 기반을 두고 있다.Unlike this prior art, the present invention is based on machine learning and deep learning by artificial intelligence.

인공지능(Artificial Intelligence; AI)은 지능적인 인간의 행동을 모방하는 기계의 능력을 말한다. AI는 크게 일반 인공지능과 응용 인공지능으로 구분할 수 있다. 버티컬 AI 또는 내로우(Narrow) AI라고도 불리는 응용 인공지능은 주식 매매나 맞춤형 광고처럼 특정 니즈에 특화된 "스마트" 시스템을 말한다. 또한, 스트롱 AI 또는 풀(Full) AI라고도 불리는 일반 인공지능은 인간이 다룰 수 있는 모든 작업을 처리할 수 있는 시스템과 장치를 말한다. 이들은 공상 과학 영화에 나오는 드로이드(droid)와 유사하다. 일반 대중이 미래에 대해 떠올릴 때는 일반 인공지능과 관련된 것들이 대부분이다.Artificial Intelligence (AI) is the ability of a machine to imitate intelligent human behavior. AI can be divided into general AI and applied AI. Applied AI, also called vertical AI or narrow AI, refers to a "smart" system that specializes in specific needs, such as stock trading or customized advertising. General AI, also called Strong AI or Full AI, refers to systems and devices that can handle all the tasks that humans can handle. These are similar to the droids found in science fiction films. When the general public thinks about the future, much of it is related to general AI.

인공지능의 하위 영역에 속하는 머신러닝(Machine Learning)은 이미지 인식, 자연 언어 처리 등 인공지능 영역에서 이뤄지는 발전의 상당부분에 기여하고 있다. 머신 러닝은 기본적으로 알고리즘을 이용해 데이터를 분석하고, 분석을 통해 학습하며, 학습한 내용을 기반으로 판단이나 예측을 한다. 따라서 궁극적으로는 의사 결정 기준에 대한 구체적인 지침을 소프트웨어에 직접 코딩해 넣는 것이 아닌, 대량의 데이터와 알고리즘을 통해 컴퓨터 그 자체를 '학습'시켜 작업 수행 방법을 익히는 것을 목표로 한다.Machine learning, a sub-domain of AI, contributes to much of the advances in the field of AI, including image recognition and natural language processing. Machine learning basically analyzes data using algorithms, learns through analysis, and makes judgments or predictions based on what is learned. Ultimately, the goal is to learn how to perform tasks by 'learning' the computer itself through large amounts of data and algorithms, rather than coding specific guidelines for decision criteria directly in software.

한편, 딥러닝(Deep Learning)은 인간의 뇌 구조에서 영감을 얻은 첨단 기술로, 인공 신경망을 사용하여 인간의 뇌세포와 유사한 방식으로 데이터를 처리한다. 방대한 양의 데이터를 신경망으로 유입시켜, 데이터를 정확하게 구분하도록 시스템을 "교육" 시킨다. 오늘날의 수퍼컴퓨터와 빅데이터의 등장은 딥러닝을 현실화하는데 발판을 마련해 주었다. 초기 머신 러닝 연구자들이 만들어 낸 또 다른 알고리즘인 인공 신경망(artificial neural network)에 영감을 준 것은 인간의 뇌가 지닌 생물학적 특성, 특히 뉴런의 연결 구조였다. 그러나 물리적으로 근접한 어떤 뉴런이든 상호 연결이 가능한 뇌와는 달리, 인공 신경망은 레이어 연결 및 데이터 전파 방향이 일정하다. 예를 들어, 이미지를 수많은 타일로 잘라 신경망의 첫 번째 레이어에 입력하면, 그 뉴런들은 데이터를 다음 레이어로 전달하는 과정을 마지막 레이어에서 최종 출력이 생성될 때까지 반복한다. 그리고 각 뉴런에는 수행하는 작업을 기준으로 입력의 정확도를 나타내는 가중치가 할당되며, 그 후 가중치를 모두 합산해 최종 출력이 결정된다.Deep learning, on the other hand, is a cutting-edge technology inspired by the human brain structure, which uses artificial neural networks to process data in a manner similar to human brain cells. A massive amount of data is introduced into the neural network, which "trains" the system to accurately classify the data. The emergence of today's supercomputers and big data has paved the way for deep learning. Another algorithm created by early machine learning researchers, the artificial neural network, inspired the biological properties of the human brain, especially neuronal connections. However, unlike the brain, where any neuron in physical proximity can be interconnected, artificial neural networks have a constant layer connection and data propagation direction. For example, if you cut an image into many tiles and enter it into the first layer of the neural network, the neurons repeat the process of passing the data to the next layer until the final output is produced on the last layer. Each neuron is assigned a weight that indicates the accuracy of the input based on the task being performed, and then the final output is determined by summing all the weights.

딥 러닝은 앞서 설명한 바와 같이 인공신경망에서 발전한 형태의 인공 지능으로, 뇌의 뉴런과 유사한 정보 입출력 계층을 활용해 데이터를 학습한다. 그러나 기본적인 신경망조차 굉장한 양의 연산을 필요로 하는 탓에 신경망 네트워크는 '학습' 과정에서 수많은 오답을 낼 가능성이 크다. 그러나 학습의 시간과 양이 커질수록 정답에 근접할 확률이 크게 향상된다.Deep learning is a form of artificial intelligence developed from an artificial neural network as described above, and uses data input and output layers similar to neurons in the brain to learn data. But even basic neural networks require huge amounts of computation, so neural networks are likely to produce a number of incorrect answers during the 'learning' process. However, the greater the time and amount of learning, the greater the probability of getting close to the correct answer.

딥러닝에서는 컴퓨터 모델이 이미지, 텍스트 또는 사운드로부터 직접 분류 작업 방법을 학습한다. 또한 대부분의 딥러닝 방식은 뉴럴 네트워크 아키텍처를 사용하기 때문에, 딥러닝 모델은 종종 딥 뉴럴 네트워크로 불린다. '딥'이라는 용어는 뉴럴 네트워크를 구성하는 숨겨진 레이어의 수를 가리킨다. 기존의 뉴럴 네트워크는 숨겨진 레이어가 2~3개에 불과하지만 딥 네트워크는 최대 150개에 이른다. 가장 많이 쓰이는 딥 뉴럴 네트워크 유형은 합성곱신경망(Convolutional Neural Network, CNN)으로, 입력 데이터에 대해 합성곱을 취함으로써 특징을 추출한다. 이미지를 예로 들면, 잘 학습된 딥러닝 모델은 전에 그와 동일한 이미지를 본적이 없다 하더라도 이미지에서 객체를 자동으로 식별할 수 있다. In deep learning, computer models learn to classify directly from images, text, or sound. Also, since most deep learning methods use a neural network architecture, the deep learning model is often called a deep neural network. The term 'deep' refers to the number of hidden layers that make up a neural network. Traditional neural networks have only two or three hidden layers, but up to 150 deep networks. The most popular type of deep neural network is the Convolutional Neural Network (CNN), which extracts features by taking a composite product of the input data. Using images as an example, a well-trained deep learning model can automatically identify objects in an image even if they have never seen the same image before.

합성곱신경망(CNN)은 최소한의 전처리를 사용하도록 설계된 다계층 퍼셉트론(multilayer perceptrons)의 한 종류이다. CNN은 하나 또는 여러개의 합성곱 계층과 그 위에 올려진 일반적인 인공 신경망 계층들로 이루어져 있으며, 가중치와 통합 계층(pooling layer)들을 추가로 활용한다. 이러한 구조 덕분에 CNN은 2차원 구조의 입력 데이터를 충분히 활용할 수 있다. 다른 딥 러닝 구조들과 비교해서, CNN은 영상, 음성 분야 모두에서 좋은 성능을 보여준다. CNN은 또한 표준 역전달을 통해 훈련될 수 있다. CNN은 다른 피드포워드 인공신경망 기법들보다 쉽게 훈련되는 편이고 적은 수의 매개변수를 사용한다는 이점이 있다. Synthetic multiplication neural network (CNN) is a type of multilayer perceptrons designed to use minimal pretreatment. CNN consists of one or several convolutional layers and a common artificial neural network layer on top of it, further utilizing weights and pooling layers. This structure allows the CNN to fully utilize the input data of the two-dimensional structure. Compared to other deep learning architectures, CNN performs well in both video and audio. CNNs can also be trained through standard reverse propagation. CNN is easier to train than other feedforward neural network techniques and has the advantage of using fewer parameters.

기계학습은 데이터로부터 지식을 추출해내는 직접 학습을 진행할 수도 있지만, 보통 중간 단계인 특징 추출을 거쳐 "데이터 - 특징 - 지식"의 단계로 학습하는 것이 보통이다. 예를 들어 사진 속에서 사물을 인식하기 위해 픽셀값에서 먼저 특징적인 선이나 특징적인 색 분포 등을 먼저 추출한 후, 이를 기반으로 대상 물체가 무엇인지 판단하는 것이다. 이러한 중간 표현단계를 특징지도(feature map)라고 하는데, 기계학습의 성능은 얼마만큼 좋은 특징들을 뽑아내느냐에 따라 그 성능이 크게 좌우된다. 특히, 다단계로 특징을 추출해 학습하는 합성곱신경망이 이미지 인식에 유용하다. Machine learning can be a direct learning process that extracts knowledge from data, but it is common to learn at the stage of "data-feature-knowledge" after intermediate feature extraction. For example, in order to recognize an object in a picture, a characteristic line or characteristic color distribution is first extracted from a pixel value, and then, based on this, the object is determined. This intermediate representation is called a feature map, and the performance of machine learning depends largely on how good features are extracted. In particular, a synthetic multiplying neural network that extracts and learns features in multiple steps is useful for image recognition.

그리고 이러한 합성곱신경망을 이용한 기계학습 또는 데이터 추출은 다음과 같은 기법들을 통해 더욱 향상될 수 있다.In addition, machine learning or data extraction using the compound multiplication neural network can be further improved through the following techniques.

1. 데이터 증강(Data Augmentation)1. Data Augmentation

기계학습을 위해서는 많은 양의 데이터가 필요한데, 다양한 케이스에 대한 학습을 행하여 일반적을 추정을 이끌어낼 수 있기 때문이다. 만약 데이터의 양이 적다면 학습한 데이터에 대해서만 잘 추정하게 되고, 이는 오버피팅(over fitting)의 문제로 이어져 인공지능 모델의 성능에 영향을 끼치게 된다. 데이터 수가 부족할 경우 야기될 수 있는 문제는 이미지를 플리핑(flipping)하는 방법이나 회전하는 방법 등으로 보완될 수 있다.Machine learning requires a large amount of data, because learning about various cases can lead to general estimates. If the amount of data is small, the estimation is only good for the learned data, which leads to the problem of overfitting, which affects the performance of the AI model. Problems that may arise when the number of data is insufficient may be compensated by a method of flipping an image or rotating the image.

2. 컨볼루션 필터2. Convolution Filter

합성곱신경망은 입력영상에 대하여, 일정한 필터의 사이즈로 해당 사이즈에 포함되는 입력영상의 수치들을 컨볼루션하여 그 결과를 출력한다. 이를 수식으로 표현하면, 필터를 W라 했을 때 컨볼루션 결과는 Wx+b의 형태로 나타낼 수 있다(b는 바이어스). 이는 선형 모델링으로서, 더욱 깊은 층의 인공지능일수록 선형 함수와 비선형 함수의 결합을 통해 더욱 정교한 분류기(classifier)를 설계할 수 있다. The composite product neural network convolves the numerical values of the input image included in the corresponding size with a predetermined filter size and outputs the result. Expressed as an equation, when the filter is W, the convolution result can be expressed in the form of Wx + b (b is bias). This is linear modeling. The deeper the AI, the more sophisticated classifiers can be designed through the combination of linear and nonlinear functions.

필터의 사이즈와 필터가 영상 안에서 이동하는 보폭(stride)에 따라 컨볼루션을 통한 영상의 크기가 달라진다. 입력영상 크기가 N×N, 필터 사이즈가 F×F라 할 경우, 이 필터를 통해 출력되는 영상의 크기는 (N-F)/stride +1로서, 다음 층의 특징지도가 필터의 크기에 영향을 받는다는 것을 알 수 있다. 기본적으로 필터를 거칠수록 입력영상의 크기는 작아지게 되는 문제가 있다. 이를 방지하기 위하여 입력영상을 증폭시키는 작업이 필요하기도 하다.The size of the image through convolution depends on the size of the filter and the stride in which the filter moves within the image. If the input image size is N × N and the filter size is F × F, the size of the image output through this filter is (NF) / stride +1, indicating that the feature map of the next layer is affected by the size of the filter. It can be seen that. Basically, as the filter passes, the size of the input image becomes smaller. In order to prevent this, it is necessary to amplify the input image.

3. 최대 풀링(max pooling)3. max pooling

최대 풀링이란 컨볼루션을 거친 특징 지도를 샘플링하는 작업을 일컫는다. 예를 들어 4×4의 특징지도와 2×2의 필터를 가정할 경우, 2×2 필터 내부의 수치들 중 가장 큰 값을 추출하여 2×2의 결과층을 만드는 방법이다. 이 기법에 의하면 입력영상으로부터 가장 강한 특징들 만으로 특징 지도를 추출할 수 있게 된다.Maximum pooling refers to sampling convolutional feature maps. For example, assuming a feature map of 4x4 and a filter of 2x2, this method extracts the largest value among the values inside the 2x2 filter to create a 2x2 result layer. According to this technique, a feature map can be extracted from only the strongest features from the input image.

4. 소프트맥스 층(softmax layer) 및 원 핫 인코딩(one-hot encoding)4. softmax layer and one-hot encoding

소프트맥스는 점수로 나온 결과를 전체 합계가 1이 되는 0과 1사이의 값으로 변경하는 작업을 수행한다. 각 점수에 지수(exp)를 취한 후, 정규화 상수로 나누어 총합이 1이 되도록 계산한다. Softmax does the job of changing the result of the score to a value between 0 and 1, which adds up to 1. An exp is taken for each score and then divided by a normalization constant to calculate the sum to 1.

신경망의 마지막 단에서 해당 클래스에 해당하는 N개의 클래스에 대한 벡터값이 산출되는바, 클래스 개수에 상응하는 벡터는 각각 임의의 값을 가지고 있다. 이 결과가 소프트맥스 층을 거치면 각 결과값들이 0과 1 사이의 벡터값으로 변경된다. 이후, 원 핫 인코딩을 통해 가장 높은 값을 1로, 나머지 값들을 0으로 변환하여, 가장 특징이 우세한 결과를 추출할 수 있게 된다.In the last stage of the neural network, vector values for N classes corresponding to the corresponding class are calculated. Each vector corresponding to the number of classes has an arbitrary value. As the result passes through the softmax layer, each result is changed to a vector between 0 and 1. Thereafter, the one-hot encoding converts the highest value to one and the remaining values to zero, so that the most characteristic result can be extracted.

5. 비용함수(cost function) 및 경사 하강법 알고리즘(gradient descent algorithm)5. Cost function and gradient descent algorithm

인공지능의 웨이트 값을 학습시키기 위해서는 목적함수를 정의한다. 목적함수는 목적에 따라 다르겠지만, 일반적으로 교차 엔트로피(cross entropy) 방법을 사용한다. 소프트맥스 층을 거친 특징 벡터를 사용하여, 알고 있는 실측정보(ground truth)를 상기 목적함수를 이용하여 정의한다. 이때 경사하강법을 이용하면 목적함수의 값이 최소로 되는 방향으로 학습하게 된다. In order to learn the weight value of artificial intelligence, we define the objective function. The objective function depends on the purpose, but in general, the cross entropy method is used. Known ground truth is defined using the objective function using a feature vector over the softmax layer. In this case, when the gradient descent method is used, the direction of the objective function is minimized.

경사하강법 알고리즘 사용시, 학습 에포크(epoch) 수를 너무 많이 제공하면 신경망은 특정 입력값과 실측정보에만 맞는 결과를 산출하는 경향이 있게 되는데, 예를 들면 학생들이 시험에 임하여 연습문제에만 특화되어 해당 연습문제가 출제되면 정확하게 해결하지만, 다른 문제가 나왔을 경우 결과가 기대한 바에 미치지 못하는 현상이라고 할 수 있다. 이를 방지하기 위해서는 드롭아웃 레이어, 입력 노이즈, 웨이트 노이즈 등을 추가함으로써 오버피팅에 빠지는 문제들을 해결할 수 있다.When using gradient descent algorithms, providing too many learning epochs tends to produce neural networks that are tailored to specific inputs and actual information. If a question is asked, it is solved correctly, but if another problem comes out, the result may not be as expected. To prevent this, the problem of overfitting can be solved by adding a dropout layer, input noise, and weight noise.

이에 따라 본 발명의 방법에 있어서 합성곱신경망의 머신러닝은 아래와 같은 단계를 거쳐 수행될 수 있다:Accordingly, in the method of the present invention, the machine learning of the synthetic product neural network may be performed through the following steps:

i) 학습 샘플 데이터 및 신경망 데이터 수신i) Receive training sample data and neural network data

ii) 신경망 데이터를 그래픽 데이터로 준비ii) prepare neural network data as graphical data

iii) 각 입력 샘플에 대하여iii) for each input sample

- 계산 결과 출력Calculation result output

- 기대 출력에 대한 오차 계산Calculate the error for the expected output

- 목적함수 결정-Determination of objective function

- 컨볼루션 필터 재설정-Reset convolution filter

- 상기 반복Repeating the above

iv) 다음 샘플 입력 - 반복iv) Enter next sample-repeat

본 발명에서는 이러한 딥러닝에 의한 이미지 학습을 통하여, 상품 이미지에 매칭되는 키워드를 최대한 많이 그리고 정확하게 학습한 결과를 토대로 하여, 사용자가 입력한 키워드 및/또는 이미지에 가장 근접한 결과물을 사용자 화면에 제공할 수 있다.In the present invention, through the image learning by deep learning, based on the result of learning as many and accurate keywords as possible to match the product image, the user's input keyword and / or the image that is closest to the image will be provided to the user screen Can be.

한편, 본 발명은 또한 이러한 방법을 실행하기 위한 시스템으로서,On the other hand, the present invention also provides a system for carrying out such a method,

- 사용자에 의한 키워드 및/또는 이미지 입력부;Keyword and / or image input by the user;

- 입력된 키워드 및/또는 이미지에 가장 근접한 이미지를 선별하는 이미지 결정부;An image determination unit for selecting an image closest to the input keyword and / or the image;

- 선별된 이미지를 포함하는 결과 출력부A result output including the selected image

를 포함하고,Including,

상기 이미지 결정부는 The image determination unit

- 분산파일 저장 시스템,Distributed file storage system,

- 딥 뉴럴 네트워크(Deep Neural Network) 기반 단일객체 이미지 인식 학습부,Deep Neural Network based single object image recognition learning unit

- DNN 모델 프로파일 생성부,DNN model profile generation unit,

- DNN 프로파일 추출 및 모델 생성부, 및A DNN profile extraction and model generation unit, and

- 상품 이미지 인식부-Product image recognition unit

를 포함하는 것을 특징으로 하는 시스템도 제공한다.It also provides a system comprising a.

나아가, 본 발명에서는 미리 학습되지 않은 키워드 검색시, 학습을 자동으로 행하는 것을 특징으로 하는 방법, 및 장치에 대해서도 제공한다.Furthermore, the present invention also provides a method and an apparatus, characterized in that learning is automatically performed when a keyword is not pre-learned.

또한, 본 발명의 또 다른 바람직한 양태에서는, 사용자가 입력한 키워드 및/또는 이미지에 매칭되는 결과만을 제공하는 것이 아니라, 해당 사용자에게 적합한 기타 정보를 맞춤형으로 제공할 수 있는 방법 및 장치에 대해서도 제공할 수 있다.In addition, in another preferred embodiment of the present invention, not only provide a result matching the keyword and / or image input by the user, but also provide a method and apparatus that can be customized to provide other information suitable for the user. Can be.

Claims

As a product search method in an online shopping mall,
Collecting images to learn using an image crawler,
Learning the collected image using a graphic processing unit (GPU),
Recognizing keywords and / or images input by the user based on the learned results and providing relevant information to the user screen
Product search method comprising a.

The method of claim 1,
The providing of highly relevant information to a user screen may include recommending content of interest to the user.

The method of claim 1,
The learning may include automatically learning a keyword and / or an image that has not been pre-learned.

As a product search system in an online shopping mall,
Keyword and / or image input by the user;
An image determining unit for selecting an image closest to the input keyword and / or the image; And
A result output including the selected image
Including,
The image determination unit
Distributed file storage system,
Deep Neural Network based single object image recognition learning unit
DNN model profile generation unit,
A DNN profile extraction and model generation unit, and
-Product image recognition unit
Search system comprising a.