KR102150720B1

KR102150720B1 - Image embedding apparatus and method for content-based user clustering

Info

Publication number: KR102150720B1
Application number: KR1020200000885A
Authority: KR
Inventors: 윤자영; 하지수
Original assignee: 주식회사 스타일쉐어
Priority date: 2020-01-03
Filing date: 2020-01-03
Publication date: 2020-09-02

Abstract

The present invention relates to a fashion content embedding apparatus for user clustering and a method thereof. To this end, the fashion content embedding apparatus includes: a fashion item classification module which is configured to receive fashion item image information from a fashion item detection module, input each fashion item image information to a previously learned artificial neural network, and output fashion item classification information for each fashion item image information; and an embedding module which is connected to a fully connected layer (FC Layer) of the artificial neural network of the fashion item classification module and output an output vector of the FC layer of each artificial neural network for each fashion item image information as an embedding vector for each fashion item image information. Therefore, it can be automatically clustered by the embedding vector without an administrator tagging a style of interest or classifying users for the fashion content or fashion items which each user reacts.

Description

Fashion content embedding apparatus and method for user clustering TECHNICAL FIELD [Image embedding apparatus and method for content-based user clustering]

본 발명은 유저 클러스터링을 위한 패션 콘텐츠 임베딩 장치 및 방법에 관한 것이다. The present invention relates to an apparatus and method for embedding fashion contents for user clustering.

최근 스마트폰의 발달로 다양한 소셜 네트워크가 범람하면서 패션 시장이 격변하고 있다. 사용자들은 스타일쉐어(Styleshare), 인스타그램(Instagram), 유튜브(Youtube), 패션 아이템북(fashion itembook), 스냅(Snap), 스노우(Snow), 핀터레스트(Pinterest), 텀블러(Tumblr), 틱톡(Tiktok) 등의 소셜 네트워크에서 다른 사용자들이 업로드 하는 패션 아이템, OOTD(Outfit of the day), V-log 등을 포함하는 패션 콘텐츠를 감상하고 이에 like, comment, share 등의 소셜 액션을 수행하여 해당 패션 아이템에 대한 공감을 표시하고 상호 소통을 수행하게 된다. 또한, 최근의 흐름에 따르면 패션 아이템 광고 콘텐츠도 하나의 패션 콘텐츠와 같이 소비되고 있으며, 패션 아이템 광고 콘텐츠나 패션 브랜드 광고 콘텐츠가 패션 콘텐츠에 포함되는 이러한 흐름은 특히 인스타그램, 스타일쉐어 등의 이미지 기반의 소셜 네트워크에서 두드러지게 나타나고 있다. 최근의 이러한 흐름은 패션 커머스 및 패션 커머스의 광고를 변화 시키고 있다. With the recent development of smartphones, various social networks are flooding, and the fashion market is undergoing catastrophic change. Users include Styleshare, Instagram, YouTube, fashion itembook, Snap, Snow, Pinterest, Tumblr, TikTok Appreciate fashion content including fashion items uploaded by other users, OOTD (Outfit of the day), V-log, etc. on social networks such as (Tiktok), and perform social actions such as like, comment, and share. They show empathy for fashion items and communicate with each other. In addition, according to the recent trend, fashion item advertisement content is also consumed as a single fashion content, and this flow in which fashion item advertisement content or fashion brand advertisement content is included in fashion content is especially the image of Instagram and Style Share. It appears prominently in social networks based. These recent trends are changing fashion commerce and fashion commerce advertisements.

패션 커머스 및 패션 커머스 광고의 첫 번째 변화는 크리에이터 기반의 패션 아이템 커머스가 빠르게 성장하고 있는 점, 두 번째 변화는 아이덴티티 기반의 패션 온라인 커머스의 급성장이다. 크리에이터 기반의 패션 아이템 커머스가 급성장하고 있다는 대표적인 예시는 Kylie Cosmetics(Kylie Jenner), Huda Beauty(Huda Kattan), 임블리 등이 있다. 크리에이터 기반의 패션 아이템 커머스에서는 크리에이터와의 소통하기 위해, 또는 크리에이터를 닮고자 소비하는 소비 특징이 발견되고, 크리에이터의 소셜 네트워크 채널 자체가 광고 채널 역할을 수행하는 특징을 나타내고 있다. 또한, 아이덴티티 기반의 패션 온라인 커머스는 FarFetch, YOOX, Net-A-Porter, 스타일난다, 무신사 등이 있고, 소셜 네트워크를 광고 채널로서 적극적으로 활용하고, 추천 알고리즘이나 리디렉션 등의 광고 최적화에 상당한 리소스를 투입한다는 특징이 있다. The first change in fashion commerce and fashion commerce advertising is the rapid growth of creator-based fashion item commerce, and the second change is the rapid growth of identity-based fashion online commerce. Representative examples of the rapid growth of creator-based fashion item commerce include Kylie Cosmetics (Kylie Jenner), Huda Beauty (Huda Kattan), and Imvely. In the creator-based fashion item commerce, consumption characteristics are found to communicate with the creator or to resemble the creator, and the social network channel of the creator itself plays a role of an advertising channel. In addition, identity-based fashion online commerce includes FarFetch, YOOX, Net-A-Porter, Stylenanda, and Musinsa, and actively utilizes social networks as advertising channels, and considerable resources for advertising optimization such as recommendation algorithms and redirects. It has the characteristic of putting in.

대한민국 공개특허 10-2019-0029567, 스타일 특징을 이용한 상품 추천 방법, 옴니어스 주식회사Republic of Korea Patent Publication 10-2019-0029567, Product recommendation method using style characteristics, Omnious Co., Ltd.

소셜 네트워크 및 패션 커머스의 변화에 따라 소셜 네트워크에서 발생되는 데이터를 활용하여 패션 커머스를 최적화 하기 위한 많은 시도들이 있었다. 예를 들어, 대한민국 공개특허 10-2019-0029567는 패션 이미지의 스타일 특징을 분류하는 이미지 분류 모델을 기초로, 동일한 스타일 특징을 가지지만 다른 카테고리에 존재하는 패션 아이템을 사용자에게 추천하는 동일한 스타일 특징을 가진 상품을 추천하는 방법을 제시하고 있다. In accordance with changes in social networks and fashion commerce, there have been many attempts to optimize fashion commerce using data generated from social networks. For example, Korean Patent Application Laid-Open No. 10-2019-0029567 provides the same style feature that recommends fashion items that have the same style feature to users but exist in different categories based on an image classification model that classifies style features of fashion images. It presents a way to recommend the products you have.

하지만, 소셜 네트워크에서 발생되는 로그 데이터의 Unsupervised Learning을 활용해 사용자들의 스타일이나 아이덴티티를 클러스터링 하여 패션 커머스의 최적화에 이용하는 것은 기술적인 난이도 때문에 쉽지 않았다. 예를 들어, 대한민국 공개특허 10-2019-0029567에서 제시하는 스타일 특징이 동일한 패션 아이템의 추천은 전체적인 분위기나 매칭 하모니 등을 고려하지 않은 단순히 동일한 모양인지를 판단하는 알고리즘에 관한 것이고, 사용자들의 스타일이나 아이덴티티를 클러스터링 하는데 활용하지 못한다. 사용자들을 클러스터링하여 패션 커머스의 최적화에 이용하게 되면 패션 커머스의 MD가 알기 어려운 사용자들의 특성들까지 분류하여 패션 아이템 추천 서비스나 디테일한 유저 클러스터를 활용한 광고 서비스, 브랜드 아이덴티티에 어울리는 크리에이터 추천 등의 다양한 형태의 패션 커머스 최적화가 가능해지는 효과가 발생된다.However, it was not easy because of the technical difficulty to cluster users' styles and identities by using Unsupervised Learning of log data generated in social networks to optimize fashion commerce. For example, the recommendation of a fashion item with the same style characteristics suggested in Korean Patent Application Laid-Open No. 10-2019-0029567 relates to an algorithm that determines whether it is simply the same shape without taking into account the overall atmosphere or matching harmony, etc. It cannot be used to cluster identities. When users are clustered and used for optimizing fashion commerce, the MD of fashion commerce categorizes the characteristics of users that are difficult to understand, such as a fashion item recommendation service, an advertisement service using a detailed user cluster, and a creator recommendation suitable for brand identity. The effect of optimizing fashion commerce in the form of fashion occurs.

따라서, 본 발명의 목적은 소셜 네트워크에서 발생되는 로그 데이터를 활용하여 사용자들의 스타일이나 아이덴티티를 클러스터링 할 수 있도록 하는, 유저 클러스터링을 위한 패션 콘텐츠 임베딩 장치 및 방법을 제공하는데에 있다. Accordingly, an object of the present invention is to provide an apparatus and method for embedding fashion contents for user clustering, which enables clustering of user styles or identities by using log data generated in a social network.

이하 본 발명의 목적을 달성하기 위한 구체적 수단에 대하여 설명한다.Hereinafter, specific means for achieving the object of the present invention will be described.

본 발명의 목적은, 사용자 클라이언트 또는 소셜 네트워크 웹서버에서 특정 기간 동안에 특정 사용자가 반응 정보를 입력하거나 업로드한 패션 콘텐츠 정보인 관심 패션 콘텐츠 정보를 수신하고, 상기 관심 패션 콘텐츠 정보에 포함된 패션 콘텐츠 영상 정보 및 반응 정보를 생성하는 패션 콘텐츠 수신 모듈; 상기 패션 콘텐츠 수신 모듈에서 패션 콘텐츠 영상 정보를 수신하고, 상기 패션 콘텐츠 영상 정보에서 패션 아이템의 구역을 디텍션(detection)하여 복수의 패션 아이템 구역 정보를 생성하며, 각각의 상기 패션 아이템 구역 정보에 대한 상기 패션 콘텐츠 영상 정보의 적어도 일부의 영상 정보인 패션 아이템 영상 정보를 상기 패션 콘텐츠 영상 정보에서 디텍션 된 상기 패션 아이템의 수 만큼 생성하는 패션 아이템 디텍션 모듈; 상기 패션 아이템 디텍션 모듈에서 상기 패션 아이템 영상 정보를 수신하고, 각각의 상기 패션 아이템 영상 정보를 기학습된 인공신경망에 각각 입력하여 각각의 상기 패션 아이템 영상 정보에 대한 패션 아이템 분류 정보를 출력하도록 구성되는 패션 아이템 분류 모듈; 및 상기 패션 아이템 분류 모듈의 상기 인공신경망의 FC 레이어(Fully Connected Layer)와 연결되어, 각각의 상기 패션 아이템 영상 정보에 대한 각 상기 인공신경망의 상기 FC 레이어의 출력 벡터를 각각의 상기 패션 아이템 영상 정보에 대한 임베딩 벡터로 출력하는 임베딩 모듈;을 포함하고, 상기 인공신경망의 손실 함수는, 각각의 상기 패션 아이템 영상 정보에 대한 상기 임베딩 벡터들 사이의 거리를 포함하며, 상기 인공신경망의 학습 단계에서 상기 손실 함수의 손실 값을 최소로 하도록 상기 인공신경망의 가중치를 학습하고, 각각의 상기 패션 아이템 영상 정보에 대한 상기 임베딩 벡터를 사용자 클러스터링 모듈 또는 패션 아이템 클러스터링 모듈의 입력 정보로 사용하는 것을 특징으로 하는, 유저 클러스터링을 위한 패션 콘텐츠 임베딩 장치를 제공하여 달성될 수 있다. An object of the present invention is to receive fashion content information of interest, which is fashion content information that a specific user inputs or uploads during a specific period of time in a user client or a social network web server, and a fashion content image included in the fashion content information of interest A fashion content receiving module that generates information and reaction information; The fashion content receiving module receives fashion content image information, detects an area of a fashion item from the fashion content image information to generate a plurality of fashion item area information, and the fashion item area information A fashion item detection module that generates fashion item image information, which is image information of at least part of fashion content image information, as many as the number of fashion items detected in the fashion content image information; The fashion item detection module is configured to receive the fashion item image information, input each of the fashion item image information to a pre-learned artificial neural network, and output fashion item classification information for each of the fashion item image information. Fashion item classification module; And a fully connected layer of the artificial neural network of the fashion item classification module, and an output vector of the FC layer of each artificial neural network with respect to each of the fashion item image information is obtained from each of the fashion item image information. And an embedding module that outputs an embedding vector for, wherein the loss function of the artificial neural network includes a distance between the embedding vectors for each of the fashion item image information, and in the learning step of the artificial neural network, the Learning the weight of the artificial neural network to minimize the loss value of the loss function, and using the embedding vector for each of the fashion item image information as input information of a user clustering module or a fashion item clustering module, This can be achieved by providing a fashion content embedding device for user clustering.

본 발명의 다른 목적은, 패션 콘텐츠 수신 모듈이, 사용자 클라이언트 또는 소셜 네트워크 웹서버에서 특정 기간 동안에 특정 사용자가 반응 정보를 입력하거나 업로드한 패션 콘텐츠 정보인 관심 패션 콘텐츠 정보를 수신하고, 상기 관심 패션 콘텐츠 정보에 포함된 패션 콘텐츠 영상 정보 및 반응 정보를 생성하는 패션 콘텐츠 수신 단계; 패션 아이템 디텍션 모듈이, 상기 패션 콘텐츠 수신 모듈에서 패션 콘텐츠 영상 정보를 수신하고, 상기 패션 콘텐츠 영상 정보에서 패션 아이템의 구역을 디텍션(detection)하여 복수의 패션 아이템 구역 정보를 생성하며, 각각의 상기 패션 아이템 구역 정보에 대한 상기 패션 콘텐츠 영상 정보의 적어도 일부의 영상 정보인 패션 아이템 영상 정보를 상기 패션 콘텐츠 영상 정보에서 디텍션 된 상기 패션 아이템의 수 만큼 생성하는 패션 아이템 디텍션 단계; 패션 아이템 분류 모듈이, 상기 패션 아이템 디텍션 모듈에서 상기 패션 아이템 영상 정보를 수신하고, 각각의 상기 패션 아이템 영상 정보를 기학습된 인공신경망에 각각 입력하여 각각의 상기 패션 아이템 영상 정보에 대한 패션 아이템 분류 정보를 출력하도록 구성되는 패션 아이템 분류 단계; 및 임베딩 모듈이, 상기 패션 아이템 분류 모듈의 상기 인공신경망의 FC 레이어(Fully Connected Layer)와 연결되어, 각각의 상기 패션 아이템 영상 정보에 대한 각 상기 인공신경망의 상기 FC 레이어의 출력 벡터를 각각의 상기 패션 아이템 영상 정보에 대한 임베딩 벡터로 출력하는 임베딩 단계;를 포함하고, 상기 인공신경망의 손실 함수는, 각각의 상기 패션 아이템 영상 정보에 대한 상기 임베딩 벡터들 사이의 거리를 포함하며, 상기 인공신경망의 학습 단계에서 상기 손실 함수의 손실 값을 최소로 하도록 상기 인공신경망의 가중치를 학습하고, 각각의 상기 패션 아이템 영상 정보에 대한 상기 임베딩 벡터를 사용자 클러스터링 모듈 또는 패션 아이템 클러스터링 모듈의 입력 정보로 사용하는 것을 특징으로 하는, 유저 클러스터링을 위한 패션 콘텐츠 임베딩 방법을 제공하여 달성될 수 있다. Another object of the present invention is that the fashion content receiving module receives fashion content information of interest, which is fashion content information that a specific user inputs or uploads reaction information for during a specific period in a user client or a social network web server, and the fashion content of interest A fashion content receiving step of generating fashion content image information and response information included in the information; A fashion item detection module receives fashion content image information from the fashion content receiving module, detects an area of a fashion item from the fashion content image information, and generates a plurality of fashion item area information. A fashion item detection step of generating fashion item image information, which is image information of at least part of the fashion content image information with respect to item area information, as many as the number of fashion items detected in the fashion content image information; A fashion item classification module classifies fashion items for each fashion item image information by receiving the fashion item image information from the fashion item detection module and inputting each of the fashion item image information to a previously learned artificial neural network A fashion item classification step, configured to output information; And the embedding module is connected to a fully connected layer of the artificial neural network of the fashion item classification module, so that the output vector of the FC layer of each artificial neural network for each of the fashion item image information And an embedding step of outputting an embedding vector for fashion item image information, wherein the loss function of the artificial neural network includes a distance between the embedding vectors for each of the fashion item image information, and the artificial neural network Learning the weight of the artificial neural network to minimize the loss value of the loss function in the learning step, and using the embedding vector for each of the fashion item image information as input information of the user clustering module or the fashion item clustering module. Characterized, it can be achieved by providing a fashion content embedding method for user clustering.

본 발명의 다른 목적은, 패션 콘텐츠 임베딩 프로그램 코드를 저장하는 메모리 모듈; 및 상기 패션 콘텐츠 임베딩 프로그램 코드를 처리하는 처리 모듈;을 포함하고, 상기 패션 콘텐츠 임베딩 프로그램 코드는, 사용자 클라이언트 또는 소셜 네트워크 웹서버에서 특정 기간 동안에 특정 사용자가 반응 정보를 입력하거나 업로드한 패션 콘텐츠 정보인 관심 패션 콘텐츠 정보를 수신하고, 상기 관심 패션 콘텐츠 정보에 포함된 패션 콘텐츠 영상 정보 및 반응 정보를 생성하는 패션 콘텐츠 수신 단계; 상기 패션 콘텐츠 영상 정보에서 패션 아이템의 구역을 디텍션(detection)하여 복수의 패션 아이템 구역 정보를 생성하며, 각각의 상기 패션 아이템 구역 정보에 대한 상기 패션 콘텐츠 영상 정보의 적어도 일부의 영상 정보인 패션 아이템 영상 정보를 상기 패션 콘텐츠 영상 정보에서 디텍션 된 상기 패션 아이템의 수 만큼 생성하는 패션 아이템 디텍션 단계; 각각의 상기 패션 아이템 영상 정보를 기학습된 인공신경망에 각각 입력하여 각각의 상기 패션 아이템 영상 정보에 대한 패션 아이템 분류 정보를 출력하도록 구성되는 패션 아이템 분류 단계; 및 각각의 상기 패션 아이템 영상 정보에 대한 각 상기 인공신경망의 상기 FC 레이어의 출력 벡터를 각각의 상기 패션 아이템 영상 정보에 대한 임베딩 벡터로 출력하는 임베딩 단계;을 컴퓨터 상에서 수행시키는 프로그램 코드를 포함하고, 상기 인공신경망의 손실 함수는, 각각의 상기 패션 아이템 영상 정보에 대한 상기 임베딩 벡터들 사이의 거리를 포함하며, 상기 인공신경망의 학습 단계에서 상기 손실 함수의 손실 값을 최소로 하도록 상기 인공신경망의 가중치를 학습하고, 각각의 상기 패션 아이템 영상 정보에 대한 상기 임베딩 벡터를 사용자 클러스터링 모듈 또는 패션 아이템 클러스터링 모듈의 입력 정보로 사용하는 것을 특징으로 하는, 유저 클러스터링을 위한 패션 콘텐츠 임베딩 장치를 제공하여 달성될 수 있다. Another object of the present invention is a memory module for storing a fashion content embedding program code; And a processing module for processing the fashion content embedding program code, wherein the fashion content embedding program code is fashion content information input or uploaded by a specific user during a specific period in a user client or a social network web server. A fashion content receiving step of receiving fashion content information of interest and generating fashion content image information and reaction information included in the fashion content information of interest; A fashion item image that detects an area of a fashion item from the fashion content image information to generate a plurality of fashion item area information, and is image information of at least a part of the fashion content image information for each of the fashion item area information A fashion item detection step of generating information as much as the number of fashion items detected in the fashion content image information; A fashion item classification step, configured to output fashion item classification information for each fashion item image information by inputting each of the fashion item image information to a previously learned artificial neural network; And an embedding step of outputting an output vector of the FC layer of each of the artificial neural networks for each of the fashion item image information as an embedding vector for each of the fashion item image information, on a computer, The loss function of the artificial neural network includes a distance between the embedding vectors for each of the fashion item image information, and the weight of the artificial neural network to minimize the loss value of the loss function in the learning step of the artificial neural network. And using the embedding vector for each of the fashion item image information as input information of a user clustering module or a fashion item clustering module.It can be achieved by providing a fashion content embedding device for user clustering. have.

상기한 바와 같이, 본 발명에 의하면 이하와 같은 효과가 있다.As described above, the present invention has the following effects.

첫째, 본 발명의 일실시예에 따르면, 사용자들이 반응한 패션 콘텐츠나 패션 아이템에 대해 관리자가 일일이 관심 스타일을 태깅하거나 사용자들을 분류하지 않아도 임베딩 벡터에 의해 자동으로 클러스터링이 가능해지는 효과가 발생된다. First, according to an embodiment of the present invention, an effect of automatically clustering by an embedding vector occurs without having to manually tag a style of interest or classify users for fashion content or fashion items that users respond to.

둘째, 본 발명의 일실시예에 따르면, 특정 사용자가 업로드하거나 반응을 보인 패션 콘텐츠/패션 아이템을 기초로 상품을 추천해줌으로써 추천 알고리즘 또는 광고 알고리즘의 전환률을 향상시킬 수 있는 효과가 발생된다. Second, according to an embodiment of the present invention, by recommending a product based on fashion content/fashion items uploaded or reacted by a specific user, an effect of improving the conversion rate of a recommendation algorithm or an advertisement algorithm is generated.

셋째, 본 발명의 일실시예에 따르면, 특정 브랜드의 패션 콘텐츠/패션 아이템을 기초로 사용자 군집 또는 크리에이터를 추천해줌으로써 특정 브랜드의 관심 타겟형 광고가 가능해지는 효과가 발생된다.Third, according to an embodiment of the present invention, an interest-targeted advertisement of a specific brand is possible by recommending a group of users or creators based on fashion contents/fashion items of a specific brand.

본 명세서에 첨부되는 다음의 도면들은 본 발명의 바람직한 실시예를 예시하는 것이며, 발명의 상세한 설명과 함께 본 발명의 기술사상을 더욱 이해시키는 역할을 하는 것이므로, 본 발명은 그러한 도면에 기재된 사항에만 한정되어 해석되어서는 아니 된다.
도 1은 본 발명의 일실시예에 따른 유저 클러스터링을 위한 패션 콘텐츠 임베딩 장치를 도시한 모식도,
도 2는 본 발명의 일실시예에 따른 패션 콘텐츠 수신 모듈(10)의 작동관계를 도시한 모식도,
도 3은 본 발명의 일실시예에 따른 스무딩 필터 적용 모듈의 스무딩 필터 적용을 도시한 모식도,
도 4는 본 발명의 일실시예에 따른 후보 출력 모듈의 후보 정보 생성 결과를 도시한 모식도,
도 5는 본 발명의 일실시예에 따른 비후보 제거 모듈을 도시한 모식도,
도 6은 본 발명의 일실시예에 따른 패션 아이템 분류 모듈(12)의 인공신경망의 작동관계를 도시한 모식도,
도 7은 본 발명의 일실시예에 따른 패션 아이템 분류 모듈(12)의 인공신경망의 예시를 도시한 모식도,
도 8은 본 발명의 일실시예에 따른 클러스터링 모듈에서의 임베딩 벡터 매핑을 도시한 모식도이다. The following drawings attached to the present specification illustrate preferred embodiments of the present invention, and serve to further understand the technical idea of the present invention together with the detailed description of the present invention, so the present invention is limited to the matters described in such drawings. And should not be interpreted.
1 is a schematic diagram showing a fashion content embedding apparatus for user clustering according to an embodiment of the present invention;
2 is a schematic diagram showing an operation relationship of the fashion content receiving module 10 according to an embodiment of the present invention;
3 is a schematic diagram showing the application of a smoothing filter by a smoothing filter application module according to an embodiment of the present invention;
4 is a schematic diagram showing a result of generating candidate information of a candidate output module according to an embodiment of the present invention;
5 is a schematic diagram showing a non-candidate removal module according to an embodiment of the present invention;
6 is a schematic diagram showing the operational relationship of the artificial neural network of the fashion item classification module 12 according to an embodiment of the present invention;
7 is a schematic diagram showing an example of an artificial neural network of the fashion item classification module 12 according to an embodiment of the present invention;
8 is a schematic diagram showing embedding vector mapping in a clustering module according to an embodiment of the present invention.

이하 첨부된 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 본 발명을 쉽게 실시할 수 있는 실시예를 상세히 설명한다. 다만, 본 발명의 바람직한 실시예에 대한 동작원리를 상세하게 설명함에 있어서 관련된 공지기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략한다.Hereinafter, exemplary embodiments in which the present invention can be easily implemented by those of ordinary skill in the art will be described in detail with reference to the accompanying drawings. However, when it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted.

또한, 도면 전체에 걸쳐 유사한 기능 및 작용을 하는 부분에 대해서는 동일한 도면 부호를 사용한다. 명세서 전체에서, 특정 부분이 다른 부분과 연결되어 있다고 할 때, 이는 직접적으로 연결되어 있는 경우뿐만 아니라, 그 중간에 다른 소자를 사이에 두고, 간접적으로 연결되어 있는 경우도 포함한다. 또한, 특정 구성요소를 포함한다는 것은 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라, 다른 구성요소를 더 포함할 수 있는 것을 의미한다.In addition, the same reference numerals are used for parts having similar functions and functions throughout the drawings. Throughout the specification, when a specific part is said to be connected to another part, this includes not only the case that it is directly connected, but also the case that it is indirectly connected with another element interposed therebetween. In addition, the inclusion of a specific component does not exclude other components unless specifically stated to the contrary, but means that other components may be further included.

유저 클러스터링을 위한 패션 콘텐츠 임베딩 장치Fashion content embedding device for user clustering

도 1은 본 발명의 일실시예에 따른 유저 클러스터링을 위한 패션 콘텐츠 임베딩 장치를 도시한 모식도이다. 도 1에 도시된 바와 같이, 유저 클러스터링을 위한 패션 콘텐츠 임베딩 장치(1)는, 패션 콘텐츠 수신 모듈(10), 패션 아이템 디텍션 모듈(11), 패션 아이템 분류 모듈(12), 임베딩 모듈(13), 클러스터링 모듈(14)을 포함할 수 있다. 또한, 본 발명의 일실시예에 따른 유저 클러스터링을 위한 패션 콘텐츠 임베딩 장치(1)는 특정 웹서버, 클라우드 서버와 같은 가상 서버, 스마트폰, 태블릿 PC, 데스크탑 PC 등의 컴퓨팅 장치의 처리모듈에 의해 처리되고, 각 장치의 메모리 모듈에 저장되도록 구성될 수 있다. 본 발명의 일실시예에 따른 유저 클러스터링을 위한 패션 콘텐츠 임베딩 장치(1)는 사용자의 반응 정보(Like, Share, Comment 등)가 입력되거나 해당 사용자에 의해 업로드된 패션 콘텐츠인 관심 패션 콘텐츠를 사용자 클라이언트(100)에서 직접 수신하거나, 소셜 네트워크 웹서버(50)에서 소셜 네트워크 API를 통해 수신하도록 구성될 수 있다. 1 is a schematic diagram showing a fashion content embedding apparatus for user clustering according to an embodiment of the present invention. As shown in Fig. 1, the fashion content embedding device 1 for user clustering includes a fashion content receiving module 10, a fashion item detection module 11, a fashion item classification module 12, and an embedding module 13 , May include a clustering module 14. In addition, the fashion content embedding device 1 for user clustering according to an embodiment of the present invention is provided by a processing module of a computing device such as a specific web server, a virtual server such as a cloud server, a smartphone, a tablet PC, and a desktop PC. It can be configured to be processed and stored in a memory module of each device. The fashion content embedding device 1 for user clustering according to an embodiment of the present invention receives the user's reaction information (like, share, comment, etc.) or the fashion content of interest, which is the fashion content uploaded by the user, to a user client. It may be configured to receive directly from 100 or to receive from the social network web server 50 through a social network API.

패션 콘텐츠 수신 모듈(10)은, 사용자 클라이언트(100) 또는 소셜 네트워크 웹서버(50)에서 사용자의 반응 정보(Like, Share, Comment 등)가 입력되거나 해당 사용자에 의해 업로드 된 패션 콘텐츠인 관심 패션 콘텐츠를 수신하는 모듈이다. 도 2는 본 발명의 일실시예에 따른 패션 콘텐츠 수신 모듈(10)의 작동관계를 도시한 모식도이다. 도 2에 도시된 바와 같이, 본 발명의 일실시예에 따른 패션 콘텐츠 수신 모듈(10)은, 사용자 클라이언트(100)의 소셜 네트워크 애플리케이션 모듈 또는 소셜 네트워크 웹서버(50)에서 특정 사용자에 대한 특정 기간 동안의 관심 패션 콘텐츠 정보(패션 콘텐츠 영상 정보, 텍스트 정보, 태그 정보, 댓글 정보, 호감 정보, 뷰 카운트 정보 등)를 수신하고, 패션 아이템 디텍션 모듈(11)에 수신된 패션 콘텐츠 영상 정보를 송신하는 모듈이다.Fashion content receiving module 10, the user's reaction information (Like, Share, Comment, etc.) from the user client 100 or the social network web server 50 is input or fashion content of interest that is uploaded by the user It is a module that receives 2 is a schematic diagram showing an operation relationship of the fashion content receiving module 10 according to an embodiment of the present invention. As shown in Figure 2, the fashion content receiving module 10 according to an embodiment of the present invention, a specific period for a specific user in the social network application module of the user client 100 or the social network web server 50 Receives fashion content information (fashion content image information, text information, tag information, comment information, like information, view count information, etc.) of interest during the period, and transmits fashion content image information received to the fashion item detection module 11 It is a module.

패션 아이템 디텍션 모듈(11)은, 상기 패션 콘텐츠 수신 모듈(10)에서 패션 콘텐츠 영상 정보를 수신하고, 상기 패션 콘텐츠 영상 정보에서 패션 아이템의 구역을 디텍션(detection)하여 패션 아이템 구역 정보를 생성하며, 패션 아이템 구역 정보를 기초로 패션 콘텐츠 영상 정보의 적어도 일부인 패션 아이템 영상 정보를 생성하는 모듈이다. The fashion item detection module 11 receives fashion content image information from the fashion content receiving module 10, detects a fashion item area from the fashion content image information, and generates fashion item area information, This module generates fashion item image information, which is at least a part of fashion content image information, based on fashion item area information.

본 발명의 일실시예에 따른 패션 아이템 디텍션 모듈(11)은 YOLO, RCNN, Faster RCNN 등을 Fine-tunning한 패션 아이템 검출 알고리즘을 이용할 수 있다. 또는, ImageNet으로 기학습된 AlexNet 등의 네트워크를 Fine-tunning 한 패션 아이템 검출 알고리즘을 이용할 수 있다. 나아가, Viola-jones의 Haar-like Feature를 Boosting 등의 기존 컴퓨터 비전 알고리즘을 이용할 수 있다. The fashion item detection module 11 according to an embodiment of the present invention may use a fashion item detection algorithm fine-tuning YOLO, RCNN, Faster RCNN, and the like. Alternatively, a fashion item detection algorithm that fine-tuned a network such as AlexNet previously learned with ImageNet can be used. Furthermore, it is possible to use existing computer vision algorithms such as Boosting the Haar-like Feature of Viola-jones.

본 발명의 일실시예에 따른 패션 아이템 디텍션 모듈(11)은 스무딩 필터 적용 모듈, 후보 출력 모듈 및 비후보 제거 모듈을 포함할 수 있다. 본 발명의 일실시예에 따른 패션 아이템 디텍션 모듈(11)의 패션 아이템 검출은 세부적으로는 패션 콘텐츠 영상 정보 속에서의 패션 아이템 감지(fashion item detection)를 수행하도록 구성되고, 감지된 패션 아이템 영상 정보를 feature vector로 표현하는 fashion item feature embedding을 수행하는 임베딩 모듈(13)에 송신하여 통해 각 패션 아이템을 벡터화 하게 되며, 클러스터링 모듈(14)에서는 embedding 된 feature vector들을 비교하여 vector들 간의 군집을 형성하는 clustering을 수행하여 군집된 적어도 하나 이상의 패션 아이템 영상 정보 및 사용자 정보를 포함하는 군집 정보를 출력하도록 구성될 수 있다.The fashion item detection module 11 according to an embodiment of the present invention may include a smoothing filter application module, a candidate output module, and a non-candidate removal module. The fashion item detection by the fashion item detection module 11 according to an embodiment of the present invention is specifically configured to perform fashion item detection in fashion content image information, and the detected fashion item image information The fashion item expressing as a feature vector is transmitted to the embedding module 13 that performs feature embedding to vectorize each fashion item. In the clustering module 14, the embedding feature vectors are compared to form a cluster between vectors. It may be configured to perform clustering to output cluster information including at least one fashion item image information and user information clustered.

본 발명의 일실시예에 따른 패션 아이템 디텍션 모듈(11)의 패션 아이템 감지(fashion item detection)는 상기 패션 콘텐츠 영상 정보에 포함된 적어도 하나의 패션 아이템을 감싸도록 구성되는 사각형(또는 다각형, 타원형, 원형, 곡면 등)의 패션 아이템 박스(fashion item box)인 후보 정보를 출력하는 후보 출력 모듈, 상기 후보 정보 중 패션 아이템 박스 내의 이미지가 패션 아이템이 아닌 것으로 분류되는 비후보 분류 정보를 출력하는 비후보 제거 모듈에 의해 수행될 수 있고, 상기 비후보 분류 정보가 특정 값 이하인(비후보로 분류된 후보 정보가 제외된) 상기 후보 정보를 제공하도록 구성될 수 있다.Fashion item detection of the fashion item detection module 11 according to an embodiment of the present invention includes a rectangle (or polygon, oval, or shape) configured to surround at least one fashion item included in the fashion content image information. A candidate output module that outputs candidate information, which is a fashion item box (circular, curved, etc.), non-candidate that outputs non-candidate classification information in which an image in a fashion item box among the candidate information is classified as not a fashion item It may be performed by the removal module, and may be configured to provide the candidate information whose non-candidate classification information is less than or equal to a specific value (excluding candidate information classified as non-candidate).

본 발명의 일실시예에 따른 스무딩 필터 적용 모듈은 상기 패션 콘텐츠 영상 정보에 스무딩 필터를 적용하여 적어도 하나 이상의 스무딩 패션 콘텐츠 영상 정보를 생성하고, 생성된 스무딩 패션 콘텐츠 영상 정보를 후보 출력 모듈에 송신하도록 구성될 수 있다. 상기 스무딩 필터 적용 모듈의 상기 스무딩 필터는 가우시안 필터(Gaussian Filter), 바이레터럴 필터(Bilateral Filter), 미디안 필터(Median Filter) 등의 스무딩 필터로 구성될 수 있으며, 상기 패션 콘텐츠 영상 정보에 대해 점진적인 스무딩 가중치를 적용하여 복수개의 스무딩 패션 콘텐츠 영상 정보를 생성하고 후보 출력 모듈의 인공신경망의 입력 정보로 입력하도록 구성될 수 있다. 예를 들어, 스무딩 필터로 가우시안 필터(Gaussian Filter)가 구성되는 경우, 가우시안 커널의 x,y 방향의 표준편차인 sigma의 값을 점진적으로 큰 값으로 적용(예를 들어, 첫 번째 스무딩 패션 콘텐츠 영상 정보에 대해서는 sigma 를 1로, 두 번째 스무딩 패션 콘텐츠 영상 정보에 대해서는 sigma를 2로 적용하는 등)하면서 복수개의 스무딩 패션 콘텐츠 영상 정보를 생성하도록 구성되고, 점진적으로 스무딩 필터의 스무딩 가중치가 적용된 스무딩 패션 콘텐츠 영상 정보와 스무딩 필터가 적용되지 않은 패션 콘텐츠 영상 정보를 후보 출력 모듈의 인공신경망에 입력하도록 구성될 수 있다. 도 3은 본 발명의 일실시예에 따른 스무딩 필터 적용 모듈의 스무딩 필터 적용을 도시한 모식도이다. 도 3에 도시된 바와 같이, 본 발명의 일실시예에 따른 스무딩 필터 적용 모듈은 패션 콘텐츠 영상 정보에 대해 Sigma를 점진적으로 증가시켜서 스무딩 정도가 점진적으로 증가된 적어도 하나 이상의 스무딩 패션 콘텐츠 영상 정보를 생성하도록 구성될 수 있다. The smoothing filter application module according to an embodiment of the present invention generates at least one smoothing fashion content image information by applying a smoothing filter to the fashion content image information, and transmits the generated smoothing fashion content image information to the candidate output module. Can be configured. The smoothing filter of the smoothing filter application module may be composed of a smoothing filter such as a Gaussian filter, a bilateral filter, and a median filter, and the fashion content image information is gradually It may be configured to generate a plurality of smoothing fashion content image information by applying a smoothing weight and input it as input information of the artificial neural network of the candidate output module. For example, when a Gaussian filter is configured as a smoothing filter, the value of sigma, which is the standard deviation of the Gaussian kernel in the x and y directions, is gradually applied to a large value (for example, the first smoothing fashion content image It is configured to generate a plurality of smoothing fashion content image information while applying sigma to 1 for information, sigma to 2 for the second smoothing fashion content image information, etc.), and the smoothing fashion to which the smoothing weight of the smoothing filter is gradually applied. It may be configured to input content image information and fashion content image information to which a smoothing filter is not applied to the artificial neural network of the candidate output module. 3 is a schematic diagram illustrating application of a smoothing filter by a smoothing filter application module according to an embodiment of the present invention. As shown in FIG. 3, the smoothing filter application module according to an embodiment of the present invention generates at least one smoothing fashion content image information whose smoothing degree is gradually increased by gradually increasing Sigma for fashion content image information. Can be configured to

본 발명의 일실시예에 따른 후보 출력 모듈은 기학습된 인공신경망을 포함하며, 후보 출력 모듈의 상기 인공신경망의 입력 정보는 상기 패션 콘텐츠 영상 정보 및 상기 패션 콘텐츠 영상 정보에 스무딩 필터를 적용한 스무딩 패션 콘텐츠 영상 정보로 구성될 수 있고, 출력 정보는 복수개의 후보 정보에 대한 패션 아이템 분류 정보(패션 아이템인지 여부) 및 후보 정보의 패션 아이템 구역 정보(예를 들어, 각 꼭지점의 좌표 정보, 또는 중심 셀의 좌표 정보와 폭/높이 정보)를 출력하도록 구성될 수 있다. 구체적으로, 후보 출력 모듈의 기학습된 인공신경망의 추론 단계(inference session)는 유저 클러스터링을 하고자 하는 특정 사용자에 대한 상기 패션 콘텐츠 영상 정보 및 스무딩 패션 콘텐츠 영상 정보를 입력 정보로 하여 입력된 영상 정보의 특정 구역에 패션 아이템이 존재하는지에 대한 신뢰도(confidence score)인 패션 아이템 유무 정보, 해당 구역의 좌표 정보를 출력 벡터로 출력하게 된다. 후보 출력 모듈의 인공신경망의 학습 단계(Training session)에서는 소셜 네트워크 웹서버(50)를 통해 소셜 네트워크 웹서버의 데이터베이스에 기저장된 복수의 사용자들에 대한 복수의 패션 콘텐츠 영상 정보를 소스 데이터로 활용하고, 패션 콘텐츠 영상 정보에서 특정 패션 아이템에 대해 구역을 설정한 구역 정보를 인공신경망 학습의 Ground-truth로 하고, 아래의 손실 정보를 병합하여 손실 함수로 활용하며, 손실 함수의 손실 값을 최소로 하도록 인공신경망의 각 layer의 weight를 학습하게 된다.The candidate output module according to an embodiment of the present invention includes a pre-learned artificial neural network, and the input information of the artificial neural network of the candidate output module is a smoothing fashion in which a smoothing filter is applied to the fashion content image information and the fashion content image information. Content image information can be configured, and the output information is fashion item classification information (whether it is a fashion item) for a plurality of candidate information and fashion item area information of the candidate information (for example, coordinate information of each vertex, or a center cell). It may be configured to output coordinate information and width/height information). Specifically, the inference session of the previously learned artificial neural network of the candidate output module includes the fashion content image information and smoothing fashion content image information for a specific user who wants to perform user clustering as input information. Fashion item presence information, which is a confidence score for the existence of a fashion item in a specific area, and coordinate information of the corresponding area are output as an output vector. In the training session of the artificial neural network of the candidate output module, a plurality of fashion content image information about a plurality of users previously stored in the database of the social network web server through the social network web server 50 is used as source data. , In the fashion content video information, the zone information set for a specific fashion item is used as the ground-truth of artificial neural network learning, and the following loss information is merged to be used as a loss function, and the loss value of the loss function is minimized. The weight of each layer of the artificial neural network is learned.

[손실 1] 패션 아이템이 존재하는 것으로 예측되는 구역의 중심 좌표 예측치와 Ground-truth의 중심 좌표와의 차이[Loss 1] The difference between the predicted center coordinates of the area predicted to exist and the ground-truth center coordinates

[손실 2] 패션 아이템이 존재하는 것으로 예측되는 구역의 폭/높이와 Ground-truth의 폭/높이와의 차이[Loss 2] The difference between the width/height of the area predicted to have a fashion item and the width/height of the ground-truth

[손실 3] 패션 아이템이 존재하는 것으로 예측되는 구역의 신뢰도(패션 아이템 유무 정보)와 Ground-truth의 해당 구역에 대한 실제 패션 아이템 존재 여부와의 차이[Loss 3] The difference between the reliability of the area where the fashion item is predicted to exist (information on the presence or absence of fashion items) and the existence of the actual fashion item in the corresponding area of Ground-truth

[후보 출력 모듈의 인공신경망 손실 함수] = 손실 1 + 손실 2 + 손실 3[Artificial neural network loss function of candidate output module] = loss 1 + loss 2 + loss 3

또한, 본 발명의 일실시예에 따른 후보 출력 모듈의 인공신경망은 전체 레이어가 컨볼루전 레이어(Convoulution layer) 및 풀링 레이어(Pooling layer)만으로 구성될 수 있고, 플래튼 레이어(Flatten Layer) 또는 Fully Connected Layer는 구성되지 않을 수 있다. 이에 따르면, 출력 정보인 후보 정보가 3차원으로 구성되어 차원 손실이 발생되지 않으며, 일반적인 CNN(Convolutional Neural Network)에 비해 상대적으로 빠른 속도로 복수개의 후보 정보가 출력되는 효과가 발생된다. 예를 들어, 본 발명의 일실시예에 따른 후보 출력 모듈의 인공신경망은 12x12x3의 입력 정보(패션 콘텐츠 영상 정보 또는 스무딩 패션 콘텐츠 영상 정보)에 3x3의 컨볼루전 필터(Convolution Filter) 및 2x2의 맥스 풀링 필터(Max Pooling Filter)를 적용하여 5x5x10의 벡터를 출력하는 컨볼루전 레이어, 5x5x10의 벡터에 3x3의 컨볼루전 필터(Convolution Filter)를 적용하여 3x3x16의 벡터를 출력하는 컨볼루전 레이어, 3x3x16의 벡터에 3x3의 컨볼루전 필터(Convolution Filter)를 적용하여 1x1x32의 벡터를 출력하는 컨볼루전 레이어, 1x1x32의 벡터에 컨볼루전 필터를 적용하여 1x1x2의 벡터로 출력하는 복수개의 후보 정보에 대한 패션 아이템 유무 정보 및 1x1x4의 벡터로 출력하는 후보 정보의 패션 아이템 구역 정보를 출력 정보로 포함하도록 구성될 수 있다. In addition, in the artificial neural network of the candidate output module according to an embodiment of the present invention, the entire layer may be composed of only a convoulution layer and a pooling layer, and a platen layer or a fully connected layer. Layer may not be configured. Accordingly, since candidate information, which is output information, is configured in three dimensions, no dimensional loss occurs, and a plurality of candidate information is output at a relatively high speed compared to a general convolutional neural network (CNN). For example, the artificial neural network of the candidate output module according to an embodiment of the present invention includes a 3x3 convolution filter and 2x2 max pooling on 12x12x3 input information (fashion content image information or smoothing fashion content image information). A convolution layer that outputs a 5x5x10 vector by applying a filter (Max Pooling Filter), a convolution layer that outputs a 3x3x16 vector by applying a 3x3 convolution filter to a 5x5x10 vector, and a 3x3x16 vector Convolutional layer that outputs a 1x1x32 vector by applying a convolution filter of, and fashion item availability information for a plurality of candidate information output as a 1x1x2 vector by applying a convolution filter to a 1x1x32 vector, and 1x1x4 vector. It may be configured to include fashion item area information of candidate information output as a vector as output information.

본 발명의 일실시예에 따른 후보 출력 모듈의 인공신경망은 패션 아이템 유무 정보의 학습을 위해 교차 엔트로피 손실(Cross entropy loss)을 손실 함수(cost function)로서 이용할 수 있다. 즉, 특정 학습률(learning rate)에서 각각의 후보 정보에 대해 상기 교차 엔트로피 손실이 작아지도록(최적화 되도록) 상기 후보 출력 모듈의 인공신경망의 은닉층(hidden layer)의 가중치가 업데이트 되게 된다. 이때 이용될 수 있는 최적화 방법으로는 Gradient descent 방식, Momentum 방식 등이 있고, Gradient descent를 용이하게 적용하기 위해 오류 역전파 알고리즘(Back propagation algorithm)이 활용될 수 있다. 또한, 본 발명의 일실시예에 따른 후보 출력 모듈의 인공신경망은 각 후보 정보의 패션 아이템 구역 정보의 학습을 위해 각각의 꼭지점과 같은 좌표 포인트에 대해 유클리드 손실(Euclidean loss)을 손실 함수로서 이용할 수 있다. The artificial neural network of the candidate output module according to an embodiment of the present invention may use a cross entropy loss as a cost function for learning the presence or absence of fashion items. That is, the weight of the hidden layer of the artificial neural network of the candidate output module is updated so that the cross entropy loss is reduced (optimized) for each candidate information at a specific learning rate. At this time, optimization methods that can be used include a gradient descent method, a momentum method, and the like, and a back propagation algorithm may be used to easily apply gradient descent. In addition, the artificial neural network of the candidate output module according to an embodiment of the present invention may use Euclidean loss as a loss function for coordinate points such as each vertex for learning fashion item area information of each candidate information. have.

본 발명의 일실시예에 따른 후보 출력 모듈에 포함된 상기 인공신경망은 상기 비후보 제거 모듈 보다 더 적은 층(layer)을 포함하는 인공신경망으로 구성될 수 있다. 이에 따라, 후보 정보 출력이 매우 빠르게 진행되는 효과가 발생된다. The artificial neural network included in the candidate output module according to an embodiment of the present invention may be configured as an artificial neural network including fewer layers than the non-candidate removal module. Accordingly, there is an effect that the output of candidate information proceeds very quickly.

도 4는 본 발명의 일실시예에 따른 후보 출력 모듈의 후보 정보 생성 결과를 도시한 모식도이다. 도 4에 도시된 바와 같이, 본 발명의 일실시예에 따른 후보 정보 생성 모듈에 따르면, 패션 콘텐츠 영상 정보에서는 상대적으로 크기가 작은 패션 아이템 구역 정보가 생성되고, 스무딩 패션 콘텐츠 영상 정보에서는 상대적으로 크기가 큰 패션 아이템 구역 정보가 생성되도록 구성될 수 있다. 이에 따르면, 스무딩 필터가 적용되지 않은 패션 콘텐츠 영상 정보 및 스무딩 필터가 적용된 복수개의 스무딩 패션 콘텐츠 영상 정보에 대해 각각 후보 정보가 출력되게 되므로, 하나의 패션 콘텐츠 영상 정보에 포함되는 크기가 각기 다른 복수개의 패션 아이템 부분을 모두 감지(detect)할 수 있게 되는 효과가 발생된다. 예를 들어, 스무딩 필터가 적용되지 않은 패션 콘텐츠 영상 정보에 대해서는 가장 작은 크기를 가지는 패션 아이템 부분이 감지되고, 스무딩 필터가 적용된 스무딩 패션 콘텐츠 영상 정보에 대해서는 보다 큰 크기를 가지는 패션 아이템 부분이 감지되는 효과가 발생된다. 스무딩 패션 콘텐츠 영상 정보는 패션 콘텐츠 영상 정보에 비해 비교적 규모가 큰 특징만 남기고 사라지게 되기 때문이다.4 is a schematic diagram showing a result of generating candidate information by a candidate output module according to an embodiment of the present invention. As shown in FIG. 4, according to the candidate information generation module according to an embodiment of the present invention, fashion item area information having a relatively small size is generated from fashion content image information, and relatively size is generated from smoothing fashion content image information. It may be configured to generate information about a large fashion item area. According to this, since candidate information is output for each fashion content image information to which a smoothing filter is not applied and a plurality of smoothing fashion content image information to which a smoothing filter is applied, a plurality of different sizes included in one fashion content image information There is an effect of being able to detect all parts of a fashion item. For example, for fashion content image information to which a smoothing filter is not applied, a fashion item portion having the smallest size is detected, and for smoothing fashion content image information to which a smoothing filter is applied, a fashion item portion having a larger size is detected. The effect occurs. This is because the smoothing fashion content video information disappears leaving only features that are relatively larger in scale than the fashion content video information.

또한, 본 발명의 일실시예에 따른 후보 출력 모듈은 인공신경망에서 출력된 복수개의 상기 후보 정보 중 중복된 후보 정보인 중복 후보 정보를 제거하도록 구성될 수 있다. 상기 후보 출력 모듈의 중복 후보 정보 제거는 아래의 순서로 진행될 수 있다. In addition, the candidate output module according to an embodiment of the present invention may be configured to remove redundant candidate information, which is redundant candidate information, among the plurality of candidate information output from the artificial neural network. The redundant candidate information removal of the candidate output module may be performed in the following order.

(1) 후보 출력 모듈이 인공신경망에서 출력된 복수개의 상기 후보 정보를 패션 아이템 유무 정보가 높은 순(신뢰도, confidence가 높은 순)으로 정렬(1) The candidate output module sorts the plurality of candidate information output from the artificial neural network in the order of the highest fashion item presence information (the highest reliability, the highest confidence).

(2) 가장 패션 아이템 유무 정보가 높은 후보 정보와 겹치는 후보 정보 중에서, 후보 정보들의 전체 영역 대비 겹치는 영역의 비율이 특정 값 이상으로 겹치는 후보 정보는 동일한 패션 아이템을 감지(detect)한 것으로 판정하여 제거(예를 들어, 후보 정보들의 전체 영역 대비 겹치는 영역의 비율이 50% 이상인 경우 제거)(2) Among candidate information that overlaps with the candidate information having the highest fashion item presence information, candidate information whose ratio of the overlapping area to the total area of the candidate information is more than a certain value is determined to have detected the same fashion item and removed. (For example, if the ratio of the overlapping area to the total area of candidate information is 50% or more, it is removed)

(3) 2단계에서 제거되지 않은 후보 정보 중 패션 아이템 유무 정보가 높은 후보 정보 순으로 2단계를 수행(3) Among the candidate information that was not removed in step 2, step 2 is performed in order of the candidate information with the highest fashion item presence information

이에 따르면, 상기 패션 콘텐츠 영상 정보나 상기 스무딩 패션 콘텐츠 영상 정보에 포함된 동일한 패션 아이템에 대하여 후보 정보가 중복되어 출력되는 것을 방지할 수 있게 되는 효과가 발생된다. 이러한 효과는 비후보 제거 모듈의 계산 부하를 저감시켜주는 연쇄 효과를 발생시키게 된다.Accordingly, it is possible to prevent duplicated output of candidate information for the fashion content image information or the same fashion item included in the smoothing fashion content image information. This effect generates a chain effect that reduces the computational load of the non-candidate removal module.

본 발명의 일실시예에 따른 비후보 제거 모듈은, 상기 후보 출력 모듈에서 출력된 상기 후보 정보(해당 후보 정보에 대한 패션 아이템 유무 정보 및 해당 후보 정보의 패션 아이템 구역 정보)에 해당되는 패션 콘텐츠 영상 정보에서의 이미지 정보를 입력 정보로 하고 상기 후보 정보의 비후보 분류 정보를 출력 정보로 하는 기학습된 인공신경망을 포함할 수 있다. 또한, 본 발명의 일실시예에 따른 비후보 제거 모듈은 상기 후보 정보의 비후보 분류 정보를 기초로 후보 정보 중 패션 아이템을 포함하지 않는 것으로 분류되는 후보 정보(상기 비후보 분류 정보가 특정 값 이상인 후보 정보)를 제거하여 비후보 후보가 제거된 나머지 후보 정보(패션 아이템 구역 정보, 패션 아이템 유무 정보, 비후보 분류 정보 포함)를 패션 아이템 영상 정보로 출력하도록 구성될 수 있다. 도 5는 본 발명의 일실시예에 따른 비후보 제거 모듈을 도시한 모식도이다. 도 5에 도시된 바와 같이, 본 발명의 일실시예에 따른 비후보 제거 모듈은 후보 출력 모듈에서 생성된 후보 정보를 수신하고, 후보 정보 중 비후보 분류 정보의 신뢰도가 특정 값 이상인 후보 정보를 제거한 뒤, 제거되지 않고 남은 후보 정보의 영상 정보 또는 구역 정보를 패션 아이템 영상 정보로서 출력하고 패션 아이템 분류 모듈(12)에 송신하는 모듈이다. 이에 따르면, 후보 정보 중 패션 아이템을 포함하지 않는 것으로 분류되는 후보 정보를 높은 확률로 제거할 수 있게 되는 효과가 발생된다.The non-candidate removal module according to an embodiment of the present invention is a fashion content image corresponding to the candidate information (fashion item presence information for the candidate information and fashion item area information of the candidate information) output from the candidate output module. It may include a pre-learned artificial neural network that uses image information in the information as input information and non-candidate classification information of the candidate information as output information. In addition, the non-candidate removal module according to an embodiment of the present invention includes candidate information classified as not including a fashion item among candidate information based on the non-candidate classification information of the candidate information (the non-candidate classification information is equal to or greater than a specific value). Candidate information) may be removed to output the remaining candidate information (including fashion item area information, fashion item presence information, and non-candidate classification information) from which the non-candidate candidate has been removed as fashion item image information. 5 is a schematic diagram showing a non-candidate removal module according to an embodiment of the present invention. 5, the non-candidate removal module according to an embodiment of the present invention receives candidate information generated by the candidate output module, and removes candidate information whose reliability of non-candidate classification information is greater than or equal to a specific value among candidate information. Afterwards, it is a module that outputs video information or area information of candidate information remaining without being removed as fashion item video information and transmits it to the fashion item classification module 12. Accordingly, it is possible to remove candidate information classified as not including a fashion item among the candidate information with a high probability.

패션 아이템 분류 모듈(12)은, 수신된 패션 콘텐츠 영상 정보에 포함된 각각의 패션 아이템 영상 정보를 기학습된 인공신경망(특히, ConvNet)에 입력하여 각 패션 아이템 영상 정보에 대한 패션 아이템 분류 정보를 출력하도록 구성(인공신경망의 추론 단계)되는 모듈이다. 도 6은 본 발명의 일실시예에 따른 패션 아이템 분류 모듈(12)의 인공신경망의 작동관계를 도시한 모식도이다. 도 6에 도시된 바와 같이, 패션 아이템 분류 모듈(12)에 수신된 패션 콘텐츠 영상 정보에 포함된 각각의 패션 아이템 영상 정보를 패션 아이템 분류 모듈(12)의 기학습된 인공신경망(특히, ConvNet)에 입력하여 각 패션 아이템 영상 정보에 대한 패션 아이템 분류 정보를 출력하도록 구성될 수 있다. 패션 아이템 분류 모듈(12)의 기학습된 인공신경망의 추론 단계에서는 유저 클러스터링을 수행하고자 하는 사용자의 패션 콘텐츠 영상 정보를 입력 정보로 활용하게 된다.The fashion item classification module 12 inputs each fashion item image information included in the received fashion content image information into a pre-learned artificial neural network (especially, ConvNet), and stores fashion item classification information for each fashion item image information. It is a module that is configured to output (inference stage of artificial neural network). 6 is a schematic diagram showing an operation relationship of the artificial neural network of the fashion item classification module 12 according to an embodiment of the present invention. As shown in FIG. 6, each fashion item image information included in the fashion content image information received by the fashion item classification module 12 is converted into a pre-learned artificial neural network (especially, ConvNet) of the fashion item classification module 12. It may be configured to input to and output fashion item classification information for each fashion item image information. In the inference step of the previously learned artificial neural network of the fashion item classification module 12, fashion content image information of a user who wants to perform user clustering is used as input information.

학습 단계에서 본 발명의 일실시예에 따른 패션 아이템 분류 모듈(12)의 인공신경망은, 소셜 네트워크 웹서버(50)의 데이터베이스에 기저장된 복수의 사용자들의 복수의 패션 콘텐츠 영상 정보를 소스 데이터로 활용하며, 패션 아이템 영상 정보가 어떤 패션 아이템 카테고리인지 또는 어떤 패션 아이템인지에 대한 패션 아이템 레이블 정보(label)를 인공신경망 학습의 Ground-truth로 하고, 아래의 손실 정보를 병합하여 손실 함수로 활용하며, 손실 함수의 손실 값을 최소로 하도록 인공신경망의 각 layer의 weight를 학습하게 된다. 이때, 학습을 위한 소스 정보(입력 정보)는 소셜 네트워크 내의 복수의 패션 콘텐츠 영상 정보가 될 수 있으며, 패션 콘텐츠의 반응 정보(호감 정보, 댓글 정보, 공유 정보 등)를 포함할 수 있다. In the learning step, the artificial neural network of the fashion item classification module 12 according to an embodiment of the present invention uses a plurality of fashion content image information of a plurality of users previously stored in the database of the social network web server 50 as source data. The fashion item label information on which fashion item category or fashion item is the fashion item image information is used as the ground-truth of artificial neural network learning, and the following loss information is merged and used as a loss function. The weight of each layer of the artificial neural network is learned to minimize the loss value of the loss function. In this case, the source information (input information) for learning may be a plurality of fashion content image information in a social network, and may include reaction information (like feeling information, comment information, sharing information, etc.) of the fashion content.

[손실 1] 패션 아이템 영상 정보가 어떤 레이블로 분류되는지에 대한 패션 아이템 분류 정보와 Ground-truth의 해당 패션 아이템 영상 정보에 대한 실제 패션 아이템 레이블 정보와의 차이[Loss 1] The difference between the fashion item classification information on which label the fashion item video information is classified and the actual fashion item label information on the corresponding fashion item video information of Ground-truth

[손실 2] 각각의 패션 아이템 영상 정보에 대한 각 인공신경망의 FC 레이어(Fully Connected Layer)에서 출력되는 벡터들 사이의 거리 (하나의 패션 콘텐츠 영상 정보에 하나의 패션 아이템이 디텍션 된 경우, 거리는 0)[Loss 2] The distance between vectors output from the FC layer (Fully Connected Layer) of each artificial neural network for each fashion item video information (if one fashion item is detected in one fashion content video information, the distance is 0 )

[패션 아이템 분류 모듈의 인공신경망 손실 함수] = 손실 1 + 손실 2 [Artificial neural network loss function of fashion item classification module] = loss 1 + loss 2

도 7은 본 발명의 일실시예에 따른 패션 아이템 분류 모듈(12)의 인공신경망의 예시를 도시한 모식도이다. 도 7에 도시된 바와 같이, 패션 아이템 분류 모듈(12)의 인공신경망은 [INPUT-CONV-RELU-POOL-FC]으로 구축될 수 있다. 입력 정보인 패션 아이템 영상 정보는 가로 32, 세로 32, 높이 n의 채널을 가지고 입력의 크기는 [32x32xn]인 매트릭스로 구성될 수 있다. CONV 레이어(Conv. Filter, 101)는 패션 아이템 영상 정보의 일부 영역과 연결되어 있으며, 이 연결된 영역과 가중치의 내적 연산(dot product)을 계산하게 된다. 결과 볼륨은 [32x32x12]와 같은 크기를 갖게 된다. RELU 레이어는 max(0,x)와 같이 각 요소에 적용되는 액티베이션 함수(activation function)이다. RELU 레이어는 볼륨의 크기를 변화시키지 않는다([32x32x12]). 그 결과 Activation map 1 (102)을 생성한다. POOL 레이어(pooling, 103)는 "가로,세로" 차원에 대해 다운샘플링(downsampling)을 수행해 [16x16x12]와 같이 줄어든 볼륨(Activation map 2, 104)을 출력한다. n번째 Activation map n(105)과 연결된 FC(fully-connected) 레이어(106) 이후 클래스 점수들을 계산해 [m x m x 1]의 크기를 갖는 볼륨(output layer, 107)을 출력한다. 패션 아이템 분류 모듈(12)의 인공신경망의 손실 함수 중 손실 2는 패션 아이템 영상 정보의 차원이 축소된 형태인 FC 레이어(106)의 출력 벡터를 이용하게 되고, 각각의 패션 아이템 영상 정보에 대한 FC 레이어의 출력 벡터 사이의 거리가 손실 2의 손실 값으로 구성되게 된다. 7 is a schematic diagram showing an example of an artificial neural network of the fashion item classification module 12 according to an embodiment of the present invention. As shown in FIG. 7, the artificial neural network of the fashion item classification module 12 may be constructed as [INPUT-CONV-RELU-POOL-FC]. The fashion item image information, which is input information, may be composed of a matrix having a channel of 32 width, 32 height, and height n, and an input size of [32x32xn]. The CONV layer (Conv. Filter) 101 is connected to a partial area of fashion item image information, and a dot product of the connected area and weight is calculated. The resulting volume will have the same size as [32x32x12]. The RELU layer is an activation function applied to each element, such as max(0,x). The RELU layer does not change the volume size ([32x32x12]). As a result, Activation map 1 (102) is generated. The POOL layer (pooling, 103) performs downsampling on the "horizontal and vertical" dimensions and outputs a reduced volume (Activation map 2, 104) such as [16x16x12]. After the fully-connected (FC) layer 106 connected to the n-th activation map n 105, class scores are calculated, and a volume (output layer, 107) having a size of [m x m x 1] is output. Loss 2 of the loss function of the artificial neural network of the fashion item classification module 12 uses the output vector of the FC layer 106, which has a reduced dimension of the fashion item image information, and the FC for each fashion item image information The distance between the output vectors of the layer is composed of the loss value of loss 2.

특히, 손실 2의 가중치는 해당 패션 콘텐츠 영상 정보의 반응 정보의 수(호감 정보, 댓글 정보, 공유 정보 등)에 비례하도록 구성될 수 있다. 예를 들어, 반응 정보가 많은 패션 콘텐츠의 경우, 손실 2가 손실 1에 비해 더 높은 가중치를 가질 수 있다. 즉, 반응 정보가 더 많은 패션 콘텐츠는 해당 패션 콘텐츠 영상 정보 내의 각각의 패션 아이템 영상 정보에 대한 각 인공신경망의 FC 레이어의 벡터들 사이의 거리가 더 가깝게 출력될 수 있다. In particular, the weight of loss 2 may be configured to be proportional to the number of reaction information (like information, comment information, sharing information, etc.) of the corresponding fashion content video information. For example, in the case of fashion content having a lot of response information, loss 2 may have a higher weight than loss 1. That is, in the fashion content having more response information, the distance between the vectors of the FC layer of each artificial neural network with respect to each fashion item image information in the corresponding fashion content image information may be outputted closer.

패션 아이템 분류 모듈의 인공신경망의 손실 2와 관련하여, 예를 들어, 도 6에 도시된 패션 아이템 분류 모듈의 인공신경망을 학습 시킬 때, 부츠 1 부분(패션 아이템 영상 정보)에 대한 인공신경망, 부츠 2 부분에 대한 인공신경망, 청바지 부분에 대한 인공신경망, 가죽 점퍼 부분에 대한 인공신경망 각각의 output layer가 아닌 output layer 이전의 Fully Connected Layer의 출력 벡터(임베딩 벡터) 각각의 거리(부츠 1 - 부츠 2 임베딩 벡터 사이의 거리, 부츠 1 - 청바지 임베딩 벡터 사이의 거리, 등)를 통합한 값을 손실 2의 손실 값으로 하고, 손실 1과 손실 2를 통합한 손실 함수의 값을 최소로 하도록 인공신경망을 학습하게 된다. 본 발명의 일실시예에 따르면 하나의 패션 콘텐츠 영상 정보 내에 포함된 패션 아이템들은 FC 레이어의 출력 벡터 사이의 거리가 서로 가깝게 출력되게 되고, 나아가 반응 정보가 더 많은 패션 콘텐츠는 FC 레이어의 출력 벡터 사이의 거리가 더 가깝게 출력되로, 서로 어울리는 패션 아이템끼리 가까운 벡터값을 가지도록 출력되게 되는 효과가 발생된다. 이에 따르면, 패션 아이템 각각에 대해 서로 어울리는지, 어떤 스타일에 대한 패션 아이템인지를 일일이 태깅하지 않아도, 기존의 패션 아이템 분류 인공신경망을 활용하여 손실 함수 변경만으로 서로 조합하였을 때 어울리는 패션 아이템을 구분하거나 패션 아이템들을 스타일별로 클러스터링 할 수 있게 되는 효과가 발생된다.Regarding the loss 2 of the artificial neural network of the fashion item classification module, for example, when learning the artificial neural network of the fashion item classification module shown in FIG. 6, the artificial neural network and boots for the boot 1 part (fashion item image information) Artificial neural network for part 2, artificial neural network for jeans part, artificial neural network for leather jumper The output vector (embedding vector) of the Fully Connected Layer before the output layer, not each output layer (Boots 1-Boots 2) The artificial neural network is set to minimize the value of the loss function that combines the loss 1 and the loss 2, using the combined value of the distance between embedding vectors, boots 1-the distance between jeans embedding vectors, etc.) You will learn. According to an embodiment of the present invention, fashion items included in one fashion content image information have a distance between the output vectors of the FC layer close to each other, and further, fashion content with more response information is between the output vectors of the FC layer. As the distance of is outputted closer, the effect of being outputted so that matching fashion items have close vector values occurs. According to this, the existing fashion item classification artificial neural network is used to classify the matching fashion items when combined with each other only by changing the loss function, even without tagging each fashion item for each fashion item. There is an effect that items can be clustered by style.

임베딩 모듈(13)은, 상기 패션 아이템 분류 모듈(12)의 인공신경망의 FC 레이어와 연결되어, FC 레이어의 출력 벡터를 해당 패션 아이템에 대한 임베딩 벡터로서 수신하는 모듈이다. 예를 들어, 임베딩 모듈(13)은 도 6에서와 같이 하나의 패션 콘텐츠 영상 정보 내에 4개의 패션 아이템 영상 정보가 있는 경우, 각각의 패션 아이템 영상 정보에 대응되는 4개의 임베딩 벡터를 생성하게 된다. The embedding module 13 is a module that is connected to the FC layer of the artificial neural network of the fashion item classification module 12 and receives an output vector of the FC layer as an embedding vector for a corresponding fashion item. For example, as shown in FIG. 6, when there are four fashion item image information in one fashion content image information, the embedding module 13 generates four embedding vectors corresponding to each fashion item image information.

클러스터링 모듈(14)은, 기존에 학습에 이용된 소스 데이터(소셜 네트워크의 적어도 일부의 패션 콘텐츠 영상 정보의 패션 아이템 영상 정보)의 임베딩 벡터를 클러스터링하여 클러스터 정보를 생성하고, 상기 임베딩 모듈(13)에서 생성된 각 패션 아이템 영상 정보에 대한 임베딩 벡터를 클러스터 정보에 매핑하는 모듈이다. 클러스터링 모듈(14)에 따르면, 특정 소셜 네트워크에서 특정 사용자가 반응 정보을 입력한 적어도 하나 이상의 패션 콘텐츠에 대하여 복수개의 패션 아이템이 검출되고, 해당 복수개의 패션 아이템을 기학습된 클러스터링 모듈(14)을 통해 복수개의 클러스터에 매핑하게 되므로, 하나의 사용자에 대해 적어도 하나 이상의 클러스터가 매핑되고, 각 클러스터는 해당 사용자의 스타일 아이덴티티를 의미하며, 각 클러스터에 포함된 패션 아이템의 개수가 해당 스타일 아이덴티티의 중요도를 나타내는 것으로 해석될 수 있다. The clustering module 14 generates cluster information by clustering an embedding vector of source data (fashion item image information of at least part of fashion content image information of a social network) previously used for learning, and the embedding module 13 This module maps the embedding vector for each fashion item image information generated in the cluster information. According to the clustering module 14, a plurality of fashion items are detected with respect to at least one fashion content to which a specific user inputs reaction information in a specific social network, and the plurality of fashion items are previously learned through the clustering module 14. Since it is mapped to a plurality of clusters, at least one cluster is mapped for one user, and each cluster represents the style identity of the user, and the number of fashion items included in each cluster represents the importance of the style identity. Can be interpreted as.

즉, 본 발명의 일실시예에 따른 클러스터링 모듈(14)에 따르면, 특정 사용자의 관심 패션 콘텐츠 정보를 입력하여 복수의 패션 아이템으로 구성되는 클러스터인 해당 특정 사용자의 스타일 아이덴티티(해당 클러스터를 구성하는 복수의 패션 아이템 영상 정보)를 출력하도록 구성될 수 있다. 또한, 본 발명의 일실시예에 따른 클러스터링 모듈(14)에 따르면, 복수의 사용자들의 각각의 관심 패션 콘텐츠 정보를 입력하고, 복수의 사용자들을 스타일 아이덴티티 별로 군집한 사용자 군집(유저 클러스터링 정보)를 출력하도록 구성될 수 있다. 나아가, 본 발명의 일실시예에 따른 클러스터링 모듈(14)에 따르면, 개별적인 패션 아이템이나 특정 브랜드의 패션 아이템을 클러스터링 모듈(14)의 입력하여 해당 패션 아이템과 같은 클러스터에 해당되는 패션 아이템을 출력하거나 해당 패션 아이템과 같은 클러스터에 해당되는 패션 아이템에 반응 정보를 입력하거나 업로드 한 사용자 군집(유저 클러스터링 정보)을 출력하도록 구성될 수 있다.That is, according to the clustering module 14 according to an embodiment of the present invention, the style identity of a specific user, which is a cluster composed of a plurality of fashion items by inputting information on fashion content of interest of a specific user (a plurality of It may be configured to output the fashion item image information). In addition, according to the clustering module 14 according to an embodiment of the present invention, information about fashion contents of interest of each of a plurality of users is input, and a group of users (user clustering information) that groups a plurality of users by style identity is output. Can be configured to Further, according to the clustering module 14 according to an embodiment of the present invention, an individual fashion item or a fashion item of a specific brand is input by the clustering module 14 to output a fashion item corresponding to a cluster such as a corresponding fashion item. It may be configured to input reaction information to a fashion item corresponding to a cluster such as a corresponding fashion item or to output an uploaded user cluster (user clustering information).

도 8은 본 발명의 일실시예에 따른 클러스터링 모듈에서의 임베딩 벡터 매핑을 도시한 모식도이다. 도 8에 도시된 바와 같이, 본 발명의 일실시예에 따른 클러스터링 모듈(14)은 기학습된 복수개의 클러스터 정보를 포함할 수 있고, 각각의 클러스터들은 신발, 상의, 하의, 모자, 주얼리 등의 다양한 카테고리의 패션 아이템의 임베딩 벡터를 포함할 수 있다. 또한, 특정 사용자가 특정 소셜 네트워크에서 반응 정보를 입력한 적어도 하나의 패션 콘텐츠에 포함된 패션 아이템들의 임베딩 벡터가 상기 기학습된 클러스터들에 매핑되도록 구성될 수 있으며, 패션 아이템들이 매핑된 각 클러스터들이 상기 특정 사용자의 스타일 아이덴티티를 나타낸다. 특정 사용자의 이러한 스타일 아이덴티티를 이용하여, 동일한 클러스터에 있는 다른 패션 아이템을 추천하거나 광고할 수 있도록 추천 알고리즘 또는 광고 알고리즘을 구성할 수 있다. 또는, 특정 브랜드의 패션 아이템이 속하는 클러스터와 동일한 클러스터에 있는 크리에이터(업로드 한 패션 콘텐츠 기반)를 추천하는 알고리즘을 구성할 수 있다. 또는, 특정 브랜드와 스타일 아이덴티티가 유사한 사용자들을 대상으로 타겟팅된 광고나 이벤트를 구성하기 위하여 특정 브랜드의 제품이 속하는 클러스ㅌ와 동일한 클러스터에 있는 사용자들을 출력하는 알고리즘을 구성할 수 있다. 8 is a schematic diagram showing embedding vector mapping in a clustering module according to an embodiment of the present invention. As shown in Figure 8, the clustering module 14 according to an embodiment of the present invention may include a plurality of pre-learned cluster information, each of the clusters, such as shoes, tops, bottoms, hats, jewelry, etc. It may include embedding vectors of various categories of fashion items. In addition, the embedding vector of fashion items included in at least one fashion content in which a specific user inputs reaction information in a specific social network may be configured to be mapped to the previously learned clusters, and each cluster to which fashion items are mapped Represents the style identity of the specific user. Using this style identity of a specific user, a recommendation algorithm or advertisement algorithm can be configured to recommend or advertise other fashion items in the same cluster. Alternatively, an algorithm for recommending creators (based on uploaded fashion content) in the same cluster as the cluster to which the fashion item of a specific brand belongs may be configured. Alternatively, an algorithm for outputting users in the same cluster as the cluster to which the product of a specific brand belongs may be constructed in order to compose an advertisement or event targeted to users having a similar brand and style identity.

본 발명의 일실시예에 따른 클러스터링 모듈(14)의 클러스터링 알고리즘은 K-means 와 같은 일반적인 클러스터링 알고리즘이 사용될 수 있다. 또한, 본 발명의 일실시예에 따른 클러스터링 알고리즘은 복수개의 패션 아이템 영상 정보에 대한 상기 임베딩 벡터 중 임의의 점(core point)을 기준으로 반경 e(epsilon)내에 점이 특정 개수 이상 있으면 하나의 군집으로 인식하는 과정을 반복적으로 수행하여 임의의 개수의 군집(cluster)을 형성하는 방법으로 구성될 수 있다. 이에 따르면, 클러스터의 수를 기설정할 필요 없이 자동으로 패션 아이템 영상 정보의 군집의 개수를 찾게 되는 효과가 발생되며, outlier에 의해 클러스터링 성능이 하락하는 것을 방지하게 되는 효과가 발생된다. 또한, 특정 패션 콘텐츠 영상 정보 내에 몇 개의 패션 아이템이 등장할지 모르는 경우에도 군집화가 가능해지는 효과가 발생되며, 패션 아이템 분류 모듈(12)의 FC 레이어를 임베딩 벡터로 활용하기 때문에 클러스터링에 적절한 파라미터가 미리 정해져있게 되는 효과가 발생된다.As the clustering algorithm of the clustering module 14 according to an embodiment of the present invention, a general clustering algorithm such as K-means may be used. In addition, in the clustering algorithm according to an embodiment of the present invention, if there are more than a certain number of points within a radius e (epsilon) based on a core point among the embedding vectors for a plurality of fashion item image information, one cluster is formed. It can be configured in a method of forming an arbitrary number of clusters by repeatedly performing the recognition process. Accordingly, there is an effect of automatically finding the number of clusters of fashion item image information without the need to preset the number of clusters, and an effect of preventing deterioration of clustering performance due to outliers is generated. In addition, even when it is not known how many fashion items will appear in the specific fashion content video information, the effect of clustering occurs, and since the FC layer of the fashion item classification module 12 is used as an embedding vector, parameters suitable for clustering are previously The effect of being fixed occurs.

유저 클러스터링을 위한 패션 콘텐츠 임베딩 방법Fashion content embedding method for user clustering

본 발명의 일실시예에 따른 유저 클러스터링을 위한 패션 콘텐츠 임베딩 방법은 패션 콘텐츠 수신 단계, 패션 아이템 디텍션 단계, 패션 아이템 분류 단계, 임베딩 단계, 클러스터링 단계를 포함할 수 있다. The fashion content embedding method for user clustering according to an embodiment of the present invention may include a fashion content reception step, a fashion item detection step, a fashion item classification step, an embedding step, and a clustering step.

패션 콘텐츠 수신 단계는, 패션 콘텐츠 수신 모듈(10)이 사용자 클라이언트(100) 또는 소셜 네트워크 웹서버(50)에서 특정 기간 동안 특정 사용자가 반응 정보를 입력하거나 업로드 한 패션 콘텐츠 정보인 관심 패션 콘텐츠 정보(패션 콘텐츠 영상 정보, 텍스트 정보, 태그 정보, 댓글 정보, 호감 정보, 뷰 카운트 정보 등)를 수신하고, 패션 아이템 디텍션 모듈(11)에 수신된 패션 콘텐츠 영상 정보를 송신하는 단계이다. In the fashion content receiving step, the fashion content receiving module 10 includes fashion content information of interest, which is fashion content information that a specific user inputs or uploads response information for a specific period in the user client 100 or the social network web server 50 Fashion content video information, text information, tag information, comment information, like information, view count information, etc.) are received, and the received fashion content video information is transmitted to the fashion item detection module 11.

패션 아이템 디텍션 단계는, 패션 아이템 디텍션 모듈(11)이 패션 콘텐츠 수신 모듈(10)에서 패션 콘텐츠 영상 정보를 수신하고, 상기 패션 콘텐츠 영상 정보에서 패션 아이템의 구역을 디텍션(detection)하여 패션 아이템 구역 정보를 생성하며, 패션 아이템 구역 정보를 기초로 패션 콘텐츠 영상 정보의 적어도 일부인 패션 아이템 영상 정보를 생성하는 단계이다. In the fashion item detection step, the fashion item detection module 11 receives fashion content image information from the fashion content receiving module 10, detects a fashion item area from the fashion content image information, and detects fashion item area information. And generating fashion item image information that is at least a part of fashion content image information based on fashion item area information.

패션 아이템 분류 단계는, 패션 아이템 분류 모듈(12)이 수신된 패션 콘텐츠 영상 정보에 포함된 각각의 패션 아이템 영상 정보를 기학습된 인공신경망에 입력하여 각 패션 아이템 영상 정보에 대한 패션 아이템 분류 정보를 출력하도록 구성(인공신경망의 추론 단계)되는 단계이다.In the fashion item classification step, the fashion item classification module 12 inputs each fashion item image information included in the received fashion content image information into a pre-learned artificial neural network to provide fashion item classification information for each fashion item image information. It is a step that is configured to be output (inference step of the artificial neural network).

임베딩 단계는, 임베딩 모듈(13)이 상기 패션 아이템 분류 모듈(12)의 인공신경망의 FC 레이어와 연결되어, FC 레이어의 출력 벡터를 해당 패션 아이템에 대한 임베딩 벡터로서 수신하는 단계이다.In the embedding step, the embedding module 13 is connected to the FC layer of the artificial neural network of the fashion item classification module 12 and receives the output vector of the FC layer as an embedding vector for the fashion item.

클러스터링 단계는 클러스터링 모듈(14)이 기존에 학습에 이용된 소스 데이터(소셜 네트워크의 적어도 일부의 패션 콘텐츠 영상 정보의 패션 아이템 영상 정보)의 임베딩 벡터를 클러스터링하여 클러스터 정보를 생성하고, 상기 임베딩 모듈(13)에서 생성된 각 패션 아이템 영상 정보에 대한 임베딩 벡터를 클러스터 정보에 매핑하는 단계이다.In the clustering step, the clustering module 14 generates cluster information by clustering an embedding vector of source data (fashion item image information of at least part of fashion content image information of a social network) previously used for learning, and the embedding module ( This is a step of mapping the embedding vector for each fashion item image information generated in 13) to cluster information.

이상에서 설명한 바와 같이, 본 발명이 속하는 기술 분야의 통상의 기술자는 본 발명이 그 기술적 사상이나 필수적 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 상술한 실시예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로서 이해해야만 한다. 본 발명의 범위는 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 등가 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함하는 것으로 해석되어야 한다.As described above, those skilled in the art to which the present invention pertains will appreciate that the present invention can be implemented in other specific forms without changing the technical spirit or essential features thereof. Therefore, the above-described embodiments are illustrative in all respects and should be understood as non-limiting. The scope of the present invention is indicated by the claims to be described later rather than the detailed description, and all changes or modified forms derived from the meaning and scope of the claims and equivalent concepts should be interpreted as being included in the scope of the present invention.

본 명세서 내에 기술된 특징들 및 장점들은 모두를 포함하지 않으며, 특히 많은 추가적인 특징들 및 장점들이 도면들, 명세서, 및 청구항들을 고려하여 당업자에게 명백해질 것이다. 더욱이, 본 명세서에 사용된 언어는 주로 읽기 쉽도록 그리고 교시의 목적으로 선택되었고, 본 발명의 주제를 묘사하거나 제한하기 위해 선택되지 않을 수도 있다는 것을 주의해야 한다.The features and advantages described herein are not all inclusive, and in particular many additional features and advantages will become apparent to those skilled in the art in view of the drawings, specification, and claims. Moreover, it should be noted that the language used herein has been selected primarily for readability and for teaching purposes, and may not be chosen to describe or limit the subject matter of the invention.

본 발명의 실시예들의 상기한 설명은 예시의 목적으로 제시되었다. 이는 개시된 정확한 형태로 본 발명을 제한하거나, 빠뜨리는 것 없이 만들려고 의도한 것이 아니다. 당업자는 상기한 개시에 비추어 많은 수정 및 변형이 가능하다는 것을 이해할 수 있다.The above description of embodiments of the present invention has been presented for purposes of illustration. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Those skilled in the art will understand that many modifications and variations are possible in light of the above disclosure.

그러므로 본 발명의 범위는 상세한 설명에 의해 한정되지 않고, 이를 기반으로 하는 출원의 임의의 청구항들에 의해 한정된다. 따라서, 본 발명의 실시예들의 개시는 예시적인 것이며, 이하의 청구항에 기재된 본 발명의 범위를 제한하는 것은 아니다.Therefore, the scope of the invention is not limited by the detailed description, but by any claims in the application on which it is based. Accordingly, the disclosure of the embodiments of the present invention is illustrative and does not limit the scope of the present invention described in the following claims.

1: 유저 클러스터링을 위한 패션 콘텐츠 임베딩 장치
10: 패션 콘텐츠 수신 모듈
11: 패션 아이템 디텍션 모듈
12: 패션 아이템 분류 모듈
13: 임베딩 모듈
14: 클러스터링 모듈
50: 소셜 네트워크 웹서버
100: 사용자 클라이언트1: Fashion content embedding device for user clustering
10: Fashion content receiving module
11: Fashion item detection module
12: Fashion item classification module
13: embedding module
14: clustering module
50: social network web server
100: user client

Claims

Receive fashion content information of interest, which is fashion content information uploaded or input by the specific user, to a social network web server, which is a web server of a specific social network service, through a user client of a specific user, and include it in the fashion content information of interest. A fashion content receiving module that generates fashion content image information;
The fashion content receiving module receives fashion content image information, detects an area of a fashion item from the fashion content image information to generate a plurality of fashion item area information, and the fashion item area information A fashion item detection module that generates fashion item image information, which is at least part of image information of fashion content image information, as many as the number of fashion items detected in the fashion content image information; And
The fashion item detection module receives the fashion item image information and is connected to a fully connected layer of a pre-learned artificial neural network using each fashion item image information as input data, and each fashion item image An embedding module for outputting an output vector of the FC layer of the artificial neural network for information as an embedding vector for each fashion item image information;
Including,
The loss function of the artificial neural network includes a distance between the embedding vectors for each of the fashion item image information included in one of the fashion content information, and a loss value of the loss function decreases as the distance is closer. And in the learning step of the artificial neural network, the weight of the artificial neural network is learned in a direction in which the loss value decreases,
Characterized in that performing user clustering or fashion item clustering based on the embedding vector for each of the fashion item image information,
Fashion content embedding device for user clustering.

The fashion content receiving module receives fashion content information of interest, which is fashion content information uploaded or input by the specific user, to a social network web server that is a web server of a specific social network service through a user client of a specific user, A fashion content receiving step of generating fashion content image information included in the fashion content information of interest;
A fashion item detection module receives fashion content image information from the fashion content receiving module, detects an area of a fashion item from the fashion content image information, and generates a plurality of fashion item area information. A fashion item detection step of generating fashion item image information, which is image information of at least part of the fashion content image information with respect to item area information, as many as the number of fashion items detected in the fashion content image information;
The embedding module receives the fashion item image information from the fashion item detection module and is connected to a fully connected layer of a pre-learned artificial neural network using each of the fashion item image information as input data. An embedding step of outputting an output vector of the FC layer of the artificial neural network for the fashion item image information as an embedding vector for each fashion item image information;
Including,
The loss function of the artificial neural network includes a distance between the embedding vectors for each of the fashion item image information included in one of the fashion content information, and a loss value of the loss function decreases as the distance is closer. And in the learning step of the artificial neural network, the weight of the artificial neural network is learned in a direction in which the loss value decreases,
Characterized in that performing user clustering or fashion item clustering based on the embedding vector for each of the fashion item image information,
Fashion content embedding method for user clustering.

A memory module that stores a fashion content embedding program code; And
A processing module that processes the fashion content embedding program code;
Including,
The fashion content embedding program code,
Receive fashion content information of interest, which is fashion content information uploaded or input by the specific user, to a social network web server, which is a web server of a specific social network service, through a user client of a specific user, and include it in the fashion content information of interest. A fashion content receiving step of generating fashion content image information;
A fashion item image that detects an area of a fashion item from the fashion content image information to generate a plurality of fashion item area information, and is image information of at least a part of the fashion content image information for each of the fashion item area information A fashion item detection step of generating information as much as the number of fashion items detected in the fashion content image information; And
An embedding step of outputting an output vector of a fully-connected layer of the artificial neural network, which has the fashion item image information as input data, as an embedding vector for each fashion item image information;
Including program code to execute on a computer,
The loss function of the artificial neural network includes a distance between the embedding vectors for each of the fashion item image information included in one of the fashion content information, and a loss value of the loss function decreases as the distance is closer. And in the learning step of the artificial neural network, the weight of the artificial neural network is learned in a direction in which the loss value decreases,
Characterized in that performing user clustering or fashion item clustering based on the embedding vector for each of the fashion item image information,
Fashion content embedding device for user clustering.