KR20240037448A

KR20240037448A - User reviews evaluation device

Info

Publication number: KR20240037448A
Application number: KR1020220115969A
Authority: KR
Inventors: 안성민; 박동길; 옥순재
Original assignee: 주식회사 오투오
Priority date: 2022-09-15
Filing date: 2022-09-15
Publication date: 2024-03-22

Abstract

본 발명은 리뷰 진정성 평가 장치에 관한 것으로서, 특정 웹사이트 서비스 플랫폼에 게시된 리뷰어의 평가점수가 진정한 평가점수 인지 여부를 평가하는 리뷰 진정성 평가 장치에 관한 것이다. 이를 위해 소비자 리뷰 또는 광고성 리뷰가 게시된 웹 사이트에 접속하여 리뷰를 크롤링 하고, 크롤링한 리뷰의 단어, 어구, 및 문장 중 적어도 어느 하나의 평가 리뷰와 평가 리뷰의 평가점수를 수집 매칭하고 리뷰 데이터화하는 리뷰 데이터 수집부, 리뷰 데이터에 포함된 평가 리뷰를 적어도 3단계에 걸쳐 전처리함으로써 진정성 리뷰만을 필터링 하여 전처리 출력하는 리뷰 데이터 전처리부, 리뷰 데이터 전처리부에서 필터링 한 진정성 리뷰 데이터를 입력받아 5개의 군집 리뷰 데이터로 군집 분류하는 클러스터링부, 클러스터링부에서 군집 분류한 5개의 군집 리뷰 학습 데이터를 기초로 기 정해진 조건에 따라 2개씩 쌍으로 군집 리뷰 학습 데이터를 기 구축된 한국어 학습 모델에 입력하여 리뷰 진정성 판별 학습 모델을 5개의 군집별로 각각 생성하는 리뷰 진정성 판별 학습 모델 생성부, 클러스터링부에서 군집 분류한 5개의 군집 리뷰 평가 데이터를 군집 분류에 대응하는 리뷰 진정성 판별 학습 모델에 각각 입력하여 리뷰의 진정성을 평가하는 리뷰 진정성 평가부가 개시된다.The present invention relates to a review authenticity evaluation device, which evaluates whether a reviewer's evaluation score posted on a specific website service platform is a true evaluation score. For this purpose, access to a website where consumer reviews or advertising reviews are posted, crawl the reviews, collect and match at least one of the words, phrases, and sentences of the crawled reviews and the evaluation scores of the evaluation reviews, and convert them into review data. A review data collection unit, a review data pre-processing unit that filters and outputs only the authenticity reviews by pre-processing the evaluation reviews included in the review data in at least three stages, and receives the authenticity review data filtered by the review data pre-processing unit and generates 5 cluster reviews. Based on the clustering unit that classifies data into clusters, and the five cluster review learning data clustered by the clustering unit, the cluster review learning data is input into a pre-established Korean learning model in pairs according to predetermined conditions to learn to determine review authenticity. The review authenticity determination learning model generation unit, which generates models for each of five clusters, inputs the five cluster review evaluation data clustered by the clustering unit into the review authenticity determination learning model corresponding to the cluster classification to evaluate the authenticity of the review. The review authenticity evaluation department is launched.

Description

Review authenticity evaluation device {User reviews evaluation device}

본 발명은 리뷰 진정성 평가 장치에 관한 것으로서, 보다 상세하게는 특정 웹사이트 서비스 플랫폼에 게시된 리뷰어의 평가점수가 진정한 평가점수 인지 여부를 평가하는 리뷰 진정성 평가 장치에 관한 것이다.The present invention relates to a review authenticity evaluation device, and more specifically, to a review authenticity evaluation device that evaluates whether a reviewer's evaluation score posted on a specific website service platform is a true evaluation score.

온라인이 활성화 되면서 웹사이트 서비스 플랫폼에 음식점, 숙박업소 또는 전자 제품과 관련된 다양한 평가 리뷰가 소비자에 의해 평가되고 있으며, 또한 유명한 유투버 또는 인플루언스 등이 특정 제품을 평가 리뷰한 글이 게재되고 있는 실정이다. 소비자가 올린 평가 리뷰는 소비자의 리뷰 글과 평가 점수가 서로 다를 때도 있으며, 유명한 유투버 또는 인플루언스가 올린 광고성 리뷰는 또한 객관적인 평가 자표를 제공하지 못한다.As the online world becomes more active, various evaluation reviews related to restaurants, accommodations, or electronic products are being evaluated by consumers on website service platforms, and famous YouTubers or influencers are posting evaluations and reviews of specific products. This is the situation. Reviews posted by consumers sometimes have different evaluation scores from consumers' review posts, and advertising reviews posted by famous YouTubers or influencers also do not provide objective evaluation marks.

따라서, 소비자가 올리 소비자 리뷰나 또는 유명한 유투버 또는 인플루언스가 올린 광고성 리뷰가 진정한 리뷰인지 여부를 평가하는 것이 필요하다.Therefore, it is necessary to evaluate whether consumer reviews posted by consumers or advertising reviews posted by famous YouTubers or influencers are genuine reviews.

KR 10-2022-0027723(발명의 명칭 : 상품 후기 평가 방법, 장치 및 시스템)KR 10-2022-0027723 (Title of invention: Product review evaluation method, device, and system)

따라서, 본 발명은 전술한 바와 같은 문제점을 해결하기 위하여 창출된 것으로서, 소비자 리뷰 또는 광고성 리뷰가 진정성 있는 리뷰인지 여부를 판단 평가할 수 있는 발명을 제공하는데 그 목적이 있다.Accordingly, the present invention was created to solve the problems described above, and its purpose is to provide an invention that can determine and evaluate whether a consumer review or an advertising review is an authentic review.

그러나, 본 발명의 목적들은 상기에 언급된 목적으로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.However, the objects of the present invention are not limited to the objects mentioned above, and other objects not mentioned will be clearly understood by those skilled in the art from the description below.

전술한 본 발명의 목적은, 소비자 리뷰 또는 광고성 리뷰가 게시된 웹 사이트에 접속하여 리뷰를 크롤링 하고, 크롤링한 리뷰의 단어, 어구, 및 문장 중 적어도 어느 하나의 평가 리뷰와 평가 리뷰의 평가점수를 수집 매칭하고 리뷰 데이터화하는 리뷰 데이터 수집부, 리뷰 데이터에 포함된 평가 리뷰를 적어도 3단계에 걸쳐 전처리함으로써 진정성 리뷰만을 필터링 하여 전처리 출력하는 리뷰 데이터 전처리부, 리뷰 데이터 전처리부에서 필터링 한 진정성 리뷰 데이터를 입력받아 5개의 군집 리뷰 데이터로 군집 분류하는 클러스터링부, 클러스터링부에서 군집 분류한 5개의 군집 리뷰 학습 데이터를 기초로 기 정해진 조건에 따라 2개씩 쌍으로 군집 리뷰 학습 데이터를 기 구축된 한국어 학습 모델에 입력하여 리뷰 진정성 판별 학습 모델을 5개의 군집별로 각각 생성하는 리뷰 진정성 판별 학습 모델 생성부, 클러스터링부에서 군집 분류한 5개의 군집 리뷰 평가 데이터를 군집 분류에 대응하는 리뷰 진정성 판별 학습 모델에 각각 입력하여 리뷰의 진정성을 평가하는 리뷰 진정성 평가부를 포함하는 것을 특징으로 하는 리뷰 진정성 평가 장치를 제공함으로써 달성될 수 있다.The purpose of the present invention described above is to access a website where consumer reviews or advertising reviews are posted, crawl the reviews, and collect at least one evaluation review and an evaluation score of the evaluation review among the words, phrases, and sentences of the crawled review. A review data collection unit that collects, matches and converts review data into review data, a review data preprocessing unit that filters and outputs only the authenticity reviews by preprocessing the evaluation reviews included in the review data in at least three stages, and the authenticity review data filtered by the review data preprocessing unit. A clustering unit that receives input and clusters classification into 5 cluster review data. Based on the 5 cluster review learning data clustered by the clustering unit, the cluster review learning data is divided into pairs of 2 according to predetermined conditions to a pre-built Korean learning model. The review authenticity discrimination learning model generation unit generates a review authenticity discrimination learning model for each of five clusters by inputting the five cluster review evaluation data classified into clusters by the clustering unit and inputting them into the review authenticity discrimination learning model corresponding to the cluster classification. This can be achieved by providing a review authenticity evaluation device that includes a review authenticity evaluation unit that evaluates the authenticity of the review.

또한, 리뷰 데이터 수집부는 소비자 리뷰 또는 광고성 리뷰가 게시된 웹 사이트 서비스 플랫폼에 접속하는 서비스 플랫폼 접속부, 웹 사이트 서비스 플랫폼에 게시된 리뷰의 위치를 탐색하는 리뷰 위치 탐색부, 리뷰 위치 탐색부의 탐색된 리뷰 위치를 기초로 웹 사이트 서비스 플랫폼에 게시된 리뷰를 크롤링하는 리뷰 크롤링부, 크롤링한 리뷰의 단어, 어구, 및 문장 중 적어도 어느 하나의 평가 리뷰와 각 평가 리뷰의 평가점수를 수집 매칭하여 리뷰 데이터화 하며, 리뷰 데이터를 리뷰할 대상의 종류별로 분류하여 저장하는 리뷰 데이터 데이터베이스부를 포함한다.In addition, the review data collection unit includes a service platform access unit that accesses the website service platform where consumer reviews or advertising reviews are posted, a review location search unit that searches for the location of reviews posted on the website service platform, and a searched review unit in the review location search unit. A review crawling unit that crawls reviews posted on the website service platform based on location, collects and matches the evaluation scores of each evaluation review with at least one of the words, phrases, and sentences of the crawled review to create review data. , and includes a review data database unit that classifies and stores review data by type of target to be reviewed.

또한, 리뷰 데이터 전처리부는 리뷰 데이터에 포함된 평가 리뷰를 분석하여 기 정의된 조건에 부적당한 평가 리뷰를 1차적으로 제거하여 출력하는 리뷰 전처리 제거부, 리뷰 전처리 제거부에서 1차적으로 제거한 평가 리뷰만을 대상으로 광고성 리뷰 여부를 필터링하여 2차적으로 평가 리뷰를 제거하는 광고성 리뷰 제거부, 광고성 리뷰 제거부에서 2차적으로 제거한 평가 리뷰만을 대상으로 기 구축된 감정사전의 평가점수와 평가 리뷰의 평가점수를 서로 비교하여 기 설정된 조건에 부합하지 않는 평가 리뷰를 3차적으로 제거하여 3단계를 통과한 진정성 리뷰만을 클러스터링부로 랜덤 출력하는 리뷰 신뢰성 평가 제거부를 포함한다.In addition, the review data pre-processing unit analyzes the evaluation reviews included in the review data, primarily removes and outputs evaluation reviews that do not meet predefined conditions, and outputs only the evaluation reviews primarily removed from the review pre-processing removal unit. Advertising review removal unit, which filters out whether or not advertising reviews are the target and secondarily removes evaluation reviews, and evaluates the evaluation scores of evaluation reviews and the evaluation scores of the already built appraisal dictionary targeting only evaluation reviews that are secondary removed from the advertising review removal unit. It includes a review reliability evaluation removal unit that compares them with each other and thirdly removes evaluation reviews that do not meet preset conditions, and randomly outputs only the authentic reviews that have passed the third stage to the clustering unit.

또한, 리뷰 전처리 제거부는 평가 리뷰에 포함된 기호 및 이모티콘을 제거하여 순수 언어만 남도록 하고, 기호 및 이모티콘만으로 이루어진 평가 리뷰를 제거하는 언어 교정부, 평가 리뷰에 포함된 영문자의 대문자를 소문자로 변환하는 문자 변환부, 평가 리뷰에 포함된 욕설 또는 비방 문구를 제거하는 불용어 제거부, 언어 교정부, 문자 변환부 및 불용어 제거부를 통과한 평가 리뷰만을 대상으로 기 학습된 문장 형태소 분석부를 통해 문장의 형태소를 분석하여 평가 리뷰의 문법을 수정하고, 형태소 분석에 따라 평가 리뷰의 불완전한 문장을 제거하여 제1 전처리 리뷰 데이터를 광고성 리뷰 제거부로 출력하는 문법 교정 및 형태소 분석부를 포함한다.In addition, the review pre-processing removal unit removes symbols and emoticons included in the evaluation review so that only pure language remains, the language correction department removes evaluation reviews consisting only of symbols and emoticons, and converts uppercase letters of English letters included in the evaluation review to lowercase. A character conversion unit, a stopword removal unit that removes profanity or slanderous phrases included in the evaluation review, a language correction unit, a text conversion unit, and a stopword removal unit that removes morphemes from sentences through a previously learned sentence morpheme analysis unit targeting only the evaluation reviews that have passed the text conversion unit and stopword removal unit. It includes a grammar correction and morpheme analysis unit that analyzes and corrects the grammar of the evaluation review, removes incomplete sentences of the evaluation review according to morphological analysis, and outputs the first preprocessed review data to the advertising review removal unit.

또한, 광고성 리뷰 제거부는 리뷰 전처리 제거부에서 출력된 제1 전처리 리뷰 데이터를 리뷰 카테고리별로 매칭하여 출력하는 카테고리별 광고성 리뷰 매칭 입력부, 카테고리별 광고성 리뷰 매칭 입력부에서 매칭 출력된 제1 전처리 리뷰 데이터를 리뷰 카테고리별로 구축된 광고성 리뷰 필터링 모델에 각각 매칭 입력하여 광고성 리뷰 여부를 필터링함으로써 제2 전처리 리뷰 데이터를 생성하고, 생성된 제2 전처리 리뷰 데이터를 리뷰 신뢰성 평가 제거부로 출력하는 광고성 리뷰 필터링 모듈부를 포함한다.In addition, the advertising review removal unit matches and outputs the first pre-processed review data output from the review pre-processing removal unit by review category, and the advertising review matching input unit for each category matches and outputs the first pre-processed review data output from the advertising review matching input unit for each category. Includes an advertising review filtering module unit that generates second pre-processed review data by filtering whether or not advertising reviews are present by matching each input to the advertising review filtering model built for each category, and outputs the generated second pre-processed review data to the review credibility evaluation removal unit. do.

또한, 광고성 리뷰 필터링 모듈부는 리뷰 전처리 제거부에서 출력된 제1 전처리 리뷰 데이터를 리뷰 카테고리별로 분류하고, 리뷰 카테고리별로 광고성 리뷰의 단어, 어구 및 문장을 분석 및 추출하는 광고성 리뷰 추출부, 광고성 리뷰 추출부에서 추출한 리뷰 카테고리별 단어, 어구 및 문장의 각 사용 빈도수를 산출하고, 사용 빈도수가 높은 단어, 어구 및 문장을 추출하는 사용 빈도수 추출부, 사용 빈도수 추출부에서 추출한 필터링 리뷰단어인 단어, 어구 및 문장을 리뷰 카테고리별로 분류 및 저장하는 카테고리별 광고성 리뷰 분류부, 카테고리별 광고성 리뷰 분류부에서 리뷰 카테고리별로 분류 저장한 필터링 리뷰단어를 이용하여 리뷰 카테고리별 필터링 모델을 각각 생성하는 카테고리별 필터링 모델 생성부를 포함한다.In addition, the advertising review filtering module classifies the first pre-processed review data output from the review pre-processing removal unit into review categories, and the advertising review extraction unit analyzes and extracts words, phrases, and sentences of advertising reviews for each review category, and extracts advertising reviews. Calculate the frequency of use of each word, phrase, and sentence by review category extracted from the section, the frequency of use extraction section that extracts words, phrases, and sentences with high frequency of use, and the words, phrases, and filtered review words extracted from the frequency of use extraction section. A category-specific advertising review classification unit that classifies and stores sentences by review category, and a category-specific filtering model creation unit that creates filtering models by review category using the filtered review words classified and stored by review category in the category-specific advertising review classification unit. Includes.

또한, 리뷰 신뢰성 평가 제거부는 광고성 리뷰 제거부에서 출력된 제2 전처리 리뷰 데이터에 포함된 평가 리뷰의 어구를 구축된 감정사전 모델에 입력하여 감정사전 평가점수를 산출하여 출력하는 감정사전 모델링부, 광고성 리뷰 제거부에서 출력된 제2 전처리 리뷰 데이터에 포함된 평가 리뷰 및 평가 리뷰의 평가점수를 추출하고, 평가 리뷰를 감정사전 모델링부에 출력하는 리뷰 어구 및 점수 추출부, 감정사전 모델링부로부터 감정사전 평가점수를 입력받고, 리뷰 어구 및 점수 추출부로부터 평가 리뷰의 평가점수를 입력받으며, 평가 리뷰의 어구에 상응하는 감정사전 평가점수와 평가 리뷰의 평가점수를 서로 비교하여 리뷰 평가점수 차이값을 산출하는 리뷰 감정점수 비교부, 리뷰 감정점수 비교부에서 산출한 리뷰 평가점수 차이 값이 기 설정된 조건 값을 초과하는 경우에는 평가 리뷰의 신뢰도가 낮다고 판단하는 리뷰 신뢰도 평가부, 리뷰 신뢰도 평가부에서 신뢰도가 낮다고 판단한 평가 리뷰를 제거한 진정성 평가 리뷰인 제3 전처리 리뷰 데이터를 클러스터링부로 랜덤 출력하는 리뷰 제거부를 포함한다.In addition, the review credibility evaluation removal unit inputs the phrases of the evaluation review included in the second pre-processed review data output from the advertising review removal unit into the constructed appraisal dictionary model to calculate and output the appraisal dictionary evaluation score. A review phrase and score extraction unit that extracts the evaluation review and the evaluation score of the evaluation review included in the second pre-processed review data output from the review removal unit and outputs the evaluation review to the appraisal dictionary modeling unit, and an appraisal dictionary from the appraisal dictionary modeling unit. The evaluation score is input, the evaluation score of the evaluation review is input from the review phrase and score extraction unit, and the evaluation score of the evaluation review is compared with the dictionary evaluation score corresponding to the phrase of the evaluation review to calculate the difference in review evaluation score. If the review evaluation score difference value calculated by the review emotional score comparison unit or the review emotional score comparison unit exceeds the preset condition value, the review reliability evaluation unit, which determines that the reliability of the review review is low, and the review reliability evaluation unit determines that the reliability of the review is low. It includes a review removal unit that randomly outputs the third preprocessed review data, which is an authenticity evaluation review from which review reviews judged to be low, are randomly output to the clustering unit.

또한, 감정사전 모델링부는 미리 정의된 리뷰의 긍정 어구와 긍정 어구의 동의어 및 유의어를 학습하는 긍정 어구 학습부, 미리 정의된 리뷰의 부정 어구와 부정 어구의 동의어 및 유의어를 학습하는 부정 어구 학습부, 긍정 어구와 부정 어구의 중간적인 감정을 가지는 미리 정의된 리뷰의 중간 어구와 중간 어구의 동의어 및 유의어를 학습하는 중간 어구 학습부, 학습된 감정사전의 긍정 어구, 부정 어구 및 중간 어구의 어구별 등급에 따라 감정사전 점수를 매칭시키는 어구별 점수 매칭부, 리뷰 어구 및 점수 추출부에서 출력된 평가 리뷰의 어구를 감정사전의 어구와 매칭 비교하는 리뷰 어구 매칭부, 평가 리뷰의 어구와 매칭된 감정사전의 어구의 평균치 값을 감정사전 평가점수로 하여 리뷰 감정점수 비교부로 출력하는 감정사전 점수 출력부를 포함한다.In addition, the emotion dictionary modeling unit includes a positive phrase learning unit that learns synonyms and synonyms of positive phrases and positive phrases in predefined reviews, a negative phrase learning unit that learns synonyms and synonyms of negative phrases and negative phrases in predefined reviews, An intermediate phrase learning unit that learns synonyms and synonyms of intermediate phrases and intermediate phrases of predefined reviews with intermediate emotions between positive and negative phrases, and phrase-specific ratings of positive, negative, and intermediate phrases in the learned sentiment dictionary. A phrase-specific score matching unit that matches the appraisal dictionary score, a review phrase matching unit that matches and compares the phrases of the evaluation review output from the review phrase and score extraction unit with phrases in the appraisal dictionary, and an appraisal dictionary matched with phrases in the evaluation review. It includes an appraisal dictionary score output unit that uses the average value of the phrase as the appraisal dictionary evaluation score and outputs it to the review appraisal score comparison unit.

또한, 리뷰 진정성 판별 학습 모델 생성부는 클러스터링부에서 군집 분류한 5개의 군집 리뷰 학습 데이터를 기초로 기 정해진 조건에 따라 2개씩 한 쌍으로 군집 리뷰 학습 데이터를 매칭하는 군집 리뷰 쌍 매칭부, 군집 리뷰 쌍 매칭부에서 2개씩 한 쌍으로 매칭한 각각의 군집 리뷰 학습 데이터의 입력을 기초로 쌍으로 매칭된 군집 리뷰별로 한국어 학습을 진행하는 한국어 학습 모델부, 한국어 학습 모델부의 군집 리뷰별 학습에 따라 리뷰 군집별로 5개의 리뷰 진정성 판별 학습 모델이 각각 생성되는 리뷰 진정성 판별 학습 모델부를 포함한다.In addition, the review authenticity determination learning model generation unit is a cluster review pair matching unit, which matches the cluster review learning data in pairs according to predetermined conditions based on the five cluster review learning data classified by the clustering unit. Reviews of each cluster matched in pairs in the matching unit. Based on the input of the review learning data, the Korean learning model unit conducts Korean learning for each review. The cluster of reviews in the Korean learning model unit clusters reviews according to learning by review. It includes a review authenticity determination learning model unit in which five review authenticity determination learning models are generated for each review.

또한, 리뷰 진정성 평가부는 클러스터링부에서 군집 분류한 5개의 군집 리뷰 평가 데이터를 리뷰 진정성 판별 학습 모델부에서 생성한 5개의 리뷰 진정성 판별 학습 모델에 매칭입력하는 군집 분류별 학습모델 매칭부, 군집 분류별 학습모델 매칭부의 매칭에 의해 어느 하나의 리뷰 진정성 판별 학습 모델에 매칭 입력된 군집 리뷰 평가 데이터의 진정성 판별을 'true' or 'false'로 평가한 리뷰 진정성 평가 데이터를 출력하는 리뷰 진정성 판별 학습 모듈부, 리뷰 진정성 판별 학습 모듈부에서 평가한 리뷰의 진정성 판별을 군집 리뷰별로 카테고리화 하여 수집 저장하는 리뷰별 진정성 평가 수집부, 리뷰 진정성 판별 학습 모델에서 판별한 'true' or 'false'를 기준으로 군집 리뷰별로 판별 평균값을 산출하는 군집별 리뷰 평균값 산출부, 판별 학습 모델의 판별 평균값과 웹 사이트 서비스 플랫폼의 리뷰 평균값을 서로 비교 제공하는 리뷰 진정성 평가 제공부를 포함한다.In addition, the review authenticity evaluation unit matches the five cluster review evaluation data clustered by the clustering unit to the five review authenticity determination learning models generated by the review authenticity determination learning model unit. A review that evaluates the authenticity of the input cluster review evaluation data as 'true' or 'false' by matching it to a review authenticity determination learning model by matching in the matching unit. A review that outputs the authenticity evaluation data. Authenticity determination learning module unit, review Authenticity evaluation collection unit for each review that categorizes and stores the authenticity of the reviews evaluated in the authenticity determination learning module by group review, collects and stores each cluster review based on 'true' or 'false' determined in the review authenticity determination learning model. It includes a review average value calculation unit for each cluster that calculates the discriminant average value, and a review authenticity evaluation provision unit that compares the discriminant average value of the discriminant learning model and the review average value of the website service platform.

또한, 리뷰 진정성 평가부에서 평가한 리뷰 진정성 평가 데이터 중 정확도가 떨어지는 리뷰 진정성 평가 데이터를 기 설정된 조건에 따라 선별하고, 선별된 리뷰 진정성 평가 데이터와 관련된 리뷰 평가를 설문자 응답을 통해 피드백 받아 설문자 응답에 포함된 리뷰의 평가 점수를 추출하여 대응 매칭되는 선별된 리뷰 진정성 평가 데이터에 수정 반영하도록 하는 리뷰 진정성 평가 피드백부를 더 포함한다.In addition, among the review authenticity evaluation data evaluated by the review authenticity evaluation department, review authenticity evaluation data with low accuracy are selected according to preset conditions, and review evaluations related to the selected review authenticity evaluation data are feedback through the questionnaire responses. It further includes a review authenticity evaluation feedback unit that extracts the evaluation scores of the reviews included in the response and modifies and reflects them in the correspondingly matched selected review authenticity evaluation data.

또한, 리뷰 진정성 평가 피드백부는 선별된 리뷰 진정성 평가 데이터와 관련된 설문자 응답의 유효 여부를 기 정의된 조건에 따라 판단 평가하는 설문자 응답 평가부, 설문자 응답 평가부의 판단 평가에 따라 유효로 판단된 설문자 응답에 포함된 리뷰 및 리뷰의 설문자 평가점수를 추출하는 설문자 응답 리뷰 및 평가점수 추출부, 설문자 응답 리뷰 및 평가점수 추출부에서 추출한 리뷰의 설문자 평가점수를 대응 매칭되는 리뷰 진정성 평가 데이터에 수정 반영하는 리뷰 평가점수 수정부, 리뷰 평가점수 수정부에 의해 수정된 리뷰 진정성 평가 데이트를 군집 리뷰 쌍 매칭부에 재입력하는 응답 평가 피드백부를 포함한다.In addition, the review authenticity evaluation feedback unit determines and evaluates the validity of the questionnaire response related to the selected review authenticity evaluation data according to predefined conditions, and the questionnaire response evaluation unit determines and evaluates the validity of the questionnaire response evaluation unit. Review authenticity that matches the reviews included in the survey response and the surveyor evaluation score of the review extracted from the surveyor response review and evaluation score extraction unit, and the surveyor response review and evaluation score extraction unit that extracts the surveyor evaluation score of the review. It includes a review evaluation score correction unit that reflects corrections in the evaluation data, and a response evaluation feedback unit that re-inputs the review authenticity evaluation date corrected by the review evaluation score correction unit into the cluster review pair matching unit.

또한, 군집 리뷰 쌍 매칭부는 응답 평가 피드백부의 군집 리뷰 학습 데이터 재입력에 따라 5개의 군집 리뷰 학습 데이터를 기초로 기 정해진 조건에 따라 2개씩 한 쌍으로 군집 리뷰 학습 데이터를 재매칭 한다.In addition, the cluster review pair matching unit re-matches the cluster review learning data into pairs of two according to predetermined conditions based on the five cluster review learning data according to the re-input of the cluster review learning data from the response evaluation feedback unit.

한편, 본 발명의 목적은 리뷰 데이터 수집부가 소비자 리뷰 또는 광고성 리뷰가 게시된 웹사이트에 접속하여 게시된 리뷰를 크롤링하고, 크롤링한 리뷰의 단어, 어구, 및 문장 중 적어도 어느 하나의 평가 리뷰와 평가 리뷰의 평가점수를 수집 매칭하여 리뷰 데이터를 생성하는 단계, 리뷰 데이터 전처리부가 리뷰 데이터에 포함된 평가 리뷰를 적어도 3단계에 걸쳐 전처리함으로써 진정성 리뷰만을 필터링 하여 전처리 출력하는 단계, 클러스터링부가 리뷰 데이터 전처리부에서 필터링 한 진정성 리뷰 데이터를 입력받아 5개의 군집 리뷰 학습 데이터로 군집 분류하는 단계, 리뷰 진정성 판별 학습 모델 생성부가 클러스터링부에서 군집 분류한 5개의 군집 리뷰 학습 데이터를 기초로 기 정해진 조건에 따라 2개씩 쌍으로 군집 리뷰 학습 데이터를 기 구축된 한국어 학습 모델에 입력함으로써 5개의 군집별로 리뷰 진정성 판별 학습 모델이 각각 생성되는 단계, 리뷰 진정성 평가부에서 평가한 리뷰 진정성 평가 데이터 중 정확도가 떨어지는 리뷰 진정성 평가 데이터를 기 설정된 조건에 따라 선별하고, 선별된 리뷰 진정성 평가 데이터와 관련된 리뷰 평가를 설문자 응답을 통해 피드백 받아 설문자 응답에 포함된 리뷰의 평가 점수를 추출하여 대응 매칭되는 선별된 리뷰 진정성 평가 데이터에 수정 반영하도록 함으로써 설문자 응답을 피드백하는 단계, 수정된 리뷰 진정성 평가 데이터를 기초로 군집 리뷰 학습 데이터를 재생성하여 리뷰 진정성 판별 학습 모델 생성부에 입력하여 재학습함으로써 재생성된 군집 리뷰 학습 데이터와 관련된 새로운 리뷰 진정성 판별 학습 모델이 재생성되는 단계를 포함하는 것을 특징으로 하는 리뷰 진정성 평가 학습방법을 제공함으로써 달성될 수 있다.Meanwhile, the purpose of the present invention is for the review data collection unit to access a website where consumer reviews or advertising reviews are posted, crawl the posted reviews, and collect reviews and evaluations of at least one of the words, phrases, and sentences of the crawled reviews. A step of generating review data by collecting and matching the evaluation scores of the reviews, a review data pre-processing unit pre-processing the evaluation reviews included in the review data in at least three stages to filter and output only authentic reviews, a clustering unit pre-processing the review data The step of receiving the filtered authenticity review data and clustering them into 5 cluster review learning data. The review authenticity determination learning model generation unit classifies the authenticity review data into 5 clusters by the clustering unit. Based on the 5 cluster review learning data clustered by the clustering unit, two clusters are classified according to predetermined conditions. A step in which review authenticity determination learning models for each of the five clusters are created by inputting paired cluster review learning data into a pre-built Korean learning model. Review authenticity evaluation data with low accuracy among the review authenticity evaluation data evaluated by the review authenticity evaluation department. are selected according to preset conditions, the review evaluations related to the selected review authenticity evaluation data are fed back through the questionnaire responses, and the evaluation scores of the reviews included in the questionnaire responses are extracted and matched to the selected review authenticity evaluation data. A step of feedbacking the surveyor's response by reflecting the correction, regenerating the cluster review learning data based on the modified review authenticity evaluation data, inputting it to the review authenticity determination learning model generation unit, and relearning it to create new information related to the regenerated cluster review learning data. This can be achieved by providing a review authenticity evaluation learning method that includes a step of regenerating a review authenticity determination learning model.

또한, 리뷰 데이터를 생성하는 단계는 서비스 플랫폼 접속부가 소비자 리뷰 또는 광고성 리뷰가 게시된 웹 사이트 서비스 플랫폼에 접속하는 단계, 리뷰 위치 탐색부가 웹 사이트 서비스 플랫폼에 게시된 리뷰의 위치를 탐색하는 단계, 리뷰 크롤링부가 리뷰 위치 탐색부의 탐색된 리뷰 위치를 기초로 웹 사이트 서비스 플랫폼에 게시된 리뷰를 크롤링하는 단계, 리뷰 데이터 데이터베이스부가 크롤링한 리뷰의 단어, 어구, 및 문장 중 적어도 어느 하나의 평가 리뷰와 각 평가 리뷰의 평가점수를 수집 매칭하여 리뷰 데이터화 하며, 리뷰 데이터를 리뷰할 대상의 종류별로 분류하여 저장하는 단계를 포함한다.In addition, the step of generating review data includes: the service platform connection unit accessing the website service platform where the consumer review or advertising review is posted; the review location exploration unit searching for the location of the review posted on the website service platform; A step of the crawling unit crawling reviews posted on the website service platform based on the searched review location of the review location search unit, at least one evaluation review and each evaluation among words, phrases, and sentences of the review crawled by the review data database unit. It includes the step of collecting and matching the evaluation scores of the reviews to create review data, and classifying and storing the review data by type of target to be reviewed.

또한, 전처리 출력하는 단계는 리뷰 전처리 제거부가 리뷰 데이터에 포함된 평가 리뷰를 분석하여 기 정의된 조건에 부적당한 평가 리뷰를 1차적으로 제거하여 출력하는 단계, 광고성 리뷰 제거부가 리뷰 전처리 제거부에서 1차적으로 제거한 평가 리뷰만을 대상으로 광고성 리뷰 여부를 필터링하여 2차적으로 평가 리뷰를 제거하는 단계, 리뷰 신뢰성 평가 제거부가 광고성 리뷰 제거부에서 2차적으로 제거한 평가 리뷰만을 대상으로 기 구축된 감정사전의 평가점수와 평가 리뷰의 평가점수를 서로 비교하여 기 설정된 조건에 부합하지 않는 평가 리뷰를 3차적으로 제거하여 3단계를 통과한 진정성 리뷰만을 클러스터링부로 랜덤 출력하는 단계를 포함한다.In addition, the preprocessing output step is a step in which the review preprocessing removal unit analyzes the evaluation reviews included in the review data and primarily removes and outputs evaluation reviews that do not meet predefined conditions. The advertising review removal unit performs 1 review in the review preprocessing removal unit. A step of secondarily removing evaluation reviews by filtering whether or not they are advertising reviews, targeting only the secondarily removed evaluation reviews; the review reliability evaluation removal unit evaluates the previously constructed appraisal dictionary targeting only the evaluation reviews secondarily removed by the advertising review removal department. It includes a step of comparing the evaluation scores of the scores and evaluation reviews to thirdly remove evaluation reviews that do not meet preset conditions, and randomly output only the authentic reviews that passed the third step to the clustering unit.

또한, 리뷰 진정성 판별 학습 모델이 각각 생성되는 단계는 군집 리뷰 쌍 매칭부가 클러스터링부에서 군집 분류한 5개의 군집 리뷰 학습 데이터를 기초로 기 정해진 조건에 따라 2개씩 한 쌍으로 군집 리뷰 학습 데이터를 매칭하는 단계, 한국어 학습 모델부가 군집 리뷰 쌍 매칭부에서 2개씩 한 쌍으로 매칭한 각각의 군집 리뷰 학습 데이터의 입력을 기초로 쌍으로 매칭된 군집 리뷰별로 한국어 학습을 진행하는 단계, 리뷰 진정성 판별 학습 모델부가 한국어 학습 모델부의 군집 리뷰별 학습에 따라 리뷰 군집별로 5개의 리뷰 진정성 판별 학습 모델을 각각 생성하는 단계를 포함한다.In addition, the step in which each review authenticity determination learning model is created is where the cluster review pair matching unit matches the cluster review learning data in pairs according to predetermined conditions based on the five cluster review learning data clustered by the clustering unit. Step, the Korean learning model unit performs Korean learning for each pair-matched cluster review based on the input of each cluster review learning data matched as a pair by the cluster review pair matching unit, the review authenticity determination learning model unit It includes the step of generating five review authenticity discrimination learning models for each review cluster according to the learning for each cluster review in the Korean learning model department.

또한, 설문자 응답을 피드백하는 단계는 설문자 응답 평가부가 선별된 리뷰 진정성 평가 데이터와 관련된 설문자 응답의 유효 여부를 기 정의된 조건에 따라 판단 평가하는 단계, 설문자 응답 리뷰 및 평가점수 추출부가 설문자 응답 평가부의 판단 평가에 따라 유효로 판단된 설문자 응답에 포함된 리뷰 및 리뷰의 설문자 평가점수를 추출하는 단계, 리뷰 평가점수 수정부가 설문자 응답 리뷰 및 평가점수 추출부에서 추출한 리뷰의 설문자 평가점수를 대응 매칭되는 리뷰 진정성 평가 데이터에 수정 반영하는 단계, 응답 평가 피드백부가 리뷰 평가점수 수정부에 의해 수정된 리뷰 진정성 평가 데이트를 군집 리뷰 쌍 매칭부에 재입력하여 설문자 응답을 피드백하는 단계를 포함한다.In addition, the step of feeding back the surveyor's response is a step where the surveyor response evaluation department determines and evaluates the validity of the surveyor's response related to the selected review authenticity evaluation data according to predefined conditions, and the surveyor response review and evaluation score extraction department A step of extracting the review and the surveyor evaluation score of the review included in the surveyor response judged to be valid according to the judgment evaluation of the surveyor response evaluation unit, and the review evaluation score correction unit extracting the surveyor response review and the review extracted from the evaluation score extraction unit. Step of modifying and reflecting the surveyor's evaluation score to the correspondingly matched review authenticity evaluation data; the response evaluation feedback section re-enters the review authenticity evaluation date modified by the review evaluation score correction section into the cluster review pair matching section to feed back the surveyor's response. It includes steps to:

또한, 새로운 리뷰 진정성 판별 학습 모델이 재생성되는 단계는 군집 리뷰 쌍 매칭부가 응답 평가 피드백부의 수정된 리뷰 진정성 평가 데이터를 기초로 새로운 군집 리뷰 학습 데이터를 생성하는 단계, 군집 리뷰 쌍 매칭부가 새로운 군집 리뷰 학습 데이터를 기초로 기 정해진 조건에 따라 2개씩 한 쌍으로 군집 리뷰 학습 데이터를 재매칭하고, 한국어 학습 모델부로 새롭게 재매칭된 군집 리뷰 학습 데이터를 출력하는 단계, 한국어 학습 모델부가 한 쌍으로 입력된 군집 리뷰 학습 데이터를 이용하여 재학습함으로써 재생성된 군집 리뷰 학습 데이터와 관련된 새로운 리뷰 진정성 판별 학습 모델이 재생성되는 단계를 포함한다.In addition, the step in which a new review authenticity determination learning model is regenerated is a step in which the cluster review pair matching unit generates new cluster review learning data based on the modified review authenticity evaluation data of the response evaluation feedback unit, and the cluster review pair matching unit learns new cluster reviews. A step of rematching the cluster review learning data in pairs according to predetermined conditions based on the data, and outputting the newly rematched cluster review learning data to the Korean learning model unit, and the clusters input as a pair to the Korean learning model unit. It includes a step of regenerating a new review authenticity determination learning model related to the regenerated cluster review learning data by re-learning using the review learning data.

전술한 바와 같은 본 발명에 의하면 소비자 리뷰 또는 광고성 리뷰가 진정성 있는 리뷰인지 여부를 판단 평가할 수 있는 효과가 있다.According to the present invention as described above, there is an effect of determining and evaluating whether a consumer review or an advertising review is an authentic review.

본 명세서에 첨부되는 다음의 도면들은 본 발명의 바람직한 일실시예를 예시하는 것이며, 발명의 상세한 설명과 함께 본 발명의 기술적 사상을 더욱 이해시키는 역할을 하는 것이므로, 본 발명은 그러한 도면에 기재된 사항에만 한정되어 해석 되어서는 아니 된다.
도 1은 본 발명의 일실시예에 따른 리뷰 진정성 평가 장치의 대략적인 구성을 도시한 도면이고,
도 2 및 도 3은 본 발명의 일실시예에 따른 리뷰 데이터 수집부를 도시한 도면이고,
도 4는 본 발명의 일실시예에 따른 리뷰 데이터 전처리부를 도시한 도면이고,
도 5 및 도 6은 본 발명의 일실시예에 따른 리뷰 전처리 제거부를 도시한 도면이고,
도 7 내지 도 9는 본 발명의 일실시예에 따른 광고성 리뷰 제거부를 도시한 도면이고,
도 10 내지 도 12는 본 발명의 일실시예에 따른 리뷰 신뢰성 평가 제거부를 도시한 도면이고,
도 13은 본 발명의 일실시예에 따른 클러스터링부를 도시한 도면이고,
도 14는 본 발명의 일실시예에 따른 리뷰 진정성 판별 학습 모델 생성부 및 리뷰 진정성 평가부를 도시한 도면이고,
도 15는 본 발명의 일실시예에 따른 한국어 학습 모델부 및 리뷰 진정성 판별 학습 모델부를 도시한 도면이고,
도 16은 본 발명의 일실시예에 따른 리뷰 진정성 평가 피드백부를 도시한 도면이고,
도 17은 본 발명의 일실시예에 따른 리뷰 진정성 평가 학습방법을 순차적으로 도시한 도면이다.The following drawings attached to this specification illustrate a preferred embodiment of the present invention, and serve to further understand the technical idea of the present invention along with the detailed description of the invention. Therefore, the present invention is limited to the matters described in such drawings. It should not be interpreted in a limited way.
1 is a diagram showing the approximate configuration of a review authenticity evaluation device according to an embodiment of the present invention;
Figures 2 and 3 are diagrams showing a review data collection unit according to an embodiment of the present invention;
Figure 4 is a diagram showing a review data preprocessing unit according to an embodiment of the present invention;
Figures 5 and 6 are diagrams showing a review pre-processing removal unit according to an embodiment of the present invention;
Figures 7 to 9 are diagrams showing an advertising review removal unit according to an embodiment of the present invention;
10 to 12 are diagrams showing a review reliability evaluation removal unit according to an embodiment of the present invention;
Figure 13 is a diagram showing a clustering unit according to an embodiment of the present invention;
Figure 14 is a diagram showing a review authenticity determination learning model generation unit and a review authenticity evaluation unit according to an embodiment of the present invention;
Figure 15 is a diagram showing a Korean learning model unit and a review authenticity determination learning model unit according to an embodiment of the present invention;
Figure 16 is a diagram showing a review authenticity evaluation feedback unit according to an embodiment of the present invention;
Figure 17 is a diagram sequentially showing a review authenticity evaluation learning method according to an embodiment of the present invention.

이하, 도면을 참조하여 본 발명의 바람직한 일실시예에 대해서 설명한다. 또한, 이하에 설명하는 일실시예는 특허청구범위에 기재된 본 발명의 내용을 부당하게 한정하지 않으며, 본 실시 형태에서 설명되는 구성 전체가 본 발명의 해결 수단으로서 필수적이라고는 할 수 없다. 또한, 종래 기술 및 당업자에게 자명한 사항은 설명을 생략할 수도 있으며, 이러한 생략된 구성요소(방법) 및 기능의 설명은 본 발명의 기술적 사상을 벗어나지 아니하는 범위내에서 충분히 참조될 수 있을 것이다.Hereinafter, a preferred embodiment of the present invention will be described with reference to the drawings. In addition, one embodiment described below does not unduly limit the content of the present invention described in the claims, and it cannot be said that all of the configurations described in this embodiment are essential as a solution to the present invention. In addition, descriptions of matters that are obvious to those skilled in the art and skilled in the art may be omitted, and descriptions of such omitted components (methods) and functions may be sufficiently referenced without departing from the technical spirit of the present invention.

(리뷰 진정성 평가 장치의 구성 및 기능)(Configuration and function of review authenticity evaluation device)

본 발명의 일실시예에 따른 리뷰 진정성 평가 장치는 웹사이트에 게재된 소비자 리뷰 또는 광고성 리뷰가 진정성 있는 리뷰인지 여부를 평가하는 장치이다. 이하에선느 첨부된 도면을 참고하여 본 발명의 일실시예에 따른 리뷰 진정성 평가 장치에 대해 상세히 설명하기로 한다.The review authenticity evaluation device according to an embodiment of the present invention is a device that evaluates whether a consumer review or an advertising review posted on a website is an authentic review. Below, a review authenticity evaluation device according to an embodiment of the present invention will be described in detail with reference to the attached drawings.

도 1에 도시된 바와 같이 본 발명의 일실시예에 따른 리뷰 진정성 평가 징치는 리뷰 데이터 수집부(100), 리뷰 데이터 전처리부(200), 클러스터링부(300), 리뷰 진정성 판별 학습 모델 생성부(400), 리뷰 진정성 평가부(500) 및 리뷰 진정성 평가 피드백부(600)를 포함한다. As shown in Figure 1, the review authenticity evaluation punishment according to an embodiment of the present invention includes a review data collection unit 100, a review data preprocessing unit 200, a clustering unit 300, and a review authenticity determination learning model generation unit ( 400), a review authenticity evaluation unit 500, and a review authenticity evaluation feedback unit 600.

클러스터링부(300)에서 출력된 군집 리뷰 학습 데이터를 이용하여 리뷰 진정성 판별 학습 모델 생성부(400)에서 리뷰 진정성 판별 모델을 생성하고, 클러스터링부(300)에서 출력된 군집 리뷰 평가 데이터를 학습된 리뷰 진정성 판별 모델에 입력하여 평가 리뷰의 진정성을 평가한다. 클러스터링부(300)는 군집 리뷰 학습 데이터를 이용하여 평가점수별 학습 데이터로 군집 분류한다.A review authenticity determination model is generated in the review authenticity determination learning model generation unit 400 using the cluster review learning data output from the clustering unit 300, and the cluster review evaluation data output from the clustering unit 300 is used to generate a learned review model. The authenticity of the evaluation review is evaluated by inputting it into the authenticity determination model. The clustering unit 300 uses the cluster review learning data to cluster and classify the learning data by evaluation score.

도 2에 도시된 바와 같이 리뷰 데이터 수집부(100)는 소비자 리뷰 또는 광고성 리뷰가 기재된 웹 사이트에 접속하여 리뷰를 크롤링 하고, 크롤링한 리뷰의 단어, 어구, 및 문장 중 적어도 어느 하나의 평가 리뷰와 평가 리뷰의 평가점수를 수집 매칭하고, 매칭된 리뷰 데이터를 저장한다. 이때, 소비자 리뷰는 웹 사이트 서비스 플랫폼에 게재된 소비자가 올린 제품, 숙박업소 또는 음식점 리뷰 및 리뷰의 평가점수이고, 광고성 리뷰는 유명 유투버 또는 인플루언서가 특정 제품의 리뷰나 음식점 리뷰 및 평가점수를 올리는 것을 의미한다. 다만, 제품이나 음식점에 한정되지 않고 웹사이트 서비스 플랫폼에 게재 수 있는 리뷰는 본 발명의 리뷰에 모두 포함될 수 있다.As shown in FIG. 2, the review data collection unit 100 accesses a website containing consumer reviews or advertising reviews, crawls the reviews, and collects at least one evaluation review and one of the words, phrases, and sentences of the crawled reviews. Collect and match the evaluation scores of evaluation reviews, and store the matched review data. At this time, consumer reviews are reviews and evaluation scores of products, lodging or restaurants posted by consumers posted on the website service platform, and advertising reviews are reviews and evaluation scores of specific products or restaurants posted by famous YouTubers or influencers. means raising. However, not limited to products or restaurants, all reviews that can be posted on the website service platform can be included in the review of the present invention.

본 발명의 일실시예에 따른 리뷰 데이터 수집부(100)는 도 2에 도시된 바와 같이 서비스 플랫폼 접속부(100), 리뷰 위치 탐색부(120), 리뷰 크롤링부(130) 및 리뷰 데이터 데이터베이스부(140)를 포함한다. 서비스 플랫폼 접속부(110)는 소비자 리뷰 또는 광고성 리뷰가 게재된 특정 웹 사이트 서비스 플랫폼에 접속한다. 서비스 플랫폼 접속부(110)는 접속된 특정 웹 사이트 서비스 플랫폼의 모든 리뷰를 크롤링 완료하면 다음 웹 사이트 서비스 플랫폼으로 접속한다. 다만, 서비스 플랫폼 접속부(110)는 직접적인 접속을 통해 접속될 수도 있고, 링크를 통해 간접적으로 접속될 수도 있다.As shown in Figure 2, the review data collection unit 100 according to an embodiment of the present invention includes a service platform connection unit 100, a review location search unit 120, a review crawling unit 130, and a review data database unit ( 140). The service platform connection unit 110 connects to a specific website service platform on which consumer reviews or advertising reviews are posted. When the service platform connection unit 110 completes crawling all reviews of a specific connected website service platform, it connects to the next website service platform. However, the service platform connection unit 110 may be connected through direct connection or indirectly through a link.

리뷰 위치 탐색부(120)는 서비스 플랫폼 접속부(110)에서 접속을 완료한 특정 웹사이트 서비스 플랫폼에 게재된 리뷰의 위치를 탐색한다. The review location search unit 120 searches for the location of a review posted on a specific website service platform that has been accessed by the service platform connection unit 110.

리뷰 크롤링부(130)는 리뷰 위치 탐색부(120)의 탐색된 리뷰 위치를 기초로 특정 웹 사이트 서비스 플랫폼에 게재된 리뷰를 크롤링한다. 도 3을 참고하면, 크롤링된 리뷰는 일예로서 소비자 리뷰인 경우 '음식점 이름, 음식점 위치, 음식점 주소, 리뷰어 평가 점수, 평점 리뷰 내용' 일 수 있으며, 광고성 리뷰인 경우 '음식점 이름, 음식점 위치, 음식점 주소, 리뷰어 평가 점수, 광고성 리뷰 내용' 일 수 있다.The review crawling unit 130 crawls reviews posted on a specific website service platform based on the review location discovered by the review location search unit 120. Referring to Figure 3, the crawled review may be, for example, 'restaurant name, restaurant location, restaurant address, reviewer evaluation score, rating review content' in the case of a consumer review, and 'restaurant name, restaurant location, restaurant rating' in the case of an advertising review. This can be address, reviewer evaluation score, and advertising review content.

리뷰 데이터 데이터베이스부(140)는 리뷰 크롤링부(150)에서 크롤링한 리뷰의 단어, 어구, 및 문장 중 적어도 어느 하나의 평가 리뷰와 각 평가 리뷰의 평가점수를 수집 매칭하여 리뷰 데이터를 저장한다. 또한, 저장할 리뷰 데이터를 리뷰할 대상의 종류별로 분류하여 저장할 수 있다. 리뷰할 대상의 종류에는 전자 제품, 음식점 등으로 구분할 수 있다. 리뷰 데이터 데이터베이스부(140)에 크롤링한 데이터가 모두 저장되면, 서비스 플랫폼 접속부(110)는 리뷰가 게재된 새로운 웹사이트 서비스 플랫폼으로 접속한다. 따라서 서비스 플랫폼 접속부(110)에는 접속할 웹사이트 서비스 플랫폼이 순서별로 지정되어 있을 수 있다.The review data database unit 140 stores review data by collecting and matching at least one evaluation review among words, phrases, and sentences of the reviews crawled by the review crawling unit 150 and the evaluation score of each evaluation review. Additionally, the review data to be saved can be classified and stored by type of object to be reviewed. The types of objects to be reviewed can be categorized into electronic products, restaurants, etc. When all the crawled data is stored in the review data database unit 140, the service platform connection unit 110 connects to the new website service platform where the review is posted. Therefore, the service platform connection unit 110 may have website service platforms to be accessed designated in order.

본 발명의 일실시예에 따른 리뷰 데이터 전처리부(200)는 리뷰 데이터 수집부(100)에서 수집한 리뷰 데이터에 포함된 평가 리뷰를 적어도 3단계에 걸쳐 전처리함으로써 진정성 있는 리뷰만을 필터링 하여 전처리 출력한다. 리뷰 데이터 전처리부(200)는 도 4에 도시된 바와 같이 리뷰 전처리 제거부(210)에 의해 리뷰를 1차적으로 제거하며, 광고성 리뷰 제거부(220)에 의해 2차적으로 리뷰를 제거하며, 리뷰 신뢰성 평가 제거부(230)에 의해 3차적으로 리뷰를 제거한다.The review data preprocessing unit 200 according to an embodiment of the present invention preprocesses the evaluation reviews included in the review data collected by the review data collection unit 100 in at least three stages to filter out only authentic reviews and output the preprocessed reviews. . As shown in FIG. 4, the review data pre-processing unit 200 primarily removes reviews by the review pre-processing removal unit 210 and secondarily removes reviews by the advertising review removal unit 220, and reviews Reviews are removed thirdly by the reliability evaluation removal unit 230.

도 5에 도시된 바와 같이 리뷰 전처리 제거부(210)는 리뷰 데이터 수집부(100)에서 수집한 리뷰 데이터에 포함된 평가 리뷰를 분석하여 기 정의된 조건에 부적당한 평가 리뷰를 1차적으로 전처리 제거하여 광고성 리뷰 제거부(220)로 출력한다. 이를 위해 리뷰 전처리 제거부(210)는 언어 교정부(211), 문자 변환부(212), 불용어 제거부(213) 및 문법 교정부(214)를 포함한다.As shown in FIG. 5, the review preprocessing removal unit 210 analyzes the evaluation reviews included in the review data collected by the review data collection unit 100 and primarily preprocesses and removes evaluation reviews that are inappropriate for predefined conditions. This is output to the advertising review removal unit 220. To this end, the review pre-processing removal unit 210 includes a language correction unit 211, a character conversion unit 212, a stop word removal unit 213, and a grammar correction unit 214.

언어 교정부(211)는 평가 리뷰에 포함된 기호 및 이모티콘을 제거하여 순수 언어만 남도록 하고, 기호 및 이모티콘만으로 이루어진 평가 리뷰를 제거한다. 일예로서, 평가 리뷰의 단어 또는 문장이 'ㅋㅋ, ^^, 또는 이모티콘' 만으로 이루어진 경우에는 해당 리뷰를 제거하고, 단어 또는 문장에 'ㅋㅋ, ^^, 또는 이모티콘' 이 포함된 경우에는 이를 제거하여 순수 언어만 포함되도록 전처리 한다. The language correction unit 211 removes symbols and emoticons included in the evaluation review so that only pure language remains, and removes evaluation reviews consisting only of symbols and emoticons. As an example, if the word or sentence in the evaluation review consists only of 'ㅋ, ^^, or emoticon', the review is removed, and if the word or sentence contains 'ㅋ, ^^, or emoticon', it is removed. Preprocess to include only pure languages.

문자 변환부(212)는 학습을 위해 평가 리뷰에 포함된 영문자의 대문자를 소문자로 변환하여 출력한다. The character conversion unit 212 converts uppercase letters of English letters included in the evaluation review into lowercase letters and outputs them for learning.

불용어 제거부(213)는 평가 리뷰에 포함된 욕설 또는 비방 문구를 기 설정된 조건에 따라 판독하여 제거한다. 이때, 평가 리뷰가 욕설 또는 비방 문구만으로 이루어진 경우에는 해당 리뷰를 제거하고, 욕설 또는 비방 문구가 포함된 경우에는 이를 제거하도록 전처리 한다.The stopword removal unit 213 reads and removes abusive or slanderous phrases included in the evaluation review according to preset conditions. At this time, if the evaluation review consists only of profanity or slanderous phrases, the review is removed, and if it contains profanity or slanderous phrases, it is pre-processed to remove them.

문법 교정 및 형태소 분석부(214)는 언어 교정부(211), 문자 변환부(212) 및 불용어 제거부(213)를 통과한 평가 리뷰만을 대상으로 기 학습된 문장 형태소 분석부를 통해 문장의 형태소를 분석하여 평가 리뷰의 틀린 문법을 수정하고, 형태소 분석에 따라 평가 리뷰의 불완전한 문장을 제거 또는 수정 보완한다. 이를 통해 문법 교정 및 형태소 분석부(214)는 전처리된 제1 전처리 리뷰 데이터를 광고성 리뷰 제거부(220)로 출력한다.The grammar correction and morpheme analysis unit 214 analyzes the morphemes of sentences through a previously learned sentence morpheme analysis unit targeting only evaluation reviews that have passed the language correction unit 211, character conversion unit 212, and stopword removal unit 213. Analyze and correct incorrect grammar in the evaluation review, and remove or correct and supplement incomplete sentences in the evaluation review according to morphological analysis. Through this, the grammar correction and morpheme analysis unit 214 outputs the preprocessed first preprocessed review data to the advertising review removal unit 220.

문법 교정 및 형태소 분석부(214)로부터 출력되는 제1 전처리 리뷰 데이터는 도 6에 도시된 바와 같이 각 조건에 부합하지 않는 경우에는 리뷰가 제거되고, 각 조건에 부합하는 경우에는 리뷰를 포함시켜 출력되는 데이터이다. 이때, 출력되는 제1 전처리 리뷰 데이터는 일예로서 '음식점 이름, 리뷰어 평가 점수, 평가 리뷰 내용(단어, 어구 또는 문장으로 이루어질수 있음)'을 적어도 포함한다.As shown in FIG. 6, the first preprocessed review data output from the grammar correction and morpheme analysis unit 214 is output with the review removed if it does not meet each condition, and includes the review if it meets each condition. This is data. At this time, the output first pre-processed review data includes at least 'the restaurant name, reviewer evaluation score, and evaluation review content (which may consist of words, phrases, or sentences)' as an example.

광고성 리뷰 제거부(220)는 리뷰 전처리 제거부(210)에서 1차적으로 제거한 평가 리뷰만을 대상으로 광고성 리뷰 여부를 필터링하여 2차적으로 평가 리뷰를 전처리 제거함으로써 생성된 제2 전처리 리뷰 데이터를 리뷰 신뢰성 평가 제거부(230)로 출력한다. 이때, 광고성 리뷰인지 여부의 판단은 후술하는 광고성 리뷰 필터링 모듈부(222)를 구축함으로써 구현될 수 있다. 광고성 리뷰 제거부(220)는 이를 위해 카테고리별 광고성 리뷰 매칭 입력부(221) 및 광고성 리뷰 필터링 모듈부(222)를 도 7과 같이 포함한다.The advertising review removal unit 220 filters only the evaluation reviews primarily removed by the review preprocessing removal unit 210 to determine whether they are advertising reviews, and secondarily preprocesses the evaluation reviews to determine review reliability using the second preprocessed review data. It is output to the evaluation removal unit 230. At this time, determination of whether or not it is an advertising review can be implemented by building an advertising review filtering module unit 222, which will be described later. For this purpose, the advertising review removal unit 220 includes an advertising review matching input unit 221 for each category and an advertising review filtering module unit 222 as shown in FIG. 7 .

카테고리별 광고성 리뷰 매칭 입력부(221)는 리뷰 전처리 제거부(210)에서 출력된 제1 전처리 리뷰 데이터를 리뷰 카테고리별로 매칭하여 출력한다. 즉, 카테고리별 광고성 리뷰 매칭 입력부(221)는 입력된 제1 전처리 리뷰 데이터의 카테고리에 상응하는 광고성 리뷰 필터링 모델(222d)에 제1 전처리 리뷰 데이터를 입력시켜 리뷰를 전처리한다. 일예로서 제1 전처리 리뷰 데이터의 카테고리가 음식점 리뷰인 경우 음식점 리뷰를 필터링 하는 모델(222d)에 매칭 입력시키고, 제1 전처리 리뷰 데이터의 카테고리가 전자제품 리뷰인 경우 전자제품 리뷰를 필터링 하는 모델(222d)에 매칭 입력시킨다.The category-specific advertising review matching input unit 221 matches the first pre-processed review data output from the review pre-processing removal unit 210 by review category and outputs it. That is, the category-specific advertising review matching input unit 221 preprocesses the review by inputting the first pre-processing review data into the advertising review filtering model 222d corresponding to the category of the input first pre-processing review data. As an example, if the category of the first pre-processed review data is a restaurant review, matching is input to the model 222d for filtering restaurant reviews, and if the category of the first pre-processed review data is electronic product reviews, a model for filtering electronic product reviews (222d) ) and input matching.

광고성 리뷰 필터링 모듈부(222)는 카테고리별 광고성 리뷰 매칭 입력부(221)에서 매칭 출력된 제1 전처리 리뷰 데이터를 리뷰의 카테고리별로 구축된 광고성 리뷰 필터링 모델에 각각 매칭 입력하여 광고성 리뷰 여부를 필터링함으로써 제2 전처리 리뷰 데이터를 생성하고, 생성된 제2 전처리 리뷰 데이터를 리뷰 신뢰성 평가 제거부(230)로 출력한다. 이를 위해 광고성 리뷰 필터링 모듈부(222)는 도 8에 도시된 바와 같이 광고성 리뷰 추출부(222a), 사용 빈도수 추출부(222b), 카테고리별 광고성 리뷰 분류부(222c) 및 카테고리별 필터링 모델 생성부(222d)를 포함한다. 이때, 광고성 리뷰 필터링 모듈부(222)는 리뷰의 종류(일예로서 음식점 리뷰 또는 전자제품 리뷰 등)에 따라 미리 구축될 필요가 있으며, 앞서 상술한 제1 전처리 리뷰 데이터를 학습 모델 데이터로 삼아 미리 구축할 수 있다.The advertising review filtering module unit 222 matches and inputs the first pre-processed review data matched and output from the advertising review matching input unit 221 for each category into an advertising review filtering model built for each category of the review to filter whether or not it is an advertising review. 2 Preprocessed review data is generated, and the generated second preprocessed review data is output to the review reliability evaluation removal unit 230. To this end, as shown in FIG. 8, the advertising review filtering module unit 222 includes an advertising review extraction unit 222a, a frequency of use extraction unit 222b, an advertising review classification unit 222c by category, and a filtering model creation unit by category. Includes (222d). At this time, the advertising review filtering module unit 222 needs to be built in advance according to the type of review (for example, a restaurant review or an electronic product review, etc.), and is built in advance using the first preprocessed review data described above as learning model data. can do.

광고성 리뷰 추출부(222a)는 리뷰 전처리 제거부(210)에서 출력된 제1 전처리 리뷰 데이터를 리뷰 카테고리별로 분류하고, 학습 모델을 위해 리뷰 카테고리별로 광고성 리뷰의 단어, 어구 및 문장을 분석 및 추출한다.The advertising review extraction unit 222a classifies the first preprocessed review data output from the review preprocessing removal unit 210 by review category, and analyzes and extracts words, phrases, and sentences of the advertising review by review category for a learning model. .

사용 빈도수 추출부(222b)는 광고성 리뷰 추출부(222a)에서 추출한 리뷰의 카테고리별 단어, 어구 및 문장의 각 사용 빈도수를 산출하고, 사용 빈도수가 높은 단어, 어구 및 문장을 추출한다.The frequency of use extraction unit 222b calculates the frequency of use of each category of words, phrases, and sentences in the reviews extracted from the advertising review extraction unit 222a, and extracts words, phrases, and sentences with high frequency of use.

카테고리별 광고성 리뷰 분류부(222c)는 사용 빈도수 추출부(222b)에서 추출한 필터링 리뷰단어인 단어, 어구 및 문장을 리뷰의 카테고리별로 분류 및 저장한다. 일예로서, 특정 리뷰에서 노출되는 단어의 노출 빈도수가 기 설정된 조건을 만족하는 경우에 광고성 리뷰로 추측할 수 있으며, 이에 따라 노출 빈도수가 높은 단어를 필터링 리뷰단어로 정의할 수 있다.The category-specific advertising review classification unit 222c classifies and stores words, phrases, and sentences, which are filtered review words extracted from the usage frequency extraction unit 222b, by review category. As an example, if the exposure frequency of a word exposed in a specific review satisfies a preset condition, it can be assumed to be an advertising review, and accordingly, the word with a high exposure frequency can be defined as a filtering review word.

카테고리별 필터링 모델 생성부(222d)는 카테고리별 광고성 리뷰 분류부(222c)에서 리뷰의 카테고리별로 분류 저장한 필터링 리뷰단어를 이용하여 리뷰 의 카테고리별 필터링 모델을 각각 생성한다. 일예로서 전자제품 리뷰인 경우 전자제품 광고성 리뷰 필터링 모델을 생성하고, 음식점 리뷰인 경우 음식점 광고성 리뷰 ?터링 모델을 생성한다. 전자제품 광고성 리뷰 필터링 모델에는 전자제품과 관련된 노출 빈도수가 높은 단어, 어구 및 문장이 각각 학습되어 필터링 리뷰단어로 저장되고, 음식점 광고성 리뷰 필터링 모델에는 음식점과 관련된 노출 빈도수가 높은 단어, 어구 및 문장이 각각 학습되어 필터링 리뷰단어로 저장된다. 전자제품 필터링 리뷰단어는 전자제품과 관련한 웹사이트 서비스 플랫폼에서 크롤링한 모델 데이터이고, 음식점 필터링 리뷰단어는 음식점과 관련한 웹사이트 서비스 플랫폼에서 크롤링한 모델 데이터이다.The category-specific filtering model generation unit 222d generates a review category-specific filtering model using the filtered review words classified and stored by category of the review in the category-specific advertising review classification unit 222c. As an example, in the case of an electronic product review, an electronic product advertising review filtering model is created, and in the case of a restaurant review, a restaurant advertising review filtering model is created. In the electronic product advertising review filtering model, words, phrases, and sentences with high exposure frequency related to electronic products are learned and stored as filtering review words, and in the restaurant advertising review filtering model, words, phrases, and sentences with high exposure frequency related to restaurants are learned and stored as filtering review words. Each is learned and saved as a filtered review word. Electronic product filtering review words are model data crawled from a website service platform related to electronic products, and restaurant filtering review words are model data crawled from a website service platform related to restaurants.

각 리뷰의 카테고리별 필터링 모델에 제1 전처리 리뷰 데이터를 매칭 입력시키면 필터링 모델에 정의된 필터링 리뷰단어인 경우에 광고성 리뷰로 판단하고 도 9와 같이 해당 리뷰를 삭제시킬 수 있으며, 필터링 리뷰단어가 없는 경우에는 광고성 리뷰가 아니므로 해당 리뷰를 포함시킬 수 있다. If the first pre-processed review data is matched and entered into the filtering model for each review's category, if it is a filtering review word defined in the filtering model, it is judged to be an advertising review and the corresponding review can be deleted as shown in Figure 9, and if there is no filtering review word, In this case, the review can be included because it is not an advertising review.

리뷰 신뢰성 평가 제거부(230)는 광고성 리뷰 제거부(220)에서 2차적으로 제거한 평가 리뷰만을 대상으로 기 구축된 감정사전의 평가점수와 평가 리뷰의 평가점수를 서로 비교하여 기 설정된 조건에 부합하지 않는 평가 리뷰를 3차적으로 전처리 제거하고, 리뷰 전처리 제거부(210), 광고성 리뷰 제거부(220) 및 리뷰 신뢰성 평가 제거부(230) 각각의 3단계를 통과한 진정성 리뷰만을 클러스터링부(300)로 랜덤 출력한다. 리뷰 신뢰성 평가 제거부(230)는 감정사전 모델링부(231), 리뷰 어구 및 점수 추출부(232), 리뷰 감정점수 비교부(233), 리뷰 신뢰도 평가부(234) 및 리뷰 제거부(235)를 포함한다.The review reliability evaluation removal unit 230 compares the evaluation scores of the evaluation review with the evaluation score of the previously constructed appraisal dictionary targeting only the evaluation reviews secondaryly removed by the advertising review removal unit 220 and determines whether the evaluation score does not meet the preset conditions. Non-evaluative reviews are pre-processed and removed thirdly, and only authentic reviews that have passed the three stages of the review pre-processing removal unit 210, the advertising review removal unit 220, and the review credibility evaluation removal unit 230 are clustered in the clustering unit 300. Prints randomly. The review credibility evaluation removal unit 230 includes an appraisal dictionary modeling unit 231, a review phrase and score extraction unit 232, a review appraisal score comparison unit 233, a review reliability evaluation unit 234, and a review removal unit 235. Includes.

감정사전 모델링부(231)는 도 10 내지 도 12에 도시된 바와 같이 광고성 리뷰 제거부(220)에서 출력된 제2 전처리 리뷰 데이터에 포함된 평가 리뷰의 어구를 구축된 감정사전 모델에 입력하여 감정사전 평가점수를 산출하여 리뷰 감정점수 비교부(233)로 출력한다. 이를 위해 감정사전 모델링부(231)는 긍정 어구 학습부(231a), 부정 어구 학습부(231b), 중간 어구 학습부(231c), 어구별 점수 매칭부(231d), 리뷰 어구 매칭부(231e) 및 감정사전 점수 출력부(231f)를 포함한다.As shown in FIGS. 10 to 12, the appraisal dictionary modeling unit 231 inputs the phrases of the evaluation review included in the second pre-processed review data output from the advertising review removal unit 220 into the constructed appraisal dictionary model and performs appraisal. The preliminary evaluation score is calculated and output to the review emotional score comparison unit 233. For this purpose, the emotion dictionary modeling unit 231 includes a positive phrase learning unit 231a, a negative phrase learning unit 231b, an intermediate phrase learning unit 231c, a phrase score matching unit 231d, and a review phrase matching unit 231e. and an appraisal dictionary score output unit 231f.

긍정 어구 학습부(231a)는 미리 정의된 리뷰의 긍정 어구와 긍정 어구의 동의어 및 유의어를 학습한다. 긍정 어구의 일예로서 '참 맛있어요 또는 그런데로 맛있어요' 등이 있을 수 있다. 다만, '참 맛있어요'와 '그런데로 맛있어요'는 동일한 긍정 어구이긴 하나 긍정 어구의 등급이 서로 달라 감정사전 점수가 서로 다르다.The positive phrase learning unit 231a learns positive phrases of predefined reviews and synonyms and synonyms of the positive phrases. An example of a positive phrase may be ‘It’s really delicious or it’s just as delicious’. However, although 'It's really delicious' and 'It's delicious by the way' are the same positive phrases, the ratings of the positive phrases are different, so their emotional dictionary scores are different.

부정 어구 학습부(231b)는 미리 정의된 리뷰의 부정 어구와 부정 어구의 동의어 및 유의어를 학습한다. 부정 어구의 일예로서 '참 맛없어요 또는 진짜 맛없어요, 조금 맛없어요' 등이 있을 수 있다. 다만, '참 맛없어요 또는 진짜 맛없어요 '는 유사한 부정 어구로서 감정사전 점수가 동일하고, '조금 맛없어요'는 부정 어구의 등급이 달라 감정사전 점수가 서로 다르다.The negative phrase learning unit 231b learns negative phrases of predefined reviews and synonyms and synonyms of the negative phrases. An example of a negative phrase may be 'It's really tasteless, or it's really tasteless, or it's a bit tasteless.' However, 'It's really tasteless' or 'It's really tasteless' are similar negative phrases and have the same appraisal dictionary score, while 'It's a bit tasteless' has different appraisal dictionary scores due to the different grades of the negative phrase.

중간 어구 학습부(231c)는 긍정 어구와 부정 어구의 중간적인 감정을 가지는 미리 정의된 리뷰의 중간 어구와 중간 어구의 동의어 및 유의어를 학습한다. 일예로서 '썩 나쁘지 않네요 또는 특별히 맛이 있다고 볼 수 없네요' 등이 있을 수 있으며, 중간 어구도 어구의 등급에 따라 감정사전 점수가 달라질 수 있다.The intermediate phrase learning unit 231c learns intermediate phrases of predefined reviews that have intermediate emotions between positive and negative phrases, and synonyms and synonyms of the intermediate phrases. An example might be 'It's not that bad or I can't say it's particularly tasty,' and even intermediate phrases may have different appraisal dictionary scores depending on the grade of the phrase.

어구별 점수 매칭부(231d)는 학습된 감정사전의 긍정 어구, 부정 어구 및 중간 어구의 어구별 등급에 따라 감정사전 점수를 매칭시킨다. 또한, 리뷰 어구 매칭부(231e)에서 출력한 평가 리뷰의 어구를 입력받아 이에 상응하는 평가점수를 감정사전 점수 출력부(231f)로 출력한다.The phrase-specific score matching unit 231d matches the emotion dictionary scores according to the phrase-specific grades of positive, negative, and intermediate phrases in the learned emotion dictionary. In addition, the phrases of the evaluation review output from the review phrase matching unit 231e are input and the corresponding evaluation scores are output to the appraisal dictionary score output unit 231f.

리뷰 어구 매칭부(231e)는 리뷰 어구 및 점수 추출부(232)에서 출력된 평가 리뷰의 어구(단어, 어구 및 문장 중 적어도 어느 하나를 포함)를 학습된 감정사전의 어구와 매칭 비교한다. 학습된 감정사전의 어구와 동일한 어구 또는 동의어 또는 유사어를 매칭한다. 매칭된 감정사전의 어구는 어구별 점수 매칭부(231d)로 출력한다.The review phrase matching unit 231e matches and compares phrases (including at least one of words, phrases, and sentences) of the evaluation review output from the review phrase and score extraction unit 232 with phrases from the learned emotion dictionary. Match the same phrase, synonym, or similar word to the phrase in the learned emotion dictionary. The matched phrases from the emotion dictionary are output to the phrase-specific score matching unit 231d.

감정사전 점수 출력부(231f)는 평가 리뷰의 어구와 매칭된 감정사전의 어구의 평균치 값을 감정사전 평가점수로 하여 리뷰 감정점수 비교부(233)로 출력한다. 이때, 평가 리뷰의 어구가 복수로 존재하는 경우 각 평가 리뷰의 어구의 감정사전 값을 평균하여 산출한다. The appraisal dictionary score output unit 231f outputs the average value of the phrases in the appraisal dictionary that match the phrases in the evaluation review as the appraisal dictionary evaluation score to the review appraisal score comparison unit 233. At this time, if there are multiple phrases in the evaluation review, the appraisal dictionary value of each phrase in the evaluation review is averaged and calculated.

리뷰 어구 및 점수 추출부(232)는 광고성 리뷰 제거부(220)에서 출력된 제2 전처리 리뷰 데이터에 포함된 평가 리뷰 및 평가 리뷰의 평가점수를 추출하고, 평가 리뷰를 감정사전 모델링부(231)에 출력한다. The review phrase and score extraction unit 232 extracts the evaluation review and the evaluation score of the evaluation review included in the second pre-processed review data output from the advertising review removal unit 220, and extracts the evaluation review from the appraisal dictionary modeling unit 231. Printed to

리뷰 감정점수 비교부(233)는 감정사전 모델링부(231)로부터 감정사전 평가점수를 입력받고, 리뷰 어구 및 점수 추출부(232)로부터 평가 리뷰의 평가점수를 입력받으며, 평가 리뷰의 어구에 상응하는 감정사전 평가점수와 평가 리뷰의 평가점수를 서로 비교하여 리뷰 평가점수 차이값을 산출한다.The review appraisal score comparison unit 233 receives the appraisal dictionary evaluation score from the appraisal dictionary modeling unit 231, receives the evaluation score of the evaluation review from the review phrase and score extraction unit 232, and corresponds to the phrase of the evaluation review. The difference in review evaluation scores is calculated by comparing the dictionary evaluation score and the evaluation score of the evaluation review.

리뷰 신뢰도 평가부(234)는 리뷰 감정점수 비교부(233)에서 산출한 리뷰 평가점수 차이 값이 기 설정된 조건 값을 초과하는 경우에는 평가 리뷰의 신뢰도가 낮다고 평가한다If the review evaluation score difference value calculated by the review evaluation score comparison unit 233 exceeds a preset condition value, the review reliability evaluation unit 234 evaluates the reliability of the evaluation review as low.

리뷰 제거부(235)는 리뷰 신뢰도 평가부(234)에서 신뢰도가 낮다고 평가한 평가 리뷰를 제거한 진정성 있는 평가 리뷰인 제3 전처리 리뷰 데이터를 클러스터링부로 랜덤 출력한다.The review removal unit 235 randomly outputs the third preprocessed review data, which is an authentic evaluation review by removing the evaluation reviews evaluated as low reliability by the review reliability evaluation unit 234, to the clustering unit.

본 발명의 일실시예에 따른 클러스터링부(300)는 리뷰 데이터 전처리부(200)에서 3단계에 걸쳐 필터링 한 진정성 리뷰 데이터(또는 제3 전처리 리뷰 데이터)를 입력받아 5개의 군집 리뷰 데이터로 군집 분류한다. 일예로서 클러스터링부(300)는 리뷰 데이터 전처리부(200)에서 필터링한 진정성 리뷰 데이터를 랜덤하게 무작위로 입력받아 진정성 리뷰 데이터에 포함된 평가 리뷰의 단어(어구 또는 문장을 포함할 수 있음)의 특성, 문맥, 각 단어의 의미 또는 유사성 등 여러가지를 고려 분석하여 특정 군집 5개를 생성한다. 생성된 특정 군집 5개는 도 13에 도시된 바와 같이 제1,….,5 리뷰 군집 데이터부(310,….,350)일 수 있다. 제1,….,5 리뷰 군집 데이터부(310,….,350)에는 동일하거나 유사한 단어(어구 또는 문장을 포함할 수 있음)끼리 군집으로 서로 묶여있다. 일예로서, 제1 리뷰 군집 데이터부(310)에는 평가점수가 1점이거나 이와 유사한 평가리뷰가 서로 군집으로 묶여 있으며, 제2 리뷰 군집 데이터부(320)에는 평가점수가 2점이거나 이와 유사한 평가리뷰가 서로 군집으로 묶여 있다. 마찬가지로 제3,….,5 리뷰 군집 데이터부(330,….,350)에는 3점, 4점, 5점과 동일하거나 유사한 평가리뷰가 서로 군집으로 묶여 있다. 따라서 클러스터링부(300)는 리뷰 데이터 전처리부(200)에서 제3 전처리 리뷰 데이터를 랜덤으로 입력받아 5개의 군집별로 군집화된 군집 데이터를 출력한다. 군집화된 군집 데이터는 각각 1점,….,5점의 평가점수에 대응되는 군집으로 군집 모듈이 생성된다.The clustering unit 300 according to an embodiment of the present invention receives the authenticity review data (or third pre-processed review data) filtered in three stages by the review data pre-processing unit 200 and classifies the review data into five clusters. do. As an example, the clustering unit 300 randomly receives the authenticity review data filtered by the review data pre-processing unit 200 and provides characteristics of words (which may include phrases or sentences) of evaluation reviews included in the authenticity review data. , context, meaning or similarity of each word are considered and analyzed to create five specific clusters. As shown in Figure 13, the five specific clusters created are 1st,... .,5 It may be a review cluster data unit (310,….,350). 1st,… .,5 In the review cluster data unit 310,….,350, identical or similar words (which may include phrases or sentences) are grouped together. As an example, in the first review cluster data unit 310, evaluation reviews with an evaluation score of 1 or similar are clustered together, and in the second review cluster data portion 320, evaluation reviews with an evaluation score of 2 or similar are clustered together. are grouped together. Similarly, the third... .,5 In the review cluster data section (330,….,350), evaluation reviews with the same or similar scores of 3, 4, and 5 are clustered together. Therefore, the clustering unit 300 randomly receives the third pre-processed review data from the review data pre-processing unit 200 and outputs cluster data clustered into five clusters. Clustered cluster data are each scored with 1 point,… .,A cluster module is created with a cluster corresponding to an evaluation score of 5 points.

이러한 군집 데이터의 분류는 DEC(Deep Embedding Clustering) 알고리즘을 이용할 수 있다. DEC(Deep Embedding Clustering) 알고리즘의 설명은 생략하기로 하고 필요에 따라 본 발명의 기술적 사상을 벗어나지 아니하는 범위내에서 참조될 수 있다. Classification of such cluster data can use the DEC (Deep Embedding Clustering) algorithm. The description of the Deep Embedding Clustering (DEC) algorithm will be omitted and may be referred to as needed without departing from the technical spirit of the present invention.

클러스터링부(300)에 입력되는 리뷰 데이터 전처리부(200)의 제3 전처리 리뷰 데이터는 리뷰 학습 데이터와 리뷰 평가 데이터일 수 있다. 리뷰 학습 데이터는 후술하는 리뷰 진정성 판별 학습 모델을 생성할 때 클러스터링부(300)에 입력되는 데이터이다. 리뷰 평가 데이터는 리뷰 진정성 판별 학습 모델이 생성된 후에 리뷰의 진정성을 실질적으로 평가하기 위해 특정 웹사이트 서비스 플랫폼에 있는 평가 리뷰가 클러스터링부(300)에 입력되는 데이터이다.The third preprocessed review data of the review data preprocessor 200 input to the clustering unit 300 may be review learning data and review evaluation data. Review learning data is data input to the clustering unit 300 when creating a learning model for determining review authenticity, which will be described later. Review evaluation data is data that is input to the clustering unit 300 from evaluation reviews on a specific website service platform in order to substantially evaluate the authenticity of the review after the review authenticity determination learning model is created.

본 발명의 일실시예에 따른 리뷰 진정성 판별 학습 모델 생성부(400)는 클러스터링부(300)에서 군집 분류한 5개의 군집 리뷰 학습 데이터를 기초로 도 14 및 도 15에 도시된 바와 같이 기 정해진 조건에 따라 2개씩 쌍으로 군집 리뷰 학습 데이터를 기 구축된 한국어 학습 모델(420)에 입력하여 리뷰 진정성 판별 학습 모델(431,….,435)을 5개의 군집별로 각각 생성한다. The review authenticity determination learning model generator 400 according to an embodiment of the present invention performs predetermined conditions as shown in FIGS. 14 and 15 based on the five cluster review learning data classified into clusters by the clustering unit 300. Accordingly, the cluster review learning data is input into the previously constructed Korean learning model (420) in pairs, and the review authenticity determination learning model (431,...,435) is generated for each of the five clusters.

즉, 제1 리뷰 군집 데이터부(310)와 제5 리뷰 군집 데이터부(350)를 한 쌍으로 매칭하여 한국어 학습 모델부(420)에 입력함으로써 제1 리뷰 판별 모델부(431)를 생성한다. 제1 리뷰 군집 데이터부(310)는 리뷰의 평가점수가 1점과 동일 또는 유사한 점수끼리 묶인 군집 데이터이고, 제5 리뷰 군집 데이터부(350)는 리뷰의 평가점수가 5점과 동일 또는 유사한 점수끼리 묶인 군집 데이터이다. 따라서 평가점수 1점과 관련한 리뷰 진정성 판별 모델(431)을 학습시키기 위해 1점과 가장 대척점에 있는 5점과 관련된 리뷰 군집 데이터부를 서로 매칭시켜 한국어 학습 모델부(420)에 입력한다. That is, the first review cluster data unit 310 and the fifth review cluster data unit 350 are matched as a pair and input into the Korean learning model unit 420 to generate the first review discrimination model unit 431. The first review cluster data unit 310 is cluster data in which review evaluation scores are equal to or similar to 1, and the fifth review cluster data unit 350 is cluster data with review evaluation scores equal to or similar to 5. It is cluster data that is grouped together. Therefore, in order to learn the review authenticity determination model 431 related to the evaluation score of 1, the review cluster data section related to the 1 point and the most opposite 5 points are matched and input into the Korean learning model section 420.

이와 마찬가지로 제2 리뷰 군집 데이터부(320)는 평가점수가 2점과 관련한 군집 데이터로서 가장 대척점에 있는 5점과 관련한 제5 리뷰 군집 데이터부(350)를 서로 매칭하여 한국어 학습 모델부(420)에 입력 후 학습시킴으로써 2점과 관련한 리뷰 진정성 판별 모델(432)이 생성된다.Likewise, the second review cluster data unit 320 is cluster data related to the evaluation score of 2, and matches the fifth review cluster data unit 350 related to the most opposite 5 points to create the Korean learning model unit 420. ) and learning, a review authenticity determination model 432 related to 2 points is created.

또한, 제3 리뷰 군집 데이터부(330)는 평가점수가 3점과 관련한 군집 데이터로서 가장 대척점에 있는 5점과 관련한 제5 리뷰 군집 데이터부(350)를 서로 매칭하여 한국어 학습 모델부(420)에 입력 후 학습시킴으로써 3점과 관련한 리뷰 진정성 판별 모델(433)이 생성된다.In addition, the third review cluster data unit 330 is cluster data related to the evaluation score of 3 points, and matches the fifth review cluster data unit 350 related to the most opposite 5 points to form the Korean learning model unit 420. ) and learning, a review authenticity determination model (433) related to 3 points is created.

또한, 제4 리뷰 군집 데이터부(340)는 평가점수가 4점과 관련한 군집 데이터로서 가장 대척점에 있는 1점과 관련한 제1 리뷰 군집 데이터부(310)를 서로 매칭하여 한국어 학습 모델부(420)에 입력 후 학습시킴으로써 4점과 관련한 리뷰 진정성 판별 모델(434)이 생성된다.In addition, the fourth review cluster data unit 340 is cluster data related to the evaluation score of 4, and matches the first review cluster data unit 310 related to the 1 point at the most opposite point to form the Korean learning model unit 420. ), a review authenticity determination model (434) related to 4 points is created by learning it.

마지막으로, 제5 리뷰 군집 데이터부(350)는 평가점수가 5점과 관련한 군집 데이터로서 가장 대척점에 있는 1점과 관련한 제1 리뷰 군집 데이터부(310)를 서로 매칭하여 한국어 학습 모델부(420)에 입력 후 학습시킴으로써 5점과 관련한 리뷰 진정성 판별 모델(4345)이 생성된다.Finally, the fifth review cluster data unit 350 is cluster data related to the evaluation score of 5 points, and matches the first review cluster data unit 310 related to the most opposite point, 1 point, to create a Korean learning model unit ( By inputting and learning 420), a review authenticity determination model 4345 related to 5 points is created.

리뷰 진정성 판별 학습 모델 생성부(400)는 군집 리뷰 쌍 매칭부(410), 한국어 학습 모델부(420), 리뷰 진정성 판별 학습 모델부(430)를 포함한다.The review authenticity determination learning model generation unit 400 includes a cluster review pair matching unit 410, a Korean learning model unit 420, and a review authenticity determination learning model unit 430.

군집 리뷰 쌍 매칭부(410)는 클러스터링부(300)에서 군집 분류한 5개의 군집 리뷰 학습 데이터를 기초로 기 정해진 조건에 따라 2개씩 한 쌍으로 군집 리뷰 학습 데이터를 매칭하여 한국어 학습 모델부(420)로 입력한다. 즉, 앞서 설명드린 바와 같이 가장 대척점에 있는 제1,5 리뷰 군집 데이터부(310,350)를 서로 매칭하고, 제2,5 리뷰 군집 데이터부(320,350)를 서로 매칭하고, 제3,5 리뷰 군집 데이터부(330,350)를 서로 매칭하고, 제4,1 리뷰 군집 데이터부(340,310)를 서로 매칭하고, 제5,1 리뷰 군집 데이터부(350,310)를 서로 매칭한다.The cluster review pair matching unit 410 matches the cluster review learning data into pairs according to predetermined conditions based on the five cluster review learning data classified into clusters by the clustering unit 300, and matches the cluster review learning data into pairs according to predetermined conditions to match the cluster review learning data to the Korean learning model unit 420. ). That is, as explained earlier, the 1st and 5th review cluster data units (310,350), which are at the most opposite points, are matched with each other, the 2nd and 5th review cluster data units (320,350) are matched with each other, and the 3rd and 5th review clusters are matched with each other. The data units 330 and 350 are matched with each other, the 4th and 1st review cluster data units 340 and 310 are matched with each other, and the 5th and 1st review cluster data units 350 and 310 are matched with each other.

한국어 학습 모델부(420)는 군집 리뷰 쌍 매칭부(410)에서 2개씩 한 쌍으로 매칭한 각각의 군집 리뷰 학습 데이터의 입력을 기초로 쌍으로 매칭된 군집 리뷰별로 한국어 학습을 진행한다. 한국어 학습 모델은 KoBERT 학습 모델을 이용하며, KoBERT 학습 모델은 본 발명의 기술적 사상의 범위를 벗어나지 아니하는 범위내에서 설명을 참조할 수 있다.The Korean learning model unit 420 performs Korean learning for each pair-matched cluster review based on the input of each cluster review learning data matched as a pair by the cluster review pair matching unit 410. The Korean learning model uses the KoBERT learning model, and the description of the KoBERT learning model may be referred to within the scope of the technical idea of the present invention.

리뷰 진정성 판별 학습 모델부(430)는 한국어 학습 모델부(420)의 군집 리뷰별 학습에 따라 리뷰 군집별로 5개의 리뷰 진정성 판별 학습 모델(431,….,435)을 각각 생성한다.The review authenticity determination learning model unit 430 generates five review authenticity determination learning models 431,...,435 for each review cluster according to the learning for each cluster review of the Korean learning model unit 420.

본 발명의 일실시예에 따른 리뷰 진정성 평가부(500)는 클러스터링부(300)에서 군집 분류한 5개의 군집 리뷰 평가 데이터를 군집 분류에 대응하는 리뷰 진정성 판별 학습 모델에 각각 입력하여 리뷰의 진정성을 평가한다. 즉, 5개의 군집 리뷰 평가 데이터 중 제1 리뷰 군집 데이터부(310)가 입력되면 제1 리뷰 진정성 판별 모델부(431)로 리뷰 군집 데이터를 입력하여 리뷰의 진정성을 평가 판단한다. 마찬가지로 제2 리뷰 군집 데이터부(320)가 입력되면 제2 리뷰 진정성 판별 모델부(432)로 리뷰 군집 데이터를 입력시킨다. 이하 제3,….,5 리뷰 군집 데이터부(330,….,350)도 동일한 원리로 서로 매칭 입력된다. 한편, 군집 분류한 5개의 군집 리뷰 평가 데이터는 5개의 군집 데이터가 동시에 입력될 수도 있고, 필요에 따라 입력 군집 수가 정해질 수도 있다. The review authenticity evaluation unit 500 according to an embodiment of the present invention inputs the five cluster review evaluation data classified into clusters by the clustering unit 300 into a review authenticity determination learning model corresponding to the cluster classification to determine the authenticity of the review. Evaluate. That is, when the first review cluster data unit 310 among the five cluster review evaluation data is input, the review cluster data is input to the first review authenticity determination model unit 431 to evaluate and determine the authenticity of the review. Likewise, when the second review cluster data unit 320 is input, the review cluster data is input to the second review authenticity determination model unit 432. Hereinafter, Article 3… .,5 The review cluster data units 330,….,350 are also matched and inputted according to the same principle. Meanwhile, the five cluster review evaluation data classified into five clusters may be input at the same time, or the number of input clusters may be determined as needed.

리뷰 진정성 판별 학습 모델 생성부(400)에서 군집 리뷰 데이터를 이용하여 학습이 완료되어 리뷰 진정성 판별 모델이 생성되면, 리뷰 진정성 평가부(500)는 생성된 리뷰 진정성 판별 모델을 이용하여 평가대상 리뷰의 진정성 여부를 평가한다.When the review authenticity determination learning model generation unit 400 completes learning using the cluster review data and generates a review authenticity determination model, the review authenticity evaluation unit 500 uses the generated review authenticity determination model to determine the review target review. Evaluate authenticity.

리뷰 진정성 평가부(500)는 이를 위해 군집 분류별 학습모델 매칭부(510), 리뷰 진정성 판별 학습 모듈부(520), 리뷰별 진정성 평가 수집부(530), 각 군집별 리뷰 평균값 산출부(540), 리뷰 진정성 평가 제공부(550)를 포함한다.For this purpose, the review authenticity evaluation unit 500 includes a learning model matching unit 510 for each cluster classification, a review authenticity determination learning module unit 520, an authenticity evaluation collection unit 530 for each review, and a review average value calculation unit 540 for each cluster. , and includes a review authenticity evaluation provision unit 550.

군집 분류별 학습모델 매칭부(510)는 클러스터링부(300)에서 군집 분류한 5개의 군집 리뷰 평가 데이터와 리뷰 진정성 판별 학습 모델부(430)에서 생성한 5개의 리뷰 진정성 판별 학습 모델을 서로 대응 매칭시킨다. 즉, 군집 분류한 군집 리뷰 평가 데이터가 1점과 관련한 군집 리뷰 평가 데이터인 경우 제1 리뷰 진정성 판별 모델부(431)로 입력되도록 대응 매칭시킨다. 2점 내지 5점의 경우에도 동일한 원리가 적용된다.The learning model matching unit 510 for each cluster classification matches the five cluster review evaluation data clustered by the clustering unit 300 and the five review authenticity determination learning models generated by the review authenticity determination learning model unit 430. . That is, if the cluster-classified cluster review evaluation data is cluster review evaluation data related to 1 point, it is matched to be input to the first review authenticity determination model unit 431. The same principle applies for scores 2 to 5.

리뷰 진정성 판별 학습 모듈부(520)에는 리뷰 진정성 판별 학습 모델부(430)에서 생성된 5개의 리뷰 진정성 판별 학습 모델이 각각 구비된다. 리뷰 진정성 판별 학습 모듈부(520)는 군집 분류별 학습모델 매칭부(510)의 매칭에 의해 어느 하나의 리뷰 진정성 판별 학습 모델에 매칭 입력된 군집 리뷰 평가 데이터의 진정성 판별을 'true' or 'false'로 평가한 리뷰 진정성 평가 데이터를 출력한다.The review authenticity determination learning module unit 520 is provided with five review authenticity determination learning models generated in the review authenticity determination learning model unit 430, respectively. The review authenticity determination learning module unit 520 determines the authenticity of the input cluster review evaluation data as 'true' or 'false' by matching it to one review authenticity determination learning model by matching the learning model matching unit 510 for each cluster classification. Outputs the review authenticity evaluation data evaluated by .

리뷰별 진정성 평가 수집부(530)는 리뷰 진정성 판별 학습 모듈부(520)에서 평가한 리뷰의 진정성 판별을 군집 리뷰별로 카테고리화 하여 수집 저장한다.The authenticity evaluation collection unit 530 for each review categorizes and stores the authenticity determination of the reviews evaluated by the review authenticity determination learning module unit 520 by group of reviews.

군집별 리뷰 평균값 산출부(540)는 리뷰 진정성 판별 학습 모델에서 판별한 'true' or 'false'를 기준으로 군집 리뷰별로 판별 평균값을 산출한다. 즉, 입력된 군집 리뷰 데이터가 제1,2 군집 리뷰 데이터인 경우에 제1 군집 리뷰 데이터의 제1 판별 평균값을 산출하고, 제2 군집 리뷰 데이터의 제2 판별 평균값을 산출한다. 평균값의 산출은 일예로서 제1 군집 리뷰 데이터의 군집 리뷰의 개수가 총 100개인 경우 판별기준에 따라 'true'로 판별된 리뷰의 개수를 이용하여 판별 평균값을 산출한다. 제1 군집 리뷰 데이터의 판별 평균값은 대체로 1점과 동일 또는 유사한 평균값이 도출되고, 제2 군집 리뷰 데이터의 판별 평균값은 대체로 2점과 동일 또는 유사한 평균값이 도출될 수 있다.The review average value calculation unit 540 for each cluster calculates the average discrimination value for each cluster review based on 'true' or 'false' determined by the review authenticity determination learning model. That is, when the input cluster review data is the first and second cluster review data, the first discriminant average value of the first cluster review data is calculated, and the second discriminant average value of the second cluster review data is calculated. Calculation of the average value is, as an example, when the total number of cluster reviews in the first cluster review data is 100, the average discrimination value is calculated using the number of reviews determined as 'true' according to the discrimination standard. The discriminant average value of the first cluster review data may be generally equal to or similar to 1 point, and the discriminative average value of the second cluster review data may be generally equal to or similar to 2 points.

리뷰 진정성 평가 제공부(550)는 판별 학습 모델의 군집 리뷰 데이터별 판별 평균값과 웹 사이트 서비스 플랫폼의 리뷰 평균값을 서로 비교하여 제공한다. 일예로서 웹 사이트 서비스 플랫폼의 리뷰 평균값은 '4.4'일 수 있고, 본 발명에서 판단한 리뷰 평균값은 '4.3'일 수 있다.The review authenticity evaluation provision unit 550 compares and provides the discriminant average value for each cluster review data of the discriminant learning model and the review average value of the website service platform. As an example, the average review value of the website service platform may be '4.4', and the average review value determined by the present invention may be '4.3'.

본 발명의 일실시예에 따른 리뷰 진정성 평가 피드백부(600)는 도 16에 도시된 바와 같이 리뷰 진정성 평가부(500)에서 평가한 리뷰 진정성 평가 데이터 중 정확도가 떨어지는 리뷰 진정성 평가 데이터를 기 설정된 조건에 따라 선별하고, 선별된 리뷰 진정성 평가 데이터와 관련된 리뷰 평가를 설문자 응답을 통해 피드백 받아 설문자 응답에 포함된 리뷰의 평가 점수를 추출하여 대응 매칭되는 선별된 리뷰 진정성 평가 데이터에 수정 반영하도록 한다. 이를 위해 리뷰 진정성 평가 피드백부(600)는 도 16과 같이 설문자 응답 평가부(610), 설문자 응답 리뷰 및 평가점수 추출부(620), 리뷰 평가점수 수정부(630), 응답 평가 피드백부(640)를 포함한다.As shown in FIG. 16, the review authenticity evaluation feedback unit 600 according to an embodiment of the present invention evaluates less accurate review authenticity evaluation data among the review authenticity evaluation data evaluated by the review authenticity evaluation unit 500 under preset conditions. Accordingly, the review evaluation related to the selected review authenticity evaluation data is fed back through the questionnaire response, and the evaluation score of the review included in the questionnaire response is extracted and reflected in the correspondingly matched selected review authenticity evaluation data. . To this end, the review authenticity evaluation feedback unit 600 includes a surveyor response evaluation unit 610, a surveyor response review and evaluation score extraction unit 620, a review evaluation score modification unit 630, and a response evaluation feedback unit, as shown in Figure 16. Includes (640).

설문자 응답 평가부(610)는 선별된 리뷰 진정성 평가 데이터와 관련된 설문자 응답의 유효 여부를 기 정의된 조건에 따라 판단 평가한다. 즉, 정확도가 떨어지는 리뷰 진정성 평가 데이터의 경우 보완이 필요하며, 따라서 정확도가 떨어지는 리뷰 진정성 평가 데이터에 포함된 리뷰 항목을 설문자에게 보내 평가점수를 다시 피드백 받도록 한다. 설문자로부터 피드백된 응답은 리뷰의 평가점수로서, 설문 응답자가 피드백한 응답의 리뷰 평가점수가 유효한지 여부를 기 설정된 조건에 따라 판단 평가한다. 유효한지 여부는 선별된 리뷰 진정성 평가 데이터의 평가점수와 너무 동떨어지는 평가점수를 응답 피드백한 경우 등일 수 있다.The surveyor response evaluation unit 610 determines and evaluates the validity of the surveyor's response related to the selected review authenticity evaluation data according to predefined conditions. In other words, in the case of review authenticity evaluation data with low accuracy, supplementation is necessary, so review items included in the review authenticity evaluation data with low accuracy are sent to the surveyor to receive feedback on the evaluation score. The response fed back from the surveyor is the evaluation score of the review, and whether the review evaluation score of the response fed back by the survey respondent is valid is judged and evaluated according to preset conditions. Validity may arise in cases where response feedback is provided with an evaluation score that is too far from the evaluation score of the selected review authenticity evaluation data.

설문자 응답 리뷰 및 평가점수 추출부(620)는 설문자 응답 평가부(610)의 판단 평가에 따라 유효로 판단된 설문자 응답에 포함된 리뷰 및 리뷰의 설문자 평가점수를 추출한다.The surveyor response review and evaluation score extraction unit 620 extracts the review and the surveyor evaluation score of the review included in the surveyor response determined to be valid according to the judgment and evaluation of the surveyor response evaluation unit 610.

리뷰 평가점수 수정부(630)는 설문자 응답 리뷰 및 평가점수 추출부(620)에서 추출한 리뷰의 설문자 응답 평가점수를 대응 매칭되는 리뷰 진정성 평가 데이터에 수정 반영한다. 즉, 선별된 리뷰 진정성 평가 데이터의 리뷰 평가점수가 '3.2점'이고, 설문자 응답 평가점수가 '3.4점'으로 평가되었으면 선별된 리뷰 진정성 평가 데이터의 리뷰 평가점수를 '3.4점'으로 수정한다.The review evaluation score correction unit 630 modifies and reflects the surveyor response evaluation score of the review extracted from the surveyor response review and evaluation score extraction unit 620 in the correspondingly matched review authenticity evaluation data. In other words, if the review evaluation score of the selected review authenticity evaluation data is '3.2 points' and the survey response evaluation score is evaluated as '3.4 points', the review evaluation score of the selected review authenticity evaluation data is modified to '3.4 points'. .

응답 평가 피드백부(640)는 리뷰 평가점수 수정부(630)에 의해 수정된 리뷰 진정성 평가 데이트를 군집 리뷰 쌍 매칭부(410)에 재입력한다. 따라서 군집 리뷰 쌍 매칭부(410)는 응답 평가 피드백부(640)의 수정된 리뷰 진정성 평가 데이터를 기초로 재생성된 군집 리뷰 학습 데이터의 재입력에 따라, 재입력된 5개의 군집 리뷰 학습 데이터를 기초로 기 정해진 조건에 따라 2개씩 한 쌍으로 군집 리뷰 학습 데이터를 재매칭한다. 재매칭 군집 리뷰 학습 데이터를 한국어 학습 모델부(420)에 재입력함으로써 새로운 리뷰 진정성 판별 학습 모델이 생성된다. 이때, 새롭게 생성되는 리뷰 진정성 판별 학습 모델은 선별된 리뷰 진정성 평가 데이터와 관련한 학습 모델이다. 즉, 일예로서 선별된 리뷰 진정성 평가 데이터가 3점과 관련한 군집 학습 데이터인 경우에 제3 리뷰 진정성 판별 모델부(433)가 새롭게 학습 생성된다.The response evaluation feedback unit 640 re-inputs the review authenticity evaluation date modified by the review evaluation score modification unit 630 to the cluster review pair matching unit 410. Therefore, the cluster review pair matching unit 410 re-inputs the cluster review learning data regenerated based on the modified review authenticity evaluation data of the response evaluation feedback unit 640, and based on the five re-entered cluster review learning data. The cluster review learning data is rematched in pairs according to predetermined conditions. By re-entering the rematching cluster review learning data into the Korean learning model unit 420, a new review authenticity determination learning model is created. At this time, the newly created review authenticity determination learning model is a learning model related to the selected review authenticity evaluation data. That is, as an example, when the selected review authenticity evaluation data is cluster learning data related to 3 points, the third review authenticity determination model unit 433 is newly learned and generated.

(리뷰 진정성 평가 학습방법)(Learning method to evaluate review authenticity)

본 발명의 일실시예에 따른 리뷰 진정성 평가 학습방법을 첨부된 도 17을 참고하여 순차적으로 상세히 설명하도록 한다.The review authenticity evaluation learning method according to an embodiment of the present invention will be sequentially described in detail with reference to the attached FIG. 17.

리뷰 데이터 수집부(100)가 소비자 리뷰 또는 광고성 리뷰가 게시된 웹사이트에 접속하여 게시된 리뷰를 크롤링하고, 크롤링한 리뷰의 단어, 어구, 및 문장 중 적어도 어느 하나의 평가 리뷰와 평가 리뷰의 평가점수를 수집 매칭하여 리뷰 데이터를 생성한다.The review data collection unit 100 accesses a website where consumer reviews or advertising reviews are posted, crawls the posted reviews, and evaluates at least one of the words, phrases, and sentences of the crawled reviews and evaluation reviews. Review data is generated by collecting and matching scores.

리뷰 데이터를 생성하는 단계는, 서비스 플랫폼 접속부(110)가 소비자 리뷰 또는 광고성 리뷰가 게시된 웹 사이트 서비스 플랫폼에 접속하고, 리뷰 위치 탐색부(120)가 웹 사이트 서비스 플랫폼에 게시된 리뷰의 위치를 탐색하고, 리뷰 크롤링부(130)가 리뷰 위치 탐색부(120)의 탐색된 리뷰 위치를 기초로 웹 사이트 서비스 플랫폼에 게시된 리뷰를 크롤링하고, 리뷰 데이터 데이터베이스부(140)가 크롤링한 리뷰의 단어, 어구, 및 문장 중 적어도 어느 하나의 평가 리뷰와 각 평가 리뷰의 평가점수를 수집 매칭하여 리뷰 데이터화 하며, 리뷰 데이터를 리뷰할 대상의 종류별로 분류하여 저장한다.In the step of generating review data, the service platform connection unit 110 connects to the website service platform on which consumer reviews or advertising reviews are posted, and the review location search unit 120 determines the location of the review posted on the website service platform. The review crawling unit 130 crawls the reviews posted on the website service platform based on the searched review location of the review location search unit 120, and the review data database unit 140 crawls the words of the review. , phrases, and sentences, at least one evaluation review and the evaluation score of each evaluation review are collected and matched to form review data, and the review data is classified and stored by type of target to be reviewed.

다음으로, 리뷰 데이터 전처리부(200)가 리뷰 데이터에 포함된 평가 리뷰를 적어도 3단계에 걸쳐 전처리함으로써 진정성 리뷰만을 필터링 하여 전처리 출력한다.Next, the review data pre-processing unit 200 pre-processes the evaluation reviews included in the review data in at least three stages to filter out only the authentic reviews and output the pre-processed results.

전처리 출력하는 단계는 리뷰 전처리 제거부(210)ㄴ가 리뷰 데이터에 포함된 평가 리뷰를 분석하여 기 정의된 조건에 부적당한 평가 리뷰를 1차적으로 제거하여 출력하며, 광고성 리뷰 제거부(220)가 리뷰 전처리 제거부(210)에서 1차적으로 제거한 평가 리뷰만을 대상으로 광고성 리뷰 여부를 필터링하여 2차적으로 평가 리뷰를 제거하고, 리뷰 신뢰성 평가 제거부(230)가 광고성 리뷰 제거부(220)에서 2차적으로 제거한 평가 리뷰만을 대상으로 기 구축된 감정사전의 평가점수와 평가 리뷰의 평가점수를 서로 비교하여 기 설정된 조건에 부합하지 않는 평가 리뷰를 3차적으로 제거하여 3단계를 통과한 진정성 리뷰만을 클러스터링부(300)로 랜덤 출력한다.In the preprocessing output step, the review preprocessing removal unit 210 analyzes the evaluation reviews included in the review data and primarily removes and outputs evaluation reviews that are inappropriate for predefined conditions, and the advertising review removal unit 220 The review pre-processing removal unit 210 filters only the evaluation reviews primarily removed for advertising reviews to secondarily remove the evaluation reviews, and the review reliability evaluation removal unit 230 removes 2 from the advertising review removal unit 220. By comparing the evaluation scores of the previously constructed appraisal dictionary with the evaluation scores of the evaluation reviews for only the sequentially removed evaluation reviews, evaluation reviews that do not meet the preset conditions are thirdly removed, and only the authentic reviews that have passed the third stage are clustered. Randomly output to unit 300.

다음으로, 클러스터링부(300)가 리뷰 데이터 전처리부(200)에서 필터링 한 진정성 리뷰 데이터를 입력받아 5개의 군집 리뷰 학습 데이터로 군집 분류한다.Next, the clustering unit 300 receives the authenticity review data filtered by the review data pre-processing unit 200 and classifies it into five cluster review learning data.

다음으로, 리뷰 진정성 판별 학습 모델 생성부(400)가 클러스터링부(300)에서 군집 분류한 5개의 군집 리뷰 학습 데이터를 기초로 서로 대척점에 있는 군집 리뷰 학습 데이터를 2개씩 쌍으로 묶어 기 구축된 한국어 학습 모델에 입력함으로써 5개의 군집별로 리뷰 진정성 판별 학습 모델을 각각 생성한다.Next, the review authenticity determination learning model generation unit 400 groups the cluster review learning data at opposite points into two pairs based on the five cluster review learning data classified into clusters by the clustering unit 300 to create a pre-constructed model. By inputting it into the Korean learning model, a learning model for determining review authenticity is created for each of the five clusters.

리뷰 진정성 판별 학습 모델이 각각 생성되는 단계는 군집 리뷰 쌍 매칭부(410)가 클러스터링부(300)에서 군집 분류한 5개의 군집 리뷰 학습 데이터를 기초로 기 정해진 조건에 따라 2개씩 한 쌍으로 군집 리뷰 학습 데이터를 매칭하고, 한국어 학습 모델부(420)가 군집 리뷰 쌍 매칭부(410)에서 2개씩 한 쌍으로 매칭한 각각의 군집 리뷰 학습 데이터의 입력을 기초로 쌍으로 매칭된 군집 리뷰별로 한국어 학습을 진행하고, 리뷰 진정성 판별 학습 모델부(430)가 한국어 학습 모델부(420)의 군집 리뷰별 학습에 따라 리뷰 군집별로 5개의 리뷰 진정성 판별 학습 모델을 각각 생성한다.The step in which each review authenticity determination learning model is generated is based on the five cluster review learning data classified by the cluster review pair matching unit 410 into clusters by the clustering unit 300, and cluster reviews in pairs of two according to predetermined conditions. Match the learning data, and the Korean learning model unit 420 learns Korean for each cluster review matched in pairs based on the input of each cluster review learning data matched as a pair by the cluster review pair matching unit 410. Then, the review authenticity determination learning model unit 430 generates five review authenticity determination learning models for each review cluster according to the learning for each cluster review of the Korean learning model unit 420.

다음으로, 리뷰 진정성 평가부(500)에서 평가한 리뷰 진정성 평가 데이터 중 정확도가 떨어지는 리뷰 진정성 평가 데이터를 기 설정된 조건에 따라 선별하고, 선별된 리뷰 진정성 평가 데이터와 관련된 리뷰 평가를 설문자 응답을 통해 피드백 받아 설문자 응답에 포함된 리뷰의 평가 점수를 추출하여 대응 매칭되는 선별된 리뷰 진정성 평가 데이터에 수정 반영하도록 함으로써 설문자 응답을 피드백한다.Next, among the review authenticity evaluation data evaluated by the review authenticity evaluation unit 500, review authenticity evaluation data with low accuracy are selected according to preset conditions, and review evaluations related to the selected review authenticity evaluation data are conducted through questionnaire responses. The surveyor's response is fed back by receiving feedback, extracting the evaluation score of the review included in the surveyor's response, and modifying and reflecting it in the correspondingly matched selected review authenticity evaluation data.

수정된 리뷰 진정성 평가 데이터를 기초로 군집 리뷰 학습 데이터를 재생성하고, 군집 리뷰 쌍 매칭부(400)에 의해 재생성된 군집 리뷰 학습 데이터를 쌍으로 재매칭하고, 재매칭된 군집 학습 데이터 쌍을 한국어 학습 모델부(420)에 입력하여 재학습함으로써 재생성된 군집 리뷰 학습 데이터와 관련된 새로운 리뷰 진정성 판별 학습 모델이 새롭게 생성된다. 새롭게 생성된 리뷰 진정성 판별 학습 모델은 리뷰 진정성 판별 학습 모듈부(520)에 이식된다.Cluster review learning data is regenerated based on the modified review authenticity evaluation data, cluster review learning data regenerated by the cluster review pair matching unit 400 is rematched in pairs, and the rematched cluster learning data pairs are used for Korean language learning. A new review authenticity determination learning model related to the regenerated cluster review learning data is created by inputting it into the model unit 420 and relearning it. The newly created review authenticity determination learning model is implanted into the review authenticity determination learning module unit 520.

설문자 응답을 피드백하는 단계는 설문자 응답 평가부(610)가 선별된 리뷰 진정성 평가 데이터와 관련된 설문자 응답의 유효 여부를 기 정의된 조건에 따라 판단 평가하고, 설문자 응답 리뷰 및 평가점수 추출부(620)가 설문자 응답 평가부(610)의 판단 평가에 따라 유효로 판단된 설문자 응답에 포함된 리뷰 및 리뷰의 설문자 응답 평가점수를 추출하고, 리뷰 평가점수 수정부(630)가 설문자 응답 리뷰 및 평가점수 추출부(620)에서 추출한 리뷰의 설문자 응답 평가점수를 대응 매칭되는 리뷰 진정성 평가 데이터에 수정 반영하고, 응답 평가 피드백부(640)가 리뷰 평가점수 수정부(630)에 의해 수정된 리뷰 진정성 평가 데이트를 군집 리뷰 쌍 매칭부(410)에 재입력하여 설문자 응답을 피드백한다. 이때, 응답 평가 피드백부(640)는 수정된 리뷰 진정성 평가 데이터를 기초로 군집 리뷰 학습 데이터를 새롭게 재생성하여 리뷰 쌍 매칭부(410)에 입력시킬 수 있다.In the step of feeding back the questionnaire response, the questionnaire response evaluation unit 610 determines and evaluates the validity of the questionnaire response related to the selected review authenticity evaluation data according to predefined conditions, reviews the questionnaire response, and extracts an evaluation score. The unit 620 extracts the review and the surveyor response evaluation score of the review included in the surveyor response determined to be valid according to the judgment and evaluation of the surveyor response evaluation unit 610, and the review evaluation score modification unit 630 The surveyor response evaluation score of the review extracted from the surveyor response review and evaluation score extraction unit 620 is corrected and reflected in the correspondingly matched review authenticity evaluation data, and the response evaluation feedback unit 640 is operated by the review evaluation score correction unit 630. The review authenticity evaluation date modified by is re-entered into the cluster review pair matching unit 410 to feed back the questionnaire response. At this time, the response evaluation feedback unit 640 may newly regenerate the cluster review learning data based on the modified review authenticity evaluation data and input it to the review pair matching unit 410.

또한, 새로운 리뷰 진정성 판별 학습 모델이 재생성되는 단계는 군집 리뷰 쌍 매칭부(410)가 응답 평가 피드백부(640)의 수정된 리뷰 진정성 평가 데이터를 기초로 새로운 군집 리뷰 학습 데이터를 생성하고, 군집 리뷰 쌍 매칭부(410)가 새로운 군집 리뷰 학습 데이터를 기초로 기 정해진 조건에 따라 2개씩 한 쌍으로 군집 리뷰 학습 데이터를 재매칭하고, 한국어 학습 모델부(420)로 새롭게 재매칭된 군집 리뷰 학습 데이터를 출력하고, 한국어 학습 모델부(420)가 한 쌍으로 입력된 군집 리뷰 학습 데이터를 이용하여 재학습함으로써 재생성된 군집 리뷰 학습 데이터와 관련된 새로운 리뷰 진정성 판별 학습 모델이 새롭게 재생성된다.In addition, the step in which a new review authenticity determination learning model is generated is that the cluster review pair matching unit 410 generates new cluster review learning data based on the modified review authenticity evaluation data of the response evaluation feedback unit 640, and the cluster review pair matching unit 410 generates new cluster review learning data based on the modified review authenticity evaluation data of the response evaluation feedback unit 640. The pair matching unit 410 rematches the cluster review learning data in pairs according to predetermined conditions based on the new cluster review learning data, and the cluster review learning data newly rematched by the Korean learning model unit 420. is output, and the Korean learning model unit 420 re-learns using the cluster review learning data input as a pair, thereby newly regenerating a new review authenticity determination learning model related to the regenerated cluster review learning data.

본 발명을 설명함에 있어 종래 기술 및 당업자에게 자명한 사항은 설명을 생략할 수도 있으며, 이러한 생략된 구성요소(방법) 및 기능의 설명은 본 발명의 기술적 사상을 벗어나지 아니하는 범위내에서 충분히 참조될 수 있을 것이다. 또한, 상술한 본 발명의 구성요소는 본 발명의 설명의 편의를 위하여 설명하였을 뿐 여기에서 설명되지 아니한 구성요소가 본 발명의 기술적 사상을 벗어나지 아니하는 범위내에서 추가될 수 있다. In explaining the present invention, matters that are obvious to those skilled in the art and skilled in the art may be omitted, and descriptions of such omitted components (methods) and functions may be sufficiently referenced without departing from the technical spirit of the present invention. You will be able to. In addition, the components of the present invention described above are only described for the convenience of explaining the present invention, and components not described herein may be added without departing from the technical spirit of the present invention.

상술한 각부의 구성 및 기능에 대한 설명은 설명의 편의를 위하여 서로 분리하여 설명하였을 뿐 필요에 따라 어느 한 구성 및 기능이 다른 구성요소로 통합되어 구현되거나, 또는 더 세분화되어 구현될 수도 있다.The description of the configuration and function of each part described above is explained separately from each other for convenience of explanation, and if necessary, one configuration and function may be implemented by integrating with other components, or may be implemented in further detail.

이상, 본 발명의 일실시예를 참조하여 설명했지만, 본 발명이 이것에 한정되지는 않으며, 다양한 변형 및 응용이 가능하다. 즉, 본 발명의 요지를 일탈하지 않는 범위에서 많은 변형이 가능한 것을 당업자는 용이하게 이해할 수 있을 것이다. 또한, 본 발명과 관련된 공지 기능 및 그 구성 또는 본 발명의 각 구성에 대한 결합관계에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는, 그 구체적인 설명을 생략하였음에 유의해야 할 것이다.Although the present invention has been described above with reference to one embodiment, the present invention is not limited to this, and various modifications and applications are possible. That is, those skilled in the art will easily understand that many modifications are possible without departing from the gist of the present invention. In addition, it should be noted that if a specific description of the known functions and their configurations related to the present invention or the combination relationship between each component of the present invention is judged to unnecessarily obscure the gist of the present invention, the detailed description has been omitted. something to do.

100 : 리뷰 데이터 수집부
110 : 서비스 플랫폼 접속부
120 : 리뷰 위치 탐색부
130 : 리뷰 크롤링부
140 : 리뷰 데이터 데이터베이스부
200 : 리뷰 데이터 전처리부
210 : 리뷰 전처리 제거부
211 : 언어 교정부
212 : 문자 변환부
213 : 불용어 제거부
214 : 문법 교정 및 형태소 분석부
220 : 광고성 리뷰 제거부
221 : 카테고리별 광고성 리뷰 매칭 입력부
222 : 광고성 리뷰 필터링 모듈부
222a : 광고성 리뷰 추출부
222b : 사용 빈도수 추출부
222c : 카테고리별 광고성 리뷰 분류부
222d : 카테고리별 필터링 모델 생성부
230 : 리뷰 신뢰성 평가 제거부
231 : 감정사전 모델링부
231a : 긍정 어구 학습부
231b : 부정 어구 학습부
231c : 중간 어구 학습부
231d : 어구별 점수 매칭부
231e : 리뷰 어구 매칭부
231f : 감정사전 점수 출력부
232 : 리뷰 어구 및 점수 추출부
233 : 리뷰 감정점수 비교부
234 : 리뷰 신뢰도 평가부
235 : 리뷰 제거부
300 : 클러스터링부
310 : 제1 리뷰 군집 데이터부
320 : 제2 리뷰 군집 데이터부
330 : 제3 리뷰 군집 데이터부
340 : 제4 리뷰 군집 데이터부
350 : 제5 리뷰 군집 데이터부
400 : 리뷰 진정성 판별 학습 모델 생성부
410 : 군집 리뷰 쌍 매칭부
420 : 한국어 학습 모델부
430 : 리뷰 진정성 판별 학습 모델부
431 : 제1 리뷰 진정성 판별 모델부
432 : 제2 리뷰 진정성 판별 모델부
433 : 제3 리뷰 진정성 판별 모델부
434 : 제4 리뷰 진정성 판별 모델부
435 : 제5 리뷰 진정성 판별 모델부
500 : 리뷰 진정성 평가부
510 : 군집 분류별 학습모델 매칭부
520 : 리뷰 진정성 판별 학습 모듈부
530 : 리뷰별 진정성 평가 수집부
540 : 군집별 리뷰 평균값 산출부
550 : 리뷰 진정성 평가 제공부
600 : 리뷰 진정성 평가 피드백부
610 : 설문자 응답 평가부
620 : 설문자 응답 리뷰 및 평가점수 추출부
630 : 리뷰 평가점수 수정부
640 : 응답 평가 피드백부100: Review data collection unit
110: Service platform connection part
120: Review location search unit
130: Review crawling unit
140: Review data database unit
200: Review data preprocessing unit
210: Review pre-processing removal unit
211: Language Correction Department
212: character conversion unit
213: Stop word removal unit
214: Grammar correction and morphological analysis department
220: Advertising review removal unit
221: Category-specific advertising review matching input unit
222: Advertising review filtering module unit
222a: Advertising review extraction unit
222b: Frequency of use extraction unit
222c: Advertising review classification unit by category
222d: Category-specific filtering model generation unit
230: Review reliability evaluation removal unit
231: Appraisal dictionary modeling department
231a: Positive phrase learning section
231b: Negative phrase learning unit
231c: Intermediate phrase learning unit
231d: Score matching unit for each phrase
231e: Review phrase matching unit
231f: Appraisal dictionary score output unit
232: Review phrase and score extraction unit
233: Review emotional score comparison unit
234: Review reliability evaluation unit
235: review removal unit
300: Clustering unit
310: 1st review cluster data unit
320: Second review cluster data unit
330: Third review cluster data unit
340: 4th review cluster data unit
350: 5th review cluster data unit
400: Review authenticity determination learning model creation unit
410: Cluster review pair matching unit
420: Korean learning model department
430: Review authenticity determination learning model unit
431: First review authenticity determination model unit
432: Second review authenticity determination model unit
433: Third review authenticity determination model unit
434: Fourth review authenticity determination model unit
435: Fifth review authenticity determination model unit
500: Review authenticity evaluation department
510: Learning model matching unit for each cluster classification
520: Review authenticity determination learning module part
530: Authenticity evaluation collection unit for each review
540: Review average value calculation unit for each group
550: Review authenticity evaluation provision department
600: Review authenticity evaluation feedback department
610: Surveyor response evaluation unit
620: Surveyor response review and evaluation score extraction unit
630: Review evaluation score modification unit
640: Response evaluation feedback unit

Claims

Accessing a website where consumer reviews or advertising reviews are posted, crawling the reviews, collecting and matching the evaluation scores of the evaluation reviews with at least one of the words, phrases, and sentences of the crawled reviews, and converting them into review data. review data collection department,
A review data pre-processing unit that pre-processes the evaluation reviews included in the review data in at least three stages to filter and output only authentic reviews;
A clustering unit that receives the authenticity review data filtered by the review data pre-processing unit and classifies them into five cluster review data;
Based on the five cluster review learning data classified into clusters by the clustering unit, the cluster review learning data is input into the previously constructed Korean learning model in pairs according to predetermined conditions, and the review authenticity determination learning model is created for each of the five clusters. Review authenticity determination learning model generation unit,
A review authenticity evaluation unit comprising a review authenticity evaluation unit that evaluates the authenticity of the review by inputting the five cluster review evaluation data clustered by the clustering unit into a review authenticity determination learning model corresponding to the cluster classification.

According to claim 1,
Review data collection department,
Service platform interface to access the website service platform where consumer reviews or advertising reviews are posted;
A review location search unit that searches for the location of reviews posted on the website service platform;
A review crawling unit that crawls reviews posted on the website service platform based on the review location discovered by the review location search unit;
A review data database that collects and matches at least one of the words, phrases, and sentences of the crawled reviews with the evaluation score of each evaluation review to create review data, and stores the review data by classifying it by type of subject to be reviewed. A review authenticity evaluation device characterized by including wealth.

According to claim 1,
The review data preprocessing unit,
A review pre-processing removal unit that analyzes the evaluation reviews included in the review data and primarily removes and outputs evaluation reviews that do not meet predefined conditions;
An advertising review removal unit that secondarily removes evaluation reviews by filtering whether or not they are advertising reviews, targeting only the evaluation reviews primarily removed in the review pre-processing removal unit;
Compare the evaluation scores of the previously constructed appraisal dictionary with the evaluation scores of the evaluation reviews, targeting only the evaluation reviews that were secondarily removed by the advertising review removal unit, and thirdly remove evaluation reviews that do not meet the preset conditions, and remove the evaluation reviews as above. A review authenticity evaluation device comprising a review reliability evaluation removal unit that randomly outputs only the authenticity reviews that have passed the stage to the clustering unit.

According to claim 3,
The review pre-processing removal unit,
a language correction department that removes symbols and emoticons included in the evaluation reviews so that only pure language remains, and removes evaluation reviews consisting only of symbols and emoticons;
A character conversion unit that converts uppercase letters of English letters included in the evaluation review into lowercase letters,
A stop word removal unit that removes profanity or slanderous phrases included in the evaluation review;
The grammar of the evaluation review is corrected by analyzing the morphemes of the sentence through the sentence morpheme analysis unit, which has already been learned only for evaluation reviews that have passed the language correction unit, character conversion unit, and stopword removal unit, and incomplete sentences in the evaluation review according to the morphological analysis. A review authenticity evaluation device comprising a grammar correction and morpheme analysis unit that removes and outputs first pre-processed review data to the advertising review removal unit.

According to claim 3,
The advertising review removal unit,
A category-specific advertising review matching input unit that matches and outputs the first pre-processed review data output from the review pre-processing removal unit by review category;
Second pre-processed review data is generated by matching and inputting the first pre-processed review data matched and output from the category-specific advertising review matching input unit into an advertising review filtering model built for each review category to filter whether or not the advertising review is present, and the generated first pre-processed review data 2. Review authenticity evaluation device comprising an advertising review filtering module unit that outputs pre-processed review data to the review credibility evaluation removal unit.

According to claim 5,
Advertising review filtering module unit,
An advertising review extraction unit that classifies the first preprocessed review data output from the review preprocessing removal unit into review categories, and analyzes and extracts words, phrases, and sentences of advertising reviews for each review category;
A frequency of use extraction unit that calculates the frequency of use of each word, phrase, and sentence for each review category extracted from the advertising review extraction unit, and extracts words, phrases, and sentences with a high frequency of use;
An advertising review classification unit by category that classifies and stores words, phrases, and sentences, which are filtered review words extracted from the frequency of use extraction unit, by review category;
A review authenticity evaluation device comprising a category-specific filtering model generation unit that generates filtering models for each review category using filtered review words classified and stored by review category in the advertising review classification unit for each category.

According to claim 3,
The review reliability evaluation removal unit,
An appraisal dictionary modeling unit that inputs the phrases of the evaluation review included in the second pre-processed review data output from the advertising review removal unit into the constructed appraisal dictionary model to calculate and output an appraisal dictionary evaluation score;
a review phrase and score extraction unit that extracts evaluation reviews and evaluation scores of evaluation reviews included in the second pre-processed review data output from the advertising review removal unit, and outputs the evaluation reviews to the appraisal dictionary modeling unit;
The appraisal dictionary evaluation score is input from the appraisal dictionary modeling unit, the appraisal score of the evaluation review is input from the review phrase and score extraction unit, and the appraisal dictionary appraisal score and evaluation review corresponding to the phrase of the appraisal review are received. A review emotional score comparison unit that compares scores to calculate the difference in review evaluation scores;
A review reliability evaluation unit that determines that the reliability of the evaluation review is low when the review evaluation score difference value calculated by the review emotional score comparison unit exceeds a preset condition value;
A review authenticity evaluation device comprising a review removal unit that randomly outputs third pre-processed review data, which is an authenticity evaluation review from which evaluation reviews determined to be low in reliability by the review reliability evaluation unit, are randomly output to the clustering unit.

According to claim 7,
The appraisal dictionary modeling department,
A positive phrase learning unit that learns positive phrases from predefined reviews and synonyms and synonyms of the positive phrases;
A negative phrase learning unit that learns negative phrases in predefined reviews and synonyms and synonyms of the negative phrases;
an intermediate phrase learning unit that learns intermediate phrases of predefined reviews having intermediate emotions between the positive phrase and the negative phrase, and synonyms and synonyms of the intermediate phrase;
A phrase-specific score matching unit that matches the emotion dictionary scores according to the phrase grades of positive, negative, and intermediate phrases in the learned emotion dictionary;
a review phrase matching unit that matches and compares the phrases of the evaluation review output from the review phrase and score extraction unit with phrases in the appraisal dictionary;
A review authenticity evaluation device comprising an appraisal dictionary score output unit that outputs the average value of phrases in the appraisal dictionary that match the phrases of the evaluation review as the appraisal dictionary evaluation score to the review appraisal score comparison unit.

According to claim 1,
The review authenticity determination learning model creation unit,
A cluster review pair matching unit that matches the cluster review learning data into two pairs according to predetermined conditions based on the five cluster review learning data classified into clusters by the clustering unit;
A Korean language learning model unit that performs Korean learning for each pair-matched cluster review based on the input of each cluster review learning data matched as a pair by the cluster review pair matching unit;
A review authenticity evaluation device comprising a review authenticity determination learning model unit in which five review authenticity determination learning models are generated for each review cluster according to learning for each cluster review of the Korean learning model unit.

According to clause 9,
The review authenticity evaluation department,
A learning model matching unit for each cluster classification that matches and inputs the five cluster review evaluation data clustered by the clustering unit into the five review authenticity determination learning models generated by the review authenticity determination learning model unit;
Review authenticity determination that outputs the review authenticity evaluation data evaluated as 'true' or 'false' to determine the authenticity of the input cluster review evaluation data by matching it to any one review authenticity determination learning model by matching the learning model matching unit for each cluster classification. Learning module department,
A review-specific authenticity evaluation collection unit that categorizes and stores the authenticity of the reviews evaluated in the review authenticity determination learning module unit by categorizing them into cluster reviews;
A review average value calculation unit for each cluster that calculates the average discrimination value for each cluster review based on 'true' or 'false' determined by the review authenticity discrimination learning model;
A review authenticity evaluation device comprising a review authenticity evaluation providing unit that compares the average discrimination value of the discriminant learning model and the average review value of the website service platform.

According to clause 9,
Among the review authenticity evaluation data evaluated by the review authenticity evaluation department, review authenticity evaluation data with low accuracy are selected according to preset conditions, and review evaluations related to the selected review authenticity evaluation data are feedback through the questionnaire responses. A review authenticity evaluation device further comprising a review authenticity evaluation feedback unit that extracts the evaluation scores of the reviews included in the response and modifies and reflects them in the correspondingly matched selected review authenticity evaluation data.

According to claim 11,
The review authenticity evaluation feedback department,
A questionnaire response evaluation unit that determines and evaluates the validity of the questionnaire responses related to the selected review authenticity evaluation data according to predefined conditions;
A surveyor response review and evaluation score extraction unit that extracts the surveyor evaluation scores of reviews and reviews included in the surveyor response determined to be valid according to the judgment evaluation of the surveyor response evaluation unit;
A review evaluation score correction unit for modifying and reflecting the surveyor evaluation score of the review extracted from the surveyor response review and evaluation score extraction unit into the correspondingly matched review authenticity evaluation data;
A review authenticity evaluation device comprising a response evaluation feedback unit that re-inputs the review authenticity evaluation date corrected by the review evaluation score correction unit into the cluster review pair matching unit.

According to claim 12,
The cluster review pair matching unit,
A review authenticity evaluation device characterized by re-matching the cluster review learning data in pairs according to predetermined conditions based on the five cluster review learning data according to the re-input of the cluster review learning data from the response evaluation feedback unit.