JP2020047229A

JP2020047229A - Article analyzer and article analysis method

Info

Publication number: JP2020047229A
Application number: JP2018177743A
Authority: JP
Inventors: 健吉加藤; Kenkichi Kato; 北村　慎吾; Shingo Kitamura; 慎吾北村; 駿介花岡; Shunsuke Hanaoka
Original assignee: Hitachi Industry and Control Solutions Co Ltd
Current assignee: Hitachi Industry and Control Solutions Co Ltd
Priority date: 2018-09-21
Filing date: 2018-09-21
Publication date: 2020-03-26
Anticipated expiration: 2038-09-21
Also published as: JP6592574B1

Abstract

To provide a beneficial proposal for an advertisement plan newly inputted on the basis of knowledge obtained from past advertisement articles.SOLUTION: An article analyzer 1 has: a mechanical learning part 21 which stores a result of totalizing frequency of appearance of respective words acquired as character data 101 of past article data 11 to a feature DB 13 as character feature data; and an improvement calculation part 23 which calculates scores of respective words by calculating evaluation functions showing that the more similar the respective words acquired as character data 201 of inputted prediction article data 12 are to words in the character feature data and the higher frequency of appearance of the character feature data is, the higher the score becomes, and presents that rewrite of words in the prediction article data 12 is promoted with words in the character feature data of high score as rewrite candidates with respect to the respective words in the prediction article data 12.SELECTED DRAWING: Figure 1

Description

本発明は、記事解析装置、および、記事解析方法に関する。 The present invention relates to an article analysis device and an article analysis method.

近年、深層学習（Deep Learning）を活用することによる、人工知能（ＡＩ：Artificial Intelligence）の発展が注目されている。ネットワーク技術の進歩により大量のデータを効率的に収集できるようになったため、そのデータから何らかの知見を機械学習させることで、人間と同等の作業をさせる計算モデルの構築に期待がかかる。 2. Description of the Related Art In recent years, attention has been paid to the development of artificial intelligence (AI) by utilizing deep learning. With the advancement of network technology, a large amount of data can be efficiently collected. Therefore, it is expected to construct a computation model that performs work equivalent to humans by machine learning some knowledge from the data.

人工知能の主要な適用分野として、画像データを入力データとして、その画像データをいずれかのカテゴリに分類する画像認識処理が挙げられる。例えば、非特許文献１には、人間がどのようにして芸術的な画像を視覚的に知覚するかを示す視覚モデルを、ディープニューラルネットワークとしてモデル化する試みが記載されている。ディープニューラルネットワークは、視覚情報を階層的に処理する小さな計算単位の層により構成される畳み込みニューラルネットワークである。 A main application field of artificial intelligence is image recognition processing in which image data is used as input data and the image data is classified into any category. For example, Non-Patent Document 1 describes an attempt to model a visual model showing how a human visually perceives an artistic image as a deep neural network. A deep neural network is a convolutional neural network composed of layers of small calculation units that hierarchically process visual information.

Leon A. Gatys他、「A Neural Algorithm of Artistic Style」、［online］、2015年8月26日、［2018年8月3日検索］、インターネット〈URL：https://arxiv.org/abs/1508.06576〉Leon A. Gatys et al., "A Neural Algorithm of Artistic Style", [online], August 26, 2015, [Search August 3, 2018], Internet <URL: https://arxiv.org/abs/ 1508.06576>

ＣＭ、看板、ポスターなどの広告記事についても、人間がどのようにして知覚するかを示す広告認識モデルを、機械的に作成したいニーズがある。そして、広告代理店などの広告提供側のユーザが広告認識モデルを用いることで、作成した広告を実際にＣＭなどで世の中に公表して評判を得る前の段階で、広告の効果を予想することが期待される。 There is also a need to mechanically create an advertisement recognition model indicating how a human perceives an advertisement article such as a CM, a signboard, and a poster. Then, by using an advertisement recognition model, a user on the advertisement providing side, such as an advertising agency, expects the effects of the advertisement before the advertisement created by the advertisement is actually published to the public in CM or the like and the reputation is obtained. There is expected.

非特許文献１などの従来の人工知能の研究では、類似する広告をグルーピングするなどの大まかな機械認識にとどまる。よって、広告提供側にとって、試作した広告案をどのように評価し、どのように改善するかという中身に踏み込んだ有益な提案をするシステムにまでは至っていなかった。 In the research of the conventional artificial intelligence such as Non-Patent Literature 1, general machine recognition such as grouping similar advertisements is limited. Therefore, there has not been a system for an advertisement provider to provide a useful proposal that goes into the content of how to evaluate a prototype advertisement proposal and how to improve it.

そこで、本発明は、過去の広告記事から得た知見をもとに、今回入力された広告案に対する有益な提案をすることを、主な課題とする。 Therefore, the main object of the present invention is to make a useful proposal for the currently input advertisement plan based on knowledge obtained from past advertisement articles.

前記課題を解決するために、本発明の記事解析装置は、以下の特徴を有する。
本発明は、過去記事データの文字データとして取得された各単語の出現頻度を集計した結果を、文字特徴データとして記憶部に記憶する機械学習部と、
入力された予測記事データの文字データとして取得された各単語について、前記文字特徴データの単語との類似度が高いほど、かつ、前記文字特徴データの出現頻度が高いほど高得点とする評価関数を計算することで、各単語のスコアを計算し、
前記予測記事データの各単語について、前記スコアが高い前記文字特徴データの単語を書き換え候補として、前記予測記事データの単語の書き換えを促す旨を提示する改善計算部と、を有することを特徴とする。
その他の手段は、後記する。 In order to solve the above-described problems, an article analysis device according to the present invention has the following features.
The present invention provides a machine learning unit that stores, in a storage unit, a result of counting the frequency of appearance of each word acquired as character data of past article data as character feature data,
For each word acquired as the character data of the input predicted article data, the higher the degree of similarity with the word of the character feature data, and the higher the score of the appearance function of the character feature data, the higher the evaluation function that is scored. By calculating, the score of each word is calculated,
An improvement calculation unit that presents a message that prompts rewriting of the word of the predicted article data, with the word of the character feature data having the higher score as a rewriting candidate for each word of the predicted article data. .
Other means will be described later.

本発明によれば、過去の広告記事から得た知見をもとに、今回入力された広告案に対する有益な提案をすることができる。 ADVANTAGE OF THE INVENTION According to this invention, a useful proposal with respect to the advertisement plan input this time can be made based on the knowledge obtained from the past advertisement articles.

本発明の一実施形態に関する記事解析装置の構成図である。It is a lineblock diagram of an article analysis device concerning one embodiment of the present invention. 本発明の一実施形態に関する記事解析装置の処理を示すフローチャートである。It is a flowchart which shows the process of the article analysis apparatus concerning one Embodiment of this invention. 本発明の一実施形態に関する過去記事データの文字データおよび画像データを対象とした、ニューラルネットワークの学習工程を示す説明図である。FIG. 3 is an explanatory diagram showing a neural network learning process for character data and image data of past article data according to an embodiment of the present invention. 本発明の一実施形態に関する予測記事データの文字データおよび画像データを対象とした、ニューラルネットワークの推論工程を示す説明図である。FIG. 5 is an explanatory diagram showing an inference process of a neural network for character data and image data of predicted article data according to an embodiment of the present invention. 本発明の一実施形態に関する過去記事データの文字データを対象とした、特徴ＤＢの学習工程を示す説明図である。FIG. 8 is an explanatory diagram showing a feature DB learning process for character data of past article data according to an embodiment of the present invention. 本発明の一実施形態に関する予測記事データの文字データを対象とした、特徴ＤＢの適用工程を示す説明図である。It is explanatory drawing which shows the application process of the feature DB for character data of the prediction article data concerning one Embodiment of this invention. 本発明の一実施形態に関する図５および図６の文字データに関する具体例である。7 is a specific example relating to the character data of FIGS. 5 and 6 according to an embodiment of the present invention.

以下、本発明の一実施形態について、図面を参照して詳細に説明する。 Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.

図１は、記事解析装置１の構成図である。
記事解析装置１は、過去記事データ１１と、予測記事データ１２と、特徴ＤＢ１３とを記憶部に記憶する。記事解析装置１は、機械学習部２１と、効果予想部２２と、改善計算部２３と、形態素解析部３１と、スコア計算部３２と、スコア合計部３３とを処理部として有する。
記事解析装置１は、ＣＰＵ（Central Processing Unit）と、メモリと、ハードディスクなどの記憶手段（記憶部）と、ネットワークインタフェースとを有するコンピュータとして構成される。
このコンピュータは、ＣＰＵが、メモリ上に読み込んだプログラム（アプリケーションや、その略のアプリとも呼ばれる）を実行することにより、各処理部により構成される制御部（制御手段）を動作させる。 FIG. 1 is a configuration diagram of the article analysis device 1.
The article analysis device 1 stores past article data 11, predicted article data 12, and a feature DB 13 in a storage unit. The article analysis device 1 includes a machine learning unit 21, an effect prediction unit 22, an improvement calculation unit 23, a morphological analysis unit 31, a score calculation unit 32, and a score total unit 33 as processing units.
The article analysis device 1 is configured as a computer having a CPU (Central Processing Unit), a memory, storage means (storage unit) such as a hard disk, and a network interface.
In this computer, a CPU executes a program (also called an application or an abbreviation for an application) read into a memory, thereby operating a control unit (control unit) including each processing unit.

過去記事データ１１は、記事とその評価点とが対応付けられるデータであり、機械学習におけるラベル付きの教師データとして事前に入力される。過去記事データ１１の記事とは、例えば、ＣＭ、看板、ポスターなどの広告を示す記事であり、社会的ネットワークサービス（ＳＮＳ：social networking service）などのネットに公開されているものである。
過去記事データ１１の評価点とは、公開された記事に対する読者からの評価（効果）を定量的に示すラベルである。この評価点は、例えば、ネットからのアクセスデータとして、記事へのアクセス数や、記事に対する読者からの反応数であるいいね数、リツイート数、リプライ数などである。なお、「いいね」とは、読者が好印象と判断した記事に対して、１クリック（１タッチ）で記事を評価する操作であり、アプリによっては「お気に入り」や「ファボ（Favorite）」とも呼ばれる。
または、評価点は、記事で紹介された商品に対する市場からの反応としての、売上げデータでもよい。
以下、図２を参照して、記事解析装置１のうちの過去記事データ１１以外の構成要素を説明する。 The past article data 11 is data in which articles and their evaluation points are associated with each other, and is input in advance as labeled teacher data in machine learning. The article of the past article data 11 is, for example, an article indicating an advertisement such as a CM, a signboard, a poster, and the like, and is published on a net such as a social networking service (SNS).
The evaluation score of the past article data 11 is a label that quantitatively indicates the evaluation (effect) of the published article by the reader from the reader. The evaluation points include, for example, the number of accesses to the article, the number of likes, which is the number of responses from readers to the article, the number of retweets, the number of replies, and the like, as access data from the Internet. “Like” is an operation that evaluates an article with one click (one touch) for an article judged by the reader to have a good impression. Depending on the application, “like” or “favorite” may be used. be called.
Alternatively, the evaluation point may be sales data as a response from the market to the product introduced in the article.
Hereinafter, with reference to FIG. 2, components other than the past article data 11 in the article analysis apparatus 1 will be described.

図２は、記事解析装置１の処理を示すフローチャートである。
機械学習部２１は、過去記事データ１１を教師データとして機械学習し、その結果である回帰（Regression）の予測モデルを作成する。予測モデルは、例えば、過去記事データ１１の特徴を示す単語（キーワード）として、特徴ＤＢ１３に格納される（Ｓ１１）。
なお、予測モデルは、例えば、入力層、中間層、出力層を順に接続し、それぞれの層を情報伝達させるニューラルネットワークとして構成される。中間層が多階層（複数階層）であるときは、機械学習部２１は、ディープラーニングにより予測モデルを作成する。 FIG. 2 is a flowchart showing the processing of the article analysis device 1.
The machine learning unit 21 performs machine learning using the past article data 11 as teacher data, and creates a regression prediction model as a result. The prediction model is stored in the feature DB 13 as, for example, a word (keyword) indicating a feature of the past article data 11 (S11).
The prediction model is configured as, for example, a neural network that connects an input layer, an intermediate layer, and an output layer in order and transmits information to each layer. When the intermediate layer has multiple layers (multiple layers), the machine learning unit 21 creates a prediction model by deep learning.

記事解析装置１は、予測記事データ１２の入力を受け付ける（Ｓ１２）。入力される予測記事データ１２は、未公開の試作段階などで評価点が未定のものであり、予測モデルを用いて予測する対象の記事である。なお、予測記事データ１２は、これから宣伝で使用しようとしている広告Ａ案、広告Ｂ案、広告Ｃ案など複数のデータを事前に用意しておくことが望ましい。 The article analysis device 1 receives an input of the predicted article data 12 (S12). The input predicted article data 12 is an article whose evaluation point is undecided at an undisclosed prototype stage or the like, and is an article to be predicted using a prediction model. It is desirable that the prediction article data 12 be prepared in advance with a plurality of data such as an advertisement A plan, an advertisement B plan, and an advertisement C plan that are to be used in the advertisement.

効果予想部２２は、特徴ＤＢ１３から読み出した予測モデルを用いて、Ｓ１２で入力されたそれぞれの予測記事データ１２の評価点を予想する（Ｓ１３）。なお、評価点の高い記事だけでなく、評価点の低い記事も過去記事データ１１として活用することが望ましい。これにより、効果予想部２２は、例えば、読者に不快な単語を含む予測記事データ１２を、「評価点の低い」と適切に予想することができる。 The effect prediction unit 22 predicts an evaluation point of each predicted article data 12 input in S12 using the prediction model read from the feature DB 13 (S13). It is desirable that not only articles with high evaluation scores but also articles with low evaluation scores be used as the past article data 11. Thus, the effect prediction unit 22 can appropriately predict, for example, the predicted article data 12 including a word that is offensive to the reader as “low evaluation score”.

改善計算部２３は、過去記事データ１１に記載される単語を特徴ＤＢ１３から参照し、予測記事データ１２の記事内容を改善する改善案を作成する（Ｓ１４）。よって、今回入力された予測記事データ１２を高評価の過去記事データ１１に近づけるような改善案を作成するために、過去記事データ１１として、過去にいいねを大量に獲得した記事など評価点の高い記事を事前に集めておくことが望ましい。 The improvement calculation unit 23 refers to the words described in the past article data 11 from the feature DB 13 and creates an improvement plan for improving the article content of the predicted article data 12 (S14). Therefore, in order to create an improvement plan that brings the predicted article data 12 input this time closer to the highly evaluated past article data 11, as the past article data 11, an evaluation score such as an article that acquired a large number of likes in the past is used. It is desirable to collect expensive articles in advance.

記事解析装置１は、Ｓ１３で予想した予測記事データ１２の広告効果（評価点）と、Ｓ１４で作成した予測記事データ１２に対する改善案とを表示する（Ｓ１５）。この表示により、ユーザは、最高点の評価点となる広告Ｂ案を採用するなどの意志決定を行ったり、広告Ｂ案に記載される単語を過去の人気記事に類似するように書き換えるなどの改善案を採用したりできる。つまり、ユーザは、評価点および改善案をもとに、売上に貢献する予測記事データ１２を知ることができる。
なお、記事解析装置１は、効果予想部２２の評価点と、改善計算部２３の改善案とを同時に表示することとしたが、いずれか片方の表示だけでもユーザにとって有益である。よって、効果予想部２２および改善計算部２３のいずれか１つだけを備えた記事解析装置１として構成してもよい。 The article analysis device 1 displays the advertisement effect (evaluation score) of the predicted article data 12 predicted in S13 and the improvement plan for the predicted article data 12 created in S14 (S15). With this display, the user can make a decision such as adopting the advertisement B having the highest evaluation score, or can rewrite words written in the advertisement B so as to resemble popular articles in the past. Or adopt a plan. That is, the user can know the predicted article data 12 that contributes to sales based on the evaluation points and the improvement plans.
Note that the article analysis device 1 displays the evaluation points of the effect prediction unit 22 and the improvement proposal of the improvement calculation unit 23 at the same time, but only one of them is useful for the user. Therefore, the article analysis device 1 may include only one of the effect prediction unit 22 and the improvement calculation unit 23.

図３は、過去記事データ１１の文字データおよび画像データを対象とした、ニューラルネットワークの学習工程を示す説明図である。図３の学習工程は、図２ではＳ１１の処理に該当する。
機械学習部２１は、過去記事データ１１を構文解析することで、文字データ１０１および画像データ１０２を抽出する。文字データ１０１として、過去記事データ１１に記載されたテキストデータを抽出してもよいし、過去記事データ１１に添付された音声データや動画データからテキストデータを音声認識により機械抽出してもよい。画像データ１０２は、１枚ずつの画像ファイルから構成されていてもよいし、過去記事データ１１の動画データから抽出した画像ファイルの集合として構成されていてもよい。 FIG. 3 is an explanatory diagram showing a neural network learning process for character data and image data of the past article data 11. The learning process in FIG. 3 corresponds to the process in S11 in FIG.
The machine learning unit 21 extracts character data 101 and image data 102 by analyzing the syntax of the past article data 11. As the character data 101, text data described in the past article data 11 may be extracted, or text data may be mechanically extracted from voice data or moving image data attached to the past article data 11 by voice recognition. The image data 102 may be configured from one image file at a time, or may be configured as a set of image files extracted from the moving image data of the past article data 11.

機械学習部２１は、画像データ１０２を入力として、畳み込みニューラルネットワーク（畳込層）であるＣＮＮ（Convolutional Neural Network）データ１１２を作成する。そして、機械学習部２１は、非特許文献１に記載されているように、ＣＮＮデータ１１２からスタイル情報を抽出し、そのスタイル情報を特徴ＤＢ１３に登録する。スタイル情報は、画像データ１０２の学習結果であり、アドバイス画像（図４の改善画像データ２３２）を生成するための画像の特徴情報である。 The machine learning unit 21 creates CNN (Convolutional Neural Network) data 112 that is a convolutional neural network (convolution layer) using the image data 102 as an input. Then, as described in Non-Patent Document 1, the machine learning unit 21 extracts style information from the CNN data 112 and registers the style information in the feature DB 13. The style information is a learning result of the image data 102 and is feature information of an image for generating an advice image (improved image data 232 in FIG. 4).

機械学習部２１は、文字データ１０１から抽出した単語頻度データ１１１を、特徴ＤＢ１３に登録する（詳細は図５）。機械学習部２１は、単語頻度データ１１１と、ＣＮＮデータ１１２とを、全結合層データ１２１に結合させることで、ニューラルネットワークを作成する。機械学習部２１は、全結合層データ１２１の出力先を記事効果データ１３１に対応付ける。
つまり、ニューラルネットワークの第１層（入力層）が文字データ１０１および画像データ１０２であり、第２層（中間層）が文字特徴データ（単語頻度データ１１１）および画像特徴データ（ＣＮＮデータ１１２）であり、第３層（出力層）が全結合層データ１２１である。
記事効果データ１３１とは、過去記事データ１１に対応付けられている教師データのラベル（いいね数などの記事の評価点）である。機械学習部２１は、このようにして生成したニューラルネットワークに対して、過去記事データ１１を次々に入力（伝搬）させることで、ニューラルネットワークを学習させる。 The machine learning unit 21 registers the word frequency data 111 extracted from the character data 101 in the feature DB 13 (see FIG. 5 for details). The machine learning unit 21 creates a neural network by combining the word frequency data 111 and the CNN data 112 with the fully connected layer data 121. The machine learning unit 21 associates the output destination of the fully connected layer data 121 with the article effect data 131.
That is, the first layer (input layer) of the neural network is character data 101 and image data 102, and the second layer (intermediate layer) is character characteristic data (word frequency data 111) and image characteristic data (CNN data 112). The third layer (output layer) is the all-connected layer data 121.
The article effect data 131 is a label (evaluation score of an article such as the number of likes) of teacher data associated with the past article data 11. The machine learning unit 21 learns the neural network by inputting (propagating) the past article data 11 one after another to the neural network generated in this way.

図４は、予測記事データ１２の文字データおよび画像データを対象とした、ニューラルネットワークの推論工程を示す説明図である。図４の推論工程は、図２ではＳ１３およびＳ１４の処理に該当する。
効果予想部２２は、入力された予測記事データ１２を構文解析することで、文字データ２０１および画像データ２０２を抽出する。
効果予想部２２は、抽出した文字データ２０１および画像データ２０２を図３で作成したニューラルネットワークに入力することで、予測記事データ１２に対するラベル（いいね数などの記事の評価点）を予想する。具体的には、効果予想部２２は、文字データ２０１および画像データ２０２を、それぞれのデータ形式に合った中間層に入力する。 FIG. 4 is an explanatory diagram showing the inference process of the neural network for the character data and the image data of the predicted article data 12. The inference process in FIG. 4 corresponds to the processes in S13 and S14 in FIG.
The effect prediction unit 22 extracts the character data 201 and the image data 202 by analyzing the syntax of the input predicted article data 12.
The effect prediction unit 22 predicts a label (evaluation score of an article such as the number of likes) for the predicted article data 12 by inputting the extracted character data 201 and image data 202 to the neural network created in FIG. Specifically, the effect predicting unit 22 inputs the character data 201 and the image data 202 to an intermediate layer suitable for each data format.

これにより、特徴ＤＢ１３から読み出された単語頻度データ１１１の層と、特徴ＤＢ１３からスタイル情報として読み出されたＣＮＮデータ１１２の層から、それぞれ全結合層データ１２１への情報伝搬が発生し、その結果が記事効果データ１３１と同様の形式である効果予想データ２３１へと伝搬する。よって、ユーザは、予測記事データ１２がどれだけいいね数を得られるかなどの予測を知ることができる。 As a result, information is propagated from the layer of the word frequency data 111 read from the feature DB 13 and the layer of the CNN data 112 read as style information from the feature DB 13 to the all-connected layer data 121, respectively. The result is propagated to the effect prediction data 231 having the same format as the article effect data 131. Therefore, the user can know the prediction of how many like articles the prediction article data 12 can obtain.

さらに、改善計算部２３は、図６で後述するとおり、学習結果として特徴ＤＢ１３に保存しておいた別の単語を文字データ２０１に適用した改善文字データ２３３を生成し、ユーザに提示する。
また、改善計算部２３は、非特許文献１に記載されているように、学習結果として特徴ＤＢ１３に保存しておいたスタイル情報を画像データ２０２に適用した改善画像データ２３２を生成し、アドバイス画像としてユーザに提示する。
例えば、過去記事データ１１からいいね数が多い記事から、明るい配色のスタイル情報が抽出されたとする。そして、予測記事データ１２の画像データ２０２からは暗い配色のスタイル情報が抽出されたとき、改善計算部２３は、明るい配色のスタイル情報を参照して、画像データ２０２をより明るくした改善画像データ２３２を生成する。 Further, as described later with reference to FIG. 6, the improvement calculation unit 23 generates improved character data 233 in which another word stored in the feature DB 13 is applied to the character data 201 as a learning result, and presents it to the user.
Further, as described in Non-Patent Document 1, the improvement calculation unit 23 generates improved image data 232 in which style information stored in the feature DB 13 is applied to the image data 202 as a learning result, and an advice image is generated. To the user.
For example, it is assumed that bright color style information is extracted from an article having a large number of likes from the past article data 11. Then, when the style information of the dark color scheme is extracted from the image data 202 of the predicted article data 12, the improvement calculation unit 23 refers to the style information of the bright color scheme to improve the image data 232 by brightening the image data 202. Generate

以上、図３，図４を参照して、文字データおよび画像データを対象としたニューラルネットワークを用いた予測記事データ１２の解析処理を説明した。非特許文献１では、画像データを対象としたニューラルネットワークだけが構築されていた。本実施形態では、この画像データのニューラルネットワークに対して、新たに文字データの層を組み込むことで、文字データも画像データも含む予測記事データ１２の予測処理を精度よく行うことができる。 The analysis processing of the predicted article data 12 using the neural network for the character data and the image data has been described above with reference to FIGS. In Non-Patent Document 1, only a neural network for image data is constructed. In the present embodiment, by incorporating a new character data layer into this neural network of image data, it is possible to accurately perform prediction processing of the predicted article data 12 including both character data and image data.

図５は、過去記事データ１１の文字データを対象とした、特徴ＤＢ１３の学習工程を示す説明図である。図５の学習工程は、図２ではＳ１１の処理に該当する。
機械学習部２１は、過去記事データ１１から抽出した文字データ１０１をもとに、単語頻度データ１１１を抽出するように、形態素解析部３１に指示する。形態素解析部３１は、文字データ１０１から形態素解析により分割された各単語に対して、その出現回数（使用頻度）を集計して単語頻度データ１１１（詳細は図７）を作成し、その単語頻度データ１１１を過去記事データ１１の集合から学習する特徴データとして、特徴ＤＢ１３に登録する。 FIG. 5 is an explanatory diagram illustrating a learning process of the feature DB 13 for character data of the past article data 11. The learning process in FIG. 5 corresponds to the process in S11 in FIG.
The machine learning unit 21 instructs the morphological analysis unit 31 to extract the word frequency data 111 based on the character data 101 extracted from the past article data 11. The morphological analysis unit 31 generates word frequency data 111 (detailed in FIG. 7) by counting the number of appearances (use frequency) of each word divided by the morphological analysis from the character data 101, and generates the word frequency. The data 111 is registered in the feature DB 13 as feature data learned from a set of past article data 11.

図６は、予測記事データ１２の文字データを対象とした、特徴ＤＢ１３の適用工程を示す説明図である。図６の適用工程は、図２ではＳ１３およびＳ１４の処理に該当する。
形態素解析部３１は、図５の学習工程と同様に、予測記事データ１２の文字データ２０１に対して形態素解析により単語の集合に分割する。
スコア計算部３２は、図５の学習工程で得た単語頻度データ１１１と、形態素解析部３１による予測記事データ１２の単語の集合とをもとに、単語ごとにスコアリングする。このスコアとは、予測記事データ１２に出現する各単語について、スコアが高いほど、別の単語に書き換えた方がよい度合いを示す。そして、改善文字データ２３３とは、書き換え先となる別の単語のリストである。 FIG. 6 is an explanatory diagram showing a process of applying the feature DB 13 to the character data of the predicted article data 12. The application process in FIG. 6 corresponds to the processes in S13 and S14 in FIG.
The morphological analysis unit 31 divides the character data 201 of the predicted article data 12 into a set of words by morphological analysis, as in the learning process of FIG.
The score calculation unit 32 scores each word based on the word frequency data 111 obtained in the learning step of FIG. 5 and the set of words of the predicted article data 12 by the morphological analysis unit 31. This score indicates the degree to which the higher the score of each word appearing in the predicted article data 12, the better it is to rewrite it to another word. The improved character data 233 is a list of another word to be rewritten.

さらに、スコア合計部３３は、スコアリングの結果を予測記事データ１２ごと（記事のページごと）に合計し、その合計値を全結合層データ１２１への入力とする。図４で説明したように、効果予想部２２は、全結合層データ１２１から効果予想データ２３１に対応付けることで、予測記事データ１２ごとのいいね数などの記事の評価点を予想する。 Further, the score totaling unit 33 totals the results of the scoring for each of the predicted article data 12 (for each page of the article), and uses the total value as the input to the all connected layer data 121. As described with reference to FIG. 4, the effect prediction unit 22 predicts an evaluation score of an article such as the number of likes for each predicted article data 12 by associating the data with the effect prediction data 231 from the total connected layer data 121.

図７は、図５および図６の文字データに関する具体例である。
学習工程では、形態素解析部３１は、過去記事データ１１から単語頻度データ１１１を抽出する。単語頻度データ１１１は、「空前絶後が200回出現」、「浪漫が150回出現」、…などの単語ごとの出現頻度である。
適用工程では、まず、形態素解析部３１は、予測記事データ１２から単語の集合を抽出する。ここでは、前代未聞、ロマン、さっぱり、…などの単語が抽出されたとする。 FIG. 7 is a specific example of the character data of FIGS.
In the learning step, the morphological analysis unit 31 extracts word frequency data 111 from the past article data 11. The word frequency data 111 is the frequency of appearance of each word such as “200 occurrences after unprecedented abortion”, “150 romantic occurrences”, and so on.
In the application step, first, the morphological analysis unit 31 extracts a set of words from the predicted article data 12. Here, it is assumed that words such as unheard of, romantic, refreshing,... Are extracted.

次に、スコア合計部３３は、スコア計算部３２が以下の式で計算した単語ごとのスコアを、予測記事データ１２ごとに合計した集計スコアを求める。
（単語のスコア）＝（予測記事データ１２の単語）と（単語頻度データ１１１の単語）との類似度×（単語頻度データ１１１の使用頻度）
例えば、「前代未聞」と「空前絶後」との類似度が2.8なら、「前代未聞」のスコア＝2.8×200＝560となる。この単語ごとの集計スコアは、全結合層データ１２１に入力される。 Next, the score totaling unit 33 obtains a total score obtained by adding the scores for each word calculated by the score calculating unit 32 using the following formula for each predicted article data 12.
(Score of word) = Similarity between (word of predicted article data 12) and (word of word frequency data 111) × (frequency of use of word frequency data 111)
For example, if the similarity between “unheard of before” and “after unprecedented” is 2.8, the score of “unheard of before” = 2.8 × 200 = 560. The total score for each word is input to the all connected layer data 121.

そして、改善計算部２３は、予測記事データ１２の単語ごとに、スコアが高い順に、単語頻度データ１１１の類似する単語のリスト（１位、２位、３位、…）を、改善文字データ２３３として抽出する。
例えば、今回試作した広告Ｂ案には、「前代未聞」という単語が記載されていた。しかし、人気の記事（過去記事データ１１）には、「前代未聞」ではなく、意味が類似する「空前絶後」という単語が多く記載されていた。よって、改善計算部２３は、「前代未聞」を高スコアの「空前絶後」や「画期的」などに書き換える旨の改善文字データ２３３をユーザに提示する。
これにより、広告Ｂ案を公開する前に「前代未聞」を「空前絶後」に書き換えさせることで、広告Ｂ案の趣旨を大きく変えることなく、広告Ｂ案の印象を改善できる。 Then, the improvement calculation unit 23 generates a list of similar words (first, second, third,...) Of the word frequency data 111 for each word of the predicted article data 12 in the descending order of the score, in the improved character data 233. Extract as
For example, the word "unprecedented" was described in the prototype of the advertisement B produced this time. However, in a popular article (past article data 11), not the word "unheard of" but a lot of words with similar meanings "unprecedented". Therefore, the improvement calculation unit 23 presents the user with the improved character data 233 indicating that “unprecedented unheard of” is rewritten to “high-scoring” “after unprecedented” or “innovative”.
Thereby, by rewriting “unheard of before” to “after all” before releasing the advertisement B plan, the impression of the advertisement B plan can be improved without largely changing the purpose of the advertisement B plan.

以上説明した本実施形態では、効果予想部２２が過去記事データ１１からニューラルネットワークを機械学習することで、予測記事データ１２の効果を予測できる。ここで、過去記事データ１１の学習データとして、いいね数などのＳＮＳの口コミ効果の情報を用いることで、市場の評判を示すデータを手軽に入手できる。
さらに、改善計算部２３が過去記事データ１１から抽出した単語頻度データ１１１をもとに、予測記事データ１２の単語を書き換える提案をすることで、予測記事データ１２の内容に踏み込んだ改善をユーザに促すことができる。 In the present embodiment described above, the effect of the predicted article data 12 can be predicted by the effect prediction unit 22 machine learning the neural network from the past article data 11. Here, by using information on the word-of-mouth effect of the SNS such as the number of likes as the learning data of the past article data 11, data indicating the reputation of the market can be easily obtained.
Further, the improvement calculation unit 23 proposes to rewrite the words of the predicted article data 12 based on the word frequency data 111 extracted from the past article data 11, so that the user can make an improvement in the content of the predicted article data 12. Can be encouraged.

なお、本発明は前記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、前記した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。
また、ある実施例の構成の一部を他の実施例の構成に置き換えることが可能であり、また、ある実施例の構成に他の実施例の構成を加えることも可能である。
また、各実施例の構成の一部について、他の構成の追加・削除・置換をすることが可能である。また、上記の各構成、機能、処理部、処理手段などは、それらの一部または全部を、例えば集積回路で設計するなどによりハードウェアで実現してもよい。
また、前記の各構成、機能などは、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。 Note that the present invention is not limited to the above-described embodiment, and includes various modifications. For example, the above-described embodiment has been described in detail in order to explain the present invention in an easy-to-understand manner, and is not necessarily limited to one having all the described configurations.
Further, a part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and the configuration of one embodiment can be added to the configuration of another embodiment.
Also, for a part of the configuration of each embodiment, it is possible to add, delete, or replace another configuration. In addition, each of the above-described configurations, functions, processing units, processing means, and the like may be partially or entirely realized by hardware by, for example, designing an integrated circuit.
In addition, the above-described configurations, functions, and the like may be implemented by software by a processor interpreting and executing a program that implements each function.

各機能を実現するプログラム、テーブル、ファイルなどの情報は、メモリや、ハードディスク、ＳＳＤ（Solid State Drive）などの記録装置、または、ＩＣ（Integrated Circuit）カード、ＳＤカード、ＤＶＤ（Digital Versatile Disc）などの記録媒体に置くことができる。
また、制御線や情報線は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実際にはほとんど全ての構成が相互に接続されていると考えてもよい。 Information such as programs, tables, and files that realize each function is stored in a memory, hard disk, recording device such as SSD (Solid State Drive), IC (Integrated Circuit) card, SD card, DVD (Digital Versatile Disc), etc. Recording media.
In addition, control lines and information lines are shown as necessary for the description, and do not necessarily indicate all control lines and information lines on a product. In fact, almost all components may be considered to be interconnected.

１記事解析装置
１１過去記事データ
１２予測記事データ
１３特徴ＤＢ
２１機械学習部
２２効果予想部
２３改善計算部
３１形態素解析部
３２スコア計算部
３３スコア合計部
１０１文字データ
１０２画像データ
１１１単語頻度データ
１１２ＣＮＮデータ
１２１全結合層データ
１３１記事効果データ
２０１文字データ
２０２画像データ
２３１効果予想データ
２３２改善画像データ
２３３改善文字データ DESCRIPTION OF SYMBOLS 1 Article analysis apparatus 11 Past article data 12 Predicted article data 13 Feature DB
Reference Signs List 21 machine learning unit 22 effect prediction unit 23 improvement calculation unit 31 morphological analysis unit 32 score calculation unit 33 score total unit 101 character data 102 image data 111 word frequency data 112 CNN data 121 fully connected layer data 131 article effect data 201 character data 202 Image data 231 Effect prediction data 232 Improved image data 233 Improved character data

Claims

A machine learning unit that stores, in the storage unit, the result of counting the appearance frequencies of the words acquired as the character data of the past article data as character feature data,
For each word acquired as the character data of the input predicted article data, the higher the degree of similarity with the word of the character feature data, and the higher the score of the appearance function of the character feature data, the higher the evaluation function that is scored. By calculating, the score of each word is calculated,
An improvement calculation unit that presents a message that prompts rewriting of the word of the predicted article data, with the word of the character feature data having the higher score as a rewriting candidate for each word of the predicted article data. Article analysis device.

The machine learning unit stores, in addition to the character feature data of the past article data, image feature data acquired from the image data of the past article data in a storage unit, and stores the character data and the image of the past article data. A first layer for receiving data input, a second layer corresponding to the character feature data and the image feature data, and combining the character feature data and the image feature data of the second layer with the past article data; Form a neural network in which the third layer that outputs the evaluation points of is connected in order from the input side,
The article analysis device further inputs the character data and the image data of the predicted article data to the first layer of the neural network, so that the score of the predicted article data output from the third layer is calculated. The article analysis device according to claim 1, further comprising an effect prediction unit that presents the effect data of the prediction article data as prediction data.

The article analysis device according to claim 2, wherein the machine learning unit uses the number of responses from readers to the past article data posted to a social network service as an evaluation score for each piece of past article data. .

The article analysis device has a machine learning unit and an improvement calculation unit,
The machine learning unit stores, in the storage unit, a result of counting the appearance frequency of each word acquired as character data of past article data as character feature data,
The improvement calculation unit includes:
For each word acquired as the character data of the input predicted article data, the higher the degree of similarity with the word of the character feature data, and the higher the score of the appearance function of the character feature data, the higher the evaluation function that is scored. By calculating, the score of each word is calculated,
An article analysis method, wherein for each word of the predicted article data, a word indicating that the word of the character feature data having a high score is a rewriting candidate is presented to encourage rewriting of the word of the predicted article data.