JP7139932B2

JP7139932B2 - Demand forecasting method, demand forecasting program and demand forecasting device

Info

Publication number: JP7139932B2
Application number: JP2018235377A
Authority: JP
Inventors: 浩子鈴木; 勇渡部
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2018-12-17
Filing date: 2018-12-17
Publication date: 2022-09-21
Anticipated expiration: 2038-12-17
Also published as: JP2020098388A

Description

本発明は、需要予測方法、需要予測プログラムおよび需要予測装置に関する。 The present invention relates to a demand forecasting method, a demand forecasting program, and a demand forecasting device.

商品の需要予測は、過去の売上実績の傾向に基づき将来を予測する手法が一般的であるが、新商品の発売前の研究開発や企画段階では、予測したい新商品の受注データや売上データが存在しないため、売上予測ができない。このため、過去にすでに発売された類似商品を探索し、その類似商品の過去の売上データを用いることで、新商品の売上予測をすることが行われている。 The general method for predicting future demand for products is to forecast the future based on trends in past sales results. Sales cannot be predicted because it does not exist. For this reason, sales of new products are predicted by searching for similar products that have already been released in the past and using past sales data of the similar products.

例えば、新商品の発売前や発売初期に、予測実行者が複数の類似商品の探索および各類似商品の重みを指定し、その重みを用いて類似商品の過去の売上の重み付け加算を算出して、需要予測を行う技術が知られている。また、商品に関する発言を含むソーシャルメディアデータと商品属性データを用いて、過去に発売された類似商品を抽出し、この類似商品の実績を用いて、新商品の売り上げを予測する技術が知られている。 For example, before or at the beginning of the launch of a new product, the prediction executor searches for multiple similar products and specifies the weight of each similar product, and uses that weight to calculate the weighted addition of past sales of similar products. , techniques for demand forecasting are known. Also known is a technology that extracts similar products that have been released in the past using social media data and product attribute data that include statements about products, and predicts the sales of new products using the results of these similar products. there is

特開２００８－１８６４１３号公報JP 2008-186413 A 特開２０００－３３８８号公報JP-A-2000-3388 特開２０１３－１８２４１５号公報JP 2013-182415 A

しかしながら、上記技術では、発売前である新商品の需要予測の精度を向上させることが難しい。 However, with the above technology, it is difficult to improve the accuracy of demand prediction for new products that have not yet been put on the market.

例えば、予測実行者が類似商品の指定と重み付けの指定を行う場合、主観的や属人的な要素が強く、類似商品のどのような内容が新商品の需要予測に影響するのかを定量的に把握することができないので、必ずしも新商品の需要予測の精度が高いとは限らない。また、類似商品の過去の売上の重み付け加算やソーシャルメディアデータでは、過去の類似製品の単純な組み合わせでは表現できない新商品の需要を正しく予測することができない。 For example, when a forecast executor specifies similar products and assigns weights, subjective and individual factors are strong, and it is possible to quantitatively determine what kind of content of similar products will affect demand forecasts for new products. Therefore, the accuracy of the demand forecast for new products is not necessarily high. In addition, weighted addition of past sales of similar products and social media data cannot accurately predict demand for new products that cannot be expressed by simple combinations of past similar products.

一つの側面では、新商品の需要予測の精度を向上させることができる需要予測方法、需要予測プログラムおよび需要予測装置を提供することを目的とする。 An object of one aspect is to provide a demand forecasting method, a demand forecasting program, and a demand forecasting apparatus capable of improving the accuracy of demand forecasting for new products.

第１の案では、需要予測方法は、コンピュータが、発売が開始されている既存商品または発売が開始されていない新商品の属性が記載された各文書から、予め設定された条件に基づいて各商品の属性を示す特徴語を抽出する処理を実行する。需要予測方法は、コンピュータが、前記各文書に含まれる特徴語の出現頻度から、商品ごとに特徴語を有する度合の組み合わせを示したクラスタリング情報を生成する処理を実行する。需要予測方法は、コンピュータが、生成したクラスタリング情報を説明変数に設定し、前記既存商品の売上実績を目的変数に設定した学習データを用いて、前記新商品の需要予測を行う予測モデルを学習する処理を実行する。 In the first proposal, the demand forecasting method is such that a computer extracts information from each document describing the attributes of existing products that have started to be sold or new products that have not yet started to be sold, based on preset conditions. Execute processing for extracting feature words that indicate product attributes. In the demand forecasting method, a computer generates clustering information indicating a combination of degrees of having characteristic words for each product from the frequency of appearance of characteristic words included in each of the documents. In the demand forecasting method, a computer sets the generated clustering information as an explanatory variable, and uses learning data in which the sales performance of the existing product is set as an objective variable to learn a forecasting model that forecasts the demand for the new product. Execute the process.

一実施形態によれば、新商品の需要予測の精度を向上させることができる。 According to one embodiment, it is possible to improve the accuracy of demand prediction for new products.

図１は、実施例１にかかる需要予測装置を説明する図である。FIG. 1 is a diagram illustrating a demand forecasting device according to a first embodiment; FIG. 図２は、実施例１にかかる需要予測装置の機能構成を示す機能ブロック図である。FIG. 2 is a functional block diagram of a functional configuration of the demand prediction device according to the first embodiment; 図３は、企画書ＤＢに記憶される企画書の一例を示す図である。FIG. 3 is a diagram illustrating an example of a proposal stored in a proposal DB. 図４は、売上情報ＤＢに記憶される情報の例を示す図である。FIG. 4 is a diagram showing an example of information stored in a sales information DB. 図５は、月別売上情報ＤＢに記憶される情報の例を示す図である。FIG. 5 is a diagram showing an example of information stored in a monthly sales information DB. 図６は、実施例１にかかる学習フェーズを説明する図である。FIG. 6 is a diagram for explaining a learning phase according to the first embodiment; 図７は、実施例１にかかる適用フェーズを説明する図である。FIG. 7 is a diagram illustrating an application phase according to the first embodiment; 図８は、処理の流れを示すフローチャートである。FIG. 8 is a flowchart showing the flow of processing. 図９は、実施例２にかかる需要予測装置を説明する図である。FIG. 9 is a diagram for explaining a demand prediction device according to a second embodiment; 図１０は、実施例２にかかる学習フェーズを説明する図である。FIG. 10 is a diagram for explaining the learning phase according to the second embodiment; 図１１は、実施例２にかかる適用フェーズを説明する図である。FIG. 11 is a diagram illustrating an application phase according to the second embodiment; 図１２は、スムージングの例を説明する図である。FIG. 12 is a diagram illustrating an example of smoothing. 図１３は、スムージングの別例を説明する図である。FIG. 13 is a diagram illustrating another example of smoothing. 図１４は、スムージング結果を説明する図である。FIG. 14 is a diagram for explaining the smoothing result. 図１５は、効果を説明する図である。FIG. 15 is a diagram explaining the effect. 図１６は、効果の比較例を説明する図である。FIG. 16 is a diagram illustrating a comparative example of effects. 図１７は、ハードウェア構成例を説明する図である。FIG. 17 is a diagram illustrating a hardware configuration example.

以下に、本願の開示する需要予測方法、需要予測プログラムおよび需要予測装置の実施例を図面に基づいて詳細に説明する。なお、この実施例によりこの発明が限定されるものではない。また、各実施例は、矛盾のない範囲内で適宜組み合わせることができる。 Embodiments of the demand forecasting method, the demand forecasting program, and the demand forecasting apparatus disclosed in the present application will be described below in detail with reference to the drawings. In addition, this invention is not limited by this Example. Moreover, each embodiment can be appropriately combined within a range without contradiction.

［需要予測装置の説明］
図１は、実施例１にかかる需要予測装置１０を説明する図である。図１に示す需要予測装置１０は、発売前の新商品の需要予測を実行するコンピュータ装置の一例である。この需要予測装置１０は、学習フェーズにおいて予測モデルを学習し、適用フェーズにおいて学習済みの予測モデルを用いた需要予測を実行する。 [Description of demand forecasting device]
FIG. 1 is a diagram illustrating a demand prediction device 10 according to the first embodiment. A demand forecasting device 10 shown in FIG. 1 is an example of a computer device that executes demand forecasting for new products before they go on sale. This demand forecasting device 10 learns a forecast model in the learning phase, and executes demand forecasting using the learned forecast model in the application phase.

図１に示すように、学習フェーズでは、需要予測装置１０は、すでに発売されている既存商品の企画書や発売前の研究開発時や企画時に作成される新商品の企画書などのテキスト情報から、内容を表す単語（キーワード）を抽出する。そして、需要予測装置１０は、抽出したキーワードを用いて、クラスタリングを行ってクラスタを生成する。その後、需要予測装置１０は、既存商品のクラスタ結果を説明変数、既存商品の売上情報を目的変数とする学習データを用いて、需要予測を行う予測モデルを学習する。 As shown in FIG. 1, in the learning phase, the demand forecasting device 10 uses text information such as proposals for existing products that have already been released and proposals for new products that are created during research and development before launch or during planning. , to extract words (keywords) that represent the content. Then, the demand prediction device 10 performs clustering using the extracted keywords to generate clusters. After that, the demand prediction device 10 learns a prediction model that performs demand prediction using learning data in which the cluster result of the existing product is an explanatory variable and the sales information of the existing product is an objective variable.

学習完了後の適用フェーズでは、需要予測装置１０は、学習フェーズで生成された新商品のクラスタ結果を、学習済みの予測モデルに入力する。そして、需要予測装置１０は、予測モデルの出力結果を需要予測として取得する。 In the application phase after the completion of learning, the demand forecasting device 10 inputs the new product cluster result generated in the learning phase to the learned prediction model. Then, the demand prediction device 10 acquires the output result of the prediction model as a demand prediction.

このように、需要予測装置１０は、商品に紐づくテキスト情報を入力としてクラスタリングを行い、キーワード群の意味的なまとまりを持つクラスタを得て、このクラスタ結果を説明変数として予測モデルに投入する。これにより、各クラスタの売上に対する影響度を定量的に計算することができる。したがって、需要予測装置１０は、新商品の需要予測の精度を向上させることができる。 In this way, the demand forecasting device 10 performs clustering by inputting text information associated with a product, obtains a cluster having semantic unity of a keyword group, and inputs this cluster result into a forecasting model as an explanatory variable. This makes it possible to quantitatively calculate the degree of impact of each cluster on sales. Therefore, the demand prediction device 10 can improve the accuracy of demand prediction for new products.

［機能構成］
図２は、実施例１にかかる需要予測装置１０の機能構成を示す機能ブロック図である。図２に示すように、需要予測装置１０は、通信部１１、記憶部１２、制御部３０を有する。 [Function configuration]
FIG. 2 is a functional block diagram showing the functional configuration of the demand prediction device 10 according to the first embodiment. As shown in FIG. 2 , the demand prediction device 10 has a communication section 11 , a storage section 12 and a control section 30 .

通信部１１は、他の装置の間の通信を制御する処理部であり、例えば通信インタフェースなどである。例えば、通信部１１は、管理者から各種処理開始の指示や各種データなどを受信し、管理者端末に学習結果や予測結果などを送信する。 The communication unit 11 is a processing unit that controls communication between other devices, such as a communication interface. For example, the communication unit 11 receives various processing start instructions and various data from the administrator, and transmits learning results, prediction results, and the like to the administrator terminal.

記憶部１２は、各種データや制御部３０が実行するプログラムなどを記憶する記憶装置の一例であり、例えばメモリやハードディスクなどである。記憶部１２は、企画書ＤＢ１３、売上情報ＤＢ１４、月別売上情報ＤＢ１５、テキスト情報ＤＢ１６、重み情報ＤＢ１７、クラスタＤＢ１８、学習データＤＢ１９、学習結果ＤＢ２０、予測結果ＤＢ２１を有する。 The storage unit 12 is an example of a storage device that stores various data, programs executed by the control unit 30, and the like, and is, for example, a memory or a hard disk. The storage unit 12 has a proposal DB 13, a sales information DB 14, a monthly sales information DB 15, a text information DB 16, a weight information DB 17, a cluster DB 18, a learning data DB 19, a learning result DB 20, and a prediction result DB 21.

企画書ＤＢ１３は、すでに発売されている既存商品の企画書のデータと、発売前であり、研究開発段階や企画段階で生成された新商品の企画書のデータとを記憶するデータベースである。具体的には、企画書ＤＢ１３は、材料名、ターゲット年代、商品の内容などの商品に関する情報を表すキーワードが含まれる各企画書のデータを記憶する。 The proposal DB 13 is a database that stores proposal data for existing products that have already been released and proposal data for new products that have not yet been released and are generated at the stage of research and development or planning. Specifically, the proposal DB 13 stores data of each proposal containing keywords representing information about products such as material names, target years, product details, and the like.

図３は、企画書ＤＢ１３に記憶される企画書の一例を示す図である。図３に示すように、企画書は、商品の特徴や商品の説明などを表す項目ａと項目ｂが記載される。また、項目ａには、項目ａに関する情報を具体的に記載した文書１ａが記載され、項目ｂには、項目ｂに関する情報を具体的に記載した文書１ｂが記載される。また、文書１ａと文書１ｂには、キーワードが含まれる。例えば、項目ａは、商品特徴を説明する項目であり、項目ｂは、商品のターゲットを記載する項目などである。 FIG. 3 is a diagram showing an example of a proposal stored in the proposal DB 13. As shown in FIG. As shown in FIG. 3, the proposal includes item a and item b representing the features of the product, the description of the product, and the like. Item a includes a document 1a that specifically describes information about item a, and item b includes a document 1b that specifically describes information about item b. The documents 1a and 1b also contain keywords. For example, item a is an item describing product features, and item b is an item describing the target of the product.

売上情報ＤＢ１４は、既存商品の売上情報を記憶するデータベースである。具体的には、売上情報ＤＢ１４は、既存商品の発売日の売上を記憶する。図４は、売上情報ＤＢ１４に記憶される情報の例を示す図である。図４に示すように、売上情報ＤＢ１４は、「商品、発売日、売り上げ」を対応付けて記憶する。 The sales information DB 14 is a database that stores sales information of existing products. Specifically, the sales information DB 14 stores the sales of existing products on the release date. FIG. 4 is a diagram showing an example of information stored in the sales information DB 14. As shown in FIG. As shown in FIG. 4, the sales information DB 14 stores "product, release date, sales" in association with each other.

ここで記憶される「商品」は、発売された既存商品の商品名であり、「発売日」は、発売が開始された日であり、「売り上げ」は、売上個数などである。図４の例では、商品１は、2018年6月10日に発売が開始されて、その日の売上が100個だったことを示す。なお、売上情報ＤＢ１４は、発売開始日の売上に限らず、特定のある発売日の売上を記憶することもできる。なお、本実施例では、商品１、商品２、商品３を既存商品として説明する。 The "product" stored here is the name of an existing product that has been released, the "release date" is the date on which the sale was started, and the "sales" is the number of sales. The example in FIG. 4 indicates that product 1 was launched on June 10, 2018 and sold 100 units that day. It should be noted that the sales information DB 14 can store not only the sales on the sales start date but also the sales on a specific sales date. In this embodiment, the product 1, product 2, and product 3 are described as existing products.

月別売上情報ＤＢ１５は、既存商品の月別の売上情報を記憶するデータベースである。図５は、月別売上情報ＤＢ１５に記憶される情報の例を示す図である。図５に示すように、売上情報ＤＢ１４は、「商品、１か月目、２か月目、３か月目」を対応付けて記憶する。 The monthly sales information DB 15 is a database that stores monthly sales information of existing products. FIG. 5 is a diagram showing an example of information stored in the monthly sales information DB 15. As shown in FIG. As shown in FIG. 5, the sales information DB 14 associates and stores "merchandise, 1st month, 2nd month, 3rd month".

ここで記憶される「商品」は、発売された既存商品の商品名であり、「１か月目」等は、発売開始から１か月、２か月、３か月ごとの売上個数などである。図５の例では、商品１は、発売開始から１か月目で1000個、１か月目から２か月目で300個、２か月目から３か月目で100個の売り上げがあったことを示す。なお、月別に限らず、日別や年別などの情報を用いることもできる。 The "product" stored here is the product name of an existing product that has been released, and the "1st month" and the like are the number of sales for each month, two months, and three months from the start of sales. be. In the example of FIG. 5, product 1 sold 1,000 units in the first month after its release, 300 units in the first and second months, and 100 units in the second and third months. indicates that It should be noted that it is also possible to use information on a daily basis, a yearly basis, and the like, instead of the monthly basis.

テキスト情報ＤＢ１６は、既存商品および新商品の各企画書に関するデータを記憶するデータベースである。具体的には、テキスト情報ＤＢ１６は、各商品について、項目ａと項目ｂのそれぞれにどのような文書が含まれるかを記憶する。 The text information DB 16 is a database that stores data on proposals for existing products and new products. Specifically, the text information DB 16 stores what kind of document is included in each item a and item b for each product.

例えば、テキスト情報ＤＢ１６は、「商品、項目ａ、項目ｂ」を対応付けて記憶する。ここで記憶される「商品」は、商品名であり、「項目ａ」と「項目ｂ」は、クラスタ分類に使用されるキーワードの抽出元となる文書が記載されている箇所を示す情報である。一例を挙げると、商品１の項目ａに含まれる文書１ａと項目ｂに含まれる文書１ｂとが抽出元である場合、テキスト情報ＤＢ１６は、「商品、項目ａ、項目ｂ」として「商品１、文書１ａ、文書１ｂ」を記憶する。 For example, the text information DB 16 stores "product, item a, item b" in association with each other. The "product" stored here is the name of the product, and the "item a" and "item b" are information indicating the location of the document from which the keyword used for cluster classification is extracted. . For example, if the document 1a included in the item a of the product 1 and the document 1b included in the item b are the extraction sources, the text information DB 16 stores "product 1, item b" as "product, item a, item b". Document 1a, Document 1b" are stored.

重み情報ＤＢ１７は、テキスト情報に含まれるキーワードの重みに関する情報を記憶するデータベースである。具体的には、重み情報ＤＢ１７は、企画書などから抽出された各キーワードの重みを記憶する。 The weight information DB 17 is a database that stores information on the weight of keywords included in text information. Specifically, the weight information DB 17 stores the weight of each keyword extracted from a proposal or the like.

クラスタＤＢ１８は、既存商品と新商品とを含む各商品が分類されたクラスタに関する情報を記憶するデータベースである。具体的には、クラスタＤＢ１８は、各商品をキーワードでクラスタリングした結果を記憶する。すなわち、クラスタＤＢ１８は、商品ごとの、各クラスタに割り当てられたクラスタＩＤを記憶する。 The cluster DB 18 is a database that stores information on clusters into which products including existing products and new products are classified. Specifically, the cluster DB 18 stores the result of clustering each product with a keyword. That is, the cluster DB 18 stores a cluster ID assigned to each cluster for each product.

学習データＤＢ１９は、月別の予測モデルの学習に使用される学習データを記憶するデータベースである。具体的には、学習データＤＢ１９は、各商品のクラスタＩＤを説明変数に設定し、各商品の発売後の月別ごとの各売り上げを目的変数に設定した複数の学習データを記憶する。例えば、学習データＤＢ１９は、１か月目の売上予測用の予測モデルを学習するための学習データ、２か月目の売上予測用の予測モデを学習するための学習データル、３か月目の売上予測用の予測モデルそれぞれを学習するための学習データを記憶する。 The learning data DB 19 is a database that stores learning data used for learning a prediction model for each month. Specifically, the learning data DB 19 stores a plurality of pieces of learning data in which the cluster ID of each product is set as an explanatory variable, and the monthly sales of each product after its release are set as objective variables. For example, the learning data DB 19 includes learning data for learning a prediction model for sales prediction for the first month, learning data for learning a prediction model for sales prediction for the second month, Stores learning data for learning each prediction model for sales prediction.

学習結果ＤＢ２０は、月別の各予測モデルの学習結果を記憶するデータベースである。例えば、学習結果ＤＢ２０は、制御部３０による学習データの判別結果（分類結果）、重回帰分析や機械学習などによって学習された各種パラメータを記憶する。例えば、学習結果ＤＢ２０は、１か月目の売上予測用の予測モデル、２か月目の売上予測用の予測モデル、３か月目の売上予測用の予測モデルそれぞれを構成するための各種パラメータなどを記憶する。 The learning result DB 20 is a database that stores the learning results of each prediction model for each month. For example, the learning result DB 20 stores the determination result (classification result) of the learning data by the control unit 30, various parameters learned by multiple regression analysis, machine learning, and the like. For example, the learning result DB 20 contains various parameters for configuring a forecast model for sales forecast for the first month, a forecast model for sales forecast for the second month, and a forecast model for sales forecast for the third month. etc. to remember.

予測結果ＤＢ２１は、学習済みの予測モデルを用いて予測された、新商品の売上予測結果を記憶するデータベースである。具体的には、予測結果ＤＢ２１は、各新商品について、１か月目の売上予測結果、２か月目の売上予測結果、３か月目の売上予測結果を記憶する。 The prediction result DB 21 is a database that stores sales prediction results of new products predicted using a learned prediction model. Specifically, the forecast result DB 21 stores the sales forecast result for the first month, the sales forecast result for the second month, and the sales forecast result for the third month for each new product.

制御部３０は、需要予測装置１０全体を司る処理部であり、例えばプロセッサなどである。この制御部３０は、学習処理部４０と予測処理部５０を有する。なお、学習処理部４０と予測処理部５０は、プロセッサが有する電子回路の一例やプロセッサが実行するプロセスの一例である。 The control unit 30 is a processing unit that controls the entire demand prediction device 10, such as a processor. This control unit 30 has a learning processing unit 40 and a prediction processing unit 50 . Note that the learning processing unit 40 and the prediction processing unit 50 are an example of an electronic circuit possessed by a processor and an example of a process executed by the processor.

学習処理部４０は、単語抽出部４１、重み算出部４２、選定部４３、クラスタリング部４４、学習データ生成部４５、学習部４６を有し、月別ごとの予測モデルを学習する処理部である。 The learning processing unit 40 has a word extraction unit 41, a weight calculation unit 42, a selection unit 43, a clustering unit 44, a learning data generation unit 45, and a learning unit 46, and is a processing unit that learns a prediction model for each month.

単語抽出部４１は、既存商品および新商品の企画書に含まれる項目ごとに出現するキーワードを抽出する処理部である。例えば、単語抽出部４１は、商品１の企画書の項目ａに記載される文書１ａに形態素解析などを実行して、キーワードとして、Ｋ１ａ、Ｋ２ａ、Ｋ３ａ、Ｋ４ａなどを抽出する。また、単語抽出部４１は、商品１の企画書の項目ｂに記載される文書１ｂに形態素解析などを実行して、キーワードとして、Ｋ１ｂ、Ｋ２ｂ、Ｋ３ｂ、Ｋ４ｂ、Ｋ５ｂなどを抽出する。 The word extraction unit 41 is a processing unit that extracts keywords appearing in each item included in proposals for existing products and new products. For example, the word extraction unit 41 performs morphological analysis on the document 1a described in the item a of the proposal for the product 1, and extracts K1a, K2a, K3a, K4a, etc. as keywords. The word extraction unit 41 also performs morphological analysis and the like on the document 1b described in the item b of the proposal for the product 1, and extracts keywords K1b, K2b, K3b, K4b, K5b, and the like.

このようにして、単語抽出部４１は、既存商品の企画書および新商品の企画書からキーワードを抽出し、抽出結果をテキスト情報ＤＢ１６に格納するとともに、重み算出部４２に出力する。 In this manner, the word extraction unit 41 extracts keywords from the existing product proposals and the new product proposals, stores the extraction results in the text information DB 16 , and outputs them to the weight calculation unit 42 .

重み算出部４２は、各キーワードの重みを算出する処理部である。具体的には、重み算出部４２は、単語抽出部４１によって抽出された各キーワードのＴＦＩＤＦ（Term Frequency Inverse Document Frequency）を算出する。上記例で説明すると、重み算出部４２は、商品１について、項目ａの文書１ａにおけるキーワード「Ｋ１ａ」のＴＧＩＤＦを、Ｋ１ａの重みとして算出する。 The weight calculator 42 is a processor that calculates the weight of each keyword. Specifically, the weight calculator 42 calculates the TFIDF (Term Frequency Inverse Document Frequency) of each keyword extracted by the word extractor 41 . In the above example, the weight calculator 42 calculates the TGIDF of the keyword "K1a" in the document 1a of the item a for the product 1 as the weight of K1a.

このようにして、重み算出部４２は、既存商品の企画書から抽出された各キーワードおよび新商品の企画書から抽出された各キーワードの重みを算出して、算出結果を重み情報ＤＢ１７に格納し、選定部４３に出力する。 In this way, the weight calculator 42 calculates the weight of each keyword extracted from the existing product proposal and each keyword extracted from the new product proposal, and stores the calculation results in the weight information DB 17. , to the selection unit 43 .

選定部４３は、キーワードの選定を実行する処理部である。具体的には、選定部４３は、単語抽出部４１によって抽出された各キーワードのうち、重みが所定値未満のキーワード、ストップワードリストに該当するキーワード、除外対象に品詞に該当するキーワードを除外する。 The selection unit 43 is a processing unit that executes keyword selection. Specifically, the selection unit 43 excludes, from among the keywords extracted by the word extraction unit 41, the keywords whose weight is less than a predetermined value, the keywords corresponding to the stop word list, and the keywords corresponding to the part of speech to be excluded. .

このようにして、選定部４３は、既存商品の各キーワードおよび新商品の各キーワードのそれぞれから選定を実行し、選定結果をクラスタリング部４４に出力する。なお、ストップキーワードとは、キーワードとして対象外とする単語の一覧であり、除外対象の品詞とは、助詞などであり、これらは管理者等により予め設定されたり、一般的な辞書を用いたりすることができる。 In this way, the selection unit 43 executes selection from each keyword of the existing product and each keyword of the new product, and outputs the selection result to the clustering unit 44 . The stop keyword is a list of words that are excluded as keywords, and the parts of speech to be excluded are particles, etc. These are set in advance by the administrator, etc., or a general dictionary is used. be able to.

クラスタリング部４４は、選定部４３により選定されたキーワードを用いて、商品のクラスタリングを実行する処理部である。すなわち、クラスタリング部４４は、商品ごとに特徴語を有する度合の組み合わせを示したクラスタリング情報を生成する。 The clustering unit 44 is a processing unit that clusters products using the keywords selected by the selection unit 43 . That is, the clustering unit 44 generates clustering information indicating a combination of degrees of characteristic words for each product.

例えば、クラスタリング部４４は、各商品についてクラスタリングを実行し、各クラスタにクラスタＩＤを付与する。例えば、クラスタリング部４４は、項目ａに属するキーワードのうち、Ｋ１ａとＫ３ａを有するクラスタにＩＤ「Ｃ１ａ」を設定し、Ｋ２ａとＫ３ａを有するクラスタにＩＤ「Ｃ２ａ」を設定する。同様に、クラスタリング部４４は、項目ｂに属するキーワードのうち、Ｋ１ｂとＫ２ｂを有するクラスタにＩＤ「Ｃ１ｂ」を設定し、Ｋ２ｂとＫ３ｂを有するクラスタにＩＤ「Ｃ２ｂ」を設定する。 For example, the clustering unit 44 clusters each product and assigns a cluster ID to each cluster. For example, the clustering unit 44 sets the ID "C1a" to the cluster having K1a and K3a among the keywords belonging to the item a, and sets the ID "C2a" to the cluster having K2a and K3a. Similarly, the clustering unit 44 sets the ID "C1b" to the cluster having K1b and K2b among the keywords belonging to the item b, and sets the ID "C2b" to the cluster having K2b and K3b.

そして、クラスタリング部４４は、商品とクラスタＩＤとを対応付けて、商品ごとに「項目ａ（Ｃ１ａ、Ｃ２ａ）、項目ｂ（Ｃ１ｂ、Ｃ２ｂ）」を生成する。例えば、クラスタリング部４４は、商品１の項目ａにおけるキーワードが「Ｋ２ａ、Ｋ３ａ」、項目ｂにおけるキーワードが「Ｋ１ｂ、Ｋ２ｂ」である場合、商品１の「項目ａ（Ｃ１ａ、Ｃ２ａ）、項目ｂ（Ｃ１ｂ、Ｃ２ｂ）」として「0，1，1，0」を生成する。 Then, the clustering unit 44 associates the product with the cluster ID and generates "item a (C1a, C2a), item b (C1b, C2b)" for each product. For example, when the keyword in item a of product 1 is "K2a, K3a" and the keyword in item b is "K1b, K2b", the clustering unit 44 determines "item a (C1a, C2a), item b ( C1b, C2b)” to generate “0, 1, 1, 0”.

このようにして、クラスタリング部４４は、既存商品および新商品を含む各商品のクラスタリングを実行し、クラスタリング結果をクラスタＤＢ１８に格納し、学習データ生成部４５に出力する。なお、クラスタリング手法は、一般的な様々な手法を用いることができる。 In this manner, the clustering unit 44 clusters products including existing products and new products, stores the clustering results in the cluster DB 18 , and outputs them to the learning data generation unit 45 . Various general methods can be used as the clustering method.

学習データ生成部４５は、クラスタリング結果を用いて、学習データを生成する処理部である。具体的には、学習データ生成部４５は、クラスタリング結果のうち既存商品のクラスタリング結果を説明変数、月別売上情報を目的変数とする学習データを生成し、学習データＤＢ１９に格納する。 The learning data generation unit 45 is a processing unit that generates learning data using the clustering result. Specifically, the learning data generation unit 45 generates learning data using the clustering results of the existing products among the clustering results as an explanatory variable and the monthly sales information as an objective variable, and stores the learning data in the learning data DB 19 .

上記例で説明すると、学習データ生成部４５は、商品１の「項目ａ（Ｃ１ａ、Ｃ２ａ）、項目ｂ（Ｃ１ｂ、Ｃ２ｂ）」である「0，1，1，0」を説明変数、月別売上情報ＤＢ１５に記憶される１か月目の売上「1000」、２か月目の売上「300」、３か月目の売上「100」それぞれを目的変数とする学習データを生成する。 In the above example, the learning data generation unit 45 uses “0, 1, 1, 0”, which are “item a (C1a, C2a), item b (C1b, C2b)” of product 1, as explanatory variables, Learning data is generated with the first month sales "1000", the second month sales "300", and the third month sales "100" stored in the information DB 15 as objective variables.

つまり、学習データ生成部４５は、１か月目の売上予測を行う予測モデル用の学習データとして「（0，1，1，0），1000」を生成し、２か月目の売上予測を行う予測モデル用の学習データとして「（0，1，1，0），300」を生成し、１か月目の売上予測を行う予測モデル用の学習データとして「（0，1，1，0），100」を生成する。このようにして、学習データ生成部４５は、既存商品が属するクラスタＩＤを用いて、月別ごとの予測モデル用の各学習データを生成する。 That is, the learning data generation unit 45 generates "(0, 1, 1, 0), 1000" as learning data for the prediction model that performs sales prediction for the first month, and generates sales prediction for the second month. Generate "(0, 1, 1, 0), 300" as learning data for the prediction model that performs sales forecast for the first month, and generate "(0, 1, 1, 0 ), producing 100”. In this way, the learning data generation unit 45 generates each learning data for the prediction model for each month using the cluster ID to which the existing product belongs.

学習部４６は、月別ごとの予測モデルの学習を実行する処理部である。具体的には、学習部４６は、学習データＤＢ１９に記憶される月別ごとの学習データを用いて、月別ごとの予測モデルを生成する教師有学習を実行する。例えば、学習部４６は、重回帰分析を用いて学習処理を実行する。 The learning unit 46 is a processing unit that executes learning of the prediction model for each month. Specifically, the learning unit 46 uses the monthly learning data stored in the learning data DB 19 to perform supervised learning to generate monthly prediction models. For example, the learning unit 46 performs learning processing using multiple regression analysis.

そして、学習部４６は、学習結果を学習結果ＤＢ２０に格納する。なお、学習処理を終了するタイミングは、所定数以上の学習データを用いた学習が完了した時点、目的変数（ラベル）と予測モデルの出力結果との誤差が閾値未満となった時点など、任意に設定することができる。 Then, the learning unit 46 stores the learning result in the learning result DB 20 . The timing to end the learning process is arbitrary, such as when learning using a predetermined number or more of learning data is completed, or when the error between the target variable (label) and the output result of the prediction model is less than the threshold. can be set.

予測処理部５０は、学習済みの予測モデルを用いて、新商品の売上予測を実行する処理部である。例えば、予測処理部５０は、学習結果ＤＢ２０から各種パラメータを読み出して、１か月目用の予測モデル、２か月目用の予測モデル、３か月目用の予測モデルを構築する。そして、予測処理部５０は、新商品に対応付けられたクラスタＩＤ「1，0，1，0」を特徴量として各予測モデルに入力して各出力結果を取得し、予測結果ＤＢ２１に格納する。このようにして、予測処理部５０は、新商品について、１か月目の売上予測、２か月目の売上予測、３か月目の売上予測を行う。 The prediction processing unit 50 is a processing unit that executes sales prediction for new products using a learned prediction model. For example, the prediction processing unit 50 reads various parameters from the learning result DB 20 and constructs a prediction model for the first month, a prediction model for the second month, and a prediction model for the third month. Then, the prediction processing unit 50 inputs the cluster ID "1, 0, 1, 0" associated with the new product to each prediction model as a feature amount, acquires each output result, and stores it in the prediction result DB 21. . In this manner, the prediction processing unit 50 performs sales prediction for the first month, second month, and third month for the new product.

［具体例］
次に、図６と図７を用いて、学習フェーズと適用フェーズの具体例を説明する。図６は、実施例１にかかる学習フェーズを説明する図である。図７は、実施例１にかかる適用フェーズを説明する図である。 [Concrete example]
Next, specific examples of the learning phase and the application phase will be described with reference to FIGS. 6 and 7. FIG. FIG. 6 is a diagram for explaining a learning phase according to the first embodiment; FIG. 7 is a diagram illustrating an application phase according to the first embodiment;

（学習フェーズ）
図６に示すように、テキスト情報ＤＢ１６は、商品ごとに項目ａに属する文書と項目ｂに属する文書とを記憶する。具体的には、テキスト情報ＤＢ１６は、「商品、項目ａ、項目ｂ」として「商品１、文書１ａ、文書１ｂ」、「商品２、文書２ａ、文書２ｂ」、「商品３、文書３ａ、文書３ｂ」、「新商品、文書ａ、文書ｂ」を記憶する。 (learning phase)
As shown in FIG. 6, the text information DB 16 stores documents belonging to item a and documents belonging to item b for each product. Specifically, the text information DB 16 stores "product 1, document 1a, document 1b", "product 2, document 2a, document 2b", "product 3, document 3a, document 3b", and "new product, document a, document b".

そして、単語抽出部４１は、各商品の文書からキーワードを抽出する（Ｓ１）。例えば、単語抽出部４１は、項目ａの文書からキーワード「Ｋ１ａ、Ｋ２ａ、Ｋ３ａ」を抽出し、項目ｂの文書からキーワード「Ｋ１ｂ、Ｋ２ｂ、Ｋ３ｂ」を抽出する。 Then, the word extraction unit 41 extracts keywords from the document of each product (S1). For example, the word extraction unit 41 extracts keywords "K1a, K2a, K3a" from the document of item a, and extracts keywords "K1b, K2b, K3b" from the document of item b.

続いて、重み算出部４２が、各キーワードのＴＦＩＤＦを算出し、選定部４３は、キーワードの選定を実行する（Ｓ２）。例えば、商品１については、Ｋ１ａは該当なし、Ｋ２ａは重み（０．７）、Ｋ３ａは重み（０．１）、Ｋ１ｂは重み（０．８）、Ｋ２ｂは重み（０．６）、Ｋ３ｂは該当なしと選定される。商品２については、Ｋ１ａは重み（０．８）、Ｋ２ａは該当なし、Ｋ３ａは重み（０．７）、Ｋ１ｂは該当なし、Ｋ２ｂは重み（０．１）、Ｋ３ｂは重み（０．８）と選定される。 Subsequently, the weight calculator 42 calculates the TFIDF of each keyword, and the selector 43 executes keyword selection (S2). For example, for product 1, K1a is not applicable, K2a is weight (0.7), K3a is weight (0.1), K1b is weight (0.8), K2b is weight (0.6), K3b is Selected as not applicable. For product 2, K1a is weighted (0.8), K2a is not applicable, K3a is weighted (0.7), K1b is not applicable, K2b is weighted (0.1), K3b is weighted (0.8) is selected.

商品３については、Ｋ１ａは該当なし、Ｋ２ａは重み（０．８）、Ｋ３ａは重み（０．３）、Ｋ１ｂは該当なし、Ｋ２ｂは重み（０．３）、Ｋ３ｂは重み（０．５）と選定される。新商品については、Ｋ１ａは重み（０．７）、Ｋ２ａは該当なし、Ｋ３ａは重み（０．９）、Ｋ１ｂは重み（０．７）、Ｋ２ｂは重み（０．６）、Ｋ３ｂは該当なしと選定される。 For product 3, K1a is not applicable, K2a is weight (0.8), K3a is weight (0.3), K1b is not applicable, K2b is weight (0.3), K3b is weight (0.5) is selected. For new products, K1a has a weight of 0.7, K2a does not apply, K3a has a weight of 0.9, K1b has a weight of 0.7, K2b has a weight of 0.6, and K3b is not applicable is selected.

そして、クラスタリング部４４は、重み算出やキーワード選定の結果を用いて、既存商品および新商品のクラスタリングを実行する（Ｓ３）。例えば、クラスタリング部４４は、項目ａについて、Ｋ１ａとＫ３ａを含む商品をクラスタＣ１ａに分類し、Ｋ２ａとＫ３ａを含む商品をクラスタＣ２ａに分類する。また、クラスタリング部４４は、項目ｂについて、Ｋ１ｂとＫ２ｂを含む商品をクラスタＣ１ｂに分類し、Ｋ２ｂとＫ３ｂを含む商品をクラスタＣ２ｂに分類する。 Then, the clustering unit 44 clusters existing products and new products using the results of weight calculation and keyword selection (S3). For example, for item a, the clustering unit 44 classifies products containing K1a and K3a into cluster C1a, and classifies products containing K2a and K3a into cluster C2a. For item b, the clustering unit 44 also classifies products containing K1b and K2b into cluster C1b, and classifies products containing K2b and K3b into cluster C2b.

この結果、商品１は、項目ａに関してＣ２ａに属し、項目ｂに関してＣ１ｂに属するので、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「０，１，１，０」が生成される。商品２は、項目ａに関してＣ１ａに属し、項目ｂに関してＣ２ｂに属するので、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「１，０，０，１」が生成される。 As a result, product 1 belongs to C2a with respect to item a and belongs to C1b with respect to item b, so "0, 1, 1, 0" is generated as the clustering result "C1a, C2a, C1b, C2b". Product 2 belongs to C1a with respect to item a and belongs to C2b with respect to item b, so "1, 0, 0, 1" is generated as the clustering result "C1a, C2a, C1b, C2b".

商品３は、項目ａに関してＣ２ａに属し、項目ｂに関してＣ２ｂに属するので、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「０，１，０，１」が生成される。新商品は、項目ａに関してＣ１ａに属し、項目ｂに関してＣ１ｂに属するので、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「１，０，１，０」が生成される。 Since product 3 belongs to C2a with respect to item a and C2b with respect to item b, "0, 1, 0, 1" is generated as the clustering result "C1a, C2a, C1b, C2b". Since the new product belongs to C1a with respect to item a and C1b with respect to item b, "1, 0, 1, 0" is generated as the clustering result "C1a, C2a, C1b, C2b".

その後、学習データ生成部４５は、既存商品と新商品とを含むクラスタリング結果のうち既存商品のクラスタリング結果と、既存商品の売上実績とを用いて、学習データを生成する（Ｓ４とＳ５）。 After that, the learning data generation unit 45 generates learning data using the clustering results of the existing products among the clustering results including the existing products and the new products and the sales results of the existing products (S4 and S5).

例えば、学習データ生成部４５は、商品１について３つの学習データを生成する。すなわち、学習データ生成部４５は、商品１のクラスタリング結果「０，１，１，０」を説明変数に設定し、１か月目の売上「１０００」を目的変数に設定した学習データを生成する。学習データ生成部４５は、商品１のクラスタリング結果「０，１，１，０」を説明変数に設定し、２か月目の売上「３００」を目的変数に設定した学習データを生成する。学習データ生成部４５は、商品１のクラスタリング結果「０，１，１，０」を説明変数に設定し、３か月目の売上「１００」を目的変数に設定した学習データを生成する。 For example, the learning data generator 45 generates three pieces of learning data for product 1 . That is, the learning data generation unit 45 sets the clustering result "0, 1, 1, 0" of the product 1 as explanatory variables, and generates learning data in which the first month's sales "1000" is set as the objective variable. . The learning data generation unit 45 sets the clustering result "0, 1, 1, 0" of the product 1 as explanatory variables, and generates learning data in which the second month's sales "300" is set as the objective variable. The learning data generating unit 45 sets the clustering result "0, 1, 1, 0" of the product 1 as explanatory variables, and generates learning data in which the third month's sales "100" is set as the objective variable.

同様に、学習データ生成部４５は、商品２のクラスタリング結果「１，０，０，１」を説明変数に設定し、１か月目の売上「１５００」を目的変数に設定した学習データを生成する。学習データ生成部４５は、商品２のクラスタリング結果「１，０，０，１」を説明変数に設定し、２か月目の売上「１０００」を目的変数に設定した学習データを生成する。学習データ生成部４５は、商品２のクラスタリング結果「１，０，０，１」を説明変数に設定し、３か月目の売上「７００」を目的変数に設定した学習データを生成する。 Similarly, the learning data generation unit 45 sets the clustering result "1, 0, 0, 1" of the product 2 as explanatory variables, and generates learning data in which the first month's sales "1500" is set as the objective variable. do. The learning data generation unit 45 sets the clustering result "1, 0, 0, 1" of the product 2 as explanatory variables, and generates learning data in which the second month's sales "1000" is set as the objective variable. The learning data generation unit 45 sets the clustering result "1, 0, 0, 1" of the product 2 as explanatory variables, and generates learning data in which the third month's sales "700" is set as the objective variable.

同様に、学習データ生成部４５は、商品３のクラスタリング結果「０，１，０，１」を説明変数に設定し、１か月目の売上「５０００」を目的変数に設定した学習データを生成する。学習データ生成部４５は、商品３のクラスタリング結果「０，１，０，１」を説明変数に設定し、２か月目の売上「２０００」を目的変数に設定した学習データを生成する。学習データ生成部４５は、商品３のクラスタリング結果「０，１，０，１」を説明変数に設定し、３か月目の売上「５００」を目的変数に設定した学習データを生成する。 Similarly, the learning data generation unit 45 sets the clustering result "0, 1, 0, 1" of the product 3 as the explanatory variable, and generates the learning data in which the first month's sales "5000" is set as the objective variable. do. The learning data generation unit 45 generates learning data by setting the clustering result "0, 1, 0, 1" of the product 3 as an explanatory variable and setting the second month's sales "2000" as an objective variable. The learning data generating unit 45 sets the clustering result "0, 1, 0, 1" of the product 3 as an explanatory variable, and generates learning data in which the third month's sales "500" is set as an objective variable.

そして、学習部４６は、生成された各学習データを用いて、各予測モデルを学習する（Ｓ６）。すなわち、学習部４６は、クラスタリング結果を商品の特徴を表すベクトルデータ、月別の売上情報を正解情報として、予測モデル（学習モデル）を学習する。 Then, the learning unit 46 learns each prediction model using each generated learning data (S6). That is, the learning unit 46 learns a prediction model (learning model) using the clustering result as vector data representing product features and monthly sales information as correct information.

例えば、学習部４６は、商品１の学習データ「（０，１，１，０）、１０００」と、商品２の学習データ「（１，０，０，１）、１５００」と、商品３の学習データ「（０，１，０，１）、５０００」とを用いて、重回帰分析により、１か月目の売上予測を行う予測モデルを学習する。 For example, the learning unit 46 learns the learning data "(0, 1, 1, 0), 1000" for the product 1, the learning data "(1, 0, 0, 1), 1500" for the product 2, and the A prediction model for predicting sales for the first month is learned by multiple regression analysis using learning data “(0, 1, 0, 1), 5000”.

同様に、学習部４６は、商品１の学習データ「（０，１，１，０）、３００」と、商品２の学習データ「（１，０，０，１）、１０００」と、商品３の学習データ「（０，１，０，１）、２０００」とを用いて、重回帰分析により、２か月目の売上予測を行う予測モデルを学習する。 Similarly, the learning unit 46 obtains learning data “(0, 1, 1, 0), 300” for product 1, learning data “(1, 0, 0, 1), 1000” for product 2, product 3 learning data "(0, 1, 0, 1), 2000" are used to learn a prediction model that predicts sales for the second month by multiple regression analysis.

同様に、学習部４６は、商品１の学習データ「（０，１，１，０）、１００」と、商品２の学習データ「（１，０，０，１）、７００」と、商品３の学習データ「（０，１，０，１）、５００」とを用いて、重回帰分析により、３か月目の売上予測を行う予測モデルを学習する。 Similarly, the learning unit 46 obtains learning data “(0, 1, 1, 0), 100” for product 1, learning data “(1, 0, 0, 1), 700” for product 2, product 3 learning data "(0, 1, 0, 1), 500" are used to learn a prediction model for predicting sales for the third month by multiple regression analysis.

（適用フェーズ）
適用フェーズでは、図７に示すように、予測処理部５０は、図６のＳ３で得られたクラスタリング結果のうち、新商品のクラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「１，０，１，０」を抽出する。 (application phase)
In the application phase, as shown in FIG. 7, the prediction processing unit 50 selects "1, 0, 1,0” is extracted.

そして、予測処理部５０は、新商品のクラスタリング結果「１，０，１，０」を、１か月用の予測モデル、２か月用の予測モデル、３か月用の予測モデルのそれぞれに入力する。その後、予測処理部５０は、１か月用の予測モデルの出力値「２５００」、２か月用の予測モデルの出力値「２０」、３か月用の予測モデルの出力値「３００」を取得する。 Then, the prediction processing unit 50 assigns the clustering result "1, 0, 1, 0" of the new product to the prediction model for one month, the prediction model for two months, and the prediction model for three months. input. After that, the prediction processing unit 50 outputs the output value “2500” of the prediction model for one month, the output value “20” of the prediction model for two months, and the output value “300” of the prediction model for three months. get.

この結果、予測処理部５０は、新商品が発売されてから１か月目の売上予測を「２５００個」、１か月目から２か月目までの売上予測を「２０個」、２か月目から３か月目までの売上予測を「３００個」と予測する。 As a result, the prediction processing unit 50 predicts the sales for the first month after the new product goes on sale as "2500", the sales for the first and second months as "20", and the sales for the second month. The sales forecast from the first month to the third month is predicted to be "300".

［処理の流れ］
図８は、処理の流れを示すフローチャートである。図８に示すように、学習処理部４０は、処理開始が指示されると（Ｓ１０１：Ｙｅｓ）、企画書ＤＢ１３に記憶される企画書のデータを読み込む（Ｓ１０２）。 [Process flow]
FIG. 8 is a flowchart showing the flow of processing. As shown in FIG. 8, when the learning processing unit 40 is instructed to start processing (S101: Yes), it reads the proposal data stored in the proposal DB 13 (S102).

続いて、学習処理部４０は、各企画書のデータからキーワードを抽出し（Ｓ１０３）、キーワードの重みを算出する（Ｓ１０４）。そして、学習処理部４０は、ストップワードリストや重みを用いて、キーワードの選定を実行する（Ｓ１０５）。 Subsequently, the learning processing unit 40 extracts keywords from the data of each proposal (S103), and calculates the weight of the keywords (S104). Then, the learning processing unit 40 selects keywords using the stop word list and weights (S105).

その後、学習処理部４０は、選定されたキーワードを用いてクラスタリングを実行し（Ｓ１０６）、商品とクラスタＩＤとを対応付けたクラスタリング結果を生成する（Ｓ１０７）。 After that, the learning processing unit 40 performs clustering using the selected keywords (S106), and generates a clustering result in which products and cluster IDs are associated with each other (S107).

そして、学習処理部４０は、新商品を含むクラスタリングのうち、既存商品のクラスタリング結果を説明変数として抽出し（Ｓ１０８）、各商品の月別の売上情報を用いて学習データを生成する（Ｓ１０９）。続いて、学習処理部４０は、学習データを用いて、予測モデルの学習を実行する（Ｓ１１０）。 Then, the learning processing unit 40 extracts the clustering results of the existing products from the clustering including the new products as explanatory variables (S108), and generates learning data using the monthly sales information of each product (S109). Subsequently, the learning processing unit 40 executes learning of the prediction model using the learning data (S110).

その後、学習が完了すると（Ｓ１１１：Ｙｅｓ）、予測処理部５０は、学習フェーズにおけるクラスタリング結果のうち、新商品のクラスタリング結果を学習済みの予測モデルに入力して（Ｓ１１２）、予測結果を取得する（Ｓ１１３）。 After that, when the learning is completed (S111: Yes), the prediction processing unit 50 inputs the clustering result of the new product among the clustering results in the learning phase to the learned prediction model (S112) to acquire the prediction result. (S113).

実施例１では、新商品を含めた状態でクラスタリングを実行する例を説明したが、この手法は、学習時に、新商品の企画書などが作成されている場合には有効な手法である。しかし、学習時に新商品の企画書がない場合も考えられる。そこで、実施例２では、学習時には既存商品の企画書を用いて予測モデルを学習し、予測時に新商品の企画書を用いて予測を実行する例を説明する。 In the first embodiment, an example in which clustering is performed including new products has been described, but this method is an effective method when a plan for new products is created at the time of learning. However, it is conceivable that there is no proposal for a new product at the time of learning. Therefore, in the second embodiment, an example will be described in which a prediction model is learned using proposals for existing products at the time of learning, and prediction is performed using proposals for new products at the time of prediction.

［実施例２にかかる需要予測装置の説明］
図９は、実施例２にかかる需要予測装置１０を説明する図である。図９に示すように、学習フェーズでは、需要予測装置１０は、すでに発売されている既存商品の企画書から、内容を表すキーワードを抽出する。そして、需要予測装置１０は、抽出した単語を用いて、既存商品のクラスタリングを行ってクラスタを生成する。その後、需要予測装置１０は、既存商品のクラスタ結果を説明変数に設定し、既存商品の売上情報を目的変数に設定した学習データを用いて、需要予測を行う予測モデルを学習する。 [Description of the demand forecasting device according to the second embodiment]
FIG. 9 is a diagram for explaining the demand prediction device 10 according to the second embodiment. As shown in FIG. 9, in the learning phase, the demand prediction device 10 extracts keywords representing content from proposals of existing products that have already been put on the market. Then, the demand prediction device 10 clusters existing products using the extracted words to generate clusters. After that, the demand prediction device 10 sets the cluster result of the existing product as an explanatory variable, and uses learning data in which the sales information of the existing product is set as an objective variable to learn a prediction model that performs demand prediction.

学習完了後の適用フェーズでは、需要予測装置１０は、新商品の企画書からキーワードを抽出し、キーワードの一致数などを用いて、各商品と新商品とのテキスト間の類似度を算出する。そして、需要予測装置１０は、各商品と新商品とのテキスト間の類似度を用いて、既存商品のクラスタＩＤごとの重み付け加算を算出する。すなわち、需要予測装置１０は、各商品が属するクラスタへの新商品の依存度を算出する。その後、需要予測装置１０は、重み付け加算の結果を各予測モデルに入力して、予測モデルの出力結果を需要予測として取得する。 In the application phase after the completion of learning, the demand prediction device 10 extracts keywords from the proposals of the new products, and calculates the similarity between the texts of each product and the new product using the number of matching keywords. Then, the demand prediction device 10 calculates the weighted addition for each cluster ID of the existing product using the similarity between the texts of each product and the new product. That is, the demand prediction device 10 calculates the degree of dependence of the new product on the cluster to which each product belongs. After that, the demand forecasting device 10 inputs the results of the weighted addition to each forecasting model and acquires the output result of the forecasting model as a demand forecast.

［具体例］
次に、図１０と図１１を用いて、学習フェーズと適用フェーズの具体例を説明する。図１０は、実施例２にかかる学習フェーズを説明する図である。図１１は、実施例２にかかる適用フェーズを説明する図である。 [Concrete example]
Next, specific examples of the learning phase and the application phase will be described with reference to FIGS. 10 and 11. FIG. FIG. 10 is a diagram for explaining the learning phase according to the second embodiment; FIG. 11 is a diagram illustrating an application phase according to the second embodiment;

（学習フェーズ）
実施例１と異なる点は、新商品の情報は用いずに、既存商品の情報のみを用いて、既存商品のみをクラスタリングする点である。具体的には、図１０に示すように、テキスト情報ＤＢ１６は、商品ごとに項目ａに属する文書情報である「商品、項目ａ、項目ｂ」として「商品１、文書１ａ、文書１ｂ」、「商品２、文書２ａ、文書２ｂ」、「商品３、文書３ａ、文書３ｂ」を記憶する。 (learning phase)
A different point from Example 1 is that only existing products are clustered using only existing product information without using new product information. Specifically, as shown in FIG. 10, the text information DB 16 stores "product 1, document 1a, document 1b", ""Product 2, Document 2a, Document 2b" and "Product 3, Document 3a, Document 3b" are stored.

そして、単語抽出部４１は、各商品の文書からキーワードを抽出する（Ｓ１０）。例えば、単語抽出部４１は、項目ａの文書からキーワード「Ｋ１ａ、Ｋ２ａ、Ｋ３ａ」を抽出し、項目ｂの文書からキーワード「Ｋ１ｂ、Ｋ２ｂ、Ｋ３ｂ」を抽出する。 Then, the word extraction unit 41 extracts a keyword from the document of each product (S10). For example, the word extraction unit 41 extracts keywords "K1a, K2a, K3a" from the document of item a, and extracts keywords "K1b, K2b, K3b" from the document of item b.

続いて、重み算出部４２が、各キーワードのＴＦＩＤＦを算出し、選定部４３は、キーワードの選定を実行する（Ｓ１１）。例えば、商品１については、Ｋ１ａは該当なし、Ｋ２ａは重み（０．７）、Ｋ３ａは重み（０．１）、Ｋ１ｂは重み（０．８）、Ｋ２ｂは重み（０．６）、Ｋ３ｂは該当なしと選定される。商品２については、Ｋ１ａは重み（０．８）、Ｋ２ａは該当なし、Ｋ３ａは重み（０．７）、Ｋ１ｂは該当なし、Ｋ２ｂは重み（０．１）、Ｋ３ｂは重み（０．８）と選定される。商品３については、Ｋ１ａは該当なし、Ｋ２ａは重み（０．８）、Ｋ３ａは重み（０．３）、Ｋ１ｂは該当なし、Ｋ２ｂは重み（０．３）、Ｋ３ｂは重み（０．５）と選定される。 Subsequently, the weight calculator 42 calculates the TFIDF of each keyword, and the selector 43 executes keyword selection (S11). For example, for product 1, K1a is not applicable, K2a is weight (0.7), K3a is weight (0.1), K1b is weight (0.8), K2b is weight (0.6), K3b is Selected as not applicable. For product 2, K1a is weighted (0.8), K2a is not applicable, K3a is weighted (0.7), K1b is not applicable, K2b is weighted (0.1), K3b is weighted (0.8) is selected. For product 3, K1a is not applicable, K2a is weight (0.8), K3a is weight (0.3), K1b is not applicable, K2b is weight (0.3), K3b is weight (0.5) is selected.

そして、クラスタリング部４４は、重み算出やキーワード選定の結果を用いて、既存商品のクラスタリングを実行する（Ｓ１２）。例えば、クラスタリング部４４は、実施例１と同様、項目ａについてクラスタＣ１ａとＣ２ａに分類し、項目ｂについてクラスタＣ１ｂとＣ２ｂに分類する。 Then, the clustering unit 44 performs clustering of existing products using the results of weight calculation and keyword selection (S12). For example, as in the first embodiment, the clustering unit 44 classifies item a into clusters C1a and C2a, and classifies item b into clusters C1b and C2b.

この結果、商品１について、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「０，１，１，０」が生成される。商品２について、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「１，０，０，１」が生成される。商品３について、クラスタリング結果「Ｃ１ａ，Ｃ２ａ，Ｃ１ｂ，Ｃ２ｂ」として「０，１，０，１」が生成される。 As a result, for product 1, "0, 1, 1, 0" is generated as the clustering result "C1a, C2a, C1b, C2b". For product 2, "1, 0, 0, 1" is generated as the clustering result "C1a, C2a, C1b, C2b". For product 3, "0, 1, 0, 1" is generated as the clustering result "C1a, C2a, C1b, C2b".

その後、学習データ生成部４５は、既存商品のクラスタリング結果と既存商品の売上実績とを用いて、学習データを生成する（Ｓ１３）。例えば、学習データ生成部４５は、商品１について、商品１のクラスタリング結果「０，１，１，０」を説明変数に設定し、１か月目の売上「１０００」、２か月目の売上「３００」、３か月目の売上「１００」のそれぞれを目的変数に設定した３つの学習データを生成する。 After that, the learning data generation unit 45 generates learning data using the clustering result of the existing products and the actual sales of the existing products (S13). For example, the learning data generation unit 45 sets the clustering result "0, 1, 1, 0" of the product 1 as explanatory variables for the product 1, the first month sales "1000", the second month sales Three sets of learning data are generated with each of "300" and third-month sales "100" set as objective variables.

同様に、学習データ生成部４５は、商品２のクラスタリング結果「１，０，０，１」を説明変数に設定し、１か月目の売上「１５００」、２か月目の売上「１０００」、３か月目の売上「７００」それぞれを目的変数に設定した、３つの学習データを生成する。同様に、学習データ生成部４５は、商品３のクラスタリング結果「０，１，０，１」を説明変数に設定し、１か月目の売上「５０００」、２か月目の売上「２０００」、３か月目の売上「５００」それぞれを目的変数に設定した、３つの学習データを生成する。 Similarly, the learning data generation unit 45 sets the clustering result “1, 0, 0, 1” of the product 2 as explanatory variables, the first month sales “1500”, the second month sales “1000” , and third month sales "700" are set as objective variables to generate three sets of learning data. Similarly, the learning data generation unit 45 sets the clustering result “0, 1, 0, 1” of product 3 as explanatory variables, the first month sales “5000” and the second month sales “2000”. , and third month sales "500" are set as objective variables to generate three learning data.

そして、学習部４６は、生成された各学習データを用いて、各予測モデルを学習する（Ｓ１４）。例えば、学習部４６は、商品１の学習データ「（０，１，１，０）、１０００」と、商品２の学習データ「（１，０，０，１）、１５００」と、商品３の学習データ「（０，１，０，１）、５０００」とを用いて、重回帰分析により、１か月目の売上予測を行う予測モデルを学習する。 Then, the learning unit 46 learns each prediction model using each generated learning data (S14). For example, the learning unit 46 learns the learning data "(0, 1, 1, 0), 1000" for the product 1, the learning data "(1, 0, 0, 1), 1500" for the product 2, and the A prediction model for predicting sales for the first month is learned by multiple regression analysis using learning data “(0, 1, 0, 1), 5000”.

（適用フェーズ）
適用フェーズでは、実施例１と異なり、新商品の情報を用いて、新商品と既存商品とのテキスト間類似度を算出し、新商品が分類済みのクラスタにどれだけ関連するかを示す特徴量を算出する。そして、新商品の特徴量を入力として予測を実行する。 (application phase)
In the application phase, unlike the first embodiment, the information of the new product is used to calculate the similarity between the texts of the new product and the existing product, and the feature quantity that indicates how much the new product is related to the classified clusters. Calculate Then, prediction is executed with the feature amount of the new product as an input.

図１１に示すように、テキスト情報ＤＢ１６は、新商品の企画書データ「商品、項目ａ、項目ｂ」として「新商品、文書ａ、文書ｂ」を記憶する。この状態で、予測処理部５０は、新商品の企画書データの各文書からキーワードを抽出し、ＴＧＩＤＦなどを用いた各キーワードの重みの算出、キーワードの選定などを実行する（Ｓ２０）。例えば、予測処理部５０は、新商品の項目ａの文書ａからキーワード「Ｋ１ａ、Ｋ３ａ」を抽出し、項目ｂの文書ｂからキーワード「Ｋ１ｂ、Ｋ２ｂ」を抽出する。そして、予測処理部５０は、Ｋ１ａの重み「０．７」、Ｋ３ａの重み「０．９」、Ｋ１ｂの重み「０．７」、Ｋ２ｂの重み「０．６」を算出する。ここでは、Ｋ２ａとＫ３ｂは、抽出されなかったとする。 As shown in FIG. 11, the text information DB 16 stores "new product, document a, document b" as the new product proposal data "product, item a, item b". In this state, the prediction processing unit 50 extracts keywords from each document of the proposal data of the new product, calculates the weight of each keyword using TGIDF, and selects the keywords (S20). For example, the prediction processing unit 50 extracts the keywords "K1a, K3a" from the document a of the new product item a, and extracts the keywords "K1b, K2b" from the document b of the item b. Then, the prediction processing unit 50 calculates the weight of K1a "0.7", the weight of K3a "0.9", the weight of K1b "0.7", and the weight of K2b "0.6". Assume that K2a and K3b are not extracted here.

続いて、予測処理部５０は、コサイン類似度などの手法を用いて、新商品の重み情報と各既存商品の重み情報との間のテキスト間類似度を算出する（Ｓ２１）。例えば、予測処理部５０は、既存の商品１について、項目ａのテキスト間類似度「０．１１」と項目ｂのテキスト間類似度「１．００」を算出する。同様に、予測処理部５０は、既存の商品２について、項目ａのテキスト間類似度「０．９８」と項目ｂのテキスト間類似度「０．０８」を算出する。また、予測処理部５０は、既存の商品３について、項目ａのテキスト間類似度「０．２８」と項目ｂのテキスト間類似度「０．３３」を算出する。なお、コサイン類似度を用いたテキスト間類似度に限らず、全キーワードののうち一致する割合など一般的な類似度の算出手法を用いることもできる。 Subsequently, the prediction processing unit 50 uses a technique such as cosine similarity to calculate the inter-text similarity between the weight information of the new product and the weight information of each existing product (S21). For example, the prediction processing unit 50 calculates the inter-text similarity “0.11” for the item a and the inter-text similarity “1.00” for the item b for the existing product 1 . Similarly, for the existing product 2, the prediction processing unit 50 calculates the inter-text similarity of item a of “0.98” and the inter-text similarity of item b of “0.08”. In addition, the prediction processing unit 50 calculates the inter-text similarity of item a “0.28” and the inter-text similarity of item b “0.33” for the existing product 3 . It should be noted that not only text-to-text similarity using cosine similarity but also a general similarity calculation method such as matching rate among all keywords can be used.

その後、予測処理部５０は、既存商品のクラスタＩＤのダミー変数を、テキスト間類似度で重み付け加算を行う。（Ｓ２２）。上記例で説明すると、予測処理部５０は、商品１に対して、図１０のＳ１２で生成されたクラスタリング結果を参照し、商品１が属する項目ａのクラスタＣ２ａにテキスト間類似度「０．１１」、商品１が属する項目ｂのクラスタＣ１ｂにテキスト間類似度「１．００」を設定する。 After that, the prediction processing unit 50 weights and adds the dummy variable of the cluster ID of the existing product with the similarity between the texts. (S22). In the above example, the prediction processing unit 50 refers to the clustering result generated in S12 of FIG. ”, and the inter-text similarity “1.00” is set for the cluster C1b of the item b to which the product 1 belongs.

同様に、予測処理部５０は、商品２に対して、図１０のＳ１２で生成されたクラスタリング結果を参照し、商品２が属する項目ａのクラスタＣ１ａにテキスト間類似度「０．９８」、商品２が属する項目ｂのクラスタＣ２ｂにテキスト間類似度「０．０８」を設定する。また、予測処理部５０は、商品３に対して、図１０のＳ１２で生成されたクラスタリング結果を参照し、商品３が属する項目ａのクラスタＣ２ａにテキスト間類似度「０．２８」、商品２が属する項目ｂのクラスタＣ２ｂにテキスト間類似度「０．３３」を設定する。 Similarly, the prediction processing unit 50 refers to the clustering result generated in S12 of FIG. A similarity between texts of “0.08” is set for the cluster C2b of item b to which 2 belongs. Also, the prediction processing unit 50 refers to the clustering result generated in S12 of FIG. A similarity between texts of "0.33" is set for the cluster C2b of the item b to which .

その後、予測処理部５０は、各既存商品のクラスタＩＤに対して設定されたテキスト間類似度を加算して、新商品の特徴量（特徴ベクトル）を生成する。上記例で説明すると、予測処理部５０は、項目ａのＣ１ａに対して、商品２のテキスト間類似度「０．９８」を設定し、項目ａのＣ２ａに対して、商品１のテキスト間類似度「０．１１」と商品３のテキスト間類似度「０．２８」を加算した「０．３９」を設定する。同様に、予測処理部５０は、項目ｂのＣ１ｂに対して、商品１のテキスト間類似度「１．００」を設定し、項目ｂのＣ２ｂに対して、商品２のテキスト間類似度「０．０８」と商品３のテキスト間類似度「０．３３」を加算した「０．４２」を設定する。 After that, the prediction processing unit 50 adds the inter-text similarity set for the cluster ID of each existing product to generate a feature amount (feature vector) of the new product. In the above example, the prediction processing unit 50 sets the similarity between texts of product 2 to C1a of item a to be “0.98”, and the similarity between texts of product 1 to C2a of item a to be “0.98”. "0.39" is set by adding the degree "0.11" and the text-to-text similarity "0.28" of product 3. Similarly, the prediction processing unit 50 sets the inter-text similarity of product 1 to “1.00” for C1b of item b, and sets the inter-text similarity of product 2 to “0.00” for C2b of item b. 0.08” and the text-to-text similarity of product 3 “0.33” are added to set “0.42”.

そして、予測処理部５０は、加算結果「０．９８，０．３９，１．００，０．４２」を説明変数として、学習済みである１か月用の予測モデル、２か月用の予測モデル、３か月用の予測モデルそれぞれに入力して、予測結果を取得する（Ｓ２３）。上記例で説明すると、予測処理部５０は、新商品が発売されてから１か月目の売上予測を「２５００個」、１か月目から２か月目までの売上予測を「２０個」、２か月目から３か月目までの売上予測を「３００個」と予測する。 Then, the prediction processing unit 50 uses the addition result “0.98, 0.39, 1.00, 0.42” as an explanatory variable, and uses the learned prediction model for one month and the prediction model for two months. Input to the model and prediction model for three months, and obtain the prediction result (S23). In the above example, the prediction processing unit 50 sets the sales forecast for the first month after the new product is released to "2500 units" and the sales forecast for the first month to the second month as "20 units". , the sales forecast from the second month to the third month is predicted to be "300".

ところで、実施例１と実施例２では、月別の予測モデルを用いて、月別の需要予測を行う場合を説明したが、これに限定されるものではない。例えば、月別の需要予測結果を用いてスムージングを実行することにより、予測されていない期間の予測結果を推測することができる。 By the way, in Embodiment 1 and Embodiment 2, the case of performing monthly demand forecasting using the monthly forecasting model has been described, but the present invention is not limited to this. For example, by performing smoothing using the monthly demand forecast results, it is possible to estimate the forecast results for periods when no forecast is made.

例えば、需要予測装置１０の予測処理部５０は、学習データの時間粒度（例えば４週）が予測したい時間粒度（例えば週次）よりも大きい場合は、予測結果が得られたのち、予測したい時間粒度に合わせて等分割する。例を挙げると、予測処理部５０は、４週分の予測値４分割して、階段状の週次の予測値に変換する。 For example, when the time granularity (for example, four weeks) of the learning data is larger than the desired prediction time granularity (for example, weekly), the prediction processing unit 50 of the demand prediction device 10 obtains the prediction result, then Divide equally according to the particle size. For example, the prediction processing unit 50 divides the prediction values for four weeks into four and converts them into stepwise weekly prediction values.

続いて、予測処理部５０は、発売後の売上が特定の確率分布に従って時間的に変化すると仮定する。なお、一般に売上は減衰していくので、ワイブル分布、対数正規分布、対数ロジスティック分布などの確率分布を使用する。 Next, the prediction processing unit 50 assumes that the sales after the release change over time according to a specific probability distribution. Since sales generally decline, probability distributions such as Weibull distribution, logarithmic normal distribution, and logarithmic logistic distribution are used.

そして、予測処理部５０は、複数の予測値全体にフィットするように、一般的な手法を用いて確率分布のパラメータを計算し、得られたパラメータを用いて、任意の時刻での新たな予測値（補間・補正した予測値）を計算する。つまり、時刻を代入すると新たな予測値が得られる。 Then, the prediction processing unit 50 calculates the parameters of the probability distribution using a general method so as to fit the entire plurality of prediction values, and uses the obtained parameters to make a new prediction at an arbitrary time. Calculate the value (interpolated/corrected predicted value). That is, substituting the time yields a new predicted value.

ここで、図１２から図１４を用いて具体例を説明する。図１２は、スムージングの例を説明する図であり、図１３は、スムージングの別例を説明する図である。図１４は、スムージング結果を説明する図である。 A specific example will now be described with reference to FIGS. 12 to 14. FIG. FIG. 12 is a diagram for explaining an example of smoothing, and FIG. 13 is a diagram for explaining another example of smoothing. FIG. 14 is a diagram for explaining the smoothing result.

図１２では、スムージングを不規則変動の低減に用いる場合を説明する。例えば、週別の売上を独立に予測するため、予測結果に不規則変化（予測誤差）が含まれると予測結果がばらつく可能性がある。このため、複数週の予測結果を総合的に判断して予測値の補間や補正を行うことにより、不規則変化を抑えることができる。 FIG. 12 illustrates the case where smoothing is used to reduce random fluctuations. For example, since weekly sales are forecasted independently, forecast results may vary if irregular changes (forecast errors) are included in the forecast results. Therefore, irregular changes can be suppressed by interpolating or correcting predicted values by comprehensively judging prediction results for a plurality of weeks.

図１２に示す図は、週次の売上実績を目的変数として週次の予測モデルを学習し、学習済みの週次の予測モデルを用いて発売日から週次（各週）の売上予測を算出し、週次の予測値に対しスムージングを行った結果である。このようにすることで、予測値を補正することができるので、需要予測の精度を向上させることができる。なお、入力の予測値の単位は週次に限定されない。月次の予測値や４週分の予測値に対し同様の処理を行うことも可能である。 In the diagram shown in FIG. 12, a weekly forecast model is learned using the weekly sales performance as the objective variable, and the weekly (each week) sales forecast is calculated from the release date using the learned weekly forecast model. , is the result of smoothing the weekly predicted values. By doing so, the predicted value can be corrected, so the accuracy of the demand prediction can be improved. Note that the unit of the input predicted value is not limited to weekly. It is also possible to perform the same processing on the monthly predicted values and the predicted values for four weeks.

図１３では、学習データの時間粒度と予測の時間粒度のギャップ調整に用いる場合を説明する。例えば、予測モデルの学習に使用するデータの単位（例えば４週）と、予測が必要となる単位（例えば週次）にギャップがある場合がある。 FIG. 13 illustrates a case of adjusting the gap between the time granularity of learning data and the time granularity of prediction. For example, there may be a gap between the unit of data used for learning the prediction model (for example, 4 weeks) and the unit for which prediction is required (for example, weekly).

例を挙げると、４週でしかデータを取得して管理していないので、週次データが存在しないときが挙げられる。また、週次のデータは存在するが、数量が少量のためデータ量が不足し、そのままではうまく予測モデルを作成することができず、より大きい括り（４週）で集計して学習に使用する必要が発生したときなどが挙げられる。これらの場合、一般的な手法では、学習データと同じ単位での予測結果しか得られない。 For example, there is no weekly data because data is acquired and managed only for four weeks. In addition, although there is weekly data, the amount of data is insufficient due to the small amount of data, and it is not possible to create a prediction model as it is. For example, when the need arises. In these cases, general methods can only obtain prediction results in the same unit as the training data.

そこで、予測結果を補間および補正することにより、学習データとは異なる任意の時間粒度での予測結果を計算する。図１３では、４週分データで学習し得られた４週分の予測値に対してスムージングを行い、新たに週次の予測値が得られた結果である。つまり、４週分の予測値を４分割して、階段状の週次の予測値に一度変換したものを図示する。 Therefore, by interpolating and correcting the prediction result, the prediction result is calculated with an arbitrary time granularity different from that of the learning data. FIG. 13 shows the result of smoothing the prediction values for four weeks obtained by learning with the data for four weeks, and obtaining new prediction values for each week. In other words, the four weeks' worth of predicted values are divided into four and once converted into stepwise weekly predicted values.

このように、予測結果では得られない予測値を推定することができるので、学習データに依存することなく、需要予測を行うことができる。なお、週次に限らず、日次など任意の単位で算出を行うことができる。また、入力の予測値の単位は４週に限定されず、月次の予測値や週次の予測値に対し、同様の処理を行うこともできる。 In this way, a prediction value that cannot be obtained from the prediction result can be estimated, so demand prediction can be performed without depending on learning data. Note that the calculation can be performed in arbitrary units such as daily, not limited to weekly. Also, the unit of input predicted values is not limited to four weeks, and the same processing can be performed on monthly predicted values and weekly predicted values.

このようなスムージング手法を実施例１や実施例２に適用することで、図１４に示す結果を得ることができる。具体的には、月別の予測モデルから得られた月別の予測結果を用いて、週別の予測値を推定することができる。 By applying such a smoothing method to Example 1 and Example 2, the results shown in FIG. 14 can be obtained. Specifically, the forecast value for each week can be estimated using the forecast result for each month obtained from the forecast model for each month.

さて、これまで本発明の実施例について説明したが、本発明は上述した実施例以外にも、種々の異なる形態にて実施されてよいものである。 Although the embodiments of the present invention have been described so far, the present invention may be implemented in various different forms other than the embodiments described above.

［実施例の効果］ [Effect of Example]

図１５と図１６を用いて、上記実施例の効果を説明する。図１５は、効果を説明する図である。図１６は、効果の比較例を説明する図である。図１５に示すように、上記実施例による手法は、企画書に記載される狙い、ターゲット層、特徴、キャンペーン情報などを用いてテキストマイニングおよびクラスタリングを実行した結果を説明変数に設定し、商品の属性である過去の販売実績などを目的変数に設定した学習データを用いた機械学習を実行する。 The effect of the above embodiment will be described with reference to FIGS. 15 and 16. FIG. FIG. 15 is a diagram explaining the effect. FIG. 16 is a diagram illustrating a comparative example of effects. As shown in FIG. 15, the method according to the above embodiment sets the results of executing text mining and clustering using the aims, target demographics, characteristics, campaign information, etc. described in the proposal as explanatory variables, Execute machine learning using learning data with attributes such as past sales performance set as objective variables.

このような学習により、上記実施例による手法は、新商品の初期需要と需要の推移を学習することができるとともに、累積の増加や需要傾向（パターン）をも学習することができる。すなわち、上記実施例による手法は、類似商品の需要パターンを要素の組合せに分解して学習することができる。 Through such learning, the method according to the above embodiment can learn not only the initial demand and the transition of demand for a new product, but also the cumulative increase and the trend (pattern) of demand. That is, the method according to the above embodiment can learn by decomposing the demand pattern of similar products into a combination of elements.

この結果、図１６に示すように、定数予測を行う一般手法Ａや、定型情報や数値情報のみを用いた機械学習を行う一般手法Ｂに比べて、誤差率を低減することができる。シミュレーションによれば、実施例による手法を用いることで、誤差率の中央値を１９．３％まで低減することができる。 As a result, as shown in FIG. 16, the error rate can be reduced compared to the general method A that performs constant prediction and the general method B that performs machine learning using only standard information and numerical information. According to simulations, the median error rate can be reduced to 19.3% by using the method according to the embodiment.

［データや数値等］
上記実施例で用いた数値、データ例、データの数、ラベルの設定内容等は、あくまで一例であり、任意に変更することができる。また、キーワードは、特徴語の一例である。また、既存商品は、過去の商品であり、現段階で販売が終了している商品であってもよく、現段階で販売が継続中の商品であってもよい。また、目的変数には、売上以外にも、携帯電話などの契約数を用いることができる。また、上記実施例では、月別の売上情報を用いて月別の予測モデルを生成する例を説明したが、これに限定されるものではなく、日別、週別、年別の売上情報を用いることで、様々な予測モデルを生成することができる。 [Data, figures, etc.]
The numerical values, data examples, number of data, label setting contents, etc. used in the above embodiments are merely examples, and can be arbitrarily changed. A keyword is an example of a feature word. Also, the existing product is a product in the past, and may be a product whose sales have ended at the present stage, or may be a product that is currently being sold. In addition to sales, the number of contracts for mobile phones and the like can also be used as the objective variable. Also, in the above embodiment, an example of generating a monthly forecast model using monthly sales information has been described, but the present invention is not limited to this, and daily, weekly, and yearly sales information may be used. can generate a variety of predictive models.

また、企画書データの項目ａと項目ｂを用いる例を説明したが、これに限定されるものではなく、１つ以上の項目を用いることができ、企画書全体を１つの項目として用いることもできる。また、企画書以外にも商品の説明書などを用いることもできる。 Also, an example using items a and b of the proposal data has been described, but the present invention is not limited to this, and one or more items can be used, and the entire proposal can be used as one item. can. In addition to the proposal, a product manual or the like can also be used.

［システム］
上記文書中や図面中で示した処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。 [system]
Information including processing procedures, control procedures, specific names, and various data and parameters shown in the above documents and drawings can be arbitrarily changed unless otherwise specified.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散や統合の具体的形態は図示のものに限られない。つまり、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。例えば、学習処理部４０と予測処理部５０とを別々の装置で実現することもできる。 Also, each component of each device illustrated is functionally conceptual, and does not necessarily need to be physically configured as illustrated. That is, the specific forms of distribution and integration of each device are not limited to those shown in the drawings. That is, all or part of them can be functionally or physically distributed and integrated in arbitrary units according to various loads and usage conditions. For example, the learning processing unit 40 and the prediction processing unit 50 can be realized by separate devices.

さらに、各装置にて行なわれる各処理機能は、その全部または任意の一部が、ＣＰＵおよび当該ＣＰＵにて解析実行されるプログラムにて実現され、あるいは、ワイヤードロジックによるハードウェアとして実現され得る。 Further, each processing function performed by each device may be implemented in whole or in part by a CPU and a program analyzed and executed by the CPU, or implemented as hardware based on wired logic.

［ハードウェア］
図１７は、ハードウェア構成例を説明する図である。図１７に示すように、需要予測装置１０は、通信装置１０ａ、ＨＤＤ（Hard Disk Drive）１０ｂ、メモリ１０ｃ、プロセッサ１０ｄを有する。また、図１７に示した各部は、バス等で相互に接続される。 [hardware]
FIG. 17 is a diagram illustrating a hardware configuration example. As shown in FIG. 17, the demand prediction device 10 has a communication device 10a, a HDD (Hard Disk Drive) 10b, a memory 10c, and a processor 10d. 17 are interconnected by a bus or the like.

通信装置１０ａは、ネットワークインタフェースカードなどであり、他のサーバとの通信を行う。ＨＤＤ１０ｂは、図２に示した機能を動作させるプログラムやＤＢを記憶する。 The communication device 10a is a network interface card or the like, and communicates with other servers. The HDD 10b stores programs and DBs for operating the functions shown in FIG.

プロセッサ１０ｄは、図２に示した各処理部と同様の処理を実行するプログラムをＨＤＤ１０ｂ等から読み出してメモリ１０ｃに展開することで、図２等で説明した各機能を実行するプロセスを動作させる。すなわち、このプロセスは、需要予測装置１０が有する各処理部と同様の機能を実行する。具体的には、プロセッサ１０ｄは、学習処理部４０と予測処理部５０等と同様の機能を有するプログラムをＨＤＤ１０ｂ等から読み出す。そして、プロセッサ１０ｄは、学習処理部４０と予測処理部５０等と同様の処理を実行するプロセスを実行する。 The processor 10d reads from the HDD 10b or the like a program that executes the same processing as each processing unit shown in FIG. 2 and develops it in the memory 10c, thereby operating the process of executing each function described with reference to FIG. 2 and the like. That is, this process performs the same function as each processing unit of the demand forecasting device 10 . Specifically, the processor 10d reads a program having the same functions as those of the learning processing unit 40, the prediction processing unit 50, and the like from the HDD 10b and the like. Then, the processor 10d executes a process for executing processing similar to that of the learning processing unit 40, the prediction processing unit 50, and the like.

このように需要予測装置１０は、プログラムを読み出して実行することで需要予測方法を実行する情報処理装置として動作する。また、需要予測装置１０は、媒体読取装置によって記録媒体から上記プログラムを読み出し、読み出された上記プログラムを実行することで上記した実施例と同様の機能を実現することもできる。なお、この他の実施例でいうプログラムは、需要予測装置１０によって実行されることに限定されるものではない。例えば、他のコンピュータまたはサーバがプログラムを実行する場合や、これらが協働してプログラムを実行するような場合にも、本発明を同様に適用することができる。 Thus, the demand forecasting device 10 operates as an information processing device that executes a demand forecasting method by reading and executing a program. Further, the demand prediction device 10 can read the program from the recording medium by the medium reading device, and execute the read program to realize the same function as the embodiment described above. Note that the programs referred to in other embodiments are not limited to being executed by the demand forecasting device 10 . For example, the present invention can be applied in the same way when another computer or server executes the program, or when they cooperate to execute the program.

１０需要予測装置
１１通信部
１２記憶部
１３企画書ＤＢ
１４売上情報ＤＢ
１５月別売上情報ＤＢ
１６テキスト情報ＤＢ
１７重み情報ＤＢ
１８クラスタＤＢ
１９学習データＤＢ
２０学習結果ＤＢ
２１予測結果ＤＢ
３０制御部
４０学習処理部
４１単語抽出部
４２重み算出部
４３選定部
４４クラスタリング部
４５学習データ生成部
４６学習部
５０予測処理部 10 demand prediction device 11 communication unit 12 storage unit 13 proposal DB
14 Sales information DB
15 Monthly sales information DB
16 Text information DB
17 Weight information DB
18 Cluster DB
19 Learning data DB
20 Learning result DB
21 Prediction result DB
30 control unit 40 learning processing unit 41 word extraction unit 42 weight calculation unit 43 selection unit 44 clustering unit 45 learning data generation unit 46 learning unit 50 prediction processing unit

Claims

the computer
extracting feature words indicating the attributes of each product based on preset conditions from each document describing the attributes of existing products that have started selling or new products that have not started selling,
generating clustering information indicating a combination of the degree of having characteristic words for each product from the appearance frequency of the characteristic words contained in each of the documents;
setting the generated clustering information as an explanatory variable, and using learning data in which the sales performance of the existing product is set as an objective variable, and executing a process of learning a prediction model for predicting demand for the new product. demand forecasting method.

The extracting process extracts the feature word from each document corresponding to each of a plurality of existing products and the document corresponding to the new product,
The generating process clusters the plurality of existing products and the new product using feature words extracted from each to generate the clustering information;
The learning process generates a plurality of learning data using clustering information corresponding to each of the plurality of existing products among the clustering information, and executes a process of learning the prediction model using the plurality of learning data. 2. The demand forecasting method according to claim 1, wherein:

The processing of inputting the clustering information corresponding to the new product out of the clustering information to a trained prediction model and acquiring the output result of the learned prediction model as a demand prediction for the new product, on the computer. 3. The demand forecasting method according to claim 2, wherein

The extracting process extracts the feature word from each document corresponding to each of a plurality of existing products,
The generating process clusters the plurality of existing products using characteristic words extracted from each to generate the clustering information;
3. The process of learning is characterized in that a process of generating a plurality of learning data using the clustering information of a plurality of existing products and learning the prediction model using the plurality of learning data. 1. The demand forecasting method according to 1.

extracting the characteristic word from the document corresponding to the new product;
calculating the degree of similarity between the new product and each of the plurality of existing products using the characteristic word of the new product and the characteristic word of each of the plurality of existing products;
Using the similarity, associate the new product with each cluster in which the plurality of existing products are clustered,
Inputting the result of associating the new product with each cluster into a learned prediction model, and obtaining the output result of the learned prediction model as a demand prediction for the new product;
5. The demand forecasting method according to claim 4, wherein said computer executes the processing.

In the learning process, the clustering information is set as an explanatory variable, and the sales performance of the existing product for each predetermined period is set as an objective variable. 2. The demand forecasting method according to claim 1, wherein each forecasting model for forecasting is learned.

The processing of inputting the clustering information corresponding to the new product to each learned prediction model and acquiring the output result of each learned prediction model as the demand prediction for the new product for each predetermined period 7. The demand forecasting method according to claim 6, wherein the demand forecasting method is computer-implemented.

3. The computer executes a process of performing smoothing using the result of the demand forecast for the new product forecasted for each of the predetermined periods and interpolating the forecast result for each predetermined period. 8. The demand forecasting method according to 7.

to the computer,
extracting feature words indicating the attributes of each product based on preset conditions from each document describing the attributes of existing products that have started selling or new products that have not started selling,
generating clustering information indicating a combination of the degree of having characteristic words for each product from the appearance frequency of the characteristic words contained in each of the documents;
setting the generated clustering information as an explanatory variable, and using learning data in which the sales performance of the existing product is set as an objective variable, to execute a process of learning a prediction model that predicts demand for the new product. demand forecasting program.

an extraction unit for extracting characteristic words indicating the attributes of each product based on preset conditions from documents describing the attributes of existing products that have started to be sold or new products that have not started to be sold;
a generation unit that generates clustering information indicating a combination of degrees of having characteristic words for each product from the frequency of appearance of the characteristic words included in each of the documents;
a learning unit that learns a prediction model that predicts demand for the new product by using learning data in which the generated clustering information is set as an explanatory variable and the sales performance of the existing product is set as an objective variable. demand forecasting device.