JP6201077B1

JP6201077B1 - Investigation data processing apparatus and investigation data processing method

Info

Publication number: JP6201077B1
Application number: JP2017051688A
Authority: JP
Inventors: 玄田村; 伊佐片柳
Original assignee: Video Research Co Ltd
Current assignee: Video Research Co Ltd
Priority date: 2017-03-16
Filing date: 2017-03-16
Publication date: 2017-09-20
Anticipated expiration: 2037-03-16
Also published as: JP2018156299A

Abstract

【課題】互いに異なるモニタに対して実施した第１調査及び第２調査の各々の調査結果を示す調査データの片方又は双方が動的に変化する調査データであっても、その回答内容を正確且つ効率的に融合し、融合データとして有効に活用するためのデータ処理サービスを提供する。【解決手段】第１調査データのうち新規の第１調査データを特定する（Ｓ２０１）。新規の第１調査データと第２調査データの双方の共通項目に関数を適用して距離計算用スコア群を算出する（Ｓ２０２）。距離計算用スコア群を比較して、新規の第１調査データの各モニタＡ’と第２調査データの各モニタＢとの類似度合いを示す距離について距離計算を実行する（Ｓ２０３）。新規の第１調査データの各モニタＡ’について、総距離が近いモニタを第２調査データのモニタＢの中から特定し、同一のモニタとみなして融合する（Ｓ２０４）。融合データを保存する（Ｓ２０５）。【選択図】図１２Even if one or both of survey data indicating the results of surveys of a first survey and a second survey conducted on different monitors are dynamically changed, the response contents can be accurately and accurately determined. Provide data processing services for efficient fusion and effective use as fused data. New first survey data is identified among first survey data (S201). A distance calculation score group is calculated by applying a function to the common items of both the first survey data and the second survey data (S202). The distance calculation score groups are compared, and distance calculation is executed for the distance indicating the degree of similarity between each monitor A 'of the new first survey data and each monitor B of the second survey data (S203). For each new monitor A ′ of the first survey data, a monitor having a short total distance is identified from the monitors B of the second survey data, and they are merged by regarding them as the same monitor (S204). The fusion data is saved (S205). [Selection] Figure 12

Description

本発明は、調査データ処理装置及び調査データ処理方法に係り、特に、互いに異なるモニタに対して実施した第１調査及び第２調査の各々の調査結果を示す調査データを処理する装置及び方法に関する。 The present invention relates to a survey data processing apparatus and a survey data processing method, and more particularly to an apparatus and method for processing survey data indicating the results of surveys of a first survey and a second survey performed on different monitors.

メディアプランニング等を目的としてアンケート調査等の種々の調査を実施し、その調査結果を示す調査データを処理することは各業界でよく行われており、各社は、例えば自社の商品やサービスについて独自に調査を行い、その調査結果の中から有用な情報を抽出する。さらに、調査の中には、同一のモニタを対象にして購買・広告接触・ライフスタイル等の多面的情報を採取することを目的としてなされる調査があり、このような調査の結果は、所謂「シングルソースデータ」と呼ばれ、広告メディアと商品の購買の有無との相関等を個人ベースで分析する際に有用な情報源となる。 Various surveys such as questionnaire surveys are conducted for the purpose of media planning, etc., and survey data indicating the survey results are often processed in each industry. For example, each company has its own products and services. Conduct a survey and extract useful information from the survey results. Furthermore, some surveys are conducted for the purpose of collecting multifaceted information such as purchase, advertisement contact, lifestyle, etc. for the same monitor. This is referred to as “single source data” and is a useful information source when analyzing the correlation between advertising media and whether or not a product is purchased on an individual basis.

一方、従来から複数のデータベースを融合する技術が知られている。かかる技術は、データフュージョンと称され、シングルソースデータを取得するには非常にコストがかかるという不都合を解消するための手段として用いられる。 On the other hand, a technique for fusing a plurality of databases is conventionally known. Such a technique is called data fusion, and is used as a means for solving the disadvantage that it is very expensive to acquire single source data.

図１５に示すデータフュージョンの概念図を参照しながら説明すると、互いに独立して取得した２つの調査データＡ及びＢを、それぞれの調査に回答したモニタの類似度に基づいて融合（フュージョン）する。つまり、共通項目１〜ｎの回答内容について類似度が高いモニタ同士を同じモニタとみなし、これによって調査データＡ中、あるモニタの回答内容を示すレコード（具体的には、共通項目１〜ｎ及び独自項目Ａ１〜ｎの回答）に対して、調査データＢ中、そのあるモニタと同視されるモニタの回答内容を示すレコード（具体的には、独自項目Ｂ１〜ｎの回答）を付与することが可能となる。この結果、調査データＡの内容に、調査データＡには未収録であった独自項目Ｂ１〜ｎの回答内容を追加した融合データが生成されることになる。 Referring to the conceptual diagram of data fusion shown in FIG. 15, two survey data A and B acquired independently from each other are fused based on the similarity of the monitors that have answered each survey. In other words, the monitors having a high degree of similarity with respect to the answer contents of the common items 1 to n are regarded as the same monitor, and thereby the record indicating the answer contents of a certain monitor in the survey data A (specifically, the common items 1 to n and It is possible to give a record (specifically, the answer of the original items B1 to n) indicating the response contents of the monitor that is regarded as the certain monitor in the survey data B to the answer of the original items A1 to n). It becomes possible. As a result, fusion data is generated by adding the response contents of the original items B1 to n not recorded in the survey data A to the contents of the survey data A.

以上のように、データフュージョンによって比較的少ない質問数に設定された調査の調査結果であってもシングルソースデータ化することが可能となり、このようにして得られた擬似シングルソースデータを活用することで、メディアプランニング等において有用な情報をより低コストで入手可能となる。 As described above, even the survey results set to a relatively small number of questions by data fusion can be converted to single source data, and the pseudo single source data obtained in this way can be used. Thus, information useful in media planning and the like can be obtained at a lower cost.

さらに、データフュージョンの手法としては、「制約なし統計的マッチング」及び「制約付き統計的マッチング」が挙げられる。「制約なし統計的マッチング」は、モニタの類似度に基づいて単にデータ同士をフュージョンするものである。この「制約なし統計的マッチング」は、処理が簡素化されるので、効率的に大量のデータを処理することができるというメリットがある。一方、「制約付き統計的マッチング」は、元データの平均と分散を保持しようとする制約の下でデータ同士をフュージョンするので、元データが持つ情報がより正確に融合データに引き継がれるというメリットがある。 Further, as data fusion methods, “unconstrained statistical matching” and “constrained statistical matching” can be cited. “Unconstrained statistical matching” simply fuses data based on the similarity of monitors. This “unconstrained statistical matching” has an advantage that a large amount of data can be processed efficiently because the processing is simplified. On the other hand, “Statistical matching with constraints” fuses data under the constraint of maintaining the average and variance of the original data, so the information that the original data has can be transferred to the fusion data more accurately. is there.

また、「制約付き統計的マッチング」では、通常、データフュージョン用の計算に要する時間が長時間になるため、近年では、計算時間を短くし「制約付き」とほぼ変わらない精度を検証済みの「距離優先制約付き統計的マッチング」という技術が開発されている（特許文献１参照）。 In addition, with “constrained statistical matching”, the time required for calculation for data fusion usually takes a long time. In recent years, the calculation time has been shortened and accuracy that is almost the same as “constrained” has been verified. A technique called “statistical matching with distance priority constraint” has been developed (see Patent Document 1).

特許第４３３８４８６号公報Japanese Patent No. 4338486

以上までに説明してきたように、比較的少ない質問数に設定された調査の調査結果からシングルソースデータを得る手法としてデータフュージョンは有効である。
すなわち、複数の調査の調査結果を融合して得られる融合データを有効に活用することにより、融合前では不可能な集計が可能となる。また、これに係る一連の処理の実行がＡＳＰ（ＡｐｐｌｉｃａｔｉｏｎＳｅｒｖｉｃｅＰｒｏｖｉｄｅｒ）等によるサービスとして提供されれば、それらの調査結果を利用したい各社にとって有益となる。 As described above, data fusion is effective as a method for obtaining single source data from the survey results set in a relatively small number of questions.
In other words, by effectively utilizing the fusion data obtained by fusing the survey results of a plurality of surveys, it is possible to perform aggregation that is impossible before the fusion. Further, if execution of a series of processes related to this is provided as a service by an ASP (Application Service Provider) or the like, it will be beneficial for each company that wants to use the survey results.

しかしながら、調査データの片方又は両方が動的に変化する調査データの場合、データフュージョン実施時点では融合できていたレコードがその後調査データから存在しなくなったり、データフュージョン実施時点では存在しなかった新たなレコードがその後出現したりすることがある。
例えば、このように動的に変化する調査データの例としては、ローテーションを行うパネル調査データが挙げられる。具体的には、パネル調査で同一対象を長い間調査していると、調査慣れや学習効果による影響が出てくることがあり、また、標本が母集団の変化に対応しなくなることがある。このような状況を避けるため、パネル調査では、一定期間経過後に、モニタの一部又は全部を遂次組み替えること（総入れ替えや部分的に入れ替える等のローテーション）を行っている場合がある。また、パネル調査データには、上記のような定期的なローテーション以外にも、脱落したモニタを補充するため、新たなモニタを追加することにより、不規則にモニタを入れ替えるような場合もある。さらに、上記のようなアンケート調査等の調査データに限らず、動的に変化する調査データとしては、各種機器から自動的に取得できる稼働ログデータや、Ｗｅｂログ、購買履歴、アプリ利用履歴等、又は、常に変化し続ける大量の顧客データ（いわゆるビッグデータ）等も、動的に変化する調査データとして捉えることもできる。 However, in the case of survey data in which one or both of the survey data changes dynamically, records that could be merged at the time of data fusion no longer exist in the survey data, or new records that did not exist at the time of data fusion. Records may appear later.
For example, as an example of the survey data that dynamically changes in this way, panel survey data for rotation is given. Specifically, long-term investigations of the same subject in a panel survey may have an impact due to survey familiarity and learning effects, and the sample may not respond to changes in the population. In order to avoid such a situation, in the panel survey, a part or all of the monitor may be sequentially rearranged (rotation such as total replacement or partial replacement) after a certain period of time. In addition to the periodic rotation as described above, the panel survey data may be replaced irregularly by adding a new monitor to supplement the dropped monitor. Furthermore, not only survey data such as questionnaire surveys as described above, but dynamically changing survey data includes operation log data that can be automatically acquired from various devices, web logs, purchase history, application usage history, etc. Alternatively, a large amount of customer data that changes constantly (so-called big data) can also be considered as dynamically changing survey data.

上述のとおり、データフュージョンには、「制約なし統計的マッチング」、「制約付き統計的マッチング」又は「距離優先制約付き統計的マッチング」等、様々な手法が存在するが、従来では、このように調査データの片方又は両方が動的に変化する調査データにおいては、データフュージョン実施時点では存在しなかった新たなレコードを正確且つ効率的に融合する処理ができなかった。 As described above, there are various methods for data fusion, such as “unconstrained statistical matching”, “constrained statistical matching”, or “statistic matching with distance priority constraint”. In the survey data in which one or both of the survey data changes dynamically, it has not been possible to accurately and efficiently fuse new records that did not exist at the time of data fusion.

そこで、本発明は、上記の問題に鑑みてなされたものであり、その目的は、互いに異なるモニタに対して実施した第１調査及び第２調査の各々の調査結果を示す調査データの片方又は双方が動的に変化する調査データであっても、その回答内容を正確且つ効率的に融合し、融合データとしてより有効に活用するためのデータ処理サービスを提供することが可能な調査データ処理装置及び調査データ処理方法を実現することである。 Therefore, the present invention has been made in view of the above-mentioned problems, and the object thereof is one or both of the survey data indicating the survey results of the first survey and the second survey conducted on different monitors. Survey data processing apparatus capable of providing a data processing service for accurately and efficiently merging the contents of responses even if the survey data is dynamically changing It is to realize the survey data processing method.

前記課題は、本発明の調査データ処理装置によれば、互いに異なるモニタに対して実施した第１調査及び第２調査の各々の調査結果を示す調査データを処理する調査データ処理装置であって、前記第１調査及び前記第２調査の双方に含まれる共通項目に対する回答内容を、前記第１調査に回答した第１調査モニタの人数分集めた第１調査データを取得する第１調査データ取得部と、前記共通項目、及び、前記第２調査のみに含まれる第２調査独自項目に対する各々の回答内容を、前記第２調査に回答した第２調査モニタの人数分集めた第２調査データを取得する第２調査データ取得部と、前記第１調査データ取得部が取得した前記第１調査データと、前記第２調査データ取得部が取得した前記第２調査データと、を融合して、前記第１調査データ及び前記第２調査データの各々の回答内容を示す融合データを生成するデータ融合処理を実行する処理実行部と、を備え、前記データ融合処理は、所定の演算式を用いて前記共通項目に対する回答内容に関する値を算出し、前記第１調査モニタと前記第２調査モニタとの間で前記値同士を比較した結果を用いて、前記共通項目に対する回答内容の類似度合いを算出する算出処理と、算出した前記類似度合いに基づいて設定された割り当てパターンにて、前記第１調査モニタに対して、前記第２調査モニタの前記第２調査独自項目に対する回答内容と同一の回答内容を割り当てる割り当て処理と、を含み、前記第１調査モニタ又は前記第２調査モニタの双方又は一方が変化したときに、前記割り当て処理によって既に割り当て済みであって割り当て相手がその時点で未だ存在する前記第１調査モニタ及び前記第２調査モニタについてはそのまま割り当て相手を引継ぎ、新規の第１調査モニタ又は新規の第２調査モニタについてのみ前記算出処理及び前記割り当て処理を実行すること、により解決される。 According to the survey data processing device of the present invention, the problem is a survey data processing device that processes survey data indicating the results of each of the first survey and the second survey performed on different monitors, A first survey data acquisition unit that acquires first survey data collected for the number of persons of the first survey monitor who answered the first survey for the contents of responses to common items included in both the first survey and the second survey And second survey data obtained by collecting the number of responses of the second survey monitor who answered the second survey, and the contents of the responses to the common survey items and the second survey unique items included only in the second survey. The second survey data acquisition unit, the first survey data acquired by the first survey data acquisition unit, and the second survey data acquired by the second survey data acquisition unit, 1 Survey data And a process execution unit for executing a data fusion process for generating fused data indicating the response contents of each of the second survey data, and the data fusion process is performed on the common item using a predetermined arithmetic expression. A calculation process for calculating a value related to the response content, and using the result of comparing the values between the first survey monitor and the second survey monitor to calculate the similarity of the response content to the common item; An assignment process for assigning, to the first survey monitor, the same response content as the response content to the second survey original item of the second survey monitor in the allocation pattern set based on the calculated similarity degree; And when both or one of the first survey monitor and the second survey monitor has changed, the allocation process has already been performed. For the first survey monitor and the second survey monitor that still exist at that time, the allocation partner is taken over as it is, and the calculation process and the allocation are performed only for the new first survey monitor or the new second survey monitor. It is solved by executing the process.

また、本発明の調査データ処理方法によれば、互いに異なるモニタに対して実施した第１調査及び第２調査の各々の調査結果を示す調査データをコンピュータによって処理する調査データ処理方法であって、前記コンピュータが、前記第１調査及び前記第２調査の双方に含まれる共通項目に対する回答内容を、前記第１調査に回答した第１調査モニタの人数分集めた第１調査データを取得する工程と、前記共通項目、及び、前記第２調査のみに含まれる第２調査独自項目に対する各々の回答内容を、前記第２調査に回答した第２調査モニタの人数分集めた第２調査データを取得する工程と、取得した前記第１調査データと前記第２調査データとを融合して、前記第１調査データ及び前記第２調査データの各々の回答内容を示す融合データを生成するデータ融合処理を実行する工程と、を実行し、前記データ融合処理は、所定の演算式を用いて前記共通項目に対する回答内容に関する値を算出し、前記第１調査モニタと前記第２調査モニタとの間で前記値同士を比較した結果を用いて、前記共通項目に対する回答内容の類似度合いを算出する算出処理と、算出した前記類似度合いに基づいて設定された割り当てパターンにて、前記第１調査モニタに対して、前記第２調査モニタの前記第２調査独自項目に対する回答内容と同一の回答内容を割り当てる割り当て処理と、を含み、前記第１調査モニタ又は前記第２調査モニタの双方又は一方が変化したときに、前記割り当て処理によって既に割り当て済みであって割り当て相手がその時点で未だ存在する前記第１調査モニタ及び前記第２調査モニタについてはそのまま割り当て相手を引継ぎ、新規の第１調査モニタ又は新規の第２調査モニタについてのみ前記算出処理及び前記割り当て処理を実行すること、により解決される。 Further, according to the survey data processing method of the present invention, a survey data processing method for processing survey data indicating the survey results of the first survey and the second survey performed on different monitors by a computer, The computer acquiring first survey data in which the contents of responses to common items included in both the first survey and the second survey are collected for the number of first survey monitors who answered the first survey; The second survey data obtained by collecting the contents of the responses to the second survey unique items included in the common survey and the second survey only for the number of the second survey monitors who answered the second survey is acquired. The process and the acquired first survey data and the second survey data are merged to generate fused data indicating the contents of the responses of the first survey data and the second survey data. Performing a data fusion process, wherein the data fusion process calculates a value related to a response content for the common item using a predetermined arithmetic expression, and the first survey monitor and the second survey monitor In the calculation process for calculating the degree of similarity of the answer contents for the common item using the result of comparing the values with each other, and the assignment pattern set based on the calculated degree of similarity, the first An assignment process for assigning to the survey monitor the same response content as the response content to the second survey original item of the second survey monitor, and either or both of the first survey monitor and the second survey monitor Change, the first investigation monitor and the second monitor that have already been allocated by the allocation process and the allocation partner still exists at that time. Take over as assignment partner for 査 monitor, performing the calculation process and the assignment process only the new first study monitor or the new second study monitor is solved by.

以上のように構成された本発明の調査データ処理装置又は調査データ処理方法では、第１調査モニタ又は第２調査モニタの双方又は一方が変化したときに、割り当て処理によって既に割り当て済みであって割り当て相手がその時点で未だ存在する第１調査モニタ及び第２調査モニタについてはそのまま割り当て相手を引継ぎ、新規の第１調査モニタ又は新規の第２調査モニタについてのみ算出処理及び割り当て処理をあらためて実行する。
これにより、互いに異なるモニタに対して実施した第１調査及び第２調査の各々の調査結果を示す調査データの片方又は双方が動的に変化する調査データであっても、既に融合済みの融合データをそのまま生かすことができると共に、継続的にその回答内容を正確且つ効率的に融合することができる。そして、その融合データが示す情報を分析することにより、融合前の情報からは得られなかった情報を入手し、さらにその情報に基づいて効果的なメディアプランニングを立案することが可能となる。 In the survey data processing apparatus or the survey data processing method of the present invention configured as described above, when both or one of the first survey monitor and the second survey monitor changes, the allocation has already been performed by the allocation process. For the first survey monitor and the second survey monitor that still exist at that time, the allocation partner is taken over as it is, and the calculation process and the allocation process are executed again only for the new first survey monitor or the new second survey monitor.
As a result, even if one or both of the survey data indicating the survey results of the first survey and the second survey performed on different monitors are dynamically changed, the fused data that has already been fused. Can be utilized as it is, and the response contents can be continuously and accurately fused. Then, by analyzing the information indicated by the fusion data, it is possible to obtain information that was not obtained from the information before the fusion, and to plan effective media planning based on the information.

また、上記の調査データ処理装置について好適な構成を述べると、前記第１調査モニタ又は前記第２調査モニタの双方又は一方が変化する前後において、前記算出処理で用いられる前記所定の演算式は変化しない、とよい。 Further, when a preferred configuration of the survey data processing apparatus is described, the predetermined arithmetic expression used in the calculation process is changed before and after either or both of the first survey monitor and the second survey monitor are changed. Do not do it.

上記の構成のように、新規の調査データを融合する際に、共通項目に対する回答内容に関する値（距離計算用スコア）を算出する際に使用する所定の演算式（融合パラメータ等の関数）を再計算せず、既に使用した所定の演算式を再利用することにより、処理が簡素化され、大量の新規調査データであっても、迅速に処理することが可能となる。 As in the above configuration, when merging new survey data, a predetermined arithmetic expression (function such as a fusion parameter) used to calculate a value related to the response content (distance calculation score) for the common item is re-executed. By reusing a predetermined arithmetic expression that has already been used without calculation, the processing is simplified, and even a large amount of new survey data can be processed quickly.

また、上記の調査データ処理装置について好適な構成を述べると、前記第１調査モニタ又は前記第２調査モニタの双方又は一方が変化する前後において、前記共通項目に対する調査時期が異なる場合、前記算出処理及び前記割り当て処理において、前記処理実行部は、前記共通項目と近似する項目若しくは時期に対する回答内容又は該回答内容の統計値を前記共通項目に対する回答内容として代替する、とよい。 A preferred configuration of the survey data processing apparatus described above is as follows. When the survey time for the common item is different before and after both or one of the first survey monitor and the second survey monitor is changed, the calculation process is performed. In the assignment process, the process execution unit may substitute the response content for the item or time approximate to the common item or the statistical value of the response content as the response content for the common item.

上記の構成のように、例えば、共通項目が時点に依存する等の理由により、共通項目に対する調査時期が異なることによって、新規の調査データが融合データと全く同一の共通項目を保持していない場合であっても、共通項目と近似する項目若しくは時期に対する回答内容又はその回答内容の平均値や代表値等の統計値を共通項目に対する回答内容として代替することにより、同一の共通項目を保持している場合と差異なく、新規の調査データを融合することが可能となる。 When the new survey data does not hold the same common item as the fusion data due to the difference in the survey period for the common item, for example, because the common item depends on the time point, as in the above configuration Even so, it is possible to maintain the same common items by substituting the answer contents for the items or time approximate to the common items or the statistical values such as the average value or representative value of the answer contents as the answer contents for the common items. New survey data can be merged without any difference.

また、上記の調査データ処理装置について好適な構成を述べると、前記第１調査モニタ又は前記第２調査モニタの双方又は一方が変化する前において、前記処理実行部は、前記割り当て処理において、前記第１調査に関する集計結果が前記第１調査データと前記融合データとの間で揃い、且つ、前記第２調査に関する集計結果が前記第２調査データと前記融合データとの間で揃うような前記割り当てパターンを統計的解法に従って設定する、とよい。
In addition, when a preferred configuration is described for the survey data processing apparatus, the process execution unit performs the first process in the allocation process before either or one of the first survey monitor and the second survey monitor changes. 1 survey totalization result is aligned between the fusion data and the first survey data, and the like aligned between the second regulating counting result about the査 said second survey data and the fusion data The allocation pattern should be set according to a statistical solution.

上記構成のように、第１調査モニタ又は第２調査モニタの双方又は一方が変化する前は、いわゆる「制約付き統計的マッチング」を採用し、融合前の調査データにおける集計結果が融合後の調査データ（融合データ）における集計結果と揃うように融合処理を実施することにより、第１調査データ及び第２調査データの平均と分散を維持でき、各調査データが保持する情報をより正確に融合データに引き継ぐことができる。 As described above, before both or one of the first survey monitor and the second survey monitor change, so-called “constrained statistical matching” is adopted, and the aggregated results in the survey data before merging are the surveys after merging. By performing the fusion process so that it is aligned with the aggregated results in the data (fusion data), the average and variance of the first survey data and the second survey data can be maintained, and the information held by each survey data is more accurately fused data. Can take over.

また、上記の調査データ処理装置について好適な構成を述べると、前記第１調査データは、前記第１調査のみに含まれる第１調査独自項目に対する回答内容を含む、とよい。 Moreover, when a suitable structure is described about said investigation data processing apparatus, it is good for the said 1st investigation data to contain the reply content with respect to the 1st investigation original item contained only in the said 1st investigation.

上記構成のように、第１調査データにも共通項目以外に第１調査のみに含まれる第１調査独自項目に対する回答内容を含むことにより、より内容の充実した融合データを得ることができる。 As in the above configuration, the first survey data includes the answer contents for the first survey unique items included only in the first survey in addition to the common items, so that it is possible to obtain fusion data with more complete content.

また、上記の調査データ処理装置について好適な構成を述べると、前記第１調査は、メディア接触に関する調査であり、前記第２調査は、生活者属性、商品関与及びメディア接触について多面的に捉えるアンケート調査である、とよい。 The preferred configuration of the survey data processing apparatus is as follows. The first survey is a survey on media contact, and the second survey is a questionnaire that captures various aspects of consumer attributes, product involvement, and media contact. It is good to be a survey.

上記構成のように、一方の調査データをテレビ番組やラジオ番組等（ＣＭ含む。）の視聴状況、携帯電話・スマートフォン・タブレット・ＰＣ等によってインターネット上で閲覧・視聴できる情報（ウェブサイト等）の閲覧状況や動画・音楽等の視聴状況、新聞・雑誌等の購読状況等、メディア接触に関する調査とし、他方の調査データを生活者属性、商品関与及びメディア接触等について多面的に捉えるアンケート調査とすることにより、これらの相関関係を把握することが可能となり、この相関関係に基づいて最大の広告効果が得られる広告枠（放送時間やテレビ番組等）を選定することが可能となる等、融合データをより有効に活用することができる。 As in the above configuration, the survey data on one side of the TV program, radio program, etc. (including CM), information that can be viewed and viewed on the Internet by mobile phone, smartphone, tablet, PC, etc. (website, etc.) Surveys related to media contact such as browsing status, viewing status of videos and music, subscription status of newspapers and magazines, etc., and other survey data to be a questionnaire survey that captures multifaceted aspects of consumer attributes, product involvement, media contact, etc. Therefore, it is possible to grasp these correlations, and it is possible to select an advertising space (broadcast time, TV program, etc.) that can obtain the maximum advertising effect based on this correlation. Can be used more effectively.

本発明によれば、互いに異なるモニタに対して実施した第１調査及び第２調査の各々の調査結果を示す調査データの片方又は双方が動的に変化する調査データであっても、その回答内容を正確且つ効率的に融合し、融合データとしてより有効に活用するためのデータ処理サービスを提供することが可能な調査データ処理装置及び調査データ処理方法を実現することができる。
これにより、その融合データが示す情報を分析することで、融合前の情報からは得られなかった情報を入手し、さらにその情報に基づいて効果的なメディアプランニングを立案することが可能となる。具体的に説明すると、例えば、ある調査において調査した項目（例えば、テレビ視聴の有無）と、他の調査において調査した項目（例えば、生活状況や消費者意識等）との間の相関関係を把握することが可能となり、この相関関係に基づいて最大の広告効果が得られる広告枠（放送時間やテレビ番組等）を選定することが可能となる。 According to the present invention, even if one or both of the survey data indicating the survey results of the first survey and the second survey conducted on different monitors are dynamically changed, the response contents It is possible to realize a survey data processing apparatus and a survey data processing method capable of providing a data processing service for accurately and efficiently merging the data and utilizing them effectively as fused data.
Thus, by analyzing the information indicated by the fusion data, it is possible to obtain information that could not be obtained from the information before the fusion, and to plan effective media planning based on the information. More specifically, for example, grasping the correlation between items surveyed in one survey (for example, whether or not watching TV) and items surveyed in other surveys (for example, living conditions and consumer awareness) It is possible to select an advertisement frame (broadcast time, television program, etc.) that provides the maximum advertising effect based on this correlation.

本発明に係るデータ処理サービスに関する説明図である。It is explanatory drawing regarding the data processing service which concerns on this invention. 本発明の調査データ処理装置を含む通信システムを示した図である。It is the figure which showed the communication system containing the investigation data processing apparatus of this invention. 図３中の（Ａ）は、第１調査に対する回答結果を示し、（Ｂ）は、第２調査に対する回答結果を示した図である。(A) in FIG. 3 shows the answer results for the first survey, and (B) shows the answer results for the second survey. 調査データ処理装置のハードウェア構成を示した図である。It is the figure which showed the hardware constitutions of the investigation data processing apparatus. 調査データ処理装置の構成を機能面から示した図である。It is the figure which showed the structure of the investigation data processing apparatus from the functional surface. データ融合処理の内容を概念的に示した図である。It is the figure which showed the content of the data fusion process notionally. データ処理サービスに係る一連の処理についての大まかな流れを示した図である。It is the figure which showed the rough flow about the series of processes which concern on a data processing service. 時点ｔにおけるデータ融合処理の流れ示す図である。It is a figure which shows the flow of the data fusion process in the time t. 「制約なし統計的マッチング」を採用してデータ融合処理を実施した場合の説明図である。It is explanatory drawing at the time of employ | adopting "unconstrained statistical matching" and implementing a data fusion process. 「制約付き統計的マッチング」を採用してデータ融合処理を実施した場合の説明図である。It is explanatory drawing at the time of employ | adopting "restricted statistical matching" and performing a data fusion process. 第１調査データが変化した状態を概念的に示す図である。It is a figure which shows notionally the state to which the 1st investigation data changed. 融合実施後の時点ｔ＋１におけるデータ融合処理の流れ示す図である。It is a figure which shows the flow of the data fusion process in the time t + 1 after fusion implementation. 第１調査データが変化した状態を概念的に示す図である。It is a figure which shows notionally the state to which the 1st investigation data changed. 第２調査データが変化した後に第１調査データが変化した状態を概念的に示す図である。It is a figure which shows notionally the state to which the 1st survey data changed after the 2nd survey data changed. データフュージョンの概念を示した図である。It is the figure which showed the concept of data fusion.

以下、本発明の一実施形態（以下、本実施形態）について図面を参照しながら説明する。
本実施形態においては、メディア接触に関する調査の一例である視聴率調査を第１調査とし、生活者属性、商品関与及びメディア接触等について多面的に捉えるアンケート調査を第２調査とするケースを例に挙げて説明する。なお、ここでは、両調査の関係において、視聴率調査を主とし、アンケート調査を従として説明する。
また、以下の説明において、「シングルソースデータ」とは、同一のモニタ（回答者）から収集した比較的多数（例えば数百問）の質問への回答内容を示すデータであり、購買、広告接触、ライフスタイル等の多面的情報を採取したデータである。具体的には、同データが示す情報には、モニタの属性に関する質問への回答内容、具体的にはデモグラフィック（人口統計学的属性）に関する情報、及び、サイコグラフィック（心理学的属性）に関する情報等が含まれている。
なお、メディア接触に関する調査とは、本実施形態のようなテレビ番組（ＣＭ含む）の視聴状況に関する調査に限らず、ラジオ番組等（ＣＭ含む）の視聴状況、携帯電話・スマートフォン・タブレット・ＰＣ等によってインターネット上で閲覧・視聴できる情報（ウェブサイト等）の閲覧状況や動画・音楽等の視聴状況、新聞・雑誌等の購読状況等であってもよい。 Hereinafter, an embodiment of the present invention (hereinafter, this embodiment) will be described with reference to the drawings.
In this embodiment, an audience rating survey, which is an example of a media contact survey, is a first survey, and a questionnaire survey that captures multifaceted aspects of consumer attributes, product involvement, media contacts, and the like is a second survey. I will give you a description. Here, in the relationship between the two surveys, the audience rating survey is the main and the questionnaire survey is the subordinate.
In the following description, “single source data” is data indicating the contents of answers to a relatively large number of questions (for example, several hundred questions) collected from the same monitor (respondent). This is data that collects multifaceted information such as lifestyle. Specifically, the information indicated by the data includes the contents of answers to questions regarding monitor attributes, specifically information related to demographics (demographic attributes) and psychographics (psychological attributes). Information etc. are included.
Note that the survey on media contact is not limited to the survey on the viewing status of television programs (including CMs) as in this embodiment, but the viewing status of radio programs (including CMs), mobile phones, smartphones, tablets, PCs, etc. The browsing status of information (website etc.) that can be browsed / viewed on the Internet, viewing status of videos / music, subscription status of newspapers / magazines, etc. may be used.

＜＜調査データ処理サービスについて＞＞
先ず、本発明により実現される調査データ処理サービスについて図１を参照しながら説明する。図１は、調査データ処理サービスに関する説明図である。 << About Survey Data Processing Service >>
First, a survey data processing service realized by the present invention will be described with reference to FIG. FIG. 1 is an explanatory diagram relating to a survey data processing service.

調査データ処理サービスは、本発明の調査データ処理装置を管理する調査会社によって提供されるものであり、互いに異なるモニタに対して実施した複数の調査（本実施形態では、第１調査及び第２調査の二種類の調査）の調査結果を示す調査データを融合するための一連のデータ処理を実行するサービスである。 The survey data processing service is provided by a survey company that manages the survey data processing apparatus of the present invention, and a plurality of surveys performed on different monitors (in this embodiment, the first survey and the second survey). It is a service that executes a series of data processing for fusing survey data indicating the survey results of the two types of surveys.

具体的に説明すると、調査会社は、第１調査として視聴率調査を実施する。この視聴率調査は、無作為に抽出したモニタに対してテレビ番組等の視聴状況に関する調査を実施するものであり、調査会社は、それらの視聴状況に関する情報を各モニタから収集すると、それらの情報を集約して第１調査データとして保管する。
なお、視聴率には、世帯視聴率と個人視聴率の二種類の視聴率があるが、本実施形態では、個人視聴率を利用するものとする。ただし、これに限定されるものではなく、調査対象者を世帯単位とした世帯視聴率であっても適用することができる。
個人視聴率とは、調査対象者を世帯に属する構成員単位とし、世帯内の所定年齢以上（例えば４歳以上）の家族全員の中で、誰がどれくらいどのようなテレビ番組を視聴したかを示す割合であり、本実施形態のように、視聴者を、性別等に分けて、どのような属性の個人がどれくらい見ていたかを知りたいとき等に利用されている。一方、世帯視聴率とは、調査対象者を世帯単位とし、テレビ所有世帯のうち、どのくらいの世帯がテレビ番組を視聴していたかを示す割合である。 More specifically, the research company conducts an audience rating survey as the first survey. This audience rating survey is a survey on the viewing status of TV programs, etc., on randomly selected monitors. When the survey company collects information on the viewing status from each monitor, the information is collected. Are collected and stored as the first survey data.
Note that there are two types of audience ratings: household audience ratings and individual audience ratings. In this embodiment, the audience ratings are used. However, the present invention is not limited to this, and it can be applied even when the audience rating is the household to be surveyed.
The individual audience rating is the number of members who belong to a household, and indicates who has watched what kind of TV program and how many of them are in a family of a certain age or older (for example, 4 years or older) in the household. It is a ratio, and is used when it is desired to know how much an individual with what attribute has watched by dividing the viewer into gender as in this embodiment. On the other hand, the household audience rating is a ratio indicating how many of the TV-owned households are watching TV programs, with the survey target as household units.

また、調査会社は、第２調査としてアンケート調査を実施する。このアンケート調査は複数のモニタを対象にして複数の質問をするものであり、調査会社は、全ての質問に対する回答を各モニタから回収すると、モニタ人数に相当する分の回答を集約して第２調査データとして保管する。
アンケート調査では、生活者属性、商品関与、メディア接触という３つの視点を同一モニタに調査し、生活者を多角的に捉えることを目的とし、各モニタに対して比較的多数の質問を出す。具体的には、質問は数百問程度に及び、その内容には、モニタのデモグラフィック属性やサイコグラフィック属性を含む。つまり、調査会社が第２調査として実施するアンケート調査の回答結果を示す調査データは、シングルソースデータに相当する。
なお、アンケート調査のモニタは、視聴率調査のモニタとは異なるモニタとする。また、調査会社がアンケート調査を依頼するモニタの人数については自由に設定可能であるが、シングルソースデータとして十分な情報を収集することが可能な規模でモニタを確保するのが望ましい。 In addition, the survey company conducts a questionnaire survey as the second survey. In this questionnaire survey, a plurality of questions are asked for a plurality of monitors. When the survey company collects answers to all the questions from each monitor, the survey company collects the answers corresponding to the number of monitors and sets the second answer. Store as survey data.
In the questionnaire survey, three viewpoints of consumer attributes, product involvement, and media contact are surveyed on the same monitor, and a relatively large number of questions are given to each monitor for the purpose of grasping consumers from various perspectives. Specifically, there are about several hundred questions, and the contents include demographic attributes and psychographic attributes of the monitor. That is, the survey data indicating the result of the questionnaire survey conducted by the survey company as the second survey corresponds to the single source data.
The questionnaire survey monitor is a different monitor from the audience rating survey monitor. In addition, the number of monitors for which the survey company requests a questionnaire survey can be set freely, but it is desirable to secure monitors on a scale that can collect sufficient information as single source data.

調査会社は、第１調査である視聴率調査で取得した第１調査データと、第２調査であるアンケート調査で取得したシングルソースデータとしての第２調査データとを融合する。この結果、上記２つの調査データを融合した融合データが生成される。
以上のように、調査会社は、データ融合処理を実行して第１調査と第２調査の回答結果を集約し、その結果を融合データという形で得ることができ、融合前のデータでは不可能な集計等も可能となる。また、この融合データは、メーカ等の顧客企業に提供することもでき、融合データの提供を受けた顧客企業は、その融合データが示す情報を分析し、その情報に基づいて効果的なメディアプランニングを立案することが可能となる。 The research company merges the first survey data acquired in the audience rating survey as the first survey with the second survey data as single source data acquired in the questionnaire survey as the second survey. As a result, fusion data obtained by fusing the two survey data is generated.
As described above, the research company can execute the data fusion process, aggregate the response results of the first survey and the second survey, and obtain the result in the form of fusion data, which is impossible with the data before fusion. It is also possible to perform total tabulation. In addition, this fusion data can be provided to client companies such as manufacturers, and the customer company receiving the fusion data analyzes the information indicated by the fusion data, and performs effective media planning based on the information. Can be planned.

＜＜調査データ処理サービスの提供システムについて＞＞
次に、上述したデータ処理サービスを提供するためのシステム構成について、図２を参照しながら説明する。図２は、本発明の調査データ処理装置を含む通信システムを示した図である。
調査会社は、上記のデータ処理サービスを提供するために、コンピュータ、より厳密にはサーバコンピュータ（以下、サーバ１）を保有している。このサーバ１は、本発明の調査データ処理装置に相当し、上述したデータ処理サービスをＡＳＰサービスとして提供する。 << About survey data processing service provision system >>
Next, a system configuration for providing the above-described data processing service will be described with reference to FIG. FIG. 2 is a diagram showing a communication system including the survey data processing apparatus of the present invention.
In order to provide the above-described data processing service, the research company has a computer, more precisely, a server computer (hereinafter referred to as server 1). The server 1 corresponds to the survey data processing apparatus of the present invention, and provides the above-described data processing service as an ASP service.

サーバ１の機能について概説すると、サーバ１は、図２に示すように、インターネット等の情報通信網Ｎ１を介して、視聴率調査の対象となる各世帯と通信可能に接続されている。 When the function of the server 1 is outlined, as shown in FIG. 2, the server 1 is communicably connected to each household subject to audience rating survey via an information communication network N1 such as the Internet.

具体的に説明すると、調査会社は、抽出された世帯に対して測定機２１を配布する。この測定機２１は、抽出された世帯の自宅内（厳密にはテレビの設置箇所周辺）に配置され、世帯全体の中の各構成員を一の調査対象者として、テレビ番組の視聴状況を測定するために使用される。測定機２１により生成されたテレビ番組の視聴状況を示すデータは、情報通信網Ｎ１を通じて外部の機器に向けて送信される。この測定機２１が送信するデータ（以下、視聴率データ）は、どの調査対象者がどのテレビ番組をいつどれだけ視聴したのかを示すデータである。
ちなみに、本実施形態では、世帯の構成員（すなわち、個人）を調査対象者としたが、世帯自体を調査対象者として取り扱ってもよい。 Specifically, the survey company distributes the measuring device 21 to the extracted household. This measuring device 21 is placed in the home of the extracted household (strictly, around the location where the TV is installed) and measures the viewing status of the TV program with each member in the entire household as one survey subject. Used to do. Data indicating the viewing status of the television program generated by the measuring device 21 is transmitted to an external device through the information communication network N1. The data (hereinafter referred to as audience rating data) transmitted by the measuring device 21 is data indicating which television program is viewed by which survey target and how much.
Incidentally, in this embodiment, the members of the household (that is, individuals) are the survey subjects, but the household itself may be handled as the survey subjects.

また、サーバ１は、図２に示すように、インターネット等の情報通信網Ｎ２を介してモニタ保有の回答用端末３１と通信可能に接続されている。そして、サーバ１は、各回答用端末３１と通信することにより、アンケート調査の各質問に対する各モニタの回答データ（以下、個別回答データ）を受信する。 As shown in FIG. 2, the server 1 is communicably connected to an answer terminal 31 owned by a monitor via an information communication network N2 such as the Internet. Then, the server 1 communicates with each answering terminal 31 to receive answer data (hereinafter, individual answer data) of each monitor for each question of the questionnaire survey.

ここで、回答用端末３１とは、アンケート調査の回答を依頼した各モニタに対して調査会社が配布したタブレット型の通信端末である。つまり、モニタは、回答用端末３１に搭載されたタッチパネルを見てアンケート調査の各質問を確認し、タッチパネル上でのタッチ操作を通じて回答する。そして、回答用端末３１がモニタによる回答操作を受け付けると、その回答内容を示す個別回答データを生成し、生成したデータをサーバ１に向けて送信する。
なお、回答用端末３１については、タブレット型の端末に限定されるものではなく、スマートフォンやノートＰＣ等、他の通信端末であってもよい。また、回答用端末３１に代えて、調査会社が各モニタにアンケート用紙を配付し、回収した回答済みのアンケート用紙の内容に基づいて調査会社側でデータ化する方法であってもよい。 Here, the answering terminal 31 is a tablet-type communication terminal distributed by a research company to each monitor that has requested a questionnaire survey response. In other words, the monitor looks at the touch panel mounted on the answering terminal 31 to confirm each question of the questionnaire survey, and answers through a touch operation on the touch panel. When the answer terminal 31 accepts an answer operation by the monitor, individual answer data indicating the answer content is generated, and the generated data is transmitted to the server 1.
The answering terminal 31 is not limited to a tablet-type terminal, and may be another communication terminal such as a smartphone or a notebook PC. Further, instead of the answering terminal 31, a method may be used in which the survey company distributes questionnaire sheets to each monitor, and the survey company side converts the questionnaire data into data based on the contents of the collected questionnaire sheets that have been answered.

また、サーバ１は、規定の問題数に相当する数の個別回答データを各回答用端末３１から受信すると、各データが示す情報を取りまとめて全モニタ分の回答結果として集約し、その集約したデータ、すなわち、第２調査データをサーバ１内に記憶させておく。 In addition, when the server 1 receives the number of individual response data corresponding to the specified number of questions from each response terminal 31, the information indicated by each data is collected and aggregated as response results for all monitors, and the aggregated data That is, the second survey data is stored in the server 1.

また、サーバ１は、同サーバ１内に記憶された第１調査データと第２調査データとを融合して融合データを生成する処理、すなわち、データ融合処理を実行する。ここで、融合データを生成するとは、元データである２つの調査データとは別のデータを生成するケースに限られず、元データである２つの調査データのうちの一方を他方に組み込んだ内容に更新する形で融合データを生成するケースを含むものとする。
なお、データ融合処理及び融合データについては、後の項で詳細に説明する。 In addition, the server 1 executes a process of generating the fusion data by fusing the first investigation data and the second investigation data stored in the server 1, that is, a data fusion process. Here, the generation of the fusion data is not limited to the case of generating data different from the two survey data that is the original data, but the content that incorporates one of the two survey data that is the original data into the other. It shall include the case of generating fusion data in an updated form.
Data fusion processing and fusion data will be described in detail in later sections.

さらに、サーバ１に、調査会社の担当者が操作する各端末や顧客企業端末が通信可能に接続されている場合は、その不図示の各端末や顧客企業端末が入力操作を受け付けることで生成するデータ配信要求を受信すると、生成した融合データを要求の発信元である各端末や顧客企業端末に向けて配信することもできる。 Furthermore, when each terminal operated by a person in charge of the research company and a customer company terminal are connected to the server 1 so that they can communicate with each other, each terminal (not shown) or customer company terminal is generated by receiving an input operation. When the data distribution request is received, the generated fusion data can be distributed to each terminal or customer company terminal that is the source of the request.

＜＜調査データについて＞＞
次に、視聴率調査に関する第１調査データ、及び、アンケート調査に関する第２調査データについて、図３を参照しながら説明する。
図３中の（Ａ）は、第１調査である視聴率調査の調査結果の内容の一例を示している。図３中の（Ｂ）は、第２調査であるアンケート調査に対する調査結果の内容の一例を示している。 << About survey data >>
Next, the first survey data related to the audience rating survey and the second survey data related to the questionnaire survey will be described with reference to FIG.
(A) in FIG. 3 shows an example of the contents of the survey result of the audience rating survey which is the first survey. (B) in FIG. 3 shows an example of the content of the survey result for the questionnaire survey that is the second survey.

視聴率調査の結果を示す視聴率データを集計した第１調査データＤ１には、性別等のモニタの属性に関する内容、すなわち、デモグラフィック（人口統計学的属性）に関する情報と、テレビＸの視聴の有無等に関する情報を含む。また、各項目は「０」又は「１」の数値で二値化されている。ただし、これに限定されず、各項目が「０」及び「１」のいずれかを入力する方法以外の方法で入力されることとしてもよい。 The first survey data D1 obtained by compiling audience rating data indicating the results of the audience rating survey includes information on monitor attributes such as gender, that is, information on demographics (demographic attributes) and television X viewing information. Includes information about presence and absence. Each item is binarized with a numerical value of “0” or “1”. However, the present invention is not limited to this, and each item may be input by a method other than the method of inputting either “0” or “1”.

これに対して、アンケート調査の結果を示す個別回答データを集計した第２調査データＤ２は、前述したようにシングルソースデータとなっており、数百問分の個別回答データを例えば１万人超のモニタから回収して集約したものになっている。
例えば、性別等のモニタの属性に関する内容、すなわち、デモグラフィック（人口統計学的属性）に関する情報と、テレビＸの視聴の有無等に関する情報の他、ビールＡの購入の有無等、サイコグラフィック（心理学的属性）に関する情報を含む。また、第２調査データＤ２においても、各項目は「０」又は「１」の数値で二値化されているが、これに限定されず、各項目が「０」及び「１」のいずれかを入力する方法以外の方法で入力されることとしてもよい。 On the other hand, the second survey data D2 obtained by collecting the individual response data indicating the results of the questionnaire survey is single source data as described above, and the individual response data for several hundred questions is, for example, more than 10,000 people. It is collected and collected from the monitors.
For example, in addition to information on monitor attributes such as gender, that is, information on demographics (demographic attributes) and information on whether or not to watch TV X, psychographic (psychological) Information). Also, in the second survey data D2, each item is binarized with a numerical value of “0” or “1”, but the present invention is not limited to this, and each item is either “0” or “1”. It is good also as inputting by methods other than the method of inputting.

第１調査データＤ１及び第２調査データＤ２には、図３（Ａ）及び（Ｂ）に示すように、両調査間で共通する項目（以下、共通項目）が含まれている。この共通項目は、後のデータ融合処理においてキーとなる項目であり、図３に示すケースではモニタ属性に関する項目（性別等）及びテレビ番組の視聴状況に関する項目（テレビＸの視聴の有無等）が、共通項目に該当する。
一方、第２調査データＤ２には、第１調査には含まれず、且つ、第２調査のみにしか含まれない独自の項目（独自項目）が存在する。この項目は、世間の動向を把握するため、生活者属性、商品関与、メディア接触という３つの視点を同一モニタに調査し、生活者を多角的に捉えることを目的として、調査会社が実施するアンケート調査において特別に設定されたものであり、一例を挙げると、図３（Ｂ）に示したビールＡの購入の有無等を問う項目が、この独自項目に該当する。 As shown in FIGS. 3A and 3B, the first survey data D1 and the second survey data D2 include items common to both surveys (hereinafter, common items). This common item is a key item in later data fusion processing, and in the case shown in FIG. 3, there are items related to monitor attributes (gender, etc.) and items related to TV program viewing status (whether TV X is viewed, etc.). This is a common item.
On the other hand, the second survey data D2 includes unique items (unique items) that are not included in the first survey and are included only in the second survey. This item is a questionnaire conducted by a research company to investigate the three aspects of consumer attributes, product involvement, and media contact on the same monitor in order to grasp trends in the world, and to grasp consumers from various perspectives. This item is specially set in the survey. For example, the item asking whether or not beer A is purchased shown in FIG. 3B corresponds to this unique item.

＜＜サーバの構成について＞＞
次に、サーバ１の構成について図４を参照しながら説明する。図４は、サーバ１のハードウェア構成を示した図である。
サーバ１は、図４に示すように、ＣＰＵ１ａ、ＲＯＭ１ｂ、ＲＡＭ１ｃ、通信用インタフェース（図４中、通信用Ｉ／Ｆと表記）１ｄ、ハードディスクドライブ（図４中、ＨＤＤと表記）１ｅ、マウスやキーボード等の入力装置１ｆ、及びディスプレイやプリンタ等の出力装置１ｇを構成要素として有する。また、サーバ１には、その機能を発揮させるためのプログラム（以下、データ処理プログラム）が予めインストールされている。このデータ処理プログラムがＣＰＵ１ａに読み取られて実行されることで、サーバ１によるデータ処理サービスが提供されることになる。 << About server configuration >>
Next, the configuration of the server 1 will be described with reference to FIG. FIG. 4 is a diagram illustrating a hardware configuration of the server 1.
As shown in FIG. 4, the server 1 includes a CPU 1a, ROM 1b, RAM 1c, communication interface (indicated as communication I / F in FIG. 4) 1d, hard disk drive (indicated as HDD in FIG. 4) 1e, mouse, An input device 1f such as a keyboard and an output device 1g such as a display or a printer are included as components. The server 1 is preinstalled with a program for exhibiting the function (hereinafter, data processing program). By reading and executing this data processing program by the CPU 1a, a data processing service by the server 1 is provided.

サーバ１のハードウェア構成については上述の通りであるが、以下、図５を参照しながらサーバ１の構成を機能面から改めて説明する。図５は、サーバ１の構成を機能面から示した図である。
サーバ１は、図５に示すように、データ受信部１１、データ集約部１２、データ記憶部１３、処理実行部１４及びデータ配信部１５を有する。これらは、サーバ１が実行する各種処理を担うものであり、サーバ１を構成する上述のハードウェア構成機器と上述のデータ処理プログラムとが協働することによって構成されている。以下、上述したサーバ１の機能部の各々について説明する。 The hardware configuration of the server 1 is as described above. Hereinafter, the configuration of the server 1 will be described again from the functional aspect with reference to FIG. FIG. 5 is a diagram showing the configuration of the server 1 in terms of functions.
As illustrated in FIG. 5, the server 1 includes a data reception unit 11, a data aggregation unit 12, a data storage unit 13, a process execution unit 14, and a data distribution unit 15. These are responsible for various processes executed by the server 1, and are configured by the cooperation of the above-described hardware components constituting the server 1 and the above-described data processing program. Hereinafter, each of the functional units of the server 1 described above will be described.

データ受信部１１は、情報通信網Ｎ１，Ｎ２を介してサーバ１と接続された機器と通信して当該機器からデータを受信するものであり、例えば、各世帯に配置された測定機２１から視聴率データを受信し、また、回答用端末３１から個別回答データを受信する。 The data receiving unit 11 communicates with a device connected to the server 1 via the information communication networks N1 and N2 and receives data from the device. For example, the data receiving unit 11 is viewed from a measuring device 21 arranged in each household. The rate data is received, and the individual response data is received from the response terminal 31.

データ集約部１２は、データ受信部１１が、視聴率データ又は個別回答データを受信すると、当該視聴率データ又は個別回答データを解析して同データが示す情報を特定し、さらにその情報を図３に図示したテーブル形式でまとめる。すなわち、データ集約部１２は、各世帯に配置された測定機２１から送信された視聴率データを、全モニタ分且つ全項目分の第１調査データＤ１として集約するものである。また、回答用端末３１から送信された個別回答データを、全モニタ分且つ全項目分の第２調査データＤ２として集約するものである。 When the data receiving unit 11 receives the audience rating data or the individual response data, the data aggregating unit 12 analyzes the audience rating data or the individual response data, specifies information indicated by the data, and further displays the information in FIG. Are summarized in the table format shown in FIG. That is, the data aggregating unit 12 aggregates the audience rating data transmitted from the measuring device 21 arranged in each household as the first survey data D1 for all monitors and for all items. Further, the individual response data transmitted from the response terminal 31 is collected as the second survey data D2 for all the monitors and for all items.

データ記憶部１３は、各種のデータを記憶しておくものであり、サーバ１に搭載されたハードディスクドライブ１ｅを主たる構成要素としている。データ記憶部１３に記憶されるデータの中には、データ集約部１２によって生成された第１調査データＤ１や第２調査データＤ２が含まれている。さらに、データ記憶部１３には、後述の処理実行部１４がデータ融合処理を実行することで生成される融合データが記憶される。 The data storage unit 13 stores various data, and includes a hard disk drive 1e mounted on the server 1 as a main component. The data stored in the data storage unit 13 includes the first survey data D1 and the second survey data D2 generated by the data aggregation unit 12. Furthermore, the data storage unit 13 stores fusion data generated by a process execution unit 14 (to be described later) executing a data fusion process.

なお、本実施形態では、第１調査データＤ１及び第２調査データＤ２がサーバ１内のハードディスクドライブ１ｅに記憶されることとしたが、これに限定されるものではない。つまり、第１調査データＤ１及び第２調査データＤ２を記憶する記憶装置については、サーバ１と別に設けられていることとしてもよく、例えば、サーバ１と通信可能なデータベースサーバを第１調査データＤ１及び第２調査データＤ２の記憶装置として用いることとしてもよい。 In the present embodiment, the first survey data D1 and the second survey data D2 are stored in the hard disk drive 1e in the server 1, but the present invention is not limited to this. That is, the storage device that stores the first survey data D1 and the second survey data D2 may be provided separately from the server 1. For example, a database server that can communicate with the server 1 is used as the first survey data D1. And it is good also as using as a memory | storage device of 2nd investigation data D2.

処理実行部１４は、データ記憶部１３に記憶された第１調査データＤ１及び第２調査データＤ２を読み出し、これらのデータを融合するデータ融合処理を実行して融合データを生成するものである。
以下、図６を参照しながらデータ融合処理について概説する。図６は、データ融合処理の内容を概念的に示した図である。 The process execution unit 14 reads the first survey data D1 and the second survey data D2 stored in the data storage unit 13, executes a data fusion process for fusing these data, and generates fused data.
The data fusion process will be outlined below with reference to FIG. FIG. 6 is a diagram conceptually showing the contents of the data fusion processing.

データ融合処理は、互いに異なるモニタに対して実施した第１調査及び第２調査の各々の調査結果を示す調査データ同士を、当該各々の共通項目に対する回答内容をキーとして融合する処理である。かかる処理により、図６下段に示すように、第１調査の調査結果に対して第２調査にのみ含まれた独自項目（第２調査独自項目に相当）に対する回答内容を付加した情報を示すデータ、すなわち、融合データが生成される。 The data fusion process is a process of fusing survey data indicating the results of each of the first survey and the second survey performed on different monitors, using the answer contents for each common item as a key. As a result of such processing, as shown in the lower part of FIG. 6, data indicating information obtained by adding the response content to the original item (corresponding to the second item) included in only the second item with respect to the result of the first item. That is, fusion data is generated.

データ融合処理では、先ず、視聴率調査の調査結果を示す第１調査データＤ１と、アンケート調査の調査結果を示す第２調査データＤ２とを対比する。具体的に説明すると、上記２つの調査に回答したモニタ同士の間で両調査に含まれる共通項目を特定して、その共通項目への回答の類似度合いを算出する。
ここで、共通項目は、前述したように、性別等のモニタ属性に関する項目とテレビ番組の視聴状況（テレビ視聴行動）に関する項目とを含んでいる。 In the data fusion processing, first, the first survey data D1 indicating the survey result of the audience rating survey is compared with the second survey data D2 indicating the survey result of the questionnaire survey. More specifically, a common item included in both surveys is specified between the monitors that have answered the two surveys, and a similarity degree of responses to the common items is calculated.
Here, as described above, the common items include items related to monitor attributes such as sex and items related to the viewing status of television programs (television viewing behavior).

より詳しく説明すると、データ融合処理では、第１調査である視聴率調査を依頼したモニタＡ（以下、モニタＡ）の各々について、第２調査であるアンケート調査を依頼したモニタＢ（以下、モニタＢ）の中から共通項目に対する回答内容が最も類似しているモニタを探索する。かかる目的のため、モニタＡとモニタＢとの間の類似度合いを算出する。なお、類似度合いの算出方法については後の項で説明する。 More specifically, in the data fusion process, for each monitor A (hereinafter referred to as monitor A) that requested the audience survey as the first survey, monitor B (hereinafter referred to as monitor B) that requested the questionnaire survey as the second survey. ) Are searched for the most similar response contents for common items. For this purpose, the degree of similarity between the monitor A and the monitor B is calculated. A method for calculating the degree of similarity will be described in a later section.

類似度合いの算出後には、類似度合いが高いモニタの組み合わせを探索する。具体的に説明すると、モニタＡに対して、モニタＢの中から上記の類似度合いが最も高いモニタを探索する。探索されたモニタ（モニタＢ）、及び、基準とされたモニタ（モニタＡ）は、以降、一組のモニタとして扱われる。 After calculating the degree of similarity, a combination of monitors having a high degree of similarity is searched. More specifically, for the monitor A, the monitor having the highest similarity is searched from the monitors B. The searched monitor (monitor B) and the reference monitor (monitor A) are hereinafter treated as a set of monitors.

そして、一組のモニタのうち、モニタＡに対して、モニタＢが回答した独自項目の回答内容と同一の回答内容を割り当てる。
ここで、独自項目は、前述したように、ビールＡの購入の有無等の商品・サービス関与に関する項目の他、日常生活意識・行動、メディア・広告関与、メディア接触等に関する項目等を含んでいる。 Then, the same answer contents as the answer contents of the unique item answered by the monitor B are assigned to the monitor A in the set of monitors.
Here, as described above, the unique items include items related to products / services such as whether or not beer A is purchased, as well as items related to daily life awareness / behavior, media / advertisement involvement, media contact, etc. .

モニタＡは、視聴率調査に含まれる項目、具体的には、共通項目のみに対して回答しており、アンケート調査のみに含まれる独自項目に対しては回答していない。したがって、モニタＢが回答した独自項目の回答内容と同一の回答内容を割り当てることにより、アンケート調査に回答していないモニタ（モニタＡ）に対して、独自項目への仮想回答が付与されることになる。 Monitor A responds to items included in the audience rating survey, specifically, only common items, and does not respond to unique items included only in the questionnaire survey. Therefore, by assigning the same answer contents as the answer contents of the original item answered by the monitor B, a virtual answer to the original item is given to the monitor (monitor A) not answering the questionnaire survey. Become.

以上の手順に従ってモニタＡ全てに対して仮想回答が割り当てられることにより、視聴率調査の結果を示す視聴率データを集計した第１調査データＤ１と、アンケート調査の結果を示す個別回答データを集計した第２調査データＤ２とが融合するようになる。
この結果、異なるモニタに対して実施された異なる調査の調査結果における共通項目と独自項目を含む擬似的調査データとして、融合データが生成され、生成された融合データは、データ記憶部１３に記憶される。 By assigning virtual answers to all the monitors A according to the above procedure, the first survey data D1 that aggregates audience rating data indicating the results of audience rating surveys and individual response data that indicates the results of questionnaire surveys are aggregated. The second survey data D2 is merged.
As a result, fusion data is generated as pseudo survey data including common items and original items in the survey results of different surveys performed on different monitors, and the generated fusion data is stored in the data storage unit 13. The

なお、融合データについて付言しておくと、融合データの数は、モニタＡの数又はモニタＢの数と必ずしも同数になるとは限らず、例えば、モニタＢの数が、モニタＡの数の倍数となっていない場合には、両モニタ数の最小公倍数に相当する数のデータを含む融合データが生成される場合がある。 It should be noted that the number of fused data is not necessarily the same as the number of monitors A or the number of monitors B. For example, the number of monitors B is a multiple of the number of monitors A. If not, fusion data including a number of data corresponding to the least common multiple of the number of monitors may be generated.

データ配信部１５は、サーバ１に通信可能に接続されている不図示の各端末や顧客企業端末から発された融合データの配信要求をデータ受信部１１が受信することにより、データ記憶部１３から融合データを読み出して上記の各端末や顧客企業端末に向けて同データを配信するものである。 The data distribution unit 15 receives from the data storage unit 13 when the data reception unit 11 receives a fusion data distribution request issued from each terminal (not shown) or customer company terminal that is communicably connected to the server 1. The merged data is read out and distributed to each of the above terminals and customer company terminals.

＜＜調査データ処理方法＞＞
次に、本実施形態に係る調査データ処理方法について説明する。
本実施形態に係る調査データ処理方法は、コンピュータであるサーバ１を用いて行われる。換言すると、サーバ１がＡＳＰサービスとして実行するデータ処理（以下、データ処理サービス）では、本実施形態に係る調査データ処理方法が適用されていることになる。以下では、本実施形態に係る調査データ処理方法の説明として、サーバ１によるデータ処理サービスの流れと同サービス中の各工程について説明することとする。 << Survey data processing method >>
Next, a survey data processing method according to this embodiment will be described.
The survey data processing method according to the present embodiment is performed using the server 1 that is a computer. In other words, the survey data processing method according to the present embodiment is applied to data processing (hereinafter referred to as data processing service) executed by the server 1 as an ASP service. Hereinafter, as a description of the survey data processing method according to the present embodiment, the flow of the data processing service by the server 1 and each process in the service will be described.

サーバ１によるデータ処理サービスは、図７に示す流れに従って進行する。図７は、データ処理サービスに係る一連の処理についての大まかな流れを示した図である。 The data processing service by the server 1 proceeds according to the flow shown in FIG. FIG. 7 is a diagram showing a rough flow of a series of processes related to the data processing service.

データ処理サービスは、先ず、第１調査データを取得する工程（Ｓ００１）から始まる。ここで、第１調査データは、視聴率調査の結果を示す視聴率データを集計したものである。
本工程Ｓ００１について詳しく説明すると、データ受信部１１が、各世帯に配置された測定機２１から視聴率データを受信する。そして、一定期間経過後、データ集約部１２が、これらの視聴率データを集約して、全モニタ分の第１調査データを生成する。そして、本工程Ｓ００１で取得した第１調査データは、データ記憶部１３に記憶される。
このように、本実施形態では、データ受信部１１とデータ集約部１２との協働によって第１調査データが取得される。かかる観点において、データ受信部１１及びデータ集約部１２は、第１調査データを取得する第１調査データ取得部を構成している。 First, the data processing service starts from a step of acquiring first survey data (S001). Here, the first survey data is obtained by tabulating audience rating data indicating the results of the audience rating survey.
If this process S001 is demonstrated in detail, the data reception part 11 will receive audience rating data from the measuring machine 21 arrange | positioned in each household. Then, after a certain period of time, the data aggregating unit 12 aggregates these audience rating data and generates first survey data for all monitors. The first survey data acquired in this step S001 is stored in the data storage unit 13.
Thus, in the present embodiment, the first survey data is acquired by the cooperation of the data receiving unit 11 and the data aggregating unit 12. From this viewpoint, the data receiving unit 11 and the data aggregating unit 12 constitute a first survey data acquisition unit that acquires first survey data.

第１調査データを取得した後、第２調査データを取得する工程（Ｓ００２）が実行される。
本工程Ｓ００２について詳しく説明すると、データ受信部１１が、各モニタの回答用端末３１から個別回答データを受信する。その後、全項目分の個別回答データを全モニタから回収した時点で、データ集約部１２が、これらの個別回答データを集約して、全モニタ分且つ全項目分の第２調査データを生成する。そして、本工程Ｓ００２で取得した第２調査データは、データ記憶部１３に記憶される。
このように、本実施形態では、データ受信部１１とデータ集約部１２との協働によって第２調査データが取得される。かかる観点において、データ受信部１１及びデータ集約部１２は、第２調査データを取得する第２調査データ取得部を構成している。 After acquiring the first survey data, the step of acquiring the second survey data (S002) is executed.
If this process S002 is demonstrated in detail, the data reception part 11 will receive separate response data from the response terminal 31 of each monitor. Thereafter, when the individual response data for all items are collected from all the monitors, the data aggregating unit 12 aggregates the individual response data and generates second survey data for all the monitors and for all the items. Then, the second survey data acquired in this step S002 is stored in the data storage unit 13.
Thus, in the present embodiment, the second survey data is acquired by the cooperation of the data reception unit 11 and the data aggregation unit 12. From this viewpoint, the data receiving unit 11 and the data aggregating unit 12 constitute a second survey data acquiring unit that acquires the second survey data.

なお、本実施形態では、第１調査データを取得した後に第２調査データを取得することとしたが、これに限定されるものではなく、第２調査データを取得した後に第１調査データを取得することとしてもよいし、第１調査データと第２調査データを同時に取得することとしてもよい。
また、第１調査データ又は第２調査データの片方又は双方が動的に変化するデータである場合は、その変化した分のデータのみ取得することとしてもよい。 In this embodiment, the second survey data is acquired after acquiring the first survey data. However, the present invention is not limited to this, and the first survey data is acquired after acquiring the second survey data. The first survey data and the second survey data may be acquired at the same time.
Further, when one or both of the first survey data and the second survey data is dynamically changing data, only the changed data may be acquired.

以上までの工程Ｓ００１、Ｓ００２により２つの調査データ（第１調査データ及び第２調査データ）がデータ記憶部１３に記憶されると、その後にサーバ１の処理実行部１４が、これらの第１調査データ及び第２調査データを読み出してデータ融合処理を実行する（Ｓ００３）。
本工程Ｓ００３により、視聴率調査及びアンケート調査の双方の回答内容を示す融合データが生成され、本工程Ｓ００３において生成された融合データは、データ記憶部１３に記憶される。なお、本工程Ｓ００３は、それ以前の工程Ｓ００１、Ｓ００２によって第１調査データ及び第２調査データがデータ記憶部１３に記憶されると自動的に実行されることとしてもよく、又は、サーバ１のユーザ（例えば、調査会社の従業員）による所定の入力操作をサーバ１側で受け付けたことを契機として実行されることとしてもよい。 When two pieces of survey data (first survey data and second survey data) are stored in the data storage unit 13 through the above-described steps S001 and S002, the processing execution unit 14 of the server 1 thereafter performs these first surveys. The data and the second survey data are read and the data fusion process is executed (S003).
In this step S003, fusion data indicating the contents of responses in both the audience rating survey and the questionnaire survey is generated, and the fusion data generated in this step S003 is stored in the data storage unit 13. This step S003 may be automatically executed when the first survey data and the second survey data are stored in the data storage unit 13 by the previous steps S001 and S002, or the server 1 It may be executed when a predetermined input operation by a user (for example, an employee of a research company) is received on the server 1 side.

その後、必要に応じて、例えば、各端末や顧客企業端末を通じて融合データの配信が要求されると、当該要求をデータ受信部１１が受信することによって、サーバ１のデータ配信部１５がデータ記憶部１３から融合データを読み出し、当該データを各端末や顧客企業端末に向けて配信する（Ｓ００４）。
以上の一連の工程が終了した段階で、ある時点（例えば時点ｔ）における１回のデータ処理サービスが完了することとなる。 After that, for example, when distribution data distribution is requested through each terminal or customer company terminal as necessary, the data distribution unit 15 of the server 1 receives the request from the data reception unit 11 and the data storage unit 15 The fusion data is read from 13 and the data is distributed to each terminal or customer company terminal (S004).
At the stage where the above series of steps is completed, one data processing service at a certain time (for example, time t) is completed.

次に、上述したデータ処理サービスのうち、データ融合処理を実行する工程Ｓ００３について図８を参照しながらより詳細に説明する。図８は、ある時点（時点ｔ）におけるデータ融合処理の流れを示す図である。
サーバ１の処理実行部１４により実行されるデータ融合処理は、図８の流れにしたがって進行する。以下、各工程について説明する。 Next, step S003 of executing the data fusion process in the data processing service described above will be described in more detail with reference to FIG. FIG. 8 is a diagram showing a flow of data fusion processing at a certain time point (time point t).
The data fusion process executed by the process execution unit 14 of the server 1 proceeds according to the flow of FIG. Hereinafter, each step will be described.

データ融合処理では、先ず、データ記憶部１３に記憶された第１調査データと第２調査データを読み出し、第１調査データと第２調査データの双方に含まれる共通項目を指定する（Ｓ１０１）。
具体的には、処理実行部１４が、第１調査データの内容を解析し、第１調査データに含まれる一のモニタＡが共通質問に対して回答した回答内容を特定する。同様に、第２調査データを解析し、第２調査データに含まれる一のモニタＢが共通質問に対して回答した回答内容を特定する。
なお、本実施形態では、性別等のモニタ属性とテレビ視聴行動についての回答内容が、共通項目として指定される。 In the data fusion process, first, the first survey data and the second survey data stored in the data storage unit 13 are read, and common items included in both the first survey data and the second survey data are designated (S101).
Specifically, the process execution unit 14 analyzes the content of the first survey data, and specifies the response content that one monitor A included in the first survey data has answered to the common question. Similarly, the second survey data is analyzed, and the content of the answer that one monitor B included in the second survey data has answered to the common question is specified.
In the present embodiment, monitor attributes such as gender and the content of responses regarding TV viewing behavior are designated as common items.

次に、モニタＡの一人が、モニタＢのどの人と最も類似しているかを特定するために、モニタＡとモニタＢのそれぞれに共通項目を用いて合成変数（主成分分析により得られる主成分得点算出関数等）を作成し（Ｓ１０２）、工程Ｓ１０２により算出した合成変数（主成分分析により得られる主成分得点算出関数等）又は予め設定した任意の関数をデータ記憶部１３に保存する（Ｓ１０３）。
具体的には、処理実行部１４が、第１調査のモニタＡと第２調査のモニタＢとの距離を求めるために必要な値（距離計算用スコア群）を算出する際に使用する融合パラメータとなる関数を設定する。合成変数の作成には、例えば、統計学における主成分分析を行い生成される主成分得点を用いても良いし、共通項目それぞれに任意の係数を掛け合わせ、その後得られる総和を用いても良い。また、合成変数は単一でもよく、複数あってもよい。 Next, in order to identify which person on monitor A is most similar to which person on monitor B, a composite variable (principal component obtained by principal component analysis) is used using items common to both monitor A and monitor B. A score calculation function or the like) is created (S102), and the composite variable (principal component score calculation function or the like obtained by principal component analysis) calculated in step S102 or an arbitrary function set in advance is stored in the data storage unit 13 (S103). ).
Specifically, the fusion parameter used when the process execution unit 14 calculates a value (score group for distance calculation) necessary for obtaining the distance between the monitor A of the first survey and the monitor B of the second survey. Set the function to be For the creation of a composite variable, for example, a principal component score generated by performing principal component analysis in statistics may be used, or an arbitrary coefficient may be multiplied to each common item, and then the total obtained may be used. . Moreover, the composite variable may be single or plural.

次に、第１調査データと第２調査データの双方の共通項目に上記の関数を適用して、距離計算用スコア群を算出する（Ｓ１０４）。
具体的には、処理実行部１４が、上記関数と共通項目を用いることにより、第１調査データと第２調査データの各モニタＡ、Ｂの値を求める。このとき、距離計算用スコア群が目的変数となり、共通項目が説明変数となる。 Next, a distance calculation score group is calculated by applying the above function to the common items of both the first survey data and the second survey data (S104).
Specifically, the process execution unit 14 obtains the values of the monitors A and B of the first survey data and the second survey data by using the function and the common item. At this time, the distance calculation score group becomes an objective variable, and the common item becomes an explanatory variable.

次に、上記工程Ｓ１０４により算出した距離計算用スコア群を比較して、第１調査データの各モニタＡと第２調査データの各モニタＢとの類似度合いを示す距離について距離計算を実行する（Ｓ１０５）。
本実施形態では、共通項目の各々について回答内容の違いを距離で表し、共通項目ごとの距離計算用スコア群を合計した総距離を以てモニタ間の類似度合いとしており、その総距離が小さい値になる程、属性が近いモニタであることを示している。具体的には、処理実行部１４が、各項目の距離計算用スコア群を合算し、その結果を以てモニタ間の類似度合いとする。 Next, the distance calculation score groups calculated in step S104 are compared, and distance calculation is performed for distances indicating the degree of similarity between each monitor A of the first survey data and each monitor B of the second survey data ( S105).
In the present embodiment, the difference in answer contents for each common item is expressed as a distance, and the total distance obtained by summing up the distance calculation score groups for each common item is used as the degree of similarity between monitors, and the total distance is a small value. This indicates that the monitor has a close attribute. Specifically, the process execution unit 14 adds up the distance calculation score groups of the respective items, and uses the result as the degree of similarity between the monitors.

次に、上記工程Ｓ１０５によって距離計算を実行した後、第１調査データの各モニタＡについて、総距離が近いモニタを第２調査データのモニタＢの中から特定し、同一のモニタとみなして紐付けて融合する（Ｓ１０６）。
このとき、本実施形態においては、第１調査データのモニタＡと第２調査データのモニタＢとの割り当てパターンを設定する際に、「制約なし統計的マッチング」又は「制約付き統計的マッチング」のいずれの手法を用いてもよい。 Next, after performing the distance calculation in the above-described step S105, for each monitor A of the first survey data, the monitor having the shortest total distance is specified from the monitors B of the second survey data, and is regarded as the same monitor. Then, they are fused (S106).
At this time, in the present embodiment, when setting the allocation pattern between the monitor A of the first survey data and the monitor B of the second survey data, “unconstrained statistical matching” or “constrained statistical matching” Any method may be used.

例えば、処理を簡素化して、効率的に大量のデータを処理することが可能な「制約なし統計的マッチング」の手法を採用して、割り当てパターンを設定することができる。ここでは、事案を簡素化するために、図９に示すケースを例に挙げて説明することとする。
図９は、上記のような「制約なし統計的マッチング」を採用してデータ融合処理を実施した場合の説明図である。 For example, it is possible to set an allocation pattern by adopting a method of “unconstrained statistical matching” that can simplify processing and efficiently process a large amount of data. Here, in order to simplify the case, the case shown in FIG. 9 will be described as an example.
FIG. 9 is an explanatory diagram when the data fusion processing is performed by adopting the “unconstrained statistical matching” as described above.

図９に示すように、処理実行部１４は、第１調査データのモニタＡについて、上記工程１０６にて算出した類似度合いが最も高い第２調査データのモニタＢ、すなわち、最も類似した第２調査データのモニタＢを一つ選択して紐付ける。例えば、ここでは、第１調査データのモニタＡ（Ａ００００１）については、第２調査データのモニタＢ（Ｂ００００２）を、第１調査データのモニタＡ（Ａ００００２）については、第２調査データのモニタＢ（Ｂ００００３）を、最も類似するモニタと特定し、融合する。
これにより、第１調査データのモニタＡ（Ａ００００１）には第２調査データのモニタＢ（Ｂ００００２）の回答と同一の回答内容が仮想回答として割り当てられ、第１調査データのモニタＡ（Ａ００００２）には第２調査データのモニタＢ（Ｂ００００３）の回答と同一の回答が仮想回答として割り当てられる。すなわち、仮想回答が各モニタＡ００００１、Ａ００００１に対して割り当てられた結果、データ同士が融合し、最終的に、図９の下段に示す融合データが生成される。 As shown in FIG. 9, the process execution unit 14 monitors the second survey data monitor B having the highest degree of similarity calculated in the above-described step 106 for the monitor A of the first survey data, that is, the most similar second survey. Select and link one data monitor B. For example, for the first survey data monitor A (A00001), the second survey data monitor B (B00002) is used. For the first survey data monitor A (A00002), the second survey data monitor B is used. Identify (B00003) as the most similar monitor and merge.
As a result, the same answer content as the answer of the second survey data monitor B (B00002) is assigned to the first survey data monitor A (A00001) as a virtual answer, and the first survey data monitor A (A00002) is assigned. The same answer as the answer of monitor B (B00003) of the second survey data is assigned as a virtual answer. That is, as a result of allocating virtual answers to the monitors A00001 and A00001, the data are merged, and finally, merged data shown in the lower part of FIG. 9 is generated.

一方、融合データは、元データ（第１調査データ及び第２調査データ）の平均・分散を維持できるように、「制約付き統計的マッチング」の手法を採用して、割り当てパターンを設定することもできる。この「制約付き統計的マッチング」では、処理実行部１４が、モニタ同士間の類似度合いに基づいて割り当てパターンを統計的解法に従って設定する。 On the other hand, the fusion data can be set by assigning an allocation pattern using the “constrained statistical matching” method so that the average and variance of the original data (first survey data and second survey data) can be maintained. it can. In this “constrained statistical matching”, the process execution unit 14 sets an allocation pattern according to a statistical solution based on the degree of similarity between monitors.

具体的には、処理実行部１４は、割り当てパターンを設定するため手法として輸送問題の解法を採用し、当該解法により下記の前提条件（Ａ）、（Ｂ）の双方を満たすような割り当てパターンを設定することとしている。
（Ａ）第１調査に関する集計結果については、融合前の第１調査データと融合データとの間で同一とする。
（Ｂ）第２調査に関する集計結果については、融合前の第２調査データと融合データとの間で同一とする。
ここで、第１調査又は第２調査に関する集計結果とは、同調査に対する回答内容別にモニタ人数を集計した際の人数比率のことである。
また、輸送問題の解法により割り当てパターンを設定するにあたり、第１調査の各モニタ及び第２調査の各モニタに対して重み（ウェイト）を設定する。ここで、第１調査データの各モニタＡに対して設定される重みは、輸送問題における需要量に相当し、第２調査データの各モニタＢに対して設定される重みは、供給量に相当する。 Specifically, the process execution unit 14 adopts a transportation problem solution as a method for setting an assignment pattern, and assigns an assignment pattern that satisfies both the following preconditions (A) and (B) by the solution. I am going to set it.
(A) About the total result regarding a 1st investigation, it is set as the same between the 1st survey data before fusion, and fusion data.
(B) About the total result regarding a 2nd investigation, it is made the same between the 2nd investigation data before fusion, and fusion data.
Here, the total result regarding the first survey or the second survey is the ratio of the number of persons when the number of monitors is totaled according to the content of the response to the survey.
In setting the allocation pattern by solving the transportation problem, a weight is set for each monitor in the first survey and each monitor in the second survey. Here, the weight set for each monitor A of the first survey data corresponds to the demand amount in the transportation problem, and the weight set for each monitor B of the second survey data corresponds to the supply amount. To do.

ここでは、事案を簡素化するために、図１０に示すケースを例に挙げて説明することとする。
図１０は、上記のような「制約付き統計的マッチング」を採用してデータ融合処理を実施した場合の説明図である。 Here, in order to simplify the case, the case shown in FIG. 10 will be described as an example.
FIG. 10 is an explanatory diagram when the data fusion processing is performed by adopting the “constrained statistical matching” as described above.

例えば、図１０に示すように、データ融合に係る２つの調査データのうちの一方（第１調査データ）がモニタ数２人（Ａ００００１〜Ａ００００２）のデータであり、他方（第２調査データ）がモニタ数３人（Ｂ００００１〜Ｂ００００３）である場合、第１調査データのモニタＡを第２調査データのモニタＢのいずれか一人以上と結びつけ、第２調査データのモニタＢを第１調査データのモニタＡのいずれか一人以上と結びつける。このとき、第２調査データのモニタＢは、第１調査データのモニタ１人と重み３で結びついてもよいし、図１０に示すように、第１調査データのモニタ１人と重み１（Ａ００００１とＢ００００２、Ａ００００２とＢ００００２）、他の第１調査データのモニタ１人と重み２（Ａ００００１とＢ００００１、Ａ００００２とＢ００００３）で結びついてもよい。また、第１調査データのモニタＡは、第２調査データのモニタ１人と重み２（Ａ００００１とＢ００００１、Ａ００００２とＢ００００３）で結びついてもよいし、第１調査のモニタ１人と重み１（Ａ００００１とＢ００００２）、他の第１調査のモニタ１人と重み１（Ａ００００２とＢ００００２）で結びついてもよい。
ただし、結びついた重みと第１調査データのモニタＡと第２調査データのモニタＢとの間の類似度（上記値から算出した類似度）の総和が最も小さくなるよう結び付ける。その結果、融合データは、元データ（第１調査データ及び第２調査データ）の平均・分散を維持することができる。この「制約付き統計的マッチング」では、このように設定した重み配分こそが、割り当てパターンに相当する。 For example, as shown in FIG. 10, one of the two survey data related to data fusion (first survey data) is data of two monitors (A00001 to A00002), and the other (second survey data) is When there are three monitors (B00001 to B00003), the monitor A of the first survey data is linked to one or more of the monitors B of the second survey data, and the monitor B of the second survey data is monitored as the monitor of the first survey data Connect with one or more of A. At this time, the monitor B of the second survey data may be connected to one monitor of the first survey data with a weight of 3, and as shown in FIG. 10, one monitor of the first survey data and the weight of 1 (A00001). And B00002, A00002 and B00002), and another monitor of the first survey data may be connected by weight 2 (A00001 and B00001, A00002 and B00003). Further, the monitor A of the first survey data may be connected to one monitor of the second survey data by weight 2 (A00001 and B00001, A00002 and B00003), or one monitor of the first survey data and weight 1 (A00001). And B00002) and one other monitor of the first survey may be connected with weight 1 (A00002 and B00002).
However, the combined weights are combined so that the sum of the similarities (similarities calculated from the above values) between the monitor A of the first survey data and the monitor B of the second survey data is minimized. As a result, the fusion data can maintain the average and variance of the original data (first survey data and second survey data). In this “constrained statistical matching”, the weight distribution set in this way corresponds to the allocation pattern.

これにより、第１調査データのモニタＡ（Ａ００００１）には第２調査データのモニタＢ（Ｂ００００１）の回答及び第２調査データのモニタＢ（Ｂ００００２）の回答と同一の回答内容が仮想回答として割り当てられ、前者はモニタ２人分の回答、後者はモニタ１人分の回答として扱われる。また、第１調査データのモニタＡ（Ａ００００２）には第２調査データのモニタＢ（Ｂ００００２）の回答及び第２調査データのモニタＢ（Ｂ００００３）の回答と同一の回答内容が仮想回答として割り当てられ、前者はモニタ１人分の回答、後者はモニタ２人分の回答として扱われる。すなわち、仮想回答が各モニタＡ００００１、Ａ００００１に対して割り当てられた結果、データ同士が融合し、最終的に、図１０の下段に示す融合データが生成される。 As a result, the first survey data monitor A (A00001) is assigned the same answer content as the second survey data monitor B (B00001) response and the second survey data monitor B (B00002) response as a virtual answer. The former is treated as an answer for two monitors, and the latter is treated as an answer for one monitor. The first survey data monitor A (A00002) is assigned the same answer contents as the responses of the second survey data monitor B (B00002) and the second survey data monitor B (B00003) as virtual answers. The former is treated as an answer for one monitor, and the latter is treated as an answer for two monitors. That is, as a result of the virtual answers being assigned to the monitors A00001 and A00001, the data are merged, and finally, the merged data shown in the lower part of FIG. 10 is generated.

その後、融合結果（融合データ）を保存する処理（Ｓ１０７）が実行され、融合データがデータ記憶部１３に記憶されるようになる。そして、この時点でデータ融合処理が完了する。 Thereafter, a process of saving the fusion result (fusion data) (S107) is executed, and the fusion data is stored in the data storage unit 13. At this point, the data fusion process is completed.

なお、融合処理は、共通項目のうち、男女別等の特定項目の値別に融合することとしてもよい。その場合は、上記工程Ｓ１０４〜Ｓ１０６の処理はその値別に実行する。また、その場合は、融合結果を一つのファイルにして保存する。 Note that the fusion processing may be performed according to the value of a specific item such as gender among the common items. In that case, the process of said process S104-S106 is performed according to the value. In this case, the fusion result is saved as one file.

ここで、本実施形態においては、第１調査及び第２調査は、モニタのローテーションを行う調査であり、第１調査データＤ１及び第２調査データＤ２は、時間の経過と共に動的に変化するデータとなる。
具体的には、同一モニタを長い間調査していると、調査慣れや学習効果による影響が出てくることがあり、また、標本となるモニタが母集団の変化に対応しなくなることがあるので、このような状況を避けるため、本実施形態では、一定期間経過後に、第１調査及び第２調査のモニタＡ，Ｂの一部又は全部を遂次組み替えること（ローテーション）を行っている。また、上記のような定期的なローテーション以外にも、脱落したモニタを補充するため、新たなモニタを追加することにより、不規則にモニタを入れ替えるような場合もある。
なお、本実施形態における動的に変化するデータとは、アンケート調査等の調査データに限らず、各種機器から自動的に取得できる稼働ログデータや、Ｗｅｂログ、購買履歴、アプリ利用履歴等、又は、常に変化し続ける大量の顧客データ（いわゆるビッグデータ）等も、全て含み、本実施形態において適用することができる。 Here, in the present embodiment, the first survey and the second survey are surveys that rotate the monitor, and the first survey data D1 and the second survey data D2 are data that dynamically change over time. It becomes.
Specifically, if the same monitor is surveyed for a long time, it may be affected by survey familiarity and learning effects, and the sample monitor may not respond to changes in the population. In order to avoid such a situation, in the present embodiment, after a certain period of time, some or all of the monitors A and B of the first and second surveys are sequentially rearranged (rotation). In addition to the periodic rotation as described above, there is a case where the monitor is irregularly replaced by adding a new monitor in order to supplement the dropped monitor.
Note that the dynamically changing data in this embodiment is not limited to survey data such as questionnaire surveys, operation log data that can be automatically acquired from various devices, Web logs, purchase history, application usage history, or the like. A large amount of customer data that changes constantly (so-called big data) is also included and can be applied in this embodiment.

本実施形態においては、第１調査と第２調査は、変化のタイミングが異なるものである。具体的には、第１調査は、時点ｔから１経過するごとに所定割合でモニタＡの一部を逐次入れ替えるものである。また、第２調査は、時点ｔからｎ経過するごとにモニタＢの全部を総入れ替えするものである。
なお、本実施形態においては、モニタＢの全部を総入れ替えするものとして説明するが、これに限定されることはなく、第２調査のモニタＢは、第１調査のモニタＡと変化のタイミングが異なるものであればよく、例えば、時点ｔからｎ経過するごとにモニタＢの半分や１／３を部分的に入れ替えるものであってもよい。 In the present embodiment, the timing of change is different between the first survey and the second survey. Specifically, in the first survey, a part of the monitor A is sequentially replaced at a predetermined rate every time 1 elapses from the time point t. In the second investigation, every time n elapses from time t, the entire monitor B is totally replaced.
In this embodiment, the description will be made assuming that all the monitors B are totally replaced. However, the present invention is not limited to this, and the monitor B of the second survey has the same timing as the monitor A of the first survey. It may be different, and for example, half or 1/3 of the monitor B may be partially replaced every time n elapses from time t.

以下、第１調査データが動的に変化するデータである場合の具体的な事例について、図１１乃至図１３を例に挙げて説明する。
図１１及び図１３は、第１調査データが変化した状態を概念的に示す図である。図１２は、融合実施後の時点ｔ＋１におけるデータ融合処理の流れ示す図である。 Hereinafter, specific examples in the case where the first survey data is dynamically changing data will be described with reference to FIGS. 11 to 13 as examples.
11 and 13 are diagrams conceptually showing a state in which the first survey data has changed. FIG. 12 is a diagram showing a flow of data fusion processing at time t + 1 after the fusion is performed.

図１１に示すように、融合実施時点（時点ｔ）では、第１調査のモニタＡと第２調査のモニタＢとは全て融合できているが、融合実施後（時点ｔ＋１）には、融合していたモニタＡの一部が第１調査データの中から存在しなくなり、融合実施時点（時点ｔ）には存在しなかった新たなモニタＡ’が、第１調査データの中に加わることになる。そのため、新たなモニタＡ’について、第２調査データの中から紐付ける相手を特定する必要がある。 As shown in FIG. 11, at the time of performing the fusion (time t), the monitor A of the first survey and the monitor B of the second survey are all merged, but after the fusion is performed (time t + 1), they are merged. A part of the monitor A which has been lost does not exist in the first survey data, and a new monitor A ′ which does not exist at the time of performing the fusion (time t) is added to the first survey data. . Therefore, it is necessary to specify a partner to be associated with the new monitor A ′ from the second survey data.

上記のような場合、融合実施後の時点ｔ＋１における、上述したデータ処理サービスのうち、データ融合処理を実行する工程Ｓ００３について、図１２を参照しながら説明する。
融合実施後の時点ｔ＋１において、サーバ１の処理実行部１４により実行されるデータ融合処理は、図１２の流れにしたがって進行する。 In the above case, step S003 of executing the data fusion process among the data processing services described above at the time point t + 1 after the fusion will be described with reference to FIG.
At the time point t + 1 after performing the fusion, the data fusion process executed by the process execution unit 14 of the server 1 proceeds according to the flow of FIG.

融合実施後のｔ＋１時点におけるデータ融合処理では、先ず、第１調査データのうち新規の第１調査データを特定する（Ｓ２０１）。
具体的には、処理実行部１４が、データ記憶部１３に記憶された第１調査データの内容を解析し、その第１調査データの中から新規の第１調査データ（モニタＡ’の回答内容に相当するデータ）を特定する。 In the data fusion process at time t + 1 after the fusion is performed, first, new first survey data is specified from the first survey data (S201).
Specifically, the process execution unit 14 analyzes the content of the first survey data stored in the data storage unit 13, and creates new first survey data (the response content of the monitor A ′) from the first survey data. Data).

次に、その新規の第１調査データと第２調査データの双方の共通項目に、上述の時点ｔにおけるデータ融合処理のＳ１０３にて保存した関数を適用して、距離計算用スコア群を算出する（Ｓ２０２）。
具体的には、処理実行部１４が、上記関数と共通項目を用いることにより、新規の第１調査データと第２調査データの各モニタＡ’、Ｂの合成変数を求める。このとき、新規の第１調査データについてのみ、上記処理を行うこと以外は、上述した時点ｔにおけるデータ融合処理のＳ１０４と同様の処理を実行する。 Next, the distance calculation score group is calculated by applying the function stored in S103 of the data fusion process at the time point t to the common items of both the new first survey data and the second survey data. (S202).
Specifically, the process execution unit 14 obtains a composite variable for each of the monitors A ′ and B of the new first survey data and the second survey data by using the function and the common item. At this time, the process similar to S104 of the data fusion process at the time point t described above is executed except that the process is performed only for the new first survey data.

このように、第１調査データに新規のモニタＡ’が加わった場合も、距離計算用スコア群を算出する際に使用する関数を再計算せず、既に使用し有効であると実証済みの関数を再利用することにより、処理が簡素化され、大量の新規調査データであっても、迅速に処理することができる。 Thus, even when a new monitor A ′ is added to the first survey data, the function used when calculating the distance calculation score group is not recalculated, and the function that has already been used and proven effective. By reusing, the processing is simplified, and even a large amount of new survey data can be processed quickly.

このとき、例えば、特定期間Ａのテレビの視聴傾向のように、共通項目が時点ｔに依存する等の理由により、新規の第１調査データが融合データ（又は第２調査データ）と全く同一の共通項目を保持していない場合があり得る。すなわち、図１１に示すように、融合実施時点（時点ｔ）においては、共通項目ｒ（ｔ）は、第１調査データ（モニタＡ）及び第２調査データ（モニタＢ）の双方に含まれているが、融合実施後（時点ｔ＋１）においては、新たなモニタＡ’の第１調査データの共通項目は、時点の経過に伴って共通項目ｒ（ｔ＋１）と変化し、融合データ（又は第２調査データ）と全く同一の共通項目を保持していないことになる。
そのため、本実施形態では、時点ｔより後ろ（ｔ＋１以降）の時点において、例えば、特定期間Ｂのテレビ視聴傾向の平均値等の統計量を算出し、時点ｔにおける共通項目ｒ（ｔ）に代替する。これにより、厳密には、時点ｔの特定期間Ａのテレビの視聴傾向とは異なるが、近似する時期の視聴傾向ということで代替でき、同一の共通項目を保持している場合と差異なく、新規の調査データを融合することができる。 At this time, the new first survey data is exactly the same as the fusion data (or the second survey data) because, for example, the common item depends on the time point t as in the TV viewing tendency in the specific period A. There may be cases where common items are not held. That is, as shown in FIG. 11, at the time of performing the fusion (time t), the common item r (t) is included in both the first survey data (monitor A) and the second survey data (monitor B). However, after the fusion is performed (time t + 1), the common item of the first survey data of the new monitor A ′ changes to the common item r (t + 1) as time elapses, and the fusion data (or second data) Survey data), the same common items are not held.
Therefore, in the present embodiment, at a time point after time t (after t + 1), for example, a statistic such as an average value of the TV viewing tendency in the specific period B is calculated and substituted for the common item r (t) at time t. To do. Strictly speaking, this is different from the TV viewing tendency in the specific period A at the time point t, but it can be replaced by a viewing tendency at an approximate time, which is not different from the case where the same common item is held, and is new. The survey data can be merged.

次に、上記工程Ｓ２０２により算出した距離計算用スコア群を比較して、新規の第１調査データの各モニタＡ’と第２調査データの各モニタＢとの類似度合いを示す距離について距離計算を実行する（Ｓ２０３）。
具体的には、処理実行部１４が、新規の第１調査データの各モニタＡ’について各項目の距離計算用スコア群を合算し、その結果を以てモニタ間の類似度合いとする。 Next, the distance calculation score groups calculated in the above step S202 are compared, and distance calculation is performed for the distance indicating the degree of similarity between each monitor A ′ of the new first survey data and each monitor B of the second survey data. Execute (S203).
Specifically, the process execution unit 14 adds the distance calculation score groups of the respective items for each monitor A ′ of the new first survey data, and uses the result as the degree of similarity between the monitors.

次に、上記工程Ｓ２０３によって距離計算を実行した後、新規の第１調査データの各モニタＡ’について、総距離が近いモニタを第２調査データのモニタＢの中から特定し、同一のモニタとみなして紐付けて融合する（Ｓ２０４）。
このとき、本実施形態においては、新規の第１調査データのモニタＡ’と第２調査データのモニタＢとの割り当てパターンを設定する際に、処理の簡素化を目的として、「制約なし統計的マッチング」を採用してデータ融合処理を実施する。
具体的には、処理実行部１４は、新規の第１調査データのモニタＡ’について、上記工程２０３にて算出した類似度合いが最も高い第２調査データのモニタＢ、すなわち、最も類似した第２調査データのモニタＢを一つ選択して紐付け、そのモニタＢを最も類似するモニタと特定し、融合する。これにより、新規の第１調査データのモニタＡ’にも第２調査データ内のいずれかのモニタＢの回答と同一の回答内容が仮想回答として割り当てられ、最終的に、新たな融合データが生成される。 Next, after performing the distance calculation in the above-described step S203, for each monitor A ′ of the new first survey data, a monitor having a short total distance is identified from the monitors B of the second survey data, and the same monitor They are considered to be tied and fused (S204).
At this time, in the present embodiment, when setting the allocation pattern between the new monitor A ′ for the first survey data and the monitor B for the second survey data, for the purpose of simplifying the processing, "Matching" is used to implement data fusion processing.
Specifically, the process execution unit 14 monitors the second survey data monitor B having the highest degree of similarity calculated in the above step 203 with respect to the new first survey data monitor A ′, that is, the most similar second survey data A ′. One monitor B of survey data is selected and linked, and that monitor B is identified as the most similar monitor and merged. As a result, the same response content as the response of any one of the monitors B in the second survey data is assigned to the new monitor A ′ of the first survey data as a virtual response, and finally new fusion data is generated. Is done.

なお、本実施形態では、時点ｔ＋１において、第１調査データのモニタＡがモニタＡ’に変化したときに、融合相手であるモニタＡ，Ｂがその時点で未だ存在するモニタＡ，Ｂについてはそのままの融合相手を引継ぎ、新規の第１調査データのモニタＡ’についてのみ時点ｔ＋１におけるデータ融合処理をあらためて実行する。これにより、既に融合処理済みの融合データを生かすと共に、新規の第１調査データのモニタＡ’についても、正確且つ効率的に融合することができる。 In the present embodiment, when the monitor A of the first survey data changes to the monitor A ′ at the time point t + 1, the monitors A and B that are the fusion partners still exist at that time as they are. The data fusion process at time t + 1 is executed again only for the monitor A ′ of the new first survey data. As a result, the fusion data that has already undergone the fusion processing can be utilized, and the new monitor A ′ of the first survey data can be fused accurately and efficiently.

その後、融合結果（融合データ）を保存する処理（Ｓ２０５）が実行され、融合データがデータ記憶部１３に記憶されるようになる。これにより、時点ｔ＋１におけるデータ融合処理が完了する。 Thereafter, a process of saving the fusion result (fusion data) (S205) is executed, and the fusion data is stored in the data storage unit 13. Thereby, the data fusion process at the time point t + 1 is completed.

また、他の事例として、以下、第２調査データが動的に変化するデータである場合の具体的な事例について、図１４を例に挙げて説明する。
図１４は、第２調査データが変化した状態を概念的に示す図である。 As another example, a specific example in the case where the second survey data is dynamically changing data will be described below with reference to FIG.
FIG. 14 is a diagram conceptually illustrating a state in which the second survey data has changed.

上述のように、第１調査と第２調査は、変化のタイミングが異なるものであり、第２調査は、時点ｔからｎ経過するごとにモニタＢの全部を総入れ替えするものである。
上述の図１１に示すように、融合実施時点（時点ｔ）では、第１調査のモニタＡと第２調査のモニタＢとは全て融合できているが、図１４に示すように、融合実施後（時点ｔ＋ｎ）には、融合していたモニタＢの全部が第２調査データの中から存在しなくなり、第２調査データは融合実施時点（時点ｔ）には存在しなかった新たなモニタＢ’に総入れ替えとなる。そのため、新たなモニタＢ’について、第１調査データの中から紐付ける相手を特定する必要がある。
この場合は、原則として、図８に示す融合実施の時点ｔにおいてサーバ１の処理実行部１４により実行されるデータ融合処理と同様の処理をあらためて実行する。なお、図１２に示す融合実施後の時点ｔ＋１においてサーバ１の処理実行部１４により実行されるデータ融合処理と同様の処理を、第２調査データのモニタＢ’について実行することとしてもよい。 As described above, the first survey and the second survey have different timings of change, and the second survey totally replaces all the monitors B every time n elapses from time t.
As shown in FIG. 11 described above, the first survey monitor A and the second survey monitor B are all merged at the fusion execution time point (time point t). However, as shown in FIG. At (time t + n), all of the merged monitors B no longer exist in the second survey data, and the second survey data is a new monitor B ′ that did not exist at the merge execution time (time t). Total replacement. Therefore, for the new monitor B ′, it is necessary to specify the partner to be associated from the first survey data.
In this case, in principle, a process similar to the data fusion process executed by the process execution unit 14 of the server 1 at the fusion execution time t shown in FIG. 8 is newly executed. Note that the same process as the data fusion process executed by the process execution unit 14 of the server 1 at the time point t + 1 after the fusion shown in FIG. 12 may be executed for the monitor B ′ of the second survey data.

また、さらに、融合実施後（時点ｔ＋ｎ＋１）には、第２調査データのモニタＢ’と融合していたモニタＡの一部が第１調査データの中から存在しなくなり、融合実施時点（時点ｔ＋ｎ）には存在しなかった新たなモニタＡ’が、第１調査データの中に加わることになる。そのため、新たなモニタＡ’について、第２調査データの中から紐付ける相手を特定する必要があるが、この場合も図１２に示す融合実施後の時点ｔ＋１において、サーバ１の処理実行部１４により実行されるデータ融合処理と同様の処理を実行する。 Further, after the fusion is performed (time t + n + 1), a part of the monitor A that has been merged with the monitor B ′ of the second survey data no longer exists in the first survey data, and the fusion execution time (time t + n). ), A new monitor A ′ that did not exist is added to the first survey data. Therefore, for the new monitor A ′, it is necessary to specify the partner to be linked from the second survey data. In this case as well, the processing execution unit 14 of the server 1 performs the processing at the time t + 1 after performing the fusion shown in FIG. A process similar to the data fusion process to be executed is executed.

＜＜その他の実施形態＞＞
上記の実施形態には、主として本発明の調査データ処理装置及び調査データ処理方法について説明した。しかし、上記の実施形態は、本発明の理解を容易にするためのものであり、本発明を限定するものではない。本発明は、その趣旨を逸脱することなく、変更、改良され得ると共に、本発明にはその等価物が含まれることはもちろんである。 << Other Embodiments >>
In the above embodiment, the investigation data processing apparatus and the investigation data processing method of the present invention have been mainly described. However, the above embodiment is for facilitating the understanding of the present invention, and does not limit the present invention. The present invention can be changed and improved without departing from the gist thereof, and the present invention includes the equivalents thereof.

上記の実施形態では、第１調査及び第２調査共に一の調査会社が実施している例について説明したが、これに限定されるものではない。例えば、第１調査を調査会社が実施し、第２調査を他の会社が実施する等、本発明は、異なる主体によって実施された調査結果を示す調査データを融合する場合であっても適用可能である。こうすることにより、顧客企業にとっては、独自に実施したアンケート調査等（例えば、自社が販売する商品や提供するサービスについて世間の反応を把握する目的から実施するアンケート調査等）の結果と、調査会社が実施した調査（例えば、視聴率調査等）とを融合することができ、各顧客企業のニーズに則した効果的なメディアプランニングを立案することが可能となる。 In the above-described embodiment, an example in which the first survey company and the second survey are performed by one survey company has been described. However, the present invention is not limited to this. For example, the first survey is conducted by a survey company and the second survey is conducted by another company, and the present invention can be applied even when survey data indicating survey results conducted by different entities are merged. It is. In this way, for client companies, the results of questionnaire surveys conducted by the company (for example, questionnaire surveys conducted for the purpose of grasping the public reaction regarding products sold and services provided by the company) and the survey company Can be integrated with surveys conducted by the company (for example, audience rating surveys), and effective media planning can be made in accordance with the needs of each client company.

また、上記の実施形態では、独自項目が存在するのは第２調査だけとし、第１調査は共通項目のみで独自項目を含んでいないが、これに限定されるものではない。すなわち、本発明は、第１調査においても、第２調査には含まれず、且つ、第１調査のみにしか含まれない独自の項目（第１調査独自項目）が存在する場合であっても適用可能である。 In the above embodiment, the unique item exists only in the second survey, and the first survey includes only the common item and does not include the unique item. However, the present invention is not limited to this. In other words, the present invention is applicable even in the case where there is a unique item (first survey unique item) that is not included in the second survey and included only in the first survey. Is possible.

また、上記の実施形態では、２つの調査データを融合することにより、融合データを生成することとしたが、これに限定されるものではない。すなわち、３つ以上の複数の調査データであっても、共通項目が存在すれば、それらを融合して融合データを生成することができるので、そのようなケースについても本発明を適用することが可能である。 In the above embodiment, the fusion data is generated by fusing two pieces of survey data. However, the present invention is not limited to this. That is, even if there are three or more survey data, if there are common items, they can be merged to generate fused data, so the present invention can be applied to such a case. Is possible.

また、上記の実施形態では、シングルソースデータと非シングルソースデータとを融合して擬似シングルソースデータとしての融合データを生成するケースを例に挙げて説明したが、これに限定されるものではない。すなわち、本発明は、非シングルソースデータ同士を融合して融合データを生成するケース、及び、シングルソースデータ同士を融合して融合データを生成するケースのいずれにも適用可能である。 In the above-described embodiment, the case of generating the fusion data as the pseudo single source data by fusing the single source data and the non-single source data has been described as an example. However, the present invention is not limited to this. . That is, the present invention can be applied to both a case where non-single source data is fused to generate fused data and a case where single source data is fused to generate fused data.

また、上記の実施形態では、第１調査のモニタＡと第２調査のモニタＢとの間の類似度合いを評価する上で、両調査に共通して含まれる共通項目全ての回答内容を評価対象とすることとしたが、これに限定されるものではない。すなわち、複数ある共通項目のうち、一部のみを評価対象とすることとしてもよい。 In the above embodiment, in evaluating the degree of similarity between the monitor A of the first survey and the monitor B of the second survey, the response contents of all common items included in both surveys are evaluated. However, the present invention is not limited to this. That is, only a part of a plurality of common items may be evaluated.

１サーバ（調査データ処理装置）
１ＡＣＰＵ
１ＢＲＯＭ
１ｃＲＡＭ
１ｄ通信用インタフェース
１ｅハードディスクドライブ
１ｆ入力装置
１ｇ出力装置
１１データ受信部
１２データ集約部
１３データ記憶部
１４処理実行部
１５データ配信部
２１測定機
３１回答用端末
Ｄ１，Ｄ２調査データ
Ｎ１，Ｎ２情報通信網
1 server (survey data processing device)
1A CPU
1B ROM
1c RAM
1d Communication Interface 1e Hard Disk Drive 1f Input Device 1g Output Device 11 Data Receiving Unit 12 Data Aggregation Unit 13 Data Storage Unit 14 Processing Execution Unit 15 Data Distribution Unit 21 Measuring Device 31 Answer Terminals D1, D2 Survey Data N1, N2 Information Communication network

Claims

A survey data processing device that processes survey data indicating the results of each of the first survey and the second survey conducted on different monitors,
A first survey data acquisition unit that acquires first survey data collected for the number of persons of the first survey monitor who answered the first survey for the contents of responses to common items included in both the first survey and the second survey When,
A second survey data is acquired by collecting the contents of the responses to the common items and the second survey unique items included only in the second survey for the number of second survey monitors who answered the second survey. 2 Survey data acquisition unit,
The first survey data and the second survey data obtained by merging the first survey data acquired by the first survey data acquisition unit and the second survey data acquired by the second survey data acquisition unit. A process execution unit for executing a data fusion process for generating fusion data indicating the content of each of the answers,
The data fusion process includes:
A value for the content of the response to the common item is calculated using a predetermined arithmetic expression, and the result of comparing the values between the first survey monitor and the second survey monitor is used to answer the common item A calculation process for calculating the degree of similarity of content;
An assignment process for assigning, to the first survey monitor, the same response content as the response content to the second survey original item of the second survey monitor in the allocation pattern set based on the calculated similarity degree; Including,
When both or one of the first survey monitor and the second survey monitor changes, the first survey monitor and the second survey that have already been allocated by the allocation process and the allocation partner still exists at that time. An investigation data processing apparatus characterized in that an assignment partner is directly taken over for a monitor, and the calculation process and the assignment process are executed only for a new first investigation monitor or a new second investigation monitor.

Before and after both or one of the first survey monitor and the second survey monitor change,
The survey data processing apparatus according to claim 1, wherein the predetermined arithmetic expression used in the calculation process does not change.

Before and after either or both of the first survey monitor and the second survey monitor change, when the survey time for the common item is different,
In the calculation process and the allocation process, the process execution unit substitutes an answer content for an item or time approximate to the common item or a statistical value of the answer content as an answer content for the common item. Item 3. A survey data processing device according to item 1 or 2.

Before both or one of the first survey monitor and the second survey monitor change,
The process execution unit, in the allocation process, the first survey totalization result is aligned between the fusion data and the first survey data, and the aggregation results the second investigation concerning the second survey The survey data processing apparatus according to any one of claims 1 to 3, wherein the allocation pattern that matches between the data and the fusion data is set according to a statistical solution.

5. The survey data processing apparatus according to claim 1, wherein the first survey data includes a response content for a first survey unique item included only in the first survey. 6.

The first survey is a survey on media contact,
The survey data processing apparatus according to claim 1, wherein the second survey is a questionnaire survey that captures multifaceted aspects of consumer attributes, product involvement, and media contact.

A survey data processing method for processing survey data indicating a survey result of each of the first survey and the second survey performed on different monitors by a computer,
The computer is
Obtaining the first survey data collected for the number of persons of the first survey monitor who answered the first survey, the content of the response to the common items included in both the first survey and the second survey;
The step of acquiring the second survey data in which the contents of the responses to the common item and the second survey unique item included only in the second survey are collected by the number of the second survey monitors who answered the second survey. When,
Fusing the acquired first survey data and the second survey data, and executing a data fusion process for generating fused data indicating the contents of each of the first survey data and the second survey data; Run,
The data fusion process includes:
A value for the content of the response to the common item is calculated using a predetermined arithmetic expression, and the result of comparing the values between the first survey monitor and the second survey monitor is used to answer the common item A calculation process for calculating the degree of similarity of content;
An assignment process for assigning, to the first survey monitor, the same response content as the response content to the second survey original item of the second survey monitor in the allocation pattern set based on the calculated similarity degree; Including,
When both or one of the first survey monitor and the second survey monitor changes, the first survey monitor and the second survey that have already been allocated by the allocation process and the allocation partner still exists at that time. An investigation data processing method characterized in that the assignment partner is directly taken over for the monitor, and the calculation process and the assignment process are executed only for a new first investigation monitor or a new second investigation monitor.