JP7326400B2

JP7326400B2 - Content processing method and content processing program

Info

Publication number: JP7326400B2
Application number: JP2021166510A
Authority: JP
Inventors: 橋本重治
Original assignee: Otsuka Chemical Co Ltd
Current assignee: Otsuka Chemical Co Ltd
Priority date: 2021-10-08
Filing date: 2021-10-08
Publication date: 2023-08-15
Anticipated expiration: 2041-10-08
Also published as: JP2023056970A; WO2023058417A1; JP2023145656A

Description

本発明は、コンテンツ処理方法及びコンテンツ処理プログラムに関する。 The present invention relates to a content processing method and a content processing program.

企業又は研究機関等における研究開発、知的財産戦略、自社の製品及びサービスのマーケティング戦略を推進する上で、技術文献を含む、コンテンツの効率的な検索技術が必要とされている。特許権の権利侵害の防止、権利取得、他社技術の把握などのために、特に特許文献の検索においては、検索漏れが無くかつ効率的な情報取得が重要となってきている。
従来技術としては、以下の技術がある。 Efficient search technology for content, including technical documents, is required to promote research and development, intellectual property strategies, and marketing strategies for company products and services in companies, research institutes, and the like. In order to prevent infringement of patent rights, acquire rights, and understand other companies' technologies, it is becoming important to obtain information efficiently without omissions, especially in searching patent documents.
As conventional techniques, there are the following techniques.

例えば、複数の単語を含む複数の評価対象の文書それぞれについて、それが対象テーマに関連するポジティブ評価文書であるか対象テーマに関連しないネガティブ評価文書であるかのユーザからの評価を受け付け、各評価対象の文書から単語を抽出すると共に、ポジティブ評価文書中のみに出現するポジティブ単語、ネガティブ評価文書中のみに出現するネガティブ単語、ポジティブ評価文書とネガティブ評価文書の双方に出現する共通単語に分類する単語を抽出し、各単語の出現頻度と他の単語との隣接関係に基づき、該共通単語の対象テーマに対するテーマ関連度を算出する技術が存在する（例えば、特許文献１参照）。 For example, for each of a plurality of evaluation target documents containing a plurality of words, an evaluation is received from the user as to whether it is a positive evaluation document related to the target theme or a negative evaluation document not related to the target theme, and each evaluation is performed. Words are extracted from the target document and classified into positive words appearing only in positive evaluation documents, negative words appearing only in negative evaluation documents, and common words appearing in both positive evaluation documents and negative evaluation documents. is extracted, and based on the appearance frequency of each word and the adjacency relationship with other words, the degree of theme relevance of the common word to the target theme is calculated (see, for example, Patent Document 1).

また、未読情報を入力し、情報データと一つ以上のキーワードから成る一つ以上の情報に対して必要か不要かを示す教師信号との組を教師データとして予め準備して、新たに入力された未読情報に付される一つ以上のキーワードと、キーワードと教師信号の組とから未読情報に付されたキーワードに対する必要とする教師信号の組が多ければ大きな値を、不要とする教師信号の組が多ければ小さな値を持つ未読情報に対するユーザの必要性を予測する必要性信号として求める技術が存在する（例えば、特許文献２参照）。 Also, unread information is input, and a set of information data and a teacher signal indicating whether one or more pieces of information consisting of one or more keywords are necessary or not is prepared in advance as teacher data, and is newly input. One or more keywords attached to the unread information, and pairs of the keywords and the teacher signals. If there are many pairs of teacher signals required for the keywords attached to the unread information, a large value is set to the number of unnecessary teacher signals. There is a technique for obtaining a need signal that predicts the user's need for unread information with a small value if the number of sets is large (see, for example, Patent Document 2).

特許第５４２４３９３号公報Japanese Patent No. 5424393 特許第３７３６５６４号公報Japanese Patent No. 3736564

開示の技術は、テキストを含むコンテンツに対する検索等によって取得されたコンテンツの集合がオペレータに提供された際に、オペレータがそのコンテンツの集合に含まれる各々のコンテンツをより効率的に把握できるように支援することで、オペレータの労力を軽減させることを目的とする。 The disclosed technology assists the operator to more efficiently grasp each content included in the set of contents when the operator is provided with a set of contents obtained by searching for contents including text. By doing so, the purpose is to reduce the labor of the operator.

開示の技術は、複数のコンテンツの各々の提示の優先度を決定するコンテンツ処理方法であって、前記複数のコンテンツを特定することと、オペレータから指定された複数のキーワードと、前記複数のキーワードの各々に対応する重みとを含む、キーワードの情報を受け取ることと、前記複数のコンテンツの各々について、前記複数のキーワードの各々の出現回数と該キーワードに対応する重みとの積を、前記複数のキーワードについて足して合計を得ることで、前記複数のコンテンツの各々に対応する前記合計を導出することと、前記複数のコンテンツの各々に対応する合計に基づいて、前記複数のコンテンツの各々の提示の優先度を決定することと、を有するコンテンツ処理方法を提供する。 The disclosed technique is a content processing method for determining the priority of presentation of each of a plurality of contents, comprising: identifying the plurality of contents; a plurality of keywords specified by an operator; receiving keyword information, including a weight corresponding to each; and calculating, for each of the plurality of contents, the product of the number of occurrences of each of the plurality of keywords and the weight corresponding to the keyword for each of the plurality of keywords. deriving the sum corresponding to each of the plurality of pieces of content by adding together to obtain a sum for each of the plurality of pieces of content; determining a degree of content processing.

また、開示の技術における前記複数のコンテンツを特定することは、前記複数のコンテンツに関連する複数の単語を抽出することと、前記複数の単語に基づいて前記オペレータが前記複数の単語から前記複数のキーワードを特定するように、前記複数の単語をオペレータに提示することと、を含んでもよい。 Further, identifying the plurality of contents in the technology disclosed herein includes extracting a plurality of words related to the plurality of contents, and extracting the plurality of words from the plurality of words based on the plurality of words. presenting the plurality of words to an operator to identify keywords.

また、開示の技術における前記重みの取り得る値にゼロを含んでもよい。 In addition, zero may be included in the possible values of the weights in the technology disclosed herein.

また、開示の技術における前記抽出することは、前記複数のコンテンツの所定の部分から、前記複数の単語を抽出することを含んでもよい。 Further, the extracting in the technology disclosed herein may include extracting the plurality of words from predetermined portions of the plurality of contents.

また、開示の技術における前記コンテンツは、テキスト、画像、及び音声のうち少なくともいずれか１つを含んでもよい。 In addition, the content in technology disclosed herein may include at least one of text, images, and audio.

また、開示の技術における前記複数の単語を抽出することは、前記複数のコンテンツのうち、前記オペレータが査読した複数のコンテンツの各々に対して前記オペレータが与えたポジティブまたはネガティブの評価値を受け取ることと、前記評価値が与えられた前記複数のコンテンツに関連する単語のうち、ポジティブの評価値が与えられたコンテンツに、より強く関連するポジティブの単語と、ネガティブの評価値が与えられたコンテンツに、より強く関連するネガティブの単語とが区別されて提示できるように、前記複数の単語を特定することと、を含んでもよい。 Also, extracting the plurality of words in the technology disclosed herein includes receiving a positive or negative evaluation value given by the operator to each of the plurality of contents peer-reviewed by the operator among the plurality of contents. and, of the words related to the plurality of contents to which the evaluation values are given, positive words more strongly related to the contents to which the positive evaluation values are given, and to the contents to which the negative evaluation values are given. and identifying the plurality of words so that they can be presented separately from more strongly associated negative words.

また、開示の技術における前記キーワードの情報を受け取ることは、前記指定された複数のキーワードの修正及び対応する前記重みの修正のうち少なくともいずれかの修正を受け取ることを含み、前記優先度を決定することは、前記修正を受け取ったことに応答して、前記評価値が与えられたコンテンツに対応する前記優先度の変化を提示するように前記優先度を変更することを含んでもよい。 Further, receiving the keyword information in the technology disclosed includes receiving at least one modification of a modification of the specified plurality of keywords and a modification of the corresponding weights, and determining the priority. This may include changing the priority to present the change in priority corresponding to content given the rating value in response to receiving the modification.

また、開示の技術における前記優先度を変更することは、前記評価値をオペレータに識別させるように前記優先度に前記評価値を対応付けることを含んでもよい。 Further, changing the priority in the technology disclosed herein may include associating the evaluation value with the priority so as to allow an operator to identify the evaluation value.

また、開示の技術は、上記の方法をコンピュータに実行させるプログラムであってもよい。 Further, the technology disclosed may be a program that causes a computer to execute the above method.

開示の技術によれば、テキストを含むコンテンツに対する検索等によって取得されたコンテンツの集合がオペレータに提供された際に、オペレータがそのコンテンツの集合に含まれる各々のコンテンツをより効率的に把握できるように支援することで、オペレータの労力を軽減させることができる。 According to the disclosed technique, when an operator is provided with a set of contents obtained by searching for contents including text, the operator can more efficiently grasp each content included in the set of contents. , the operator's labor can be reduced.

図１Ａ及び図１Ｂは、検索などによって得られる種々のコンテンツの集合を示す。FIGS. 1A and 1B show various collections of content obtained by searching or the like. 図２Ａは、ポジティブキーワードの選択のベースとなり得る情報の例である。図２Ｂは、オペレータがポジティブキーワードと、対応する重み(r)を指定した表を示す。FIG. 2A is an example of information on which positive keyword selection can be based. FIG. 2B shows a table in which the operator has specified positive keywords and corresponding weights (r). 図３Ａは、ネガティブキーワードの選択のベースとなり得るグラフの例である。図３Ｂは、オペレータがネガティブキーワードと、対応する重み(r)を指定した表を示す。FIG. 3A is an example of a graph that can be used as a basis for selecting negative keywords. FIG. 3B shows a table in which the operator has specified negative keywords and corresponding weights (r). 図４Ａは、集合Vに含まれる各々のコンテンツのキーワード出現数とそのキーワードに対応する重みとの積の合計を降順に並べた表である。図４Ｂは、集合Vに含まれる全てのコンテンツについて査読が終了した結果を示す。FIG. 4A is a table in which the sum of the products of the number of keyword appearances of each content included in the set V and the weight corresponding to the keyword is arranged in descending order. FIG. 4B shows the results of peer review of all contents included in set V. FIG. 図５Ａ及び図５Ｂは、キーワードの選定の変更、対応する重みの変更をオペレータが行う際の補助となるユーザインタフェースの例を示す。5A and 5B show an example of a user interface that assists the operator in changing the selection of keywords and corresponding weight changes. 図６Ａは、実施形態の処理フローを示す。図６Ｂは、複数のコンテンツを推定するステップＳ１０２の詳細な処理を示すフローチャートである。FIG. 6A shows the processing flow of the embodiment. FIG. 6B is a flowchart showing detailed processing of step S102 for estimating a plurality of contents. 図７は、抽出された複数の単語がオペレータに提示するステップＳ１１２の詳細な処理を示すフローチャートである。FIG. 7 is a flow chart showing detailed processing of step S112 in which a plurality of extracted words are presented to the operator. 図８Ａは、複数のキーワードと対応する重み（正又は負を含む）とを含む、キーワードの情報を受け取るステップＳ１０４の詳細な処理を示すフローチャートである。図８Ｂは、各コンテンツに対応する合計に基づいて、複数のコンテンツの提示の優先度を決定するステップＳ１０８の詳細な処理を示すフローチャートである。FIG. 8A is a flow chart showing the detailed process of step S104 of receiving keyword information, including a plurality of keywords and corresponding weights (including positive or negative). FIG. 8B is a flowchart showing detailed processing of step S108 for determining the priority of presentation of a plurality of contents based on the sum corresponding to each content. 図９は、修正を受け取ったことに応答して、評価値が与えられたコンテンツに対応する優先度の変化を提示するように優先度を変更するステップＳ３１２の処理の詳細な処理を示すフローチャートである。FIG. 9 is a flowchart showing detailed processing of the processing in step S312 for changing the priority so as to present a change in priority corresponding to the content given the evaluation value in response to receiving the correction. be. 図１０は、実施形態の機能を示すブロック図である。FIG. 10 is a block diagram illustrating the functionality of the embodiment. 図１１は、実施形態の各ハードウエア構成を示した図である。FIG. 11 is a diagram showing each hardware configuration of the embodiment.

特に特許文献の調査においては、関連する特許文献が漏れのないように、かつ不要な特許（ノイズ文献）が数多く入り込まないように検索式を工夫することが求められる。そのため、検索式を検討し、特許文献の集合を取得する。しかしながら、このような工夫を行って検索式によって取得した特許文献の集合には調査対象とは無関係な文献（ノイズ文献）が多く存在する。 In particular, when searching for patent documents, it is necessary to devise a search formula so as not to omit related patent documents and not to include many unnecessary patents (noise documents). Therefore, search formulas are examined to obtain a set of patent documents. However, many patent documents (noise documents) that are unrelated to the research target exist in the set of patent documents acquired by the search formula with such a device.

このノイズ文献を少なくするためには、検索による絞り込みを強化することとなる。しかしながら、検索における絞り込みを強化することには、重要な文献が検索結果から漏れてしまうというリスクが存在する。逆に、重要文献の漏れを防止するように検索を行うと、検索結果の文献の集合が大きくなり、オペレータがブラウジング（査読）する際に多くの労力がかかってしまう。 In order to reduce this noise literature, it is necessary to strengthen the search refinement. However, there is a risk that important documents will be omitted from the search results by strengthening the narrowing down of the search. Conversely, if a search is performed so as to prevent the omission of important documents, the set of documents obtained as a result of the search will become large, and the operator will have to spend a lot of effort in browsing (peer review).

例えば、査読によって関連特許を抽出する場合、対象とする技術分野に関連する語が含まれているかどうかに着目することが一般的である。例えば、調査対象に関連しない語が含まれている場合は、ノイズ文献と判断することが多い。
このため、調査対象に関連する語句と、調査対象に関連しない語句とを適切に選定して検索式を作成して検索すれば、適切な検索結果が得られる場合が多い。 For example, when extracting related patents by peer review, it is common to focus on whether or not words related to the target technical field are included. For example, if it contains words unrelated to the research subject, it is often judged as noise literature.
Therefore, in many cases, appropriate search results can be obtained by appropriately selecting words and phrases related to the research object and words and phrases not related to the research object, creating a search formula, and performing a search.

しかしながら、ここで注意すべきことは、長い文章によって構成される特許文献においては、その文中の一部に、特許文献の目的とする技術以外の記載が混在している場合もある。例えば、上手くいかなかった実験例など（反例）が記載されている例も多い。或いは、技術の性能について、高い・低いなどその程度を表す語が用いられていることがある。性能を特徴づける語句を検索用の語句とした場合には、その程度を表す語が、求めている技術と整合していない場合などでは、その文献はノイズ文献となる場合がある。 However, it should be noted here that in a patent document composed of long sentences, descriptions other than the target technology of the patent document may be mixed in part of the sentence. For example, there are many examples where examples of experiments that did not go well (counterexamples) are described. Alternatively, terms such as high or low may be used to express the degree of technical performance. When a term characterizing performance is used as a search term, the document may become noise if the term expressing the degree of performance does not match the required technology.

なお、検索式にはＮＯＴ演算がある。関連しない語句を指定し、ＮＯＴ演算を検索式に組み込むことで、関連しない語句が含まれる文献を除外した検索結果を得ることが従来から行われてきた。しかし、この方法では、上記の反例などが文献に含まれている場合には、検索者の意図に反し、重要な文献が検索結果から漏れてしまうというリスクが生じていた。 Note that the search formula includes a NOT operation. Conventionally, by specifying irrelevant words and phrases and incorporating a NOT operation into the search formula, retrieval results are obtained excluding documents that include irrelevant words and phrases. However, with this method, if the above-mentioned counterexamples are included in the literature, there is a risk that important literature will be omitted from the search results against the intention of the searcher.

そこで、開示の技術では、オペレータが欲する調査対象に関連するキーワード、関連しないキーワードと対応する重みを用いて、コンテンツの集合に含まれる各々のコンテンツに対して、優先度を付与すること等を提案している。優先度を用いて、オペレータへのコンテンツの提示順序を調整したり、コンテンツに優先度を結び付けて表示したりすることで、オペレータに対するコンテンツの利用をより容易化させることができる。 Therefore, the disclosed technique proposes giving priority to each content included in a set of content by using weights corresponding to keywords related and unrelated to the research target desired by the operator. are doing. By using the priority to adjust the order in which the content is presented to the operator or to display the content with the priority associated with it, it is possible to facilitate the use of the content by the operator.

なお、本実施形態で用いられるキーワードは、検索式に用いられるキーワードとは別途設定可能なキーワードであり、必ずしも同じキーワードである必要はない。本実施形態が対象とするコンテンツの集合は、キーワードを含む検索式を用いた検索によって得られたものであってもよいし、その他の手段、例えば、AIを用いて収集されたコンテンツの集合などであってもよい。すなわち、本実施形態は、対象とするコンテンツの集合の収集手段に依存しない。
開示の実施形態で、コンテンツとは、テキスト、画像、映像、及び音声など、言語を用いて表現された内容を含む表現を意味する。 Note that the keywords used in this embodiment are keywords that can be set separately from the keywords used in the search formula, and are not necessarily the same keywords. The set of contents targeted by the present embodiment may be obtained by searching using a search formula including keywords, or may be obtained by other means, such as a set of contents collected using AI. may be That is, the present embodiment does not depend on collection means for collection of targeted content.
In the disclosed embodiment, content means an expression including content expressed using language, such as text, image, video, and audio.

検索式を立案するオペレータであれば、技術用語に関して、一定の水準の知識及び理解度を有している。したがって、オペレータは、オペレータ自身にとって重要であると判断するコンテンツに密接にかかわるキーワードを指定することが可能である。加えて、オペレータ自身にとって重要でないと判断するコンテンツ（ノイズとなる文献）に密接に関連するキーワードを指定することも可能である。加えて、それぞれのキーワードの同義語、類義語を指定することも可能であると考えられる。 An operator who drafts a search formula has a certain level of knowledge and understanding of technical terms. Thus, the operator can specify keywords that are closely related to content that the operator deems important. In addition, it is also possible to specify keywords that are closely related to content (noisy literature) that the operator deems unimportant. In addition, it is considered possible to specify synonyms and synonyms for each keyword.

或いは、検索式を作成出来る技術レベルを有した者であれば、コンテンツを査読しなくても関連するキーワード、関連しないキーワードの選定（又は指定）が可能である場合が多い。加えて、オペレータが、検索結果のうちの幾つかのコンテンツを査読していれば、より適切に、目的とするコンテンツに関連するキーワード及び関連しないキーワードの選定（又は指定）が可能となろう。 Alternatively, a person who has the technical level to create a search formula can often select (or specify) related keywords and unrelated keywords without reviewing the content. In addition, if the operator had reviewed the content of some of the search results, he would be better able to select (or specify) keywords related and unrelated to the content of interest.

更に、テキストマイニング、統計的処理等により、検索結果（又は査読済み）のコンテンツの集合に含まれる語句の一覧をオペレータにわかりやすいように提示するようにすれば、オペレータは、関連するキーワード、関連しないキーワードの選定がより容易となる。この場合、対象とするコンテンツにどのようなキーワードが含まれるか、オペレータは予知できない。またオペレータが検索式等で用いたキーワードの同義語、類義語、および／または異表記の語句が対象とするコンテンツに多く含まれている場合も生じえる。従い、対象に含まれる語句の一覧から、キーワードを選定する方法によって、本方法の精度向上を図ることが可能となる。 Furthermore, if a list of words and phrases included in a set of search results (or peer-reviewed) content is presented to the operator in an easy-to-understand manner by text mining, statistical processing, etc., the operator can find relevant keywords and unrelated keywords. Keyword selection becomes easier. In this case, the operator cannot predict what keywords will be included in the target content. In addition, there may be a case where the target content contains many synonyms, synonyms, and/or different notation of the keyword used by the operator in the search formula or the like. Therefore, it is possible to improve the accuracy of this method by selecting keywords from a list of words included in the target.

本明細書において、オペレータ自身にとって重要であると判断するコンテンツ（ポジティブコンテンツ）（サーチのターゲットとなる文献）に密接に関連するキーワードをポジティブキーワードという。そして、オペレータ自身にとって重要でないと判断するコンテンツ（ネガティブコンテンツ）（ノイズとなる文献）に密接に関連するキーワードをネガティブキーワードという。
＜検索結果の集合に関する説明＞
図１Ａ及び図１Ｂは、検索によって得られる種々のコンテンツの集合を示している。 In this specification, a keyword closely related to content (positive content) (documents targeted for search) that the operator determines to be important is referred to as a positive keyword. Keywords that are closely related to content (negative content) (documents that become noise) that are judged to be unimportant to the operator are called negative keywords.
<Description on collection of search results>
Figures 1A and 1B show various collections of content obtained by searching.

図１Ａに示すように、コンテンツの母集団Ｕに対して検索式などを用いて絞り込まれた結果の集合Vが得られる。通常、オペレータは、検索などによって絞り込まれた集合Vに含まれるコンテンツを、順に査読することで、各コンテンツが重要であるか否かの判断を行う。 As shown in FIG. 1A, a set V of results narrowed down using a search formula or the like is obtained for a population U of contents. Usually, the operator judges whether each content is important or not by sequentially reviewing the content included in the set V narrowed down by a search or the like.

図１Ａにおけるポジティブ集合R1は、オペレータが査読した集合Vのコンテンツのうち、重要と判断したコンテンツが含まれる集合を示している。ポジティブ集合R1は、オペレータの査読の結果によってポジティブな評価が与えられたコンテンツ（ポジティブコンテンツ）を含むポジティブ集合と定義される。 A positive set R1 in FIG. 1A indicates a set including contents judged to be important among the contents of the set V reviewed by the operator. The positive set R1 is defined as a positive set including content (positive content) given a positive evaluation as a result of operator review.

ネガティブ集合G1は、オペレータが査読した集合Vのコンテンツのうち、重要でないと判断したコンテンツが含まれる集合を示している。ネガティブ集合G1は、オペレータの査読の結果によってネガティブな評価が与えられたコンテンツ（ネガティブコンテンツ）を含むネガティブ集合と定義される。 The negative set G1 indicates a set that includes content that is judged to be unimportant among the contents of the set V reviewed by the operator. The negative set G1 is defined as a negative set including content (negative content) given a negative evaluation as a result of operator review.

その他の集合T1は、オペレータが査読した集合Vのコンテンツのうち、重要でも不要でもない（或いは査読が十分行われておらず評価結果が十分でない）コンテンツの集合を示している。その他の集合T1は、オペレータの査読の結果によってネガティブでもポジティブでもない評価が与えられたコンテンツを含む、その他の集合と定義される。 Other set T1 indicates a set of contents that are neither important nor unnecessary (or that review has not been sufficiently performed and the evaluation result is not sufficient) among the contents of set V that have been peer-reviewed by the operator. The other set T1 is defined as the other set that includes content that has been rated neither negative nor positive by the results of the operator's review.

通常であれば、オペレータは、絞り込みの結果の集合Vに含まれるコンテンツを順に査読してゆき、集合Vの全てのコンテンツを査読して、コンテンツの各々に対して、ポジティブ、ネガティブ、その他の評価等を与えることになる。なお、評価の観点は、オペレータの行う調査の目的によって変わり得る。調査目的は、特許権取得、他社特許の無効化資料の取得、権利侵害防止、他社技術の把握、研究開発のための基礎資料の取得など、多岐にわたる。これらの調査目的に応じて、同じコンテンツであっても、重要度（優先度）を判断する観点は異なったものとなることは言うまでもない。 Usually, the operator sequentially reviews the contents included in the narrowing result set V, reviews all the contents in the set V, and evaluates each of the contents positively, negatively, or otherwise. etc. will be given. Note that the viewpoint of evaluation may change depending on the purpose of the investigation conducted by the operator. The purpose of the investigation is wide-ranging, including acquisition of patent rights, acquisition of materials to invalidate other companies' patents, prevention of infringement, understanding of other companies' technologies, and acquisition of basic materials for research and development. Needless to say, the viewpoints for judging the degree of importance (priority) of the same content differ according to these research purposes.

図１Ｂは、検索結果の集合Vに対して、オペレータによりコンテンツの全ての査読が終了したときの状態の例を示している。すなわち、集合Vは、ポジティブ集合R0、ネガティブ集合G0、その他の集合T0を含む。
図１Ｂに示される各集合を得るには、一般に、オペレータが集合Vのコンテンツの査読をすべて行うことが望ましいといえる。
＜実施形態＞ FIG. 1B shows an example of the state when the operator has finished reviewing all of the content for the set V of search results. That is, set V includes positive set R0, negative set G0, and other sets T0.
To obtain each set shown in FIG. 1B, it may generally be desirable for the operator to review all of the contents of set V. FIG.
<Embodiment>

以下に示す実施形態では、オペレータにとってより重要であると推測されるコンテンツに対してより高い優先度を付与することで、少なくともポジティブ集合R0を事前に推定することができる。 In the embodiments described below, at least the positive set R0 can be pre-estimated by giving higher priority to content that is presumed to be more important to the operator.

本実施形態を用いることで、オペレータは、コンテンツに付与された優先度を参照して、予めオペレータにとって重要である可能性の高いコンテンツを優先的にブラウズ（査読）することができる。オペレータは、高い優先度のコンテンツから順番にコンテンツを取り扱うことで、より容易にかつ短時間に、集合Vに属するコンテンツを適切に処理することができる。
図２Ａは、ポジティブキーワードの選択のベースとなり得る情報の例である。
図２Ａは、横軸に、キーワードが列記されており、縦軸に、各キーワードに対応して「キーワードの指標１」がプロットされている。 By using this embodiment, the operator can refer to the priority given to the content and preferentially browse (review) the content that is likely to be important to the operator in advance. The operator can appropriately process the contents belonging to the set V more easily and in a short period of time by handling the contents in descending order of priority.
FIG. 2A is an example of information on which positive keyword selection can be based.
In FIG. 2A, keywords are listed on the horizontal axis, and "keyword index 1" corresponding to each keyword is plotted on the vertical axis.

図２Ａのグラフを参照して、オペレータは、ポジティブのキーワードを少なくとも1個選ぶことができる。図２Ａにおける「キーワードの指標１」としては、以下の例が挙げられる。 Referring to the graph of FIG. 2A, the operator can select at least one positive keyword. Examples of "keyword index 1" in FIG. 2A include the following.

・キーワードの指標１の例
１．検索結果の集合Vにおける一部又は全部のコンテンツに含まれる単語の出現回数
２．査読においてポジティブの評価が与えられたコンテンツあたりの単語の出現回数の平均
３．［査読においてポジティブの評価が与えられたコンテンツあたりの単語の出現回数の平均］－［査読においてネガティブの評価が与えられたコンテンツあたりの単語の出現回数の平均］ - Example of keyword index 1 1 . 1. The number of occurrences of words included in some or all of the contents in the set V of search results; 3. Average number of word occurrences per content that was given a positive rating in peer review. [Average number of word occurrences per content rated positive in peer review] - [Average number of word occurrences per content rated negative in peer review]

なお、上記において、「出現回数」は、コンテンツの全体における単語の出現回数ではなく、コンテンツの一部分についての単語の出現回数であってもよい。例えば、特許文献がコンテンツに含まれる場合には、特許文献の特許請求の範囲に限定して、単語の出現回数がカウントされてもよい。
図２Ｂは、オペレータがポジティブキーワードと、対応する重み(r)を指定した表を示している。 In the above description, the "number of appearances" may be the number of appearances of a word in a part of the content instead of the number of appearances of the word in the entire content. For example, when a patent document is included in the content, the number of occurrences of words may be counted by limiting the scope of claims of the patent document.
FIG. 2B shows a table in which the operator has specified positive keywords and corresponding weights (r).

キーワードに対応する重み(r)は、キーワードがオペレータにとって重要である（すなわちポジティブである）コンテンツに含まれる度合いが高いほど、ゼロ以上のより大きな値になるように設定されることが望ましい。そして、重み(r)は、キーワードがオペレータにとって重要でない（すなわちネガティブである）コンテンツに含まれる度合いが高いほど、ゼロ以下であってその絶対値がより大きな値になるように設定されることが望ましい。重み(r)の利用方法については後述する。 The weight (r) corresponding to a keyword is preferably set to a value greater than or equal to zero as the keyword is included in content that is more important (that is, positive) to the operator. Then, the weight (r) can be set to be less than or equal to zero and have a larger absolute value as the keyword is included in content that is not important to the operator (that is, is negative). desirable. How to use the weight (r) will be described later.

なお、オペレータは、コンテンツの査読を未だ行っていないなどの場合には、図２Ａのグラフそのものが無い場合などもあるため、図２Ａのグラフを参照することなく（又は図２Ａのグラフの有無にかかわらず）、オペレータの知識などに基づいて、自由にポジティブキーワード及び対応する重み(r)を指定してもよい。あるいは、集合V に属するコンテンツの一部または全部のコンテンツに含まれる単語のリストがオペレータに提示されて、オペレータがそのリストからポジティブキーワードを選定してもよい。ポジティブキーワードは、少なくとも１つ以上指定されていることが望ましい。
なお、重み(r)がゼロに指定された場合は、対応するポジティブキーワード又はネガティブキーワードが指定されなかったのと同じことになる。したがって、重み(r)をゼロに指定することによって、ポジティブキーワード又はネガティブキーワードの指定を取り消すこと同様のことができ、指定されたキーワードの取消の操作が簡略化される。 If the content has not yet been peer-reviewed, the operator may not have the graph in FIG. 2A. regardless), the operator may freely specify positive keywords and corresponding weights (r) based on operator knowledge and the like. Alternatively, the operator may be presented with a list of words included in some or all of the contents belonging to set V, and the operator may select positive keywords from the list. At least one positive keyword is desirably specified.
Note that if the weight (r) is specified as zero, it is the same as if the corresponding positive or negative keyword was not specified. Therefore, by designating the weight (r) as zero, it is possible to do the same thing as canceling the designation of a positive keyword or a negative keyword, which simplifies the operation of canceling the designated keyword.

なお、「同義語・異表記」は、辞書が参照されることによって、自動的にコンピュータによって指定されてもよい。或いは、コンピュータによって辞書が参照されることで、「同義語・異表記」の候補がオペレータに示されて、オペレータに選択するようにうながしてもよい。或いはオペレータにより設定されてもよい。「同義語・異表記」に指定された単語は、対応するポジティブキーワードと同様に（同じ語として）扱われることが望ましい。
図３Ａは、ネガティブキーワードの選択のベースとなり得るグラフの例である。
図３Ａは、縦軸に、各単語に対応して「キーワードの指標２」がプロットされている。 The "synonyms/different notations" may be automatically specified by the computer by referring to a dictionary. Alternatively, by referring to a dictionary by a computer, candidates for "synonyms/different notations" may be presented to the operator, and the operator may be prompted to make a selection. Alternatively, it may be set by the operator. It is desirable that the words specified as "synonyms/different notations" be treated in the same way (as the same word) as the corresponding positive keywords.
FIG. 3A is an example of a graph that can be used as a basis for selecting negative keywords.
In FIG. 3A, “keyword index 2” is plotted on the vertical axis corresponding to each word.

図３Ａのグラフを参照して、オペレータは、ネガティブのキーワードを少なくとも１個選ぶことができる。図３Ａにおける「キーワードの指標２」としては、以下の例が挙げられる。
・キーワードの指標２の例
４．検索結果の集合Vにおける一部又は全部のコンテンツに含まれる単語の出現回数
５．査読においてネガティブの評価が与えられたコンテンツあたりの単語の出現回数の平均 Referring to the graph of FIG. 3A, the operator can select at least one negative keyword. Examples of "keyword index 2" in FIG. 3A include the following.
- Example of keyword index 2 4 . 5. The number of appearances of words included in some or all of the contents in the set V of search results; Average number of word occurrences per piece of content rated negative in peer review

６．［査読においてネガティブの評価が与えられたコンテンツあたりの単語の出現回数の平均］－［査読においてポジティブの評価が与えられたコンテンツあたりの単語の出現回数の平均］
図３Ｂは、オペレータがネガティブキーワードと、対応する重み(r)を指定した表を示している。 6. [Average number of word occurrences per content rated negative in peer review] - [Average number of word occurrences per content rated positive in peer review]
FIG. 3B shows a table in which the operator has specified negative keywords and corresponding weights (r).

重み(r)は、コンテンツがオペレータにとって重要でない（すなわちネガティブである）度合いが高いほど、マイナスであってその絶対値が大きな値になるように設定されることが望ましい。重み(r)の利用方法については後述する。 The weight (r) is desirably set to be negative and have a large absolute value as the content is less important (that is, negative) to the operator. How to use the weight (r) will be described later.

なお、オペレータは、コンテンツの査読を未だ行っていないなどの場合には、図３Ａのグラフそのものが無い場合などもあるため、図３Ａのグラフを参照することなく（又は図３Ａのグラフの有無にかかわらず）、オペレータの知識などに基づいて、自由にネガティブキーワード及び対応する重み(r)を指定してもよい。あるいは、集合V に属するコンテンツの一部または全部のコンテンツに含まれる単語のリストがオペレータに提示されて、オペレータがそのリストからネガティブキーワードを選定してもよい。ネガティブキーワードは、少なくとも１つ以上指定されていることが望ましい。なお、重み(r)がゼロに指定された場合は、対応するネガティブキーワードが指定されなかったのと同じことになる。 It should be noted that the operator may not have the graph in FIG. regardless), the operator may freely specify negative keywords and corresponding weights (r) based on operator knowledge and the like. Alternatively, the operator may be presented with a list of words included in some or all of the contents belonging to set V, and the operator may select negative keywords from the list. At least one negative keyword is desirably specified. If the weight (r) is specified as zero, it is the same as if the corresponding negative keyword was not specified.

なお、「同義語・異表記」は、辞書が参照されることによって、自動的にコンピュータによって指定されてもよい。或いは、コンピュータによって辞書が参照されることで、「同義語・異表記」の候補がオペレータに示されて、オペレータに選択するようにうながしてもよい。或いはオペレータにより設定されてもよい。「同義語・異表記」に指定された単語は、対応するネガティブキーワードと同様に（同じ語として）扱われることが望ましい。
なお、上記ポジティブの評価及びネガティブの評価は評価値の一例である。
図４は、検索結果の集合Vに含まれるコンテンツの各々に対応する合計（Total(m)）を降順に並べた表である。 The "synonyms/different notations" may be automatically specified by the computer by referring to a dictionary. Alternatively, by referring to a dictionary by a computer, candidates for "synonyms/different notations" may be presented to the operator, and the operator may be prompted to make a selection. Alternatively, it may be set by the operator. It is desirable that the words specified as "synonyms/different notations" be treated in the same way (as the same word) as the corresponding negative keywords.
Note that the above positive evaluation and negative evaluation are examples of evaluation values.
FIG. 4 is a table in which the totals (Total(m)) corresponding to each content included in the set V of search results are arranged in descending order.

例えば、検索結果の集合Vに属するコンテンツｍにおいて、キーワードの出現回数とそのキーワードに対応する重みとの積をキーワードの全てについて足した合計すなわちTotal(m)は以下のように定義される。

ただし、
ｍ：検索結果の集合Vに含まれる各々のコンテンツを識別する番号（１から始まる連続する整数のいずれか）
u：ポジティブキーワードの数＋ネガティブキーワードの数、すなわちキーワードの総数
c(m)_n：コンテンツｍにおける、 n番目のポジティブキーワード又はネガティブキーワード（キーワード）の出現回数
r_n：n番目のキーワードに対応する重みr
For example, in the content m belonging to the set V of the search results, the sum of the products of the number of appearances of the keyword and the weight corresponding to the keyword for all the keywords, that is, Total(m) is defined as follows.

however,
m: A number that identifies each content included in the search result set V (one of consecutive integers starting from 1)
u: number of positive keywords + number of negative keywords, i.e. total number of keywords
c(m) _n : Number of appearances of the nth positive or negative keyword (keyword) in content m
r _n : weight r corresponding to the nth keyword

上記の合計、すなわち検索結果の集合Vに含まれるコンテンツｍのキーワードの出現回数とそのキーワードに対応する重みとの積の合計Total(m)は、コンテンツｍの優先度の一例である。 The above total, that is, the sum Total(m) of the product of the number of appearances of the keyword of the content m included in the search result set V and the weight corresponding to the keyword is an example of the priority of the content m.

ポジティブキーワードに対応する重みは、ゼロ以上の数値であることが望ましく、ネガティブキーワードに対応する重みは、ゼロ以下の数値であることが望ましい。 The weight corresponding to the positive keyword is preferably a numerical value greater than or equal to zero, and the weight corresponding to the negative keyword is preferably a numerical value less than or equal to zero.

あるコンテンツにおいて、あるポジティブキーワードの出現回数が大きいほど、そのコンテンツは、オペレータの望んでいる技術の内容に、より密接な関係がある可能性が高いと推定できる。加えて、そのポジティブキーワードの重みが大きいほど、そのポジティブキーワードを含むコンテンツが、オペレータの望んでいる技術の内容により近いコンテンツである可能性が高い。 It can be estimated that the greater the number of appearances of a certain positive keyword in a certain content, the higher the possibility that the content is more closely related to the details of the technique desired by the operator. In addition, the higher the weight of the positive keyword, the higher the possibility that the content containing the positive keyword is the content closer to the technology desired by the operator.

あるコンテンツにおいて、あるネガティブキーワードの出現回数が大きいほど、そのコンテンツは、オペレータの望んでいる技術の内容に対して、より疎遠な関係がある可能性が高いと推定できる。加えて、そのネガティブキーワードの重み（ゼロ以下の値）の絶対値が大きいほど、そのネガティブキーワードを含むコンテンツが、オペレータの望んでいる技術の内容からより疎遠であるコンテンツである可能性が高い。 It can be inferred that the greater the number of appearances of a certain negative keyword in a certain content, the higher the possibility that the content is more distantly related to the details of the technology desired by the operator. In addition, the greater the absolute value of the negative keyword weight (value less than or equal to zero), the higher the possibility that the content containing the negative keyword is content that is farther from the content of the technology desired by the operator.

したがって、あるコンテンツにおいて、あるキーワード（ポジティブキーワード又はネガティブキーワード）の出現回数とそのキーワードに対応する重みとの積は、そのコンテンツがオペレータにとってどの程度重要であるかの指標の要素となる。そして、あるコンテンツに含まれる全てのキーワード（ポジティブキーワード又はネガティブキーワード）について、その積を足した合計は、そのコンテンツが、オペレータの望んでいる技術の内容と近接している程度を示す指標（優先度）となり得る。
したがって、コンテンツに対応する合計の値（優先度）が大きいほど、オペレータの望んでいる技術の内容に近いコンテンツであると推定することができる。
図４Ａは、集合Vに含まれる各々のコンテンツのキーワード出現数とそのキーワードに対応する重みとの積の合計を降順に並べた表である。 Therefore, the product of the number of appearances of a certain keyword (positive keyword or negative keyword) in certain content and the weight corresponding to that keyword is an index of how important that content is to the operator. Then, for all keywords (positive keywords or negative keywords) contained in a certain content, the total sum of the products is an index (priority degrees).
Therefore, it can be estimated that the larger the total value (priority) corresponding to the content is, the closer the content is to the technology desired by the operator.
FIG. 4A is a table in which the sum of the products of the number of keyword appearances of each content included in the set V and the weight corresponding to the keyword is arranged in descending order.

図４Ａの表４００において、コンテンツが表の上位に位置するほど、優先度が高いコンテンツであると推定することができる。逆に、コンテンツが表の下位に位置するほど、優先度が低いコンテンツであると推定することができる。 In the table 400 of FIG. 4A, it can be estimated that the higher the content is positioned in the table, the higher the priority of the content. Conversely, it can be estimated that the lower the content is located in the table, the lower the priority of the content.

例えば、図４Ａの表４００の最上位に位置するコンテンツ番号４５のコンテンツは、合計Total(w)の値が一番大きな値となるコンテンツである。このコンテンツは、キーワード（ポジティブキーワードとネガティブキーワード）の出現回数と対応する重みとの積を足し合わせた和が一番大きいことを意味している。図４Ａの表４００において、上位に位置するコンテンツほど重みの大きなポジティブキーワードがコンテンツに多く含まれている傾向を示しており、より優先度が高いコンテンツと判断される。 For example, the content with the content number 45 positioned at the top of the table 400 in FIG. 4A is the content with the largest total Total(w) value. This content means that the sum of the products of the number of occurrences of the keywords (positive keywords and negative keywords) and the corresponding weights is the largest. In the table 400 of FIG. 4A, the higher the content ranks, the more weighted positive keywords are included in the content, and the content is determined to have a higher priority.

図４Ｂは、集合Vに含まれる全てのコンテンツについて査読が終了した結果を示している。横軸は、優先度の順位が示されている。縦軸は、優先順位(No.)が１（No.=1）から優先順位mまでのm個のコンテンツに重要であると判断されたコンテンツが含まれる割合（重要であると判断されたコンテンツ含有率）を示すグラフである。 FIG. 4B shows the results of peer review of all the contents included in set V. As shown in FIG. The horizontal axis indicates the order of priority. The vertical axis shows the ratio of content judged to be important among m pieces of content with priority (No.) from 1 (No.=1) to priority m (content judged to be important). It is a graph showing the content rate).

たとえば、グラフ４１０において、優先度１のコンテンツから、横軸に示された優先度４３の位置までの４３のコンテンツに、集合Vに属するコンテンツのうち９０％の、重要であると判断されたコンテンツが含まれていることを意味している。このことは、本実施形態を用いて優先度をコンテンツに付与して、検索結果の集合Vに属する２００のコンテンツのうち、優先度の高い４３個のコンテンツを査読することで、集合V における９０％の重要な（優先度の高い）コンテンツが査読により発見できることを意味している。 For example, in graph 410, 43 contents from priority 1 to priority 43 shown on the horizontal axis, and 90% of the contents belonging to set V judged to be important. means that it contains By giving priority to contents using this embodiment and reviewing 43 high-priority contents out of 200 contents belonging to set V of search results, 90 contents in set V are reviewed. This means that 100% of important (high priority) content can be found through peer review.

更に、優先度の高いコンテンツから順に１００個のコンテンツを査読すれば、集合Vに属するコンテンツのうち、重要であると判断されるコンテンツをオペレータは査読により１００％発見することができる。このように、本実施形態は、検索結果の集合Vに含まれる２００のコンテンツのうちから効率的に査読ができるように、コンテンツの優先度をオペレータに提供することができる。
図５は、キーワードの選定の変更、対応する重みの変更をオペレータが行う際の補助となるユーザインタフェースの例を示している。 Furthermore, by reviewing 100 pieces of content in descending order of priority, the operator can discover 100% of the content that belongs to set V and is judged to be important. In this manner, the present embodiment can provide the operator with a priority of content so as to efficiently review content from among the 200 content contained in the set V of search results.
FIG. 5 shows an example of a user interface that assists the operator in changing the selection of keywords and changing the corresponding weights.

図５では、集合Vに含まれる２００のコンテンツのうち１７のコンテンツについて査読を終了した場合を想定している。たとえば、そのうちの６のコンテンツが、オペレータの目的に密接に関連しており重要と判断され（図１におけるポジティブ集合R1）、８のコンテンツがオペレータの目的に関連しておらず不要であると判断され（図１におけるネガティブ集合G1）、３のコンテンツがその他のカテゴリ（図１におけるその他の集合T1）であると判断されたと仮定する In FIG. 5, it is assumed that 17 contents out of 200 contents included in set V have been peer-reviewed. For example, 6 of the contents are closely related to the operator's purpose and judged to be important (positive set R1 in FIG. 1), and 8 contents are not relevant to the operator's purpose and judged to be unnecessary. (negative set G1 in FIG. 1) and the content of 3 was determined to be in the Other category (Other set T1 in FIG. 1).

図５Ａの査読済みコンテンツの表５２０は、査読済みのコンテンツが持つTotal(m)を大きい順に並べた表である。列ｉには、図１に示したコンテンツの属する集合（R1、G1又はT1）と対応して、それぞれ黒塗り、白、網掛けが表示されている。この列ｉにおける識別を参照しつつ、オペレータは、表５２０に示されたキーワード又は対応する重みを修正することによって、ポジティブ集合R1に属するコンテンツの多くが、なるべく表５２０の上位に位置するようにすることができる。査読文書の中で、ポジティブ集合R1に属するコンテンツが表５２０の上位に位置するようにすることで、査読されていないコンテンツに関して、ポジティブであるコンテンツについてTotal(m)の値をより大きくできる可能性を高めることができる。このことは、査読していないポジティブなコンテンツが提示される優先順位を高くすることができる可能性を高めることができるということである。 A peer-reviewed content table 520 in FIG. 5A is a table in which the Total(m) of the peer-reviewed content is arranged in descending order. Column i displays black, white, and hatching corresponding to the set (R1, G1, or T1) to which the content shown in FIG. 1 belongs, respectively. While referring to the identification in this column i, the operator modifies the keywords or the corresponding weights shown in table 520 so that most of the contents belonging to positive set R1 are positioned at the top of table 520 as much as possible. can do. Among the peer-reviewed documents, content belonging to the positive set R1 is placed at the top of the table 520, so that the value of Total(m) can be made larger for positive content than for non-peer-reviewed content. can increase This can increase the likelihood that positive non-peer-reviewed content can be given higher priority.

図５Ａの表５００には、既に指定されたポジティブのキーワードとネガティブのキーワード及び対応する重み５０２が表示されている。オペレータは、表５２０に新たなキーワードを加えたり、キーワードを削除したりすることができる。加えて、オペレータが、スライドバー５０４のマーク５０６を左右にマウスなどでスライドすることによって、重みを容易に変更できるユーザインタフェースを設けてもよい。或いは、オペレータに対してキー入力などで重みを変更できるようにさせてもよい。
なお、重みをゼロにすることは、キーワードを削除すること（すなわち、そのキーワードを考慮の対象外とすること）と同じ効果を有する。
図５Ｂは、キーワードの重みが変更された場合のTotal(m)とｉ列の表示の変化に対応して、コンテンツの順番が変更された状態を示した図である。 Table 500 of FIG. 5A displays previously specified positive and negative keywords and corresponding weights 502 . The operator can add new keywords to table 520 or delete keywords. In addition, a user interface may be provided that allows the operator to easily change the weight by sliding the mark 506 on the slide bar 504 left and right with a mouse or the like. Alternatively, the operator may be allowed to change the weight by key input or the like.
Note that setting the weight to zero has the same effect as deleting a keyword (ie, excluding it from consideration).
FIG. 5B is a diagram showing a state in which the order of contents is changed in accordance with the change in the display of Total(m) and the i column when the keyword weight is changed.

図５Ｂの査読済みコンテンツの表５０１は、図５Ａの査読済みコンテンツの表５００のキーワード「ヒータ」のマーク５０６の位置を、「５」から「１」に移動させ、重みを「５」から「１」に変更した例を示している。この重みの変更に応答して、図５Ｂの表５２１のコンテンツの並びの順序が、Total(m)の値に応じて変更されている様子が示されている。 The peer-reviewed content table 501 of FIG. 5B moves the position of the mark 506 of the keyword "heater" in the peer-reviewed content table 500 of FIG. 1” is shown. In response to this weight change, the order in which the contents are arranged in the table 521 of FIG. 5B is shown to be changed according to the value of Total(m).

図５Ａの査読済みコンテンツの表５２０と比して、図５Ｂの査読済みコンテンツの表５２１では、査読済みのコンテンツのポジティブ集合R1のより多くのポジティブコンテンツがより上位に位置していることがわかる。また、図５Ａの表５２０のポジティブ集合の最下位のコンテンツ（m=112）が、図５Ａの表５２０の１１位から、図５Ｂの表５２１では、７位に上昇していることがわかる。 As compared to the peer-reviewed content table 520 of FIG. 5A, in the peer-reviewed content table 521 of FIG. . It can also be seen that the lowest content (m=112) in the positive set in table 520 of FIG. 5A has risen from 11th place in table 520 of FIG. 5A to 7th place in table 521 of FIG. 5B.

以上のことから、図５Ｂの表５０１における複数のキーワードと対応する重みの対応付けのパターンの方が、図５Ａの表５００におけるパターンよりもより好ましいと、オペレータに容易に認識させることができる。
キーワードの削除及び追加については、図を用いた説明を省略するが、キーワードの修正に応じて、コンテンツの並びが変化することは、当業者に理解できる。 From the above, it is possible for the operator to easily recognize that the pattern of association of weights corresponding to multiple keywords in table 501 in FIG. 5B is more preferable than the pattern in table 500 in FIG. 5A.
Deletion and addition of keywords will not be described with reference to drawings, but those skilled in the art will understand that the arrangement of content changes according to keyword correction.

オペレータは、キーワードの修正又は対応する重みを適宜修正することを試みて、コンテンツの並びにおいて、査読済みのコンテンツのポジティブ集合R1に属するコンテンツの多くが図５Ｂの表５２１のなるべく上位に位置するようにすることができる。 The operator attempts to modify the keywords or the corresponding weights accordingly so that most of the content belonging to the positive set R1 of peer-reviewed content is positioned as high as possible in the table 521 of FIG. 5B. can be

図５Ａ及び図５Ｂにおけるユーザインタフェースは一例であって、この例に限られるものではない。図５では、図５Ａの表５２０及び図５Ｂの表５２１において、査読済みのコンテンツのみが含まれているが、集合Vに属するすべてのコンテンツが含まれるようにしてもよい。この場合にも、オペレータは、査読済みのポジティブ集合R1に属するコンテンツが、なるべく表の上位に位置するように、キーワードの修正又は重みの修正を行えばよい。あるいは、オペレータは、査読済みのポジティブ集合R1に属するコンテンツの最下位のコンテンツの順位が、なるべく高くなるように、キーワードの修正又は重みの修正を行えばよい。 The user interface in FIGS. 5A and 5B is an example and is not limited to this example. In FIG. 5, only peer-reviewed content is included in table 520 of FIG. 5A and table 521 of FIG. 5B, but all content belonging to set V may be included. In this case as well, the operator may correct the keywords or the weights so that the contents belonging to the peer-reviewed positive set R1 are positioned as high in the table as possible. Alternatively, the operator may correct the keywords or the weights so that the lowest content among the content belonging to the peer-reviewed positive set R1 is ranked as high as possible.

そして、オペレータに好ましいと認識されたキーワードと対応する重みのパターンが決定される。決定されたパターンが用いられて、集合Vに属するコンテンツの合計（すなわち優先度）がオペレータに提供される。オペレータは、優先度の高い順に、コンテンツの査読を行うことで、オペレータにとって重要であると推定されるコンテンツを優先的に査読することができる。
図６Ａは、実施形態の処理フローを示している。以下に、図６Ａに示された処理フローを説明する。 Then, a pattern of weights corresponding to keywords recognized as being preferred by the operator is determined. The determined pattern is used to provide the operator with the sum of the contents belonging to set V (ie the priority). By reviewing the content in order of priority, the operator can preferentially review content that is presumed to be important to the operator.
FIG. 6A shows the processing flow of the embodiment. The processing flow shown in FIG. 6A will be described below.

［ステップＳ１０２］複数のコンテンツ（すなわち集合Ｖに属するコンテンツ）が特定される。例えば、オペレータが、検索式を用いて、コンテンツの母集団Ｕから、検索結果の集合Vを得ることで、対象とする複数のコンテンツが特定される。
［ステップＳ１０４］複数のキーワードと対応する重み（正又は負を含む）とを含む、キーワードの情報が受け取られる。 [Step S102] A plurality of contents (that is, contents belonging to the set V) are identified. For example, an operator obtains a set V of search results from a content population U using a search formula, thereby identifying a plurality of target contents.
[Step S104] Keyword information is received, including a plurality of keywords and corresponding weights (including positive or negative).

［ステップＳ１０６］複数のコンテンツの各々に対してキーワードの出現回数と対応する重みとの積が算出され、複数のキーワードについて積の合計が導出される。 [Step S106] The product of the number of occurrences of the keyword and the corresponding weight is calculated for each of the plurality of contents, and the sum of the products is derived for the plurality of keywords.

［ステップＳ１０８］各コンテンツに対応する合計に基づいて、複数のコンテンツの提示の優先度が決定される。なお、この合計（優先度）は、オペレータに対するコンテンツの提示の順序の優先度に限定されるものではない。例えば、コンテンツの表示と共に合計（優先度）を表示してオペレータに示すことによって、オペレータは、表示されているコンテンツの推定された重要度を認識できるようにしてもよい。 [Step S108] The priority of presentation of a plurality of contents is determined based on the sum corresponding to each content. Note that this total (priority) is not limited to the priority of the order in which content is presented to the operator. For example, the total (priority) may be displayed and shown to the operator along with the display of the content so that the operator can recognize the estimated importance of the content being displayed.

以上の処理によって得られたコンテンツの各々に対応する合計（優先度）によって、オペレータは、集合Vに属するコンテンツを効率よく処理することができる。
図６Ｂは、複数のコンテンツを推定するステップＳ１０２の詳細な処理を示すフローチャートである。以下に、その処理を説明する。 The operator can efficiently process the contents belonging to the set V based on the sum (priority) corresponding to each of the contents obtained by the above processing.
FIG. 6B is a flowchart showing detailed processing of step S102 for estimating a plurality of contents. The processing will be described below.

［ステップＳ１１２］複数のコンテンツに関連する複数の単語が抽出される。複数のコンテンツは、集合Ｖに属するすべてのコンテンツであってもよいし、一部のコンテンツであってもよい。あるいは、コンテンツの中の一部分（例えば、特許文献であれば特許請求の範囲）について、単語が抽出されてもよい。 [Step S112] A plurality of words related to a plurality of contents are extracted. The plurality of contents may be all the contents belonging to the set V, or may be a part of the contents. Alternatively, words may be extracted from a portion of the content (for example, the scope of claims in the case of patent documents).

［ステップＳ１１４］抽出された複数の単語がオペレータに提示される。提示された単語に基づいて、ポジティブキーワード又はネガティブキーワードが特定されてもよい。あるいは、オペレータが自ら決定したポジティブキーワード又はネガティブキーワードが特定されてもよい。
単語をオペレータに提示することによって、オペレータによるポジティブキーワード又はネガティブキーワードの特定が容易化される。
図７は、複数のコンテンツに関連する複数の単語を抽出するステップＳ１１２の詳細な処理を示すフローチャートである。以下に、その処理を説明する。 [Step S114] The plurality of extracted words are presented to the operator. Positive or negative keywords may be identified based on the presented words. Alternatively, a positive or negative keyword determined by the operator himself may be identified.
Presenting the words to the operator facilitates the operator's identification of positive or negative keywords.
FIG. 7 is a flow chart showing detailed processing of step S112 for extracting a plurality of words related to a plurality of contents. The processing will be described below.

［ステップＳ２０２］オペレータが査読した複数のコンテンツの各々に対してオペレータが与えたポジティブまたはネガティブの評価値が受け取られる。この処理によって、図１に示すポジティブ集合R1及びネガティブ集合G1が特定される。その他の集合T1が特定されてもよい。なお、その他の集合T1は、必ずしも特定されなくてもよい。 [Step S202] A positive or negative evaluation value given by the operator to each of the plurality of contents peer-reviewed by the operator is received. Through this process, the positive set R1 and the negative set G1 shown in FIG. 1 are identified. Other sets T1 may be specified. Note that the other set T1 does not necessarily have to be specified.

［ステップＳ２０４］評価値が与えられた複数のコンテンツに関連する単語のうち、ポジティブの評価値が与えられたコンテンツに、より強く関連するポジティブの単語と、ネガティブの評価値が与えられたコンテンツに、より強く関連するネガティブの単語とが区別されて提示できるように、複数の単語が特定される。 [Step S204] Among words related to a plurality of content given evaluation values, positive words more strongly related to content given a positive evaluation value and content given a negative evaluation value , multiple words are identified so that they can be presented separately from the more strongly associated negative words.

すでに、図２及び図３において説明したように、この処理によって、オペレータがポジティブキーワード又はネガティブキーワード、或いはこれらに対応する重みを決定することを容易化させる。 As already described in FIGS. 2 and 3, this process facilitates the operator to determine positive or negative keywords or their corresponding weights.

図８Aは、複数のキーワードと対応する重み（正又は負を含む）とを含む、キーワードの情報を受け取るステップＳ１０４の詳細な処理を示すフローチャートである。以下に、その処理を説明する。 FIG. 8A is a flow chart showing the detailed process of step S104 of receiving keyword information, including a plurality of keywords and corresponding weights (including positive or negative). The processing will be described below.

［ステップＳ３０２］指定された複数のキーワードの修正及び対応する重みの修正のうち少なくともいずれかの修正が受け取られる。この処理によって、より好ましいキーワード（ポジティブキーワード又はネガティブキーワード）又は対応する重みが得られる。 [Step S302] Modifications of at least one of a plurality of designated keywords and corresponding weight modifications are received. This process yields more favorable keywords (positive or negative keywords) or corresponding weights.

図８Ｂは、各コンテンツに対応する合計に基づいて、複数のコンテンツの提示の優先度を決定するステップＳ１０８の詳細な処理を示すフローチャートである。以下に、その処理を説明する。
［ステップＳ３１２］修正を受け取ったことに応答して、評価値が与えられたコンテンツに対応する優先度の変化を提示するように優先度が変更される。
この処理によって、オペレータにとって望ましい優先度が集合Vに属する各コンテンツに付与される。
オペレータは、この優先度を用いてコンテンツの査読等を効率的に実行することができる。 FIG. 8B is a flowchart showing detailed processing of step S108 for determining the priority of presentation of a plurality of contents based on the sum corresponding to each content. The processing will be described below.
[Step S312] In response to receiving the modification, the priority is changed to present a change in priority corresponding to the content given the rating value.
Through this process, each content belonging to the set V is assigned a priority desired by the operator.
The operator can use this priority to efficiently perform content review and the like.

図９は、修正を受け取ったことに応答して、評価値が与えられたコンテンツに対応する優先度の変化を提示するように優先度を変更するステップＳ３１２の処理の詳細な処理を示すフローチャートである。以下に、その処理を説明する。
［ステップＳ４０２］評価値をオペレータに識別させるように優先度に評価値が対応付けられる。図５Ａ及び図５Ｂにおいて説明したように、この処理を行うことによって、オペレータは、ポジティブ集合R1に属するコンテンツに対応する合計（優先度）が、なるべく高い優先度となるように、キーワード（ポジティブキーワード又はネガティブキーワード）の修正又は対応する重みを決定することができる。 FIG. 9 is a flowchart showing detailed processing of the processing in step S312 for changing the priority so as to present a change in priority corresponding to the content given the evaluation value in response to receiving the correction. be. The processing will be described below.
[Step S402] The priority is associated with the evaluation value so that the operator can identify the evaluation value. As described with reference to FIGS. 5A and 5B, by performing this processing, the operator can add keywords (positive keywords or negative keywords) can be determined with modifications or corresponding weights.

図１０は、実施形態の機能を示すブロック図である。以下に、このブロック図について説明する。
コンテンツ特定部１００２は、例えば、検索結果からコンテンツに関する諸情報を特定する。 FIG. 10 is a block diagram illustrating the functionality of the embodiment. This block diagram will be described below.
The content specifying unit 1002, for example, specifies various pieces of information about the content from the search results.

単語抽出部１００４は、例えばポジティブ集合R1、ネガティブ集合G1、またはその他の集合T1を受け取ることで、これらの集合に存在する単語を抽出し、オペレータに提示することができる。単語抽出部１００４は、集合Vに属するすべてのコンテンツ又は一部のコンテンツから単語を抽出してもよい。なお、単語抽出部１００４が機能しない場合があってもよい。この場合には、オペレータによって、次に述べるキーワード特定部１００６及び重み決定部１００８が機能して、キーワードとその重みが特定されてもよい。 The word extraction unit 1004 receives, for example, a positive set R1, a negative set G1, or another set T1, extracts words existing in these sets, and presents them to the operator. The word extraction unit 1004 may extract words from all the contents belonging to the set V or from some of the contents. Note that the word extraction unit 1004 may not function in some cases. In this case, the operator may operate the keyword specifying unit 1006 and the weight determining unit 1008 to specify the keyword and its weight.

キーワード特定部１００６で、キーワード（ポジティブキーワード又はネガティブキーワード）が特定される。キーワードは、オペレータによって、提示された単語のリストから選定されてもよい。あるいは、オペレータ自らが指定したキーワードが用いられてもよい。
重み決定部１００８は、オペレータからの指示（又は修正指示）に基づいて、キーワードに対応する重みを決定することができる。
辞書記憶部１０１０は、キーワードに対応する同義語、類義語及び／又は異表記のキーワードが抽出されるために利用される。
コンテンツ優先度決定部１０１２は、上述したように、コンテンツに対応する合計（優先度）を計算する。
計算された優先度は、オペレータによって、コンテンツを効率的に処理するために利用される。 A keyword (positive keyword or negative keyword) is identified by the keyword identification unit 1006 . Keywords may be selected by the operator from a presented list of words. Alternatively, a keyword specified by the operator himself may be used.
The weight determination unit 1008 can determine weights corresponding to keywords based on instructions (or correction instructions) from the operator.
The dictionary storage unit 1010 is used to extract synonyms, synonyms, and/or different notation keywords corresponding to the keyword.
The content priority determining unit 1012 calculates the total (priority) corresponding to the content as described above.
The calculated priority is used by the operator to efficiently process the content.

図１１は、実施形態の各ハードウエア構成を示した図である。ハードウエア構成は、ＣＰＵ３００１、本実施形態のプログラム、データベース及び／又はデータが格納され得るＲＯＭ３００２、ＲＡＭ３００３、ネットワークインターフェース３００５、入力インタフェース３００６、表示インタフェース３００７、外部メモリインタフェース３００８を有する。これらのハードウエアは、バス３００４によって相互に接続されている。 FIG. 11 is a diagram showing each hardware configuration of the embodiment. The hardware configuration includes a CPU 3001 , a ROM 3002 in which programs, databases and/or data of this embodiment can be stored, a RAM 3003 , a network interface 3005 , an input interface 3006 , a display interface 3007 and an external memory interface 3008 . These pieces of hardware are interconnected by a bus 3004 .

ネットワークインターフェース３００５は、ネットワーク３０１５に接続されている。ネットワーク３０１５には、有線ＬＡＮ、無線ＬＡＮ、インターネット、電話網などがある。入力インタフェース３００６には、入力部３０１６が接続されている。表示インタフェース３００７には、表示部３０１７が接続される。外部メモリインタフェース３００８には、記憶媒体３０１８が接続される。記憶媒体３０１８は、ＲＡＭ、ＲＯＭ、ＣＤ－ＲＯＭ、ＤＶＤ－ＲＯＭ、ハードディスク、メモリーカード、ＵＳＢメモリ等であってもよい。
上述の実施形態を実現するプログラム及び方法は、図１１に示されるハードウエア構成を備えるコンピュータにより実行され得る。 Network interface 3005 is connected to network 3015 . The network 3015 includes a wired LAN, a wireless LAN, the Internet, a telephone network, and the like. An input unit 3016 is connected to the input interface 3006 . A display unit 3017 is connected to the display interface 3007 . A storage medium 3018 is connected to the external memory interface 3008 . The storage medium 3018 may be RAM, ROM, CD-ROM, DVD-ROM, hard disk, memory card, USB memory, or the like.
Programs and methods that implement the above-described embodiments can be executed by a computer having the hardware configuration shown in FIG.

以上に説明した各実施形態は、それぞれが排他的なものではなく、ある実施形態の一部を他の実施形態に組み込んだり、ある実施形態の一部を他の実施形態の一部で代替したりすることができる。 Each of the embodiments described above is not exclusive, and a part of one embodiment may be incorporated into another embodiment, or a part of one embodiment may be substituted for another embodiment. can be

加えて、例示したフローチャートの各フローは、矛盾のない限り順番を入れ替えることができる。また、矛盾のない限り、例示された１つのフローを、異なるタイミングで、複数回実行することができる。複数のステップが同時に実行されてもよい。各ステップは、メモリ（ｎｏｎ－ｔｒａｎｓｉｔｏｒｙｍｅｍｏｒｙ）に記憶されたプログラムを実行することにより実現されてもよい。 Additionally, each flow in the illustrated flow charts can be interchanged in order as long as there is no conflict. Also, as long as there is no contradiction, one illustrated flow can be executed multiple times at different timings. Multiple steps may be performed simultaneously. Each step may be implemented by executing a program stored in memory (non-transitory memory).

また、開示された実施形態の一部のプログラムは、オペレーティングシステムなどの汎用のプログラム、またはハードウエアで実現することができる。加えて、開示されたプログラムは、複数のハードウエアで分散して実行されてもよい。 Also, some programs of the disclosed embodiments can be implemented in general-purpose programs such as operating systems or hardware. In addition, the disclosed programs may be distributed and executed on multiple pieces of hardware.

上述の実施形態を実現するプログラムは、図１１に示されるハードウエア構成を備えるコンピュータにより実行され得る。また，実施形態のプログラムは，コンピュータに実行させる方法として，インプリメントされてもよい。 A program that implements the above-described embodiments can be executed by a computer having the hardware configuration shown in FIG. Also, the program of the embodiment may be implemented as a method to be executed by a computer.

以上の実施形態は，請求項に記載された発明を限定するものではなく，例示として取り扱われることは言うまでもない。また、開示の技術及び請求項に記載された発明が対象とするコンテンツに含まれ得るテキスト及び音声などは、特定の言語に限定されるものではなく、いかなる言語によって表現されたものであってもよく、複数の言語が混在したものであってもよい。 It goes without saying that the above embodiments do not limit the invention described in the claims, but are treated as examples. In addition, the text and voice that can be included in the content targeted by the disclosed technique and the claimed invention are not limited to a specific language, and can be expressed in any language. Alternatively, multiple languages may be mixed.

１００２コンテンツ特定部
１００４単語抽出部
１００６キーワード特定部
１００８決定部
１０１０辞書記憶部
１０１２コンテンツ優先度決定部

1002 Content identification unit 1004 Word extraction unit 1006 Keyword identification unit 1008 Determination unit 1010 Dictionary storage unit 1012 Content priority determination unit

Claims

A content processing method in which a computer executes content processing for determining the priority of presentation of each of a plurality of content,
identifying the plurality of pieces of content, including receiving a rating value given by the operator to each of the plurality of content peer-reviewed by the operator among the plurality of pieces of content;
Receiving keyword information including a plurality of keywords designated by the operator and weights corresponding to each of the plurality of keywords, and modifying the weights corresponding to the designated plurality of keywords. to receive, including to receive;
For each of the plurality of contents, adding the product of the number of appearances of each of the plurality of keywords and the weight corresponding to the keyword for each of the plurality of keywords to obtain a total corresponds to each of the plurality of contents. and deriving the sum of
determining a priority for presentation of each of the plurality of pieces of content based on the sum corresponding to each of the plurality of pieces of content, and in response to receiving a modification of the weights ; changing the presentation priority based on the sum derived using the modification, and recognizing the change in the presentation priority corresponding to the content given the rating value in association with the rating value by the operator; determining, including displaying on a display to cause
A content processing method comprising:

The evaluation value includes a positive evaluation value or a negative evaluation value,
The content processing method according to claim 1.

receiving the keyword information further includes receiving modifications of the plurality of keywords;
The content processing method according to claim 1 or 2.

Identifying the plurality of contents includes:
extracting a plurality of words associated with the plurality of content from the plurality of content ;
presenting the plurality of words to an operator such that the operator identifies the plurality of keywords from the plurality of words based on the plurality of words;
4. The content processing method according to any one of claims 1 to 3, comprising:

In the extracting
extracting the plurality of words from a predetermined portion of the plurality of content ;
The content processing method according to claim 4.

The extracting includes
Among the words related to the plurality of contents given the evaluation values, positive words more strongly related to the contents given the positive evaluation values, and more positive words to the contents given the negative evaluation values. identifying the plurality of words such that they are presented separately from strongly associated negative words;
6. The content processing method according to claim 4 or 5, comprising:

the weight corresponding to each of the plurality of keywords is a positive value (including zero) or a negative value (including zero);
7. A content processing method according to any one of claims 1 to 6.

the content includes at least one of text, images, and audio;
A content processing method according to any one of claims 1 to 7.

A program that causes a computer to execute the method according to any one of claims 1 to 8.