JP2021033994A

JP2021033994A - Text processing method, apparatus, device and computer readable storage medium

Info

Publication number: JP2021033994A
Application number: JP2019209171A
Authority: JP
Inventors: シーホングオ; Xihong Guo; シンユグオ; xin yu Guo; アンシンリー; Anxin Li; ランチン; Lan Chen; 大志池田; Hiroshi Ikeda; 吉村　健; Takeshi Yoshimura; 健吉村; 拓藤本; Hiroshi Fujimoto
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2019-08-20
Filing date: 2019-11-19
Publication date: 2021-03-01
Anticipated expiration: 2039-11-19
Also published as: JP7414357B2; CN112487136A

Abstract

To provide a text processing method, apparatus, device and computer readable storage medium that extract a summary from text efficiently and can generate the summary.SOLUTION: A text processing device 400 comprises: a pre-processing unit that is arranged so as to perform pre-processing to source text, and generate a plurality of word vectors for a plurality of words; a sentence vector determination unit that is arranged so as to determine a plurality of sentence vectors on the basis of a plurality of initial recommendation weight vectors and the plurality of word vectors; a recommendation probability determination unit that is arranged so as to adjust the plurality of initial recommendation weight vectors on the basis of relevancy of each sentence vector with other sentence vector of the plurality of sentence vectors, and determine a recommendation probability distribution for the plurality of words; and an output unit that is arranged so as to determine the word to be output based on the recommendation probability distribution.SELECTED DRAWING: Figure 4

Description

本開示は、テキスト処理分野に関し、具体的に、テキスト処理方法、装置、デバイス及びコンピュータ読み取り可能な記憶媒体に関する。 The present disclosure relates to the field of text processing, specifically to text processing methods, devices, devices and computer readable storage media.

従来のテキストの生成過程において、テキストを生成するネットワークの出力コンテンツは、訓練データを学習した結果である。例えば、要約のようなテキストを生成するシーンでは、多くの訓練データの正解がテキストのコンテンツにおける前のいくつかの文に集中しているため、このような訓練データを用いて訓練されたネットワークも、テキストのコンテンツにおける前の文について新たなテキストコンテンツを生成する傾向にある。したがって、現在のテキスト処理方法では、テキストのコンテンツに対して要約及び抽出をする効率的な手段がない。 In the conventional text generation process, the output content of the network that generates the text is the result of learning the training data. For example, in a text-generating scene such as a summary, many training data correct answers are concentrated in the previous few sentences in the text content, so networks trained with such training data are also available. , Tends to generate new text content for previous sentences in text content. Therefore, current text processing methods do not have an efficient means of summarizing and extracting text content.

本開示は、テキストから要約を効率的に抽出し生成するためのテキスト処理方法、装置、デバイス及びコンピュータ読み取り可能な記憶媒体を提供する。 The present disclosure provides text processing methods, devices, devices and computer-readable storage media for efficiently extracting and generating summaries from text.

本開示の１つの局面において、ソーステキストに対し前処理を行って、複数の単語のための複数の単語ベクトルを生成するように配置される前処理ユニットと、複数の初期推奨重みベクトルと前記複数の単語ベクトルに基づいて、複数の文ベクトルを確定するように配置される文ベクトル確定ユニットと、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性に基づいて前記複数の初期推奨重みベクトルを調整して、前記複数の単語のための推奨確率分布を確定するように配置される推奨確率確定ユニットと、前記推奨確率分布に基づいて出力すべき単語を確定するように配置される出力ユニットと、を備えるテキスト処理装置が提供されている。 In one aspect of the present disclosure, a preprocessing unit arranged to preprocess the source text to generate a plurality of word vectors for a plurality of words, a plurality of initial recommended weight vectors, and the plurality. Based on the word vector of, the sentence vector confirmation unit arranged so as to determine the plurality of sentence vectors, and the plurality of sentence vectors based on the relationship between each sentence vector and the other sentence vector among the plurality of sentence vectors. Adjust the initial recommended weight vector of to determine the recommended probability determination unit arranged to determine the recommended probability distribution for the plurality of words and the word to be output based on the recommended probability distribution. A text processing device comprising an output unit to be arranged is provided.

いくつかの実施例において、前記文ベクトル確定ユニットは、符号化ニューラルネットワークを利用して前記複数の単語ベクトルを処理して、各単語ベクトルにそれぞれ対応する現在の符号化隠れ状態ベクトルを確定し、各初期推奨重みベクトルと前記現在の符号化隠れ状態ベクトルに基づいて、当該初期推奨重みベクトルに対応する文ベクトルを確定するように配置される。 In some embodiments, the sentence vector determination unit processes the plurality of word vectors using a coded neural network to determine the current coded hidden state vector corresponding to each word vector. Based on each initial recommended weight vector and the current coded hidden state vector, the sentence vector corresponding to the initial recommended weight vector is arranged so as to be determined.

いくつかの実施例において、前記出力ユニットは、前記現在の符号化隠れ状態ベクトルに基づいて、復号化ニューラルネットワークを利用して現在の復号化隠れ状態ベクトルを確定し、前記現在の符号化隠れ状態ベクトルと前記現在の復号化隠れ状態ベクトルを利用して現在の単語確率分布を確定し、前記現在の単語確率分布と前記推奨確率分布に基づいて、出力すべき単語を確定するように配置される。 In some embodiments, the output unit utilizes a decoding neural network to determine the current decoding hidden state vector based on the current coded hidden state vector, and the current coded hidden state vector. The current word probability distribution is determined using the vector and the current decoding hidden state vector, and the word to be output is determined based on the current word probability distribution and the recommended probability distribution. ..

いくつかの実施例において、前記現在の単語確率分布は、生成確率分布及び注意確率分布を含み、前記出力ユニットは、前記推奨確率分布を利用して前記注意確率分布を調整し、調整後の注意確率分布を確定し、前記生成確率分布と前記調整後の注意確率分布を重み付け加算して出力単語確率分布を確定し、出力単語確率分布内の確率の最大である単語を出力すべき単語として確定するように配置される。 In some embodiments, the current word probability distribution includes a generation probability distribution and an attention probability distribution, and the output unit adjusts the attention probability distribution using the recommended probability distribution, and the adjusted attention. The probability distribution is determined, the generated probability distribution and the adjusted attention probability distribution are weighted and added to determine the output word probability distribution, and the word with the maximum probability in the output word probability distribution is determined as the word to be output. Arranged to do.

いくつかの実施例において、前記現在の単語確率分布は、生成確率分布及び注意確率分布を含み、前記出力ユニットは、前記生成確率分布、前記注意確率分布及び前記推奨確率分布に用いられる重みを確定して、前記重みに基づいて前記出力単語確率分布を確定し、出力単語確率分布の確率の最大である単語を出力すべき単語として確定するように配置される。 In some embodiments, the current word probability distribution includes a generation probability distribution and an attention probability distribution, and the output unit determines the weights used for the generation probability distribution, the attention probability distribution, and the recommended probability distribution. Then, the output word probability distribution is determined based on the weight, and the word having the maximum probability of the output word probability distribution is determined as the word to be output.

いくつかの実施例において、推奨確率確定ユニットは、関連性確定サブユニットをさらに含み、前記関連性確定サブユニットは、各文ベクトルに対し、当該文ベクトルを他の文ベクトルと組み合わせて、組合せ文ベクトルを生成し、関連性行列を利用して前記組合せ文ベクトルを処理することにより、当該文ベクトルと当該他の文ベクトルとの関連性を確定するように配置される。 In some embodiments, the recommended probability-determining unit further comprises a relevance-determining subsystem, wherein the relevance-determining subsystem, for each sentence vector, combines the statement vector with another statement vector to form a combination statement. By generating a vector and processing the combination sentence vector using the relationship matrix, the vector is arranged so as to determine the relationship between the sentence vector and the other sentence vector.

いくつかの実施例において、推奨確率確定ユニットは、調整サブユニットをさらに含み、前記調整サブユニットは、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性に基づいて、当該文ベクトルの推奨係数を確定し、前記初期推奨重みベクトルの夫々に対し、当該初期推奨重みベクトルに対応する文ベクトルの推奨係数を利用して当該初期推奨重みベクトルを調整し、調整後の単語確率ベクトルを取得し、調整後の単語確率ベクトルに基づいて前記複数の単語の推奨確率分布を確定するように配置される。 In some embodiments, the recommended probability determination unit further comprises an adjustment subsystem, which is based on the association of the statement vector with each of the other statement vectors of the plurality of statement vectors. Then, the recommended coefficient of the sentence vector is determined, and for each of the initial recommended weight vectors, the recommended coefficient of the sentence vector corresponding to the initial recommended weight vector is used to adjust the initial recommended weight vector, and after the adjustment. The word probability vector of the above is acquired, and the recommended probability distribution of the plurality of words is determined based on the adjusted word probability vector.

本開示の他の態様において、ソーステキストに対し前処理を行って、複数の単語のための複数の単語ベクトルを生成することと、複数の初期推奨重みベクトルと前記複数の単語ベクトルに基づいて、複数の文ベクトルを確定することと、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性に基づいて前記複数の初期推奨重みベクトルを調整して、前記複数の単語のための推奨確率分布を確定することと、前記推奨確率分布に基づいて出力すべき単語を確定することとを含むテキスト処理方法が提供されている。 In another aspect of the present disclosure, the source text is preprocessed to generate a plurality of word vectors for the plurality of words, and based on the plurality of initial recommended weight vectors and the plurality of word vectors. The plurality of initial recommended weight vectors are adjusted based on the determination of the plurality of sentence vectors and the relationship between each sentence vector and the other sentence vector among the plurality of sentence vectors, and the plurality of words of the plurality of words. A text processing method including determining a recommended probability distribution for the purpose and determining a word to be output based on the recommended probability distribution is provided.

いくつかの実施例において、複数の初期推奨重みベクトルと前記複数の単語ベクトルに基づいて、複数の文ベクトルを確定することは、符号化ニューラルネットワークを利用して前記複数の単語ベクトルを処理して、各単語ベクトルにそれぞれ対応する現在の符号化隠れ状態ベクトルを確定し、各初期推奨重みベクトルと前記現在の符号化隠れ状態ベクトルに基づいて、当該初期推奨重みベクトルに対応する文ベクトルを確定することを含む。 In some embodiments, determining a plurality of sentence vectors based on a plurality of initial recommended weight vectors and said plurality of word vectors is performed by processing the plurality of word vectors using a coded neural network. , Determine the current coded hidden state vector corresponding to each word vector, and determine the sentence vector corresponding to the initial recommended weight vector based on each initial recommended weight vector and the current coded hidden state vector. Including that.

いくつかの実施例において、前記推奨確率分布に基づいて出力すべき単語を確定することは、前記現在の符号化隠れ状態ベクトルに基づいて、復号化ニューラルネットワークを利用して現在の復号化隠れ状態ベクトルを確定し、前記現在の符号化隠れ状態ベクトルと前記現在の復号化隠れ状態ベクトルを利用して現在の単語確率分布を確定し、前記現在の単語確率分布と前記推奨確率分布に基づいて、出力すべき単語を確定することを含む。 In some embodiments, determining the word to output based on the recommended probability distribution is based on the current coded hidden state vector and utilizes a decoding neural network to determine the current decoded hidden state. The vector is determined, the current word probability distribution is determined using the current coded hidden state vector and the current decoded hidden state vector, and based on the current word probability distribution and the recommended probability distribution, Includes determining the word to be output.

いくつかの実施例において、前記現在の単語確率分布は、生成確率分布及び注意確率分布を含み、ここで、前記現在の単語確率分布と前記推奨確率分布に基づいて、出力すべき単語を確定することは、前記推奨確率分布を利用して前記注意確率分布を調整し、調整後の注意確率分布を確定し、前記生成確率分布と前記調整後の注意確率分布を重み付け加算して出力単語確率分布を確定し、出力単語確率分布内の確率の最大である単語を出力すべき単語として確定することを含む。 In some embodiments, the current word probability distribution includes a generation probability distribution and an attention probability distribution, where the words to be output are determined based on the current word probability distribution and the recommended probability distribution. That is, the attention probability distribution is adjusted by using the recommended probability distribution, the adjusted attention probability distribution is determined, and the generation probability distribution and the adjusted attention probability distribution are weighted and added to output the word probability distribution. Is included, and the word having the maximum probability in the output word probability distribution is determined as the word to be output.

いくつかの実施例において、前記現在の単語確率分布は、生成確率分布及び注意確率分布を含み、ここで、前記現在の単語確率分布と前記推奨確率分布に基づいて、出力すべき単語を確定することは、前記生成確率分布、前記注意確率分布及び前記推奨確率分布に用いられる重みを確定して、前記重みに基づいて前記出力単語確率分布を確定し、出力単語確率分布の確率の最大である単語を出力すべき単語として確定することを含む。 In some embodiments, the current word probability distribution includes a generation probability distribution and an attention probability distribution, where the words to be output are determined based on the current word probability distribution and the recommended probability distribution. That is, the weights used for the generation probability distribution, the attention probability distribution, and the recommended probability distribution are determined, the output word probability distribution is determined based on the weights, and the probability of the output word probability distribution is the maximum. Includes confirming a word as a word to be output.

いくつかの実施例において、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性は、以下のように確定される。つまり、各文ベクトルに対し、当該文ベクトルを他の文ベクトルと組み合わせて、組合せ文ベクトルを生成し、関連性行列を利用して前記組合せ文ベクトルを処理することにより、当該文ベクトルと当該他の文ベクトルとの関連性を確定する。 In some embodiments, the association between each sentence vector and the other sentence vector of the plurality of sentence vectors is determined as follows. That is, for each sentence vector, the sentence vector is combined with another sentence vector to generate a combination sentence vector, and the combination sentence vector is processed by using the relevance matrix to obtain the sentence vector and the other. Determine the relevance of the sentence vector.

いくつかの実施例において、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性に基づいて前記複数の初期推奨重みベクトルを調整して、前記複数の単語のための推奨確率分布を確定することは、推奨確率確定ユニットは、調整サブユニットをさらに含み、前記調整サブユニットは、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性に基づいて、当該文ベクトルの推奨係数を確定し、前記初期推奨重みベクトルの夫々に対し、当該初期推奨重みベクトルに対応する文ベクトルの推奨係数を利用して当該初期推奨重みベクトルを調整し、調整後の単語確率ベクトルを取得し、調整後の単語確率ベクトルに基づいて前記複数の単語の推奨確率分布を確定することを含む。 In some embodiments, the plurality of initial recommended weight vectors are adjusted based on the association of each sentence vector with the other sentence vector of the plurality of sentence vectors to make recommendations for the plurality of words. Determining the probability distribution is recommended. The probabilistic determination unit further includes an adjustment subunit, and the adjustment subunit relates to each of the sentence vector and the other sentence vector among the plurality of sentence vectors. Based on this, the recommended coefficient of the sentence vector is determined, and for each of the initial recommended weight vectors, the recommended coefficient of the sentence vector corresponding to the initial recommended weight vector is used to adjust and adjust the initial recommended weight vector. This includes acquiring the subsequent word probability vector and determining the recommended probability distribution of the plurality of words based on the adjusted word probability vector.

本開示のさらに他の態様において、プロセッサと、コンピュータ読み取り可能なプログラム命令が記憶されるメモリと、を含み、前記コンピュータ読み取り可能なプログラム命令が前記プロセッサにより実行されるとき、上述したようなテキスト処理方法を実行するテキスト処理デバイスが提供されている。 In yet another aspect of the present disclosure, text processing as described above, including a processor and a memory in which computer-readable program instructions are stored, when the computer-readable program instructions are executed by the processor. A text processing device is provided to perform the method.

本開示のさらに他の態様において、コンピュータ読み取り可能な命令が記憶されるコンピュータ読み取り可能な記憶媒体であって、前記コンピュータ読み取り可能な命令がコンピュータにより実行されるとき、前記コンピュータに上述したようなテキスト処理方法を実行させるコンピュータ読み取り可能な記憶媒体が提供されている。 In yet another aspect of the present disclosure, a computer-readable storage medium in which computer-readable instructions are stored, the text as described above in the computer when the computer-readable instructions are executed by the computer. A computer-readable storage medium that executes the processing method is provided.

本開示に係るテキスト処理方法、装置、デバイス及びコンピュータ読み取り可能な記憶媒体をよれば、テキストにおける各単語と各単語からなる文との関連性に基づいて、テキストの要約の抽出方法によるテキストのコンテンツに対する理解力を向上させ、テキストのコンテンツをより好適に抽象化させ、要約し、テキストの要約を生成することができる。 According to the text processing methods, devices, devices and computer-readable storage media according to the present disclosure, the content of the text by the method of extracting the abstract of the text based on the relationship between each word in the text and the sentence consisting of each word. It is possible to improve the comprehension of the text, abstract the content of the text more favorably, summarize it, and generate a summary of the text.

本発明の上記及び他の目的、特徴や利点は、後述する本発明の実施例や添付する図面に基づくより詳細な説明によって明らかになるであろう。図面は、本開示の実施例のさらなる理解を提供するために使用され、本明細書の一部を構成し、本開示の実施例と共に本開示を説明するために使用され、本開示を限定するものではない。なお、図面において、同一の符号は同一の構成要素又はステップを示す。
本開示による、テキスト処理方法の模式的なフローチャートを示す。本開示の実施例による、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性を確定する模式図を示す。本開示の実施例による、出力単語確率分布の確定の模式図を示す。本開示の実施例による、生成確率分布と調整後の注意確率分布を利用して出力単語確率分布を確定する模式図を示す。本開示の実施例による、生成確率分布、注意確率分布及び推奨確率分布を利用して出力単語確率分布を確定する模式図を示す。本開示の実施例による、テキスト処理装置の模式的なブロック図を示す。本開示の実施例による、演算デバイスの模式図である。 The above and other objects, features and advantages of the present invention will be clarified by more detailed description based on the examples of the present invention described later and the accompanying drawings. The drawings are used to provide a further understanding of the embodiments of the present disclosure, form part of this specification, and are used in conjunction with the embodiments of the present disclosure to illustrate the present disclosure and limit the present disclosure. It's not a thing. In the drawings, the same reference numerals indicate the same components or steps.
A schematic flowchart of the text processing method according to the present disclosure is shown. A schematic diagram for determining the relationship between each sentence vector and another sentence vector among the plurality of sentence vectors according to the embodiment of the present disclosure is shown. A schematic diagram of the determination of the output word probability distribution according to the embodiment of the present disclosure is shown. A schematic diagram for determining the output word probability distribution using the generated probability distribution and the adjusted attention probability distribution according to the embodiment of the present disclosure is shown. A schematic diagram for determining the output word probability distribution using the generation probability distribution, the attention probability distribution, and the recommended probability distribution according to the embodiment of the present disclosure is shown. A schematic block diagram of the text processing apparatus according to the embodiment of the present disclosure is shown. It is a schematic diagram of the arithmetic device according to the Example of this disclosure.

以下、本開示の実施例における技術的解決策を、本開示の実施例における添付図面と併せて、明確かつ完全に説明する。もちろん、説明された実施例は、本開示の一部の実施例にすぎず、全ての実施例ではない。本開示の実施例に基づいて、当業者が創造的な労力を要することなく得られる全ての他の実施例は、本開示の保護範囲に属する。 Hereinafter, the technical solutions in the embodiments of the present disclosure will be clearly and completely described together with the accompanying drawings in the embodiments of the present disclosure. Of course, the examples described are only some of the examples of the present disclosure, not all of them. All other embodiments obtained by those skilled in the art based on the embodiments of the present disclosure without the need for creative effort belong to the scope of protection of the present disclosure.

特に定義されない限り、本明細書で使用される技術的または科学的用語は、本発明が属する技術分野における通常の技能を有する者によって理解される通常の意味である。本明細書で使用される「第１の」、「第２の」及び類似の用語は、いかなる順序、数、又は重要性も示すものではなく、異なる構成要素を区別するために使用されるだけである。同様に、「含む」または「備える」などの類似の単語は、その単語の前に存在する要素または物品が、その単語の後に存在する要素または物品およびその均等物を包含することを意味し、他の要素または物品を排除するものではない。「接続され」または「に接され」などの類似の用語は、物理的または機械的接続に限定されず、直接的または間接的のいずれであっても、電気的接続を含み得る。「上」、「下」、「左」、「右」などは、相対的な位置関係を示すためのものであり、記述されたオブジェクトの絶対的な位置が変化すると、相対的な位置関係も変化する可能性がある。 Unless otherwise defined, the technical or scientific terms used herein are the usual meanings understood by those of ordinary skill in the art to which the present invention belongs. The terms "first," "second," and similar as used herein do not indicate any order, number, or significance, but are only used to distinguish between different components. Is. Similarly, similar words such as "contain" or "provide" mean that the element or article that precedes the word includes the element or article that exists after the word and its equivalents. It does not exclude other elements or articles. Similar terms such as "connected" or "contacted" are not limited to physical or mechanical connections and may include electrical connections, either direct or indirect. "Upper", "lower", "left", "right", etc. are for indicating the relative positional relationship, and when the absolute position of the described object changes, the relative positional relationship also changes. May change.

図１は、本開示によるテキスト処理方法の模式的なフローチャートを示す。図１に示すように、ステップＳ１０２において、ソーステキストに対して前処理を行って、前記複数の単語のための複数の単語ベクトルを生成する。 FIG. 1 shows a schematic flowchart of the text processing method according to the present disclosure. As shown in FIG. 1, in step S102, the source text is preprocessed to generate a plurality of word vectors for the plurality of words.

テキスト処理方法がコンピュータによって実行される場合、コンピュータはテキストデータを直接に処理できないため、ソーステキストを処理する際には、ソーステキストを数値型のデータに変換しておく必要がある。例えば、ソーステキストのコンテンツは、１つ又は複数の文であってもよい。前記前処理は、文を複数の単語に分割するように各文に対して単語分割処理を実行し、、複数の単語をそれぞれ所定次元の単語ベクトルに変換することを含む。例えば、ワード埋め込み(ｗｏｒｄｅｍｂｅｄｄｉｎｇ)の方式によって、この変換を行うことができる。 When the text processing method is executed by the computer, the computer cannot process the text data directly, so when processing the source text, it is necessary to convert the source text into numeric type data. For example, the content of the source text may be one or more sentences. The preprocessing includes executing a word division process for each sentence so as to divide the sentence into a plurality of words, and converting the plurality of words into word vectors having a predetermined dimension. For example, this conversion can be performed by a word embedding method.

ステップＳ１０４において、複数の初期推奨重みベクトルと前記複数の単語ベクトルに基づいて、複数の文ベクトルＳを確定する。 In step S104, a plurality of sentence vectors S are determined based on the plurality of initial recommended weight vectors and the plurality of word vectors.

いくつかの実施例において、各時間ステップ(ｔｉｍｅｓｔｅｐ)について、符号化ニューラルネットワークを用いてステップＳ１０２において生成された複数の単語ベクトルを処理することにより、各単語ベクトルにそれぞれ対応する現在の符号化隠れ状態ベクトルを確定し得る。いくつかの実現形態において、符号化ニューラルネットワークは、長期や短期記憶(ｌｓｔｍ、ｌｏｎｇａｎｄｓｈｏｒｔ−ｔｅｒｍｍｅｍｏｒｙ)ネットワークとして実現され得る。符号化ニューラルネットワークは、単語ベクトルを符号化することができる任意の機械学習モデルとしても実現され得ることが理解されようである。 In some embodiments, for each time step, the current coding corresponding to each word vector is processed by processing the plurality of word vectors generated in step S102 using a coding neural network. The hidden state vector can be determined. In some implementations, the coded neural network can be implemented as a long-term or short-term memory (lstm, long short-term memory) network. It will be appreciated that a coded neural network can also be implemented as any machine learning model capable of coding a word vector.

ステップＳ１０２で生成された単語ベクトルを入力として、符号化ニューラルネットワークは、現在の時間ステップが各単語ベクトルｘ_１、ｘ_２、ｘ_３…のそれぞれに対応する現在の符号化隠れ状態ベクトルｈ_１、ｈ_２、ｈ_３…を出力することができる。符号化隠れ状態ベクトルの数と単語ベクトルの数は、同じであってもよいし、異なっていてもよい。例えば、ソーステキストからｋ個の単語ベクトルが生成される場合、符号化ニューラルネットワークは、このｋ個の単語ベクトルを処理することにより、ｋ個の対応する符号化隠れ状態ベクトルを生成することができる。ｋは１より大きい整数である。 Taking the word vector generated in step S102 as an input, the coded neural network has a current coded hidden state vector h ₁ , _{whose current time step corresponds to each word vector x 1} , x ₂ , x _{3 ...} It is possible to output h ₂ , h _{3 ...} The number of coded hidden state vectors and the number of word vectors may be the same or different. For example, if k word vectors are generated from the source text, the coded neural network can generate k corresponding coded hidden state vectors by processing the k word vectors. .. k is an integer greater than 1.

次に、各初期推奨重みベクトルと前記現在の符号化隠れ状態ベクトルに基づいて、当該初期推奨重みベクトルに対応する文ベクトルを確定する。 Next, the sentence vector corresponding to the initial recommended weight vector is determined based on each initial recommended weight vector and the current coded hidden state vector.

いくつかの実施例では、初期推奨重みベクトルＷは、ベクトル［Ｗ_１、Ｗ_２…、ｗ_ｋ］として表され得る。ここで、Ｗの要素の数は、符号化隠れ状態ベクトルの数と同じである。ここで、初期推奨重みベクトルＷの各元素は、現在の符号化隠れ状態ベクトルを用いて文ベクトルを確定する際に用いる各符号化隠れ状態ベクトルの重み係数を表す。これらの重み係数を用いて、符号化ニューラルネットワークが入力する各単語ベクトルに対応する符号化隠れ状態ベクトルを組み合せて、各単語ベクトルの情報を含む文ベクトルを形成することができる。なお、ここで言う文ベクトルは、抽象的な文ベクトルであってもよい。抽象的な文ベクトルは、入力テキストに含まれる文の情報と一対一に対応しないものであってもよい。文ベクトルＳは、Ｓ１０２で生成された複数の単語ベクトルのうちの一部又は全部の単語ベクトルの情報を含んでもよい。 In some embodiments, the initial recommended weight vector W can be represented as _{the vector [W 1} , W ₂ ..., W _k]. Here, the number of elements of W is the same as the number of coded hidden state vectors. Here, each element of the initial recommended weight vector W represents the weighting coefficient of each coded hidden state vector used when determining the sentence vector using the current coded hidden state vector. Using these weighting coefficients, it is possible to combine the coded hidden state vectors corresponding to each word vector input by the coded neural network to form a sentence vector containing the information of each word vector. The sentence vector referred to here may be an abstract sentence vector. The abstract sentence vector may not have a one-to-one correspondence with the sentence information contained in the input text. The sentence vector S may include information on a part or all of the word vectors generated in S102.

いくつかの実現形態において、文ベクトルＳは、現在の符号化隠れ状態ベクトルｈ_１、ｈ_２…ｈ_ｋの重み平均値として表されてもよい。例えば、文ベクトルＳは、Ｗ＊ｈとして表され、ここで、Ｗ＝［ｗ_１、ｗ_２…、ｗ_ｋ］、ｈ＝［ｈ_１、ｈ_２…、ｈ_ｋ］^Ｔであってもよい。したがって、予め訓練された所定数の初期推奨重みベクトルＷ_１、Ｗ_２…、Ｗ_ｎを利用して、所定数の文ベクトルＳ_１、Ｓ_２…、Ｓ_ｎを得ることができる。ここで、ｎ、ｍは、１より大きい整数である。 In some implementations, the sentence vector S may be represented as a weighted mean _{of the current coded hidden state vectors h 1} , h ₂ ... h _k. For example, the sentence vector S is represented as W * h, where W = [w ₁ , w ₂ ..., w _k ], h = [h ₁ , h ₂ ..., h _k ] ^T. .. Thus, pre-initial trained predetermined number recommended weight vector _W _1, W 2 ..., by using the _{W n,} sentence vector _S 1 of a predetermined _number, S 2 ..., it can be obtained _{S n.} Here, n and m are integers larger than 1.

ステップＳ１０６において、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性に基づいて前記複数の初期推奨重みベクトルを調整して、前記複数の単語のための推奨確率分布を確定する。 In step S106, the plurality of initial recommended weight vectors are adjusted based on the relationship between each sentence vector and the other sentence vector among the plurality of sentence vectors to obtain a recommended probability distribution for the plurality of words. Determine.

図２は、本開示の実施例による、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性を確定する模式図を示す。図２には、５つの単語ベクトルを例として本開示の原理が記述されるが、本開示の範囲は、これに限定されなく、他の任意の数の単語ベクトルを利用して本開示によるテキスト処理方法を実現しても良い。 FIG. 2 shows a schematic diagram for determining the relationship between each sentence vector and another sentence vector among the plurality of sentence vectors according to the embodiment of the present disclosure. Although the principle of the present disclosure is described in FIG. 2 by taking five word vectors as an example, the scope of the present disclosure is not limited to this, and the text according to the present disclosure using any number of other word vectors. A processing method may be realized.

図２に示すように、ｘ_１、ｘ_２、ｘ_３、ｘ_４、ｘ_５は、ソーステキストから生成された、ソーステキストにおける単語に対応する単語ベクトルである。符号化ニューラルネットワークを利用して、ｘ_１、ｘ_２、ｘ_３、ｘ_４、ｘ_５にそれぞれ対応する符号化隠れ状態ベクトルｈ_１、ｈ_２、ｈ_３、ｈ_４、ｈ_５を生成する。 As shown in FIG. 2, x ₁ , x ₂ , x ₃ , x ₄ , and x ₅ are word vectors generated from the source text and corresponding to the words in the source text. A coded neural network is used to generate coded hidden state vectors h ₁ , h ₂ , h ₃ , h ₄ , h ₅ _{corresponding to x 1} , x ₂ , x ₃ , x ₄ , and x _{5, respectively.}

図２には、３つの初期推奨重みベクトルＷ_１、Ｗ_２、Ｗ_３を示す。なお、本開示はこれに限定されなく、他の任意の数の初期推奨重みベクトルを利用して本開示によるテキスト処理方法を実現しても良い。図２に示すように、初期推奨重みベクトルＷ_１、Ｗ_２、Ｗ_３を利用して文ベクトルＳ_１、Ｓ_２及びＳ_３を確定する。 FIG. 2 shows three initial recommended weight vectors W ₁ , W ₂ , and W ₃ . The present disclosure is not limited to this, and the text processing method according to the present disclosure may be realized by using any number of other initial recommended weight vectors. As shown in FIG. 2, the sentence vectors S ₁ , S ₂ and S ₃ are determined by _{using the initial recommended weight vectors W 1} , W ₂ and W _3.

文ベクトルＳ_１、Ｓ_２、Ｓ_３の各文ベクトルに対し、当該文ベクトルを他の文ベクトルとを組み合わせて、組合せ文ベクトルを生成する。ここで、組合せ文ベクトルには組み合わせた少なくとも２つの文ベクトルの情報が含まれる。以下、２つの文ベクトルの間の関連性を確定することを例として本開示の原理を説明するが、当業者は、３つの以上の文ベクトルを組み合わせて組み合わせた文ベクトルの間の関連性を確定してもい。 For each sentence vector S ₁ , S ₂ , and S ₃ , the sentence vector is combined with another sentence vector to generate a combined sentence vector. Here, the combination sentence vector includes the information of at least two sentence vectors combined. Hereinafter, the principle of the present disclosure will be described by taking as an example the determination of the relationship between two sentence vectors, but those skilled in the art will describe the relationship between sentence vectors obtained by combining three or more sentence vectors. You can confirm it.

例えば、図２に示すように、文ベクトルＳ_１とＳ_２との関連性λ_１,２、文ベクトルＳ_１とＳ_３との関連性λ_１,３、及び文ベクトルＳ_２とＳ_３との関連性λ_２、３を計算することができる。 For example, as shown in FIG. 2, the sentence vector _{S 1} and relevance lambda _{1, 2} and _{S 2,} sentence vector _{S 1} and related lambda _{1, 3,} and sentence vector _{S 2} and _{S 3} and _{S 3} The relevance λ ₂ and 3 of can be calculated.

いくつかの実現形態において、当該文ベクトルを他の文ベクトルと接続して、より次元の高い組合せ文ベクトルを得ることができる。例えば、文ベクトルＳの次元がｄである場合、文ベクトルＳ１とＳ２とを接続することにより、次元２ｄである組合せ文ベクトルＳ_１,２が得られる。ただし、ｄは１より大きい整数である。 In some implementations, the sentence vector can be connected to other sentence vectors to obtain a higher dimensional combination sentence vector. For example, the dimension of the sentence vector S be a d, by connecting the sentence vectors S1 and S2, a combination sentence vector S _{1, 2} is the dimension 2d obtained. However, d is an integer greater than 1.

なお、Ｓ_１に対しＳ_１とＳ_２との関連性を計算する時、Ｓ_１を前、Ｓ_２を後で文ベクトルＳ_１とＳ_２を接続する。Ｓ_２に対しＳ_２とＳ_１との関連性を計算する時、Ｓ_２を前、Ｓ_１を後で文ベクトルＳ_２とＳ_１を接続する。そして、この場合、組合せ文ベクトルＳ_１,２と組合せ文ベクトルＳ_２,１とは異なる。 Incidentally, when the relative _{S 1} to calculate the relationship between _{S 1} and _{S 2,} before the _{S 1,} connects later sentence vector _{S 1} and _{S 2} to _{S 2.} When to S ₂ to calculate the relationship between _{S 2} and _{S 1,} before the _{S 2,} it connects later sentence vector _{S 2} and _{S 1} to _{S 1.} Then, in this case, the combination sentence vector _S1,2 and the combination sentence vector _S2,1 are different.

他の実現形態において、２つの文ベクトルに対しベクトル間の演算を行って（例えば、加算、減算、ベクトル積など）組合せ文ベクトルを生成する。この場合、組合せ文ベクトルＳ_１,２と組合せ文ベクトルＳ_２,１とは同じであっても良い。 In another embodiment, operations between the vectors are performed on the two sentence vectors (for example, addition, subtraction, cross product, etc.) to generate a combination sentence vector. In this case, the combination statement vectors S1 and ₂ and the combination statement vectors S2 and ₁ may be the same.

実際には、当業者は、任意の方式で、少なくとも２つの文ベクトルの情報を組み合わせた組合せ文ベクトルを生成することができる。 In practice, one of ordinary skill in the art can generate a combination sentence vector by combining the information of at least two sentence vectors by an arbitrary method.

そして、関連性行列を利用して前記組合せ文ベクトルを処理することにより、当該文ベクトルと当該他の文ベクトルとの関連性を確定することができる。いくつかの実施例において、文ベクトルＳ_１とＳ_２との関連性λ_１,２は_、λ＝Ｓ_１,２＊Ｚとして表されてもよい。ここで、Ｓ_１,２が文ベクトルＳ_１とＳ_２の組合せ文ベクトルを示し、Ｚが訓練された関連性行列を示す。Ｚを利用してＳ_１とＳ_２との関連性係数λ_１,２を算出することができる。いくつかの実施例において、関連性行列Ｚは、組合せ文ベクトルＳ_１,２を実数としての関連性係数に投影することができる。 Then, by processing the combination sentence vector using the relevance matrix, the relevance between the sentence vector and the other sentence vector can be determined. In some embodiments, association lambda _{1, 2} and sentence vector _{S 1} and _{S 2} _may be expressed as λ = _{S 1,2} * Z. Here, S _{1, 2} indicates a combination sentence vectors of the sentence vector S ₁ and S _2, showing the relevance matrix Z is trained. Z can be used to calculate the relevance coefficients λ ₁ _{and 2} _{between S 1} and S 2. In some embodiments, the relevance matrix Z _{can project the combinatorial statement vectors S1, 2} onto the relevance coefficient as a real number.

上記の方法によって、文ベクトルＳ_１、Ｓ_２…、Ｓｎのうちの任意の２つの文ベクトルの間の関連性を計算することができる。 By the above method, sentence vector S _1, S 2 _..., it is possible to calculate the relationship between any two statements vectors of Sn.

上記の任意の１つの文ベクトルに対して、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性に基づいて、当該文ベクトルの推奨係数を確定することができる。いくつかの実現形態において、当該文ベクトルの推奨係数は、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性の合計として表されてもよい。 For any one sentence vector described above, the recommended coefficient of the sentence vector can be determined based on the relationship between the sentence vector and each of the other sentence vectors among the plurality of sentence vectors. .. In some implementations, the recommended coefficient of the sentence vector may be expressed as the sum of the relationships between the sentence vector and each of the other sentence vectors of the plurality of sentence vectors.

例えば、文ベクトルＳ_１の推奨係数は、Σλ_１＝λ_１,２＋λ_１,３＋…λ_１,ｍとして表され、文ベクトルＳ_２の推奨係数は、Σλ_２＝λ_２,１＋λ_２,３＋…λ_２,ｍとして表され、このように、各文ベクトルの推奨係数を確定することができる。 For example, the recommended coefficient of the sentence vector S ₁ _{is expressed as Σλ 1} = λ _1,2 + λ _1,3 + ... λ _{1, m} , and the recommended coefficient of the sentence vector S ₂ _{is Σλ 2} = λ _2,1 + λ _{2. , 3} + ... λ _{2, m} , and thus the recommended coefficient of each sentence vector can be determined.

他の実現形態において、文ベクトルの推奨係数は、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性の加重和として表されても良い。予め確定された重み係数を利用して各文ベクトルとの他の文ベクトルとの関連性に対して重み付け加算を行ってもよい。 In other implementations, the recommended coefficients of the sentence vector may be expressed as a weighted sum of the relationships between the sentence vector and each of the other sentence vectors of the plurality of sentence vectors. Weighting addition may be performed on the relationship between each sentence vector and other sentence vectors using a predetermined weighting coefficient.

上記の推奨係数は、調整後の単語確率ベクトルを取得するために、対応する文ベクトルを生成するための初期推奨重みベクトルの調整に用いられることができる。例えば、図２に示すように、文ベクトルＳ_１、Ｓ_２及びＳ_３に対応する推奨係数Σλ_１、Σλ_２及びΣλ_３を利用して初期推奨重みベクトルＷ_１、Ｗ_２、Ｗ_３を処理することができる。 The above recommended coefficients can be used to adjust the initial recommended weight vector to generate the corresponding sentence vector in order to obtain the adjusted word probability vector. For example, as shown in FIG. 2, the initial recommended weight vectors W ₁ , W ₂ , and W ₃ are processed by using the recommended coefficients Σ λ ₁ , Σ λ ₂ and Σ λ ₃ _{corresponding to the sentence vectors S 1} , S ₂ and S _3. can do.

前述したように、推奨係数は、文ベクトルと他の文ベクトルとの関連性に基づいて確定されるものである。テキストの要約の生成過程でテキストのコンテンツを要約する必要があるため、他の文ベクトルとの関連性が高いほど、当該文ベクトルに含まれる単語ベクトルの情報がテキストのコンテンツの中で重要度が高く、その結果、テキストの要約の内容になる可能性が高いと考えられる。 As mentioned above, the recommended coefficient is determined based on the relationship between the sentence vector and other sentence vectors. Since it is necessary to summarize the content of the text in the process of generating the summary of the text, the higher the relevance to other sentence vectors, the more important the information of the word vector contained in the sentence vector is in the content of the text. High, and as a result, is likely to be the content of a text summary.

いくつかの実施例では、各文ベクトルの推奨係数を、当該文ベクトルに対応する単語確率ベクトルに掛けることにより、その単語確率ベクトルに含まれる、各単語ベクトルの符号化隠れ状態ベクトルに対する重み係数を調整することができる。例えば、調整後のｉ番目の単語確率ベクトルＷ_ｉ’は、Ｗ_ｉ’＝Σλ_ｉ＊Ｗ_ｉとして表され得る。 In some embodiments, the recommended coefficient of each sentence vector is multiplied by the word probability vector corresponding to the sentence vector to obtain the weighting coefficient for the coded hidden state vector of each word vector contained in the word probability vector. Can be adjusted. For example, i-th word probability vector _{W i} 'after control is, _{W i'} can be expressed as = Σλ _{_i} * _W _i.

各文ベクトルの推奨係数を利用して当該文ベクトルの単語確率ベクトルを調整した後、上記の方法により得た調整後の複数の単語確率ベクトルＷ’を利用して前記複数の単語の推奨確率分布を確定してもよい。 After adjusting the word probability vector of the sentence vector using the recommended coefficient of each sentence vector, the recommended probability distribution of the plurality of words is used by using the adjusted multiple word probability vectors W'obtained by the above method. May be confirmed.

いくつかの実施例において、推奨確率分布Ｐ_Ｖは、上記の方法により得た調整後の複数の単語確率ベクトルＷ’の和であるＰ_Ｖ＝ΣＷ_ｉ’として表されてもよい。いくつかの実現形態において、推奨確率分布Ｐ_Ｖは、調整後の複数の単語確率ベクトルＷ_ｉ’の加重和として表されてもよい。 In some embodiments, the recommended probability distribution P _V may be represented as 'P _{V =} ΣW i is the sum _of' the above plurality of word probability vector W after adjustment obtained by the method. In some implementations, the recommended probability distribution P _V may be represented as a weighted sum of a plurality of word probability vector W _{i 'after} adjustment.

図１を参照し、ステップＳ１０８において、前記推奨確率分布に基づいて、出力すべき単語を確定してもよい。 With reference to FIG. 1, in step S108, the word to be output may be determined based on the recommended probability distribution.

ステップＳ１０６で出力する推奨確率分布は、入力したソーステキスト内の各単語のソーステキストの中で重要度を示すことができ、ここで、推奨確率分布内の確率が大きいほど、現在の時間ステップについて、当該単語のソーステキスト内の重要度が高いと考える。そして、いくつかの例において、推奨確率分布内の確率の最大である単語を現在の時間ステップに出力すべき単語として確定してもよい。 The recommended probability distribution output in step S106 can indicate the importance in the source text of each word in the input source text. Here, the larger the probability in the recommended probability distribution, the more about the current time step. , Consider the word to be of high importance in the source text. Then, in some examples, the word with the highest probability in the recommended probability distribution may be determined as the word to be output in the current time step.

いくつかの実施例において、推奨確率基づいて、現在の生成式のネットワーク（ＧｅｎｅｒａｔｉｖｅＮｅｔｗｏｒｋｓ）によって生成された単語確率分布を調整することにより、出力単語確率分布を確定してもよい。 In some embodiments, the output word probability distribution may be determined by adjusting the word probability distribution generated by the current generative network (Generative Networks) based on the recommended probabilities.

各時間ステップについて、前記現在の符号化隠れ状態ベクトルに基づいて、復号化ニューラルネットワークを利用して現在の復号化隠れ状態ベクトルを確定することができる。前記現在の符号化隠れ状態ベクトルと現在の復号化隠れ状態ベクトルを利用して現在の単語確率分布を確定することができる。前記現在の単語確率分布と前記推奨確率分布に基づいて、現在の時間ステップについての出力単語確率分布を確定し、出力単語確率分布から最大の確率を有する単語ベクトルに対応する単語を、現在の時間ステップに出力すべき単語として選定することができる。 For each time step, the current decoded hidden state vector can be determined using the decoding neural network based on the current coded hidden state vector. The current word probability distribution can be determined by using the current coded hidden state vector and the current decoded hidden state vector. Based on the current word probability distribution and the recommended probability distribution, the output word probability distribution for the current time step is determined, and the word corresponding to the word vector having the maximum probability from the output word probability distribution is obtained from the output word probability distribution at the current time. It can be selected as a word to be output to the step.

ここで、前記現在の単語確率分布は、注意（Ａｔｔｅｎｔｉｏｎ）確率分布であってもよい。前記注意確率分布は、前記入力テキストにおける単語がテキストの要約における単語となる確率分布を示す。 Here, the current word probability distribution may be an Attention probability distribution. The attention probability distribution indicates a probability distribution in which a word in the input text becomes a word in a text summary.

図３Ａは、本開示の実施例による、出力単語確率分布の確定の模式図を示す。図３Ａに示すように、推奨確率分布Ｐ_Ｖを利用して前記注意確率分布を調整することで、調整後の注意確率分布を形成することができる。 FIG. 3A shows a schematic diagram of determining the output word probability distribution according to the embodiment of the present disclosure. As shown in FIG. 3A, by adjusting the attention probability distribution by using the recommended probability distribution P _V, it is possible to form the attention probability distribution after adjustment.

一実現形態において、現在の時間ステップについての符号化隠れ状態ベクトルと復号化隠れ状態ベクトルに基づいて注意確率分布を確定することができる。例えば、式（１）を利用して上記の注意確率分布を確定することができる。
ここで、ｔは現在の時間ステップを示し、ａ^ｔは現在の時間ステップについての注意確率分布を示し、ｓｏｆｔｍａｘは正規化指数関数であり、ｅ^ｔは、式（２）により以下のように確定される。
ここで、ｖ^Ｔ、Ｗ_ｈ、Ｗ_Ｓ、ｂ_ａｔｔｎは、ポインター生成ネットワーク（Ｐｏｉｎｔｅｒ−ＧｅｎｅｒａｔｏｒＮｅｔｗｏｒｋｓ）にける学習パラメータであり、ｈ_ｉは現在の符号化隠れ状態ベクトルであり、ｓ_ｔは現在の復号化隠れ状態ベクトルである。 In one implementation, the attention probability distribution can be determined based on the coded hidden state vector and the decoded hidden state vector for the current time step. For example, the above attention probability distribution can be determined by using the equation (1).
Here, t represents the current time step, a ^t represents the attention probability distribution for the current time step, softmax is normalized exponential, e ^t is determined as follows by equation (2) Will be done.
^{_{_{_{Here, v T, W h, W}}}} S, b attn are learning parameter takes a pointer generation network _{(Pointer-Generator Networks), h} i is the current coding hidden state vector, _{s t} is the current Decryption hidden state vector.

いくつかの実施例において、前記推奨確率分布を利用して前記注意確率分布を調整し、調整後の注意確率分布を確定する。 In some embodiments, the recommended probability distribution is used to adjust the attention probability distribution and determine the adjusted attention probability distribution.

例えば、式（３）を利用して調整後の注意確率分布ａ’を確定することができる。
ここで、ｔは現在の時間ステップであり、ａ’^ｔは現在の時間ステップについての調整後の注意確率分布を示し、ｅ^ｔは式（２）により確定されたパラメータである。 For example, the adjusted attention probability distribution a'can be determined using Eq. (3).
Here, t is the current time step, a ^'t represents an attention probability distribution after adjustment for the current time step, e ^t is a parameter which is determined by equation (2).

調整後の注意確率分布を利用して、前記入力テキストにおける単語がテキストの要約における単語となる確率分布を確定することができる。例えば、入力テキストから確率の最大である単語を出力すべき単語として選定する。 The adjusted attention probability distribution can be used to determine the probability distribution in which a word in the input text becomes a word in a text summary. For example, the word with the maximum probability is selected from the input text as the word to be output.

いくつかの実施例において、前記現在の単語確率分布は、生成確率分布Ｐ_{ｖｏｃａｂ}をさらに含む。前記生成単語確率分布は、前記文字エンティティ辞書（ｔｅｘｔｅｎｔｉｔｙｄｉｃｔｉｏｎａｒｙ）における単語がテキストの要約における単語となる確率分布を示す。 In some embodiments, the current word probability distribution further comprises _{a generation probability distribution P vocab.} The generated word probability distribution indicates a probability distribution in which a word in the character entity dictionary becomes a word in a text summary.

図３Ｂは、本開示の実施例による、生成確率分布と調整後の注意確率分布を利用して出力単語確率分布を確定する模式図を示す。 FIG. 3B shows a schematic diagram for determining the output word probability distribution using the generation probability distribution and the adjusted attention probability distribution according to the embodiment of the present disclosure.

いくつかの実施例において、コンテキストベクトルと現在の時間ステップについての復号化隠れ状態ベクトルに基づいて、上記の生成確率分布を確定することができる。例えば、さらに、式（４）と式（５）を利用して上記の生成確率分布Ｐ_{ｖｏｃａｂ}を確定することができる。
ここで、Ｖ’、Ｖ、ｂ、ｂ’は、ポインター生成ネットワークにおける学習パラメータであり、ｈ_ｔ ^*は注意確率分布に基づいて確定されたコンテキストベクトルである。例えば、式（４）を利用して確定ｈ_ｔ ^*を確定することができる。
ここで、ａ_ｉ ^ｔは式（１）で確定された注意確率分布ａ^ｔにおけるｉ番目の元素であり、ｈ_ｉは現在のｉ番目の符号化隠れ状態ベクトルである。 In some embodiments, the above generation probability distribution can be determined based on the context vector and the decoding hidden state vector for the current time step. For example, the above-mentioned generation probability distribution P _vocab can be further determined by using the equations (4) and (5).
Here, V', V, b, b'are learning parameters in the pointer generation network, and _ht ^* is a context vector determined based on the attention probability distribution. _{For example, the determination ht} ^* can be determined using the equation (4).
Here, a _i ^t is the i-th element in the attention probability distribution a ^t that is determined by equation (1), h _i is the current i-th coding hidden state vector.

そして、前記生成確率分布と前記調整後の注意確率分布を重み付け加算することにより、出力単語確率分布を確定することができる。 Then, the output word probability distribution can be determined by weighting and adding the generation probability distribution and the adjusted attention probability distribution.

いくつかの実施例において、現在の時間ステップについての符号化隠れ状態ベクトル、復号化隠れ状態ベクトル、注意確率分布及び１つ前の時間ステップでの復号化ニューラルネットワークの出力に基づいて、生成確率分布及び調整後の注意確率分布の第１の重みＰ_ｇｅｎを確定することができる。 In some embodiments, the generated probability distribution is based on the coded hidden state vector for the current time step, the decoded hidden state vector, the attention probability distribution, and the output of the decoding neural network at the previous time step. _{And the first weight P gen} of the adjusted attention probability distribution can be determined.

例えば、前記生成確率分布と前記調整後の注意確率分布に対して加重和を計算するための第１の重みＰ_ｇｅｎは、式（６）として表され得る。
ここで、σは、活性化関数、例えばｓｉｇｍｏｉｄ関数を示し、ｗ_ｈ ^Ｔ、ｗ_ｓ ^Ｔ、ｗ_ｘ ^Ｔ及びｂ_ｐｔｒは訓練パラメータであり、ｈ_ｔ ^*は時間ステップｔに式（４）により確定したパラメータであり、ｓ_ｔは時間ステップｔでの復号化隠れ状態ベクトルであり、ｘ_ｔは時間ステップｔでの復号化ニューラルネットワークの入力、つまり、１つ前の時間ステップｔ−１での復号化ニューラルネットワークの出力である。式（６）により確定された第１の重みＰ_ｇｅｎはスカラーとして実現されてもよい。第１の重みＰ_ｇｅｎを利用して生成確率分布Ｐ_{ｖｏｃａｂ}と調整後の注意確率分布ａ’^ｔを重み平均して出力単語確率分布を取得することができる。 _{For example, the first weight P gen} for calculating the weighted sum with respect to the generated probability distribution and the adjusted attention probability distribution can be expressed by Eq. (6).
Here, sigma is the activation function, for example, shows a sigmoid _{^{_{^{function, w h T, w s T}}}} , w x T and _{b ptr} are trained parameter, determined by equation (4) to _{h t} ^* is the time step t _St is the decoding hidden state vector in the _{time step t, and x t} is the input of the decoding neural network in the time step t, that is, the decoding in the previous time step t-1. This is the output of the sigmoid neural network. _{The first weight P gen} determined by the formula (6) may be realized as a scalar. Can be the first weight P _gen probabilities generated using distributed P _vocab and attention probability distribution a ^'t the adjusted average weighted to get an output word probability distributions.

図３Ｃは、本開示の実施例による、生成確率分布、注意確率分布及び推奨確率分布を利用して出力単語確率分布を確定する模式図を示す。 FIG. 3C shows a schematic diagram for determining the output word probability distribution using the generation probability distribution, the attention probability distribution, and the recommended probability distribution according to the embodiment of the present disclosure.

図３Ｃに示すように、前記生成確率分布、前記注意確率分布及び前記推奨確率分布を重み付け加算して出力単語確率分布を確定することができる。一実現形態において、現在の時間ステップについての符号化隠れ状態ベクトル、復号化隠れ状態ベクトル、注意確率分布、推奨確率分布及び１つ前の時間ステップでの復号化ニューラルネットワークの出力に基づいて、前記生成確率分布、前記注意確率分布及び前記推奨確率分布を重み付け加算するための第２の重みＰ_ｇｅｎ２を確定することができる。 As shown in FIG. 3C, the output word probability distribution can be determined by weighting and adding the generation probability distribution, the attention probability distribution, and the recommended probability distribution. In one implementation, the above, based on the coded hidden state vector for the current time step, the decoded hidden state vector, the attention probability distribution, the recommended probability distribution, and the output of the decoding neural network in the previous time step. It is possible to determine _{the second weight P gen2} for weighting and adding the generation probability distribution, the attention probability distribution, and the recommended probability distribution.

式（７）と利用して前記生成確率分布、前記注意確率分布及び前記推奨確率分布を重み付け加算するための第２の重みＰ_ｇｅｎ２を確定することができる。
ここで、σは活性化関数、例えばｓｉｇｍｏｉｄ関数を示し、ｗ_ｈ ^Ｔ、ｗ_ｓ ^Ｔ、ｗ_ｘ ^Ｔ、ｗ_Ｖ ^Ｔ及びｂ_ｐｔｒは訓練パラメータであり、ｈ_ｔ ^*は時間ステップｔに式（４）により確定されたパラメータであり、ｓ_ｔは時間ステップｔでの復号化隠れ状態ベクトルであり、ｘ_ｔは時間ステップｔでの復号化ニューラルネットワークの入力であり、つまり、１つ前の時間ステップｔ−１での復号化ニューラルネットワークの出力であり、Ｐ_Ｖは時間ステップｔでの推奨確率分布である。 _{The second weight P gen2} for weighting and adding the generation probability distribution, the attention probability distribution, and the recommended probability distribution can be determined by using the equation (7).
Here, sigma activation function, for example, shows a sigmoid _{^{_{^{_{^{function, w h T, w s T}}}}}} , w x T, a _w ^{V T} and _{b ptr} training parameters, _{h t} ^* is the formula in the time step t (4 ), _St is the decoding hidden state vector at _{time step t, and x t} is the input of the decoding neural network at time step t, that is, the previous time step. the output of the decoding neural networks in t-1, P _V is the recommended probability distribution at time step t.

式（７）により確定された重みＰ_ｇｅｎ２は、３次元のベクトルとして実現し、ここで、当該３次元のベクトルにおける元素は、生成確率分布Ｐ_ｇｅｎ、それぞれ注意確率分布ａ_ｔ及び推奨確率分布Ｐ_Ｖの重み係数を示す。 Weight _{P gen2} which is determined by the equation (7) is implemented as a three-dimensional vector, where the elements in the three-dimensional vectors, generation probability distribution _{P gen,} respectively Note probability distribution _{a t} and recommended probability distribution P The weighting coefficient of _{V is shown.}

上記のテキスト処理で用いられるモデルの訓練パラメータは、予め定められた訓練データセットを用いて訓練されるものである。例えば、訓練データを上記のテキスト処理モデルに入力し、符号化ニューラルネットワーク、復号化ニューラルネットワーク、及び文ベクトル間の関連性を確定するための初期推奨重みベクトルを用いて、ソーステキストの単語ベクトルを処理することにより、上記のように訓練された出力単語確率分布を得ることができる。上記のテキスト処理モデルにおける訓練パラメータは、訓練された出力単語確率分布における正解の単語の確率損失を算出することにより調整されることができる。ここで、本開示に係るテキスト生成ネットワークの損失関数は、以下のように表され得る。
ここで、ｗ_ｔ ^*は時間ステップｔについての正解単語の時間ステップｔでの訓練の出力単語確率分布の確率値であり、Ｔは生成シーケンス全体にわたる合計時間ステップである。テキスト生成ネットワークの全体的な損失は、生成シーケンス全体にわたるすべての時間ステップでの損失値を統計することによって確定されることができる。 The training parameters of the model used in the above text processing are those trained using a predetermined training data set. For example, input training data into the text processing model above and use the coded neural network, the decoded neural network, and the initial recommended weight vector to determine the association between the sentence vectors to get the word vector of the source text. By processing, the output word probability distribution trained as described above can be obtained. The training parameters in the above text processing model can be adjusted by calculating the probability loss of the correct word in the trained output word probability distribution. Here, the loss function of the text generation network according to the present disclosure can be expressed as follows.
Here, w _t ^* is the probability value of the output word probability distribution of the training in the time step t of the correct word for the time step t, and T is the total time step over the entire generation sequence. The overall loss of the text generation network can be determined by statistics on the loss values at all time steps throughout the generation sequence.

上記のテキスト処理モデルのパラメータに対する訓練は、上記の損失が最小になるようにテキスト処理モデルの訓練パラメータを調整することによって実現できる。 Training on the parameters of the text processing model described above can be achieved by adjusting the training parameters of the text processing model so that the above loss is minimized.

本開示に係るテキスト処理方法によれば、例えば、テキストの要約のコンテンツを生成する際に、入力されたテキストにおける各単語からなる文ベクトルの間の相関性に基づいて、入力されたテキストにおける単語の当該テキストのコンテンツにおける重要度を確定することができ、テキストのコンテンツに対する単語の重要度に基づいて、生成されたテキストのコンテンツを確定するといった技術的効果を奏する。本開示では、要約を生成する場合を例に挙げて原理を説明したが、本開示の内容はこれに限定されない。本開示の原理から逸脱することなく、本開示に係るテキスト処理方法を、テキスト拡張、テキスト書き換え等の他の応用シーンに適用することもできる。 According to the text processing method according to the present disclosure, for example, when generating the content of the text summary, the words in the input text are based on the correlation between the sentence vectors consisting of each word in the input text. It is possible to determine the importance of the text in the content of the text, and it has a technical effect of determining the content of the generated text based on the importance of the word to the content of the text. In the present disclosure, the principle has been described by taking the case of generating a summary as an example, but the content of the present disclosure is not limited to this. The text processing method according to the present disclosure can be applied to other application scenes such as text extension and text rewriting without departing from the principle of the present disclosure.

図４は本開示の実施例によるテキスト処理装置の模式的なブロック図を示す。図４に示すように、テキスト処理装置４００は、前処理ユニット４１０と、文ベクトル確定ユニット４２０と、推奨確率確定ユニット４３０と、出力ユニット４４０とを含む。 FIG. 4 shows a schematic block diagram of the text processing apparatus according to the embodiment of the present disclosure. As shown in FIG. 4, the text processing apparatus 400 includes a preprocessing unit 410, a sentence vector determination unit 420, a recommended probability determination unit 430, and an output unit 440.

前処理ユニット４１０は、ソーステキストに対して前処理を行って、前記複数の単語のための複数の単語ベクトルを生成するように配置される。例えば、ワード埋め込み（ｗｏｒｄｅｍｂｅｄｄｉｎｇ）によりこの前処理を実現することができる。 The pre-processing unit 410 is arranged to perform pre-processing on the source text to generate a plurality of word vectors for the plurality of words. For example, this preprocessing can be realized by word embedding.

文ベクトル確定ユニット４２０は、複数の初期推奨重みベクトルと前記複数の単語ベクトルに基づいて、複数の文ベクトルＳを確定するように配置される。 The sentence vector determination unit 420 is arranged so as to determine the plurality of sentence vectors S based on the plurality of initial recommended weight vectors and the plurality of word vectors.

いくつかの実施例において、各時間ステップについて、符号化ニューラルネットワークを利用して前処理ユニット４１０により生成された複数の単語ベクトルを処理して、各単語ベクトルにそれぞれ対応する現在の符号化隠れ状態ベクトルを確定することができる。 In some embodiments, for each time step, a coded neural network is used to process multiple word vectors generated by the preprocessing unit 410, and the current coded hidden states corresponding to each word vector are respectively. The vector can be fixed.

前処理ユニット４１０により生成された単語ベクトルを入力とし、符号化ニューラルネットワークは、現在の時間ステップに各単語ベクトルｘ_１、ｘ_２、ｘ_３…にそれぞれ対応する現在の符号化隠れ状態ベクトルｈ_１、ｈ_２、ｈ_３…を出力することができる。符号化隠れ状態ベクトルの数と単語ベクトルの数は、同じであってもよいし、異なってもよい。例えば、ソーステキストに基づいてｋ個の単語ベクトルを生成する場合、符号化ニューラルネットワークは、これらｋ個の単語ベクトルを処理して対応するｋ個の符号化隠れ状態ベクトルを生成する。ｋは１より大きい整数である。 Taking the word vector generated by the preprocessing unit 410 as an input, the coded neural network takes the current coded hidden state vector h ₁ _{corresponding to each word vector x 1} , x ₂ , x _{3 ... In the current time step.} , H ₂ , h ₃ ... Can be output. The number of coded hidden state vectors and the number of word vectors may be the same or different. For example, when generating k word vectors based on the source text, the coded neural network processes these k word vectors to generate the corresponding k coded hidden state vectors. k is an integer greater than 1.

次に、各初期推奨重みベクトルと前記現在の符号化隠れ状態ベクトルに基づいて、当該初期推奨重みベクトルに対応する文ベクトルを確定することができる。 Next, the sentence vector corresponding to the initial recommended weight vector can be determined based on each initial recommended weight vector and the current coded hidden state vector.

いくつかの実施例において、初期推奨重みベクトルＷは、ベクトル［ｗ_１、ｗ_２…、ｗ_ｋ］として表され得る。ここで、Ｗの元素の数は符号化隠れ状態ベクトルの数と同じである。ここで、初期推奨重みベクトルＷにおける各元素は、現在の符号化隠れ状態ベクトルを利用して文ベクトルを確定する際の各符号化隠れ状態ベクトルための重み係数を示す。これらの重み係数を利用して、符号化ニューラルネットワーク入力から入力された各単語ベクトルの符号化隠れ状態ベクトルの情報を組み合わせることで、各単語ベクトル情報が含まれる文ベクトルを形成する。いくつかの実現形態において、文ベクトルＳは、現在の符号化隠れ状態ベクトルｈ_１、ｈ_２…ｈ_ｎの重み平均値として表され得る。そして、予め訓練された所定数の初期推奨重みベクトルＷ_１、Ｗ_２…、Ｗ_ｎを利用して所定数の文ベクトルＳ_１、Ｓ_２…、Ｓ_ｎを得る。 In some embodiments, the initial recommended weight vector W can be represented as _{a vector [w 1} , w ₂ ..., W _k]. Here, the number of elements of W is the same as the number of coded hidden state vectors. Here, each element in the initial recommended weight vector W indicates a weighting coefficient for each coded hidden state vector when the sentence vector is determined by using the current coded hidden state vector. By using these weighting coefficients and combining the information of the coded hidden state vector of each word vector input from the coded neural network input, a sentence vector including each word vector information is formed. In some implementations, the sentence vector S can be represented as the weighted mean _{of the current coded hidden state vectors h 1} , h ₂ ... h _n. Then, a predetermined number of sentence vectors S ₁ , S ₂ ..., _Sn are obtained by using a predetermined number of pre-trained initial recommended weight vectors W ₁ , W ₂ ..., W _n.

推奨確率処理ユニット４３０は、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性に基づいて前記複数の初期推奨重みベクトルを調整することにより、前記複数の単語のための推奨確率分布を確定するように配置される。 The recommended probability processing unit 430 for the plurality of words by adjusting the plurality of initial recommended weight vectors based on the relationship between each sentence vector and the other sentence vector among the plurality of sentence vectors. Arranged to establish the recommended probability distribution.

図４に示すように、推奨確率処理ユニット４３０は、関連性確定サブユニット４３１及び調整サブユニット４３２を含む。 As shown in FIG. 4, the recommended probability processing unit 430 includes the association determination subunit 431 and the adjustment subunit 432.

関連性確定サブユニット４３１は、文ベクトルの間の関連性を確定するように配置される。例えば、各文ベクトルを他の文ベクトルと組み合わせて、組合せ文ベクトルを生成することができる。 The association determination subunit 431 is arranged so as to establish the association between the sentence vectors. For example, each sentence vector can be combined with another sentence vector to generate a combination sentence vector.

いくつかの実現形態において、当該文ベクトルを他の文ベクトルと接続して、より次元の高い組合せ文ベクトルを得ることができる。例えば、文ベクトルＳの次元がｄである場合、文ベクトルＳ_１とＳ_２を接続して２ｄ次元の組合せ文ベクトルＳ_１,２を取得する。ここで、ｄは１より大きい整数である。 In some implementations, the sentence vector can be connected to other sentence vectors to obtain a higher dimensional combination sentence vector. For example, the dimension of the sentence vector S be a d, it acquires the sentence vector S ₁ and S ₂ to connect the 2d-dimensional combined sentence vector S _{1, 2.} Here, d is an integer greater than 1.

他の実現形態において、２つの文ベクトルのベクトル間の演算（例えば、加算、減算、ベクトル積等である）を行って組合せ文ベクトルを生成する。この場合、組合せ文ベクトルＳ_１,２と組合せ文ベクトルＳ_２,１とは、同じであってもよい。 In other implementations, operations between the vectors of two sentence vectors (eg, addition, subtraction, cross product, etc.) are performed to generate a combined sentence vector. In this case, the combination statement vectors S1 and ₂ and the combination statement vectors S2 and ₁ may be the same.

次に、関連性行列を利用して前記組合せ文ベクトルを処理することにより、当該文ベクトルと当該他の文ベクトルとの関連性を確定する。いくつかの実施例において、文ベクトルＳ_１とＳ_２との関連性λ_１,２は、λ＝Ｓ_１,２＊Ｚとして表され得る。ここで、Ｓ_１,２は文ベクトルＳ_１とＳ_２との組合せ文ベクトルであり、Ｚは訓練済みの関連性行列を示す。Ｚを利用してＳ_１とＳ_２との関連性係数λ_１,２を算出することができる。いくつかの実施例において、関連性行列Ｚは、組合せ文ベクトルＳ_１,２を実数としての関連性係数に投影することができる。 Next, the relation between the sentence vector and the other sentence vector is determined by processing the combination sentence vector using the relation matrix. In some embodiments, the relationship λ _1,2 _{between the sentence vectors S 1} and S ₂ can be expressed as λ = S _{1, 2 * Z.} Here, S _{1, 2} are combined sentence vectors of the sentence vector S ₁ and S _2, Z represents a trained relevance matrix. Z can be used to calculate the relevance coefficients λ ₁ _{and 2} _{between S 1} and S 2. In some embodiments, the relevance matrix Z _{can project the combinatorial statement vectors S1, 2} onto the relevance coefficient as a real number.

上記の方法により、文ベクトルＳ_１、Ｓ_２…、Ｓ_ｎのうちの任意の２つの文ベクトルの間の関連性を算出することができる。 By the above method, sentence vector S _1, S 2 _..., it is possible to calculate the relationship between any two statements vectors of S _n.

調整サブユニット４３２は、上述した任意の文ベクトルに対し、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性に基づいて、当該文ベクトルの推奨係数を確定するように配置される。いくつかの実現形態において、当該文ベクトルの推奨係数は、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性の合計として表されてもよい。 The adjustment subsystem 432 determines the recommended coefficient of the sentence vector for any of the above-mentioned sentence vectors, based on the relationship between the sentence vector and each of the other sentence vectors among the plurality of sentence vectors. Arranged like this. In some implementations, the recommended coefficient of the sentence vector may be expressed as the sum of the relationships between the sentence vector and each of the other sentence vectors of the plurality of sentence vectors.

他の実施例において、文ベクトルの推奨係数は、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性の加重和として表されてもよい。予め確定された重み係数を利用して、各文ベクトルと他の文ベクトルとの関連性を重み付け加算してもよい。 In another embodiment, the recommended coefficient of the sentence vector may be expressed as a weighted sum of the relationships between the sentence vector and each of the other sentence vectors among the plurality of sentence vectors. The relationship between each sentence vector and another sentence vector may be weighted and added by using a predetermined weighting coefficient.

上記推奨係数は、調整後の単語確率ベクトルを得るために、対応する文ベクトルを生成するための初期推奨重みベクトルの調整に用いられることができる。 The above recommended coefficients can be used to adjust the initial recommended weight vector to generate the corresponding sentence vector in order to obtain the adjusted word probability vector.

前述したように、推奨係数は、文ベクトルと他の文ベクトルとの関連性に基づいて確定されるものである。テキストの要約の生成過程でテキストのコンテンツを要約する必要があるため、他の文ベクトルとの関連性が高いほど、当該文ベクトルに含まれる単語ベクトルの情報がテキストのコンテンツの中で重要度が高く、その結果、テキストの要約の内容にる可能性が高いと考えられる。 As mentioned above, the recommended coefficient is determined based on the relationship between the sentence vector and other sentence vectors. Since it is necessary to summarize the content of the text in the process of generating the summary of the text, the higher the relevance to other sentence vectors, the more important the information of the word vector contained in the sentence vector is in the content of the text. High, and as a result, is likely to be the content of the text summary.

いくつかの実施例において、調整サブユニット４３２は、各文ベクトルの推奨係数を当該文ベクトルに対応する単語確率ベクトルに掛けることにより、その単語確率ベクトルに含まれる、各単語ベクトルの符号化隠れ状態ベクトルに対する重み係数を調整することができる。例えば、調整後のｉ番目の単語確率ベクトルＷ_ｉ’は、Ｗ_ｉ’＝Σλ_ｉ*Ｗ_ｉとして表され得る。 In some embodiments, the adjustment subsystem 432 multiplies the recommended coefficient of each sentence vector by the word probability vector corresponding to that sentence vector so that the coded hidden state of each word vector contained in that word probability vector is hidden. The weighting factor for the vector can be adjusted. For example, i-th word probability vector _{W i} 'after control is, _{W i'} can be expressed as = Σλ _{_i} * _W _i.

各文ベクトルの推奨係数を利用して当該文ベクトルの単語確率ベクトルを調整した後、調整サブユニット４３２は、以上のように得た調整された複数の単語確率ベクトルＷ’に基づいて前記複数の単語の推奨確率分布を確定することができる。 After adjusting the word probability vector of the sentence vector using the recommended coefficient of each sentence vector, the adjustment subsystem 432 is based on the plurality of adjusted word probability vectors W'obtained as described above. The recommended probability distribution of words can be determined.

いくつかの実施例において、推奨確率分布Ｐ_Ｖは、上記の方法により得た調整後の複数の単語確率ベクトルＷ’の和であるＰ_Ｖ＝ΣＷ_ｉ’として表されてもよく、即ち、を利用する。いくつかの実現形態において、推奨確率分布Ｐ_Ｖは、調整後の複数の単語確率ベクトルＷ_ｉ’の加重和として表されてもよい。 In some embodiments, the recommended probability distribution P _V may be represented as 'P _{V =} ΣW i is the sum _of' the above plurality of word probability vector W after adjustment obtained by the method, i.e., the Use. In some implementations, the recommended probability distribution P _V may be represented as a weighted sum of a plurality of word probability vector W _{i 'after} adjustment.

出力ユニット４４０は、前記推奨確率分布に基づいて出力すべき単語を確定するように構成される。 The output unit 440 is configured to determine the words to be output based on the recommended probability distribution.

いくつかの実施例において、推奨確率基づいて、現在の生成式のネットワークによって生成された単語確率分布を調整することにより、出力単語確率分布を確定してもよい。 In some embodiments, the output word probability distribution may be determined by adjusting the word probability distribution generated by the network of current generation formulas based on the recommended probabilities.

ここで、前記現在の単語確率分布は、注意確率分布ａ^ｔであってもよい。前記注意確率分布は、前記入力テキストおける単語がテキストの要約における単語となる確率分布を示す。一実現形態において、現在の時間ステップについての符号化隠れ状態ベクトルと復号化隠れ状態ベクトルに基づいて注意確率分布を確定することができる。 Here, the current word probability distribution may be a note probability distribution a ^t. The attention probability distribution indicates a probability distribution in which a word in the input text becomes a word in a text summary. In one implementation, the attention probability distribution can be determined based on the coded hidden state vector and the decoded hidden state vector for the current time step.

いくつかの実施例において、前記推奨確率分布を利用して前記注意確率分布を調整することで、調整後の注意確率分布ａ’^ｔを確定することができる。調整後の注意確率分布を利用して、前記入力テキストにおける単語がテキストの要約における単語となる確率分布を確定することができる。例えば、入力テキストから確率の最大である単語を出力すべき単語として選定することができる。 In some embodiments, by adjusting the attention probability distribution by using the recommended probability distribution, it is possible to determine the note probability distribution a ^'t after adjustment. The adjusted attention probability distribution can be used to determine the probability distribution in which a word in the input text becomes a word in a text summary. For example, the word with the maximum probability can be selected as the word to be output from the input text.

いくつかの実施例において、前記現在の単語確率分布は、生成確率分布Ｐ_{ｖｏｃａｂ}をさらに含む。前記生成単語確率分布は、前記文字エンティティ辞書における単語がテキストの要約における単語となる確率分布を示す。上記のコンテキストベクトルと現在の時間ステップについての復号化隠れ状態ベクトルに基づいて上記の生成確率分布を確定することができる。そして、前記生成確率分布と前記調整後の注意確率分布を重み付け加算することにより、出力単語確率分布を確定することができる。 In some embodiments, the current word probability distribution further comprises _{a generation probability distribution P vocab.} The generated word probability distribution indicates a probability distribution in which a word in the character entity dictionary becomes a word in a text summary. The above generation probability distribution can be determined based on the above context vector and the decoding hidden state vector for the current time step. Then, the output word probability distribution can be determined by weighting and adding the generation probability distribution and the adjusted attention probability distribution.

いくつかの実施例において、前記生成確率分布、前記注意確率分布及び前記推奨確率分布を重み付け加算して出力単語確率分布を確定することができる。一実現形態において、現在の時間ステップについての符号化隠れ状態ベクトル、復号化隠れ状態ベクトル、注意確率分布、推奨確率分布及び１つ前の時間ステップでの復号化ニューラルネットワークの出力に基づいて、前記生成確率分布、前記注意確率分布及び前記推奨確率分布を重み付け加算するための第２の重みＰ_ｇｅｎ２を確定することができる。第２の重みＰ_ｇｅｎ２は、３次元のベクトルとして実現し、ここで、当該３次元のベクトルにおける元素は、それぞれ生成確率分布Ｐ_ｇｅｎ、注意確率分布ａ_ｔ及び推奨確率分布Ｐ_Ｖの重み係数を示す。 In some embodiments, the output word probability distribution can be determined by weighting and adding the generation probability distribution, the attention probability distribution, and the recommended probability distribution. In one implementation, the above, based on the coded hidden state vector for the current time step, the decoded hidden state vector, the attention probability distribution, the recommended probability distribution, and the output of the decoding neural network in the previous time step. It is possible to determine _{the second weight P gen2} for weighting and adding the generation probability distribution, the attention probability distribution, and the recommended probability distribution. Second weight _{P gen2} is implemented as a three-dimensional vector, where elements in the three-dimensional vectors, respectively generates probability distribution _{P gen,} the weighting factor of attention probability distribution _{a t} and recommended probability distribution _{P V} Shown.

上記のテキスト処理装置で用いられる訓練パラメータは、予め定められた訓練データセットを用いて訓練されるものである。例えば、訓練データを上記のテキスト処理装置に入力し、符号化ニューラルネットワーク、復号化ニューラルネットワーク、及び文ベクトル間の関連性を確定するための初期推奨重みベクトルを用いて、ソーステキストの単語ベクトルを処理することにより、上記のように訓練された出力単語確率分布を得ることができる。上記のテキスト処理モデルにおける訓練パラメータは、訓練された出力単語確率分布における正解の単語の確率損失を算出することにより調整されることができる。ここで、本開示に係るテキスト生成ネットワークの損失関数は、式（８）により示され得る。 The training parameters used in the text processing apparatus described above are those trained using a predetermined training data set. For example, the training data is input to the text processor described above, and the word vector of the source text is obtained using the coded neural network, the decoded neural network, and the initial recommended weight vector for establishing the relationship between the sentence vectors. By processing, the output word probability distribution trained as described above can be obtained. The training parameters in the above text processing model can be adjusted by calculating the probability loss of the correct word in the trained output word probability distribution. Here, the loss function of the text generation network according to the present disclosure can be expressed by the equation (8).

ここで、ｗ_ｔ ^*は時間ステップｔについての正解単語の時間ステップｔでの訓練後の出力単語確率分布の確率値であり、Ｔは生成シーケンス全体にわたる合計時間ステップである。テキスト生成ネットワークの全体的な損失は、生成シーケンス全体にわたるすべての時間ステップでの損失値を統計することによって確定されることができる。 Here, w _t ^* is the probability value of the output word probability distribution after training in the time step t of the correct word for the time step t, and T is the total time step over the entire generation sequence. The overall loss of the text generation network can be determined by statistics on the loss values at all time steps throughout the generation sequence.

上記のテキスト処理装置のパラメータに対する訓練は、上記の損失が最小になるようにテキスト処理装置の訓練パラメータを調整することによって実現できる。 The training for the parameters of the text processing device can be realized by adjusting the training parameters of the text processing device so that the above loss is minimized.

本開示に係るテキスト処理装置によれば、例えば、テキストの要約のコンテンツを生成する際に、入力されたテキストにおける各単語からなる文ベクトルの間の相関性に基づいて、入力されたテキストにおける単語の当該テキストのコンテンツにおける重要度を確定することができ、テキストのコンテンツに対する単語の重要度に基づいて、生成されたテキストのコンテンツを確定するといった技術的効果を奏する。本開示では、要約を生成する場合を例に挙げて原理を説明したが、本開示の内容はこれに限定されない。本開示の原理から逸脱することなく、本開示に係るテキスト処理方法を、テキスト拡張、テキスト書き換え等の他の応用シーンに適用することもできる。 According to the text processing apparatus according to the present disclosure, for example, when generating the content of the text summary, the words in the input text are based on the correlation between the sentence vectors consisting of each word in the input text. It is possible to determine the importance of the text in the content of the text, and it has a technical effect of determining the content of the generated text based on the importance of the word to the content of the text. In the present disclosure, the principle has been described by taking the case of generating a summary as an example, but the content of the present disclosure is not limited to this. The text processing method according to the present disclosure can be applied to other application scenes such as text extension and text rewriting without departing from the principle of the present disclosure.

なお、本開示の実施例による方法または装置は、図５に示されるコンピューティングデバイスのアーキテクチャによって実現されてもよい。図５は、コンピューティングデバイスのアーキテクチャを示す。図５に示されるように、コンピューティングデバイス５００は、バス５１０、１つまたは少なくとも２つのＣＰＵ５２０、読み取り専用メモリ(ＲＯＭ)５３０、ランダムアクセスメモリ(ＲＡＭ) ５４０、ネットワークに接続された通信ポート５５０、入力／出力コンポーネント５６０、ハードディスク５７０などを含んでもよい。コンピューティングデバイス５００での記憶デバイス、例えば、ＲＯＭ５３０またはハードディスク５７０には、ビデオにおいてターゲットを検出するための方法の処理および／または通信に利用される、本開示による様々なデータまたはファイル、ならびにＣＰＵによって実行されるプログラム命令が記憶されていることができる。コンピューティング装置５００は、ユーザインターフェース５８０も含んでもよい。もちろん、図５に示されるアーキテクチャは、単なる例示的なものであり、異なるデバイスを実現する場合、実際の必要に応じて、図５に示されるコンピューティングデバイスの１つまたは少なくとも２つの構成要素は省略されてもよい。 The method or apparatus according to the embodiment of the present disclosure may be realized by the architecture of the computing device shown in FIG. FIG. 5 shows the architecture of a computing device. As shown in FIG. 5, the computing device 500 includes a bus 510, one or at least two CPU 520s, a read-only memory (ROM) 530, a random access memory (RAM) 540, and a networked communication port 550. It may include input / output components 560, hard disk 570, and the like. The storage device in the computing device 500, such as the ROM 530 or the hard disk 570, is provided by the various data or files according to the present disclosure, as well as by the CPU, which are used to process and / or communicate methods for detecting targets in video. The program instruction to be executed can be stored. The computing device 500 may also include a user interface 580. Of course, the architecture shown in FIG. 5 is merely exemplary, and when implementing different devices, one or at least two components of the computing device shown in FIG. 5 may, depending on the actual need. It may be omitted.

本願の実施例は、コンピュータ読み取り可能な記憶媒体としても実装されてもよい。本願の実施例によるコンピュータ読み取り可能な記憶媒体は、コンピュータ可読命令を記憶している。コンピュータ読み取り可能な命令がプロセッサによって実行されるとき、上記の図面を参照して説明した本願の実施例による方法が実行されることができる。コンピュータ読み取り可能な記憶媒体は、例えば、揮発性メモリ及び／又は不揮発性メモリを含むが、これらに限定されない。揮発性メモリは、例えば、ランダムアクセスメモリ(ＲＡＭ)及び／又はキャッシュメモリ（ｃａｃｈｅ）などを含んでもよい。不揮発性メモリは、例えば、読み取り専用メモリ(ＲＯＭ)、ハードディスク、フラッシュメモリなどを含んでもよい。 The embodiments of the present application may also be implemented as a computer-readable storage medium. The computer-readable storage medium according to the embodiment of the present application stores computer-readable instructions. When a computer-readable instruction is executed by the processor, the method according to the embodiment of the present application described with reference to the above drawings can be performed. Computer-readable storage media include, but are not limited to, volatile and / or non-volatile memory, for example. Volatile memory may include, for example, random access memory (RAM) and / or cache memory (cache). The non-volatile memory may include, for example, a read-only memory (ROM), a hard disk, a flash memory, and the like.

本明細書で開示された内容に対して、様々な変更および改良が行われ得ることは、当業者によって理解されるべきであろう。例えば、上記の様々な装置又は構成要素は、ハードウェアで、又はソフトウェア、ファームウェア、又はこれらの一部又は全部の組み合わせで実現されてもよい。 It should be understood by those skilled in the art that various changes and improvements may be made to the content disclosed herein. For example, the various devices or components described above may be implemented in hardware or in software, firmware, or a combination of some or all of them.

また、本出願及び特許請求の範囲に示されるように、「１」、「１個」、及び／又は「１種類」及び／又は「当該」などの用語は、文脈上明らかにそうでないことを示しない限り、単数形のものではなく、複数形のものも含むことができる。一般に、「含む」及び「有する」という用語は、明示的に特定されたステップ及び要素を含むことを単に示唆するものであり、これらのステップ及び要素は排他的な羅列を構成するものではなく、方法又は装置は他のステップ又は要素を含むこともある。 Also, as shown in the scope of this application and claims, terms such as "1", "1 piece", and / or "1 type" and / or "corresponding" are clearly not the case in the context. Unless otherwise indicated, it may include plurals rather than singulars. In general, the terms "include" and "have" merely suggest that they include explicitly specified steps and elements, and these steps and elements do not constitute an exclusive enumeration. The method or device may also include other steps or elements.

さらに、本明細書は、本開示の実施例によるシステムのいくつかのユニットに対する様々な参照を行うが、任意の数の異なるユニットが使用され、クライアント及び／又はサーバ上で実行されてもよい。前記ユニットは、単に例示的なものであり、そして前記システム及び方法の異なる態様には、異なるユニットを使用してもよい。 Further, although this specification makes various references to some units of the system according to the embodiments of the present disclosure, any number of different units may be used and run on the client and / or server. The units are merely exemplary, and different units may be used in different aspects of the system and method.

また、本発明の実施例に係るシステムが実行する動作を説明するために、本発明の開示においてフローチャートを用いる。なお、前述又は後述した動作は、必ずしも順序通りに正確に実行されなくてもよい。逆に、様々なステップは、逆の順序にまたは同時に実行され得る。同時に、他の操作もこれらのプロセスに加えられ、またはこれらのプロセスから一つ又は複数のステップの動作が除去されてもよい。 Further, in order to explain the operation executed by the system according to the embodiment of the present invention, a flowchart is used in the disclosure of the present invention. The operations described above or described later do not necessarily have to be executed accurately in order. Conversely, the various steps can be performed in reverse order or at the same time. At the same time, other operations may be added to or removed from these processes in one or more steps.

本明細書で使用される全ての用語（技術的及び科学的な用語を含み）は、特に定義されない限り、本発明が属する技術分野の当業者によって共通に理解されるのと同じ意味を持つ。一般的な辞書に定義されているような用語は、関連技術の文脈上の意味と一致する意味を持つものと解釈されるべきであり、本明細書で明らかに定義しない限り、理想的または極端な形式で解釈されるべきではない。 All terms used herein, including technical and scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which the present invention belongs, unless otherwise defined. Terms such as those defined in general dictionaries should be construed to have meaning consistent with the contextual meaning of the relevant technology and are ideal or extreme unless expressly defined herein. Should not be interpreted in any form.

以上、本発明を説明したが、本発明はこれらに限定されるものではない。本発明のいくつかの例示的な実施例を説明したが、本発明の新規な教示および利点から逸脱することなく、例示的な実施例に多くの変更を行うことができることは当業者には容易に理解されるべきである。したがって、このような全ての変更は特許請求で限定されている本発明の範囲に含まれることが意図される。上記は、本発明に対する説明であり、本発明が開示された特定の実施例に限定されるものと理解されるべきではなく、開示された実施例および他の実施例に対する変更は、添付の特許請求の範囲内に含まれることが意図されることを理解されたい。本発明は、特許請求の範囲およびそれと同等なものによって限定される。 Although the present invention has been described above, the present invention is not limited thereto. Although some exemplary embodiments of the invention have been described, it will be readily appreciated by those skilled in the art that many modifications can be made to the exemplary embodiments without departing from the novel teachings and advantages of the invention. Should be understood. Therefore, all such modifications are intended to be included in the scope of the invention, which is limited by the claims. The above is a description of the present invention and should not be understood to be limited to the particular embodiments in which the present invention has been disclosed. Please understand that it is intended to be included in the claims. The present invention is limited by the scope of claims and equivalents.

Claims

A preprocessing unit that preprocesses the source text and is arranged to generate multiple word vectors for multiple words,
A sentence vector determination unit arranged so as to determine a plurality of sentence vectors based on a plurality of initial recommended weight vectors and the plurality of word vectors.
Adjust the plurality of initial recommended weight vectors based on the relationship between each sentence vector and the other sentence vector of the plurality of sentence vectors to determine the recommended probability distribution for the plurality of words. The recommended probability determination unit to be placed and
It comprises an output unit arranged so as to determine a word to be output based on the recommended probability distribution.
Text processing device.

The sentence vector determination unit is
The plurality of word vectors are processed using a coded neural network to determine the current coded hidden state vector corresponding to each word vector.
Based on each initial recommended weight vector and the current coded hidden state vector, the statement vector corresponding to the initial recommended weight vector is arranged so as to be determined.
The text processing device according to claim 1.

The output unit is
Based on the current coded hidden state vector, the decoding neural network is used to determine the current decoded hidden state vector.
The current coded hidden state vector and the current decoded hidden state vector are used to determine the current word probability distribution.
It is arranged so as to determine the word to be output based on the current word probability distribution and the recommended probability distribution.
The text processing apparatus according to claim 2.

The current word probability distribution includes a generation probability distribution and an attention probability distribution.
The output unit is
The attention probability distribution is adjusted by using the recommended probability distribution, and the adjusted attention probability distribution is determined.
The output word probability distribution is determined by weighting and adding the generated probability distribution and the adjusted attention probability distribution.
Output word Arranged to determine the word with the highest probability in the probability distribution as the word to be output,
The text processing apparatus according to claim 3.

The current word probability distribution includes a generation probability distribution and an attention probability distribution.
The output unit is
The weights used for the generation probability distribution, the attention probability distribution, and the recommended probability distribution are determined, and the output word probability distribution is determined based on the weights.
The word with the maximum probability of the output word probability distribution is arranged so as to be determined as the word to be output.
The text processing apparatus according to claim 3.

The recommended probability determination subunit further includes a relevance determination subunit.
The association-determining subunit
For each sentence vector, the sentence vector is combined with another sentence vector to generate a combination sentence vector.
By processing the combination sentence vector using the relevance matrix, it is arranged so as to determine the relevance between the sentence vector and the other sentence vector.
The text processing apparatus according to any one of claims 1 to 5.

The recommended probability determination unit further includes the adjustment subunit and
The adjustment subunit
Based on the relationship between the sentence vector and each of the other sentence vectors among the plurality of sentence vectors, the recommended coefficient of the sentence vector is determined.
For each of the initial recommended weight vectors, the initial recommended weight vector is adjusted by using the recommended coefficient of the sentence vector corresponding to the initial recommended weight vector, and the adjusted word probability vector is obtained.
Arranged so as to determine the recommended probability distribution of the plurality of words based on the adjusted word probability vector.
The text processing apparatus according to claim 6.

Preprocessing the source text to generate multiple word vectors for multiple words,
Determining multiple sentence vectors based on multiple initial recommended weight vectors and the multiple word vectors.
To determine the recommended probability distribution for the plurality of words by adjusting the plurality of initial recommended weight vectors based on the relationship between each sentence vector and the other sentence vector among the plurality of sentence vectors. ,
Including determining the words to be output based on the recommended probability distribution.
Text processing method.

With the processor
Includes memory for storing computer-readable program instructions,
When the computer-readable program instruction is executed by the processor, the text processing method according to claim 8 is executed.
Text processing device.

A computer-readable storage medium that stores computer-readable instructions.
When the computer-readable instruction is executed by the computer, the computer is made to execute the text processing method according to claim 8.
A computer-readable storage medium.