JP7414357B2

JP7414357B2 - Text processing methods, apparatus, devices and computer readable storage media

Info

Publication number: JP7414357B2
Application number: JP2019209171A
Authority: JP
Inventors: シーホングオ; シンユグオ; アンシンリー; ランチン; 大志池田; 健吉村; 拓藤本
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2019-08-20
Filing date: 2019-11-19
Publication date: 2024-01-16
Anticipated expiration: 2039-11-19
Also published as: JP2021033994A; CN112487136A

Description

本開示は、テキスト処理分野に関し、具体的に、テキスト処理方法、装置、デバイス及びコンピュータ読み取り可能な記憶媒体に関する。 TECHNICAL FIELD The present disclosure relates to the field of text processing, and specifically relates to text processing methods, apparatus, devices, and computer-readable storage media.

従来のテキストの生成過程において、テキストを生成するネットワークの出力コンテンツは、訓練データを学習した結果である。例えば、要約のようなテキストを生成するシーンでは、多くの訓練データの正解がテキストのコンテンツにおける前のいくつかの文に集中しているため、このような訓練データを用いて訓練されたネットワークも、テキストのコンテンツにおける前の文について新たなテキストコンテンツを生成する傾向にある。したがって、現在のテキスト処理方法では、テキストのコンテンツに対して要約及び抽出をする効率的な手段がない。 In the conventional text generation process, the output content of the text generation network is the result of learning training data. For example, in a scene that generates text such as a summary, many of the correct answers in the training data are concentrated in the previous few sentences in the text content, so the network trained using such training data also , tends to generate new text content about the previous sentence in the text content. Therefore, current text processing methods do not provide efficient means for summarizing and extracting text content.

本開示は、テキストから要約を効率的に抽出し生成するためのテキスト処理方法、装置、デバイス及びコンピュータ読み取り可能な記憶媒体を提供する。 The present disclosure provides text processing methods, apparatus, devices, and computer-readable storage media for efficiently extracting and generating summaries from text.

本開示の１つの局面において、ソーステキストに対し前処理を行って、複数の単語のための複数の単語ベクトルを生成するように配置される前処理ユニットと、複数の初期推奨重みベクトルと前記複数の単語ベクトルに基づいて、複数の文ベクトルを確定するように配置される文ベクトル確定ユニットと、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性に基づいて前記複数の初期推奨重みベクトルを調整して、前記複数の単語のための推奨確率分布を確定するように配置される推奨確率確定ユニットと、前記推奨確率分布に基づいて出力すべき単語を確定するように配置される出力ユニットと、を備えるテキスト処理装置が提供されている。 In one aspect of the present disclosure, a preprocessing unit configured to perform preprocessing on a source text to generate a plurality of word vectors for a plurality of words; a plurality of initial recommendation weight vectors; a sentence vector determining unit arranged to determine a plurality of sentence vectors based on word vectors of the plurality of sentence vectors; a recommendation probability determining unit arranged to adjust an initial recommendation weight vector of to determine a recommendation probability distribution for the plurality of words; and a recommendation probability determining unit configured to determine a word to be output based on the recommended probability distribution. A text processing device is provided, comprising an output unit disposed.

いくつかの実施例において、前記文ベクトル確定ユニットは、符号化ニューラルネットワークを利用して前記複数の単語ベクトルを処理して、各単語ベクトルにそれぞれ対応する現在の符号化隠れ状態ベクトルを確定し、各初期推奨重みベクトルと前記現在の符号化隠れ状態ベクトルに基づいて、当該初期推奨重みベクトルに対応する文ベクトルを確定するように配置される。 In some embodiments, the sentence vector determination unit processes the plurality of word vectors using an encoding neural network to determine a current encoded hidden state vector corresponding to each word vector, respectively; Based on each initial recommendation weight vector and the current encoded hidden state vector, the sentence vector corresponding to the initial recommendation weight vector is determined.

いくつかの実施例において、前記出力ユニットは、前記現在の符号化隠れ状態ベクトルに基づいて、復号化ニューラルネットワークを利用して現在の復号化隠れ状態ベクトルを確定し、前記現在の符号化隠れ状態ベクトルと前記現在の復号化隠れ状態ベクトルを利用して現在の単語確率分布を確定し、前記現在の単語確率分布と前記推奨確率分布に基づいて、出力すべき単語を確定するように配置される。 In some embodiments, the output unit utilizes a decoding neural network to determine a current decoded hidden state vector based on the current encoded hidden state vector, and determines a current decoded hidden state vector based on the current encoded hidden state vector. A current word probability distribution is determined using the vector and the current decoded hidden state vector, and a word to be output is determined based on the current word probability distribution and the recommended probability distribution. .

いくつかの実施例において、前記現在の単語確率分布は、生成確率分布及び注意確率分布を含み、前記出力ユニットは、前記推奨確率分布を利用して前記注意確率分布を調整し、調整後の注意確率分布を確定し、前記生成確率分布と前記調整後の注意確率分布を重み付け加算して出力単語確率分布を確定し、出力単語確率分布内の確率の最大である単語を出力すべき単語として確定するように配置される。 In some embodiments, the current word probability distribution includes a generation probability distribution and an attention probability distribution, and the output unit adjusts the attention probability distribution using the recommended probability distribution, and adjusts the attention probability distribution after adjusting the attention probability distribution. Determine the probability distribution, add the generation probability distribution and the adjusted attention probability distribution with weight to determine the output word probability distribution, and determine the word with the maximum probability in the output word probability distribution as the word to be output. It is arranged so that

いくつかの実施例において、前記現在の単語確率分布は、生成確率分布及び注意確率分布を含み、前記出力ユニットは、前記生成確率分布、前記注意確率分布及び前記推奨確率分布に用いられる重みを確定して、前記重みに基づいて前記出力単語確率分布を確定し、出力単語確率分布の確率の最大である単語を出力すべき単語として確定するように配置される。 In some embodiments, the current word probability distribution includes a production probability distribution and an attention probability distribution, and the output unit determines weights used for the production probability distribution, the attention probability distribution, and the recommendation probability distribution. Then, the output word probability distribution is determined based on the weights, and the word having the maximum probability of the output word probability distribution is determined as the word to be output.

いくつかの実施例において、推奨確率確定ユニットは、関連性確定サブユニットをさらに含み、前記関連性確定サブユニットは、各文ベクトルに対し、当該文ベクトルを他の文ベクトルと組み合わせて、組合せ文ベクトルを生成し、関連性行列を利用して前記組合せ文ベクトルを処理することにより、当該文ベクトルと当該他の文ベクトルとの関連性を確定するように配置される。 In some embodiments, the recommendation probability determination unit further includes a relevance determination subunit, and for each sentence vector, the relevance determination subunit combines the sentence vector with other sentence vectors to form a combined sentence. By generating a vector and processing the combined sentence vector using an association matrix, the arrangement is made to determine the association between the sentence vector and the other sentence vector.

いくつかの実施例において、推奨確率確定ユニットは、調整サブユニットをさらに含み、前記調整サブユニットは、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性に基づいて、当該文ベクトルの推奨係数を確定し、前記初期推奨重みベクトルの夫々に対し、当該初期推奨重みベクトルに対応する文ベクトルの推奨係数を利用して当該初期推奨重みベクトルを調整し、調整後の単語確率ベクトルを取得し、調整後の単語確率ベクトルに基づいて前記複数の単語の推奨確率分布を確定するように配置される。 In some embodiments, the recommendation probability determination unit further includes an adjustment subunit, and the adjustment subunit is configured to determine the probability based on the association between the sentence vector and each of the other sentence vectors of the plurality of sentence vectors. Then, the recommendation coefficient of the sentence vector is determined, and for each of the initial recommendation weight vectors, the initial recommendation weight vector is adjusted using the recommendation coefficient of the sentence vector corresponding to the initial recommendation weight vector, and after the adjustment, is arranged to obtain a word probability vector of and determine a recommended probability distribution of the plurality of words based on the adjusted word probability vector.

本開示の他の態様において、ソーステキストに対し前処理を行って、複数の単語のための複数の単語ベクトルを生成することと、複数の初期推奨重みベクトルと前記複数の単語ベクトルに基づいて、複数の文ベクトルを確定することと、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性に基づいて前記複数の初期推奨重みベクトルを調整して、前記複数の単語のための推奨確率分布を確定することと、前記推奨確率分布に基づいて出力すべき単語を確定することとを含むテキスト処理方法が提供されている。 In another aspect of the disclosure, performing preprocessing on a source text to generate a plurality of word vectors for a plurality of words; and based on a plurality of initial recommendation weight vectors and the plurality of word vectors; determining a plurality of sentence vectors and adjusting the plurality of initial recommendation weight vectors based on the relevance of each sentence vector to other sentence vectors among the plurality of sentence vectors; A text processing method is provided that includes determining a recommended probability distribution for a text, and determining words to be output based on the recommended probability distribution.

いくつかの実施例において、複数の初期推奨重みベクトルと前記複数の単語ベクトルに基づいて、複数の文ベクトルを確定することは、符号化ニューラルネットワークを利用して前記複数の単語ベクトルを処理して、各単語ベクトルにそれぞれ対応する現在の符号化隠れ状態ベクトルを確定し、各初期推奨重みベクトルと前記現在の符号化隠れ状態ベクトルに基づいて、当該初期推奨重みベクトルに対応する文ベクトルを確定することを含む。 In some embodiments, determining sentence vectors based on the plurality of initial recommendation weight vectors and the plurality of word vectors includes processing the plurality of word vectors using an encoding neural network. , determine a current encoded hidden state vector corresponding to each word vector, and determine a sentence vector corresponding to the initial recommended weight vector based on each initial recommended weight vector and the current encoded hidden state vector. Including.

いくつかの実施例において、前記推奨確率分布に基づいて出力すべき単語を確定することは、前記現在の符号化隠れ状態ベクトルに基づいて、復号化ニューラルネットワークを利用して現在の復号化隠れ状態ベクトルを確定し、前記現在の符号化隠れ状態ベクトルと前記現在の復号化隠れ状態ベクトルを利用して現在の単語確率分布を確定し、前記現在の単語確率分布と前記推奨確率分布に基づいて、出力すべき単語を確定することを含む。 In some embodiments, determining the word to be output based on the recommended probability distribution includes determining the current decoded hidden state using a decoding neural network based on the current encoded hidden state vector. determining a current word probability distribution using the current encoded hidden state vector and the current decoded hidden state vector, based on the current word probability distribution and the recommended probability distribution; This includes determining the words to be output.

いくつかの実施例において、前記現在の単語確率分布は、生成確率分布及び注意確率分布を含み、ここで、前記現在の単語確率分布と前記推奨確率分布に基づいて、出力すべき単語を確定することは、前記推奨確率分布を利用して前記注意確率分布を調整し、調整後の注意確率分布を確定し、前記生成確率分布と前記調整後の注意確率分布を重み付け加算して出力単語確率分布を確定し、出力単語確率分布内の確率の最大である単語を出力すべき単語として確定することを含む。 In some embodiments, the current word probability distribution includes a generation probability distribution and an attention probability distribution, wherein the word to be output is determined based on the current word probability distribution and the recommendation probability distribution. That is, the attention probability distribution is adjusted using the recommended probability distribution, the adjusted attention probability distribution is determined, and the output word probability distribution is determined by weighting and adding the generation probability distribution and the adjusted attention probability distribution. and determining the word with the maximum probability in the output word probability distribution as the word to be output.

いくつかの実施例において、前記現在の単語確率分布は、生成確率分布及び注意確率分布を含み、ここで、前記現在の単語確率分布と前記推奨確率分布に基づいて、出力すべき単語を確定することは、前記生成確率分布、前記注意確率分布及び前記推奨確率分布に用いられる重みを確定して、前記重みに基づいて前記出力単語確率分布を確定し、出力単語確率分布の確率の最大である単語を出力すべき単語として確定することを含む。 In some embodiments, the current word probability distribution includes a generation probability distribution and an attention probability distribution, wherein the word to be output is determined based on the current word probability distribution and the recommendation probability distribution. That is, determining the weights used for the generation probability distribution, the attention probability distribution, and the recommendation probability distribution, determining the output word probability distribution based on the weights, and determining the maximum probability of the output word probability distribution. This includes determining a word as a word to be output.

いくつかの実施例において、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性は、以下のように確定される。つまり、各文ベクトルに対し、当該文ベクトルを他の文ベクトルと組み合わせて、組合せ文ベクトルを生成し、関連性行列を利用して前記組合せ文ベクトルを処理することにより、当該文ベクトルと当該他の文ベクトルとの関連性を確定する。 In some embodiments, the association between each sentence vector and other sentence vectors of the plurality of sentence vectors is determined as follows. In other words, for each sentence vector, the sentence vector is combined with other sentence vectors to generate a combined sentence vector, and the combined sentence vector is processed using the association matrix. Determine the relationship between the sentence vector and the sentence vector.

いくつかの実施例において、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性に基づいて前記複数の初期推奨重みベクトルを調整して、前記複数の単語のための推奨確率分布を確定することは、推奨確率確定ユニットは、調整サブユニットをさらに含み、前記調整サブユニットは、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性に基づいて、当該文ベクトルの推奨係数を確定し、前記初期推奨重みベクトルの夫々に対し、当該初期推奨重みベクトルに対応する文ベクトルの推奨係数を利用して当該初期推奨重みベクトルを調整し、調整後の単語確率ベクトルを取得し、調整後の単語確率ベクトルに基づいて前記複数の単語の推奨確率分布を確定することを含む。 In some embodiments, the plurality of initial recommendation weight vectors are adjusted based on the relevance of each sentence vector to other sentence vectors of the plurality of sentence vectors to make recommendations for the plurality of words. The recommended probability determining unit further includes an adjustment subunit, and the adjustment subunit determines the relationship between the sentence vector and each of the other sentence vectors among the plurality of sentence vectors. Based on this, the recommendation coefficient of the sentence vector is determined, and for each of the initial recommendation weight vectors, the initial recommendation weight vector is adjusted using the recommendation coefficient of the sentence vector corresponding to the initial recommendation weight vector. obtaining a subsequent word probability vector and determining a recommended probability distribution for the plurality of words based on the adjusted word probability vector.

本開示のさらに他の態様において、プロセッサと、コンピュータ読み取り可能なプログラム命令が記憶されるメモリと、を含み、前記コンピュータ読み取り可能なプログラム命令が前記プロセッサにより実行されるとき、上述したようなテキスト処理方法を実行するテキスト処理デバイスが提供されている。 Still other aspects of the present disclosure include a processor and a memory in which computer readable program instructions are stored, the computer readable program instructions, when executed by the processor, performing text processing as described above. A text processing device is provided that performs the method.

本開示のさらに他の態様において、コンピュータ読み取り可能な命令が記憶されるコンピュータ読み取り可能な記憶媒体であって、前記コンピュータ読み取り可能な命令がコンピュータにより実行されるとき、前記コンピュータに上述したようなテキスト処理方法を実行させるコンピュータ読み取り可能な記憶媒体が提供されている。 In yet another aspect of the disclosure, a computer-readable storage medium having computer-readable instructions stored thereon, the computer-readable instructions, when executed by a computer, prompting the computer to write text as described above. A computer readable storage medium is provided for carrying out the processing method.

本開示に係るテキスト処理方法、装置、デバイス及びコンピュータ読み取り可能な記憶媒体をよれば、テキストにおける各単語と各単語からなる文との関連性に基づいて、テキストの要約の抽出方法によるテキストのコンテンツに対する理解力を向上させ、テキストのコンテンツをより好適に抽象化させ、要約し、テキストの要約を生成することができる。 According to the text processing method, apparatus, device, and computer-readable storage medium according to the present disclosure, the text content is extracted by a text summary extraction method based on the relationship between each word in the text and a sentence consisting of each word. The content of the text can be better abstracted and summarized, and a text summary can be generated.

本発明の上記及び他の目的、特徴や利点は、後述する本発明の実施例や添付する図面に基づくより詳細な説明によって明らかになるであろう。図面は、本開示の実施例のさらなる理解を提供するために使用され、本明細書の一部を構成し、本開示の実施例と共に本開示を説明するために使用され、本開示を限定するものではない。なお、図面において、同一の符号は同一の構成要素又はステップを示す。
本開示による、テキスト処理方法の模式的なフローチャートを示す。本開示の実施例による、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性を確定する模式図を示す。本開示の実施例による、出力単語確率分布の確定の模式図を示す。本開示の実施例による、生成確率分布と調整後の注意確率分布を利用して出力単語確率分布を確定する模式図を示す。本開示の実施例による、生成確率分布、注意確率分布及び推奨確率分布を利用して出力単語確率分布を確定する模式図を示す。本開示の実施例による、テキスト処理装置の模式的なブロック図を示す。本開示の実施例による、演算デバイスの模式図である。 The above and other objects, features, and advantages of the present invention will become apparent from a more detailed description based on the embodiments of the present invention described below and the accompanying drawings. The drawings are used to provide a further understanding of the embodiments of the disclosure, constitute a part of the specification, and together with the embodiments of the disclosure serve to explain the disclosure and limit the disclosure. It's not a thing. Note that in the drawings, the same reference numerals indicate the same components or steps.
1 shows a schematic flowchart of a text processing method according to the present disclosure. FIG. 6 shows a schematic diagram for determining the relationship between each sentence vector and other sentence vectors among the plurality of sentence vectors, according to an embodiment of the present disclosure. FIG. 6 shows a schematic diagram of determining an output word probability distribution according to an embodiment of the present disclosure. FIG. 6 shows a schematic diagram of determining an output word probability distribution using a generation probability distribution and an adjusted attention probability distribution according to an embodiment of the present disclosure. FIG. 6 shows a schematic diagram of determining an output word probability distribution using a generation probability distribution, an attention probability distribution, and a recommendation probability distribution, according to an embodiment of the present disclosure. 1 shows a schematic block diagram of a text processing device, according to an embodiment of the present disclosure; FIG. 1 is a schematic diagram of a computing device, according to an embodiment of the present disclosure; FIG.

以下、本開示の実施例における技術的解決策を、本開示の実施例における添付図面と併せて、明確かつ完全に説明する。もちろん、説明された実施例は、本開示の一部の実施例にすぎず、全ての実施例ではない。本開示の実施例に基づいて、当業者が創造的な労力を要することなく得られる全ての他の実施例は、本開示の保護範囲に属する。 Hereinafter, the technical solutions in the embodiments of the present disclosure will be clearly and completely explained in conjunction with the accompanying drawings in the embodiments of the present disclosure. Of course, the described embodiments are only some, but not all, embodiments of the present disclosure. All other embodiments that can be obtained by a person skilled in the art based on the embodiments of the present disclosure without any creative efforts fall within the protection scope of the present disclosure.

特に定義されない限り、本明細書で使用される技術的または科学的用語は、本発明が属する技術分野における通常の技能を有する者によって理解される通常の意味である。本明細書で使用される「第１の」、「第２の」及び類似の用語は、いかなる順序、数、又は重要性も示すものではなく、異なる構成要素を区別するために使用されるだけである。同様に、「含む」または「備える」などの類似の単語は、その単語の前に存在する要素または物品が、その単語の後に存在する要素または物品およびその均等物を包含することを意味し、他の要素または物品を排除するものではない。「接続され」または「に接され」などの類似の用語は、物理的または機械的接続に限定されず、直接的または間接的のいずれであっても、電気的接続を含み得る。「上」、「下」、「左」、「右」などは、相対的な位置関係を示すためのものであり、記述されたオブジェクトの絶対的な位置が変化すると、相対的な位置関係も変化する可能性がある。 Unless otherwise defined, technical or scientific terms used herein have their ordinary meanings as understood by one of ordinary skill in the art to which this invention belongs. As used herein, "first," "second," and similar terms do not imply any order, number, or importance, but are only used to distinguish between different components. It is. Similarly, similar words such as "comprising" or "comprising" mean that the elements or articles that occur before the word include the elements or articles that occur after the word and their equivalents; It does not exclude other elements or items. Similar terms such as "connected" or "adjacent to" are not limited to physical or mechanical connections, but can include electrical connections, whether direct or indirect. "Top", "bottom", "left", "right", etc. are used to indicate relative positional relationships, and when the absolute position of the described object changes, the relative positional relationship also changes. Subject to change.

図１は、本開示によるテキスト処理方法の模式的なフローチャートを示す。図１に示すように、ステップＳ１０２において、ソーステキストに対して前処理を行って、前記複数の単語のための複数の単語ベクトルを生成する。 FIG. 1 shows a schematic flowchart of a text processing method according to the present disclosure. As shown in FIG. 1, in step S102, preprocessing is performed on the source text to generate word vectors for the words.

テキスト処理方法がコンピュータによって実行される場合、コンピュータはテキストデータを直接に処理できないため、ソーステキストを処理する際には、ソーステキストを数値型のデータに変換しておく必要がある。例えば、ソーステキストのコンテンツは、１つ又は複数の文であってもよい。前記前処理は、文を複数の単語に分割するように各文に対して単語分割処理を実行し、、複数の単語をそれぞれ所定次元の単語ベクトルに変換することを含む。例えば、ワード埋め込み(ｗｏｒｄｅｍｂｅｄｄｉｎｇ)の方式によって、この変換を行うことができる。 When a text processing method is executed by a computer, the computer cannot directly process text data, so when processing the source text, it is necessary to convert the source text into numerical data. For example, the source text content may be one or more sentences. The preprocessing includes performing word division processing on each sentence so as to divide the sentence into a plurality of words, and converting each of the plurality of words into word vectors of predetermined dimensions. For example, this conversion can be performed using a word embedding method.

ステップＳ１０４において、複数の初期推奨重みベクトルと前記複数の単語ベクトルに基づいて、複数の文ベクトルＳを確定する。 In step S104, a plurality of sentence vectors S are determined based on the plurality of initial recommended weight vectors and the plurality of word vectors.

いくつかの実施例において、各時間ステップ(ｔｉｍｅｓｔｅｐ)について、符号化ニューラルネットワークを用いてステップＳ１０２において生成された複数の単語ベクトルを処理することにより、各単語ベクトルにそれぞれ対応する現在の符号化隠れ状態ベクトルを確定し得る。いくつかの実現形態において、符号化ニューラルネットワークは、長期や短期記憶(ｌｓｔｍ、ｌｏｎｇａｎｄｓｈｏｒｔ－ｔｅｒｍｍｅｍｏｒｙ)ネットワークとして実現され得る。符号化ニューラルネットワークは、単語ベクトルを符号化することができる任意の機械学習モデルとしても実現され得ることが理解されようである。 In some embodiments, for each time step, an encoding neural network is used to process the plurality of word vectors generated in step S102 to determine the current encoding that corresponds to each word vector. The hidden state vector can be determined. In some implementations, the encoding neural network may be implemented as a long and short-term memory (lstm) network. It will be appreciated that the encoding neural network may also be implemented as any machine learning model capable of encoding word vectors.

ステップＳ１０２で生成された単語ベクトルを入力として、符号化ニューラルネットワークは、現在の時間ステップが各単語ベクトルｘ_１、ｘ_２、ｘ_３…のそれぞれに対応する現在の符号化隠れ状態ベクトルｈ_１、ｈ_２、ｈ_３…を出力することができる。符号化隠れ状態ベクトルの数と単語ベクトルの数は、同じであってもよいし、異なっていてもよい。例えば、ソーステキストからｋ個の単語ベクトルが生成される場合、符号化ニューラルネットワークは、このｋ個の単語ベクトルを処理することにより、ｋ個の対応する符号化隠れ状態ベクトルを生成することができる。ｋは１より大きい整数である。 With the word vector generated in step S102 as input, the encoding neural network generates the current encoded hidden state vector h ₁ , whose current time step corresponds to each of the word vectors x ₁ , x ₂ , x ₃ . h ₂ , h _{3 .} . . can be output. The number of encoded hidden state vectors and the number of word vectors may be the same or different. For example, if k word vectors are generated from the source text, the encoding neural network can generate k corresponding encoded hidden state vectors by processing the k word vectors. . k is an integer greater than 1.

次に、各初期推奨重みベクトルと前記現在の符号化隠れ状態ベクトルに基づいて、当該初期推奨重みベクトルに対応する文ベクトルを確定する。 Next, based on each initial recommendation weight vector and the current encoded hidden state vector, a sentence vector corresponding to the initial recommendation weight vector is determined.

いくつかの実施例では、初期推奨重みベクトルＷは、ベクトル［Ｗ_１、Ｗ_２…、ｗ_ｋ］として表され得る。ここで、Ｗの要素の数は、符号化隠れ状態ベクトルの数と同じである。ここで、初期推奨重みベクトルＷの各元素は、現在の符号化隠れ状態ベクトルを用いて文ベクトルを確定する際に用いる各符号化隠れ状態ベクトルの重み係数を表す。これらの重み係数を用いて、符号化ニューラルネットワークが入力する各単語ベクトルに対応する符号化隠れ状態ベクトルを組み合せて、各単語ベクトルの情報を含む文ベクトルを形成することができる。なお、ここで言う文ベクトルは、抽象的な文ベクトルであってもよい。抽象的な文ベクトルは、入力テキストに含まれる文の情報と一対一に対応しないものであってもよい。文ベクトルＳは、Ｓ１０２で生成された複数の単語ベクトルのうちの一部又は全部の単語ベクトルの情報を含んでもよい。 In some examples, the initial recommendation weight vector W may be represented as a vector [W ₁ , W _{2 .} . . , w _k ]. Here, the number of elements of W is the same as the number of encoded hidden state vectors. Here, each element of the initial recommended weight vector W represents a weight coefficient of each encoded hidden state vector used when determining a sentence vector using the current encoded hidden state vector. Using these weighting factors, the encoded neural network can combine the encoded hidden state vectors corresponding to each input word vector to form a sentence vector containing information about each word vector. Note that the sentence vector referred to here may be an abstract sentence vector. The abstract sentence vector may not have a one-to-one correspondence with sentence information included in the input text. The sentence vector S may include information on some or all of the plurality of word vectors generated in S102.

いくつかの実現形態において、文ベクトルＳは、現在の符号化隠れ状態ベクトルｈ_１、ｈ_２…ｈ_ｋの重み平均値として表されてもよい。例えば、文ベクトルＳは、Ｗ＊ｈとして表され、ここで、Ｗ＝［ｗ_１、ｗ_２…、ｗ_ｋ］、ｈ＝［ｈ_１、ｈ_２…、ｈ_ｋ］^Ｔであってもよい。したがって、予め訓練された所定数の初期推奨重みベクトルＷ_１、Ｗ_２…、Ｗ_ｎを利用して、所定数の文ベクトルＳ_１、Ｓ_２…、Ｓ_ｎを得ることができる。ここで、ｎ、ｍは、１より大きい整数である。 In some implementations, the sentence vector S may be represented as a weighted average of the current encoded hidden state vectors h ₁ , h ₂ . . . h _k . For example, _the sentence vector S may be expressed as ^W *h, where W=[w ₁ , w _{2 .} . . , w _k ], h=[h ₁ , h ₂ . . Therefore, a predetermined number of sentence vectors S ₁ , S ₂ , . . . , S _n can be obtained using a predetermined number of initial recommended weight vectors W ₁ , W _{2 , .} . . , W _n that have been trained in advance. Here, n and m are integers greater than 1.

ステップＳ１０６において、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性に基づいて前記複数の初期推奨重みベクトルを調整して、前記複数の単語のための推奨確率分布を確定する。 In step S106, the plurality of initial recommendation weight vectors are adjusted based on the relationship between each sentence vector and other sentence vectors among the plurality of sentence vectors to obtain a recommendation probability distribution for the plurality of words. Determine.

図２は、本開示の実施例による、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性を確定する模式図を示す。図２には、５つの単語ベクトルを例として本開示の原理が記述されるが、本開示の範囲は、これに限定されなく、他の任意の数の単語ベクトルを利用して本開示によるテキスト処理方法を実現しても良い。 FIG. 2 shows a schematic diagram for determining the association between each sentence vector and other sentence vectors among the plurality of sentence vectors, according to an embodiment of the present disclosure. Although the principle of the present disclosure is described in FIG. 2 by taking five word vectors as an example, the scope of the present disclosure is not limited thereto. A processing method may also be implemented.

図２に示すように、ｘ_１、ｘ_２、ｘ_３、ｘ_４、ｘ_５は、ソーステキストから生成された、ソーステキストにおける単語に対応する単語ベクトルである。符号化ニューラルネットワークを利用して、ｘ_１、ｘ_２、ｘ_３、ｘ_４、ｘ_５にそれぞれ対応する符号化隠れ状態ベクトルｈ_１、ｈ_２、ｈ_３、ｈ_４、ｈ_５を生成する。 As shown in FIG. 2, x ₁ , x ₂ , x ₃ , x ₄ , x ₅ are word vectors generated from the source text and corresponding to words in the source text. Using the encoding neural network, encoded hidden state vectors h ₁ , h ₂ , h ₃ , h 4 , and h ₅ corresponding to x ₁ , x ₂ , x ₃ , _{x 4} _, and x ₅ are generated, respectively.

図２には、３つの初期推奨重みベクトルＷ_１、Ｗ_２、Ｗ_３を示す。なお、本開示はこれに限定されなく、他の任意の数の初期推奨重みベクトルを利用して本開示によるテキスト処理方法を実現しても良い。図２に示すように、初期推奨重みベクトルＷ_１、Ｗ_２、Ｗ_３を利用して文ベクトルＳ_１、Ｓ_２及びＳ_３を確定する。 FIG. 2 shows three initial recommended weight vectors W ₁ , W ₂ , W ₃ . Note that the present disclosure is not limited thereto, and the text processing method according to the present disclosure may be implemented using any other number of initial recommended weight vectors. As shown in FIG. 2, sentence vectors S ₁ , S ₂ , and S ₃ are determined using initial recommended weight vectors W ₁ , W ₂ , and W ₃ .

文ベクトルＳ_１、Ｓ_２、Ｓ_３の各文ベクトルに対し、当該文ベクトルを他の文ベクトルとを組み合わせて、組合せ文ベクトルを生成する。ここで、組合せ文ベクトルには組み合わせた少なくとも２つの文ベクトルの情報が含まれる。以下、２つの文ベクトルの間の関連性を確定することを例として本開示の原理を説明するが、当業者は、３つの以上の文ベクトルを組み合わせて組み合わせた文ベクトルの間の関連性を確定してもい。 For each of the sentence vectors S ₁ , S ₂ , and S ₃ , the sentence vector is combined with other sentence vectors to generate a combined sentence vector. Here, the combined sentence vector includes information about at least two combined sentence vectors. The principle of the present disclosure will be explained below using an example of determining the relationship between two sentence vectors, but those skilled in the art will be able to determine the relationship between sentence vectors that are a combination of three or more sentence vectors. Even if it's confirmed.

例えば、図２に示すように、文ベクトルＳ_１とＳ_２との関連性λ_１,２、文ベクトルＳ_１とＳ_３との関連性λ_１,３、及び文ベクトルＳ_２とＳ_３との関連性λ_２、３を計算することができる。 For example, as shown in FIG. 2, the relationship between sentence vectors S ₁ and S ₂ is λ _1,2 , the relationship between sentence vectors S ₁ and S ₃ is λ _1,3 , and the relationship between sentence vectors S ₂ and S ₃ is The relevance λ _2,3 of can be calculated.

いくつかの実現形態において、当該文ベクトルを他の文ベクトルと接続して、より次元の高い組合せ文ベクトルを得ることができる。例えば、文ベクトルＳの次元がｄである場合、文ベクトルＳ１とＳ２とを接続することにより、次元２ｄである組合せ文ベクトルＳ_１,２が得られる。ただし、ｄは１より大きい整数である。 In some implementations, the sentence vector can be connected with other sentence vectors to obtain a higher dimensional combined sentence vector. For example, when the dimension of the sentence vector S is d, by connecting the sentence vectors S1 and S2, a combined sentence vector _S1,2 having the dimension 2d is obtained. However, d is an integer greater than 1.

なお、Ｓ_１に対しＳ_１とＳ_２との関連性を計算する時、Ｓ_１を前、Ｓ_２を後で文ベクトルＳ_１とＳ_２を接続する。Ｓ_２に対しＳ_２とＳ_１との関連性を計算する時、Ｓ_２を前、Ｓ_１を後で文ベクトルＳ_２とＳ_１を接続する。そして、この場合、組合せ文ベクトルＳ_１,２と組合せ文ベクトルＳ_２,１とは異なる。 Note that when calculating the relationship between S ₁ and S ₂ for S ₁ , the sentence vectors S ₁ and S ₂ are connected with S ₁ before and S ₂ after. When calculating the relationship between S ₂ and S ₁ for S ₂ , the sentence vectors S ₂ and S ₁ are connected with S ₂ before and S ₁ after. In this case, the combination sentence vector S _1,2 and the combination sentence vector S _2,1 are different.

他の実現形態において、２つの文ベクトルに対しベクトル間の演算を行って（例えば、加算、減算、ベクトル積など）組合せ文ベクトルを生成する。この場合、組合せ文ベクトルＳ_１,２と組合せ文ベクトルＳ_２,１とは同じであっても良い。 In other implementations, vector-vector operations are performed on two sentence vectors (eg, addition, subtraction, vector product, etc.) to generate a combined sentence vector. In this case, the combination sentence vector S _1,2 and the combination sentence vector S _2,1 may be the same.

実際には、当業者は、任意の方式で、少なくとも２つの文ベクトルの情報を組み合わせた組合せ文ベクトルを生成することができる。 In fact, those skilled in the art can generate a combined sentence vector that combines the information of at least two sentence vectors in any manner.

そして、関連性行列を利用して前記組合せ文ベクトルを処理することにより、当該文ベクトルと当該他の文ベクトルとの関連性を確定することができる。いくつかの実施例において、文ベクトルＳ_１とＳ_２との関連性λ_１,２は_、λ＝Ｓ_１,２＊Ｚとして表されてもよい。ここで、Ｓ_１,２が文ベクトルＳ_１とＳ_２の組合せ文ベクトルを示し、Ｚが訓練された関連性行列を示す。Ｚを利用してＳ_１とＳ_２との関連性係数λ_１,２を算出することができる。いくつかの実施例において、関連性行列Ｚは、組合せ文ベクトルＳ_１,２を実数としての関連性係数に投影することができる。 Then, by processing the combined sentence vector using the relevance matrix, the relationship between the sentence vector and the other sentence vector can be determined. In some embodiments _, the association λ _1,2 between sentence vectors S ₁ and S ₂ may be expressed as λ=S _1,2 *Z. Here, S _1,2 represents a combination sentence vector of sentence vectors S ₁ and S ₂ , and Z represents a trained association matrix. The correlation coefficient λ _1,2 between S ₁ and S ₂ can be calculated using Z. In some embodiments, the relevance matrix Z can project the combined sentence vector S _1,2 into relevance coefficients as real numbers.

上記の方法によって、文ベクトルＳ_１、Ｓ_２…、Ｓｎのうちの任意の２つの文ベクトルの間の関連性を計算することができる。 By the above method, it is possible to calculate the relationship between any two sentence vectors among the sentence vectors S ₁ , S _{2 .} . . , Sn.

上記の任意の１つの文ベクトルに対して、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性に基づいて、当該文ベクトルの推奨係数を確定することができる。いくつかの実現形態において、当該文ベクトルの推奨係数は、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性の合計として表されてもよい。 For any one sentence vector described above, a recommendation coefficient for the sentence vector can be determined based on the relationship between the sentence vector and each of the other sentence vectors among the plurality of sentence vectors. . In some implementations, the recommendation coefficient of the sentence vector may be expressed as the sum of the associations of the sentence vector with each of the other sentence vectors of the plurality of sentence vectors.

例えば、文ベクトルＳ_１の推奨係数は、Σλ_１＝λ_１,２＋λ_１,３＋…λ_１,ｍとして表され、文ベクトルＳ_２の推奨係数は、Σλ_２＝λ_２,１＋λ_２,３＋…λ_２,ｍとして表され、このように、各文ベクトルの推奨係数を確定することができる。 For example, the recommendation coefficient for sentence vector S ₁ is expressed as Σλ ₁ = λ _1,2 + λ _1,3 +...λ _1,m , and the recommendation coefficient for sentence vector S ₂ is expressed as Σλ ₂ = λ _2,1 + λ _{2 ,3} +...λ _2,m , and in this way, the recommendation coefficient of each sentence vector can be determined.

他の実現形態において、文ベクトルの推奨係数は、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性の加重和として表されても良い。予め確定された重み係数を利用して各文ベクトルとの他の文ベクトルとの関連性に対して重み付け加算を行ってもよい。 In other implementations, the recommendation coefficient of a sentence vector may be expressed as a weighted sum of associations between the sentence vector and each of the other sentence vectors of the plurality of sentence vectors. Weighted addition may be performed on the relationship between each sentence vector and other sentence vectors using a predetermined weighting coefficient.

上記の推奨係数は、調整後の単語確率ベクトルを取得するために、対応する文ベクトルを生成するための初期推奨重みベクトルの調整に用いられることができる。例えば、図２に示すように、文ベクトルＳ_１、Ｓ_２及びＳ_３に対応する推奨係数Σλ_１、Σλ_２及びΣλ_３を利用して初期推奨重みベクトルＷ_１、Ｗ_２、Ｗ_３を処理することができる。 The above recommendation coefficients can be used to adjust the initial recommendation weight vector to generate the corresponding sentence vector in order to obtain the adjusted word probability vector. For example, as shown in FIG. 2, initial recommendation weight vectors _W ₁ , W ₂ , W ₃ are processed using recommendation coefficients Σλ ₁ , Σλ ₂ and Σλ ₃ corresponding to sentence vectors S 1 , S ₂ and S ₃ can do.

前述したように、推奨係数は、文ベクトルと他の文ベクトルとの関連性に基づいて確定されるものである。テキストの要約の生成過程でテキストのコンテンツを要約する必要があるため、他の文ベクトルとの関連性が高いほど、当該文ベクトルに含まれる単語ベクトルの情報がテキストのコンテンツの中で重要度が高く、その結果、テキストの要約の内容になる可能性が高いと考えられる。 As described above, the recommendation coefficient is determined based on the relationship between the sentence vector and other sentence vectors. Since it is necessary to summarize the content of the text in the process of generating a text summary, the higher the relationship with other sentence vectors, the more important the word vector information included in the sentence vector is in the text content. As a result, it is considered that there is a high possibility that the content will be the content of the text summary.

いくつかの実施例では、各文ベクトルの推奨係数を、当該文ベクトルに対応する単語確率ベクトルに掛けることにより、その単語確率ベクトルに含まれる、各単語ベクトルの符号化隠れ状態ベクトルに対する重み係数を調整することができる。例えば、調整後のｉ番目の単語確率ベクトルＷ_ｉ’は、Ｗ_ｉ’＝Σλ_ｉ＊Ｗ_ｉとして表され得る。 In some embodiments, the weighting factor for each word vector's encoded hidden state vector in the word probability vector is determined by multiplying the recommendation factor for each sentence vector by the word probability vector corresponding to that sentence vector. Can be adjusted. For example, the adjusted i-th word probability vector W _i ′ may be expressed as W _i ′=Σλ _i *W _i .

各文ベクトルの推奨係数を利用して当該文ベクトルの単語確率ベクトルを調整した後、上記の方法により得た調整後の複数の単語確率ベクトルＷ’を利用して前記複数の単語の推奨確率分布を確定してもよい。 After adjusting the word probability vector of the sentence vector using the recommendation coefficient of each sentence vector, the recommended probability distribution of the plurality of words is performed using the adjusted plurality of word probability vectors W' obtained by the above method. may be determined.

いくつかの実施例において、推奨確率分布Ｐ_Ｖは、上記の方法により得た調整後の複数の単語確率ベクトルＷ’の和であるＰ_Ｖ＝ΣＷ_ｉ’として表されてもよい。いくつかの実現形態において、推奨確率分布Ｐ_Ｖは、調整後の複数の単語確率ベクトルＷ_ｉ’の加重和として表されてもよい。 In some embodiments, the recommended probability distribution P _V may be expressed as P _V =ΣW _i ', which is the sum of adjusted word probability vectors W' obtained by the method described above. In some implementations, the recommended probability distribution P _V may be represented as a weighted sum of adjusted word probability vectors W _{i ′} .

図１を参照し、ステップＳ１０８において、前記推奨確率分布に基づいて、出力すべき単語を確定してもよい。 Referring to FIG. 1, in step S108, words to be output may be determined based on the recommendation probability distribution.

ステップＳ１０６で出力する推奨確率分布は、入力したソーステキスト内の各単語のソーステキストの中で重要度を示すことができ、ここで、推奨確率分布内の確率が大きいほど、現在の時間ステップについて、当該単語のソーステキスト内の重要度が高いと考える。そして、いくつかの例において、推奨確率分布内の確率の最大である単語を現在の時間ステップに出力すべき単語として確定してもよい。 The recommended probability distribution output in step S106 can indicate the importance level in the source text of each word in the input source text, where the higher the probability in the recommended probability distribution, the higher the probability for the current time step. , consider that the word has high importance in the source text. Then, in some examples, the word with the highest probability within the recommended probability distribution may be determined as the word to be output at the current time step.

いくつかの実施例において、推奨確率基づいて、現在の生成式のネットワーク（ＧｅｎｅｒａｔｉｖｅＮｅｔｗｏｒｋｓ）によって生成された単語確率分布を調整することにより、出力単語確率分布を確定してもよい。 In some embodiments, the output word probability distribution may be determined by adjusting the word probability distribution generated by the current Generative Networks based on the recommended probabilities.

各時間ステップについて、前記現在の符号化隠れ状態ベクトルに基づいて、復号化ニューラルネットワークを利用して現在の復号化隠れ状態ベクトルを確定することができる。前記現在の符号化隠れ状態ベクトルと現在の復号化隠れ状態ベクトルを利用して現在の単語確率分布を確定することができる。前記現在の単語確率分布と前記推奨確率分布に基づいて、現在の時間ステップについての出力単語確率分布を確定し、出力単語確率分布から最大の確率を有する単語ベクトルに対応する単語を、現在の時間ステップに出力すべき単語として選定することができる。 For each time step, a current decoded hidden state vector may be determined using a decoding neural network based on the current encoded hidden state vector. A current word probability distribution can be determined using the current encoded hidden state vector and the current decoded hidden state vector. Based on the current word probability distribution and the recommended probability distribution, determine the output word probability distribution for the current time step, and select the word corresponding to the word vector with the maximum probability from the output word probability distribution at the current time. The word can be selected as the word to be output to the step.

ここで、前記現在の単語確率分布は、注意（Ａｔｔｅｎｔｉｏｎ）確率分布であってもよい。前記注意確率分布は、前記入力テキストにおける単語がテキストの要約における単語となる確率分布を示す。 Here, the current word probability distribution may be an attention probability distribution. The attention probability distribution indicates a probability distribution that a word in the input text becomes a word in a text summary.

図３Ａは、本開示の実施例による、出力単語確率分布の確定の模式図を示す。図３Ａに示すように、推奨確率分布Ｐ_Ｖを利用して前記注意確率分布を調整することで、調整後の注意確率分布を形成することができる。 FIG. 3A shows a schematic diagram of determining an output word probability distribution, according to an embodiment of the present disclosure. As shown in FIG. 3A, by adjusting the attention probability distribution using the recommended probability distribution _PV , an adjusted attention probability distribution can be formed.

一実現形態において、現在の時間ステップについての符号化隠れ状態ベクトルと復号化隠れ状態ベクトルに基づいて注意確率分布を確定することができる。例えば、式（１）を利用して上記の注意確率分布を確定することができる。

ここで、ｔは現在の時間ステップを示し、ａ^ｔは現在の時間ステップについての注意確率分布を示し、ｓｏｆｔｍａｘは正規化指数関数であり、ｅ^ｔは、式（２）により以下のように確定される。

ここで、ｖ^Ｔ、Ｗ_ｈ、Ｗ_Ｓ、ｂ_ａｔｔｎは、ポインター生成ネットワーク（Ｐｏｉｎｔｅｒ－ＧｅｎｅｒａｔｏｒＮｅｔｗｏｒｋｓ）にける学習パラメータであり、ｈ_ｉは現在の符号化隠れ状態ベクトルであり、ｓ_ｔは現在の復号化隠れ状態ベクトルである。 In one implementation, an attention probability distribution can be determined based on the encoded hidden state vector and the decoded hidden state vector for the current time step. For example, the above attention probability distribution can be determined using equation (1).

Here, t indicates the current time step, a ^t indicates the attention probability distribution for the current time step, softmax is the normalized exponential function, and e ^t is determined by equation (2) as follows. be done.

Here, v ^T , W _h , W _S , _battn are learning parameters in the Pointer-Generator Networks, h _i is the current encoded hidden state vector, and s _t is the current encoded hidden state vector. is the decoded hidden state vector.

いくつかの実施例において、前記推奨確率分布を利用して前記注意確率分布を調整し、調整後の注意確率分布を確定する。 In some embodiments, the recommended probability distribution is used to adjust the attention probability distribution, and the adjusted attention probability distribution is determined.

例えば、式（３）を利用して調整後の注意確率分布ａ’を確定することができる。

ここで、ｔは現在の時間ステップであり、ａ’^ｔは現在の時間ステップについての調整後の注意確率分布を示し、ｅ^ｔは式（２）により確定されたパラメータである。 For example, the adjusted attention probability distribution a' can be determined using equation (3).

Here, t is the current time step, a' ^t indicates the adjusted attention probability distribution for the current time step, and e ^t is the parameter determined by equation (2).

調整後の注意確率分布を利用して、前記入力テキストにおける単語がテキストの要約における単語となる確率分布を確定することができる。例えば、入力テキストから確率の最大である単語を出力すべき単語として選定する。 The adjusted attention probability distribution can be used to determine the probability distribution that a word in the input text becomes a word in a text summary. For example, the word with the highest probability from the input text is selected as the word to be output.

いくつかの実施例において、前記現在の単語確率分布は、生成確率分布Ｐ_{ｖｏｃａｂ}をさらに含む。前記生成単語確率分布は、前記文字エンティティ辞書（ｔｅｘｔｅｎｔｉｔｙｄｉｃｔｉｏｎａｒｙ）における単語がテキストの要約における単語となる確率分布を示す。 In some embodiments, the current word probability distribution further includes a production probability distribution P _vocab . The generated word probability distribution indicates a probability distribution that a word in the text entity dictionary becomes a word in a text summary.

図３Ｂは、本開示の実施例による、生成確率分布と調整後の注意確率分布を利用して出力単語確率分布を確定する模式図を示す。 FIG. 3B shows a schematic diagram of determining an output word probability distribution using a generation probability distribution and an adjusted attention probability distribution, according to an embodiment of the present disclosure.

いくつかの実施例において、コンテキストベクトルと現在の時間ステップについての復号化隠れ状態ベクトルに基づいて、上記の生成確率分布を確定することができる。例えば、さらに、式（４）と式（５）を利用して上記の生成確率分布Ｐ_{ｖｏｃａｂ}を確定することができる。

ここで、Ｖ’、Ｖ、ｂ、ｂ’は、ポインター生成ネットワークにおける学習パラメータであり、ｈ_ｔ ^*は注意確率分布に基づいて確定されたコンテキストベクトルである。例えば、式（４）を利用して確定ｈ_ｔ ^*を確定することができる。

ここで、ａ_ｉ ^ｔは式（１）で確定された注意確率分布ａ^ｔにおけるｉ番目の元素であり、ｈ_ｉは現在のｉ番目の符号化隠れ状態ベクトルである。 In some embodiments, the above generation probability distribution can be determined based on the context vector and the decoded hidden state vector for the current time step. For example, the above generation probability distribution P _vocab can be further determined using equations (4) and (5).

Here, V', V, b, b' are learning parameters in the pointer generation network, and h _t ^* is a context vector determined based on the attention probability distribution. For example, the definite h _t ^* can be determined using equation (4).

Here, a _i ^t is the i-th element in the attention probability distribution a ^t determined by equation (1), and h _i is the current i-th encoded hidden state vector.

そして、前記生成確率分布と前記調整後の注意確率分布を重み付け加算することにより、出力単語確率分布を確定することができる。 Then, by weighting and adding the generation probability distribution and the adjusted attention probability distribution, the output word probability distribution can be determined.

いくつかの実施例において、現在の時間ステップについての符号化隠れ状態ベクトル、復号化隠れ状態ベクトル、注意確率分布及び１つ前の時間ステップでの復号化ニューラルネットワークの出力に基づいて、生成確率分布及び調整後の注意確率分布の第１の重みＰ_ｇｅｎを確定することができる。 In some embodiments, a generated probability distribution is generated based on the encoded hidden state vector, the decoded hidden state vector, the attention probability distribution for the current time step, and the output of the decoding neural network at the previous time step. and the first weight P _gen of the adjusted attention probability distribution can be determined.

例えば、前記生成確率分布と前記調整後の注意確率分布に対して加重和を計算するための第１の重みＰ_ｇｅｎは、式（６）として表され得る。

ここで、σは、活性化関数、例えばｓｉｇｍｏｉｄ関数を示し、ｗ_ｈ ^Ｔ、ｗ_ｓ ^Ｔ、ｗ_ｘ ^Ｔ及びｂ_ｐｔｒは訓練パラメータであり、ｈ_ｔ ^*は時間ステップｔに式（４）により確定したパラメータであり、ｓ_ｔは時間ステップｔでの復号化隠れ状態ベクトルであり、ｘ_ｔは時間ステップｔでの復号化ニューラルネットワークの入力、つまり、１つ前の時間ステップｔ－１での復号化ニューラルネットワークの出力である。式（６）により確定された第１の重みＰ_ｇｅｎはスカラーとして実現されてもよい。第１の重みＰ_ｇｅｎを利用して生成確率分布Ｐ_{ｖｏｃａｂ}と調整後の注意確率分布ａ’^ｔを重み平均して出力単語確率分布を取得することができる。 For example, a first weight P _gen for calculating a weighted sum for the generation probability distribution and the adjusted attention probability distribution may be expressed as Equation (6).

_Here ^, ^σ _denotes ^the _activation _function ^, _e.g. , where s _t is the decoding hidden state vector at time step t, and x _t is the input of the decoding neural network at time step t, that is, the decoding value at the previous time step t-1. This is the output of the neural network. The first weight P _gen determined by equation (6) may be implemented as a scalar. The output word probability distribution can be obtained by weighted averaging of the generation probability distribution P _vocab and the adjusted attention probability distribution a' ^t using the first weight P _gen .

図３Ｃは、本開示の実施例による、生成確率分布、注意確率分布及び推奨確率分布を利用して出力単語確率分布を確定する模式図を示す。 FIG. 3C shows a schematic diagram of determining an output word probability distribution using a generation probability distribution, an attention probability distribution, and a recommendation probability distribution, according to an embodiment of the present disclosure.

図３Ｃに示すように、前記生成確率分布、前記注意確率分布及び前記推奨確率分布を重み付け加算して出力単語確率分布を確定することができる。一実現形態において、現在の時間ステップについての符号化隠れ状態ベクトル、復号化隠れ状態ベクトル、注意確率分布、推奨確率分布及び１つ前の時間ステップでの復号化ニューラルネットワークの出力に基づいて、前記生成確率分布、前記注意確率分布及び前記推奨確率分布を重み付け加算するための第２の重みＰ_ｇｅｎ２を確定することができる。 As shown in FIG. 3C, the output word probability distribution can be determined by weighting and adding the generation probability distribution, the attention probability distribution, and the recommendation probability distribution. In one implementation, based on the encoded hidden state vector, the decoded hidden state vector, the attention probability distribution, the recommendation probability distribution for the current time step, and the output of the decoding neural network at the previous time step, the A second weight P _gen2 for weighted addition of the generation probability distribution, the attention probability distribution, and the recommendation probability distribution can be determined.

式（７）と利用して前記生成確率分布、前記注意確率分布及び前記推奨確率分布を重み付け加算するための第２の重みＰ_ｇｅｎ２を確定することができる。

ここで、σは活性化関数、例えばｓｉｇｍｏｉｄ関数を示し、ｗ_ｈ ^Ｔ、ｗ_ｓ ^Ｔ、ｗ_ｘ ^Ｔ、ｗ_Ｖ ^Ｔ及びｂ_ｐｔｒは訓練パラメータであり、ｈ_ｔ ^*は時間ステップｔに式（４）により確定されたパラメータであり、ｓ_ｔは時間ステップｔでの復号化隠れ状態ベクトルであり、ｘ_ｔは時間ステップｔでの復号化ニューラルネットワークの入力であり、つまり、１つ前の時間ステップｔ－１での復号化ニューラルネットワークの出力であり、Ｐ_Ｖは時間ステップｔでの推奨確率分布である。 A second weight P _gen2 for weighted addition of the generation probability distribution, the attention probability distribution, and the recommendation probability distribution can be determined using equation (7).

_Here ^, _σ _denotes ^an _activation ^function ^, _e.g. _{_} ^_ ), s _t is the decoding hidden state vector at time step t, and x _t is the input of the decoding neural network at time step t, i.e., the previous time step is the output of the decoding neural network at t-1, and P _V is the recommendation probability distribution at time step t.

式（７）により確定された重みＰ_ｇｅｎ２は、３次元のベクトルとして実現し、ここで、当該３次元のベクトルにおける元素は、生成確率分布Ｐ_ｇｅｎ、それぞれ注意確率分布ａ_ｔ及び推奨確率分布Ｐ_Ｖの重み係数を示す。 The weight P _gen2 determined by equation (7) is realized as a three-dimensional vector, where the elements in the three-dimensional vector are the generation probability distribution P _gen , the attention probability distribution a _t and the recommendation probability distribution P, respectively. Indicates the weighting coefficient of _V.

上記のテキスト処理で用いられるモデルの訓練パラメータは、予め定められた訓練データセットを用いて訓練されるものである。例えば、訓練データを上記のテキスト処理モデルに入力し、符号化ニューラルネットワーク、復号化ニューラルネットワーク、及び文ベクトル間の関連性を確定するための初期推奨重みベクトルを用いて、ソーステキストの単語ベクトルを処理することにより、上記のように訓練された出力単語確率分布を得ることができる。上記のテキスト処理モデルにおける訓練パラメータは、訓練された出力単語確率分布における正解の単語の確率損失を算出することにより調整されることができる。ここで、本開示に係るテキスト生成ネットワークの損失関数は、以下のように表され得る。

ここで、ｗ_ｔ ^*は時間ステップｔについての正解単語の時間ステップｔでの訓練の出力単語確率分布の確率値であり、Ｔは生成シーケンス全体にわたる合計時間ステップである。テキスト生成ネットワークの全体的な損失は、生成シーケンス全体にわたるすべての時間ステップでの損失値を統計することによって確定されることができる。 The training parameters of the model used in the above text processing are trained using a predetermined training data set. For example, input the training data into the text processing model described above and use the encoding neural network, the decoding neural network, and the initial recommended weight vectors to determine the association between the sentence vectors to generate the word vectors of the source text. By processing, an output word probability distribution trained as described above can be obtained. The training parameters in the above text processing model can be adjusted by calculating the probability loss of correct words in the trained output word probability distribution. Here, the loss function of the text generation network according to the present disclosure can be expressed as follows.

where w _t ^* is the probability value of the training output word probability distribution at time step t of the correct word for time step t, and T is the total time step over the entire generation sequence. The overall loss of a text generation network can be determined by statistics on the loss values at every time step over the entire generation sequence.

上記のテキスト処理モデルのパラメータに対する訓練は、上記の損失が最小になるようにテキスト処理モデルの訓練パラメータを調整することによって実現できる。 Training the parameters of the text processing model described above can be achieved by adjusting the training parameters of the text processing model so that the above loss is minimized.

本開示に係るテキスト処理方法によれば、例えば、テキストの要約のコンテンツを生成する際に、入力されたテキストにおける各単語からなる文ベクトルの間の相関性に基づいて、入力されたテキストにおける単語の当該テキストのコンテンツにおける重要度を確定することができ、テキストのコンテンツに対する単語の重要度に基づいて、生成されたテキストのコンテンツを確定するといった技術的効果を奏する。本開示では、要約を生成する場合を例に挙げて原理を説明したが、本開示の内容はこれに限定されない。本開示の原理から逸脱することなく、本開示に係るテキスト処理方法を、テキスト拡張、テキスト書き換え等の他の応用シーンに適用することもできる。 According to the text processing method according to the present disclosure, for example, when generating text summary content, the words in the input text are The importance level of the content of the text can be determined, and the technical effect is that the content of the generated text can be determined based on the importance level of the word with respect to the text content. In the present disclosure, the principle has been explained using an example of generating a summary, but the content of the present disclosure is not limited thereto. The text processing method according to the present disclosure can also be applied to other application scenes such as text expansion, text rewriting, etc. without departing from the principles of the present disclosure.

図４は本開示の実施例によるテキスト処理装置の模式的なブロック図を示す。図４に示すように、テキスト処理装置４００は、前処理ユニット４１０と、文ベクトル確定ユニット４２０と、推奨確率確定ユニット４３０と、出力ユニット４４０とを含む。 FIG. 4 shows a schematic block diagram of a text processing device according to an embodiment of the present disclosure. As shown in FIG. 4, the text processing device 400 includes a preprocessing unit 410, a sentence vector determining unit 420, a recommendation probability determining unit 430, and an output unit 440.

前処理ユニット４１０は、ソーステキストに対して前処理を行って、前記複数の単語のための複数の単語ベクトルを生成するように配置される。例えば、ワード埋め込み（ｗｏｒｄｅｍｂｅｄｄｉｎｇ）によりこの前処理を実現することができる。 Pre-processing unit 410 is arranged to perform pre-processing on the source text to generate word vectors for said plurality of words. For example, this preprocessing can be achieved by word embedding.

文ベクトル確定ユニット４２０は、複数の初期推奨重みベクトルと前記複数の単語ベクトルに基づいて、複数の文ベクトルＳを確定するように配置される。 The sentence vector determination unit 420 is arranged to determine sentence vectors S based on the plurality of initial recommendation weight vectors and the plurality of word vectors.

いくつかの実施例において、各時間ステップについて、符号化ニューラルネットワークを利用して前処理ユニット４１０により生成された複数の単語ベクトルを処理して、各単語ベクトルにそれぞれ対応する現在の符号化隠れ状態ベクトルを確定することができる。 In some embodiments, for each time step, an encoding neural network is used to process the plurality of word vectors generated by preprocessing unit 410 to determine the current encoded hidden state corresponding to each word vector. The vector can be determined.

前処理ユニット４１０により生成された単語ベクトルを入力とし、符号化ニューラルネットワークは、現在の時間ステップに各単語ベクトルｘ_１、ｘ_２、ｘ_３…にそれぞれ対応する現在の符号化隠れ状態ベクトルｈ_１、ｈ_２、ｈ_３…を出力することができる。符号化隠れ状態ベクトルの数と単語ベクトルの数は、同じであってもよいし、異なってもよい。例えば、ソーステキストに基づいてｋ個の単語ベクトルを生成する場合、符号化ニューラルネットワークは、これらｋ個の単語ベクトルを処理して対応するｋ個の符号化隠れ状態ベクトルを生成する。ｋは１より大きい整数である。 Taking as input the word vectors generated by the preprocessing unit 410, the encoding neural network calculates the current encoded hidden state vector h ₁ corresponding to each word vector x ₁ , x ₂ , x _{3 .} . . at the current time step, respectively. , h ₂ , h ₃ . . . can be output. The number of encoded hidden state vectors and the number of word vectors may be the same or different. For example, when generating k word vectors based on a source text, the encoding neural network processes these k word vectors to generate corresponding k encoded hidden state vectors. k is an integer greater than 1.

次に、各初期推奨重みベクトルと前記現在の符号化隠れ状態ベクトルに基づいて、当該初期推奨重みベクトルに対応する文ベクトルを確定することができる。 Then, based on each initial recommendation weight vector and the current encoded hidden state vector, a sentence vector corresponding to the initial recommendation weight vector can be determined.

いくつかの実施例において、初期推奨重みベクトルＷは、ベクトル［ｗ_１、ｗ_２…、ｗ_ｋ］として表され得る。ここで、Ｗの元素の数は符号化隠れ状態ベクトルの数と同じである。ここで、初期推奨重みベクトルＷにおける各元素は、現在の符号化隠れ状態ベクトルを利用して文ベクトルを確定する際の各符号化隠れ状態ベクトルための重み係数を示す。これらの重み係数を利用して、符号化ニューラルネットワーク入力から入力された各単語ベクトルの符号化隠れ状態ベクトルの情報を組み合わせることで、各単語ベクトル情報が含まれる文ベクトルを形成する。いくつかの実現形態において、文ベクトルＳは、現在の符号化隠れ状態ベクトルｈ_１、ｈ_２…ｈ_ｎの重み平均値として表され得る。そして、予め訓練された所定数の初期推奨重みベクトルＷ_１、Ｗ_２…、Ｗ_ｎを利用して所定数の文ベクトルＳ_１、Ｓ_２…、Ｓ_ｎを得る。 In some examples, the initial recommendation weight vector W may be represented as a vector [w ₁ , w _{2 .} . . , w _k ]. Here, the number of elements in W is the same as the number of encoded hidden state vectors. Here, each element in the initial recommended weight vector W indicates a weighting coefficient for each encoded hidden state vector when determining a sentence vector using the current encoded hidden state vector. By using these weighting coefficients and combining the encoded hidden state vector information of each word vector input from the encoded neural network input, a sentence vector containing each word vector information is formed. In some implementations, the sentence vector S may be represented as a weighted average of the current encoded hidden state vectors h ₁ , h ₂ . . . h _n . Then, a predetermined number of sentence vectors S ₁ , S ₂ , . . . , S _n are obtained using a predetermined number of initial recommended weight vectors W ₁ , W 2 , _{. .} . , W _n that have been trained in advance.

推奨確率処理ユニット４３０は、各文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルとの関連性に基づいて前記複数の初期推奨重みベクトルを調整することにより、前記複数の単語のための推奨確率分布を確定するように配置される。 Recommendation probability processing unit 430 adjusts the plurality of initial recommendation weight vectors based on the relevance of each sentence vector to other sentence vectors among the plurality of sentence vectors. arranged to establish a recommended probability distribution.

図４に示すように、推奨確率処理ユニット４３０は、関連性確定サブユニット４３１及び調整サブユニット４３２を含む。 As shown in FIG. 4, the recommendation probability processing unit 430 includes a relevance determination subunit 431 and an adjustment subunit 432.

関連性確定サブユニット４３１は、文ベクトルの間の関連性を確定するように配置される。例えば、各文ベクトルを他の文ベクトルと組み合わせて、組合せ文ベクトルを生成することができる。 The relevance determining subunit 431 is arranged to determine the relevance between sentence vectors. For example, each sentence vector can be combined with other sentence vectors to generate a combined sentence vector.

いくつかの実現形態において、当該文ベクトルを他の文ベクトルと接続して、より次元の高い組合せ文ベクトルを得ることができる。例えば、文ベクトルＳの次元がｄである場合、文ベクトルＳ_１とＳ_２を接続して２ｄ次元の組合せ文ベクトルＳ_１,２を取得する。ここで、ｄは１より大きい整数である。 In some implementations, the sentence vector can be connected with other sentence vectors to obtain a higher dimensional combined sentence vector. For example, if the dimension of the sentence vector S is d, the sentence vectors S ₁ and S ₂ are connected to obtain a 2d-dimensional combined sentence vector S _1,2 . Here, d is an integer greater than 1.

他の実現形態において、２つの文ベクトルのベクトル間の演算（例えば、加算、減算、ベクトル積等である）を行って組合せ文ベクトルを生成する。この場合、組合せ文ベクトルＳ_１,２と組合せ文ベクトルＳ_２,１とは、同じであってもよい。 In other implementations, vector-to-vector operations (eg, addition, subtraction, vector product, etc.) of two sentence vectors are performed to generate a combined sentence vector. In this case, the combination sentence vector S _1,2 and the combination sentence vector S _2,1 may be the same.

次に、関連性行列を利用して前記組合せ文ベクトルを処理することにより、当該文ベクトルと当該他の文ベクトルとの関連性を確定する。いくつかの実施例において、文ベクトルＳ_１とＳ_２との関連性λ_１,２は、λ＝Ｓ_１,２＊Ｚとして表され得る。ここで、Ｓ_１,２は文ベクトルＳ_１とＳ_２との組合せ文ベクトルであり、Ｚは訓練済みの関連性行列を示す。Ｚを利用してＳ_１とＳ_２との関連性係数λ_１,２を算出することができる。いくつかの実施例において、関連性行列Ｚは、組合せ文ベクトルＳ_１,２を実数としての関連性係数に投影することができる。 Next, by processing the combined sentence vector using a relationship matrix, the relationship between the sentence vector and the other sentence vector is determined. In some embodiments, the association λ _1,2 between sentence vectors S ₁ and S ₂ may be expressed as λ=S _1,2 *Z. Here, S _1,2 is a combination sentence vector of sentence vectors S ₁ and S ₂ , and Z indicates a trained association matrix. The correlation coefficient λ _1,2 between S ₁ and S ₂ can be calculated using Z. In some embodiments, the relevance matrix Z can project the combined sentence vector S _1,2 into relevance coefficients as real numbers.

上記の方法により、文ベクトルＳ_１、Ｓ_２…、Ｓ_ｎのうちの任意の２つの文ベクトルの間の関連性を算出することができる。 By the above method, it is possible to calculate the relationship between any two sentence vectors among the sentence vectors S ₁ , S _{2 .} . . , S _n .

調整サブユニット４３２は、上述した任意の文ベクトルに対し、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性に基づいて、当該文ベクトルの推奨係数を確定するように配置される。いくつかの実現形態において、当該文ベクトルの推奨係数は、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性の合計として表されてもよい。 The adjustment subunit 432 determines a recommendation coefficient for the above-mentioned arbitrary sentence vector based on the relationship between the sentence vector and each of the other sentence vectors among the plurality of sentence vectors. It is arranged like this. In some implementations, the recommendation coefficient of the sentence vector may be expressed as the sum of the associations of the sentence vector with each of the other sentence vectors of the plurality of sentence vectors.

他の実施例において、文ベクトルの推奨係数は、当該文ベクトルと前記複数の文ベクトルのうちの他の文ベクトルの夫々との関連性の加重和として表されてもよい。予め確定された重み係数を利用して、各文ベクトルと他の文ベクトルとの関連性を重み付け加算してもよい。 In another embodiment, the recommendation coefficient of a sentence vector may be expressed as a weighted sum of the relationships between the sentence vector and each of the other sentence vectors among the plurality of sentence vectors. The relationship between each sentence vector and other sentence vectors may be weighted and added using a predetermined weighting coefficient.

上記推奨係数は、調整後の単語確率ベクトルを得るために、対応する文ベクトルを生成するための初期推奨重みベクトルの調整に用いられることができる。 The recommendation coefficients can be used to adjust the initial recommendation weight vector to generate the corresponding sentence vector to obtain the adjusted word probability vector.

前述したように、推奨係数は、文ベクトルと他の文ベクトルとの関連性に基づいて確定されるものである。テキストの要約の生成過程でテキストのコンテンツを要約する必要があるため、他の文ベクトルとの関連性が高いほど、当該文ベクトルに含まれる単語ベクトルの情報がテキストのコンテンツの中で重要度が高く、その結果、テキストの要約の内容にる可能性が高いと考えられる。 As described above, the recommendation coefficient is determined based on the relationship between the sentence vector and other sentence vectors. Since it is necessary to summarize the content of the text in the process of generating a text summary, the higher the relationship with other sentence vectors, the more important the word vector information included in the sentence vector is in the text content. As a result, it is considered that there is a high possibility that it will be the content of the text summary.

いくつかの実施例において、調整サブユニット４３２は、各文ベクトルの推奨係数を当該文ベクトルに対応する単語確率ベクトルに掛けることにより、その単語確率ベクトルに含まれる、各単語ベクトルの符号化隠れ状態ベクトルに対する重み係数を調整することができる。例えば、調整後のｉ番目の単語確率ベクトルＷ_ｉ’は、Ｗ_ｉ’＝Σλ_ｉ*Ｗ_ｉとして表され得る。 In some embodiments, adjustment subunit 432 determines the encoded hidden state of each word vector contained in the word probability vector by multiplying the word probability vector corresponding to the sentence vector by the recommendation coefficient of each sentence vector. Weighting factors for vectors can be adjusted. For example, the adjusted i-th word probability vector W _i ' may be expressed as W _i '=Σλ _i *W _i .

各文ベクトルの推奨係数を利用して当該文ベクトルの単語確率ベクトルを調整した後、調整サブユニット４３２は、以上のように得た調整された複数の単語確率ベクトルＷ’に基づいて前記複数の単語の推奨確率分布を確定することができる。 After adjusting the word probability vector of the sentence vector using the recommendation coefficient of each sentence vector, the adjustment subunit 432 adjusts the word probability vector of the sentence vector based on the adjusted word probability vector W′ obtained as described above. A word recommendation probability distribution can be determined.

いくつかの実施例において、推奨確率分布Ｐ_Ｖは、上記の方法により得た調整後の複数の単語確率ベクトルＷ’の和であるＰ_Ｖ＝ΣＷ_ｉ’として表されてもよく、即ち、を利用する。いくつかの実現形態において、推奨確率分布Ｐ_Ｖは、調整後の複数の単語確率ベクトルＷ_ｉ’の加重和として表されてもよい。 In some embodiments, the recommended probability distribution P _V may be expressed as P _V =ΣW _i ', which is the sum of the adjusted word probability vectors W' obtained by the above method, i.e. Make use of it. In some implementations, the recommended probability distribution P _V may be represented as a weighted sum of adjusted word probability vectors W _{i ′} .

出力ユニット４４０は、前記推奨確率分布に基づいて出力すべき単語を確定するように構成される。 The output unit 440 is configured to determine the word to be output based on said recommendation probability distribution.

いくつかの実施例において、推奨確率基づいて、現在の生成式のネットワークによって生成された単語確率分布を調整することにより、出力単語確率分布を確定してもよい。 In some embodiments, the output word probability distribution may be determined by adjusting the word probability distribution generated by the network of current generation equations based on the recommended probabilities.

ここで、前記現在の単語確率分布は、注意確率分布ａ^ｔであってもよい。前記注意確率分布は、前記入力テキストおける単語がテキストの要約における単語となる確率分布を示す。一実現形態において、現在の時間ステップについての符号化隠れ状態ベクトルと復号化隠れ状態ベクトルに基づいて注意確率分布を確定することができる。 Here, the current word probability distribution may be an attention probability distribution ^at . The attention probability distribution indicates a probability distribution that a word in the input text becomes a word in a text summary. In one implementation, an attention probability distribution can be determined based on the encoded hidden state vector and the decoded hidden state vector for the current time step.

いくつかの実施例において、前記推奨確率分布を利用して前記注意確率分布を調整することで、調整後の注意確率分布ａ’^ｔを確定することができる。調整後の注意確率分布を利用して、前記入力テキストにおける単語がテキストの要約における単語となる確率分布を確定することができる。例えば、入力テキストから確率の最大である単語を出力すべき単語として選定することができる。 In some embodiments, by adjusting the attention probability distribution using the recommended probability distribution, the adjusted attention probability distribution a' ^t can be determined. The adjusted attention probability distribution can be used to determine the probability distribution that a word in the input text becomes a word in a text summary. For example, the word with the highest probability from the input text can be selected as the word to be output.

いくつかの実施例において、前記現在の単語確率分布は、生成確率分布Ｐ_{ｖｏｃａｂ}をさらに含む。前記生成単語確率分布は、前記文字エンティティ辞書における単語がテキストの要約における単語となる確率分布を示す。上記のコンテキストベクトルと現在の時間ステップについての復号化隠れ状態ベクトルに基づいて上記の生成確率分布を確定することができる。そして、前記生成確率分布と前記調整後の注意確率分布を重み付け加算することにより、出力単語確率分布を確定することができる。 In some embodiments, the current word probability distribution further includes a production probability distribution P _vocab . The generated word probability distribution indicates a probability distribution that a word in the character entity dictionary becomes a word in a text summary. The above generation probability distribution can be determined based on the above context vector and the decoded hidden state vector for the current time step. Then, by weighting and adding the generation probability distribution and the adjusted attention probability distribution, the output word probability distribution can be determined.

いくつかの実施例において、前記生成確率分布、前記注意確率分布及び前記推奨確率分布を重み付け加算して出力単語確率分布を確定することができる。一実現形態において、現在の時間ステップについての符号化隠れ状態ベクトル、復号化隠れ状態ベクトル、注意確率分布、推奨確率分布及び１つ前の時間ステップでの復号化ニューラルネットワークの出力に基づいて、前記生成確率分布、前記注意確率分布及び前記推奨確率分布を重み付け加算するための第２の重みＰ_ｇｅｎ２を確定することができる。第２の重みＰ_ｇｅｎ２は、３次元のベクトルとして実現し、ここで、当該３次元のベクトルにおける元素は、それぞれ生成確率分布Ｐ_ｇｅｎ、注意確率分布ａ_ｔ及び推奨確率分布Ｐ_Ｖの重み係数を示す。 In some embodiments, the output word probability distribution may be determined by weighted addition of the generation probability distribution, the attention probability distribution, and the recommendation probability distribution. In one implementation, based on the encoded hidden state vector, the decoded hidden state vector, the attention probability distribution, the recommendation probability distribution for the current time step, and the output of the decoding neural network at the previous time step, the A second weight P _gen2 for weighted addition of the generation probability distribution, the attention probability distribution, and the recommendation probability distribution can be determined. The second weight P _gen2 is realized as a three-dimensional vector, where the elements in the three-dimensional vector have weighting coefficients of the generation probability distribution P _gen , the attention probability distribution a _t , and the recommendation probability distribution P _V , respectively. show.

上記のテキスト処理装置で用いられる訓練パラメータは、予め定められた訓練データセットを用いて訓練されるものである。例えば、訓練データを上記のテキスト処理装置に入力し、符号化ニューラルネットワーク、復号化ニューラルネットワーク、及び文ベクトル間の関連性を確定するための初期推奨重みベクトルを用いて、ソーステキストの単語ベクトルを処理することにより、上記のように訓練された出力単語確率分布を得ることができる。上記のテキスト処理モデルにおける訓練パラメータは、訓練された出力単語確率分布における正解の単語の確率損失を算出することにより調整されることができる。ここで、本開示に係るテキスト生成ネットワークの損失関数は、式（８）により示され得る。 The training parameters used in the text processing device described above are trained using a predetermined training data set. For example, input the training data into the text processing device described above and use the encoding neural network, the decoding neural network, and the initial recommended weight vectors to determine the association between the sentence vectors to generate word vectors of the source text. By processing, an output word probability distribution trained as described above can be obtained. The training parameters in the above text processing model can be adjusted by calculating the probability loss of correct words in the trained output word probability distribution. Here, the loss function of the text generation network according to the present disclosure can be expressed by equation (8).

ここで、ｗ_ｔ ^*は時間ステップｔについての正解単語の時間ステップｔでの訓練後の出力単語確率分布の確率値であり、Ｔは生成シーケンス全体にわたる合計時間ステップである。テキスト生成ネットワークの全体的な損失は、生成シーケンス全体にわたるすべての時間ステップでの損失値を統計することによって確定されることができる。 where w _t ^* is the probability value of the output word probability distribution after training at time step t of the correct word for time step t, and T is the total time step over the entire generation sequence. The overall loss of a text generation network can be determined by statistics on the loss values at every time step over the entire generation sequence.

上記のテキスト処理装置のパラメータに対する訓練は、上記の損失が最小になるようにテキスト処理装置の訓練パラメータを調整することによって実現できる。 Training the parameters of the text processing device described above can be achieved by adjusting the training parameters of the text processing device so that the above losses are minimized.

本開示に係るテキスト処理装置によれば、例えば、テキストの要約のコンテンツを生成する際に、入力されたテキストにおける各単語からなる文ベクトルの間の相関性に基づいて、入力されたテキストにおける単語の当該テキストのコンテンツにおける重要度を確定することができ、テキストのコンテンツに対する単語の重要度に基づいて、生成されたテキストのコンテンツを確定するといった技術的効果を奏する。本開示では、要約を生成する場合を例に挙げて原理を説明したが、本開示の内容はこれに限定されない。本開示の原理から逸脱することなく、本開示に係るテキスト処理方法を、テキスト拡張、テキスト書き換え等の他の応用シーンに適用することもできる。 According to the text processing device according to the present disclosure, for example, when generating text summary content, the words in the input text are The importance level of the content of the text can be determined, and the technical effect is that the content of the generated text can be determined based on the importance level of the word with respect to the text content. In the present disclosure, the principle has been explained using an example of generating a summary, but the content of the present disclosure is not limited thereto. The text processing method according to the present disclosure can also be applied to other application scenes such as text expansion, text rewriting, etc. without departing from the principles of the present disclosure.

なお、本開示の実施例による方法または装置は、図５に示されるコンピューティングデバイスのアーキテクチャによって実現されてもよい。図５は、コンピューティングデバイスのアーキテクチャを示す。図５に示されるように、コンピューティングデバイス５００は、バス５１０、１つまたは少なくとも２つのＣＰＵ５２０、読み取り専用メモリ(ＲＯＭ)５３０、ランダムアクセスメモリ(ＲＡＭ) ５４０、ネットワークに接続された通信ポート５５０、入力／出力コンポーネント５６０、ハードディスク５７０などを含んでもよい。コンピューティングデバイス５００での記憶デバイス、例えば、ＲＯＭ５３０またはハードディスク５７０には、ビデオにおいてターゲットを検出するための方法の処理および／または通信に利用される、本開示による様々なデータまたはファイル、ならびにＣＰＵによって実行されるプログラム命令が記憶されていることができる。コンピューティング装置５００は、ユーザインターフェース５８０も含んでもよい。もちろん、図５に示されるアーキテクチャは、単なる例示的なものであり、異なるデバイスを実現する場合、実際の必要に応じて、図５に示されるコンピューティングデバイスの１つまたは少なくとも２つの構成要素は省略されてもよい。 Note that a method or apparatus according to an embodiment of the present disclosure may be implemented by the computing device architecture shown in FIG. 5. FIG. 5 shows the architecture of the computing device. As shown in FIG. 5, computing device 500 includes a bus 510, one or at least two CPUs 520, read only memory (ROM) 530, random access memory (RAM) 540, a communication port 550 connected to a network, Input/output components 560, hard disks 570, etc. may also be included. A storage device, such as a ROM 530 or a hard disk 570, in the computing device 500 may contain various data or files according to the present disclosure utilized in processing and/or communicating the method for detecting a target in a video, as well as by the CPU. Program instructions to be executed may be stored. Computing device 500 may also include a user interface 580. Of course, the architecture shown in FIG. 5 is merely exemplary, and when implementing a different device, one or at least two components of the computing device shown in FIG. May be omitted.

本願の実施例は、コンピュータ読み取り可能な記憶媒体としても実装されてもよい。本願の実施例によるコンピュータ読み取り可能な記憶媒体は、コンピュータ可読命令を記憶している。コンピュータ読み取り可能な命令がプロセッサによって実行されるとき、上記の図面を参照して説明した本願の実施例による方法が実行されることができる。コンピュータ読み取り可能な記憶媒体は、例えば、揮発性メモリ及び／又は不揮発性メモリを含むが、これらに限定されない。揮発性メモリは、例えば、ランダムアクセスメモリ(ＲＡＭ)及び／又はキャッシュメモリ（ｃａｃｈｅ）などを含んでもよい。不揮発性メモリは、例えば、読み取り専用メモリ(ＲＯＭ)、ハードディスク、フラッシュメモリなどを含んでもよい。 Embodiments of the present application may also be implemented as a computer-readable storage medium. A computer readable storage medium according to embodiments of the present application stores computer readable instructions. When the computer readable instructions are executed by the processor, the methods according to the embodiments of the present application described with reference to the figures above can be performed. Computer-readable storage media include, for example, but not limited to, volatile memory and/or non-volatile memory. Volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache). Non-volatile memory may include, for example, read-only memory (ROM), hard disks, flash memory, and the like.

本明細書で開示された内容に対して、様々な変更および改良が行われ得ることは、当業者によって理解されるべきであろう。例えば、上記の様々な装置又は構成要素は、ハードウェアで、又はソフトウェア、ファームウェア、又はこれらの一部又は全部の組み合わせで実現されてもよい。 It should be understood by those skilled in the art that various modifications and improvements can be made to what is disclosed herein. For example, the various devices or components described above may be implemented in hardware, or in software, firmware, or a combination of some or all of these.

また、本出願及び特許請求の範囲に示されるように、「１」、「１個」、及び／又は「１種類」及び／又は「当該」などの用語は、文脈上明らかにそうでないことを示しない限り、単数形のものではなく、複数形のものも含むことができる。一般に、「含む」及び「有する」という用語は、明示的に特定されたステップ及び要素を含むことを単に示唆するものであり、これらのステップ及び要素は排他的な羅列を構成するものではなく、方法又は装置は他のステップ又は要素を含むこともある。 Additionally, as indicated in this application and the claims, terms such as "1," "one," and/or "one type," and/or "the" are used to clearly indicate otherwise from the context. Unless indicated otherwise, reference may be made in the plural rather than in the singular. In general, the terms "comprising" and "having" merely imply inclusion of the steps and elements explicitly identified and do not constitute an exclusive enumeration of such steps and elements. The method or apparatus may include other steps or elements.

さらに、本明細書は、本開示の実施例によるシステムのいくつかのユニットに対する様々な参照を行うが、任意の数の異なるユニットが使用され、クライアント及び／又はサーバ上で実行されてもよい。前記ユニットは、単に例示的なものであり、そして前記システム及び方法の異なる態様には、異なるユニットを使用してもよい。 Additionally, although this specification makes various references to several units of a system according to embodiments of the present disclosure, any number of different units may be used and executed on the client and/or server. The units are merely exemplary, and different units may be used for different aspects of the systems and methods.

また、本発明の実施例に係るシステムが実行する動作を説明するために、本発明の開示においてフローチャートを用いる。なお、前述又は後述した動作は、必ずしも順序通りに正確に実行されなくてもよい。逆に、様々なステップは、逆の順序にまたは同時に実行され得る。同時に、他の操作もこれらのプロセスに加えられ、またはこれらのプロセスから一つ又は複数のステップの動作が除去されてもよい。 Flowcharts are also used in disclosing the present invention to explain operations performed by systems according to embodiments of the present invention. Note that the operations described above or below do not necessarily have to be performed in exact order. Conversely, various steps may be performed in reverse order or simultaneously. At the same time, other operations may be added to these processes, or operations of one or more steps may be removed from these processes.

本明細書で使用される全ての用語（技術的及び科学的な用語を含み）は、特に定義されない限り、本発明が属する技術分野の当業者によって共通に理解されるのと同じ意味を持つ。一般的な辞書に定義されているような用語は、関連技術の文脈上の意味と一致する意味を持つものと解釈されるべきであり、本明細書で明らかに定義しない限り、理想的または極端な形式で解釈されるべきではない。 All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, unless otherwise defined. Terms as defined in common dictionaries should be construed to have meanings consistent with the contextual meanings of the relevant art, and unless expressly defined herein, ideal or extreme should not be interpreted in a formal manner.

以上、本発明を説明したが、本発明はこれらに限定されるものではない。本発明のいくつかの例示的な実施例を説明したが、本発明の新規な教示および利点から逸脱することなく、例示的な実施例に多くの変更を行うことができることは当業者には容易に理解されるべきである。したがって、このような全ての変更は特許請求で限定されている本発明の範囲に含まれることが意図される。上記は、本発明に対する説明であり、本発明が開示された特定の実施例に限定されるものと理解されるべきではなく、開示された実施例および他の実施例に対する変更は、添付の特許請求の範囲内に含まれることが意図されることを理解されたい。本発明は、特許請求の範囲およびそれと同等なものによって限定される。 Although the present invention has been described above, the present invention is not limited thereto. Although several exemplary embodiments of the present invention have been described, it will be readily apparent to those skilled in the art that many changes can be made to the exemplary embodiments without departing from the novel teachings and advantages of the present invention. should be understood. Accordingly, all such modifications are intended to be included within the scope of the invention as defined in the claims. The foregoing is illustrative of the invention, and the invention is not to be understood as being limited to the particular embodiments disclosed, and modifications to the disclosed embodiments and other embodiments may be disclosed in the accompanying patents. It is to be understood that it is intended to be included within the scope of the claims. The invention is limited by the claims and their equivalents.

Claims

a preprocessing unit arranged to perform preprocessing on the source text to generate a plurality of word vectors for the plurality of words;
a sentence vector determining unit arranged to determine a plurality of sentence vectors based on the plurality of initial recommendation weight vectors and the plurality of word vectors;
adjusting the plurality of initial recommendation weight vectors based on the relevance of each sentence vector to other sentence vectors of the plurality of sentence vectors to determine a recommendation probability distribution for the plurality of words; The recommended probability fixed unit to be placed,
an output unit arranged to determine the word to be output based on the recommended probability distribution ,
The recommendation probability determining unit further includes a relevance determining subunit,
The relationship determining subunit is
For each sentence vector, combine the sentence vector with other sentence vectors to generate a combined sentence vector,
arranged to determine the relationship between the sentence vector and the other sentence vector by processing the combined sentence vector using a relationship matrix,
The recommendation probability determination unit further includes an adjustment subunit,
The adjustment subunit includes:
A recommendation coefficient of the sentence vector is determined based on the relationship between the sentence vector and each of the other sentence vectors among the plurality of sentence vectors, and here, the recommendation coefficient of the sentence vector is and each of the other sentence vectors among the plurality of sentence vectors,
For each of the initial recommended weight vectors, adjust the initial recommended weight vectors using the recommendation coefficients of the sentence vectors corresponding to the initial recommended weight vectors, and obtain adjusted word probability vectors;
arranged to determine a recommended probability distribution of the plurality of words based on the adjusted word probability vector;
Text processing device.

The sentence vector determination unit is
processing the plurality of word vectors using an encoding neural network to determine a current encoded hidden state vector corresponding to each word vector;
arranged to determine, based on each initial recommendation weight vector and the current encoded hidden state vector, a sentence vector corresponding to the initial recommendation weight vector;
The text processing device according to claim 1.

The output unit is
determining a current decoding hidden state vector using a decoding neural network based on the current encoding hidden state vector;
determining a current word probability distribution using the current encoded hidden state vector and the current decoded hidden state vector;
arranged to determine a word to be output based on the current word probability distribution and the recommended probability distribution;
The text processing device according to claim 2.

the current word probability distribution includes a generation probability distribution and an attention probability distribution;
The output unit is
adjusting the attention probability distribution using the recommended probability distribution and determining the adjusted attention probability distribution;
determining an output word probability distribution by weighting and adding the generation probability distribution and the adjusted attention probability distribution;
arranged so that the word with the maximum probability within the output word probability distribution is determined as the word to be output;
The text processing device according to claim 3.

the current word probability distribution includes a generation probability distribution and an attention probability distribution;
The output unit is
determining weights used for the generation probability distribution, the attention probability distribution, and the recommendation probability distribution, and determining an output word probability distribution based on the weights;
arranged so that the word with the maximum probability of the output word probability distribution is determined as the word to be output;
The text processing device according to claim 3.

a text processing device preprocessing the source text to generate a plurality of word vectors for the plurality of words;
the text processing device determining a plurality of sentence vectors based on the plurality of initial recommendation weight vectors and the plurality of word vectors;
The text processing device adjusts the plurality of initial recommendation weight vectors based on the relationship between each sentence vector and other sentence vectors among the plurality of sentence vectors to determine recommendation probabilities for the plurality of words. determining the distribution;
The text processing device determines a word to be output based on the recommended probability distribution ,
The text processing device adjusts the plurality of initial recommendation weight vectors based on the relationship between each sentence vector and other sentence vectors among the plurality of sentence vectors. By combining the sentence vector with another sentence vector to generate a combined sentence vector, and processing the combined sentence vector using a relationship matrix, the sentence vector and the other sentence vector are combined. determining the association with the vector;
The text processing device may further adjust the plurality of initial recommended weight vectors based on the relationship between each sentence vector and other sentence vectors among the plurality of sentence vectors. determining a recommendation coefficient of the sentence vector based on the relationship between the sentence vector and each of the other sentence vectors of the plurality of sentence vectors, wherein the recommendation coefficient of the sentence vector is , expressed as the sum of the relationships between the sentence vector and each of the other sentence vectors among the plurality of sentence vectors, and for each of the initial recommended weight vectors, the initial recommended weight vector adjusting the initial recommendation weight vector using the recommendation coefficient of the sentence vector corresponding to the sentence vector to obtain an adjusted word probability vector; and a recommendation probability distribution of the plurality of words based on the adjusted word probability vector. determining the
Text processing methods.

a processor;
a memory in which computer readable program instructions are stored;
performing the text processing method of claim 6 when the computer readable program instructions are executed by the processor;
Text processing device.

A computer-readable storage medium having computer-readable instructions stored thereon, the computer-readable storage medium comprising:
when the computer readable instructions are executed by a computer, causing the computer to perform the text processing method of claim 6 ;
Computer readable storage medium.