JP2023078411A

JP2023078411A - Information processing method, model training method, apparatus, appliance, medium and program product

Info

Publication number: JP2023078411A
Application number: JP2023048430A
Authority: JP
Inventors: ファルー; Hua Lu; スーチーバオ; Siqi Bao; ファンフア; Fan Wang; ファンワン; Fang Wang; ファウー; Hua Wu; シューウェイファン; Shiwei Huang
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-08-10
Filing date: 2023-03-24
Publication date: 2023-06-06
Also published as: CN115292467B; CN115292467A

Abstract

To provide an information processing method, model training method, apparatus, appliance, medium and program product adapted to acquire a target response sentence having high interaction quality.SOLUTION: An information processing method acquires an interaction model having high interaction accuracy and a target response sentence having high interaction quality, in which the interaction model is acquired by performing training based on a corrected response-sample sentence, a second candidate response-sample sentence and a recall response-sample sentence and a plurality of candidate response-sample sentences are acquired by inputting an initial interaction-sample sentence to an initial interaction model. The second candidate response-sample sentence is any of the plurality of candidate response-sample sentences, the corrected response-sample sentence is a sentence having high interaction quality acquired by correcting a first response-sample sentence out of the candidate response-sample sentences, and the recall response-sample sentence is another sample sentence except for the initial interaction-sample sentence and the plurality of candidate response-sample sentences out of the training sample sentences.SELECTED DRAWING: Figure 2

Description

本開示は、コンピュータ技術の分野に関し、特に、人工知能と音声技術の分野に関し、具体的に、情報処理方法、モデルトレーニング方法、装置、機器、媒体及びプログラム製品に関する。 The present disclosure relates to the field of computer technology, in particular to the field of artificial intelligence and speech technology, and specifically to information processing methods, model training methods, devices, equipment, media and program products.

自然言語処理技術の発展に伴い、機械学習モデルはスマート対話の分野で使用することができ、対話モデルは、ユーザが入力した文に基づいて返答し、ユーザと対話する効果を実現する。 With the development of natural language processing technology, the machine learning model can be used in the field of smart dialogue, and the dialogue model responds based on the sentences input by the user to achieve the effect of interacting with the user.

現在、対話モデルの対話の精度は低く、対話の品質は悪い。 Currently, the dialogue model has low dialogue accuracy and poor dialogue quality.

本開示は、情報処理方法、モデルトレーニング方法、装置、機器、媒体及びプログラム製品を提供する。 The present disclosure provides information processing methods, model training methods, devices, apparatus, media and program products.

本開示の一様態によれば、情報処理方法を提供し、前記方法は、
初期対話文を取得するステップと、
前記初期対話文をトレーニング済みの対話モデルに入力して、ターゲット返答文を取得するステップとを含み、
前記対話モデルは、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングして取得されたモデルであり、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、前記第２の候補返答サンプル文は、前記複数の候補返答サンプル文のいずれかであり、前記修正返答サンプル文は、前記候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された文であり、前記リコール返答サンプル文は、トレーニングサンプル文のうち、前記初期対話サンプル文と前記複数の候補返答サンプル文とを除く他のサンプル文である。 According to one aspect of the present disclosure, an information processing method is provided, the method comprising:
obtaining an initial dialogue;
inputting the initial dialogue sentence into a trained dialogue model to obtain a target response sentence;
The dialogue model is a model obtained by training based on a modified reply sample sentence, a second candidate reply sample sentence and a recall reply sample sentence, and an initial dialogue sample sentence is input to the initial dialogue model to generate a plurality of candidates. obtaining a reply sample sentence, wherein the second candidate reply sample sentence is one of the plurality of candidate reply sample sentences, and the modified reply sample sentence is a first reply sample among the candidate reply sample sentences; It is a sentence obtained by correcting a sentence, and the recall response sample sentence is a sample sentence other than the initial dialogue sample sentence and the plurality of candidate response sample sentences among the training sample sentences.

本開示の別の態様によれば、モデルトレーニング方法を提供し、前記方法は、
初期対話サンプル文を取得するステップと、
前記初期対話サンプル文を初期対話モデルに入力して、複数の候補返答サンプル文を取得するステップと、
前記複数の候補返答サンプル文のうちの第１の候補返答サンプル文を修正して、修正返答サンプル文を取得するステップと、
前記修正返答サンプル文、前記複数の候補返答サンプル文のうちの第２の候補返答サンプル文及びリコール返答サンプル文に基づいて前記初期対話モデルをトレーニングして、対話モデルを取得するステップとを含み、
前記リコール返答サンプル文は、トレーニングサンプル文のうち、前記初期対話サンプル文と前記複数の候補返答サンプル文とを除く他のサンプル文である。 According to another aspect of the disclosure, there is provided a model training method, the method comprising:
obtaining an initial dialog sample sentence;
inputting the initial dialogue sample sentences into an initial dialogue model to obtain a plurality of candidate reply sample sentences;
modifying a first candidate response sample sentence among the plurality of candidate response sample sentences to obtain a modified response sample sentence;
training the initial dialogue model based on the modified sample response sentence, a second candidate response sample sentence of the plurality of candidate response sample sentences, and a recall sample response sentence to obtain a dialogue model;
The recall response sample sentences are sample sentences other than the initial dialogue sample sentences and the plurality of candidate response sample sentences among the training sample sentences.

本開示の別の態様によれば、情報処理装置を提供し、前記装置は、
初期対話文を取得する取得モジュールと、
前記初期対話文をトレーニング済みの対話モデルに入力して、ターゲット返答文を取得する入力モジュールと、を含み、
前記対話モデルは、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングして取得されたモデルであり、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、前記第２の候補返答サンプル文は、前記複数の候補返答サンプル文のいずれかであり、前記修正返答サンプル文は、前記候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された文であり、前記リコール返答サンプル文は、トレーニングサンプル文のうち、前記初期対話サンプル文と前記複数の候補返答サンプル文とを除く他のサンプル文である。 According to another aspect of the present disclosure, an information processing device is provided, the device comprising:
a retrieving module for retrieving an initial dialogue statement;
an input module for inputting the initial dialogue sentence into a trained dialogue model to obtain a target response sentence;
The dialogue model is a model obtained by training based on a modified reply sample sentence, a second candidate reply sample sentence and a recall reply sample sentence, and an initial dialogue sample sentence is input to the initial dialogue model to generate a plurality of candidates. obtaining a reply sample sentence, wherein the second candidate reply sample sentence is one of the plurality of candidate reply sample sentences, and the modified reply sample sentence is a first reply sample among the candidate reply sample sentences; It is a sentence obtained by correcting a sentence, and the recall response sample sentence is a sample sentence other than the initial dialogue sample sentence and the plurality of candidate response sample sentences among the training sample sentences.

本開示の別の態様によれば、モデルトレーニング装置を提供し、前記装置は、
初期対話サンプル文を取得する文取得モジュールと、
前記初期対話サンプル文を初期対話モデルに入力して、複数の候補返答サンプル文を取得する文入力モジュールと、
前記複数の候補返答サンプル文のうちの第１の候補返答サンプル文を修正して、修正返答サンプル文を取得する修正モジュールと、
前記修正返答サンプル文、前記複数の候補返答サンプル文のうちの第２の候補返答サンプル文及びリコール返答サンプル文に基づいて前記初期対話モデルをトレーニングして、対話モデルを取得するトレーニングモジュールと、を含み、
前記リコール返答サンプル文は、トレーニングサンプル文のうち、前記初期対話サンプル文と前記複数の候補返答サンプル文とを除く他のサンプル文である。 According to another aspect of the disclosure, there is provided a model training device, the device comprising:
a sentence acquisition module for acquiring initial dialogue sample sentences;
a sentence input module for inputting the initial dialogue sample sentences into an initial dialogue model to obtain a plurality of candidate reply sample sentences;
a correction module for correcting a first candidate reply sample sentence among the plurality of candidate reply sample sentences to obtain a corrected reply sample sentence;
a training module for training the initial dialogue model based on the modified sample response sentence, a second candidate response sample sentence among the plurality of candidate response sample sentences, and the recall sample response sentence to obtain a dialogue model; including
The recall response sample sentences are sample sentences other than the initial dialogue sample sentences and the plurality of candidate response sample sentences among the training sample sentences.

本開示の別の態様によれば、電子機器を提供し、前記電子機器は、
少なくとも１つのプロセッサと、
前記少なくとも１つのプロセッサと通信可能に接続されるメモリと、を含み、
前記メモリには、前記少なくとも１つのプロセッサによって実行可能な命令が記憶され、前記少なくとも１つのプロセッサが上記の方法のを実行できるように、前記命令は前記少なくとも１つのプロセッサによって実行される。 According to another aspect of the present disclosure, an electronic device is provided, the electronic device comprising:
at least one processor;
a memory communicatively coupled to the at least one processor;
The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor such that the at least one processor can perform the above method.

本開示の別の態様によれば、コンピュータ命令が記憶されている非一時的なコンピュータ読み取り可能な記憶媒体であって、前記コンピュータ命令は、コンピュータに上記の方法を実行させる。 According to another aspect of the present disclosure, a non-transitory computer-readable storage medium having computer instructions stored thereon, the computer instructions causing a computer to perform the above method.

本開示の別の態様によれば、コンピュータプログラムであって、前記コンピュータプログラムは、プロセッサによって実行される場合、上記の方法のステップを実現する。 According to another aspect of the present disclosure, a computer program, said computer program realizing the steps of the above method when executed by a processor.

本開示のいくつかの実施例では、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングを行い、対話モデルを取得し、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、第２の候補返答サンプル文は、複数の候補返答サンプル文のいずれかであり、修正返答サンプル文は、候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された対話の品質の高い文であり、リコール返答サンプル文は、トレーニングサンプル文のうち、初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文であり、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に対して、初期対話モデルをトレーニングし続けることにより、対話の精度の高い対話モデルを取得し、初期対話文を対話モデルに入力して、対話の品質の高いターゲット返答文を取得する。 Some embodiments of the present disclosure perform training based on the modified response sample sentences, the second candidate response sample sentences, and the recall response sample sentences to obtain an interaction model, and input the initial interaction sample sentences into the initial interaction model. to obtain a plurality of candidate reply sample sentences, the second candidate reply sample sentence is one of the plurality of candidate reply sample sentences, and the modified reply sample sentence is the first reply among the candidate reply sample sentences The recall response sample sentences are the sample sentences of the training sample sentences other than the initial dialogue sample sentences and the multiple candidate response sample sentences. , modified sample response sentences, second candidate response sample sentences, and recall response sample sentences, an initial dialogue model is continuously trained to obtain a highly accurate dialogue model, and the initial dialogue sentence is used as the dialogue model. Type to get a dialogue-quality target response.

なお、この部分に記載の内容は、本開示の実施例の肝心または重要な特徴を特定することを意図しておらず、本開示の範囲を限定することも意図していないことを理解されたい。本開示の他の特徴は、以下の説明によって容易に理解される。 It should be understood that the description in this section is not intended to identify key or critical features of embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. . Other features of the present disclosure will be readily understood from the following description.

図面は、本技術案をよりよく理解するために使用され、本開示を限定するものではない。
本開示の実施例１によって提供される情報処理方法の概略フローチャートである。本開示の実施例２によって提供されるモデルトレーニング方法の概略フローチャートである。本開示の実施例３によって提供される情報処理方法のフローチャートである。本開示の例示的な実施例によって提供される情報処理装置の概略構成図である。本開示の例示的な実施例によって提供されるモデルトレーニング装置の概略構成図である。本開示の実施例を実現するための例示的な電子機器の概略ブロック図である。 The drawings are used for better understanding of the present technical solution and do not limit the present disclosure.
1 is a schematic flow chart of an information processing method provided by Embodiment 1 of the present disclosure; 2 is a schematic flow chart of a model training method provided by Embodiment 2 of the present disclosure; 3 is a flow chart of an information processing method provided by Embodiment 3 of the present disclosure; 1 is a schematic configuration diagram of an information processing device provided by an exemplary embodiment of the present disclosure; FIG. 1 is a schematic block diagram of a model training device provided by an exemplary embodiment of the present disclosure; FIG. 1 is a schematic block diagram of an exemplary electronic device for implementing embodiments of the present disclosure; FIG.

以下、図面と併せて本開示の例示的な実施例を説明し、理解を容易にするためにその中には本開示の実施例の様々な詳細事項が含まれており、それらは単なる例示的なものと見なされるべきである。したがって、当業者は、本開示の範囲及び精神から逸脱することなく、ここで説明される実施例に対して様々な変更と修正を行うことができることを認識されたい。同様に、明確及び簡潔にするために、以下の説明では、周知の機能及び構造の説明を省略する。 Illustrative embodiments of the present disclosure will now be described in conjunction with the drawings, in which various details of the embodiments of the present disclosure are included for ease of understanding and are merely exemplary. should be regarded as Accordingly, those skilled in the art should appreciate that various changes and modifications can be made to the examples described herein without departing from the scope and spirit of this disclosure. Similarly, for the sake of clarity and brevity, the following description omits descriptions of well-known functions and constructions.

なお、本開示の技術案では、関連するユーザ個人情報の収集、記憶、使用、加工、伝送、提供、公開などの処理は、いずれも関連する法律法規の規定に合致し、かつ公序良俗に違反しない。 In addition, in the technical solution of the present disclosure, the collection, storage, use, processing, transmission, provision, disclosure, and other processing of the relevant user personal information shall comply with the provisions of relevant laws and regulations, and shall not violate public order and morals. .

人工知能はコンピュータに人間のある思惟過程と知能行為（学習、推理、思考、計画など）をシミュレートさせることを研究する学科であり、ハードウェアレベルの技術もソフトウェアレベルの技術もある。人工知能ハードウェア技術は一般的にセンサ、専用人工知能チップ、クラウドコンピューティング、分散ストレージ、ビッグデータ処理などの技術を含む。人工知能ソフトウェア技術は主にコンピュータビジョン技術、音声認識技術、自然言語処理技術及び機械学習／深層学習、ビッグデータ処理技術、ナレッジグラフ技術などのいくつかの方向を含む。 Artificial intelligence is a field that studies how computers simulate certain human thought processes and intelligent actions (learning, reasoning, thinking, planning, etc.), and includes both hardware-level technology and software-level technology. Artificial intelligence hardware technology generally includes sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing and other technologies. Artificial intelligence software technology mainly includes computer vision technology, speech recognition technology, natural language processing technology and machine learning/deep learning, big data processing technology, knowledge graph technology and other directions.

対話システムの分野では、ソーシャルメディアの評論データに基づいてトレーニングされた大規模な対話モデルが続々と現れている。しかし、ソーシャルメディアの評論シーンと実際の人間の対話シーンとの間にずれがあるため、モデルの生成能力がよくない。 In the field of dialogue systems, large-scale dialogue models trained on social media commentary data continue to emerge. However, due to the gap between the social media commentary scene and the actual human interaction scene, the model's generative ability is not good.

生成式対話モデルは、推論時に複数の候補返答を生成し、その後、生成スコアを使用して返答を評価してソートする。しかし、生成スコアに基づくソート方法では、高品質な返答を効果的に前列に置くことができない。 Generative interaction models generate multiple candidate responses during reasoning and then use the generated scores to evaluate and sort the responses. However, sorting methods based on generated scores do not effectively put high-quality responses in the foreground.

上記に存在する技術的課題に対して、本開示のいくつかの実施例では、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングして対話モデルを取得し、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、第２の候補返答サンプル文は、複数の候補返答サンプル文のいずれかであり、修正返答サンプル文は、候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された対話の品質の高い文であり、リコール返答サンプル文は、トレーニングサンプル文のうち、初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文であり、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に対して、初期対話モデルをトレーニングし続けることにより、対話の精度の高い対話モデルを取得し、初期対話文を対話モデルに入力して、対話の品質の高いターゲット返答文を取得する。 In view of the above technical problems, some embodiments of the present disclosure obtain a dialogue model by training based on the modified reply sample sentence, the second candidate reply sample sentence and the recall reply sample sentence, and An initial dialogue sample sentence is input to the initial dialogue model to obtain a plurality of candidate reply sample sentences, a second candidate reply sample sentence is any one of the plurality of candidate reply sample sentences, and a modified reply sample sentence is a candidate A recall sample sentence is a high-quality dialogue sentence obtained by correcting the first response sample sentence among the response sample sentences. By continuing to train the initial dialogue model for the sample sentences other than the sentences, the modified response sample sentences, the second candidate response sample sentences, and the recall response sample sentences, a dialogue model with high accuracy of the dialogue is obtained. and input the initial dialogue sentences into the dialogue model to obtain target response sentences with high dialogue quality.

以下、図面と併せて、本開示の各実施例によって提供される技術案を詳細に説明する。 The technical solutions provided by each embodiment of the present disclosure will be described in detail below in conjunction with the drawings.

図１は、本開示の実施例１によって提供される情報処理方法の概略フローチャートである。図１に示すように、当該方法は、以下のステップＳ１０１～１０２を含む。 FIG. 1 is a schematic flowchart of an information processing method provided by Embodiment 1 of the present disclosure. As shown in FIG. 1, the method includes the following steps S101-102.

Ｓ１０１、初期対話文を取得する。
Ｓ１０２、初期対話文をトレーニング済みの対話モデルに入力して、ターゲット返答文を取得する。
対話モデルは、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングして取得されたモデルであり、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、第２の候補返答サンプル文は、複数の候補返答サンプル文のいずれかであり、修正返答サンプル文は、候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された文であり、リコール返答サンプル文は、トレーニングサンプル文のうち、初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文である。 S101, obtaining an initial dialogue sentence.
S102, Input the initial dialogue sentence into the trained dialogue model to obtain the target reply sentence.
The dialogue model is a model obtained by training based on the modified reply sample sentence, the second candidate reply sample sentence, and the recall reply sample sentence, and the initial dialogue sample sentence is input to the initial dialogue model to generate a plurality of candidate answers. Obtaining sample sentences, the second candidate response sample sentence is one of a plurality of candidate response sample sentences, and the modified response sample sentence is obtained by modifying the first response sample sentence among the candidate response sample sentences. The recall response sample sentences, which are retrieved sentences, are sample sentences other than the initial dialogue sample sentences and the plurality of candidate response sample sentences among the training sample sentences.

本実施例では、上記方法の実行主体は、サーバまたは端末装置であってもよい。 In this embodiment, the execution subject of the above method may be a server or a terminal device.

上記方法の実行主体がサーバである場合、サーバの実現形態は限定されない。例えば、サーバは、汎用サーバ、クラウドサーバ、クラウドホスト、仮想センタなどのサーバ装置であってもよい。サーバの構成は主にプロセッサ、ハードディスク、メモリ、システムバスなど、及び汎用コンピュータアーキテクチャーのタイプを含む。 When the execution subject of the above method is a server, the implementation form of the server is not limited. For example, the server may be a general-purpose server, a cloud server, a cloud host, a server device such as a virtual center. The configuration of the server mainly includes processor, hard disk, memory, system bus, etc., and general computer architecture type.

上記方法の実行主体が端末装置である場合、端末装置の実現形態は限定されない。端末装置は、パーソナルコンピュータ、タブレットコンピュータ、スマートフォン、スマートウェアラブルデバイスのいずれかを含むが、これに限定されない。 When the execution subject of the above method is a terminal device, the implementation form of the terminal device is not limited. Terminal devices include, but are not limited to, personal computers, tablet computers, smart phones, and smart wearable devices.

本実施例では、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングを行い、対話モデルを取得し、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、第２の候補返答サンプル文は、複数の候補返答サンプル文のいずれかであり、修正返答サンプル文は、候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された対話の品質の高い文であり、リコール返答サンプル文は、トレーニングサンプル文のうち、初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文であり、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に対して、初期対話モデルをトレーニングし続けることにより、対話の精度の高い対話モデルを取得し、初期対話文を取得し、初期対話文を対話モデルに入力して、対話の品質の高いターゲット返答文を取得する。 In the present embodiment, training is performed based on the modified reply sample sentence, the second candidate reply sample sentence, and the recall reply sample sentence to obtain a dialogue model, and the initial dialogue sample sentence is input to the initial dialogue model to generate a plurality of candidates. Obtaining reply sample sentences, the second candidate reply sample sentence being one of a plurality of candidate reply sample sentences, and the modified reply sample sentence modifying the first reply sample sentence among the candidate reply sample sentences. recall sample sentences are sample sentences other than initial dialogue sample sentences and a plurality of candidate response sample sentences among training sample sentences; , the second candidate response sample sentence and the recall response sample sentence are continuously trained to obtain a dialogue model with high accuracy of dialogue, obtain an initial dialogue sentence, and convert the initial dialogue sentence to a dialogue Input into the model to get high-quality target responses for dialogue.

以下、応用シーンに合わせて本開示の技術案を説明する。 The technical proposal of the present disclosure will be described below according to the application scene.

応用シーン１：スマートフォンは、ユーザが音声で入力した初期対話文「今日の天気はどうですか」に応答し、スマートフォンは初期対話文をサーバにアップロードし、サーバは初期対話文をトレーニング済みの対話モデルに入力して、ターゲット返答文「今日は晴れです」を取得し、サーバは、ターゲット返答文をスマートフォンに下り送信し、スマートフォンは音声でターゲット返答文「今日は晴れです」を再生する。 Application scene 1: The smartphone responds to the initial dialogue sentence "How is the weather today?" voiced by the user, the smartphone uploads the initial dialogue sentence to the server, and the server transforms the initial dialogue sentence into a trained dialogue model. By inputting, the target reply sentence "Today is sunny" is acquired, the server down-transmits the target reply sentence to the smartphone, and the smartphone reproduces the target reply sentence "Today is sunny" by voice.

応用シーン２：スマートフォンは、ユーザが音声で入力した初期対話文「今日の天気はどうですか」に応答し、スマートフォンは、ローカルに統合された対話モデルに初期対話文を入力して、ターゲット返答文「今日は晴れです」を取得し、スマートフォンは、ターゲット返答文「今日は晴れです」を音声で再生する。 Application scene 2: The smartphone responds to the initial dialogue sentence "How is the weather today?" It's sunny today" is obtained, and the smartphone reproduces the target response sentence "It's sunny today" by voice.

対話モデルを使用する前に、初期対話モデルをトレーニングして対話モデルを取得する必要がある。以下、対話モデルをトレーニングする過程を説明する。 Before using an interaction model, you need to train an initial interaction model to obtain an interaction model. The process of training the dialogue model is described below.

図２は、本開示の実施例２によって提供されるモデルトレーニング方法の概略フローチャートである。図２に示すように、当該方法は以下のステップＳ２０１～２０４を含む。 FIG. 2 is a schematic flowchart of a model training method provided by Example 2 of the present disclosure. As shown in FIG. 2, the method includes the following steps S201-204.

Ｓ２０１、初期対話サンプル文を取得する。 S201, obtain an initial dialogue sample sentence.

Ｓ２０２、初期対話サンプル文を初期対話モデルに入力して、複数の候補返答サンプル文を取得する。 S202, inputting the initial dialogue sample sentences into the initial dialogue model to obtain a plurality of candidate reply sample sentences;

Ｓ２０３、複数の候補返答サンプル文のうちの第１の候補返答サンプル文を修正して、修正返答サンプル文を取得する。 S203, correcting the first candidate reply sample sentence among the plurality of candidate reply sample sentences to obtain a corrected reply sample sentence;

Ｓ２０４、修正返答サンプル文、複数の候補返答サンプル文のうちの第２の候補返答サンプル文及びリコール返答サンプル文に基づいて初期対話モデルをトレーニングして、対話モデルを取得する。
リコール返答サンプル文は、トレーニングサンプル文のうち、初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文である。 S204, training an initial dialogue model based on the modified reply sample sentence, the second candidate reply sample sentence among the plurality of candidate reply sample sentences, and the recall reply sample sentence to obtain a dialogue model;
The recall response sample sentences are sample sentences other than the initial dialogue sample sentences and the plurality of candidate response sample sentences among the training sample sentences.

上記の対話モデルをトレーニングするためのトレーニング装置は、任意のタイプのコンピュータ装置であってもよく、本開示の実施例はこれに対して限定しない。 The training device for training the interaction model described above may be any type of computer device, and embodiments of the present disclosure are not limited thereto.

なお、初期対話モデルはトレーニング済みのモデルであってもよく、初期対話モデルの精度が低く、初期対話モデルを使用した対話の品質が悪い。 Note that the initial dialogue model may be a trained model, the accuracy of the initial dialogue model is low, and the quality of the dialogue using the initial dialogue model is poor.

初期対話サンプル文を取得し、初期対話サンプル文を初期対話モデルに入力して、修正返答サンプル文を取得する。複数の候補返答サンプル文のうちの第１の候補返答サンプル文を修正して、修正返答サンプル文を取得し、複数の候補返答サンプル文の中から第２の候補返答サンプル文をランダムに選択し、トレーニングサンプル文のうちの初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文から、リコール返答サンプル文を選択する。修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文は１つのトレーニングデータセットを構成する。上記のステップを繰り返して、モデルトレーニングのためのトレーニングデータセットを取得する。 An initial dialogue sample sentence is obtained, the initial dialogue sample sentence is input to the initial dialogue model, and a modified reply sample sentence is obtained. modifying a first candidate response sample sentence among the plurality of candidate response sample sentences to obtain a modified response sample sentence; and randomly selecting a second candidate response sample sentence from among the plurality of candidate response sample sentences. , select recall response sample sentences from sample sentences other than the initial dialogue sample sentence and a plurality of candidate response sample sentences among the training sample sentences. The modified reply sample sentences, the second candidate reply sample sentences and the recall reply sample sentences constitute one training data set. Repeat the above steps to get a training dataset for model training.

なお、初期対話サンプル文は、データセットのカバレッジ範囲を増やすために、例えば、ニュース分野、ソーシャルメディア分野、文学分野及び実写対話分野など、できるだけ異なる分野のデータセットを採用する。 For the initial dialogue sample sentences, datasets in different fields such as news, social media, literature, and live-action dialogue are used as much as possible in order to increase the coverage of the dataset.

上記の実施例では、複数の候補返答サンプル文のうちの第１の候補返答サンプル文を修正して、修正返答サンプル文を取得する。例えば、第１の候補返答サンプル文に対して、コピー、訂正、または作成などの操作を行い、修正返答サンプル文を取得する。 In the above embodiment, the first candidate reply sample sentence among the plurality of candidate reply sample sentences is corrected to obtain the corrected reply sample sentence. For example, the first candidate reply sample sentence is copied, corrected, or created to obtain a modified reply sample sentence.

例えば、ラベリングインターフェースで初期対話サンプル文を入力する操作に応答し、初期対話サンプル文「毎日雨が降って気分が悪くなった」を取得し、初期対話サンプル文を初期対話モデルに入力して、複数の候補返答サンプル文「雨の日は、音楽とチョコレートが似合うよ」、「雨の日は寝るのにぴったりだよ」、「私も気分が悪い、誰も付き合ってくれないから」、「雨の日はいいね」、「私も！雨の日は好きじゃない」、「そうですね、外出できなくなって困っています」及び「そうですね、私も雨の日は嫌いです」を取得する。 For example, in response to the operation of inputting an initial dialogue sample sentence in the labeling interface, the initial dialogue sample sentence "It rained every day and I feel sick" is obtained, and the initial dialogue sample sentence is input into the initial dialogue model, Multiple candidate response sample sentences "On rainy days, music and chocolate look great", "Rainy days are perfect for sleeping", "I feel sick too, no one wants to hang out with me", " I like rainy days,” “Me too!

複数の候補返答サンプル文のうちの第１の候補返答サンプル文「雨の日は、音楽とチョコレートが似合うよ」を修正して、修正返答サンプル文「雨の日は、音楽とチョコレートが似合うと思いますよ」を取得し、複数の候補返答サンプル文から第２の候補返答サンプル文「雨の日は寝るのにぴったりだよ」をランダムに選択し、トレーニングサンプル文のうちの初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文から、リコール返答サンプル文「今日は晴れです」を選択する。修正返答サンプル文「雨の日は、音楽とチョコレートが似合うよ」、第２の候補返答サンプル文「雨の日は寝るのにぴったりだよ」及びリコール返答サンプル文「今日は晴れです」は１つのトレーニングデータセットを構成する。 The first candidate response sample sentence among the multiple candidate response sample sentences, "On rainy days, music and chocolate go well together" is corrected, and the corrected response sample sentence is changed to "On rainy days, music and chocolate look good together." Randomly select the second candidate response sample sentence "It's perfect for sleeping on rainy days" from multiple candidate response sample sentences, and select the initial dialogue sample sentence from the training sample sentences. and a plurality of candidate reply sample sentences, the recall reply sample sentence "Today is sunny" is selected. The corrected sample response sentence “On rainy days, music and chocolate look great”, the second candidate sample response sentence “Rainy days are perfect for sleeping”, and the recall sample sentence “Today is sunny” are 1. Configure one training dataset.

上記の実施例では、修正返答サンプル文、複数の候補返答サンプル文のうちの第２の候補返答サンプル文及びリコール返答サンプル文に基づいて初期対話モデルをトレーニングして、対話モデルを取得する。実現可能な一形態としては、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文を初期対話モデルの文生成モデルに入力して、実際返答文、修正返答サンプル文の確率、第２の候補返答サンプル文の確率及びリコール返答サンプル文の確率を取得し、実際返答文、修正返答サンプル文の確率、第２の候補返答サンプル文の確率及びリコール返答サンプル文の確率に基づいて初期対話モデルの初期文生成モデルと初期文決定モデルとを共同トレーニングして、対話モデルを取得する。 In the above embodiment, the initial dialogue model is trained based on the modified sample response sentence, the second candidate response sample sentence of the plurality of candidate response sample sentences, and the recall sample response sentence to obtain the interaction model. As one possible form, the corrected response sample sentences, the second candidate response sample sentences, and the recalled response sample sentences are input to the sentence generation model of the initial dialogue model, and the probabilities of the actual response sentences, the corrected response sample sentences, the second 2 candidate response sample probabilities and recall response sample probabilities are obtained, and an initial An initial sentence generation model and an initial sentence decision model of a dialogue model are jointly trained to obtain a dialogue model.

一実施例では、実際返答文、修正返答サンプル文の確率、第２の候補返答サンプル文の確率及びリコール返答サンプル文の確率に基づいて初期対話モデルの初期文生成モデルと初期文決定モデルとを共同トレーニングして、対話モデルを取得する。実際返答文と修正返答サンプル文とに基づいて、損失関数を決定し、損失関数に基づいて、修正返答サンプル文の確率が第２の候補返答サンプル文の確率より大きく、修正返答サンプル文の確率がリコール返答サンプル文の確率より大きく、第２の候補返答サンプル文の確率がリコール返答サンプル文の確率より大きいことをトレーニングターゲットとして、初期文生成モデルと初期文決定モデルとを共同トレーニングして、対話モデルを取得する。 In one embodiment, an initial sentence generation model and an initial sentence decision model of the initial dialogue model are generated based on the probabilities of the actual response, the modified response sample, the second candidate response sample, and the recall response sample. Collaboratively train to get an interaction model. determining a loss function based on the actual response text and the modified response sample text, and based on the loss function, the probability of the modified response sample text is greater than the probability of the second candidate response sample text, and the probability of the modified response sample text is jointly training an initial sentence generation model and an initial sentence decision model, with the training target being that is greater than the probability of the recall response sample sentence and the probability of the second candidate response sample sentence is greater than the probability of the recall response sample sentence, Get the interaction model.

上記各実施例の説明と併せて、図３は本開示の実施例３によって提供される情報処理方法のフローチャートである。図３に示すように、当該方法は以下のステップＳ３０１～Ｓ３０４を含む。 Together with the description of each embodiment above, FIG. 3 is a flowchart of an information processing method provided by Embodiment 3 of the present disclosure. As shown in FIG. 3, the method includes the following steps S301-S304.

Ｓ３０１、端末装置は音声入力操作に応答し、初期対話文を取得する。 S301, the terminal device responds to a voice input operation and obtains an initial dialogue sentence.

Ｓ３０２、端末装置は初期対話文をサーバに送信する。 S302, the terminal device sends an initial dialogue message to the server;

Ｓ３０３、サーバは、初期対話文を受信し、初期対話文を対話モデルに入力して、ターゲット返答文を取得し、ターゲット返答文を端末装置に下り送信する。 S303, the server receives the initial dialogue text, inputs the initial dialogue text into the dialogue model, obtains the target response text, and sends the target response text down to the terminal device.

Ｓ３０４、端末装置はターゲット返答文を受信して、ターゲット返答文を音声で再生する。 S304, the terminal device receives the target reply text and reproduces the target reply text by voice.

本実施例では、サーバの実現形態は限定されない。例えば、サーバは、汎用サーバ、クラウドサーバ、クラウドホスト、仮想センタなどのサーバ装置であってもよい。サーバの構成は主にプロセッサ、ハードディスク、メモリ、システムバスなど、及び汎用コンピュータアーキテクチャーのタイプを含む。 In this embodiment, the implementation form of the server is not limited. For example, the server may be a general-purpose server, a cloud server, a cloud host, a server device such as a virtual center. The configuration of the server mainly includes processor, hard disk, memory, system bus, etc., and general computer architecture type.

本実施例では、端末装置の実現形態は限定されない。端末装置は、パーソナルコンピュータ、タブレットコンピュータ、スマートフォン、スマートウェアラブルデバイスのいずれかを含むが、これに限定されない。 In this embodiment, the implementation form of the terminal device is not limited. Terminal devices include, but are not limited to, personal computers, tablet computers, smart phones, and smart wearable devices.

本実施例の各ステップの実現形態は上記実施例の説明を参照することができ、本実施例では説明を省略し、同時に、本実施例は、上記の各実施例に対応する部分の有益な効果を取得することができる。 The implementation of each step in this embodiment can refer to the description in the above embodiments, and the descriptions in this embodiment are omitted. effect can be obtained.

図４は、本開示の例示的な実施例によって提供される情報処理装置４０の概略構成図である。この情報処理装置４０は、取得モジュール４１と入力モジュール４２を含む。 FIG. 4 is a schematic block diagram of an information processing device 40 provided by an exemplary embodiment of the present disclosure. This information processing device 40 includes an acquisition module 41 and an input module 42 .

取得モジュール４１は、初期対話文を取得する。 Acquisition module 41 acquires an initial dialogue sentence.

入力モジュール４２は、初期対話文をトレーニング済みの対話モデルに入力して、ターゲット返答文を取得する。
対話モデルは、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングして取得されたモデルであり、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、第２の候補返答サンプル文は、複数の候補返答サンプル文のいずれかであり、修正返答サンプル文は、候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された文であり、リコール返答サンプル文は、トレーニングサンプル文のうち、初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文である。 The input module 42 inputs the initial dialogue sentences into the trained dialogue model to obtain target response sentences.
The dialogue model is a model obtained by training based on the modified reply sample sentence, the second candidate reply sample sentence, and the recall reply sample sentence, and the initial dialogue sample sentence is input to the initial dialogue model to generate a plurality of candidate answers. Obtaining sample sentences, the second candidate response sample sentence is one of a plurality of candidate response sample sentences, and the modified response sample sentence is obtained by modifying the first response sample sentence among the candidate response sample sentences. The recall response sample sentences, which are retrieved sentences, are sample sentences other than the initial dialogue sample sentences and the plurality of candidate response sample sentences among the training sample sentences.

選択的に、入力モジュール４２は、初期対話文をトレーニング済みの対話モデルに入力して、ターゲット返答文を取得する場合、
対話モデルの内部で、初期対話文を対話モデルの文生成モデルに入力して、複数の候補返答文と各候補返答文の確率とを取得し、
複数の候補返答文と各候補返答文の確率とを対話モデルの文決定モデルに入力して、ターゲット返答文を取得する。 Optionally, when input module 42 inputs initial dialogue sentences into a trained dialogue model to obtain target response sentences:
inside the dialogue model, inputting the initial dialogue sentences into the sentence generation model of the dialogue model to obtain a plurality of candidate response sentences and the probability of each candidate response sentence;
A target reply sentence is obtained by inputting a plurality of candidate reply sentences and the probability of each candidate reply sentence into the sentence decision model of the dialogue model.

選択的に、入力モジュール４２は、複数の候補返答文と各候補返答文の確率とを対話モデルの文決定モデルに入力して、ターゲット返答文を取得する場合、
複数の候補返答文と各候補返答文の確率とを文決定モデルに入力し、複数の候補返答文の中から、最も確率の高いターゲット返答文を選択する。 Optionally, if the input module 42 inputs a plurality of candidate response sentences and the probability of each candidate response sentence into the sentence decision model of the interaction model to obtain a target response sentence:
A plurality of candidate reply sentences and the probability of each candidate reply sentence are input to a sentence decision model, and a target reply sentence with the highest probability is selected from among the plurality of candidate reply sentences.

図５は、本開示の例示的な実施例によって提供されるモデルトレーニング装置５０の概略構成図である。このモデルトレーニング装置５０は、文取得モジュール５１、文入力モジュール５２、修正モジュール５３及びトレーニングモジュール５４を含み、
文取得モジュール５１は、初期対話サンプル文を取得し、
文入力モジュール５２は、初期対話サンプル文を初期対話モデルに入力して、複数の候補返答サンプル文を取得し、
修正モジュール５３は、複数の候補返答サンプル文のうちの第１の候補返答サンプル文を修正して、修正返答サンプル文を取得し、
トレーニングモジュール５４は、修正返答サンプル文、複数の候補返答サンプル文のうちの第２の候補返答サンプル文及びリコール返答サンプル文に基づいて初期対話モデルをトレーニングして、対話モデルを取得し、
リコール返答サンプル文がトレーニングサンプル文のうちの初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文である。 FIG. 5 is a schematic block diagram of a model training device 50 provided by an exemplary embodiment of the present disclosure. This model training device 50 includes a sentence acquisition module 51, a sentence input module 52, a correction module 53 and a training module 54,
A sentence acquisition module 51 acquires an initial dialogue sample sentence,
The sentence input module 52 inputs the initial dialogue sample sentences into the initial dialogue model to obtain a plurality of candidate reply sample sentences;
The correction module 53 corrects a first candidate reply sample sentence among the plurality of candidate reply sample sentences to obtain a corrected reply sample sentence,
training module 54 to obtain an interaction model by training an initial dialogue model based on the modified sample response sentence, a second candidate response sample sentence of the plurality of candidate response sample sentences, and the recall sample response sentence;
The recall reply sample sentences are sample sentences other than the initial dialogue sample sentences and the plurality of candidate reply sample sentences among the training sample sentences.

選択的に、トレーニングモジュール５４は、修正返答サンプル文、複数の候補返答サンプル文のうちの第２の候補返答サンプル文及びリコール返答サンプル文に基づいて初期対話モデルをトレーニングして、対話モデルを取得する場合、
修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文を初期対話モデルの文生成モデルに入力して、実際返答文、修正返答サンプル文の確率、第２の候補返答サンプル文の確率及びリコール返答サンプル文の確率を取得し、
実際返答文、修正返答サンプル文の確率、第２の候補返答サンプル文の確率及びリコール返答サンプル文の確率に基づいて初期対話モデルの初期文生成モデルと初期文決定モデルとを共同トレーニングして、対話モデルを取得する。 Optionally, training module 54 trains the initial dialogue model based on the modified sample response sentence, the second candidate response sample sentence of the plurality of candidate response sample sentences, and the recall sample response sentence to obtain the interaction model. If you do
The modified response sample sentence, the second candidate response sample sentence, and the recall response sample sentence are input to the sentence generation model of the initial dialogue model, and the actual response sentence, the probability of the modified response sample sentence, and the probability of the second candidate response sample sentence are calculated. and get the probability of the recall response sample sentence,
jointly training an initial sentence generation model and an initial sentence decision model of an initial dialogue model based on the probabilities of actual response sentences, modified response sample sentences, second candidate response sample sentence probabilities, and recall response sample sentence probabilities; Get the interaction model.

選択的に、トレーニングモジュール５４は、実際返答文、修正返答サンプル文の確率、第２の候補返答サンプル文の確率及びリコール返答サンプル文の確率に基づいて初期対話モデルの初期文生成モデルと初期文決定モデルとを共同トレーニングして、対話モデルを取得する場合、
実際返答文と修正返答サンプル文とに基づいて、損失関数を決定し、
損失関数に基づいて、修正返答サンプル文の確率が第２の候補返答サンプル文の確率より大きく、修正返答サンプル文の確率がリコール返答サンプル文の確率より大きく、第２の候補返答サンプル文の確率がリコール返答サンプル文の確率より大きいことをトレーニングターゲットとして、初期文生成モデルと初期文決定モデルとを共同トレーニングして、対話モデルを取得する。 Optionally, training module 54 trains the initial sentence generation model and the initial sentence of the initial dialogue model based on the probability of the actual response, the modified response sample sentence, the second candidate response sample sentence probability, and the recall response sample sentence probability. When co-training a decision model to obtain an interaction model,
determining a loss function based on the actual response text and the modified response sample text;
Based on the loss function, the probability of the modified response sample sentence is greater than the probability of the second candidate response sample sentence, the probability of the modified response sample sentence is greater than the probability of the recall response sample sentence, and the probability of the second candidate response sample sentence is greater than the recall response sample sentence probability, the initial sentence generation model and the initial sentence decision model are jointly trained to obtain the dialogue model.

上記実施例の装置について、その各モジュールの操作を実行する具体的な方式は、当該方法に関する実施例においてすでに詳細に説明したが、ここでは詳細に説明しない。 The specific manner of performing the operation of each module of the apparatus of the above embodiment has already been described in detail in the embodiment of the method, but will not be described in detail here.

本開示の実施例によれば、本開示は、電子機器および読み取り可能な記憶媒体をさらに提供する。
本開示の実施例によれば、本開示は、コンピュータプログラムをさらに提供し、コンピュータプログラムがプロセッサによって実行される場合、本開示によって提供される情報処理方法またはモデルトレーニング方法を実現する。 According to embodiments of the disclosure, the disclosure further provides an electronic device and a readable storage medium.
According to an embodiment of the disclosure, the disclosure further provides a computer program, which, when executed by a processor, implements the information processing method or model training method provided by the disclosure.

図６は、本開示の実施例を実行するための例示的な電子機器６００の概略ブロック図である。電子機器は、ラップトップコンピュータ、デスクトップコンピュータ、ワークステーション、パーソナルデジタルアシスタント、サーバ、ブレードサーバ、メインフレームコンピュータ、および他の適切なコンピュータなどの様々な形態のデジタルコンピュータを表すことを目的とする。電子機器は、パーソナルデジタル処理、携帯電話、スマートフォン、ウェアラブルデバイス、および他の同様のコンピューティングデバイスなどの様々な形態のモバイルデバイスを表すこともできる。本明細書で示される部品、それらの接続と関係、およびそれらの機能は、単なる例であり、本明細書の説明および／または求められる本開示の実現を制限することを意図したものではない。 FIG. 6 is a schematic block diagram of an exemplary electronic device 600 for implementing embodiments of the present disclosure. Electronic equipment is intended to represent various forms of digital computers such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronics can also represent various forms of mobile devices such as personal digital assistants, cell phones, smart phones, wearable devices, and other similar computing devices. The parts, their connections and relationships, and their functions shown herein are merely examples and are not intended to limit the description and/or the required implementation of the disclosure herein.

図６に示すように、電子機器６００は、読み取り専用メモリ（ＲＯＭ）６０２に記憶されているコンピュータプログラムまたは記憶ユニット６０８からランダムアクセスメモリ（ＲＡＭ）６０３にロードされたコンピュータプログラムに従って様々な適切な動作および処理を実行できる計算ユニット６０１を含む。ＲＡＭ６０３には、電子機器６００の動作に必要な各種のプログラムやデータも記憶されてもよい。計算ユニット６０１、ＲＯＭ６０２、およびＲＡＭ６０３は、バス６０４を介して互いに接続されている。バス６０４には、入力／出力（Ｉ／Ｏ）インターフェース６０５も接続されている。 As shown in FIG. 6, electronic device 600 performs various suitable operations according to computer programs stored in read only memory (ROM) 602 or loaded into random access memory (RAM) 603 from storage unit 608 . and a computing unit 601 that can perform processing. Various programs and data necessary for the operation of the electronic device 600 may also be stored in the RAM 603 . Computing unit 601 , ROM 602 and RAM 603 are connected to each other via bus 604 . An input/output (I/O) interface 605 is also connected to bus 604 .

電子機器６００の複数のコンポーネントはＩ／Ｏインターフェース６０５に接続され、キーボード、マウスなどの入力ユニット６０６、各タイプのディスプレイ、スピーカなどの出力ユニット６０７、磁気ディスク、光ディスクなどの記憶ユニット６０８、およびネットワークカード、モデム、無線通信トランシーバなどの通信ユニット６０９を含む。通信ユニット６０９は、電子機器６００が、インターネットなどのコンピュータネットワークおよび／または各種の電信ネットワークを介して他のデバイスと情報／データを交換することを可能にする。 A plurality of components of the electronic device 600 are connected to an I/O interface 605, including an input unit 606 such as a keyboard, mouse, etc., an output unit 607 such as each type of display, speakers, etc., a storage unit 608 such as a magnetic disk, an optical disk, etc., and a network It includes a communication unit 609 such as a card, modem, wireless communication transceiver. Communication unit 609 enables electronic device 600 to exchange information/data with other devices via computer networks such as the Internet and/or various telegraph networks.

計算ユニット６０１は、処理および計算能力を有する様々な汎用および／または専用の処理コンポーネントであってもよい。計算ユニット６０１のいくつかの例は、中央処理ユニット（ＣＰＵ）、グラフィック処理ユニット（ＧＰＵ）、各種の専用の人工知能（ＡＩ）計算チップ、機械学習モデルアルゴリズムを実行する各種の計算ユニット、デジタル信号プロセッサ（ＤＳＰ）、およびいずれかの適切なプロセッサ、コントローラ、マイクロコントローラなどを含むが、これらに限定されない。計算ユニット６０１は、上記に記載された各方法及び処理、例えば、情報処理方法とモデルトレーニング方法を実行する。例えば、いくつかの実施例では、情報処理方法とモデルトレーニング方法を、記憶ユニット６０８などの機械読み取り可能な媒体に有形的に含まれるコンピュータソフトウェアプログラムとして実現することができる。いくつかの実施例では、コンピュータプログラムの一部または全部は、ＲＯＭ６０２および／または通信ユニット６０９を介して電子機器６００にロードおよび／またはインストールすることができる。コンピュータプログラムがＲＡＭ６０３にロードされ、計算ユニット６０１によって実行される場合、上記に記載された情報処理方法とモデルトレーニング方法の１つまたは複数のステップが実行されてもよい。代替的に、他の実施例では、計算ユニット６０１は情報処理方法とモデルトレーニング方法を実行するように、他のいずれかの適切な方式（例えば、ファームウェアを介して）によって構成されてもよい。 Computing unit 601 may be various general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of computational unit 601 are a central processing unit (CPU), a graphics processing unit (GPU), various specialized artificial intelligence (AI) computational chips, various computational units that run machine learning model algorithms, digital signal including, but not limited to, a processor (DSP), and any suitable processor, controller, microcontroller, or the like. The computing unit 601 performs each of the methods and processes described above, eg, information processing methods and model training methods. For example, in some embodiments information processing methods and model training methods may be implemented as computer software programs tangibly embodied in a machine-readable medium, such as storage unit 608 . In some examples, part or all of the computer program may be loaded and/or installed in electronic device 600 via ROM 602 and/or communication unit 609 . When the computer program is loaded into RAM 603 and executed by computing unit 601, one or more steps of the information processing method and model training method described above may be performed. Alternatively, in other embodiments, computing unit 601 may be configured in any other suitable manner (eg, via firmware) to perform information processing methods and model training methods.

本明細書で上記記載のシステムと技術の様々な実施形態は、デジタル電子回路システム、集積回路システム、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、特定用途向け集積回路（ＡＳＩＣ）、特定用途向け標準製品（ＡＳＳＰ）、システムオンチップ（ＳＯＣ）、コンプレックス・プログラマブル・ロジック・デバイス（ＣＰＬＤ）、コンピュータハードウェア、ファームウェア、ソフトウェア、および／またはそれらの組み合わせで実現することができる。これらの様々な実施形態は、１つ又は複数のコンピュータプログラムで実施されることを含むことができ、当該１つ又は複数のコンピュータプログラムは、少なくとも１つのプログラマブルプロセッサを含むプログラム可能なシステムで実行および／または解釈されることができ、当該プログラマブルプロセッサは、特定用途向け又は汎用プログラマブルプロセッサであってもよく、ストレージシステム、少なくとも１つの入力装置、および少なくとも１つの出力装置からデータおよび命令を受信し、データおよび命令を当該ストレージシステム、当該少なくとも１つの入力装置、および当該少なくとも１つの出力装置に伝送することができる。 Various embodiments of the systems and techniques described herein above include digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), ), system-on-chip (SOC), complex programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being embodied in one or more computer programs, which are executed and executed on a programmable system including at least one programmable processor. /or may be interpreted, the programmable processor may be an application-specific or general-purpose programmable processor, receives data and instructions from a storage system, at least one input device, and at least one output device; Data and instructions can be transmitted to the storage system, the at least one input device, and the at least one output device.

本開示の方法を実行するためのプログラムコードは、１つ又は複数のプログラミング言語の任意の組み合わせで書くことができる。これらのプログラムコードは、プロセッサ又はコントローラによって実行された際に、フローチャートおよび／またはブロック図に規定された機能／操作が実施されるように、汎用コンピュータ、専用コンピュータ、又は他のプログラマブルデータ処理装置のプロセッサ又はコントローラに提供されてもよい。プログラムコードは、完全に機械上で実行されるか、部分的に機械上で実行されるか、スタンドアロンソフトウェアパッケージとして、部分的に機械上で実行され、部分的にリモート機械上で実行され又は完全にリモート機械又はサーバ上で実行されてもよい。 Program code to implement the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be stored in a general purpose computer, special purpose computer, or other programmable data processing apparatus to perform the functions/operations specified in the flowcharts and/or block diagrams when executed by a processor or controller. It may be provided in a processor or controller. Program code may be executed entirely on a machine, partially on a machine, or as a stand-alone software package, partially on a machine, partially on a remote machine, or completely may be executed on a remote machine or server.

本開示のコンテクストでは、機械読み取り可能な媒体は、命令実行システム、装置、またはデバイスによって使用されるために、又は命令実行システム、装置、またはデバイスと組み合わせて使用するためのプログラムを含むか、又は記憶することができる有形の媒体であってもよい。機械読み取り可能な媒体は、機械読み取り可能な信号媒体または機械読み取り可能な記憶媒体であってもよい。機械読み取り可能な媒体は、電子的、磁気的、光学的、電磁気的、赤外線的、又は半導体システム、装置又はデバイス、または上記コンテンツの任意の適切な組み合わせを含むことができるが、これらに限定されない。機械読み取り可能な記憶媒体のより具体的な例は、１つ又は複数のラインに基づく電気的接続、ポータブルコンピュータディスク、ハードディスク、ランダムアクセスメモリ（ＲＡＭ）、リードオンリーメモリ（ＲＯＭ）、消去可能プログラマブルリードオンリーメモリ（ＥＰＲＯＭ又はフラッシュメモリ）、光ファイバ、ポータブルコンパクトディスクリードオンリーメモリ（ＣＤ－ＲＯＭ）、光学記憶装置、磁気記憶装置、または上記コンテンツの任意の適切な組み合わせを含む。 In the context of this disclosure, a machine-readable medium contains a program for use by or in conjunction with an instruction execution system, apparatus, or device, or It may be a tangible medium capable of being stored. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media can include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus or devices, or any suitable combination of the above content. . More specific examples of machine-readable storage media are electrical connections based on one or more lines, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable reads. Including memory only (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the above content.

ユーザとのインタラクションを提供するために、コンピュータ上でここで説明されるシステム及び技術を実施することができ、当該コンピュータは、ユーザに情報を表示するためのディスプレイ装置（例えば、ＣＲＴ（陰極線管）又はＬＣＤ（液晶ディスプレイ）モニタ）と、キーボード及びポインティングデバイス（例えば、マウス又はトラックボール）とを有し、ユーザは、当該キーボード及び当該ポインティングデバイスによって入力をコンピュータに提供することができる。他の種類の装置も、ユーザとのインタラクションを提供することができ、例えば、ユーザに提供されるフィードバックは、任意の形式のセンシングフィードバック（例えば、ビジョンフィードバック、聴覚フィードバック、又は触覚フィードバック）であってもよく、任意の形式（音響入力と、音声入力、または、触覚入力とを含む）でユーザからの入力を受信することができる。 In order to provide interaction with a user, the systems and techniques described herein can be implemented on a computer, which includes a display device (e.g., CRT (Cathode Ray Tube)) for displaying information to the user. or LCD (liquid crystal display) monitor), and a keyboard and pointing device (eg, mouse or trackball) through which a user can provide input to the computer. Other types of devices can also provide interaction with a user, e.g., the feedback provided to the user can be any form of sensing feedback (e.g., vision feedback, auditory feedback, or tactile feedback). may receive input from the user in any form (including acoustic, voice, or tactile input).

ここで説明されるシステムおよび技術は、バックエンドコンポーネントを含むコンピューティングシステム（例えば、データサーバとする）、又はミドルウェアコンポーネントを含むコンピューティングシステム（例えば、アプリケーションサーバ）、又はフロントエンドコンポーネントを含むコンピューティングシステム（例えば、グラフィカルユーザインターフェース又はウェブブラウザを有するユーザコンピュータ、ユーザは、当該グラフィカルユーザインターフェース又は当該ウェブブラウザによってここで説明されるシステムおよび技術の実施形態とインタラクションできる）、又はこのようなバックエンドコンポーネントと、ミドルウェアコンポーネントと、フロントエンドコンポーネントのいずれかの組み合わせを含むコンピューティングシステムで実行することができる。任意の形態又は媒体のデジタルデータ通信（例えば、通信ネットワーク）によってシステムのコンポーネントを相互に接続することができる。通信ネットワークの例は、ローカルエリアネットワーク（ＬＡＮ）と、ワイドエリアネットワーク（ＷＡＮ）と、インターネットと、ブロックチェーンネットワークを含む。 The systems and techniques described herein may be computing systems that include back-end components (e.g., data servers), or computing systems that include middleware components (e.g., application servers), or computing systems that include front-end components. A system (e.g., a user computer having a graphical user interface or web browser, through which a user can interact with embodiments of the systems and techniques described herein), or such a back-end component , middleware components, and front-end components in any combination. The components of the system can be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include local area networks (LANs), wide area networks (WANs), the Internet, and blockchain networks.

コンピュータシステムは、クライアントとサーバを含むことができる。クライアントとサーバは、一般に、互いに離れており、通常に通信ネットワークを介してインタラクションする。対応するコンピュータ上で実行され、互いにクライアント－サーバ関係を有するコンピュータプログラムによってクライアントとサーバとの関係が生成される。サーバはクラウドサーバであってもよく、クラウドコンピューティングサーバまたはクラウドホストとも呼ばれ、クラウドコンピューティングサービスシステムにおける１つのホスト製品であり、従来の物理ホストとＶＰＳサービス（「ＶｉｒｔｕａｌＰｒｉｖａｔｅＳｅｒｖｅｒ」，または「ＶＰＳ」と省略する）に存在する管理の難しさ、ビジネス拡張性の弱いという欠陥を解決した。サーバは分散システムのサーバであってもよく、ブロックチェーンを組み込んだサーバであってもよい。 The computer system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or cloud host, is one host product in the cloud computing service system, and is the same as the traditional physical host and VPS service ("Virtual Private Server", or " VPS”) has solved the problems of difficulty in management and weak business expandability. The server may be a server of a distributed system, or a server incorporating a blockchain.

なお、上記に示される様々な形式のフローを使用して、ステップを並べ替え、追加、又は削除することができることを理解されたい。例えば、本開示に記載の各ステップは、並列に実行されてもよいし、順次実行されてもよいし、異なる順序で実行されてもよいが、本開示で開示されている技術案が所望の結果を実現することができれば、本明細書では限定されない。 It should be appreciated that steps may be reordered, added, or deleted using the various forms of flow shown above. For example, each step described in the present disclosure may be executed in parallel, sequentially, or in a different order. There is no limitation here as long as the results can be achieved.

上記具体的な実施形態は、本開示の保護範囲を制限するものではない。当業者は、設計要求と他の要因に応じて、様々な修正、組み合わせ、サブコンビネーション、及び代替を行うことができると理解されたい。任意の本開示の精神と原則内で行われる修正、同等の置換、及び改善などは、いずれも本開示の保護範囲内に含まれるべきである。 The above specific embodiments do not limit the protection scope of the present disclosure. It should be understood that those skilled in the art can make various modifications, combinations, subcombinations and substitutions depending on design requirements and other factors. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this disclosure should all fall within the protection scope of this disclosure.

Claims

An information processing method,
obtaining an initial dialogue;
inputting the initial dialogue sentences into a trained dialogue model to obtain target response sentences;
The dialogue model is a model obtained by training based on a modified reply sample sentence, a second candidate reply sample sentence and a recall reply sample sentence, and an initial dialogue sample sentence is input to the initial dialogue model to generate a plurality of candidates. obtaining a reply sample sentence, wherein the second candidate reply sample sentence is one of the plurality of candidate reply sample sentences, and the modified reply sample sentence is a first reply sample among the candidate reply sample sentences; is a sentence obtained by modifying a sentence, and the recall response sample sentence is a sample sentence other than the initial dialogue sample sentence and the plurality of candidate response sample sentences among the training sample sentences.
An information processing method characterized by:

inputting the initial dialogue sentence into a trained dialogue model to obtain a target response sentence;
within the dialogue model, inputting the initial dialogue sentences into a sentence generation model of the dialogue model to obtain a plurality of candidate response sentences and a probability of each of the candidate response sentences;
inputting the plurality of candidate response sentences and the probability of each candidate response sentence into a sentence decision model of the interaction model to obtain a target response sentence;
2. The information processing method according to claim 1, wherein:

inputting the plurality of candidate response sentences and the probability of each candidate response sentence into a sentence decision model of the dialogue model to obtain a target response sentence;
inputting the plurality of candidate response sentences and the probability of each of the candidate response sentences into the sentence decision model, and selecting the target response sentence with the highest probability from among the plurality of candidate response sentences;
3. The information processing method according to claim 2, characterized by:

A model training method comprising:
obtaining an initial dialog sample sentence;
inputting the initial dialogue sample sentences into an initial dialogue model to obtain a plurality of candidate reply sample sentences;
modifying a first candidate response sample sentence among the plurality of candidate response sample sentences to obtain a modified response sample sentence;
training the initial dialogue model based on the modified sample response sentence, a second candidate response sample sentence of the plurality of candidate response sample sentences, and the recall sample response sentence to obtain a dialogue model. ,
The recall response sample sentences are sample sentences other than the initial dialogue sample sentences and the plurality of candidate response sample sentences among the training sample sentences.
A model training method characterized by:

training the initial dialogue model based on the modified sample response sentence, a second candidate response sample sentence of the plurality of candidate response sample sentences, and the recall sample response sentence to obtain a dialogue model;
Inputting the modified reply sample sentence, the second candidate reply sample sentence and the recall reply sample sentence into the sentence generation model of the initial dialogue model, the probability of the actual reply sentence, the probability of the modified reply sample sentence and the second candidate reply sentence obtaining a probability of a sample sentence and a probability of a recall response sample sentence;
an initial sentence generation model and an initial sentence decision model of the initial dialogue model based on the probability of the actual response sentence, the probability of the modified sample response sentence, the probability of the second candidate response sample sentence, and the probability of the recalled response sample sentence; to obtain the interaction model by jointly training the
The model training method according to claim 4, characterized in that:

the initial sentence generation model and the initial sentence decision model of the initial dialogue model based on the probability of the actual response sentence, the probability of the modified sample response sentence, the probability of the second candidate response sample sentence, and the probability of the recalled response sample sentence; The step of jointly training with to obtain the interaction model includes:
determining a loss function based on the actual response text and the modified sample response text;
Based on the loss function, the probability of the modified response sample sentence is greater than the probability of the second candidate response sample sentence, the probability of the modified response sample sentence is greater than the probability of the recall response sample sentence, and the second jointly training the initial sentence generation model and the initial sentence decision model to obtain the dialogue model, with a training target that the probability of the candidate response sample sentence is greater than the probability of the recall response sample sentence. include,
The model training method according to claim 5, characterized in that:

An information processing device,
a retrieving module for retrieving an initial dialogue statement;
an input module for inputting the initial dialogue sentence into a trained dialogue model to obtain a target response sentence;
The dialogue model is a model obtained by training based on a modified reply sample sentence, a second candidate reply sample sentence and a recall reply sample sentence, and an initial dialogue sample sentence is input to the initial dialogue model to generate a plurality of candidates. obtaining a reply sample sentence, wherein the second candidate reply sample sentence is one of the plurality of candidate reply sample sentences, and the modified reply sample sentence is a first reply sample among the candidate reply sample sentences; is a sentence obtained by modifying a sentence, and the recall response sample sentence is a sample sentence other than the initial dialogue sample sentence and the plurality of candidate response sample sentences among the training sample sentences.
An information processing device characterized by:

When the input module inputs the initial dialogue sentence to a trained dialogue model to obtain a target response sentence,
inside the dialogue model, inputting the initial dialogue sentences into a sentence generation model of the dialogue model to obtain a plurality of candidate response sentences and a probability of each of the candidate response sentences;
inputting the plurality of candidate response sentences and the probability of each of the candidate response sentences into a sentence decision model of the dialogue model to obtain a target response sentence;
8. The information processing apparatus according to claim 7, characterized by:

When the input module inputs the plurality of candidate reply sentences and the probability of each candidate reply sentence to the sentence decision model of the dialogue model to acquire a target reply sentence,
inputting the plurality of candidate response sentences and the probability of each of the candidate response sentences into the sentence determination model, and selecting a target response sentence with the highest probability from among the plurality of candidate response sentences;
9. The information processing apparatus according to claim 8, characterized by:

A model training device,
a sentence acquisition module for acquiring initial dialogue sample sentences;
a sentence input module for inputting the initial dialogue sample sentences into an initial dialogue model to obtain a plurality of candidate reply sample sentences;
a correction module for correcting a first candidate reply sample sentence among the plurality of candidate reply sample sentences to obtain a corrected reply sample sentence;
a training module for training the initial dialogue model based on the modified sample response sentence, a second candidate response sample sentence among the plurality of candidate response sample sentences, and the recall sample response sentence to obtain a dialogue model; including
The recall response sample sentences are sample sentences other than the initial dialogue sample sentences and the plurality of candidate response sample sentences among the training sample sentences.
A model training device characterized by:

The training module trains the initial dialogue model based on the modified sample response sentence, a second candidate response sample sentence of the plurality of candidate response sample sentences, and a recall response sample sentence to obtain a dialogue model. case,
Inputting the modified reply sample sentence, the second candidate reply sample sentence and the recall reply sample sentence into the sentence generation model of the initial dialogue model, the probability of the actual reply sentence, the probability of the modified reply sample sentence and the second candidate reply sentence Obtain the probability of sample sentences and the probability of recall response sample sentences,
an initial sentence generation model and an initial sentence decision model of the initial dialogue model based on the probability of the actual response sentence, the probability of the modified sample response sentence, the probability of the second candidate response sample sentence, and the probability of the recalled response sample sentence; to obtain said interaction model;
11. The model training device according to claim 10, wherein:

The training module trains the initial sentence generation model of the initial dialogue model based on the actual response sentence probabilities, the modified response sample sentence probabilities, the second candidate response sample sentence probabilities, and the recall response sample sentence probabilities. and an initial sentence decision model to obtain said dialogue model,
determining a loss function based on the actual response text and the modified sample response text;
Based on the loss function, the probability of the modified response sample sentence is greater than the probability of the second candidate response sample sentence, the probability of the modified response sample sentence is greater than the probability of the recall response sample sentence, and the second jointly training the initial sentence generation model and the initial sentence decision model to obtain the dialogue model, with the probability of the candidate reply sample sentence being greater than the probability of the recall reply sample sentence as a training target;
12. The model training device according to claim 11, characterized by:

an electronic device,
at least one processor;
a memory communicatively coupled to the at least one processor;
Instructions executable by the at least one processor are stored in the memory, and the instructions are stored so as to enable the at least one processor to perform the method according to any one of claims 1 to 3 or 4 to 6. executed by the at least one processor;
An electronic device characterized by:

A non-transitory computer-readable storage medium having computer instructions stored thereon,
The computer instructions cause a computer to perform the method of any of claims 1-3 or 4-6,
A non-transitory computer-readable storage medium characterized by:

A computer program,
The computer program, when executed by a processor, implements the steps of the method according to any of claims 1-3 or 4-6,
A computer program characterized by: