CN114817478A - Text-based question answering method, device, computer equipment and storage medium - Google Patents

Text-based question answering method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN114817478A
CN114817478A CN202210524669.5A CN202210524669A CN114817478A CN 114817478 A CN114817478 A CN 114817478A CN 202210524669 A CN202210524669 A CN 202210524669A CN 114817478 A CN114817478 A CN 114817478A
Authority
CN
China
Prior art keywords
answer
question
text
semantic
semantic block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210524669.5A
Other languages
Chinese (zh)
Inventor
王伟
张黔
陈焕坤
郑毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Runlian Software System Shenzhen Co Ltd
Original Assignee
Runlian Software System Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Runlian Software System Shenzhen Co Ltd filed Critical Runlian Software System Shenzhen Co Ltd
Priority to CN202210524669.5A priority Critical patent/CN114817478A/en
Publication of CN114817478A publication Critical patent/CN114817478A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the application belongs to the field of artificial intelligence and relates to a question and answer method and device based on a text, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring a question text and an answer text corresponding to the question text; segmenting the answer text according to a preset semantic segmentation algorithm to obtain a plurality of semantic blocks; combining the question text with each semantic block respectively to obtain a plurality of question and answer value evaluation texts; respectively inputting the question-answer value evaluation texts into the question-answer value evaluation model to obtain question-answer value scores of semantic blocks in the question-answer value evaluation texts; screening each semantic block according to the question-answer value score to obtain a semantic block queue comprising at least one semantic block; and inputting the question text and the semantic block queue into an answer information extraction model to obtain answer information corresponding to the question text in the answer text. The method and the device improve the question and answer accuracy based on the text.

Description

基于文本的问答方法、装置、计算机设备及存储介质Text-based question answering method, device, computer equipment and storage medium

技术领域technical field

本申请涉及人工智能领域,尤其涉及一种基于文本的问答方法、装置、计算机设备及存储介质。The present application relates to the field of artificial intelligence, and in particular, to a text-based question answering method, device, computer equipment and storage medium.

背景技术Background technique

随着计算机技术的发展,计算机在自然语言处理中的应用也越来越多。基于文本的智能问答是自然语言处理中的重要课题,它的应用场景可以是先获取问题文本,然后在较长的答案文本中确定问题文本所对应的答案信息。基于文本的智能问答通常是将长文本输入基于神经网络搭建的模型,由模型输出答案信息。预训练语言模型具有自注意力机制,可以学习到语言内部蕴含的语义关联关系,在基于文本的智能问答中具有较好的效果。With the development of computer technology, there are more and more applications of computers in natural language processing. Text-based intelligent question answering is an important topic in natural language processing. Its application scenario can be to obtain the question text first, and then determine the answer information corresponding to the question text in the longer answer text. Text-based intelligent question answering usually inputs long text into a model built on a neural network, and the model outputs answer information. The pre-trained language model has a self-attention mechanism, which can learn the semantic relationship contained in the language, and has a good effect in text-based intelligent question answering.

预训练语言模型的训练需要消耗大量的计算资源,预训练语言模型对输入的文本长度也有限制。在实际应用中,需要处理的文本通常篇幅较长,为此,通常会直接将文本分割为多个小于模型最大输入长度的子文本,然后输入模型进行问答处理。然而,这样会丢失很多重要的语义信息,导致问答准确性较低。The training of the pre-trained language model consumes a lot of computing resources, and the pre-trained language model also has a limit on the length of the input text. In practical applications, the text to be processed is usually long. For this reason, the text is usually divided into multiple sub-texts that are smaller than the maximum input length of the model, and then input to the model for question and answer processing. However, this will lose a lot of important semantic information, resulting in low question answering accuracy.

发明内容SUMMARY OF THE INVENTION

本申请实施例的目的在于提出一种基于文本的问答方法、装置、计算机设备及存储介质,以解决问答准确性较低的问题。The purpose of the embodiments of the present application is to provide a text-based question answering method, device, computer equipment and storage medium, so as to solve the problem of low question answering accuracy.

为了解决上述技术问题,本申请实施例提供一种基于文本的问答方法,采用了如下所述的技术方案:In order to solve the above technical problems, the embodiments of the present application provide a text-based question answering method, which adopts the following technical solutions:

获取问题文本及其对应的答案文本;Get the question text and its corresponding answer text;

根据预设的语义分割算法分割所述答案文本,得到多个语义块;Segment the answer text according to a preset semantic segmentation algorithm to obtain a plurality of semantic blocks;

将所述问题文本与各语义块分别进行组合,得到多个问答价值评估文本;Combining the question text and each semantic block respectively to obtain a plurality of question and answer value evaluation texts;

将各问答价值评估文本分别输入问答价值评估模型,得到所述各问答价值评估文本中语义块的问答价值分数;Input each question and answer value evaluation text into the question and answer value evaluation model respectively, and obtain the question and answer value score of the semantic block in each question and answer value evaluation text;

根据所述问答价值分数对所述各语义块进行筛选,得到包含至少一个语义块的语义块队列;Screening each semantic block according to the question and answer value score to obtain a semantic block queue including at least one semantic block;

将所述问题文本和所述语义块队列输入答案信息提取模型,得到所述答案文本中与所述问题文本所对应的答案信息。Inputting the question text and the semantic block queue into an answer information extraction model to obtain answer information corresponding to the question text in the answer text.

为了解决上述技术问题,本申请实施例还提供一种基于文本的问答装置,采用了如下所述的技术方案:In order to solve the above technical problems, the embodiments of the present application also provide a text-based question and answer device, which adopts the following technical solutions:

文本获取模块,用于获取问题文本及其对应的答案文本;The text acquisition module is used to acquire the question text and its corresponding answer text;

文本分割模块,用于根据预设的语义分割算法分割所述答案文本,得到多个语义块;a text segmentation module, configured to segment the answer text according to a preset semantic segmentation algorithm to obtain a plurality of semantic blocks;

组合模块,用于将所述问题文本与各语义块分别进行组合,得到多个问答价值评估文本;a combination module for combining the question text and each semantic block respectively to obtain a plurality of question-and-answer value evaluation texts;

文本输入模块,用于将各问答价值评估文本分别输入问答价值评估模型,得到所述各问答价值评估文本中语义块的问答价值分数;A text input module, configured to input each question and answer value evaluation text into the question and answer value evaluation model respectively, and obtain the question and answer value score of the semantic block in each question and answer value evaluation text;

筛选模块,用于根据所述问答价值分数对所述各语义块进行筛选,得到包含至少一个语义块的语义块队列;A screening module, configured to screen each semantic block according to the question-and-answer value score, to obtain a semantic block queue containing at least one semantic block;

答案提取模块,用于将所述问题文本和所述语义块队列输入答案信息提取模型,得到所述答案文本中与所述问题文本所对应的答案信息。An answer extraction module, configured to input the question text and the semantic block queue into an answer information extraction model to obtain answer information corresponding to the question text in the answer text.

为了解决上述技术问题,本申请实施例还提供一种计算机设备,采用了如下所述的技术方案:In order to solve the above-mentioned technical problems, the embodiment of the present application also provides a computer device, which adopts the following technical solutions:

获取问题文本及其对应的答案文本;Get the question text and its corresponding answer text;

根据预设的语义分割算法分割所述答案文本,得到多个语义块;Segment the answer text according to a preset semantic segmentation algorithm to obtain a plurality of semantic blocks;

将所述问题文本与各语义块分别进行组合,得到多个问答价值评估文本;Combining the question text and each semantic block respectively to obtain a plurality of question and answer value evaluation texts;

将各问答价值评估文本分别输入问答价值评估模型,得到所述各问答价值评估文本中语义块的问答价值分数;Input each question and answer value evaluation text into the question and answer value evaluation model respectively, and obtain the question and answer value score of the semantic block in each question and answer value evaluation text;

根据所述问答价值分数对所述各语义块进行筛选,得到包含至少一个语义块的语义块队列;Screening each semantic block according to the question and answer value score to obtain a semantic block queue including at least one semantic block;

将所述问题文本和所述语义块队列输入答案信息提取模型,得到所述答案文本中与所述问题文本所对应的答案信息。Inputting the question text and the semantic block queue into an answer information extraction model to obtain answer information corresponding to the question text in the answer text.

为了解决上述技术问题,本申请实施例还提供一种计算机可读存储介质,采用了如下所述的技术方案:In order to solve the above technical problems, the embodiments of the present application also provide a computer-readable storage medium, which adopts the following technical solutions:

获取问题文本及其对应的答案文本;Get the question text and its corresponding answer text;

根据预设的语义分割算法分割所述答案文本,得到多个语义块;Segment the answer text according to a preset semantic segmentation algorithm to obtain a plurality of semantic blocks;

将所述问题文本与各语义块分别进行组合,得到多个问答价值评估文本;Combining the question text and each semantic block respectively to obtain a plurality of question and answer value evaluation texts;

将各问答价值评估文本分别输入问答价值评估模型,得到所述各问答价值评估文本中语义块的问答价值分数;Input each question and answer value evaluation text into the question and answer value evaluation model respectively, and obtain the question and answer value score of the semantic block in each question and answer value evaluation text;

根据所述问答价值分数对所述各语义块进行筛选,得到包含至少一个语义块的语义块队列;Screening each semantic block according to the question and answer value score to obtain a semantic block queue including at least one semantic block;

将所述问题文本和所述语义块队列输入答案信息提取模型,得到所述答案文本中与所述问题文本所对应的答案信息。Inputting the question text and the semantic block queue into an answer information extraction model to obtain answer information corresponding to the question text in the answer text.

与现有技术相比,本申请实施例主要有以下有益效果:获取问题文本及其对应的答案文本,根据预设的语义分割算法分割答案文本,确保得到的语义块具有足够丰富的语义信息;将问题文本与各语义块分别进行组合得到多个问答价值评估文本;将问答价值评估文本输入问答价值评估模型得到问答价值分数,问答价值分数衡量了问答价值评估文本中的语义块在问答中的贡献度与价值,从而选取具有较高问答价值的语义块构成语义块队列;然后将问题文本与语义块队列输入答案信息提取模型,使得模型可以根据具有较高问答价值的语义块准确输出答案信息;本申请在文本分割时保证了语义块的语义信息,选取具有较高问答价值的语义块进行答案信息的提取,提高了问答的准确性。Compared with the prior art, the embodiments of the present application mainly have the following beneficial effects: obtaining question text and corresponding answer text, segmenting the answer text according to a preset semantic segmentation algorithm, and ensuring that the obtained semantic block has sufficiently rich semantic information; The question text and each semantic block are combined to obtain multiple question and answer value evaluation texts; the question and answer value evaluation text is input into the question and answer value evaluation model to obtain the question and answer value score. The question and answer value score measures the semantic block in the question and answer value evaluation text. Contribution and value, so as to select the semantic blocks with high question-answer value to form the semantic block queue; then input the question text and semantic block queue into the answer information extraction model, so that the model can accurately output the answer information according to the semantic blocks with high question-answer value In the present application, the semantic information of the semantic block is guaranteed during the text segmentation, and the semantic block with higher question-answer value is selected to extract the answer information, which improves the accuracy of the question-and-answer.

附图说明Description of drawings

为了更清楚地说明本申请中的方案,下面将对本申请实施例描述中所需要使用的附图作一个简单介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the solutions in the present application more clearly, the following will briefly introduce the accompanying drawings used in the description of the embodiments of the present application. For those of ordinary skill, other drawings can also be obtained from these drawings without any creative effort.

图1是本申请可以应用于其中的示例性系统架构图;FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;

图2是根据本申请的基于文本的问答方法的一个实施例的流程图;Figure 2 is a flowchart of one embodiment of a text-based question answering method according to the present application;

图3是根据本申请的基于文本的问答装置的一个实施例的结构示意图;3 is a schematic structural diagram of an embodiment of a text-based question answering device according to the present application;

图4是根据本申请的计算机设备的一个实施例的结构示意图。FIG. 4 is a schematic structural diagram of an embodiment of a computer device according to the present application.

具体实施方式Detailed ways

除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同;本文中在申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请;本申请的说明书和权利要求书及上述附图说明中的术语“包括”和“具有”以及它们的任何变形,意图在于覆盖不排他的包含。本申请的说明书和权利要求书或上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field of this application; the terms used herein in the specification of the application are for the purpose of describing specific embodiments only It is not intended to limit the application; the terms "comprising" and "having" and any variations thereof in the description and claims of this application and the above description of the drawings are intended to cover non-exclusive inclusion. The terms "first", "second" and the like in the description and claims of the present application or the above drawings are used to distinguish different objects, rather than to describe a specific order.

在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。Reference herein to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor a separate or alternative embodiment that is mutually exclusive of other embodiments. It is explicitly and implicitly understood by those skilled in the art that the embodiments described herein may be combined with other embodiments.

为了使本技术领域的人员更好地理解本申请方案,下面将结合附图,对本申请实施例中的技术方案进行清楚、完整地描述。In order to make those skilled in the art better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the accompanying drawings.

如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1 , the system architecture 100 may include terminal devices 101 , 102 , and 103 , a network 104 and a server 105 . The network 104 is a medium used to provide a communication link between the terminal devices 101 , 102 , 103 and the server 105 . The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如网页浏览器应用、购物类应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等。The user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like. Various communication client applications may be installed on the terminal devices 101 , 102 and 103 , such as web browser applications, shopping applications, search applications, instant messaging tools, email clients, social platform software, and the like.

终端设备101、102、103可以是具有显示屏并且支持网页浏览的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving PictureExpertsGroup Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(MovingPictureExperts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机和台式计算机等等。The terminal devices 101, 102, and 103 may be various electronic devices that have a display screen and support web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, moving picture experts). Compression Standard Audio Layer 3), MP4 (Moving PictureExperts Group Audio Layer IV, Moving Picture Experts Compression Standard Audio Layer 4) Players, Laptops and Desktops, etc.

服务器105可以是提供各种服务的服务器,例如对终端设备101、102、103上显示的页面提供支持的后台服务器。The server 105 may be a server that provides various services, such as a background server that provides support for the pages displayed on the terminal devices 101 , 102 , and 103 .

需要说明的是,本申请实施例所提供的基于文本的问答方法一般由服务器执行,相应地,基于文本的问答装置一般设置于服务器中。It should be noted that the text-based question answering method provided by the embodiments of the present application is generally executed by the server, and accordingly, the text-based question answering device is generally set in the server.

应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.

继续参考图2,示出了根据本申请的基于文本的问答方法的一个实施例的流程图。所述的基于文本的问答方法,包括以下步骤:Continuing to refer to FIG. 2, a flowchart of one embodiment of a text-based question answering method according to the present application is shown. The text-based question answering method includes the following steps:

步骤S201,获取问题文本及其对应的答案文本。Step S201, acquiring question text and its corresponding answer text.

在本实施例中,基于文本的问答方法运行于其上的电子设备(例如图1所示的服务器)可以通过有线连接方式或者无线连接方式与终端进行通信。需要指出的是,上述无线连接方式可以包括但不限于3G/4G/5G连接、WiFi连接、蓝牙连接、WiMAX连接、Zigbee连接、UWB(ultra wideband)连接、以及其他现在已知或将来开发的无线连接方式。In this embodiment, the electronic device (for example, the server shown in FIG. 1 ) on which the text-based question answering method runs may communicate with the terminal through a wired connection or a wireless connection. It should be pointed out that the above wireless connection methods may include, but are not limited to, 3G/4G/5G connection, WiFi connection, Bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection, and other wireless connections now known or developed in the future. connection method.

具体地,获取问题文本以及问题文本所对应的答案文本。其中,问题文本用于提出问题;答案文本中包含问题文本所对应的答案。本申请要在答案文本中定位出具体的、直接的答案片段。Specifically, the question text and the answer text corresponding to the question text are obtained. Among them, the question text is used to ask the question; the answer text contains the answer corresponding to the question text. This application seeks to locate specific and direct answer fragments in the answer text.

步骤S202,根据预设的语义分割算法分割答案文本,得到多个语义块。Step S202, segment the answer text according to a preset semantic segmentation algorithm to obtain a plurality of semantic blocks.

具体地,对答案文本进行分割,得到多个语义块。本申请可以根据预设的语义分割算法识别答案文本中具有实际语义的文本元素(例如文本中特定类别的词语),并基于识别到的文本元素对问答文本进行分割,以确保得到的每个语义块具有足够丰富的语义信息。Specifically, the answer text is segmented to obtain multiple semantic blocks. The present application can identify text elements with actual semantics in the answer text (such as words of a specific category in the text) according to a preset semantic segmentation algorithm, and segment the question and answer text based on the identified text elements to ensure that each semantic Blocks have sufficiently rich semantic information.

步骤S203,将问题文本与各语义块分别进行组合,得到多个问答价值评估文本。Step S203, combining the question text and each semantic block respectively to obtain a plurality of question and answer value evaluation texts.

具体地,将问题文本与各语义块逐一组合,得到多个问答价值评估文本。在一个实施例中,对于分割得到的第i个语义块,问答价值评估文本的形式为[[CLS]Query[SEP]Si],其中,CLS和SEP表示分隔符,Query表示问题文本,Si表示第i个语义块。Specifically, the question text and each semantic block are combined one by one to obtain a plurality of question-answer value evaluation texts. In one embodiment, for the i-th semantic block obtained by segmentation, the question-answer value evaluation text is in the form of [[CLS]Query[SEP]Si], where CLS and SEP represent separators, Query represents question text, and Si represents The i-th semantic block.

步骤S204,将各问答价值评估文本分别输入问答价值评估模型,得到各问答价值评估文本中语义块的问答价值分数。Step S204, input each question and answer value evaluation text into the question and answer value evaluation model respectively, and obtain the question and answer value score of the semantic block in each question and answer value evaluation text.

其中,问答价值评估模型用于评估语义块在问答中的价值或者贡献度。Among them, the question and answer value evaluation model is used to evaluate the value or contribution of semantic blocks in question and answer.

具体地,将得到的各语义块分别输入问答价值评估模型,得到各语义块的问答价值分数。通常,语义块在问答中越有用,语义块的问答价值分数越高。Specifically, each obtained semantic block is input into the question-and-answer value evaluation model, respectively, and the question-and-answer value score of each semantic block is obtained. In general, the more useful a semantic block is in question answering, the higher the question answering value score of the semantic block.

在一个实施例中,问答价值评估模型基于预训练语言模型构建,其中,预训练语言模型可以是Bert、Roberta等模型。问答价值评估模型需要经过预先训练。In one embodiment, the question-answer value evaluation model is constructed based on a pre-trained language model, where the pre-trained language model may be a model such as Bert, Roberta, or the like. The question answering value evaluation model needs to be pre-trained.

步骤S205,根据问答价值分数对各语义块进行筛选,得到包含至少一个语义块的语义块队列。In step S205, each semantic block is screened according to the question-and-answer value score, and a semantic block queue including at least one semantic block is obtained.

具体地,问答价值分数为数字,其数值越高,语义块在问答中发挥的作用也越大。可以根据问答价值分数对各语义块进行筛选,得到语义块队列,其中,语义块队列中包含至少一个语义块。Specifically, the question-answer value score is a number, and the higher the value, the greater the role of the semantic block in question-answering. Each semantic block can be screened according to the question-and-answer value score to obtain a semantic block queue, wherein the semantic block queue includes at least one semantic block.

在一个实施例中,获取预设的价值分数阈值,将问答价值分数与价值分数阈值相比较,选取问答价值分数大于价值分数阈值的语义块,以构建语义块队列。In one embodiment, a preset value score threshold is obtained, the question and answer value score is compared with the value score threshold, and a semantic block whose question and answer value score is greater than the value score threshold is selected to construct a semantic block queue.

在一个实施例中,根据问答价值分数的大小对各语义块进行降序排列,然后选取至少一个语义块得到语义块队列。In one embodiment, the semantic blocks are sorted in descending order according to the size of the question and answer value score, and then at least one semantic block is selected to obtain a semantic block queue.

步骤S206,将问题文本和语义块队列输入答案信息提取模型,得到答案文本中与问题文本所对应的答案信息。Step S206, input the question text and the semantic block queue into the answer information extraction model, and obtain the answer information corresponding to the question text in the answer text.

其中,答案信息提取模型可以是用于从语义块队列中提取答案信息的模型。这里的答案信息,是指与问题文本直接相关的文本片段,虽然答案文本包含问题文本所对应的答案,但是答案文本可能较长,包含较多信息。因此,通过答案信息提取模型从语义块队列中提取出与问题文本直接相关的文本片段。例如,问题文本为“中国的首都在哪里?”,答案文本为一段较长的介绍性的文字,并包含如下句子:“中国首都在北京”。本申请的答案信息提取模型旨在从答案文本中提取出“北京”作为答案信息。The answer information extraction model may be a model for extracting answer information from the semantic block queue. The answer information here refers to a text fragment directly related to the question text. Although the answer text contains the answer corresponding to the question text, the answer text may be longer and contain more information. Therefore, text fragments directly related to the question text are extracted from the semantic block queue by the answer information extraction model. For example, the question text is "Where is the capital of China?" and the answer text is a longer introductory text containing the sentence: "The capital of China is Beijing". The answer information extraction model of this application aims to extract "Beijing" from the answer text as answer information.

具体地,将问题文本和语义块队列输入答案信息提取模型,答案信息提取模型可以在语义块队列中的各语义块中确定与问题文本关联最紧密的文本片段,并将其作为答案信息进行输出。Specifically, the question text and the semantic block queue are input into the answer information extraction model, and the answer information extraction model can determine the text segment most closely related to the question text in each semantic block in the semantic block queue, and output it as the answer information .

在一个实施例中,答案信息提取模型基于预训练语言模型构建,其中,预训练语言模型可以是Bert、Roberta等模型。In one embodiment, the answer information extraction model is constructed based on a pre-trained language model, where the pre-trained language model may be a Bert, Roberta or other model.

本实施例中,获取问题文本及其对应的答案文本,根据预设的语义分割算法分割答案文本,确保得到的语义块具有足够丰富的语义信息;将问题文本与各语义块分别进行组合得到多个问答价值评估文本;将问答价值评估文本输入问答价值评估模型得到问答价值分数,问答价值分数衡量了问答价值评估文本中的语义块在问答中的贡献度与价值,从而选取具有较高问答价值的语义块构成语义块队列;然后将问题文本与语义块队列输入答案信息提取模型,使得模型可以根据具有较高问答价值的语义块准确输出答案信息;本申请在文本分割时保证了语义块的语义信息,选取具有较高问答价值的语义块进行答案信息的提取,提高了问答的准确性。In this embodiment, the question text and its corresponding answer text are obtained, and the answer text is segmented according to a preset semantic segmentation algorithm to ensure that the obtained semantic block has sufficient semantic information; A question and answer value evaluation text; input the question and answer value evaluation text into the question and answer value evaluation model to obtain the question and answer value score. Then, the question text and the semantic block queue are input into the answer information extraction model, so that the model can accurately output the answer information according to the semantic blocks with high question and answer value; this application guarantees the semantic block during text segmentation. Semantic information, selecting semantic blocks with high question-answer value to extract answer information, which improves the accuracy of question-answering.

进一步的,上述步骤S202可以包括:识别答案文本中的目标词;根据目标词和预设的文本长度条件分割答案文本,得到多个语义块;其中,语义块中目标词的数量等于预设数量阈值且语义块的文本长度处于预设长度区间内;或者,语义块的文本长度等于预设长度区间右端点的数值;数量阈值和预设长度区间为预设的语义分割算法中的参数。Further, the above step S202 may include: identifying the target word in the answer text; segmenting the answer text according to the target word and a preset text length condition to obtain a plurality of semantic blocks; wherein, the number of target words in the semantic block is equal to the preset number The threshold value and the text length of the semantic block are within the preset length range; or, the text length of the semantic block is equal to the value of the right endpoint of the preset length range; the quantity threshold and the preset length range are parameters in the preset semantic segmentation algorithm.

具体地,对答案文本进行自然语言处理,以识别答案文本中的目标词。其中,目标词可以是答案文本中特定类别的词语。在一个实施例中,目标词可以是答案文本中的命名实体以及动词。识别到的目标词在文本分割时将作为一种参考要素。Specifically, natural language processing is performed on the answer text to identify target words in the answer text. Among them, the target word can be a specific category of words in the answer text. In one embodiment, the target words may be named entities as well as verbs in the answer text. The identified target words will be used as a reference element during text segmentation.

语义分割算法从目标词和文本长度两个维度对文本分割进行限制。基于语义分割算法,根据目标词和预设的文本长度条件分割答案文本,其中,文本长度条件是指截取到的语义块的文本长度应该处于预设长度区间[Bmin,Bmax]内,Bmin和Bmax为预先设定的值,例如,可以令Bmin=32,令Bmax=64。Semantic segmentation algorithms limit text segmentation from two dimensions: target word and text length. Based on the semantic segmentation algorithm, the answer text is segmented according to the target word and the preset text length condition. The text length condition means that the text length of the intercepted semantic block should be within the preset length interval [B min , B max ], B min and B max are preset values, for example, B min =32 and B max =64.

在分割时,都需要先截取Bmin长度的文本。如果Bmin长度的文本中已经包含了预设数量阈值Ns的目标词,则直接将该Bmin长度的文本作为一个语义块;并从答案文本的下一个位置开始继续进行分割。When dividing, it is necessary to intercept the text of length B min first. If the text of length B min already contains target words with a preset number of threshold Ns , the text of length B min is directly regarded as a semantic block; and the segmentation continues from the next position of the answer text.

如果Bmin长度的文本中目标词的数量达不到预设数量阈值Ns,则在Bmin长度到Bmax长度的文本中继续查找目标词,直至找到的目标词达到预设数量阈值Ns,此时以最后一个查找到的目标词的位置为结束位置截断答案文本,得到一个语义块;并从答案文本的下一个位置继续进行分割。If the number of target words in the text of length B min does not reach the preset number threshold N s , continue to search for the target words in the text of length B min to B max until the found target words reach the preset number threshold N s , at this time, the answer text is truncated with the position of the last found target word as the end position, and a semantic block is obtained; and the segmentation is continued from the next position of the answer text.

如果Bmin长度的文本中目标词的数量达不到预设数量阈值Ns,且查找到Bmax长度时,目标词的数量仍然无法达到预设数量阈值Ns,此时在Bmax长度处作为结束位置截断答案文本得到一个语义块;并从答案文本的下一个位置继续进行分割。If the number of target words in the text of length B min does not reach the preset number threshold N s , and the length of B max is found, the number of target words still cannot reach the preset number threshold N s , at this time at the length of B max Truncate the answer text as the end position to get a semantic block; and continue the segmentation from the next position of the answer text.

在一个实施例中,语义块包含第一部分和第二部分,其中第一部分文本长度为Bmin。在分割时,都需要先截取Bmin长度的文本得到第一部分。然后在位于Bmin长度和Bmax长度的文本中查找目标词,直至找到的目标词达到预设数量阈值Ns,此时以最后一个查找到的目标词的位置为结束位置截断答案文本,得到第二部分,根据第一部分和第二部分得到一个语义块;并从答案文本的下一个位置继续进行分割。当从Bmin长度开始查找到Bmax长度时,目标词的数量仍然无法达到预设数量阈值Ns,此时在Bmax长度处作为结束位置截断答案文本得到第二部分,根据第一部分和第二部分得到一个语义块;并从答案文本的下一个位置继续进行分割。In one embodiment, the semantic block includes a first part and a second part, wherein the text length of the first part is Bmin . When dividing, it is necessary to intercept the text of length B min to obtain the first part. Then, search for the target word in the text with the length of B min and B max until the found target word reaches the preset number threshold N s , at this time, the answer text is truncated with the position of the last found target word as the end position to obtain For the second part, get a semantic block based on the first part and the second part; and continue the segmentation from the next position of the answer text. When searching from the length of B min to the length of B max , the number of target words still cannot reach the preset number threshold N s . At this time, the answer text is truncated at the length of B max as the end position to obtain the second part. According to the first part and the first part The second part gets a semantic block; and continues the segmentation from the next position in the answer text.

本实施例中,根据目标词和文本长度条件分割答案文本,得到多个语义块,语义块在符合文本长度条件的情况下还具有足够数量的目标词,确保了语义块具有足够丰富的语义信息。In this embodiment, the answer text is segmented according to the target word and the text length condition to obtain a plurality of semantic blocks, and the semantic block also has a sufficient number of target words under the condition that the text length condition is met, ensuring that the semantic block has enough rich semantic information .

进一步的,上述步骤S204可以包括:对于每个问答价值评估文本,将问答价值评估文本分别输入问答价值评估模型中的各子模型,得到多个问答价值子分数;对各问答价值子分数进行线性运算,得到问答价值评估文本中语义块的问答价值分数。Further, the above step S204 may include: for each question and answer value evaluation text, input the question and answer value evaluation text into each sub-model in the question and answer value evaluation model, respectively, to obtain multiple question and answer value sub-scores; Operation is performed to obtain the question and answer value score of the semantic block in the question and answer value evaluation text.

具体地,问答价值评估模型可以包含多个并列的子模型,子模型可以由预训练语言模型、全连接层和激活函数构成。其中,预训练语言模型可以包括Bert、Roberta、LMO、GPT等模型。激活函数可以是relu、sigmoid等函数。在一个实施例中,问答价值评估模型可以由预训练语言模型组件、全连接层和激活函数构成,其中预训练语言模型组件可以多个预训练语言模型构成。问答价值评估模型输出的问答价值分数可以处于0到1之间。Specifically, the question answering value evaluation model can include multiple parallel sub-models, and the sub-models can be composed of a pre-trained language model, a fully connected layer, and an activation function. The pre-trained language model may include Bert, Roberta, LMO, GPT and other models. The activation function can be relu, sigmoid and other functions. In one embodiment, the question-answer value evaluation model may be composed of a pre-trained language model component, a fully connected layer, and an activation function, wherein the pre-trained language model component may be composed of multiple pre-trained language models. The question and answer value score output by the question and answer value evaluation model can be between 0 and 1.

对于每个语义块,将问答价值评估文本输入问答价值评估模型中的各子模型,各子模型分别输出问答价值评估文本中语义块的问答价值子分数,可以对各问答价值子分数进行线性运算,例如直接进行求和运算,得到语义块的问答价值分数。For each semantic block, the question and answer value evaluation text is input into each sub-model in the question and answer value evaluation model, and each sub-model outputs the question and answer value sub-score of the semantic block in the question and answer value evaluation text, and a linear operation can be performed on each question and answer value sub-score. , such as a direct summation operation, to obtain the question-and-answer value score of the semantic block.

在一个实施例中,计算各问答价值子分数的平均数,得到语义块的问答价值分数。In one embodiment, the average of each question and answer value sub-score is calculated to obtain the question and answer value score of the semantic block.

在一个实施例中,通过预设的权重算法确定各问答价值子分数的权重,例如,通过层次分析法确定各问答价值子分数的权重,然后根据权重计算各问答价值子分数的加权和,得到语义块的问答价值分数。通过给问答价值子分数赋予不同的权重,可以引入不同子模型以及子模型输出的问答价值子分数之间的区别,确保了问答价值分数计算的准确性。In one embodiment, the weight of each question and answer value sub-score is determined by a preset weighting algorithm, for example, the weight of each question and answer value sub-score is determined by the AHP, and then the weighted sum of each question and answer value sub-score is calculated according to the weight, to obtain Question Answer Value Score for Semantic Blocks. By assigning different weights to the question-and-answer value sub-scores, the differences between different sub-models and the question-and-answer value sub-scores output by the sub-models can be introduced to ensure the accuracy of the question-and-answer value score calculation.

本实施例中,问答价值评估模型中可以包含多个不同的子模型,每个子模型针对语义块均输出问答价值子分数,根据各子模型输出的问答价值子分数进行线性运算,得到综合多个子模型评估结果的问答价值分数,确保了问答价值分数的准确性。In this embodiment, the question-and-answer value evaluation model may include multiple different sub-models, each sub-model outputs question-and-answer value sub-scores for semantic blocks, and performs linear operations on the question-and-answer value sub-scores output by each sub-model to obtain a comprehensive number of sub-scores. The Q&A value score of the model evaluation results ensures the accuracy of the Q&A value score.

进一步的,上述步骤S204之前,还可以包括:获取多个问答价值训练文本;各问答价值训练文本由训练问题文本和分割训练答案文本得到的各语义块分别进行组合得到;将各问答价值训练文本输入初始问答价值评估模型,得到各问答价值训练文本中语义块的问答价值预测分数;获取各问答价值训练文本中语义块的语义块标签;语义块标签标识语义块是否关联于答案信息;基于得到的问答价值预测分数和语义块标签计算价值评估损失;根据价值评估损失调整初始问答价值评估模型的模型参数,直至模型收敛,得到问答价值评估模型。Further, before the above step S204, it may also include: acquiring a plurality of question-and-answer value training texts; each question-and-answer value training text is obtained by combining the training question text and each semantic block obtained by dividing the training answer text; Input the initial question and answer value evaluation model, and obtain the question and answer value prediction score of the semantic block in each question and answer value training text; obtain the semantic block label of the semantic block in each question and answer value training text; the semantic block label identifies whether the semantic block is related to the answer information; based on the obtained Calculate the value evaluation loss based on the Q&A value prediction score and semantic block label; adjust the model parameters of the initial Q&A value evaluation model according to the value evaluation loss, until the model converges, and the Q&A value evaluation model is obtained.

具体地,本申请需要预先训练得到问答价值评估模型。在训练中,获取训练问题文本和训练答案文本,根据语义分割算法对训练答案文本进行分割得到多个语义块。将训练问题文本与各语义块分别进行组合得到多个问答价值训练文本。Specifically, this application requires pre-training to obtain a question-and-answer value evaluation model. During training, the training question text and the training answer text are obtained, and the training answer text is segmented according to the semantic segmentation algorithm to obtain multiple semantic blocks. The training question text and each semantic block are combined to obtain multiple question-answer value training texts.

将问答价值训练文本输入初始问答价值评估模型,得到各问答价值训练文本中语义块的问答价值预测分数。问答价值预测分数是模型训练阶段的问答价值分数。各语义块具有语义块标签,语义块标签用于显示语义块是否关联于答案信息,例如,当语义块关联于答案信息时,语义块标签为1;当语义块不关联于答案信息时,语义块标签为0。The question and answer value training text is input into the initial question and answer value evaluation model, and the question and answer value prediction scores of the semantic blocks in each question and answer value training text are obtained. The Q&A value prediction score is the Q&A value score during the model training phase. Each semantic block has a semantic block label, and the semantic block label is used to display whether the semantic block is associated with the answer information. For example, when the semantic block is associated with the answer information, the semantic block label is 1; when the semantic block is not associated with the answer information, the semantic block label is 1. The block label is 0.

根据问答价值预测分数和语义块标签可以计算价值评估损失loss(Q),loss(Q)=CrossEntropy(dis(Q),label(Q))。其中,CrossEntropy代表交叉熵损失函数,dis(Q)表示各语义块问答价值预测分数,label(Q)表示各语义块的语义块标签。The value evaluation loss loss(Q) can be calculated according to the question and answer value prediction score and the semantic block label, loss(Q)=CrossEntropy(dis(Q),label(Q)). Among them, CrossEntropy represents the cross entropy loss function, dis(Q) represents the question-answer value prediction score of each semantic block, and label(Q) represents the semantic block label of each semantic block.

在一个实施例中,在得到问答价值预测分数后,根据问答价值预测分数对各语义块进行筛选得到语义块队列,然后根据语义块队列中各语义块的问答价值预测分数和语义块标签计算价值评估损失loss(Q),loss(Q)=CrossEntropy(dis(Q),label(Q))。其中,CrossEntropy代表交叉熵损失函数,dis(Q)表示语义块队列中各语义块的问答价值预测分数,label(Q)表示语义块队列中各语义块的语义块标签。In one embodiment, after obtaining the question and answer value prediction score, filter each semantic block according to the question and answer value prediction score to obtain a semantic block queue, and then calculate the value according to the question and answer value prediction score and the semantic block label of each semantic block in the semantic block queue. Evaluation loss loss(Q), loss(Q) = CrossEntropy(dis(Q), label(Q)). Among them, CrossEntropy represents the cross entropy loss function, dis(Q) represents the question-answer value prediction score of each semantic block in the semantic block queue, and label(Q) represents the semantic block label of each semantic block in the semantic block queue.

本实施例中,获取多个问答价值训练文本并输入初始问答价值评估模型,得到各问答价值训练文本中语义块的问答价值预测分数;获取各语义块的语义块标签;根据问答价值预测分数和语义块标签计算价值评估损失,进而调整初始问答价值评估模型,使得训练完毕得到的问答价值评估模型可以准确输出问答价值文本中语义块的问答价值分数。In this embodiment, a plurality of question-and-answer value training texts are obtained and an initial question-and-answer value evaluation model is input to obtain the question-and-answer value prediction scores of the semantic blocks in each question-and-answer value training text; the semantic block labels of each semantic block are obtained; The semantic block label calculates the value evaluation loss, and then adjusts the initial question and answer value evaluation model, so that the question and answer value evaluation model obtained after training can accurately output the question and answer value score of the semantic block in the question and answer value text.

进一步的,上述步骤S205可以包括:根据问答价值分数对各语义块进行降序排列,得到候选队列;按照候选队列中语义块的排列顺序,将各语义块依次添加到初始语义块队列中,并统计初始语义块队列的当前队列长度;当当前队列长度等于预设的长度阈值时,将当前的初始语义块队列确定为语义块队列;当当前队列长度大于预设的长度阈值时,根据长度阈值对最后一次添加的语义块进行截断,得到语义块队列。Further, the above step S205 may include: arranging each semantic block in descending order according to the question-and-answer value score to obtain a candidate queue; according to the arrangement order of the semantic blocks in the candidate queue, sequentially adding each semantic block to the initial semantic block queue, and counting The current queue length of the initial semantic block queue; when the current queue length is equal to the preset length threshold, the current initial semantic block queue is determined as the semantic block queue; when the current queue length is greater than the preset length threshold, according to the length threshold value The last added semantic block is truncated to get the semantic block queue.

具体地,根据语义块的问答价值分数由大到小的顺序,对各语义块进行降序排序,得到候选队列。初始语义块队列初始状态可以为空,然后将候选队列中降序排列的语义块依次添加到初始语义块队列中,并在每次添加后统计初始语义块队列的当前队列长度。Specifically, according to the descending order of the question and answer value scores of the semantic blocks, each semantic block is sorted in descending order to obtain a candidate queue. The initial state of the initial semantic block queue can be empty, and then the semantic blocks arranged in descending order in the candidate queue are added to the initial semantic block queue in sequence, and the current queue length of the initial semantic block queue is counted after each addition.

如果当前队列长度不足预设的长度阈值,则继续添加语义块;如果当前队列长度刚好等于长度阈值,例如,长度阈值为512,添加了8个语义块,且每个语义块的长度均为64,此时,停止添加语义块,并将当前的初始语义块队列确定为语义块队列。如果在添加一个语义块后,当前队列长度大于长度阈值,则根据长度阈值对最后一次添加的语义块进行截断,确保留下的语义块队列的长度等于长度阈值,得到语义块队列。例如,添加6个语义块后,队列长度为384,第7个语义块长度为40,第8个语义块长度为42,第9个语义块长度为50,添加第9个语义块后,当前队列长度会超过长度阈值,则,将第9个语义块截断。其中,长度阈值可以是答案信息提取模型允许输入的最大文本长度。If the current queue length is less than the preset length threshold, continue to add semantic blocks; if the current queue length is exactly equal to the length threshold, for example, the length threshold is 512, 8 semantic blocks are added, and the length of each semantic block is 64 , at this time, stop adding semantic blocks, and determine the current initial semantic block queue as the semantic block queue. If after adding a semantic block, the current queue length is greater than the length threshold, truncate the last added semantic block according to the length threshold to ensure that the length of the remaining semantic block queue is equal to the length threshold to obtain the semantic block queue. For example, after adding 6 semantic blocks, the queue length is 384, the length of the 7th semantic block is 40, the length of the 8th semantic block is 42, and the length of the 9th semantic block is 50. After adding the 9th semantic block, the current If the queue length will exceed the length threshold, the 9th semantic block will be truncated. Among them, the length threshold can be the maximum text length allowed by the answer information extraction model.

本实施例中,根据问答价值分数对语义块进行降序排序,然后将降序排序的语义块逐一添加到初始语义块队列中,从而将对问答最有价值的语义块挑选出来;在当前队列长度超过长度阈值时,及时对最后一次添加的语义块进行截断,以确保语义块队列的长度不超过答案信息提取模型允许输入的最大文本长度。In this embodiment, the semantic blocks are sorted in descending order according to the question-and-answer value score, and then the descending-ordered semantic blocks are added to the initial semantic block queue one by one, so as to select the most valuable semantic block for question-and-answer; when the current queue length exceeds When the length threshold is set, the last added semantic block is truncated in time to ensure that the length of the semantic block queue does not exceed the maximum text length allowed by the answer information extraction model.

进一步的,上述步骤S206可以包括:将问题文本和语义块队列中的各语义块分别进行组合,得到多个问答文本;将各问答文本分别输入答案信息提取模型,得到答案位置信息;基于答案位置信息,在答案文本中确定与问题文本所对应的答案信息。Further, the above step S206 may include: combining the question text and each semantic block in the semantic block queue respectively to obtain a plurality of question and answer texts; inputting each question and answer text into an answer information extraction model respectively to obtain answer position information; based on the answer position information, the answer information corresponding to the question text is determined in the answer text.

具体地,将问题文本和语义块队列中的各语义块分别进行组合,得到多个问答文本。在一个实施例中,对于语义块队列中的第i个语义块,问答文本的形式为[[CLS]Query[SEP]Si],其中,CLS和SEP表示分隔符,Query表示问题文本,Si表示第i个语义块。Specifically, the question text and each semantic block in the semantic block queue are respectively combined to obtain a plurality of question and answer texts. In one embodiment, for the ith semantic block in the semantic block queue, the question and answer text is in the form of [[CLS]Query[SEP]Si], where CLS and SEP represent delimiters, Query represents question text, and Si represents The i-th semantic block.

将各问答文本输入答案信息提取模型,答案信息提取模型对各问答文本进行信息处理,输出答案位置信息,答案位置信息可以是向量,能够表示答案信息在语义块中的起始位置和结束位置;答案位置信息还可以包括语义块的编号,用于表示答案信息存在于哪个语义块中。Input each question and answer text into the answer information extraction model, and the answer information extraction model performs information processing on each question and answer text, and outputs the answer position information. The answer position information can be a vector, which can represent the start position and end position of the answer information in the semantic block; The answer position information may further include the number of the semantic block, which is used to indicate in which semantic block the answer information exists.

根据答案位置信息,可以从答案文本中截取出与问题文本直接相关的文本片段,得到答案信息。According to the answer position information, text fragments directly related to the question text can be cut out from the answer text to obtain the answer information.

本实施例中,将问题文本和语义块队列中的各语义块分别进行组合得到问答文本,将问答文本输入答案信息提取模型;语义块队列中的语义块对问答具有最大的贡献价值,答案信息提取模型可以准确地输出答案位置信息,从而准确得到答案信息。In this embodiment, the question text and each semantic block in the semantic block queue are respectively combined to obtain the question and answer text, and the question and answer text is input into the answer information extraction model; the semantic blocks in the semantic block queue have the greatest contribution value to the question and answer, and the answer information The extraction model can accurately output the answer position information, so as to obtain the answer information accurately.

进一步的,上述步骤S206之前,还可以包括:获取问答训练文本;问答训练文本由训练问题文本和训练语义块队列中各语义块分别进行组合得到;将问答训练文本输入初始答案信息提取模型,得到问答训练文本中语义块的答案位置预测信息;获取问答训练文本中语义块的答案位置标签信息;基于得到的答案位置预测信息和答案位置标签信息计算位置评估损失;根据位置评估损失调整初始答案信息提取模型的模型参数,直至模型收敛,得到答案信息提取模型。Further, before the above step S206, it may also include: obtaining the question and answer training text; the question and answer training text is obtained by combining the training question text and each semantic block in the training semantic block queue respectively; inputting the question and answer training text into the initial answer information extraction model to obtain Answer position prediction information of semantic blocks in Q&A training text; Obtain answer position label information of semantic blocks in Q&A training text; Calculate position evaluation loss based on the obtained answer position prediction information and answer position label information; Adjust initial answer information according to position evaluation loss The model parameters of the model are extracted until the model converges, and the answer information extraction model is obtained.

具体地,需要通过预先训练得到答案信息提取模型。在训练中,获取问答训练文本,问答训练文本由训练问题文本和训练语义块队列中各语义块分别进行组合得到;其中,训练语义块队列来自训练问题文本所对应的训练答案文本。答案信息提取模型的训练可以是在问答价值评估模型的基础上进行,即先进行问答价值评估模型的训练,通过训练完毕的问答价值评估模型生成训练语义块队列。Specifically, the answer information extraction model needs to be obtained through pre-training. During training, the question-and-answer training text is obtained, and the question-and-answer training text is obtained by combining the training question text and each semantic block in the training semantic block queue, wherein the training semantic block queue comes from the training answer text corresponding to the training question text. The training of the answer information extraction model can be carried out on the basis of the question and answer value evaluation model, that is, the question and answer value evaluation model is trained first, and the training semantic block queue is generated through the trained question and answer value evaluation model.

将问答训练文本输入初始答案信息提取模型,答案信息提取模型对问答训练文本中语义块内部的答案所在位置进行预测,输出答案位置预测信息。然后获取答案位置标签信息,答案位置标签信息以标签的形式记录问答训练文本中语义块内部的答案所在位置。The question and answer training text is input into the initial answer information extraction model, and the answer information extraction model predicts the position of the answer inside the semantic block in the question and answer training text, and outputs the answer position prediction information. Then, the answer position label information is obtained, and the answer position label information records the answer position inside the semantic block in the question answering training text in the form of a label.

答案位置预测信息和答案位置标签信息可以是由0和1组成的向量,其中,1表示所在位置属于答案信息,0表示所在位置不属于答案信息。The answer position prediction information and the answer position label information may be vectors composed of 0 and 1, where 1 indicates that the position belongs to the answer information, and 0 indicates that the position does not belong to the answer information.

根据答案位置预测信息和答案位置标签信息可以计算位置评估损失,损失函数可以为交叉熵损失函数。以最小化位置评估损失为目标,调整初始答案信息提取模型的模型参数,直至模型收敛,得到答案信息提取模型。The position evaluation loss can be calculated according to the answer position prediction information and the answer position label information, and the loss function can be a cross entropy loss function. With the goal of minimizing the loss of position evaluation, the model parameters of the initial answer information extraction model are adjusted until the model converges, and the answer information extraction model is obtained.

在优化初始问答价值评估模型和初始答案信息提取模型时,可以采用SGD、Adam等方法进行模型优化。When optimizing the initial question and answer value evaluation model and the initial answer information extraction model, methods such as SGD and Adam can be used to optimize the model.

可以理解,初始答案信息提取模型基于训练语义块队列进行训练,训练语义块队列基于问答价值评估模型输出的问答价值分数生成,训练语义队列中的语义块在问答中具有较高的价值,初始答案信息提取模型可以在较高质量的训练样本上进行训练,因此可以缩短初始答案信息提取模型的训练时间,减少训练开销。It can be understood that the initial answer information extraction model is trained based on the training semantic block queue, and the training semantic block queue is generated based on the question and answer value score output by the question and answer value evaluation model. The semantic blocks in the training semantic queue have high value in question and answer, and the initial answer The information extraction model can be trained on higher quality training samples, so the training time of the initial answer information extraction model can be shortened and the training overhead can be reduced.

本实施例中,将问答训练文本输入初始答案信息提取模型得到答案位置预测信息,然后与答案位置标签信息计算位置评估损失,根据位置评估损失进行模型调整得到答案信息提取模型,确保了答案信息提取模型可以从问答文本中准确提取出答案信息。In this embodiment, the question and answer training text is input into the initial answer information extraction model to obtain the answer position prediction information, and then the position evaluation loss is calculated with the answer position label information, and the model is adjusted according to the position evaluation loss to obtain the answer information extraction model, which ensures the answer information extraction. The model can accurately extract the answer information from the question and answer text.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,该计算机可读指令可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,前述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等非易失性存储介质,或随机存储记忆体(Random Access Memory,RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a computer-readable storage medium. , when the program is executed, it may include the processes of the foregoing method embodiments. The aforementioned storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM).

应该理解的是,虽然附图的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various steps in the flowchart of the accompanying drawings are sequentially shown in the order indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order and may be performed in other orders. Moreover, at least a part of the steps in the flowchart of the accompanying drawings may include multiple sub-steps or multiple stages, and these sub-steps or stages are not necessarily executed at the same time, but may be executed at different times, and the execution sequence is also It does not have to be performed sequentially, but may be performed alternately or alternately with other steps or at least a portion of sub-steps or stages of other steps.

进一步参考图3,作为对上述图2所示方法的实现,本申请提供了一种基于文本的问答装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。Further referring to FIG. 3 , as an implementation of the method shown in FIG. 2 above, the present application provides an embodiment of a text-based question answering device. The device embodiment corresponds to the method embodiment shown in FIG. 2 . The device Specifically, it can be applied to various electronic devices.

如图3所示,本实施例所述的基于文本的问答装置300包括:文本获取模块301、文本分割模块302、组合模块303、文本输入模块304、筛选模块305以及答案提取模块306,其中:As shown in FIG. 3 , the text-based question answering device 300 in this embodiment includes: a text acquisition module 301, a text segmentation module 302, a combination module 303, a text input module 304, a screening module 305, and an answer extraction module 306, wherein:

文本获取模块301,用于获取问题文本及其对应的答案文本。The text acquisition module 301 is used to acquire question text and corresponding answer text.

文本分割模块302,用于根据预设的语义分割算法分割答案文本,得到多个语义块。The text segmentation module 302 is configured to segment the answer text according to a preset semantic segmentation algorithm to obtain a plurality of semantic blocks.

组合模块303,用于将问题文本与各语义块分别进行组合,得到多个问答价值评估文本。The combination module 303 is used to combine the question text and each semantic block respectively to obtain a plurality of question and answer value evaluation texts.

文本输入模块304,用于将各问答价值评估文本分别输入问答价值评估模型,得到各问答价值评估文本中语义块的问答价值分数。The text input module 304 is configured to input each question and answer value evaluation text into the question and answer value evaluation model respectively, and obtain the question and answer value score of the semantic block in each question and answer value evaluation text.

筛选模块305,用于根据问答价值分数对各语义块进行筛选,得到包含至少一个语义块的语义块队列。The screening module 305 is configured to screen each semantic block according to the question-and-answer value score to obtain a semantic block queue including at least one semantic block.

答案提取模块306,用于将问题文本和语义块队列输入答案信息提取模型,得到答案文本中与问题文本所对应的答案信息。The answer extraction module 306 is configured to input the question text and the semantic block queue into the answer information extraction model to obtain answer information corresponding to the question text in the answer text.

本实施例中,获取问题文本及其对应的答案文本,根据预设的语义分割算法分割答案文本,确保得到的语义块具有足够丰富的语义信息;将问题文本与各语义块分别进行组合得到多个问答价值评估文本;将问答价值评估文本输入问答价值评估模型得到问答价值分数,问答价值分数衡量了问答价值评估文本中的语义块在问答中的贡献度与价值,从而选取具有较高问答价值的语义块构成语义块队列;然后将问题文本与语义块队列输入答案信息提取模型,使得模型可以根据具有较高问答价值的语义块准确输出答案信息;本申请在文本分割时保证了语义块的语义信息,选取具有较高问答价值的语义块进行答案信息的提取,提高了问答的准确性。In this embodiment, the question text and its corresponding answer text are obtained, and the answer text is segmented according to a preset semantic segmentation algorithm to ensure that the obtained semantic block has sufficient semantic information; A question and answer value evaluation text; input the question and answer value evaluation text into the question and answer value evaluation model to obtain the question and answer value score. Then, the question text and the semantic block queue are input into the answer information extraction model, so that the model can accurately output the answer information according to the semantic blocks with high question and answer value; this application guarantees the semantic block during text segmentation. Semantic information, selecting semantic blocks with high question-answer value to extract answer information, which improves the accuracy of question-answering.

在本实施例的一些可选的实现方式中,文本分割模块302可以包括:目标词识别子模块以及文本分割子模块,其中:In some optional implementations of this embodiment, the text segmentation module 302 may include: a target word recognition submodule and a text segmentation submodule, wherein:

目标词识别子模块,用于识别答案文本中的目标词。The target word recognition submodule is used to identify target words in the answer text.

文本分割子模块,用于根据目标词和预设的文本长度条件分割答案文本,得到多个语义块;其中,语义块中目标词的数量等于预设数量阈值且语义块的文本长度处于预设长度区间内;或者,语义块的文本长度等于预设长度区间右端点的数值;数量阈值和预设长度区间为预设的语义分割算法中的参数。The text segmentation sub-module is used to segment the answer text according to the target word and the preset text length condition to obtain multiple semantic blocks; wherein, the number of target words in the semantic block is equal to the preset number threshold and the text length of the semantic block is at the preset value or, the text length of the semantic block is equal to the value of the right endpoint of the preset length interval; the quantity threshold and the preset length interval are parameters in the preset semantic segmentation algorithm.

本实施例中,根据目标词和文本长度条件分割答案文本,得到多个语义块,语义块在符合文本长度条件的情况下还具有足够数量的目标词,确保了语义块具有足够丰富的语义信息。In this embodiment, the answer text is divided according to the target word and the text length condition to obtain multiple semantic blocks. The semantic block also has a sufficient number of target words under the condition that the text length condition is met, which ensures that the semantic block has enough rich semantic information .

在本实施例的一些可选的实现方式中,文本输入模块304可以包括:文本输入子模块以及分数运算子模块,其中:In some optional implementations of this embodiment, the text input module 304 may include: a text input sub-module and a score operation sub-module, wherein:

文本输入子模块,用于对于每个问答价值评估文本,将问答价值评估文本分别输入问答价值评估模型中的各子模型,得到多个问答价值子分数。The text input sub-module is used to input the question-and-answer value evaluation text into each sub-model in the question-and-answer value evaluation model for each question-and-answer value evaluation text, and obtain multiple question-and-answer value sub-scores.

分数运算子模块,用于对各问答价值子分数进行线性运算,得到问答价值评估文本中语义块的问答价值分数。The score operation sub-module is used to perform linear operation on each question and answer value sub-score to obtain the question and answer value score of the semantic block in the question and answer value evaluation text.

本实施例中,问答价值评估模型中可以包含多个不同的子模型,每个子模型针对语义块均输出问答价值子分数,根据各子模型输出的问答价值子分数进行线性运算,得到综合多个子模型评估结果的问答价值分数,确保了问答价值分数的准确性。In this embodiment, the question-and-answer value evaluation model may include multiple different sub-models, each sub-model outputs question-and-answer value sub-scores for semantic blocks, and performs linear operations on the question-and-answer value sub-scores output by each sub-model to obtain a comprehensive number of sub-scores. The Q&A value score of the model evaluation results ensures the accuracy of the Q&A value score.

在本实施例的一些可选的实现方式中,基于文本的问答装置300还可以包括:问答训练获取模块、问答训练输入模块、标签获取模块、价值损失计算模块以及价值模型调整模块,其中:In some optional implementations of this embodiment, the text-based question answering apparatus 300 may further include: a question and answer training acquisition module, a question and answer training input module, a label acquisition module, a value loss calculation module, and a value model adjustment module, wherein:

问答训练获取模块,用于获取多个问答价值训练文本;各问答价值训练文本由训练问题文本和分割训练答案文本得到的各语义块分别进行组合得到;The question and answer training acquisition module is used to obtain multiple question and answer value training texts; each question and answer value training text is obtained by combining the training question text and each semantic block obtained by dividing the training answer text;

问答训练输入模块,用于将各问答价值训练文本输入初始问答价值评估模型,得到各问答价值训练文本中语义块的问答价值预测分数;The question and answer training input module is used to input each question and answer value training text into the initial question and answer value evaluation model, and obtain the question and answer value prediction score of the semantic block in each question and answer value training text;

标签获取模块,用于获取各问答价值训练文本中语义块的语义块标签;语义块标签标识语义块是否关联于答案信息;The label acquisition module is used to obtain the semantic block labels of the semantic blocks in each question-and-answer value training text; the semantic block labels identify whether the semantic blocks are related to the answer information;

价值损失计算模块,用于基于得到的问答价值预测分数和语义块标签计算价值评估损失;The value loss calculation module is used to calculate the value evaluation loss based on the obtained question and answer value prediction scores and semantic block labels;

价值模型调整模块,用于根据价值评估损失调整初始问答价值评估模型的模型参数,直至模型收敛,得到问答价值评估模型。The value model adjustment module is used to adjust the model parameters of the initial question and answer value assessment model according to the value assessment loss, until the model converges, and the question and answer value assessment model is obtained.

本实施例中,获取多个问答价值训练文本并输入初始问答价值评估模型,得到各问答价值训练文本中语义块的问答价值预测分数;获取各语义块的语义块标签;根据问答价值预测分数和语义块标签计算价值评估损失,进而调整初始问答价值评估模型,使得训练完毕得到的问答价值评估模型可以准确输出问答价值文本中语义块的问答价值分数。In this embodiment, a plurality of question-and-answer value training texts are obtained and an initial question-and-answer value evaluation model is input to obtain the question-and-answer value prediction scores of the semantic blocks in each question-and-answer value training text; the semantic block labels of each semantic block are obtained; The semantic block label calculates the value evaluation loss, and then adjusts the initial question and answer value evaluation model, so that the question and answer value evaluation model obtained after training can accurately output the question and answer value score of the semantic block in the question and answer value text.

在本实施例的一些可选的实现方式中,筛选模块305可以包括:降序排列子模块、语义块添加子模块、当前确定子模块以及语义块截断子模块,其中:In some optional implementations of this embodiment, the screening module 305 may include: a descending order sub-module, a semantic block adding sub-module, a current determination sub-module, and a semantic-block truncation sub-module, wherein:

降序排列子模块,用于根据问答价值分数对各语义块进行降序排列,得到候选队列。The descending order sub-module is used to sort each semantic block in descending order according to the question and answer value score to obtain a candidate queue.

语义块添加子模块,用于按照候选队列中语义块的排列顺序,将各语义块依次添加到初始语义块队列中,并统计初始语义块队列的当前队列长度。The semantic block adding sub-module is used to sequentially add each semantic block to the initial semantic block queue according to the arrangement order of the semantic blocks in the candidate queue, and count the current queue length of the initial semantic block queue.

当前确定子模块,用于当当前队列长度等于预设的长度阈值时,将当前的初始语义块队列确定为语义块队列。The current determining submodule is used to determine the current initial semantic block queue as the semantic block queue when the current queue length is equal to the preset length threshold.

语义块截断子模块,用于当当前队列长度大于预设的长度阈值时,根据长度阈值对最后一次添加的语义块进行截断,得到语义块队列。The semantic block truncation sub-module is used to truncate the last added semantic block according to the length threshold when the current queue length is greater than the preset length threshold to obtain the semantic block queue.

本实施例中,根据问答价值分数对语义块进行降序排序,然后将降序排序的语义块逐一添加到初始语义块队列中,从而将对问答最有价值的语义块挑选出来;在当前队列长度超过长度阈值时,及时对最后一次添加的语义块进行截断,以确保语义块队列的长度不超过答案信息提取模型允许输入的最大文本长度。In this embodiment, the semantic blocks are sorted in descending order according to the question-and-answer value score, and then the descending-ordered semantic blocks are added to the initial semantic block queue one by one, so as to select the most valuable semantic block for question-and-answer; when the current queue length exceeds When the length threshold is set, the last added semantic block is truncated in time to ensure that the length of the semantic block queue does not exceed the maximum text length allowed by the answer information extraction model.

在本实施例的一些可选的实现方式中,答案提取模块306可以包括:组合子模块、问答输入子模块以及答案确定子模块,其中:In some optional implementations of this embodiment, the answer extraction module 306 may include: a combination submodule, a question and answer input submodule, and an answer determination submodule, wherein:

组合子模块,用于将问题文本和语义块队列中的各语义块分别进行组合,得到多个问答文本。The combination sub-module is used to combine the question text and each semantic block in the semantic block queue to obtain multiple question and answer texts.

问答输入子模块,用于将各问答文本分别输入答案信息提取模型,得到答案位置信息。The question and answer input sub-module is used to input each question and answer text into the answer information extraction model to obtain the answer position information.

答案确定子模块,用于基于答案位置信息,在答案文本中确定与问题文本所对应的答案信息。The answer determination submodule is used to determine the answer information corresponding to the question text in the answer text based on the answer position information.

本实施例中,将问题文本和语义块队列中的各语义块分别进行组合得到问答文本,将问答文本输入答案信息提取模型;语义块队列中的语义块对问答具有最大的贡献价值,答案信息提取模型可以准确地输出答案位置信息,从而准确得到答案信息。In this embodiment, the question text and each semantic block in the semantic block queue are respectively combined to obtain the question and answer text, and the question and answer text is input into the answer information extraction model; the semantic blocks in the semantic block queue have the greatest contribution value to the question and answer, and the answer information The extraction model can accurately output the answer position information, so as to obtain the answer information accurately.

在本实施例的一些可选的实现方式中,基于文本的问答装置300还可以包括:训练文本获取模块、训练文本输入模块、位置标签获取模块、位置损失计算模块以及提取模型调整模块,其中:In some optional implementations of this embodiment, the text-based question answering apparatus 300 may further include: a training text acquisition module, a training text input module, a position label acquisition module, a position loss calculation module, and an extraction model adjustment module, wherein:

训练文本获取模块,用于获取问答训练文本;问答训练文本由训练问题文本和训练语义块队列中各语义块分别进行组合得到。The training text obtaining module is used to obtain the question and answer training text; the question and answer training text is obtained by combining the training question text and each semantic block in the training semantic block queue respectively.

训练文本输入模块,用于将问答训练文本输入初始答案信息提取模型,得到问答训练文本中语义块的答案位置预测信息。The training text input module is used to input the question and answer training text into the initial answer information extraction model, and obtain the answer position prediction information of the semantic block in the question and answer training text.

位置标签获取模块,用于获取问答训练文本中语义块的答案位置标签信息。The position label obtaining module is used to obtain the answer position label information of the semantic block in the question answering training text.

位置损失计算模块,用于基于得到的答案位置预测信息和答案位置标签信息计算位置评估损失。The position loss calculation module is used to calculate the position evaluation loss based on the obtained answer position prediction information and answer position label information.

提取模型调整模块,用于根据位置评估损失调整初始答案信息提取模型的模型参数,直至模型收敛,得到答案信息提取模型。The extraction model adjustment module is used to adjust the model parameters of the initial answer information extraction model according to the position evaluation loss, until the model converges, and the answer information extraction model is obtained.

本实施例中,将问答训练文本输入初始答案信息提取模型得到答案位置预测信息,然后与答案位置标签信息计算位置评估损失,根据位置评估损失进行模型调整得到答案信息提取模型,确保了答案信息提取模型可以从问答文本中准确提取出答案信息。In this embodiment, the question and answer training text is input into the initial answer information extraction model to obtain the answer position prediction information, and then the position evaluation loss is calculated with the answer position label information, and the model is adjusted according to the position evaluation loss to obtain the answer information extraction model, which ensures the answer information extraction. The model can accurately extract the answer information from the question and answer text.

为解决上述技术问题,本申请实施例还提供计算机设备。具体请参阅图4,图4为本实施例计算机设备基本结构框图。To solve the above technical problems, the embodiments of the present application further provide computer equipment. Please refer to FIG. 4 for details. FIG. 4 is a block diagram of a basic structure of a computer device according to this embodiment.

所述计算机设备4包括通过系统总线相互通信连接存储器41、处理器42、网络接口43。需要指出的是,图中仅示出了具有组件41-43的计算机设备4,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。其中,本技术领域技术人员可以理解,这里的计算机设备是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(ApplicationSpecific Integrated Circuit,ASIC)、可编程门阵列(Field-Programmable GateArray,FPGA)、数字处理器(Digital Signal Processor,DSP)、嵌入式设备等。The computer device 4 includes a memory 41, a processor 42, and a network interface 43 that communicate with each other through a system bus. It should be noted that only the computer device 4 with components 41-43 is shown in the figure, but it should be understood that it is not required to implement all of the shown components, and more or less components may be implemented instead. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions, and its hardware includes but is not limited to microprocessors, special-purpose Integrated circuit (ApplicationSpecific Integrated Circuit, ASIC), programmable gate array (Field-Programmable GateArray, FPGA), digital processor (Digital Signal Processor, DSP), embedded equipment, etc.

所述计算机设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述计算机设备可以与用户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互。The computer equipment may be a desktop computer, a notebook computer, a palmtop computer, a cloud server and other computing equipment. The computer device can perform human-computer interaction with the user through a keyboard, a mouse, a remote control, a touch pad or a voice control device.

所述存储器41至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,所述存储器41可以是所述计算机设备4的内部存储单元,例如该计算机设备4的硬盘或内存。在另一些实施例中,所述存储器41也可以是所述计算机设备4的外部存储设备,例如该计算机设备4上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(FlashCard)等。当然,所述存储器41还可以既包括所述计算机设备4的内部存储单元也包括其外部存储设备。本实施例中,所述存储器41通常用于存储安装于所述计算机设备4的操作系统和各类应用软件,例如基于文本的问答方法的计算机可读指令等。此外,所述存储器41还可以用于暂时地存储已经输出或者将要输出的各类数据。The memory 41 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static Random Access Memory (SRAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Programmable Read Only Memory (PROM), Magnetic Memory, Magnetic Disk, Optical Disk, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4 , such as a hard disk or a memory of the computer device 4 . In other embodiments, the memory 41 may also be an external storage device of the computer device 4 , such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, flash memory card (FlashCard) and so on. Of course, the memory 41 may also include both the internal storage unit of the computer device 4 and its external storage device. In this embodiment, the memory 41 is generally used to store the operating system and various application software installed on the computer device 4, such as computer-readable instructions of a text-based question-and-answer method. In addition, the memory 41 can also be used to temporarily store various types of data that have been output or will be output.

所述处理器42在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器42通常用于控制所述计算机设备4的总体操作。本实施例中,所述处理器42用于运行所述存储器41中存储的计算机可读指令或者处理数据,例如运行所述基于文本的问答方法的计算机可读指令。The processor 42 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. This processor 42 is typically used to control the overall operation of the computer device 4 . In this embodiment, the processor 42 is configured to execute computer-readable instructions stored in the memory 41 or process data, such as computer-readable instructions for executing the text-based question answering method.

所述网络接口43可包括无线网络接口或有线网络接口,该网络接口43通常用于在所述计算机设备4与其他电子设备之间建立通信连接。The network interface 43 may include a wireless network interface or a wired network interface, and the network interface 43 is generally used to establish a communication connection between the computer device 4 and other electronic devices.

本实施例中提供的计算机设备可以执行上述基于文本的问答方法。此处基于文本的问答方法可以是上述各个实施例的基于文本的问答方法。The computer device provided in this embodiment can execute the above text-based question answering method. The text-based question answering method here may be the text-based question answering method of each of the above embodiments.

本实施例中,获取问题文本及其对应的答案文本,根据预设的语义分割算法分割答案文本,确保得到的语义块具有足够丰富的语义信息;将问题文本与各语义块分别进行组合得到多个问答价值评估文本;将问答价值评估文本输入问答价值评估模型得到问答价值分数,问答价值分数衡量了问答价值评估文本中的语义块在问答中的贡献度与价值,从而选取具有较高问答价值的语义块构成语义块队列;然后将问题文本与语义块队列输入答案信息提取模型,使得模型可以根据具有较高问答价值的语义块准确输出答案信息;本申请在文本分割时保证了语义块的语义信息,选取具有较高问答价值的语义块进行答案信息的提取,提高了问答的准确性。In this embodiment, the question text and its corresponding answer text are obtained, and the answer text is segmented according to a preset semantic segmentation algorithm to ensure that the obtained semantic block has sufficient semantic information; A question and answer value evaluation text; input the question and answer value evaluation text into the question and answer value evaluation model to obtain the question and answer value score. Then, the question text and the semantic block queue are input into the answer information extraction model, so that the model can accurately output the answer information according to the semantic blocks with high question and answer value; this application guarantees the semantic block during text segmentation. Semantic information, selecting semantic blocks with high question-answer value to extract answer information, which improves the accuracy of question-answering.

本申请还提供了另一种实施方式,即提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令可被至少一个处理器执行,以使所述至少一个处理器执行如上述的基于文本的问答方法的步骤。The present application also provides another embodiment, that is, to provide a computer-readable storage medium, where the computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions can be executed by at least one processor to The at least one processor is caused to perform the steps of the text-based question answering method as described above.

本实施例中,获取问题文本及其对应的答案文本,根据预设的语义分割算法分割答案文本,确保得到的语义块具有足够丰富的语义信息;将问题文本与各语义块分别进行组合得到多个问答价值评估文本;将问答价值评估文本输入问答价值评估模型得到问答价值分数,问答价值分数衡量了问答价值评估文本中的语义块在问答中的贡献度与价值,从而选取具有较高问答价值的语义块构成语义块队列;然后将问题文本与语义块队列输入答案信息提取模型,使得模型可以根据具有较高问答价值的语义块准确输出答案信息;本申请在文本分割时保证了语义块的语义信息,选取具有较高问答价值的语义块进行答案信息的提取,提高了问答的准确性。In this embodiment, the question text and its corresponding answer text are obtained, and the answer text is segmented according to a preset semantic segmentation algorithm to ensure that the obtained semantic block has sufficient semantic information; A question and answer value evaluation text; input the question and answer value evaluation text into the question and answer value evaluation model to obtain the question and answer value score. Then, the question text and the semantic block queue are input into the answer information extraction model, so that the model can accurately output the answer information according to the semantic blocks with high question and answer value; this application guarantees the semantic block during text segmentation. Semantic information, selecting semantic blocks with high question-answer value to extract answer information, which improves the accuracy of question-answering.

通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of this application.

显然,以上所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例,附图中给出了本申请的较佳实施例,但并不限制本申请的专利范围。本申请可以以许多不同的形式来实现,相反地,提供这些实施例的目的是使对本申请的公开内容的理解更加透彻全面。尽管参照前述实施例对本申请进行了详细的说明,对于本领域的技术人员来而言,其依然可以对前述各具体实施方式所记载的技术方案进行修改,或者对其中部分技术特征进行等效替换。凡是利用本申请说明书及附图内容所做的等效结构,直接或间接运用在其他相关的技术领域,均同理在本申请专利保护范围之内。Obviously, the above-described embodiments are only a part of the embodiments of the present application, rather than all of the embodiments. The accompanying drawings show the preferred embodiments of the present application, but do not limit the scope of the patent of the present application. This application may be embodied in many different forms, rather these embodiments are provided so that a thorough and complete understanding of the disclosure of this application is provided. Although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing specific embodiments, or perform equivalent replacements for some of the technical features. . Any equivalent structure made by using the contents of the description and drawings of the present application, which is directly or indirectly used in other related technical fields, is also within the scope of protection of the patent of the present application.

Claims (10)

1.一种基于文本的问答方法,其特征在于,包括下述步骤:1. a text-based question and answer method, is characterized in that, comprises the following steps: 获取问题文本及其对应的答案文本;Get the question text and its corresponding answer text; 根据预设的语义分割算法分割所述答案文本,得到多个语义块;Segment the answer text according to a preset semantic segmentation algorithm to obtain a plurality of semantic blocks; 将所述问题文本与各语义块分别进行组合,得到多个问答价值评估文本;Combining the question text and each semantic block respectively to obtain a plurality of question and answer value evaluation texts; 将各问答价值评估文本分别输入问答价值评估模型,得到所述各问答价值评估文本中语义块的问答价值分数;Input each question and answer value evaluation text into the question and answer value evaluation model respectively, and obtain the question and answer value score of the semantic block in each question and answer value evaluation text; 根据所述问答价值分数对所述各语义块进行筛选,得到包含至少一个语义块的语义块队列;Screening each semantic block according to the question and answer value score to obtain a semantic block queue including at least one semantic block; 将所述问题文本和所述语义块队列输入答案信息提取模型,得到所述答案文本中与所述问题文本所对应的答案信息。Inputting the question text and the semantic block queue into an answer information extraction model to obtain answer information corresponding to the question text in the answer text. 2.根据权利要求1所述的基于文本的问答方法,其特征在于,所述根据预设的语义分割算法分割所述答案文本,得到多个语义块的步骤包括:2. The text-based question answering method according to claim 1, wherein the step of dividing the answer text according to a preset semantic segmentation algorithm to obtain a plurality of semantic blocks comprises: 识别所述答案文本中的目标词;identifying target words in the answer text; 根据所述目标词和预设的文本长度条件分割所述答案文本,得到多个语义块;其中,语义块中目标词的数量等于预设数量阈值且语义块的文本长度处于预设长度区间内;或者,语义块的文本长度等于预设长度区间右端点的数值;所述数量阈值和所述预设长度区间为预设的语义分割算法中的参数。The answer text is divided according to the target word and the preset text length condition to obtain a plurality of semantic blocks; wherein, the number of target words in the semantic block is equal to the preset number threshold and the text length of the semantic block is within the preset length interval Or, the text length of the semantic block is equal to the value of the right endpoint of the preset length interval; the quantity threshold and the preset length interval are parameters in the preset semantic segmentation algorithm. 3.根据权利要求1所述的基于文本的问答方法,其特征在于,所述将各问答价值评估文本分别输入问答价值评估模型,得到所述各问答价值评估文本中语义块的问答价值分数的步骤包括:3. text-based question and answer method according to claim 1, is characterized in that, described each question and answer value evaluation text is input respectively into question and answer value evaluation model, obtains the question and answer value score of the semantic block in described each question and answer value evaluation text. Steps include: 对于每个问答价值评估文本,将所述问答价值评估文本分别输入问答价值评估模型中的各子模型,得到多个问答价值子分数;For each question and answer value evaluation text, the question and answer value evaluation text is respectively input into each sub-model in the question and answer value evaluation model to obtain a plurality of question and answer value sub-scores; 对各问答价值子分数进行线性运算,得到所述问答价值评估文本中语义块的问答价值分数。Perform a linear operation on each question-and-answer value sub-score to obtain the question-and-answer value score of the semantic block in the question-and-answer value evaluation text. 4.根据权利要求1所述的基于文本的问答方法,其特征在于,在所述将各问答价值评估文本分别输入问答价值评估模型,得到所述各问答价值评估文本中语义块的问答价值分数的步骤之前,还包括:4. The text-based question and answer method according to claim 1, wherein each question and answer value evaluation text is respectively input into the question and answer value evaluation model to obtain the question and answer value score of the semantic block in each question and answer value evaluation text Before the steps, also include: 获取多个问答价值训练文本;各问答价值训练文本由训练问题文本和分割训练答案文本得到的各语义块分别进行组合得到;Obtaining multiple question-answer-value training texts; each question-answer-value training text is obtained by combining the training question text and each semantic block obtained by dividing the training answer text; 将所述各问答价值训练文本输入初始问答价值评估模型,得到所述各问答价值训练文本中语义块的问答价值预测分数;Inputting each question-and-answer value training text into an initial question-and-answer value evaluation model, to obtain the question-and-answer value prediction score of the semantic block in each question-and-answer value training text; 获取所述各问答价值训练文本中语义块的语义块标签;所述语义块标签标识语义块是否关联于答案信息;Obtain the semantic block label of the semantic block in each question-and-answer value training text; the semantic block label identifies whether the semantic block is associated with the answer information; 基于得到的问答价值预测分数和语义块标签计算价值评估损失;Calculate the value evaluation loss based on the obtained Q&A value prediction scores and semantic block labels; 根据所述价值评估损失调整所述初始问答价值评估模型的模型参数,直至模型收敛,得到问答价值评估模型。The model parameters of the initial question and answer value assessment model are adjusted according to the value assessment loss until the model converges, and the question and answer value assessment model is obtained. 5.根据权利要求1所述的基于文本的问答方法,其特征在于,所述根据所述问答价值分数对所述各语义块进行筛选,得到包含至少一个语义块的语义块队列的步骤包括:5. The text-based question and answer method according to claim 1, wherein the step of screening each semantic block according to the question and answer value score to obtain a semantic block queue comprising at least one semantic block comprises: 根据所述问答价值分数对所述各语义块进行降序排列,得到候选队列;Arrange the semantic blocks in descending order according to the question and answer value score to obtain a candidate queue; 按照所述候选队列中语义块的排列顺序,将各语义块依次添加到初始语义块队列中,并统计所述初始语义块队列的当前队列长度;According to the arrangement order of the semantic blocks in the candidate queue, each semantic block is added to the initial semantic block queue in turn, and the current queue length of the initial semantic block queue is counted; 当所述当前队列长度等于预设的长度阈值时,将当前的初始语义块队列确定为语义块队列;When the current queue length is equal to the preset length threshold, determining the current initial semantic block queue as the semantic block queue; 当所述当前队列长度大于预设的长度阈值时,根据所述长度阈值对最后一次添加的语义块进行截断,得到语义块队列。When the current queue length is greater than a preset length threshold, the last added semantic block is truncated according to the length threshold to obtain a semantic block queue. 6.根据权利要求1所述的基于文本的问答方法,其特征在于,所述将所述问题文本和所述语义块队列输入答案信息提取模型,得到所述答案文本中与所述问题文本所对应的答案信息的步骤包括:6 . The text-based question answering method according to claim 1 , wherein the question text and the semantic block queue are input into an answer information extraction model to obtain the difference between the question text and the question text in the answer text. 7 . The steps corresponding to the answer information include: 将所述问题文本和所述语义块队列中的各语义块分别进行组合,得到多个问答文本;Combining the question text and each semantic block in the semantic block queue respectively to obtain a plurality of question and answer texts; 将各问答文本分别输入答案信息提取模型,得到答案位置信息;Input each question and answer text into the answer information extraction model respectively to obtain the answer position information; 基于所述答案位置信息,在所述答案文本中确定与所述问题文本所对应的答案信息。Based on the answer position information, answer information corresponding to the question text is determined in the answer text. 7.根据权利要求6所述的基于文本的问答方法,其特征在于,在所述将所述问题文本和所述语义块队列输入答案信息提取模型,得到所述答案文本中与所述问题文本所对应的答案信息的步骤之前,还包括:7 . The text-based question answering method according to claim 6 , wherein, in the answer information extraction model by inputting the question text and the semantic block queue into the answer information, the answer text and the question text are obtained in the answer text. 8 . Before the steps corresponding to the answer information, it also includes: 获取问答训练文本;所述问答训练文本由训练问题文本和训练语义块队列中各语义块分别进行组合得到;Obtaining the question-and-answer training text; the question-and-answer training text is obtained by combining the training question text and each semantic block in the training semantic block queue respectively; 将所述问答训练文本输入初始答案信息提取模型,得到所述问答训练文本中语义块的答案位置预测信息;Inputting the question-and-answer training text into an initial answer information extraction model to obtain the answer position prediction information of the semantic block in the question-and-answer training text; 获取所述问答训练文本中语义块的答案位置标签信息;Obtain the answer position label information of the semantic block in the question answering training text; 基于得到的答案位置预测信息和答案位置标签信息计算位置评估损失;Calculate the position evaluation loss based on the obtained answer position prediction information and answer position label information; 根据所述位置评估损失调整所述初始答案信息提取模型的模型参数,直至模型收敛,得到答案信息提取模型。The model parameters of the initial answer information extraction model are adjusted according to the position evaluation loss until the model converges, and an answer information extraction model is obtained. 8.一种基于文本的问答装置,其特征在于,包括:8. A text-based question and answer device, comprising: 文本获取模块,用于获取问题文本及其对应的答案文本;The text acquisition module is used to acquire the question text and its corresponding answer text; 文本分割模块,用于根据预设的语义分割算法分割所述答案文本,得到多个语义块;a text segmentation module, configured to segment the answer text according to a preset semantic segmentation algorithm to obtain a plurality of semantic blocks; 组合模块,用于将所述问题文本与各语义块分别进行组合,得到多个问答价值评估文本;a combination module for combining the question text and each semantic block respectively to obtain a plurality of question-and-answer value evaluation texts; 文本输入模块,用于将各问答价值评估文本分别输入问答价值评估模型,得到所述各问答价值评估文本中语义块的问答价值分数;The text input module is used for inputting each question and answer value evaluation text into the question and answer value evaluation model respectively, and obtains the question and answer value score of the semantic block in each question and answer value evaluation text; 筛选模块,用于根据所述问答价值分数对所述各语义块进行筛选,得到包含至少一个语义块的语义块队列;A screening module, configured to screen each semantic block according to the question-and-answer value score, to obtain a semantic block queue containing at least one semantic block; 答案提取模块,用于将所述问题文本和所述语义块队列输入答案信息提取模型,得到所述答案文本中与所述问题文本所对应的答案信息。An answer extraction module, configured to input the question text and the semantic block queue into an answer information extraction model to obtain answer information corresponding to the question text in the answer text. 9.一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述处理器执行所述计算机可读指令时实现如权利要求1至7中任一项所述的基于文本的问答方法的步骤。9. A computer device comprising a memory and a processor, wherein computer-readable instructions are stored in the memory, the processor implementing the computer-readable instructions as claimed in any one of claims 1 to 7 when the processor executes the computer-readable instructions Steps of a text-based question answering method. 10.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如权利要求1至7中任一项所述的基于文本的问答方法的步骤。10. A computer-readable storage medium, wherein computer-readable instructions are stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, any one of claims 1 to 7 is implemented. The steps of the text-based question answering method described in item.
CN202210524669.5A 2022-05-13 2022-05-13 Text-based question answering method, device, computer equipment and storage medium Pending CN114817478A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210524669.5A CN114817478A (en) 2022-05-13 2022-05-13 Text-based question answering method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210524669.5A CN114817478A (en) 2022-05-13 2022-05-13 Text-based question answering method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114817478A true CN114817478A (en) 2022-07-29

Family

ID=82514824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210524669.5A Pending CN114817478A (en) 2022-05-13 2022-05-13 Text-based question answering method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114817478A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117216208A (en) * 2023-09-01 2023-12-12 北京开普云信息科技有限公司 Question and answer method, device, storage medium and equipment based on long document
CN117592567A (en) * 2023-11-21 2024-02-23 广州方舟信息科技有限公司 Medicine question-answer model training method, device, electronic equipment and storage medium
WO2025092056A1 (en) * 2023-10-31 2025-05-08 抖音视界有限公司 Question-and-answer data generation method and apparatus, and computer device and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150026106A1 (en) * 2012-02-23 2015-01-22 National Institute Of Information And Communcations Technology Non-factoid question-answering system and computer program
CN108090127A (en) * 2017-11-15 2018-05-29 北京百度网讯科技有限公司 Question and answer text evaluation model is established with evaluating the method, apparatus of question and answer text
CN108228568A (en) * 2018-01-24 2018-06-29 上海互教教育科技有限公司 A kind of mathematical problem semantic understanding method
CN111797634A (en) * 2020-06-04 2020-10-20 语联网(武汉)信息技术有限公司 Document segmentation method and device
CN112560491A (en) * 2020-12-11 2021-03-26 北京百炼智能科技有限公司 Information extraction method and device based on AI technology and storage medium
CN112784574A (en) * 2021-02-02 2021-05-11 网易(杭州)网络有限公司 Text segmentation method and device, electronic equipment and medium
CN113157867A (en) * 2021-04-29 2021-07-23 阳光保险集团股份有限公司 Question answering method and device, electronic equipment and storage medium
CN114186048A (en) * 2021-12-14 2022-03-15 深圳壹账通智能科技有限公司 Question answering method, device, computer equipment and medium based on artificial intelligence
CN114328796A (en) * 2021-08-19 2022-04-12 腾讯科技(深圳)有限公司 Question and answer index generation method, question and answer model processing method, device and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150026106A1 (en) * 2012-02-23 2015-01-22 National Institute Of Information And Communcations Technology Non-factoid question-answering system and computer program
CN108090127A (en) * 2017-11-15 2018-05-29 北京百度网讯科技有限公司 Question and answer text evaluation model is established with evaluating the method, apparatus of question and answer text
CN108228568A (en) * 2018-01-24 2018-06-29 上海互教教育科技有限公司 A kind of mathematical problem semantic understanding method
CN111797634A (en) * 2020-06-04 2020-10-20 语联网(武汉)信息技术有限公司 Document segmentation method and device
CN112560491A (en) * 2020-12-11 2021-03-26 北京百炼智能科技有限公司 Information extraction method and device based on AI technology and storage medium
CN112784574A (en) * 2021-02-02 2021-05-11 网易(杭州)网络有限公司 Text segmentation method and device, electronic equipment and medium
CN113157867A (en) * 2021-04-29 2021-07-23 阳光保险集团股份有限公司 Question answering method and device, electronic equipment and storage medium
CN114328796A (en) * 2021-08-19 2022-04-12 腾讯科技(深圳)有限公司 Question and answer index generation method, question and answer model processing method, device and storage medium
CN114186048A (en) * 2021-12-14 2022-03-15 深圳壹账通智能科技有限公司 Question answering method, device, computer equipment and medium based on artificial intelligence

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
丁长林;蔡东风;王裴岩;: "基于分类算法的专利摘要文本分割技术", 山东大学学报(理学版), no. 05, 27 April 2012 (2012-04-27), pages 68 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117216208A (en) * 2023-09-01 2023-12-12 北京开普云信息科技有限公司 Question and answer method, device, storage medium and equipment based on long document
WO2025092056A1 (en) * 2023-10-31 2025-05-08 抖音视界有限公司 Question-and-answer data generation method and apparatus, and computer device and storage medium
CN117592567A (en) * 2023-11-21 2024-02-23 广州方舟信息科技有限公司 Medicine question-answer model training method, device, electronic equipment and storage medium
CN117592567B (en) * 2023-11-21 2024-05-28 广州方舟信息科技有限公司 Medicine question-answer model training method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112685565B (en) Text classification method based on multi-modal information fusion and related equipment
WO2021121198A1 (en) Semantic similarity-based entity relation extraction method and apparatus, device and medium
CN114780727A (en) Text classification method and device based on reinforcement learning, computer equipment and medium
CN114817478A (en) Text-based question answering method, device, computer equipment and storage medium
CN114780746A (en) Knowledge graph-based document retrieval method and related equipment thereof
CN111831826A (en) Training method, classification method and device of cross-domain text classification model
CN113901836B (en) Word sense disambiguation method, device and related equipment based on contextual semantics
CN112653798A (en) Intelligent customer service voice response method and device, computer equipment and storage medium
CN111930792A (en) Data resource labeling method and device, storage medium and electronic equipment
CN113722438A (en) Sentence vector generation method and device based on sentence vector model and computer equipment
CN111695337A (en) Method, device, equipment and medium for extracting professional terms in intelligent interview
CN116610784A (en) Insurance business scene question-answer recommendation method and related equipment thereof
CN118070072A (en) Problem processing method, device, equipment and storage medium based on artificial intelligence
CN114428838A (en) Content recall method, apparatus, computer equipment and storage medium
CN117131273A (en) Resource search methods, devices, computer equipment, media and products
JP7499946B2 (en) Method and device for training sorting model for intelligent recommendation, method and device for intelligent recommendation, electronic device, storage medium, and computer program
CN110532448B (en) Document classification method, device, equipment and storage medium based on neural network
CN116204624A (en) Response method, device, electronic device and storage medium
CN115687934A (en) Intention recognition method and device, computer equipment and storage medium
CN113360602B (en) Method, apparatus, device and storage medium for outputting information
CN115238077A (en) Artificial intelligence-based text analysis method, device, equipment and storage medium
CN114912958A (en) Seat calling-out method, device, computer equipment and storage medium
CN114461749A (en) Data processing method, device, electronic device and medium for dialogue content
CN112364649B (en) Named entity identification method and device, computer equipment and storage medium
CN118916453A (en) Intelligent operation and maintenance method based on self-developed GPT model and related equipment thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination