WO2023045752A1 - 知识库构建、生成应答语句的方法和装置 - Google Patents

知识库构建、生成应答语句的方法和装置 Download PDF

Info

Publication number
WO2023045752A1
WO2023045752A1 PCT/CN2022/117228 CN2022117228W WO2023045752A1 WO 2023045752 A1 WO2023045752 A1 WO 2023045752A1 CN 2022117228 W CN2022117228 W CN 2022117228W WO 2023045752 A1 WO2023045752 A1 WO 2023045752A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
answer
knowledge base
extraction
events
Prior art date
Application number
PCT/CN2022/117228
Other languages
English (en)
French (fr)
Inventor
韩炯
王勇
Original Assignee
北京京东拓先科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东拓先科技有限公司 filed Critical 北京京东拓先科技有限公司
Publication of WO2023045752A1 publication Critical patent/WO2023045752A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H80/00ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring

Definitions

  • the present disclosure relates to the field of computer technology, specifically to the field of artificial intelligence technology, and in particular to a method and device for building a knowledge base and generating answer sentences.
  • Embodiments of the present disclosure provide a knowledge base construction method, device, device, and storage medium.
  • an embodiment of the present disclosure provides a method for constructing a knowledge base.
  • the method includes: obtaining historical medical inquiry sentences of multiple users; performing event extraction on the historical medical query sentences to obtain multiple An extraction event; based on the extraction event and the corresponding answer statement, a question-and-answer knowledge base is constructed to generate an answer statement when the user asks a doctor.
  • an embodiment of the present disclosure provides a method for generating an answer sentence, the method includes: obtaining the target medical inquiry sentence of the target user within a historical preset time period; Sentences are subjected to event extraction to obtain target extraction events; based on target extraction events, search for target answer sentences corresponding to the target interrogation sentences in the question-and-answer knowledge base, wherein the question-and-answer knowledge base is obtained by the method described in any of the above-mentioned implementation methods A question-and-answer knowledge base; in response to determining that the target answer sentence is found, push the target answer sentence to the user.
  • an embodiment of the present disclosure provides a knowledge base construction device, which includes: an acquisition module configured to acquire historical questioning sentences of multiple users; an extraction module configured Event extraction is performed on pairs of historical inquiry sentences to obtain multiple extraction events; the building module is configured to build a question-and-answer knowledge base based on the extracted events and corresponding answer sentences, so as to generate answer sentences when the user asks a doctor.
  • an embodiment of the present disclosure provides a device for generating response sentences, the device includes: an inquiry module, configured to perform event extraction on target medical inquiry sentences, and obtain target extraction events ;
  • the acquisition module is configured to extract events from the historical interrogation sentences to obtain a plurality of extraction events;
  • the search module is configured to extract events based on the target and search for the target response sentence corresponding to the target interrogation sentence in the question-and-answer knowledge base , wherein the question-and-answer knowledge base is the question-and-answer knowledge base obtained by the method described in any of the above implementations;
  • the push module is configured to push the target answer sentence to the user in response to determining that the target answer sentence is found.
  • an embodiment of the present disclosure provides an electronic device, the electronic device includes one or more processors; a storage device, on which one or more programs are stored, when one or more A plurality of programs are executed by the one or more processors, so that the one or more processors implement the method described in any of the above implementation manners.
  • the embodiments of the present disclosure provide a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, the method described in any of the above implementation manners is implemented.
  • FIG. 1 is an exemplary system architecture diagram in which the present disclosure can be applied
  • Fig. 2 is a flowchart of an embodiment of the knowledge base construction method according to the present disclosure
  • FIG. 3 is a schematic diagram of an application scenario of the knowledge base construction method according to the present disclosure.
  • Fig. 4 is a flowchart of another embodiment of the knowledge base construction method according to the present disclosure.
  • FIG. 5 is a schematic diagram of an embodiment of a method for generating a response sentence according to the present disclosure
  • FIG. 6 is a schematic diagram of an embodiment of a knowledge base construction device according to the present disclosure.
  • FIG. 7 is a schematic diagram of an embodiment of a device for generating response sentences according to the present disclosure.
  • FIG. 8 is a schematic structural diagram of a computer system suitable for implementing a server according to an embodiment of the present disclosure.
  • FIG. 1 shows an exemplary system architecture 100 to which an embodiment of the knowledge base construction method of the present disclosure can be applied.
  • a system architecture 100 may include terminal devices 101 , 102 , 103 , a network 104 and a server 105 .
  • the network 104 is used as a medium for providing communication links between the terminal devices 101 , 102 , 103 and the server 105 .
  • Network 104 may include various connection types, such as wires, wireless communication links, or fiber optic cables, among others.
  • the terminal devices 101, 102, 103 interact with the server 105 via the network 104 to receive or send messages and the like.
  • Various communication client applications can be installed on the terminal devices 101, 102, 103, for example, diagnostic applications, communication applications, and the like.
  • the terminal devices 101, 102, and 103 may be hardware or software.
  • the terminal devices 101, 102, 103 When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices with display screens, including but not limited to mobile phones and notebook computers.
  • the terminal devices 101, 102, 103 When the terminal devices 101, 102, 103 are software, they can be installed in the electronic devices listed above. It can be implemented as a plurality of software or software modules (for example, to provide knowledge base construction services), and can also be implemented as a single software or software module. No specific limitation is made here.
  • the server 105 can be a server that provides various services, for example, to obtain historical medical inquiry sentences of multiple users; to perform event extraction on historical medical inquiry sentences to obtain multiple extraction events; to construct question and answer based on the extracted events and corresponding response sentences
  • the knowledge base is used to generate answer sentences when the user asks a doctor.
  • the server 105 may be hardware or software.
  • the server 105 can be implemented as a distributed server cluster composed of multiple servers, or as a single server.
  • the server is software, it can be implemented as multiple software or software modules (for example, for providing knowledge base construction services), or as a single software or software module. No specific limitation is made here.
  • each part (such as each unit, subunit, module, and submodule) included in the knowledge base construction device may be all set in the server 105, or all of them may be set in the terminal devices 101, 102, 103, or they may be set separately in the server 105 and the terminal devices 101, 102, 103.
  • terminal devices, networks and servers in Fig. 1 are only illustrative. According to the implementation needs, there can be any number of terminal devices, networks and servers.
  • FIG. 2 shows a schematic flowchart 200 of a knowledge base construction method applicable to the present disclosure.
  • the knowledge base construction method includes the following steps 201-203.
  • Step 201 acquiring historical medical inquiry sentences of multiple users.
  • the execution subject (server 105 or terminal devices 101, 102, and 103 as shown in FIG. 1) can obtain the historical inquiry sentences of multiple users from various channels, for example, the background of the health application program
  • the chat records between users and doctors stored in the database, public accounts or popular science articles in periodicals, and chat records in community forums are not limited in this disclosure.
  • step 202 event extraction is performed on historical medical inquiry sentences to obtain a plurality of extracted events.
  • the executing subject may use an event extraction algorithm to perform event extraction on the historical medical questioning sentences to obtain multiple extracted events.
  • event extraction refers to extracting events of interest to users from unstructured information and presenting them to users in a structured manner.
  • the event extraction algorithm can adopt the event extraction algorithm in the existing technology or future development technology, for example, the event extraction algorithm based on pattern matching, the event extraction algorithm based on machine learning, the event extraction algorithm based on neural network, etc., this disclosure Not limited.
  • the executive body can evaluate the effect of the event extraction algorithm in the following ways:
  • micro-average (denoted as F) value method based on the recall rate (denoted as R) and the accuracy rate (denoted as P), or the error recognition cost (denoted as C) method. in,
  • Cmiss is the cost of a loss
  • Cfa is the cost of a false positive
  • Ltar is the prior probability for the system to make a positive judgment, which is usually set as a constant value according to the specific application.
  • Step 203 based on the extracted events and the corresponding answer sentences, build a question-answer knowledge base for generating answer sentences when the user asks a doctor.
  • the execution subject can construct a question-and-answer knowledge base according to the obtained multiple extraction events and the response sentences corresponding to each extraction event in the multiple extraction events, so as to generate response sentences when the user asks a doctor.
  • FIG. 3 is a schematic diagram of an application scenario of the method for building a knowledge base according to this embodiment.
  • the execution subject 301 obtains the historical medical inquiry sentences of multiple users.
  • the historical medical inquiry sentences input by the user 302 via the terminal device 303 are "How to treat skin diseases?", "Why is the skin uncomfortable? " 304 ; user 305's historical inquiry sentences via terminal device 306 are "Is my high voltage 120 normal?", “Is my low voltage 70 normal?” 307 .
  • the execution subject performs event extraction 308 on the above-mentioned historical medical inquiry sentences, for example, performs event extraction on the historical medical inquiry sentences of user 1, and obtains extraction events 309, which are respectively extraction event 1 "skin disease, treatment” and extraction event 2 " High Pressure", Extraction Event 3 "Low Pressure".
  • Response sentences corresponding to each extraction event are obtained, and a question-and-answer knowledge base 310 is constructed based on the extraction events and corresponding response sentences, so as to generate response sentences when the user asks a doctor.
  • FIG. 4 it shows a flow 400 of another embodiment of the knowledge base construction method.
  • the process 400 of the knowledge base construction method in this embodiment may include the following steps 401 - 404 .
  • Step 401 acquiring historical medical inquiry sentences of multiple users.
  • step 401 for implementation details and technical effects of step 401, reference may be made to the description of step 201, and details are not repeated here.
  • step 402 event extraction is performed on historical medical inquiry sentences to obtain a plurality of extracted events.
  • step 402 for implementation details and technical effects of step 402, reference may be made to the description of step 202, which will not be repeated here.
  • Step 403 clustering the extracted events to obtain multiple clustered events.
  • the execution subject may use a clustering algorithm to cluster multiple extracted events to obtain multiple clustered events.
  • the clustering algorithm can adopt the clustering algorithm in the existing technology or future development technology, for example, K-means algorithm, Single-Pass incremental clustering algorithm, HAC (Hierarchical Agglomerative Clustering, agglomerative hierarchical clustering) algorithm, etc. etc., which is not limited in the present disclosure.
  • the clustering process is:
  • Sim(Ek, E1) is the inter-class similarity between Ek and E1, and its calculation method is shown in the following formula (cosine similarity formula); Thread is a threshold to specify the end condition of clustering. This article is mainly aimed at extracting event clustering, and it is more reasonable to take the inter-class similarity threshold Thread as 0.2.
  • Step 404 constructing a question-and-answer knowledge base based on the clustering events, the extracted events and the corresponding answer sentences.
  • the execution subject can construct a question-and-answer knowledge base according to the multiple clustering events corresponding to the acquired multiple extraction events, the multiple extraction events, and the response sentences corresponding to each extraction event in the multiple extraction events, for use in Generate a response sentence when the user asks a doctor.
  • the multiple extraction events are extraction event 1 "how to treat skin diseases?” and extraction event 2 "how to treat stomach diseases?", and the execution subject clusters extraction event 1 to obtain cluster event 1 "diagnosis for skin diseases” , cluster the extracted event 2, and get the clustered event 2 "stomach disease consultation". Further, the execution subject can build a question-and-answer knowledge base based on the clustering event 1, the clustering event 2, the extraction event 1 and the corresponding answer sentence, and the extraction event 2 and the corresponding answer sentence.
  • the process 400 of the knowledge base construction method in this embodiment reflects that the extraction events are clustered to obtain multiple cluster events; based on clustering events, extracting events and the corresponding answer sentences, constructing a question-and-answer knowledge base, which is used to generate answer sentences when users ask for a diagnosis, which helps to provide users with treatment plans that are more in line with their own conditions, and improves the quality of each question and answer in the question-and-answer knowledge base.
  • a flow 500 of an embodiment of a method for generating a response sentence according to the present disclosure is shown.
  • the method for generating a response statement includes the following steps 501-504.
  • Step 501 acquiring the target medical inquiry sentences of the target user within the historical preset time period.
  • the execution subject can obtain the target medical inquiry sentences of the target user within the historical preset time period in a wired or wireless manner.
  • the wireless connection method may include but not limited to 3G/4G connection, WiFi connection, Bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection, and other currently known or future wireless connection methods.
  • the target user may be any user who needs to consult a doctor.
  • the preset time period can be set according to experience and actual needs. For example, within the preset time period when the doctor is not online, such as within 3 minutes from the time when the doctor is not online, the doctor is not online and the user asks the first question Within 2 minutes from the time, etc., which is not limited in the present disclosure.
  • the target medical questioning sentence may be one sentence or multiple sentences, which is not limited in the present disclosure.
  • Step 502 Perform event extraction on the target medical inquiry sentence to obtain target extraction events.
  • the execution subject after the execution subject obtains the target medical inquiry statement within the historical preset time period, it can first integrate the target medical inquiry statement into a document according to time, and then perform event extraction on the document to obtain the target extraction event.
  • the target interrogation sentences of the target user acquired by the execution subject within the historical preset time period are "Is my high pressure 120 normal?" , get the target extraction events "high pressure” and "low pressure”.
  • Step 503 based on the target extraction event, search for the target answer sentence corresponding to the target query sentence in the question-answer knowledge base.
  • the execution subject after the execution subject obtains the target extraction event, it can directly search the question-answer knowledge base for the response statement corresponding to the extraction event corresponding to the target extraction event according to the target extraction event, and determine the response statement as the target question answering statement.
  • the answer corresponding to the diagnostic sentence or firstly cluster the target extraction events to obtain the target clustering events, and search for the answer corresponding to the target extraction event in the question-and-answer knowledge base according to the target clustering events, that is, first find the answer in the knowledge base
  • For the clustering event corresponding to the target clustering event further search for the extraction event corresponding to the target extraction event under the clustering event entry, and determine the answer statement corresponding to the extraction event corresponding to the target extraction event as the answer corresponding to the target query statement .
  • the question-and-answer knowledge base is the knowledge base obtained by the method described in the embodiment corresponding to FIG. 2 or FIG. 4 , which will not be repeated here.
  • the target query sentence is "Is my high pressure 120 normal?", "Is my low pressure 70 normal?"
  • the execution subject performs event extraction on the target query sentence, and obtains the target extraction events "high pressure” and “low pressure”, and executes
  • the subject can search for the corresponding response sentences of "high pressure” and "low pressure” in the question-and-answer knowledge base, or first cluster the target extraction events "high pressure” and “low pressure” to obtain the target clustering event "blood pressure Standard”, according to the target clustering event, first find the clustering event corresponding to the target clustering event "blood pressure standard” in the question-and-answer knowledge base, and then find the extraction event corresponding to the target extraction event "high pressure” under the clustering event entry ", "low pressure", and then determine the response sentences corresponding to "high pressure” and "low pressure”.
  • Step 504 in response to determining that the target answer sentence is found, push the target answer sentence to the user.
  • the target answer sentence can be pushed to the user in a wired or wireless manner.
  • the method further includes: in response to determining that no target answer sentence is found, updating the question-answer knowledge base based on a target extraction event.
  • the execution subject if it does not find the target answer sentence in the question and answer knowledge base, it can update the question and answer knowledge base according to the target extraction event and the corresponding answer sentence to obtain the updated question and answer knowledge base.
  • the question-and-answer knowledge base is continuously updated and the accuracy of the generated answer sentence is improved.
  • the method further includes: in response to detecting a correction instruction for the target response sentence, correcting the target response sentence based on the correction instruction.
  • the execution subject pushes the response sentence to the user, if it detects a correction instruction for the target response sentence, it can correct the target response sentence according to the content indicated by the correction instruction, and obtain the corrected target response sentence , and push the corrected target answer statement to the user.
  • the present disclosure provides an embodiment of a knowledge base construction device, which corresponds to the method embodiment shown in FIG. 1 , and the device specifically It can be applied to various electronic devices.
  • the knowledge base construction device 600 of this embodiment includes: an acquisition module 601 , an extraction module 602 and a construction module 603 .
  • the obtaining module 601 may be configured to obtain historical medical inquiry sentences of multiple users.
  • the extraction module 602 may be configured to perform event extraction on historical medical inquiry sentences to obtain multiple extracted events.
  • the construction module 603 can be configured to construct a question and answer knowledge base based on the extracted events and corresponding answer sentences, so as to generate answer sentences when the user asks a doctor.
  • the construction module is further configured to cluster the extracted events to obtain multiple clustered events; based on the clustered events, extracted events and corresponding answer sentences, build a question-and-answer knowledge base .
  • the present disclosure provides an embodiment of a device for generating a response sentence.
  • This device embodiment corresponds to the method embodiment shown in FIG. 5 , and the device Specifically, it can be applied to various electronic devices.
  • the apparatus 700 for generating response sentences in this embodiment includes: an inquiry module 701 , an acquisition module 702 , a search module 703 and a push module 704 .
  • the medical inquiry module 701 may be configured to acquire target medical inquiry sentences of the target user within a historical preset time period.
  • the obtaining module 702 may be configured to perform event extraction on the target medical inquiry sentence to obtain target extraction events.
  • the search module 703 may be configured to search the question answer knowledge base for the target answer sentence corresponding to the target medical inquiry sentence based on the target extraction event.
  • the push module 704 may be configured to push the target answer sentence to the user in response to determining that the target answer sentence is found.
  • the apparatus further includes: an update module configured to update the question-answer knowledge base based on the target extraction event in response to determining that the target answer sentence is not found.
  • the device further includes: a correction module configured to, in response to detecting a correction instruction for the target response sentence, correct the target response sentence based on the correction instruction.
  • a correction module configured to, in response to detecting a correction instruction for the target response sentence, correct the target response sentence based on the correction instruction.
  • the present disclosure also provides an electronic device and a readable storage medium.
  • FIG. 8 it is a block diagram of an electronic device according to the knowledge base construction method of the embodiment of the present disclosure.
  • Electronic device 800 is a block diagram of an electronic device according to the knowledge base construction method of the embodiment of the present disclosure.
  • Electronic device is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers.
  • Electronic devices may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smart phones, wearable devices, and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions, are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.
  • the electronic device includes: one or more processors 801, a memory 802, and interfaces for connecting various components, including high-speed interfaces and low-speed interfaces.
  • the various components are interconnected using different buses and can be mounted on a common motherboard or otherwise as desired.
  • the processor may process instructions executed within the electronic device, including instructions stored in or on the memory, to display graphical information of a GUI on an external input/output device such as a display device coupled to an interface.
  • multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired.
  • multiple electronic devices may be connected, with each device providing some of the necessary operations (eg, as a server array, a set of blade servers, or a multi-processor system).
  • a processor 801 is taken as an example.
  • the memory 802 is a non-transitory computer-readable storage medium provided in the present disclosure.
  • the memory stores instructions executable by at least one processor, so that the at least one processor executes the knowledge base construction method provided in the present disclosure.
  • the non-transitory computer-readable storage medium of the present disclosure stores computer instructions, and the computer instructions are used to make a computer execute the knowledge base construction method provided by the present disclosure.
  • the memory 802 can be used to store non-transitory software programs, non-transitory computer-executable programs and modules, such as program instructions/modules corresponding to the knowledge base construction method in the embodiments of the present disclosure (for example, Acquisition module 601, extraction module 602, construction module 603 shown in Fig. 6).
  • the processor 801 executes various functional applications and data processing of the server by running non-transitory software programs, instructions and modules stored in the memory 802, that is, implements the knowledge base construction method in the above method embodiments.
  • the memory 802 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application program required by at least one function; the data storage area may store data created by use of electronic equipment for knowledge base construction, and the like.
  • the memory 802 may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices.
  • the storage 802 may optionally include storages that are set remotely relative to the processor 801, and these remote storages may be connected to the electronic device for constructing the knowledge base through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the electronic device of the knowledge base construction method may further include: an input device 803 and an output device 804 .
  • the processor 801, the memory 802, the input device 803, and the output device 804 may be connected through a bus or in other ways. In FIG. 8, connection through a bus is taken as an example.
  • the input device 803 can receive input digital or character information, such as a touch screen, a keypad, a mouse, a trackpad, a touchpad, a pointing stick, one or more mouse buttons, a trackball, a joystick and other input devices.
  • the output device 804 may include a display device, an auxiliary lighting device (eg, LED), a tactile feedback device (eg, a vibration motor), and the like.
  • the display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
  • Various implementations of the systems and techniques described herein can be implemented in digital electronic circuitry, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor Can be special-purpose or general-purpose programmable processor, can receive data and instruction from storage system, at least one input device, and at least one output device, and transmit data and instruction to this storage system, this at least one input device, and this at least one output device an output device.
  • machine-readable medium and “computer-readable medium” refer to any computer program product, apparatus, and/or means for providing machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memories, programmable logic devices (PLDs), including machine-readable media that receive machine instructions as machine-readable signals.
  • machine-readable signal refers to any signal used to provide machine instructions and/or data to a programmable processor.
  • the systems and techniques described herein can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user. ); and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer.
  • a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
  • a keyboard and pointing device eg, a mouse or a trackball
  • Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and can be in any form (including Acoustic input, speech input or, tactile input) to receive input from the user.
  • the systems and techniques described herein can be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., as a a user computer having a graphical user interface or web browser through which a user can interact with embodiments of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system.
  • the components of the system can be interconnected by any form or medium of digital data communication, eg, a communication network. Examples of communication networks include: Local Area Network (LAN), Wide Area Network (WAN) and the Internet.
  • a computer system may include clients and servers.
  • Clients and servers are generally remote from each other and typically interact through a communication network.
  • the relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.
  • steps may be reordered, added or deleted using the various forms of flow shown above.
  • each step described in the present disclosure may be executed in parallel, sequentially, or in a different order, as long as the desired result of the technical solution disclosed in the present disclosure can be achieved, no limitation is imposed herein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Epidemiology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Biomedical Technology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本公开公开了知识库构建方法和装置。方法的一个或多个实施方式包括:获取多个用户的历史问诊语句;对历史问诊语句进行事件抽取,得到多个抽取事件;基于抽取事件及所对应的应答语句,构建问答知识库,以用于在用户问诊时生成应答语句。

Description

知识库构建、生成应答语句的方法和装置
本公开要求于2021年9月26日提交的申请号为202111127364.2、发明名称为“知识库构建、生成应答语句的方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及计算机技术领域,具体涉及人工智能技术领域,尤其涉及一种知识库构建、生成应答语句的方法和装置。
背景技术
在医疗健康类应用程序中,偶尔遇到医生管家不能及时回复信息的情况,而患者一般都希望对自己的问题有一个快速的反馈。
目前自动应答很多是依靠关键词进行信息匹配,所以存在以下缺点:1.患者发送的消息用语不够专业化,可能不会涉及很多专业名词,无法命中词库,不能准确给出答复。2.目前市面上的自动应答系统,还没有建立完整的医患交互记录知识库,即使给出答案,也可能是多个答案,而不是准确的答案。
发明内容
本公开实施例提供了一种知识库构建方法、装置、设备以及存储介质。
在本公开的一个或多个实施例中,本公开实施例提供了一种知识库构建方法,该方法包括:获取多个用户的历史问诊语句;对历史问诊语句进行事件抽取,得到多个抽取事件;基于抽取事件及所对应的应答语句,构建问答知识库,以用于在用户问诊时生成应答语句。
在本公开的一个或多个实施例中,本公开实施例提供了一种生成应答语句的方法,该方法包括:获取目标用户在历史预设时间段内的目标问诊语句;对目标问诊语句进行事件抽取,得到目标抽取事件;基于目标抽取事件,在 问答知识库中查找与目标问诊语句相对应的目标应答语句,其中,问答知识库是如上述任一实现方式描述的方法得到的问答知识库;响应于确定查找到目标应答语句,将目标应答语句推送给用户。
在本公开的一个或多个实施例中,本公开实施例提供了一种知识库构建装置,该装置包括:获取模块,被配置成获取多个用户的历史问诊语句;抽取模块,被配置成对历史问诊语句进行事件抽取,得到多个抽取事件;构建模块,被配置成基于抽取事件及所对应的应答语句,构建问答知识库,以用于在用户问诊时生成应答语句。
在本公开的一个或多个实施例中,本公开实施例提供了一种生成应答语句的装置,该装置包括:问诊模块,被配置成对目标问诊语句进行事件抽取,得到目标抽取事件;获得模块,被配置成对历史问诊语句进行事件抽取,得到多个抽取事件;查找模块,被配置成基于目标抽取事件,在问答知识库中查找与目标问诊语句相对应的目标应答语句,其中,问答知识库是如上述任一实现方式描述的方法得到的问答知识库;推送模块,被配置成响应于确定查找到目标应答语句,将目标应答语句推送给用户。
在本公开的一个或多个实施例中,本公开实施例提供了一种电子设备,该电子设备包括一个或多个处理器;存储装置,其上存储有一个或多个程序,当一个或多个程序被该一个或多个处理器执行,使得一个或多个处理器实现如上述任一实现方式描述的方法。
在本公开的一个或多个实施例中,本公开实施例提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现如上述任一实现方式描述的方法。
应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其他特征将通过以下的说明书而变得容易理解。
附图说明
图1是本公开可以应用于其中的示例性系统架构图;
图2是根据本公开的知识库构建方法的一个实施例的流程图;
图3是根据本公开的知识库构建方法的一个应用场景的示意图;
图4是根据本公开的知识库构建方法的另一个实施例的流程图;
图5是根据本公开的生成应答语句的方法的一个实施例的示意图;
图6是根据本公开的知识库构建装置的一个实施例的示意图;
图7是根据本公开的生成应答语句的装置的一个实施例的示意图;
图8是适于用来实现本公开实施例的服务器的计算机系统的结构示意图。
具体实施方式
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。
需要说明的是,在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本公开。
图1示出了可以应用本公开的知识库构建方法的实施例的示例性系统架构100。
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例 如,诊断类应用、通讯类应用等。
终端设备101、102、103可以是硬件,也可以是软件。当终端设备101、102、103为硬件时,可以是具有显示屏的各种电子设备,包括但不限于手机和笔记本电脑。当终端设备101、102、103为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如用来提供知识库构建服务),也可以实现成单个软件或软件模块。在此不做具体限定。
服务器105可以是提供各种服务的服务器,例如,获取多个用户的历史问诊语句;对历史问诊语句进行事件抽取,得到多个抽取事件;基于抽取事件及所对应的应答语句,构建问答知识库,以用于在用户问诊时生成应答语句。
需要说明的是,服务器105可以是硬件,也可以是软件。当服务器105为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。当服务器为软件时,可以实现成多个软件或软件模块(例如用来提供知识库构建服务),也可以实现成单个软件或软件模块。在此不做具体限定。
需要指出的是,本公开的实施例所提供的知识库构建方法可以由服务器105执行,也可以由终端设备101、102、103执行,还可以由服务器105和终端设备101、102、103彼此配合执行。相应地,知识库构建装置包括的各个部分(例如各个单元、子单元、模块、子模块)可以全部设置于服务器105中,也可以全部设置于终端设备101、102、103中,还可以分别设置于服务器105和终端设备101、102、103中。
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。
图2示出了可以应用于本公开的知识库构建方法的流程示意图200。在本实施例中,知识库构建方法包括以下步骤201-步骤203。
步骤201,获取多个用户的历史问诊语句。
在本实施例中,执行主体(如图1中所示的服务器105或终端设备101、102、103)可从多种渠道获取多个用户的历史问诊语句,例如,健康类应用程序的后台数据库存储的用户与医生的聊天记录中、公众号或者期刊的科普文章中、社区论坛中的聊天记录中等,本公开对此不作限定。
步骤202,对历史问诊语句进行事件抽取,得到多个抽取事件。
在本实施例中,执行主体在获取到多个用户的历史问诊语句后,可采用事件抽取算法对历史问诊语句进行事件抽取,得到多个抽取事件。
其中,事件抽取是指从非结构化信息中抽取出用户感兴趣的事件,并以结构化呈现给用户。事件抽取算法可以采用现有技术或未来发展技术中的事件抽取算法,例如,基于模式匹配的事件抽取算法、基于机器学习的事件抽取算法、基于神经网络的事件抽取算法等等,本公开对此不作限定。
这里,执行主体可采用以下方式对事件抽取算法的效果进行评估:
基于召回率(记为R)准确率(记为P)的微平均(记为F)值法,或基于丢失率(记为L)误报率(记为M)的错误识别代价(记为C)法。其中,
F=2×PR(P+R)
C=Cmiss×L×Ltar+Cfa×M×(1-Ltar)
Cmiss为一次丢失的代价,Cfa为一次误报的代价,Ltar为系统作出肯定判断的先验概率,通常根据具体应用设定为常值。上述公式表明,两种效果测评方法之间不存在简单的逆反关系,因此在分析不同评价方法下的两种不同算法的效果时应进行适当的换算。
步骤203,基于抽取事件及所对应的应答语句,构建问答知识库,以用于在用户问诊时生成应答语句。
在本实施例中,执行主体可根据获取的多个抽取事件及多个抽取事件中各抽取事件对应的应答语句,构建问答知识库,以用于在用户问诊时生成应答语句。
继续参见图3,图3是根据本实施例的知识库构建方法的应用场景的一个示意图。
在图3的应用场景中,执行主体301获取多个用户的历史问诊语句,例如,用户302经由终端设备303输入的历史问诊语句为“皮肤病怎么治?”、“为什么皮肤不舒服?”304;用户305经由终端设备306的历史问诊语句为“我高压120正常吗?”、“我低压70正常吗?”307。进一步,执行主体对上述历史问诊语句进行事件抽取308,例如,对用户1的历史问诊语句进行事件抽取,得到抽取事件309,分别为抽取事件1“皮肤病,治”,抽取事件2“高压”、抽取事件3“低压”。获取各抽取事件对应的应答语句,基于抽取事件及所对应的应答语句,构建问答知识库310,以用于在用户问诊时生成应答语句。
本公开的知识库构建方法,通过获取多个用户的历史问诊语句;对历史问诊语句进行事件抽取,得到多个抽取事件;基于抽取事件及所对应的应答语句,构建问答知识库,以用于在用户问诊时生成应答语句,即通过构建应用于用户问诊场景下的知识库,提升了生成自动应答语句的效率和准确性。
进一步参考图4,其示出了知识库构建方法的又一个实施例的流程400。本实施例的知识库构建方法的流程400,可包括以下步骤401-步骤404。
步骤401,获取多个用户的历史问诊语句。
在本实施例中,步骤401的实现细节和技术效果,可以参考对步骤201的描述,在此不再赘述。
步骤402,对历史问诊语句进行事件抽取,得到多个抽取事件。
在本实施例中,步骤402的实现细节和技术效果,可以参考对步骤202的描述,在此不再赘述。
步骤403,对抽取事件进行聚类,得到多个聚类事件。
在本实施例中,执行主体可采用聚类算法,对多个抽取事件进行聚类,得到多个聚类事件。
这里,聚类算法可以采用现有技术或未来发展技术中的聚类算法,例如,K-means算法、Single-Pass增量聚类算法、HAC(Hierarchical Agglomerative Clustering,凝聚式层次聚类)算法等等,本公开对此不作限定。
具体地,对于抽取事件集合S={S1,S2,S3,……,Sn},聚类过程为:
(1)初始化抽取事件的类集合E={E1,E2,E3,……,En},其中Ei=Si。
(2)如果任意类别Ek,E1,都有Sim(Ek,E1)<Thread,则停止聚类过程;否则,转(3)。其中,Sim(Ek,E1)为Ek与E1的类间相似度,其计算方法见以下公式(余弦相似度公式);Thread为一个阈值,用以规定聚类的结束条件。本文主要是针对抽取事件聚类,类间相似度阈值Thread取0.2较为合理。
Figure PCTCN2022117228-appb-000001
(3)找到类间相似度Sim(Ei,Ej)最大的两个类别Ei,Ej,将这两个类合并,更新集合E,转(2)。
步骤404,基于聚类事件、抽取事件及所对应的应答语句,构建问答知识库。
在本实施例中,执行主体可根据获取的多个抽取事件对应的多个聚类事件、多个抽取事件及多个抽取事件中各抽取事件对应的应答语句,构建问答知识库,以用于在用户问诊时生成应答语句。
具体地,多个抽取事件为抽取事件1“皮肤病怎么治?”,抽取事件2“胃病怎么治?”,执行主体对抽取事件1进行聚类,得到聚类事件1“皮肤病问诊”,对抽取事件2进行聚类,得到聚类事件2“胃病问诊”。进一步地,执行主体可基于聚类事件1、聚类事件2、抽取事件1及所对应的应答语句、抽取事件2及所对应的应答语句构建问答知识库。
本公开的上述实施例,与图2对应的实施例相比,本实施例中的知识库 构建方法的流程400体现了对所述抽取事件进行聚类,得到多个聚类事件;基于聚类事件、抽取事件及所对应的应答语句,构建问答知识库,以用于在用户问诊时生成应答语句,有助于为用户提供更加符合自身情况的治疗方案,提升了问答知识库中各问答语句对之间的关联性,并进一步提高了利用问答知识库查找应答语句的效率。
继续参考图5,示出了根据本公开的生成应答语句的方法的一个实施例的流程500。该生成应答语句的方法,包括以下步骤501-步骤504。
步骤501,获取目标用户在历史预设时间段内的目标问诊语句。
在本实施例中,执行主体可采用有线或无线地方式获取目标用户在历史预设时间段内的目标问诊语句。其中,无线连接方式可以包括但不限于3G/4G连接、WiFi连接、蓝牙连接、WiMAX连接、Zigbee连接、UWB(ultra wideband)连接、以及其他现在已知或将来开发的无线连接方式。
这里,目标用户可以是有问诊需求的任意用户。预设时间段可根据经验和实际需求进行设定,例如,医生不在线的预设时间段内,如医生不在线的时刻起3分钟内、医生不在线且问诊用户提出第一问诊问题时刻起的2分钟内等等,本公开对此不作限定。
其中,目标问诊语句可以是一句,也可以是多句,本公开对此不作限定。
步骤502,对目标问诊语句进行事件抽取,得到目标抽取事件。
在本实施例中,执行主体在获取到历史预设时间段内的目标问诊语句后,可首先将目标问诊语句根据时间整合成文档,然后对文档进行事件抽取,得到目标抽取事件。
具体地,执行主体获取的目标用户在历史预设时间段内的目标问诊语句为“我高压120正常吗?”、“我低压70正常吗?”,执行主体对目标问诊语句进行事件抽取,得到目标抽取事件“高压”、“低压”。
步骤503,基于目标抽取事件,在问答知识库中查找与目标问诊语句相 对应的目标应答语句。
在本实施例中,执行主体在获取到目标抽取事件后,可直接根据目标抽取事件在问答知识库中查找与目标抽取事件对应的抽取事件对应的应答语句,并将该应答语句确定为目标问诊语句对应的答案;也可以首先对目标抽取事件进行聚类,得到目标聚类事件,根据目标聚类事件在问答知识库中查找与目标抽取事件对应的答案,即在知识库中先查找到目标聚类事件对应的聚类事件,在聚类事件条目下进一步查找与目标抽取事件对应的抽取事件,并将与目标抽取事件对应的抽取事件对应的应答语句确定为目标问诊语句对应的答案。
其中,问答知识库是如图2或图4对应的实施例描述的方法得到的知识库,这里不再赘述。
具体地,目标问诊语句为“我高压120正常吗?”、“我低压70正常吗?”,执行主体对目标问诊语句进行事件抽取,得到目标抽取事件“高压”、“低压”,执行主体可根据目标抽取事件在问答知识库中分别查找“高压”、“低压”对应的应答语句,也可首先对目标抽取事件“高压”、“低压”进行聚类,得到目标聚类事件“血压标准”,根据目标聚类事件,在问答知识库中首先查找到目标聚类事件“血压标准”对应的聚类事件,再在该聚类事件条目下查找与目标抽取事件对应的抽取事件“高压”、“低压”,进而确定“高压”、“低压”对应的应答语句。
步骤504,响应于确定查找到目标应答语句,将目标应答语句推送给用户。
在本实施例中,若执行主体在问答知识库中查找到目标应答语句,则可将目标应答语句通过有线或无线的方式推送给用户。
在一些可选的方式中,该方法还包括:响应于确定未查找到目标应答语句,基于目标抽取事件对所述问答知识库进行更新。
在本实现方式中,若执行主体在问答知识库中未查找到目标应答语句, 则可根据目标抽取事件及所对应的应答语句对问答知识库进行更新,得到更新后的问答知识库。
该实现方式通过响应于确定未查找到目标应答语句,基于目标抽取事件对问答知识库进行更新,实现了对问答知识库的不断更新,提高生成的应答语句的准确性。
在一些可选的方式中,该方法还包括:响应于检测到对目标应答语句的校正指令,基于校正指令对目标应答语句进行校正。
在本实现方式中,执行主体在向用户推送应答语句后,若检测到对目标应答语句的校正指令,则可根据校正指令所指示的内容对目标应答语句进行校正,得到校正后的目标应答语句,并将校正后的目标应答语句推送给用户。
该实现方式通过响应于检测到对目标应答语句的校正指令,基于校正指令对目标应答语句进行校正,进一步提升了推送的应答语句的准确性。
进一步参考图6,作为对上述各图所示方法的实现,本公开提供了一种知识库构建装置的一个实施例,该装置实施例与图1所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图6所示,本实施例的知识库构建装置600包括:获取模块601、抽取模块602和构建模块603。
其中,获取模块601,可被配置成获取多个用户的历史问诊语句。
抽取模块602,可被配置成对历史问诊语句进行事件抽取,得到多个抽取事件。
构建模块603,可被配置成基于抽取事件及所对应的应答语句,构建问答知识库,以用于在用户问诊时生成应答语句。
在本实施例的一些可选的方式中,构建模块进一步被配置成对抽取事件进行聚类,得到多个聚类事件;基于聚类事件、抽取事件及所对应的应答语句,构建问答知识库。
进一步参考图7,作为对上述各图所示方法的实现,本公开提供了一种生成应答语句的装置的一个实施例,该装置实施例与图5所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图7所示,本实施例的生成应答语句的装置700包括:问诊模块701、获得模块702、查找模块703和推送模块704。
其中,问诊模块701,可被配置成获取目标用户在历史预设时间段内的目标问诊语句。
获得模块702,可被配置成对目标问诊语句进行事件抽取,得到目标抽取事件。
查找模块703,可被配置成基于目标抽取事件,在问答知识库中查找与目标问诊语句相对应的目标应答语句。
推送模块704,可被配置成响应于确定查找到目标应答语句,将目标应答语句推送给用户。
在本实施例的一些可选的方式中,该装置还包括:更新模块,被配置成响应于确定未查找到目标应答语句,基于目标抽取事件对问答知识库进行更新。
在本实施例的一些可选的方式中,该装置还包括:校正模块,被配置成响应于检测到对目标应答语句的校正指令,基于校正指令对所述目标应答语句进行校正。
根据本公开的实施例,本公开还提供了一种电子设备和一种可读存储介质。
如图8所示,是根据本公开实施例的知识库构建方法的电子设备的框图。
800是根据本公开实施例的知识库构建方法的电子设备的框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工 作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本公开的实现。
如图8所示,该电子设备包括:一个或多个处理器801、存储器802,以及用于连接各部件的接口,包括高速接口和低速接口。各个部件利用不同的总线互相连接,并且可以被安装在公共主板上或者根据需要以其它方式安装。处理器可以对在电子设备内执行的指令进行处理,包括存储在存储器中或者存储器上以在外部输入/输出装置(诸如,耦合至接口的显示设备)上显示GUI的图形信息的指令。在其它实施方式中,若需要,可以将多个处理器和/或多条总线与多个存储器和多个存储器一起使用。同样,可以连接多个电子设备,各个设备提供部分必要的操作(例如,作为服务器阵列、一组刀片式服务器、或者多处理器系统)。图8中以一个处理器801为例。
存储器802即为本公开所提供的非瞬时计算机可读存储介质。其中,所述存储器存储有可由至少一个处理器执行的指令,以使所述至少一个处理器执行本公开所提供的知识库构建方法。本公开的非瞬时计算机可读存储介质存储计算机指令,该计算机指令用于使计算机执行本公开所提供的知识库构建方法。
存储器802作为一种非瞬时计算机可读存储介质,可用于存储非瞬时软件程序、非瞬时计算机可执行程序以及模块,如本公开实施例中的知识库构建方法对应的程序指令/模块(例如,附图6所示的获取模块601、抽取模块602、构建模块603)。处理器801通过运行存储在存储器802中的非瞬时软件程序、指令以及模块,从而执行服务器的各种功能应用以及数据处理,即实现上述方法实施例中的知识库构建方法。
存储器802可以包括存储程序区和存储数据区,其中,存储程序区可存 储操作系统、至少一个功能所需要的应用程序;存储数据区可存储知识库构建的电子设备的使用所创建的数据等。此外,存储器802可以包括高速随机存取存储器,还可以包括非瞬时存储器,例如至少一个磁盘存储器件、闪存器件、或其他非瞬时固态存储器件。在一些实施例中,存储器802可选包括相对于处理器801远程设置的存储器,这些远程存储器可以通过网络连接至知识库构建的电子设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
知识库构建方法的电子设备还可以包括:输入装置803和输出装置804。处理器801、存储器802、输入装置803和输出装置804可以通过总线或者其他方式连接,图8中以通过总线连接为例。
输入装置803可接收输入的数字或字符信息,例如触摸屏、小键盘、鼠标、轨迹板、触摸板、指示杆、一个或者多个鼠标按钮、轨迹球、操纵杆等输入装置。输出装置804可以包括显示设备、辅助照明装置(例如,LED)和触觉反馈装置(例如,振动电机)等。该显示设备可以包括但不限于,液晶显示器(LCD)、发光二极管(LED)显示器和等离子体显示器。在一些实施方式中,显示设备可以是触摸屏。
此处描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、专用ASIC(专用集成电路)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。
这些计算程序(也称作程序、软件、软件应用、或者代码)包括可编程处理器的机器指令,并且可以利用高级过程和/或面向对象的编程语言、和/或 汇编/机器语言来实施这些计算程序。如本文使用的,术语“机器可读介质”和“计算机可读介质”指的是用于将机器指令和/或数据提供给可编程处理器的任何计算机程序产品、设备、和/或装置(例如,磁盘、光盘、存储器、可编程逻辑装置(PLD)),包括,接收作为机器可读信号的机器指令的机器可读介质。术语“机器可读信号”指的是用于将机器指令和/或数据提供给可编程处理器的任何信号。
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。
根据本公开实施例的技术方案,有助于实现对用户不良健康状况的及时 干预。
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开公开的技术方案所期望的结果,本文在此不进行限制。
上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。

Claims (12)

  1. 一种知识库构建方法,所述方法包括:
    获取多个用户的历史问诊语句;
    对所述历史问诊语句进行事件抽取,得到多个抽取事件;
    基于所述抽取事件及所对应的应答语句,构建问答知识库,以用于在用户问诊时生成应答语句。
  2. 根据权利要求1所述的方法,其中,所述基于所述抽取事件及所对应的应答语句,构建问答知识库,包括:
    对所述抽取事件进行聚类,得到多个聚类事件;
    基于所述聚类事件、所述抽取事件及所对应的应答语句,构建问答知识库。
  3. 一种生成应答语句的方法,所述方法包括:
    获取目标用户在历史预设时间段内的目标问诊语句;
    对所述目标问诊语句进行事件抽取,得到目标抽取事件;
    基于所述目标抽取事件,在问答知识库中查找与所述目标问诊语句相对应的目标应答语句,其中,所述问答知识库是如权利要求1-2之一所述的方法得到的问答知识库;
    响应于确定查找到所述目标应答语句,将所述目标应答语句推送给用户。
  4. 根据权利要求3所述的方法,所述方法还包括:
    响应于确定未查找到所述目标应答语句,基于所述目标抽取事件对所述问答知识库进行更新。
  5. 根据权利要求3所述的方法,所述方法还包括:
    响应于检测到对所述目标应答语句的校正指令,基于所述校正指令对所 述目标应答语句进行校正。
  6. 一种知识库构建装置,所述装置包括:
    获取模块,被配置成获取多个用户的历史问诊语句;
    抽取模块,被配置成对所述历史问诊语句进行事件抽取,得到多个抽取事件;
    构建模块,被配置成基于所述抽取事件及所对应的应答语句,构建问答知识库,以用于在用户问诊时生成应答语句。
  7. 根据权利要求6所述的装置,其中,所述构建模块进一步被配置成:
    对所述抽取事件进行聚类,得到多个聚类事件;
    基于所述聚类事件、所述抽取事件及所对应的应答语句,构建问答知识库。
  8. 一种生成应答语句的装置,所述装置包括:
    问诊模块,被配置成获取目标用户在历史预设时间段内的目标问诊语句;
    获得模块,被配置成对所述目标问诊语句进行事件抽取,得到目标抽取事件;
    查找模块,被配置成基于所述目标抽取事件,在问答知识库中查找与所述目标问诊语句相对应的目标应答语句,其中,所述问答知识库是如权利要求1-2之一所述的方法得到的问答知识库;
    推送模块,被配置成响应于确定查找到所述目标应答语句,将所述目标应答语句推送给用户。
  9. 根据权利要求8所述的装置,所述装置还包括:
    更新模块,被配置成响应于确定未查找到所述目标应答语句,基于所述目标抽取事件对所述问答知识库进行更新。
  10. 根据权利要求8所述的装置,所述装置还包括:
    校正模块,被配置成响应于检测到对所述目标应答语句的校正指令,基于所述校正指令对所述目标应答语句进行校正。
  11. 一种电子设备,包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-5中任一项所述的方法。
  12. 一种存储有计算机指令的非瞬时计算机可读存储介质,所述计算机指令用于使所述计算机执行权利要求1-5中任一项所述的方法。
PCT/CN2022/117228 2021-09-26 2022-09-06 知识库构建、生成应答语句的方法和装置 WO2023045752A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111127364.2 2021-09-26
CN202111127364.2A CN113836284A (zh) 2021-09-26 2021-09-26 知识库构建、生成应答语句的方法和装置

Publications (1)

Publication Number Publication Date
WO2023045752A1 true WO2023045752A1 (zh) 2023-03-30

Family

ID=78970295

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/117228 WO2023045752A1 (zh) 2021-09-26 2022-09-06 知识库构建、生成应答语句的方法和装置

Country Status (2)

Country Link
CN (1) CN113836284A (zh)
WO (1) WO2023045752A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113836284A (zh) * 2021-09-26 2021-12-24 北京京东拓先科技有限公司 知识库构建、生成应答语句的方法和装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107993724A (zh) * 2017-11-09 2018-05-04 易保互联医疗信息科技(北京)有限公司 一种医学智能问答数据处理的方法及装置
US20180239811A1 (en) * 2017-02-21 2018-08-23 International Business Machines Corporation Question-answer pair generation
CN110263142A (zh) * 2019-06-27 2019-09-20 北京百度网讯科技有限公司 用于输出信息的方法和装置
CN111221954A (zh) * 2020-01-09 2020-06-02 珠海格力电器股份有限公司 一种构建家电维修问答库的方法、装置、存储介质及终端
US10719667B1 (en) * 2015-06-30 2020-07-21 Google Llc Providing a natural language based application program interface
CN111930908A (zh) * 2020-08-10 2020-11-13 腾讯科技(深圳)有限公司 基于人工智能的答案识别方法及装置、介质、电子设备
CN113836284A (zh) * 2021-09-26 2021-12-24 北京京东拓先科技有限公司 知识库构建、生成应答语句的方法和装置

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107358052A (zh) * 2017-07-18 2017-11-17 广州有宠网络科技股份有限公司 一种对宠物疾病进行人工智能问诊的系统及方法
CN108491486B (zh) * 2018-03-14 2020-11-24 东软集团股份有限公司 模拟病人问诊对话方法、装置、终端设备及存储介质
CN110309377B (zh) * 2018-03-22 2023-08-15 阿里巴巴集团控股有限公司 语义归一化、提问模式的生成、应答确定方法及装置
CN109036588A (zh) * 2018-09-10 2018-12-18 百度在线网络技术(北京)有限公司 线上问诊的方法、装置、设备及计算机可读介质
CN111159363A (zh) * 2018-11-06 2020-05-15 航天信息股份有限公司 一种基于知识库的问题答案确定方法及装置
CN110188248A (zh) * 2019-05-28 2019-08-30 新华网股份有限公司 基于新闻问答交互系统的数据处理方法、装置及电子设备
CN111538816B (zh) * 2020-07-09 2020-10-20 平安国际智慧城市科技股份有限公司 基于ai识别的问答方法、装置、电子设备及介质
CN112559656A (zh) * 2020-12-09 2021-03-26 河海大学 基于水文事件的事理图谱构建方法
CN112863630A (zh) * 2021-01-20 2021-05-28 中国科学院自动化研究所 基于数据和知识的个性化精准医疗问答系统

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10719667B1 (en) * 2015-06-30 2020-07-21 Google Llc Providing a natural language based application program interface
US20180239811A1 (en) * 2017-02-21 2018-08-23 International Business Machines Corporation Question-answer pair generation
CN107993724A (zh) * 2017-11-09 2018-05-04 易保互联医疗信息科技(北京)有限公司 一种医学智能问答数据处理的方法及装置
CN110263142A (zh) * 2019-06-27 2019-09-20 北京百度网讯科技有限公司 用于输出信息的方法和装置
CN111221954A (zh) * 2020-01-09 2020-06-02 珠海格力电器股份有限公司 一种构建家电维修问答库的方法、装置、存储介质及终端
CN111930908A (zh) * 2020-08-10 2020-11-13 腾讯科技(深圳)有限公司 基于人工智能的答案识别方法及装置、介质、电子设备
CN113836284A (zh) * 2021-09-26 2021-12-24 北京京东拓先科技有限公司 知识库构建、生成应答语句的方法和装置

Also Published As

Publication number Publication date
CN113836284A (zh) 2021-12-24

Similar Documents

Publication Publication Date Title
US20210375479A1 (en) Method and apparatus for processing electronic medical record data, device and medium
CN109906449B (zh) 一种查找方法及装置
US11735315B2 (en) Method, apparatus, and device for fusing features applied to small target detection, and storage medium
WO2021109787A1 (zh) 同义词挖掘方法、同义词词典的应用方法、医疗同义词挖掘方法、医疗同义词词典的应用方法、同义词挖掘装置及存储介质
CN112329964A (zh) 用于推送信息的方法、装置、设备以及存储介质
CN107403067A (zh) 基于医学知识库的智能分诊服务器、终端及系统
CN112347769B (zh) 实体识别模型的生成方法、装置、电子设备及存储介质
CN111341456B (zh) 糖尿病足知识图谱生成方法、装置及可读存储介质
KR102424085B1 (ko) 기계-보조 대화 시스템 및 의학적 상태 문의 장치 및 방법
CN111460116B (zh) 问答方法、问答系统、电子设备和存储介质
CN112509690B (zh) 用于控制质量的方法、装置、设备和存储介质
US10521433B2 (en) Domain specific language to query medical data
CN111460095B (zh) 问答处理方法、装置、电子设备及存储介质
WO2023178971A1 (zh) 就医的互联网挂号方法、装置、设备及存储介质
WO2023045752A1 (zh) 知识库构建、生成应答语句的方法和装置
US20230018489A1 (en) Method for acquiring structured question-answering model, question-answering method and corresponding apparatus
WO2021127012A1 (en) Unsupervised taxonomy extraction from medical clinical trials
CN112863701A (zh) 非接触智能问诊系统
CN112115697A (zh) 用于确定目标文本的方法、装置、服务器以及存储介质
CN112509691B (zh) 鉴别诊断的提示方法、装置、电子设备及存储介质
CN110502625A (zh) 医疗问题解答方法、装置、设备及计算机可读存储介质
CN111785340B (zh) 一种医疗数据处理方法、装置、设备及存储介质
US11600389B2 (en) Question generating method and apparatus, inquiring diagnosis system, and computer readable storage medium
CN116955646A (zh) 知识图谱的生成方法和装置、存储介质及电子设备
EP4224481A1 (en) Currency transaction method, system and apparatus based on blockchain

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22871801

Country of ref document: EP

Kind code of ref document: A1