WO2019080419A1 - Procédé de construction d'une base de connaissances standard, dispositif électronique, et support de stockage - Google Patents

Procédé de construction d'une base de connaissances standard, dispositif électronique, et support de stockage

Info

Publication number
WO2019080419A1
WO2019080419A1 PCT/CN2018/076484 CN2018076484W WO2019080419A1 WO 2019080419 A1 WO2019080419 A1 WO 2019080419A1 CN 2018076484 W CN2018076484 W CN 2018076484W WO 2019080419 A1 WO2019080419 A1 WO 2019080419A1
Authority
WO
WIPO (PCT)
Prior art keywords
answer
question
keyword
meaning
word
Prior art date
Application number
PCT/CN2018/076484
Other languages
English (en)
Chinese (zh)
Inventor
卢川
高祎璠
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019080419A1 publication Critical patent/WO2019080419A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems

Definitions

  • the answers of the intelligent customer service robots are all set in advance, and are usually paired and saved in the basic database according to the way one question corresponds to one answer. Therefore, when constructing the basic database, it is necessary to maintain the problem as much as possible - the answer pair In order to realize the intelligent answer of the intelligent customer service robot, the maintenance of the basic database is huge and consumes a lot of labor costs.
  • a method for constructing a standard knowledge base comprising the steps of: S1, constructing an answer file: collecting an answer, and parsing the answer into a same file in a uniform format, the file including a form or a text.
  • S4 forming a question-answer pair: according to the generation rule of the question-answer pair, acquiring the content of the corresponding position in the answer file is embedded into the A question is generated in the corresponding change item in the question template, and the content of the corresponding position in the answer file is obtained to generate an answer, and the generated question and the answer link are saved as a question-answer pair.
  • FIG. 5 is a flowchart of a problem template in Embodiment 2 of the method of the present application.
  • Figure 8 is a schematic diagram showing the answer file in the form of a table in the method of the present application.
  • the electronic device 2 is an apparatus capable of automatically performing numerical calculation and/or information processing in accordance with an instruction set or stored in advance.
  • the electronic device 2 can be a smartphone, a tablet, a laptop, a desktop computer, a rack server, a blade server, a tower server, or a rack server (including a stand-alone server, or a server cluster composed of multiple servers).
  • the electronic device 2 includes at least, but not limited to, a built-in system 20 that can communicate with each other via a system bus to the memory 21, the processor 22, the network interface 23, and a standard knowledge base. among them:
  • FIG. 2 shows a schematic diagram of a program module of an embodiment of the standard knowledge base construction system 20.
  • the standard knowledge base construction system 20 can be divided into a file receiving module 201 and a template setting module. 202.
  • S11 collecting an answer
  • S12 splitting each answer into a sequence of words consisting of a plurality of keywords
  • S13 obtaining two meaning keywords representing the meaning of the answer in each word sequence
  • S14 de-duplicating the meaning keywords Classification
  • S15 one type of meaning keyword is used as the first row of the table, another type of meaning keyword is used as the first column of the table, and the intersecting cells of the first row and the first column are blank
  • S16 the value of the answer is represented in the sequence of acquired words.
  • the numerical keyword S17, the numeric keyword is filled in the cell in which the two meaning keywords in the sequence of the word of the numerical keyword are located and the column intersects.
  • the change position generation problem is temporarily stored; S42, obtaining the two meaning keywords of the generated problem, the numerical keyword of the row and the column intersecting the cell is temporarily stored as an answer; S43, the temporarily stored question and the answer are associated with each other; S44 And determining whether the meaning keyword in the current position corresponding to the first change item is the last word in the first row or the first column of the meaning keyword, and if yes, executing step S46, if otherwise, performing step S45; S45, first The current position corresponding to the change item is sequentially shifted one by one along the first line or the first column of the meaning keyword in the current position, and the current position corresponding to the first change item is reset, and step S41 is performed; S46, determining the second The change item corresponds to whether the meaning keyword in the current position is the last word in the first example or the first line of the meaning keyword, and if yes, step S48 is performed, if no Then, step S47 is executed; S47, the current position corresponding to the second change item is sequentially shifted
  • the second cell in the first row of the table and the two meaning keywords in the third cell of the first column are respectively embedded in the position of the two variables in the aforementioned problem template, and the problem is generated as " What is the income of health insurance in the first quarter?
  • the corresponding answer is the value "5246286" in the cell intersecting the second and third rows; until the second cell and the first column of the first row in the table are obtained
  • the meaning of the keyword in the last cell then take the meaning keyword in the third cell of the first row in the table, and then get the meaning keywords in each cell in the first column, in order
  • the problem-answer pair is saved to the standard knowledge base.
  • S40' obtaining the position of the first word sequence separator in the text as the position of the current word sequence separator, and the position of each keyword separator before the first word sequence separator as the position of each current keyword separator;
  • S41' According to the generation rule of the question-answer pair, each meaning keyword before each current keyword separator is obtained, and the problem item generated in the problem template is temporarily stored in the problem;
  • S42' the numerical keyword before the current word sequence separator is obtained.
  • step S43' the associated question and answer are saved; S44', determining whether the current word sequence separator is the last word sequence separator in the answer file, and if yes, executing step S47', if otherwise, performing step S45'; S45', the position of the current word sequence separator is sequentially shifted and the position of the current word sequence separator is reset; S46', the position of each current keyword separator is reset to the key before the current word sequence separator The position of the word separator is executed in step S41'; S47' and ended.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé de construction d'une base de connaissances standard se rapportant au domaine de la maintenance de bases de données. Le procédé de construction d'une base de connaissances standard comprend les étapes suivantes consistant à : construire un fichier de réponse (S1) ; construire un modèle de question (S2) ; définir des éléments constants et des mots de question (S3) ; et prédéfinir une règle de génération de paire de question-réponse, et former des paires question-réponse (S4). En utilisant le procédé pour construire une base de connaissances standard, l'importation par lots de données peut être obtenue. De plus, en générant automatiquement des paires de question-réponse selon une règle, la charge de travail de maintenance d'une base de données de base est réduite, et l'efficacité de travail est considérablement améliorée.
PCT/CN2018/076484 2017-10-26 2018-02-12 Procédé de construction d'une base de connaissances standard, dispositif électronique, et support de stockage WO2019080419A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711031785.9A CN107832374A (zh) 2017-10-26 2017-10-26 标准知识库的构建方法、电子装置及存储介质
CN201711031785.9 2017-10-26

Publications (1)

Publication Number Publication Date
WO2019080419A1 true WO2019080419A1 (fr) 2019-05-02

Family

ID=61650999

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/076484 WO2019080419A1 (fr) 2017-10-26 2018-02-12 Procédé de construction d'une base de connaissances standard, dispositif électronique, et support de stockage

Country Status (2)

Country Link
CN (1) CN107832374A (fr)
WO (1) WO2019080419A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710747B (zh) * 2019-01-16 2021-04-06 北京猎户星空科技有限公司 信息处理方法、装置及电子设备
CN110334197A (zh) * 2019-06-28 2019-10-15 科大讯飞股份有限公司 语料处理方法及相关装置
CN112328762B (zh) * 2020-11-04 2023-12-19 平安科技(深圳)有限公司 基于文本生成模型的问答语料生成方法和装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101261690A (zh) * 2008-04-18 2008-09-10 北京百问百答网络技术有限公司 一种问题自动生成的系统及其方法
CN104978396A (zh) * 2015-06-02 2015-10-14 百度在线网络技术(北京)有限公司 一种基于知识库的问答题目生成方法和装置
CN107220296A (zh) * 2017-04-28 2017-09-29 北京拓尔思信息技术股份有限公司 问答知识库的生成方法、神经网络的训练方法以及设备

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10366621B2 (en) * 2014-08-26 2019-07-30 Microsoft Technology Licensing, Llc Generating high-level questions from sentences
CN104933097B (zh) * 2015-05-27 2019-04-16 百度在线网络技术(北京)有限公司 一种用于检索的数据处理方法和装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101261690A (zh) * 2008-04-18 2008-09-10 北京百问百答网络技术有限公司 一种问题自动生成的系统及其方法
CN104978396A (zh) * 2015-06-02 2015-10-14 百度在线网络技术(北京)有限公司 一种基于知识库的问答题目生成方法和装置
CN107220296A (zh) * 2017-04-28 2017-09-29 北京拓尔思信息技术股份有限公司 问答知识库的生成方法、神经网络的训练方法以及设备

Also Published As

Publication number Publication date
CN107832374A (zh) 2018-03-23

Similar Documents

Publication Publication Date Title
US10621281B2 (en) Populating values in a spreadsheet using semantic cues
WO2020186786A1 (fr) Procédé et appareil de traitement de fichiers, dispositif informatique et support de stockage
WO2019062001A1 (fr) Procédé de service de client robotisé intelligent, appareil électronique et support de stockage lisible par ordinateur
CN111666401B (zh) 基于图结构的公文推荐方法、装置、计算机设备及介质
WO2019076062A1 (fr) Procédé de personnalisation de page de fonction et serveur d'applications
WO2019062010A1 (fr) Procédé de reconnaissance sémantique, dispositif électronique et support de stockage lisible par ordinateur
US11321361B2 (en) Genealogical entity resolution system and method
US10748166B2 (en) Method and system for mining churn factor causing user churn for network application
WO2019062078A1 (fr) Procédé de service de client intelligent, appareil électronique et support de stockage lisible par ordinateur
WO2019085463A1 (fr) Procédé de recommandation de demande de service, serveur d'application et support de stockage lisible par ordinateur
CN112286934A (zh) 数据库表导入方法、装置、设备及介质
WO2019080419A1 (fr) Procédé de construction d'une base de connaissances standard, dispositif électronique, et support de stockage
CN114528413B (zh) 众包标注支持的知识图谱更新方法、系统和可读存储介质
CN104516635A (zh) 管理内容显示
US20230004979A1 (en) Abnormal behavior detection method and apparatus, electronic device, and computer-readable storage medium
CN112507098B (zh) 问题处理方法、装置、电子设备、存储介质及程序产品
WO2021169626A1 (fr) Procédé, appareil et dispositif de recommandation d'appariement basés sur une bibliothèque de mots, et support de stockage
US20150379112A1 (en) Creating an on-line job function ontology
CN106484699A (zh) 数据库查询字段的生成方法及装置
CN113010542A (zh) 业务数据处理方法、装置、计算机设备及存储介质
CN115391439A (zh) 文档数据导出方法、装置、电子设备和存储介质
CN115905630A (zh) 一种图数据库查询方法、装置、设备及存储介质
US10671668B2 (en) Inferring graph topologies
CN110737432A (zh) 一种基于词根表的脚本辅助设计方法及装置
CN115204889A (zh) 文本处理方法、装置、计算机设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18870348

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13.10.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18870348

Country of ref document: EP

Kind code of ref document: A1