WO2021053457A1 - Language statement processing in computing system - Google Patents

Language statement processing in computing system Download PDF

Info

Publication number
WO2021053457A1
WO2021053457A1 PCT/IB2020/058338 IB2020058338W WO2021053457A1 WO 2021053457 A1 WO2021053457 A1 WO 2021053457A1 IB 2020058338 W IB2020058338 W IB 2020058338W WO 2021053457 A1 WO2021053457 A1 WO 2021053457A1
Authority
WO
WIPO (PCT)
Prior art keywords
action
higher order
input
syntax tree
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/IB2020/058338
Other languages
English (en)
French (fr)
Inventor
Oleg Sidorkin
Sergey BATIN
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
IBM China Investment Co Ltd
IBM United Kingdom Ltd
International Business Machines Corp
Original Assignee
IBM China Investment Co Ltd
IBM United Kingdom Ltd
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by IBM China Investment Co Ltd, IBM United Kingdom Ltd, International Business Machines Corp filed Critical IBM China Investment Co Ltd
Priority to GB2204446.5A priority Critical patent/GB2602238A/en
Priority to CN202080063909.5A priority patent/CN114375447B/zh
Priority to JP2022516431A priority patent/JP7558258B2/ja
Publication of WO2021053457A1 publication Critical patent/WO2021053457A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/169Annotation, e.g. comment data or footnotes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • the present invention relates generally to language statement processing in a computer system and more particularly to a computer program product, system, and method for using higher order actions to annotate a syntax tree with real data for concepts used to generate an answer to a question.
  • a concept comprises an object in a domain of the question and relates to other concepts in an ontology of the domain; a plurality of actions in the domain of the question having parameters.
  • a higher order action specifies an input, action parameters, and an output.
  • the higher order action is processed to determine an element in the information space of the syntax tree corresponding to the input of the higher order action and to determine an action of the plurality of actions having parameters matching the action parameters.
  • the determined element is provided as input to the determined action to produce output of the determined action that is processed according to the output specified by the higher order action.
  • the information space of the syntax tree is further annotated with the output from the higher order action to use to provide an answer to the question.
  • the subject matter of the embodiments may optionally include an optional embodiment that the higher order action specifies a constraint for the input.
  • Determining the element in the information space of the syntax tree corresponding to the input of the higher order action comprises determining whether the information space of the syntax tree includes a concept satisfying the constraint of the input of the higher order action.
  • the determined action is applied to the concept satisfying the constraint of the input of the higher order action.
  • the output from the higher order action provides real data for the concept corresponding to the input of the higher order action.
  • the higher order action is used to determine concepts satisfying the constraint of the input so that the determined action can be applied to the concept satisfying the constraint to provide output of data for that concept to annotate in the information space of the syntax tree.
  • This allows a higher order action to trigger application of other qualifying actions to apply to the concepts in the information space to generate data for the concepts in the information space to improve the determination of answers to the question represented in the syntax tree.
  • a matching strength of the actions is determined to select an action to process the element in the information space having a highest matching strength.
  • a concept may comprise an attribute in the data model of the domain.
  • the pattern matching module 108 enables mapping of natural language in the question 114 to concepts in the ontology of the domain.
  • the concepts that annotate a token in the syntax tree 116 allow the system to consider and recognize these concepts as part of the token of the question 114 when processing the question 114.
  • the memory/storage 104 may comprise a suitable volatile or non-volatile memory for storing programs to execute and information used by the program 110 to execute.
  • program modules such as the program components 106 through 124 may comprise routines, programs, objects, components, logic, data structures, and so on that perform particular tasks or implement particular abstract data types.
  • the programs 106, 108, 110, 112, 200, 300 may comprise program code loaded into memory and executed by a processor. Alternatively, some or all of the functions may be implemented in hardware devices, such as in Application Specific Integrated Circuits (ASICs) or executed by separate dedicated processors.
  • ASICs Application Specific Integrated Circuits
  • the constraint 202 may comprise a qualification for the concepts, and comprise a subject, (the input 204), a predicate, and an object, where the predicate constrains the subject to the object.
  • a subClassOf .list constrains the symbol a to be any list, so that a concept node in the annotated syntax tree 118 must comprise a list in order to be subject to the action 200i.
  • the input 204 may comprise a tree of concepts that must be matched for the action to be selected for execution.
  • the output 206 may comprise a flat list of concepts or real data comprising the concept. Concepts within a signature can further be qualified with additional specifications.
  • the default parameter for the input may be a concept. Flowever, the input may also comprise real data produced for a concept by another action, such as real product data, real invoice data, etc.
  • An example of an action that searches a database for a product by name may have a signature of "Product (optional :WithName(data :UserString)) -> data Product”, such that for a concept in the tree being constrained as having the name ":UserString”, the data Product will be outputted, comprising the products having the product name.
  • the semantic action module 110 performs a loop of operations at blocks 406 through 426 for each higher order action 300, in the domain of the question 114. If (at block 408) there is no matching element in the tree 118, then control proceeds to block 426 to process a next higher order action 300, until all higher order actions 300 are processed. If (at block 408) there is at least one element, e.g., concept or data, in the annotated syntax tree 118 matching the input 304 of the higher order action 300, and satisfying the constraint 302, then for each matching element j satisfying the constraint 302, control proceeds (at block 410) to perform blocks 412 through 426 in FIG. 4b to apply the higher order action i to element j.
  • element e.g., concept or data
  • FIG. 7 shows an information space of the syntax tree 700 representing the syntax tree 600 in node form, where the nodes having the terms in the question 114 are shown as solid filled, are linked to concept nodes shown with a white center.
  • FIG. 8 shows a further annotated information space of a syntax tree 800, such as annotated syntax tree 120, after actions 200 are applied to add real data nodes linked to the concept nodes, where the real data nodes generated by the actions are shown as white nodes with a color center.
  • the variants panel 802 shows the actions used to generate the real data nodes, shown as white nodes with a solid center, for the concepts, shown as solid nodes with a white center.
  • the annotated syntax tree 800 may be processed to generate the candidate answers to the questions by considering the real data generated for the concepts linked to the question tokens or terms.
  • a computer readable storage medium is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
  • Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like, and conventional procedural programming languages, such as the "C" programming language or similar programming languages.
  • the computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the block may occur out of the order noted in the figures.
  • two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • System memory 906 can include computer system readable media in the form of volatile memory, such as random access memory (RAM) 910 and/or cache memory 912.
  • Computer system/server 902 may further include other removable/non-removable, volatile/non-volatile computer system storage media.
  • storage system 913 can be provided for reading from and writing to a non-removable, non-volatile magnetic media (not shown and typically called a "hard drive").
  • network adapter 924 communicates with the other components of computer system/server 902 via bus 908. It should be understood that although not shown, other hardware and/or software components could be used in conjunction with computer system/server 902. Examples, include, but are not limited to, microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data archival storage systems, etc.
  • Devices that are in communication with each other need not be in continuous communication with each other, unless expressly specified otherwise.
  • devices that are in communication with each other may communicate directly or indirectly through one or more intermediaries.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
PCT/IB2020/058338 2019-09-18 2020-09-08 Language statement processing in computing system Ceased WO2021053457A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
GB2204446.5A GB2602238A (en) 2019-09-18 2020-09-08 Language statement processing in computing system
CN202080063909.5A CN114375447B (zh) 2019-09-18 2020-09-08 计算系统中的语言语句处理
JP2022516431A JP7558258B2 (ja) 2019-09-18 2020-09-08 コンピュータ・システムにおける言語発話処理

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US16/575,341 2019-09-18
US16/575,341 US11379738B2 (en) 2019-09-18 2019-09-18 Using higher order actions to annotate a syntax tree with real data for concepts used to generate an answer to a question

Publications (1)

Publication Number Publication Date
WO2021053457A1 true WO2021053457A1 (en) 2021-03-25

Family

ID=74869682

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2020/058338 Ceased WO2021053457A1 (en) 2019-09-18 2020-09-08 Language statement processing in computing system

Country Status (5)

Country Link
US (3) US11379738B2 (https=)
JP (1) JP7558258B2 (https=)
CN (1) CN114375447B (https=)
GB (1) GB2602238A (https=)
WO (1) WO2021053457A1 (https=)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022043675A2 (en) 2020-08-24 2022-03-03 Unlikely Artificial Intelligence Limited A computer implemented method for the automated analysis or use of data
US11989507B2 (en) 2021-08-24 2024-05-21 Unlikely Artificial Intelligence Limited Computer implemented methods for the automated analysis or use of data, including use of a large language model
US12073180B2 (en) 2021-08-24 2024-08-27 Unlikely Artificial Intelligence Limited Computer implemented methods for the automated analysis or use of data, including use of a large language model
US12067362B2 (en) 2021-08-24 2024-08-20 Unlikely Artificial Intelligence Limited Computer implemented methods for the automated analysis or use of data, including use of a large language model
CN114443822B (zh) * 2021-12-24 2023-05-26 科大讯飞(苏州)科技有限公司 用于建筑领域的多模态问答的方法、系统和计算设备
US20250112819A1 (en) * 2023-10-02 2025-04-03 Schlumberger Technology Corporation Monitoring an industrial facilty employing industrial internet of things (iiot) sensors using a tactical compute application

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080235199A1 (en) * 2007-03-19 2008-09-25 Yunyao Li Natural language query interface, systems, and methods for a database
US20090089045A1 (en) * 2007-09-28 2009-04-02 Douglas Bruce Lenat Method of transforming natural language expression into formal language representation
CN102521239A (zh) * 2011-11-14 2012-06-27 江苏联著实业有限公司 一种基于owl的互联网问答信息匹配系统及其匹配方法
CN108804521A (zh) * 2018-04-27 2018-11-13 南京柯基数据科技有限公司 一种基于知识图谱的问答方法及农业百科问答系统
CN110188331A (zh) * 2019-06-03 2019-08-30 腾讯科技(深圳)有限公司 模型训练方法、对话系统评价方法、装置、设备及存储介质

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7822699B2 (en) 2005-11-30 2010-10-26 Microsoft Corporation Adaptive semantic reasoning engine
US9875494B2 (en) 2013-04-16 2018-01-23 Sri International Using intents to analyze and personalize a user's dialog experience with a virtual personal assistant
US9471689B2 (en) 2014-05-29 2016-10-18 International Business Machines Corporation Managing documents in question answering systems
US9588961B2 (en) 2014-10-06 2017-03-07 International Business Machines Corporation Natural language processing utilizing propagation of knowledge through logical parse tree structures
US10262062B2 (en) * 2015-12-21 2019-04-16 Adobe Inc. Natural language system question classifier, semantic representations, and logical form templates
US9569729B1 (en) 2016-07-20 2017-02-14 Chenope, Inc. Analytical system and method for assessing certain characteristics of organizations
CN108268582B (zh) * 2017-07-14 2021-05-07 阿里巴巴(中国)有限公司 信息查询方法及装置
US20190034540A1 (en) 2017-07-28 2019-01-31 Insight Engines, Inc. Natural language search with semantic mapping and classification
CN107885844A (zh) * 2017-11-10 2018-04-06 南京大学 基于分类检索的自动问答方法及系统
CN108363743B (zh) * 2018-01-24 2020-06-02 清华大学深圳研究生院 一种智能问题生成方法、装置和计算机可读存储介质
CN110888966B (zh) * 2018-09-06 2024-05-10 微软技术许可有限责任公司 自然语言问答

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080235199A1 (en) * 2007-03-19 2008-09-25 Yunyao Li Natural language query interface, systems, and methods for a database
US20090089045A1 (en) * 2007-09-28 2009-04-02 Douglas Bruce Lenat Method of transforming natural language expression into formal language representation
CN102521239A (zh) * 2011-11-14 2012-06-27 江苏联著实业有限公司 一种基于owl的互联网问答信息匹配系统及其匹配方法
CN108804521A (zh) * 2018-04-27 2018-11-13 南京柯基数据科技有限公司 一种基于知识图谱的问答方法及农业百科问答系统
CN110188331A (zh) * 2019-06-03 2019-08-30 腾讯科技(深圳)有限公司 模型训练方法、对话系统评价方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN114375447B (zh) 2025-07-08
GB202204446D0 (en) 2022-05-11
CN114375447A (zh) 2022-04-19
US20220284326A1 (en) 2022-09-08
US20250117574A1 (en) 2025-04-10
US11379738B2 (en) 2022-07-05
JP7558258B2 (ja) 2024-09-30
JP2022548624A (ja) 2022-11-21
US20210081814A1 (en) 2021-03-18
US11842290B2 (en) 2023-12-12
GB2602238A (en) 2022-06-22

Similar Documents

Publication Publication Date Title
US11842290B2 (en) Using functions to annotate a syntax tree with real data used to generate an answer to a question
US12175204B2 (en) Aspect prompting framework for language modeling
US11816439B2 (en) Multi-turn dialogue response generation with template generation
JP7387714B2 (ja) 限られた知識ドメイン内でナレッジグラフを構築するための技術
US11934441B2 (en) Generative ontology learning and natural language processing with predictive language models
US11507828B2 (en) Unsupervised hypernym induction machine learning
US11755657B2 (en) Training a question-answer dialog system to avoid adversarial attacks
WO2019224629A1 (en) Training data expansion for natural language classification
US11238111B2 (en) Response generation
US11036941B2 (en) Generating a plurality of document plans to generate questions from source text
US20190130251A1 (en) Neural question answering system
JP2023002475A (ja) コンピュータシステム、コンピュータプログラムおよびコンピュータで実装される方法(因果関係知識の識別および抽出)
US20230267342A1 (en) Iterative answer and supplemental information extraction for machine reading comprehension
JP2020064621A (ja) 敵対的生成ネットワークを用いるユーザフレンドリな説明生成
KR102434666B1 (ko) 사전 데이터베이스를 활용하여 음성 데이터에 기반한 텍스트를 생성하기 위한 방법 및 컴퓨팅 장치
US20230111052A1 (en) Self-learning annotations to generate rules to be utilized by rule-based system
US20220215285A1 (en) Hybrid user contributed rules and machine learning framework
CN113569017A (zh) 一种模型处理方法、装置、电子设备及存储介质
US20200364304A1 (en) Automatic evaluation of artificial intelligence-based processes
US20220075960A1 (en) Interactive Communication System with Natural Language Adaptive Components
WO2023111748A1 (en) Automated few-shot learning techniques for artificial intelligence-based query answering systems
US12608643B2 (en) Generating workflow representations using reinforced feedback analysis
US20250377864A1 (en) Language-model-based code requirement automation
US20230186190A1 (en) Ticket embedding based on multi-dimensional it data
US20260119790A1 (en) Automatic document analysis and modification systems and applications

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20866323

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022516431

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 202204446

Country of ref document: GB

Kind code of ref document: A

Free format text: PCT FILING DATE = 20200908

ENPC Correction to former announcement of entry into national phase, pct application did not enter into the national phase

Ref country code: GB

122 Ep: pct application non-entry in european phase

Ref document number: 20866323

Country of ref document: EP

Kind code of ref document: A1

WWG Wipo information: grant in national office

Ref document number: 202080063909.5

Country of ref document: CN