WO2015195308A1 - Système de traitement du langage naturel - Google Patents

Système de traitement du langage naturel Download PDF

Info

Publication number
WO2015195308A1
WO2015195308A1 PCT/US2015/033481 US2015033481W WO2015195308A1 WO 2015195308 A1 WO2015195308 A1 WO 2015195308A1 US 2015033481 W US2015033481 W US 2015033481W WO 2015195308 A1 WO2015195308 A1 WO 2015195308A1
Authority
WO
WIPO (PCT)
Prior art keywords
natural language
text
audible
pattern
processing stages
Prior art date
Application number
PCT/US2015/033481
Other languages
English (en)
Inventor
Brian Duane Clevenger
Thomas P. NEWBERRY
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing filed Critical Thomson Licensing
Publication of WO2015195308A1 publication Critical patent/WO2015195308A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90332Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/06Arrangements for sorting, selecting, merging, or comparing data on individual record carriers
    • G06F7/10Selecting, i.e. obtaining data of one kind from those record carriers which are identifiable by data of a second kind from a mass of ordered or randomly- distributed record carriers

Definitions

  • the present principles relate generally to methods and apparatus for natural language control of an embedded system.
  • AIML Artificial Intelligence Markup Language
  • Al artificial intelligence
  • AIML engines permit very simple string matching and substitution expressions. This reduces the CPU resources required to match across a large number of expressions, which is particularly important because a single command might require multiple recursive passes to process.
  • the problem with this approach is that a very large number of expressions are required to implement an Al personality which means lots of memory is required. This makes AIML impractical on an embedded system with limited available memory.
  • AIML evaluates commands against a flat list of expressions grouped into a few priorities. Because of the simple expression syntax, it's possible for engines to perform optimizations and quickly eliminate many non-wildcard expressions that won't match by ordering them and doing a binary search. This isn't possible with more complex expressions, however.
  • AIML is really designed for mimicking human interaction, but isn't really designed for performing actions based on the interaction like configuring a device or incorporating external data sources in processing responses, for example, using status and statistics from a modem.
  • AIML provides only limited context information that can be used to alter behavior based on previous interactions. This makes it difficult in practice to design Al scenarios to walk a user through a specific set of steps - for example, a set of steps to troubleshoot a particular problem.
  • an apparatus for text processing comprising a natural language electronic text processor having a plurality of electronic data processing stages.
  • the apparatus further comprises a first one of the processing stages receiving a natural language text string and recursively comparing the natural language text string with at least one stored text string until either no more stored text strings remain for comparison, or until the comparison is true.
  • the apparatus further comprises a second one of the processing stages generating a status signal indicative of the comparison being true, a third one of the processing stages executing a command in response to the natural language text if the comparison is true, and a fourth one of the processing stages generating an output message.
  • a method for processing text comprises receiving natural language text, and recursively comparing the natural language text with at least one stored text string that is not natural language text until either no more strings remain for comparison, or until the comparison is true.
  • the method further comprises generating a status signal indicative of the comparison, generating a command in response to the text, and generating an output message if the comparison is true.
  • Figure 1 shows an exemplary pattern element under the present principles
  • Figure 2 shows a block diagram of a pattern group element under the present principles.
  • Figure 3 shows a relationship between a pattern group element and a pattern element under the present principles.
  • Figure 4 shows a block diagram of an Artificial Intelligence Context under the present principles.
  • Figure 5 shows a flow diagram of the operation of an Artificial Intelligence engine.
  • Figures 6 and 6a show a flow diagram of the operation of a Pattern Group Element of Figure 2, and the Pattern Group Element operating with audio
  • Figure 7 shows one embodiment of an apparatus under the present principles.
  • Figure 8 shows one embodiment of a method under the present principles.
  • the present principles are directed to a system for processing natural language text from a user, interpreting and performing actions based on that text, and formulating appropriate text responses.
  • the system mimics artificial intelligence and allows a user to interact with a system as if it were a person and not a machine.
  • both technical and non-technical users of complex devices search for a function through an interface to configure a particular feature. It can be difficult, particularly for non-technical users, to find the proper function that they want. For example, a user may wish to disable a firewall, but does not know where the interface for this feature is located. It would be preferable, instead of navigating to a firewall settings page and clicking a disable button, for a user to simply type "Please turn off my firewall" or "Disable the firewall". The system would perform the requested action and respond appropriately with something like "Ok. I turned off the firewall.”
  • the architecture defined by the present principles makes it easy to both control devices and use information from external data sources in formulating responses. This is one advantage of the present principles over AIML engines. Another advantage is the ability to include context information to alter behavior based on previous interactions, such as walking a user through a series of steps, for example, troubleshooting. This is solved in the described system using something called a pattern group in conjunction with a pattern stack and context stack. This is fully described in a later section.
  • the present principles are directed to a system for processing natural language text from a user, interpreting and performing actions based on that text, and formulating appropriate text responses.
  • FIG. 1 shows an exemplary pattern element.
  • Each pattern element takes a command as an input and provides a match indicator and a command as outputs.
  • a pattern element can optionally provide a response as an output.
  • a pattern element can also read and write data from external sources.
  • a pattern element can read or write to an Al context that stores context information from previously processed commands.
  • a pattern element may need to write a new configuration value to perform an action specified by a user or read status information to formulate a response. It is expected that different pattern elements will need to access different external data sources based on the actions the specific pattern element needs to perform. Some pattern elements may not require any external data interaction.
  • a pattern element When a pattern element processes an input command, it can set a response, modify the input command, or take no action. If the pattern element takes no action, the output command must be equal to the input command and the pattern element must set the match indicator to false. Otherwise, the output match indicator must be set to true.
  • the Al engine recursively evaluates the commands against pattern elements until either a response is set or until no pattern sets the match indicator.
  • the purpose of pattern elements is to break down the processing of text into common elements that can be used in processing many different types of commands. This is probably best described with some examples.
  • Regular expressions are sequences of characters in computer science theory that form a search pattern, mostly for pattern matching of strings.
  • a pattern element can be used to expand common contractions by implementing the regular expression sA ⁇ (who
  • Another pattern expression might convert common negative responses to "no”, such as s/ ⁇ (nope
  • top-level pattern In many cases it is useful to chain a top level pattern to a set of sub-patterns. As an example, if implementing patterns to provide the user with definitions for certain terms, it can be useful to have the top-level pattern s/ A what does ([a-z ]+) (mean
  • Figure 2 shows a block diagram of a pattern group element. Notice that the inputs and outputs of a pattern group element are identical to the pattern element described in Figure 1 . This means a pattern group element is a type of pattern element. This relationship is depicted in the UML diagram in Figure 3.
  • a pattern group element contains a top level pattern element and one or more sub-pattern elements.
  • the contained pattern elements can be any type of pattern element. So it is possible for a pattern group element to contain other pattern group elements. This allows a hierarchy to be defined where a set of patterns are only evaluated conditionally based on the top level pattern matching. Allowing hierarchies like this to be defined greatly reduces the number of pattern elements that must be evaluated during each pass. So, if a top-level pattern in a pattern group does not match, none of the patterns in the sub-pattern list will be evaluated. In cases where patterns need to be grouped together without any top- level conditional criteria, a simple top-level pattern that does nothing but set its match indicator to true can be used.
  • FIG. 4 shows a block diagram of the Al Context.
  • the Al Context is used to store information across multiple interactions and also to define the set of patterns to use when evaluating an expression.
  • the Al Context contains a pattern stack. When the Al engine processes input received from a user, it sends the input only to the top pattern on the pattern stack. This will typically be a pattern group element that contains a hierarchy of patterns. It does this repeatedly until either a response is set or until the match indicator is set to false. As patterns are evaluated, they have the ability to push new patterns on the stack or pop them off. This allows pattern elements to modify the set of patterns that the Al Engine will use on the next pass. This can be useful when implementing an interactive troubleshooting scenario.
  • the modified pattern group can be popped off the stack, restoring the base set of patterns defined by the default pattern group.
  • the Al context provides a Context Stack and a Global Context.
  • the Global Context is simply an associative array of variables that patterns can use to preserve state information between interactions.
  • the Context Stack works in the same way as the Global Context except patterns have the ability to push an entire set of variables onto the stack or pop an entire set of variables off the stack. This is useful when a defined set of variables are only needed for a particular time. Again, the troubleshooting scenario is an example of an
  • a new set of variables related to the scenario can be pushed on the context stack.
  • these variables specific to the scenario can be popped off the stack, thus restoring the variables to the original state before the troubleshooting scenario began.
  • Figure 5 shows one embodiment of a method implementing the Al engine.
  • a command is received from the user. This command is sent to the top element of the pattern stack for processing at 502.
  • the pattern element sets a response, the response is sent to the user at 506 and then processing returns to 501 to retrieve the next command from the user. If the response is not set at 503, then processing moves to 504 to check if the match indicator is set. If the match indicator is set, the command is replaced with the command output from the pattern and processing returns to 502 to process the updated command again.
  • the match indicator is not set, it means that the Al engine was unable to process the command from the user and processing then proceeds to 507 where a default response is sent to the user, such as "I'm sorry. I don't understand", for example. Processing then returns to step 501 to retrieve the next command from the user.
  • FIG. 6 is a flow chart showing the internal operation of the Pattern Group Element described in Figure 2. Processing begins at 601 where a command is received as an input. At 602 the input command is stored in temporary memory as CMD1 . At 603, the input command (CMD1 ) is then sent to the Top Level Pattern Element for processing. At 604, the Top Level Pattern Element is checked to see if the match indicator is set. If the match indicator is not set, then processing moves to 613, the output command is set to the original input command (CMD1 ), the match indicator is set to false, and processing ends. If, at 604, the match indicator is set, processing moves to 605 and the output command for the top level pattern is stored in temporary memory as CMD2.
  • Processing then moves to 606 to check if there are any more patterns in the Sub-Pattern List. If no more patterns exist, processing then proceeds to 613. Otherwise, processing proceeds to 607 and the next pattern is retrieved from the Sub-Pattern List.
  • CMD2 is sent to the pattern retrieved at 607 for processing.
  • processing proceeds to 61 1 , the output command, match indicator, and response from the pattern are sent as outputs and processing ends. If at 609 the response is not set, processing proceeds to 61 0 to check if the match indicator is set. If the match indicator is not set at 610, processing proceeds back to 606 to process any remaining patterns in the Sub-Pattern List. If at 61 0 the match indicator is set, then the command and match indicator from the last pattern are set as outputs and processing ends.
  • Figure 6a shows the same flow chart of Figure 6, however, in another embodiment, the system can interface to an audio to text conversion device to receive audio commands, and convert them to text. At the output, the response messages can be converted from text to audio.
  • the audio to text converter, as well as the text to audio converter could be standalone units, or integrated as part of the present system.
  • FIG. 7 shows a hardware block diagram of an exemplary embedded system 700, in accordance with an embodiment of the present principles.
  • the present principles are not limited to the embedded system 700 shown and described in FIG. 7 and, thus, other systems 700 having, e.g., different configuration and/or different elements, can also be used in accordance with the teachings of the present principles.
  • input text is received from a user to an input of system 700.
  • the input is in signal connectivity with a first input of electronic data processing stage 710.
  • Processing stage 710 performs a comparison of the text input with a stored text strings.
  • the stored text strings can be in memory (not shown) within system 700, or input from external memory via another input (not shown) of system 700 and sent to processing element 710.
  • processing stage 710 processes a plurality of portions of the input text string one after another.
  • Control logic within processing stage 710 controls the processing of the input text string portions, so that when a match is made, a next portion can be searched against the stored text strings. If a match is not made on any particular portion, processing stage 710 can decide to abort comparisons on additional text string portions, as the input text will not be a valid command.
  • Processing stage 710 has a signal output that is sent to electronic data processing stages 720, 730 and 740..
  • processing stage 710 finds a match for an entire text string, or all of the text string portions, it generates signals in response to the match.
  • Processing stage 720 is used to generate the match indicator signal that provides a metric indicative of whether the input text string has matched a stored text string.
  • the match indicator signal can be a single binary signal, indicating true or false, or a multilevel signal indicative of the level of match. For example, if the input text string has been processed in portions as described in the preceding embodiment, a multilevel match indicator signal can provide a measure of the degree to which the entire string has been matched up to that point.
  • Processing stage 730 is used to generate a command. Commands are used to react to the input text string, such as to control other devices, for example. If processing stage 710 has made a match to the input text string, the command signal can be used to modify the input text command. If processing stage 71 0 has not made a match for the input text string, processing stage 730 can simply pass the input text on to another circuit.
  • Processing stage 740 is used to generate a response to the input text, which is sent to a user.
  • the response can be a text string stating, for example, that a configuration setting has been altered, or giving the status of a device. If no matches are made to the input text string, the response signal can be used to state, "Invalid command", for example.
  • FIG. 8 shows an embodiment of a method 800 for processing text input, in accordance with an embodiment of the present principles.
  • the method commences at the start block 801 , then proceeds to step 810 for receiving text from a user. Control then proceeds to step 820 for comparing text with stored strings of text. Following step 820, step 830 is used to determine whether there has been a comparison in step 820. If there has been a comparison, control proceeds to steps 850, 860, and 870. If there has not been a comparison, control proceeds to step 840 for determining whether there are more strings to compare to the input text. If there are, control proceeds back to step 820 for comparing text with another stored string. As there continues to be no matching strings to the input text, control will sequence from step 820 to step 830, step 840 and then back to step 820.
  • step 830 determines that there is no comparison, a determination is made in step 840 that there are also no more strings, then control proceeds to step 860 for generating a response to a user.
  • a response might be, "Invalid command”, "Error", or another response, for example.
  • step 830 If, at step 830 a determination is made that a comparison has been made, control proceeds to step 850 for generating a match indicator, step 860 for optionally generating a response to a user, and step 870 for generating a command.
  • processor or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and non-volatile storage.
  • DSP digital signal processor
  • ROM read-only memory
  • RAM random access memory
  • any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
  • any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function.
  • the present principles as defined by such claims reside in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
  • such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C).
  • This may be extended, as readily apparent by one of ordinary skill in this and related arts, for as many items listed.
  • teachings of the present principles may be implemented in various forms of hardware, software, firmware, special purpose processors, or combinations thereof. Most preferably, the teachings of the present principles are implemented as a combination of hardware and software. Moreover, the software may be
  • the application program may be uploaded to, and executed by, a machine comprising any suitable architecture.
  • the machine is implemented on a computer platform having hardware such as one or more central processing units (“CPU"), a random access memory (“RAM”), and input/output ("I/O") interfaces.
  • CPU central processing units
  • RAM random access memory
  • I/O input/output
  • the computer platform may also include an operating system and microinstruction code.
  • the various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU.
  • various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)
  • Telephonic Communication Services (AREA)

Abstract

L'invention concerne un système (700) de traitement de texte en langage naturel, qui détermine si une correspondance est réalisée avec une chaîne d'une pluralité de chaînes de texte stockées (710), ou des parties de chaîne de texte. En réponse à la réalisation d'une correspondance (720), le système produit en sortie un signal de commande, ainsi qu'un signal indicateur de correspondance (730). De plus, une réponse est renvoyée facultativement à l'utilisateur pour l'informer de l'état d'un dispositif, d'informations de configuration, ou de l'état de la commande (740). Le système peut fonctionner hiérarchiquement, pour que des parties d'une chaîne de texte d'entrée puissent être traitées séquentiellement. Cette caractéristique permet de façon avantageuse une réduction du nombre d'éléments motifs qui doivent être recherchés dans la mesure où, si un motif de haut niveau n'a pas de correspondance, l'évaluation de ses sous-motifs n'est pas nécessaire.
PCT/US2015/033481 2014-06-19 2015-06-01 Système de traitement du langage naturel WO2015195308A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201462014433P 2014-06-19 2014-06-19
US62/014,433 2014-06-19

Publications (1)

Publication Number Publication Date
WO2015195308A1 true WO2015195308A1 (fr) 2015-12-23

Family

ID=53385988

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2015/033481 WO2015195308A1 (fr) 2014-06-19 2015-06-01 Système de traitement du langage naturel

Country Status (2)

Country Link
TW (1) TW201617932A (fr)
WO (1) WO2015195308A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112965968A (zh) * 2021-03-04 2021-06-15 湖南大学 一种基于注意力机制的异构数据模式匹配方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999001829A1 (fr) * 1997-06-30 1999-01-14 Lernout & Hauspie Speech Products N.V. Systeme de reecriture et d'analyse syntaxique de commandes
US6665640B1 (en) * 1999-11-12 2003-12-16 Phoenix Solutions, Inc. Interactive speech based learning/training system formulating search queries based on natural language parsing of recognized user queries
US20070050191A1 (en) * 2005-08-29 2007-03-01 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
US20100332235A1 (en) * 2009-06-29 2010-12-30 Abraham Ben David Intelligent home automation
US8165886B1 (en) * 2007-10-04 2012-04-24 Great Northern Research LLC Speech interface system and method for control and interaction with applications on a computing system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999001829A1 (fr) * 1997-06-30 1999-01-14 Lernout & Hauspie Speech Products N.V. Systeme de reecriture et d'analyse syntaxique de commandes
US6665640B1 (en) * 1999-11-12 2003-12-16 Phoenix Solutions, Inc. Interactive speech based learning/training system formulating search queries based on natural language parsing of recognized user queries
US20070050191A1 (en) * 2005-08-29 2007-03-01 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
US8165886B1 (en) * 2007-10-04 2012-04-24 Great Northern Research LLC Speech interface system and method for control and interaction with applications on a computing system
US20100332235A1 (en) * 2009-06-29 2010-12-30 Abraham Ben David Intelligent home automation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112965968A (zh) * 2021-03-04 2021-06-15 湖南大学 一种基于注意力机制的异构数据模式匹配方法
CN112965968B (zh) * 2021-03-04 2023-10-24 湖南大学 一种基于注意力机制的异构数据模式匹配方法

Also Published As

Publication number Publication date
TW201617932A (zh) 2016-05-16

Similar Documents

Publication Publication Date Title
US11003444B2 (en) Methods and apparatus for recommending computer program updates utilizing a trained model
TWI517036B (zh) 程式化的平行機與電腦實施方法、電腦可讀媒體、非暫態電腦可讀媒體、用於編譯程式之電腦及系統
TWI506556B (zh) 用於編譯正規表達式之方法及裝置
US10664655B2 (en) Method and system for linear generalized LL recognition and context-aware parsing
US11568049B2 (en) Methods and apparatus to defend against adversarial machine learning
Srivastava et al. Automated test data generation using cuckoo search and tabu search (CSTS) algorithm
US20200065160A1 (en) Automated api evaluation based on api parameter resolution
JP7127688B2 (ja) 仮説推論装置、仮説推論方法、及びプログラム
KR102074909B1 (ko) 소프트웨어 취약점 분류 장치 및 방법
US20180039890A1 (en) Adaptive knowledge base construction method and system
CN113158685A (zh) 文本的语义预测方法、装置、计算机设备和存储介质
WO2019180314A1 (fr) Réseaux neuronaux artificiels
JP2015169951A (ja) 情報処理装置、情報処理方法、およびプログラム
CN112035647A (zh) 一种基于人机交互的问答方法、装置、设备及介质
EP4104105A1 (fr) Modèles d'adaptation pour intelligence artificielle
US20220035609A1 (en) Graph-based vectorization for software code optimizations
WO2015195308A1 (fr) Système de traitement du langage naturel
US9336774B1 (en) Pattern recognizing engine
CN116383521A (zh) 主题词挖掘方法及装置、计算机设备及存储介质
WO2020044414A1 (fr) Dispositif d'inférence d'hypothèse, procédé d'inférence d'hypothèse et support d'enregistrement lisible par ordinateur
KR102610431B1 (ko) 인공지능 분석 기반 프로그램 소스코드의 요약문 생성 장치 및 방법
EP3167382A1 (fr) Procédé et système de reconnaissance linguistique généralisée linéaire et d'analyse syntaxique contextuelle
CN113593557B (zh) 分布式会话方法、装置、计算机设备及存储介质
Sargisson et al. Learning CNN architecture for multi-view text classification using genetic algorithms
CN112416320A (zh) 运行数据流图的方法、电子电路、电子设备和介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15728727

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15728727

Country of ref document: EP

Kind code of ref document: A1