KR20220027220A

KR20220027220A - Predictive Similarity Scoring Subsystem in Natural Language Understanding (NLU) Frameworks

Info

Publication number: KR20220027220A
Application number: KR1020227003413A
Authority: KR
Inventors: 에드윈 사푸가이; 종건 박; 앤 캐서린 히튼-던랩
Original assignee: 서비스나우, 인크.
Priority date: 2019-07-02
Filing date: 2020-07-01
Publication date: 2022-03-07
Also published as: JP2022539138A; AU2020299384B2; WO2021003423A1; JP7346609B2; AU2020299634B2; JP7383737B2; US20210004537A1; JP2022538861A; US20210004442A1; AU2020299608A1; KR20220025026A; JP2022538591A; AU2020299634A1; US20210004443A1; US11556713B2; WO2021003311A1; KR20220027198A; AU2020299384A1; JP7420842B2; AU2020299608B2

Abstract

본 실시예들은 발화를 다루기 위한 아티팩트들의 추출을 용이하게 하기 위해 의미 표현 유사도 스코어링을 수행하는 유사도 스코어링 서브시스템을 갖는 에이전트 자동화 프레임워크를 포함한다. 유사도 스코어링 서브시스템은 발화 기반 의미 표현의 CCG 형태를 식별하고, 검색 공간 내의 후보들과 의미 표현 사이의 유사도들의 정량화들을 가능하게 하는 비교 함수 리스트를 검색하기 위해 데이터베이스에 질의한다. 비교 함수들은 유사도 스코어링 서브시스템이 다른 비교들 전에 계산적으로 가장 저렴한 및/또는 가장 효율적인 비교들을 수행할 수 있게 한다. 유사도 스코어링 서브시스템은 특정 의미 표현과 검색 공간의 후보들 사이의 초기 유사도 스코어를 결정하고, 그 후 검색 공간으로부터 비유사 후보들을 전지할 수 있다. 선택적 검색 공간 전지는 유사도 스코어링 서브시스템이 검색 공간을 잠재적으로 매칭하는 후보들로 좁히면서, 점점 더 복잡한 비교 함수들을 통해 의미 표현의 더 많은 데이터를 검색 공간과 반복적으로 비교할 수 있게 한다.The present embodiments include an agent automation framework having a similarity scoring subsystem that performs semantic expression similarity scoring to facilitate extraction of artifacts for handling utterances. The similarity scoring subsystem identifies the CCG form of the utterance-based semantic representation and queries the database to retrieve a list of comparison functions that enable quantifications of similarities between candidates and the semantic representation in the search space. Comparison functions enable the similarity scoring subsystem to perform computationally the least expensive and/or most efficient comparisons before other comparisons. The similarity scoring subsystem may determine an initial similarity score between the particular semantic expression and the candidates in the search space, and then omnipotent dissimilar candidates from the search space. The selective search space omnipotence allows the similarity scoring subsystem to iteratively compare more data of a semantic expression to the search space through increasingly complex comparison functions, while narrowing the search space to potentially matching candidates.

Description

Predictive Similarity Scoring Subsystem in Natural Language Understanding (NLU) Frameworks

상호 참조들cross references

본 출원은 "PREDICTIVE SIMILARITY SCORING SUBSYSTEM IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK"라는 명칭으로 2019년 7월 2일에 출원된 미국 가출원 제62/869,817호의 우선권 및 그 이익을 주장하며, 이 가출원은 모든 목적들을 위해 그 전체가 본 명세서에 참조로 포함된다. 본 출원은 또한 발명의 명칭이 "SYSTEM AND METHOD FOR PERFORMING A MEANING SEARCH USING A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK"인 미국 가출원 제62/869,864호; 발명의 명칭이 "DERIVING MULTIPLE MEANING REPRESENTATIONS FOR AN UTTERANCE IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK"인 미국 가출원 제62/869,826호; 및 발명의 명칭이 "PINNING ARTIFACTS FOR EXPANSION OF SEARCH KEYS AND SEARCH SPACES IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK"인 미국 가출원 제62/869,811호와 관련되며, 이들 가출원들은 각각 2019년 7월 2일에 출원되었고 모든 목적들을 위해 그 전체가 본 명세서에 참조로 포함된다.This application claims the priority and benefits of U.S. Provisional Application No. 62/869,817, filed on July 2, 2019 under the title "PREDICTIVE SIMILARITY SCORING SUBSYSTEM IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK", and this provisional application is for all purposes For all purposes, it is incorporated herein by reference in its entirety. This application also includes US Provisional Application Serial Nos. 62/869,864 entitled "SYSTEM AND METHOD FOR PERFORMING A MEANING SEARCH USING A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK;" US Provisional Application No. 62/869,826 entitled "DERIVING MULTIPLE MEANING REPRESENTATIONS FOR AN UTTERANCE IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK"; and U.S. Provisional Application No. 62/869,811 entitled "PINNING ARTIFACTS FOR EXPANSION OF SEARCH KEYS AND SEARCH SPACES IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK," each of which was filed on July 2, 2019 and is incorporated herein by reference in its entirety for all purposes.

본 개시내용은 일반적으로 자연어 이해(NLU) 및 인공 지능(AI)의 분야들에 관한 것으로, 보다 구체적으로는, NLU에 대한 예측 유사도 스코어링 서브시스템에 관한 것이다.BACKGROUND This disclosure relates generally to the fields of natural language understanding (NLU) and artificial intelligence (AI), and more particularly to a predictive similarity scoring subsystem for NLU.

이 섹션은 아래에서 설명되고/되거나 청구되는 본 개시내용의 다양한 양태들과 관련될 수 있는 기술의 다양한 양태들에 대해 독자에게 소개하기 위해 의도된다. 이 논의는 본 개시내용의 다양한 양태들의 더 나은 이해를 용이하게 하기 위해 배경 정보를 독자에게 제공하는 데 도움이 되는 것으로 믿어진다. 따라서, 이러한 진술들은 이러한 관점에서 읽혀져야 하고 종래 기술에 대한 인정들이 아님을 이해해야 한다.This section is intended to introduce the reader to various aspects of the technology that may be related to various aspects of the present disclosure described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present disclosure. Accordingly, it is to be understood that these statements are to be read in this light and are not admissions of prior art.

클라우드 컴퓨팅은 일반적으로 인터넷을 통해 액세스되는 컴퓨팅 자원들의 공유에 관한 것이다. 특히, 클라우드 컴퓨팅 인프라스트럭처는 개인들 및/또는 기업들과 같은 사용자들이 서버들, 저장 디바이스들, 네트워크들, 애플리케이션들 및/또는 다른 컴퓨팅 기반 서비스들과 같은 컴퓨팅 자원들의 공유 풀에 액세스할 수 있도록 한다. 그렇게 함으로써, 사용자들은 원격 위치들에 위치되는 맞춤형 컴퓨팅 자원들에 액세스할 수 있으며, 이들 자원들은 다양한 컴퓨팅 기능들(예를 들어, 대량의 컴퓨팅 데이터의 저장 및/또는 처리)을 수행하는데 이용될 수 있다. 기업 및 다른 조직 사용자들의 경우, 클라우드 컴퓨팅은 고가의 네트워크 장비를 구입하거나 또는 사설 네트워크 인프라스트럭처를 구축하는데 많은 양의 시간을 투자하는 것과 같은 많은 초기 비용들을 들이지 않고도 클라우드 컴퓨팅 자원들에 액세스할 수 있는 유연성을 제공한다. 대신, 사용자들은 클라우드 컴퓨팅 자원들을 활용함으로써, 그 자원들을 그 기업의 핵심 기능들에 집중할 수 있도록 전용시킬 수 있다.Cloud computing generally relates to the sharing of computing resources accessed via the Internet. In particular, a cloud computing infrastructure may enable users, such as individuals and/or businesses, to access a shared pool of computing resources, such as servers, storage devices, networks, applications, and/or other computing-based services. do. In doing so, users may access customized computing resources located at remote locations, which may be used to perform various computing functions (eg, storage and/or processing of large amounts of computing data). there is. For enterprise and other organizational users, cloud computing provides access to cloud computing resources without incurring significant upfront costs, such as purchasing expensive network equipment or investing large amounts of time to build a private network infrastructure. Provides flexibility. Instead, by leveraging cloud computing resources, users can divert those resources to focus on the company's core functions.

이러한 클라우드 컴퓨팅 서비스는 클라이언트 인스턴스의 사용자로부터의 자연어 요청들에 기반하여 클라이언트 인스턴스에서의 문제들에 자동으로 응답하도록 설계되는 채팅 에이전트와 같은 가상 에이전트를 호스팅할 수 있다. 예를 들어, 사용자는 패스워드 문제에 대한 지원을 위해 가상 에이전트에 요청을 제공할 수 있으며, 가상 에이전트는 자연어 처리(Natural Language Processing)(NLP) 또는 자연어 이해(Natural Language Understanding)(NLU) 시스템의 일부이다. NLP는 자연어 입력 처리의 일부 형태를 포함하는 컴퓨터 과학 및 AI의 일반적인 영역이다. NLP에 의해 다루어지는 영역들의 예들은 언어 번역, 음성 생성, 파스(parse) 트리 추출, 품사(part-of-speech) 식별 등을 포함한다. NLU는 구체적으로 사용자 발화(user utterance) 이해에 중점을 둔 NLP의 하위 영역이다. NLU에 의해 다루어지는 영역들의 예들은 질문-답변(예를 들어, 독해 질문들), 기사 요약 등을 포함한다. 예를 들어, NLU는 다운스트림 가상 에이전트에 의한 소비를 위해 알려진 심볼들의 세트로 인간 언어(예를 들어, 구두(spoken) 또는 서면(written))를 줄이기 위한 알고리즘들을 이용할 수 있다. NLP는 일반적으로 추가 분석을 위해 프리 텍스트(free text)를 해석하는데 이용된다. NLP에 대한 현재 접근법들은 통상적으로 프로그램의 이해를 향상시키기 위해 데이터의 패턴들을 검사하고 이를 이용하는 AI 유형인 심층 학습(deep learning)에 기반한다.This cloud computing service may host a virtual agent, such as a chat agent, that is designed to automatically respond to problems at the client instance based on natural language requests from the user of the client instance. For example, a user can provide a request to a virtual agent for assistance with a password problem, which is part of a Natural Language Processing (NLP) or Natural Language Understanding (NLU) system. am. NLP is a general area of computer science and AI that includes some form of natural language input processing. Examples of areas covered by NLP include language translation, speech generation, parse tree extraction, part-of-speech identification, and the like. NLU is a sub-domain of NLP that specifically focuses on understanding user utterance. Examples of areas covered by the NLU include question-and-answer (eg, reading questions), article summaries, and the like. For example, an NLU may use algorithms to reduce human language (eg, spoken or written) into a known set of symbols for consumption by a downstream virtual agent. NLP is commonly used to interpret free text for further analysis. Current approaches to NLP are typically based on deep learning, a type of AI that examines and uses patterns in data to improve the understanding of programs.

그러나, 검색 공간 내의 의도 및 엔티티 매치들(intent and entity matches)을 식별하기 위해 NLU 기술들을 적용하는 기존의 가상 에이전트들은 수신된 사용자 발화로부터 의미를 추론하고 그에 대한 적절한 응답을 결정하려고 시도할 때 그 계산 자원들을 지나치게 확장할 수 있다. 실제로, 의미 검색 동안, 특정의 기존의 접근법들은 수신된 사용자 발화에 대한 의미를 도출하기 위해 검색 공간 내의 저장된 사용자 발화들의 전체 집합에 직접 질의할 수 있고, 이에 의해 확장된 시간 기간에 걸쳐 상당한 처리 및 메모리 자원들을 소비할 수 있다. 이와 같이, 기존의 접근법들은 사용자와의 실시간 관여에 적절한 방식으로 복잡한 사용자 발화들을 효율적으로 다룰 수 없고/없거나 복수의 사용자 발화들에 대한 적절하고 적시의 응답들을 동시에 생성하지 못할 수 있다.However, existing virtual agents that apply NLU techniques to identify intent and entity matches within a search space infer meaning from a received user utterance and attempt to determine an appropriate response therefor. Computational resources can be over-expanded. Indeed, during semantic retrieval, certain existing approaches may directly query the entire set of stored user utterances within the search space to derive meaning for the received user utterance, thereby requiring significant processing and processing over an extended period of time. It can consume memory resources. As such, existing approaches may not be able to efficiently handle complex user utterances in a manner suitable for real-time engagement with the user and/or may not be able to simultaneously generate appropriate and timely responses to a plurality of user utterances.

본 명세서에 개시된 특정 실시예들의 요약이 아래에 제시된다. 이러한 양태들은 단지 독자에게 이러한 특정 실시예들의 간단한 요약을 제공하기 위해 제시되고, 이러한 양태들은 본 개시내용의 범위를 제한하도록 의도되지 않는다는 점이 이해되어야 한다. 실제로, 본 개시내용은 아래에 제시되지 않을 수 있는 다양한 양태들을 포함할 수 있다.A summary of specific embodiments disclosed herein is presented below. It should be understood that these aspects are presented merely to provide the reader with a brief summary of these specific embodiments, and that these aspects are not intended to limit the scope of the disclosure. Indeed, the present disclosure may include various aspects that may not be set forth below.

본 실시예들은 가상 에이전트에 의해 수신된 요청들과 같은 사용자 발화들로부터 의미를 추출하고, 이러한 사용자 발화들에 적절하게 응답하도록 설계되는 에이전트 자동화 프레임워크에 관한 것이다. 이러한 작업들을 수행하기 위해, 에이전트 자동화 프레임워크는 샘플 발화들과 연관되는 정의된 의도들 및 엔티티들을 갖는 의도-엔티티 모델 및 NLU 프레임워크를 포함한다. NLU 프레임워크는 의도-엔티티 모델의 샘플 발화들에 대한 의미 표현들을 생성하도록 설계되는 의미 추출 서브시스템을 포함한다. 따라서, NLU 프레임워크는 각각이 기본 발화의 상이한 이해 또는 해석을 나타내는, 의미 표현들로부터 검색가능한 이해 모델 또는 검색 공간을 생성한다. 의미 추출 서브시스템은 또한 사용자로부터 수신된 발화에 기반하여 의미 표현을 생성하고, 여기서 의미 표현은 검색 공간과 비교될 검색 키이다. 이와 같이, 개시된 NLU 프레임워크의 의미 검색 서브시스템은 수신된 사용자 발화의 의미 표현에 대한 의미 표현 매치들을 찾기 위해 이해 모델의 의미 표현들을 검색하도록 설계된다. 의미 추출 서브시스템은 적절한 에이전트 응답들 및/또는 액션들을 용이하게 하기 위해 매칭 의미 표현들로부터 의도들 및 엔티티들을 후속하여 추출할 수 있다.The present embodiments relate to an agent automation framework designed to extract meaning from user utterances, such as requests received by a virtual agent, and to respond appropriately to such user utterances. To perform these tasks, the agent automation framework includes an intent-entity model and an NLU framework with defined intents and entities associated with sample utterances. The NLU framework includes a semantic extraction subsystem designed to generate semantic representations for sample utterances of an intent-entity model. Thus, the NLU framework creates a searchable understanding model or search space from semantic representations, each representing a different understanding or interpretation of an underlying utterance. The semantic extraction subsystem also generates a semantic representation based on the utterance received from the user, wherein the semantic representation is a search key to be compared with the search space. As such, the semantic retrieval subsystem of the disclosed NLU framework is designed to search the semantic representations of the understanding model to find semantic representation matches to the semantic representation of the received user utterance. The semantic extraction subsystem may subsequently extract intents and entities from matching semantic expressions to facilitate appropriate agent responses and/or actions.

현재의 의미 검색 기반 NLU 기술들과 같은, 수신된 사용자 발화들로부터 이해들을 추출하기 위해 검색에 크게 의존하는 NLU 프레임워크들의 경우, 큰 이해 모델들 또는 심지어 이해 모델들의 조합들에 적용되는 검색 프로세스는 특정 경우들에서 상당한 수들의 의미 표현들을 평가할 수 있다. 이와 같이, 검색 공간의 표면 상에서 사용자 발화의 의미 표현과 호환되지 않는 의미 표현들을 제거하기 위해 검색 공간을 전지(prune)하는 것은 NLU 프레임워크가 더 큰 검색 공간들을 통합하거나 검색 공간들의 본 실시예들을 더 능숙하게 수용하게 할 수 있다는 것이 현재 인식되고 있다. 복수의 이해 모델들로부터 생성되는 검색 공간을 이용하는 실시예들에 있어서, 본 명세서에 개시되는 기술들은 대응하는 이해 모델과 각각 연관되는 복수의 상이한 비즈니스 양태들(예를 들어, 판매들, 건물 정보, 서비스 티켓들)을 고려하는 의미 검색들에 기반하여 개선된 사용자 만족을 전달할 수 있다. 아래에 논의되는 바와 같이, 본 실시예들은 일반적으로 이해 모델 내의 매칭 의미 표현들의 식별을 개선하도록 큰 검색 공간들을 전지하기 위해 유사도 스코어링 능력들을 활용함으로써 의미 검색 서브시스템의 동작을 개선한다. 특히, 본 실시예들은 검색 공간을 연속적으로 축소하기 위해 의미 매치 프로세스의 더 이른 부분들 동안 저비용 또는 덜 자원 집약적인 전지 기준들을 이용하고, 그 후 크기가 감소된 검색 공간에 더 자원 집약적인 전지 기준들을 적용한다. 이해되는 바와 같이, 본 기술들은 이렇게 함으로써 NLU에 의해 제기되는 어려운 문제를, 검색 키와 피상적으로 관련되지 않은 검색 공간 내의 의미 표현들을 평가할 시에 자원들을 낭비하지 않는 관리가능하고 덜 자원 집약적인 검색 문제로 변환함으로써 이를 해결한다.For NLU frameworks that rely heavily on search to extract understandings from received user utterances, such as current semantic search-based NLU techniques, the search process applied to large understanding models or even combinations of understanding models is In certain cases it is possible to evaluate a significant number of semantic expressions. As such, pruning the search space to remove semantic expressions that are incompatible with the semantic expression of a user utterance on the surface of the search space allows the NLU framework to incorporate larger search spaces or allow present embodiments of search spaces. It is now recognized that it can lead to more skillful acceptance. In embodiments utilizing a search space generated from a plurality of models of understanding, the techniques disclosed herein may be used for a plurality of different business aspects (eg, sales, building information, service tickets) can deliver improved user satisfaction based on semantic searches. As discussed below, the present embodiments generally improve the operation of the semantic search subsystem by utilizing similarity scoring capabilities to prune large search spaces to improve identification of matching semantic expressions within an understanding model. In particular, the present embodiments use low-cost or less resource-intensive pruning criteria during earlier parts of the semantic match process to continuously shrink the search space, and then use the more resource-intensive pruning criteria in the reduced size search space. apply them As will be appreciated, the present techniques solve the difficult problem posed by the NLU by doing so, a manageable, less resource-intensive search problem that does not waste resources in evaluating semantic expressions in a search space that are not superficially related to a search key. This is solved by converting to .

더 구체적으로, 본 실시예들은 수신된 사용자 발화로부터 도출된 의미 표현들을 의도-엔티티 모델의 샘플 발화들로부터 도출된 의미 표현들과 효율적으로 비교하기 위한 의미 검색 서브시스템의 유사도 스코어링 서브시스템에 관한 것이다. 이해되는 바와 같이, 수학적 비교 함수들(mathematical comparison functions)의 점진적 세트에 기반하여, 유사도 스코어링 서브시스템은 비교된 의미 표현들의 각각의 쌍에 대한 유사도 스코어를 반복적으로 결정한다. 예를 들어, 특정 의미 표현을 이해 모델 내의 의미 표현들의 집합과 비교하기 위해, 유사도 스코어링 서브시스템은 먼저 특정 의미 표현에 대한 인지 구축 문법(cognitive construction grammar)(CCG) 형태를 결정할 수 있다. CCG 기술들에 의해 제시된 바와 같이, 특정 의미 표현의 CCG 형태 클래스 멤버십은 특정 의미 표현의 발화 트리 구조의 형상뿐만 아니라, 의미 표현의 노드들의 품사 주석부기에 의해 설정된다. 특정 의미 표현의 CCG 형태에 기반하여, 유사도 스코어링 서브시스템은 그 후 특정 의미 표현과 이해 모델의 의미 표현들 사이의 비교들을 가능하게 하는 수학적 비교 함수 리스트들을 검색하기 위해 형태 클래스 데이터베이스에 질의할 수 있다. 또한, 유사도 스코어링 서브시스템은 특정 의미 표현의 것에 대한 호환가능한 CCG 형태를 갖지 않는 이해 모델의 의미 표현들을 무시하거나 전지할 수 있다.More specifically, the present embodiments relate to a similarity scoring subsystem of a semantic search subsystem for efficiently comparing semantic expressions derived from received user utterances with semantic expressions derived from sample utterances of an intent-entity model. . As will be appreciated, based on an incremental set of mathematical comparison functions, the similarity scoring subsystem iteratively determines a similarity score for each pair of compared semantic expressions. For example, to compare a particular semantic expression to a set of semantic expressions in an understanding model, the similarity scoring subsystem may first determine a cognitive construction grammar (CCG) form for the particular semantic expression. As suggested by CCG techniques, the CCG form class membership of a specific semantic expression is established by the shape of the utterance tree structure of the specific semantic expression, as well as the part-of-speech annotation of the nodes of the semantic expression. Based on the CCG form of the particular semantic expression, the similarity scoring subsystem may then query the form class database to retrieve a list of mathematical comparison functions that enable comparisons between the particular semantic expression and the semantic expressions of the understanding model. . In addition, the similarity scoring subsystem may ignore or omnipotent semantic representations of an understanding model that do not have a compatible CCG form for that of a particular semantic representation.

본 명세서에 설명되는 바와 같이, 각각의 수학적 비교 함수 리스트는 비교되는 의미 표현들의 각각의 수의 노드들을 반복적으로 고려하는 비교 함수들의 순서화된 집합을 포함한다. 특히, 비교 함수들은 유사도 스코어링 서브시스템이 먼저 계산적으로 가장 저렴한 및/또는 가장 효율적인 함수를 구현하고, 따라서, 특정 의미 표현과 이해 모델의 의미 표현들 사이의 초기 또는 예비 유사도 스코어를 결정하도록 순서화된다. 예를 들어, 유사도 스코어링 서브시스템은 특정 의미 표현의 루트 노드(root node)가 이해 모델의 각각의 비교가능한 의미 표현의 루트 노드와 적절히 유사한지를 고려하기 위해 초기 함수를 이용할 수 있다. 유사도 스코어링 서브시스템은 그 후, 각각의 의미 표현의 루트 노드 및 제1 의존 노드 둘 다를 고려하기 위해 후속 함수를 적용하거나 임의의 다른 더 자원 집약적인 비교 함수를 적용하기 전에, 적절히 유사한 의미 표현들에 대해 좁힐 수 있다. 따라서, 본 명세서에 설명된 선택적 노드 언커버링(uncovering)의 반복적 적용 및/또는 더 비싼 비교 함수들의 적용은 이해 모델의 잠재적으로 매칭하는 의미 표현 후보들을 연마(hone)하면서, 점점 더 복잡한 비교 함수들을 통해 비교된 의미 표현들의 더 많은 특징들을 반복적으로 고려한다. 이와 같이, 예측 유사도 스코어링을 위한 본 기술들은 의미 표현 매치들의 타겟화된 발견을 가능하게 하고, 이에 의해, 의미 표현 크기(예를 들어, 노드들의 수) 및 검색 공간 크기가 거대할 수 있는, NLU와 같은 생성 공간들에 계산 이익들을 제공한다.As described herein, each list of mathematical comparison functions includes an ordered set of comparison functions that iteratively takes into account each number of nodes of the semantic expressions being compared. In particular, the comparison functions are ordered such that the similarity scoring subsystem first implements the computationally cheapest and/or most efficient function, and thus determines an initial or preliminary similarity score between the particular semantic representation and the semantic representations of the understanding model. For example, the similarity scoring subsystem may use the initial function to consider whether the root node of a particular semantic representation is appropriately similar to the root node of each comparable semantic representation of the understanding model. The similarity scoring subsystem then applies a subsequent function to account for both the root node and the first dependent node of each semantic expression, or applies any other more resource-intensive comparison function to appropriately similar semantic expressions. can be narrowed down to Thus, the iterative application of the selective node uncovering described herein and/or the application of more expensive comparison functions can create increasingly complex comparison functions while hone potentially matching semantic representation candidates of the understanding model. Iteratively considers more features of the compared semantic expressions through As such, the present techniques for predictive similarity scoring enable targeted discovery of semantic representation matches, whereby the semantic representation size (eg, number of nodes) and search space size can be large, NLU provides computational benefits to generative spaces such as

본 개시내용의 다양한 양태들은 다음의 상세한 설명을 읽고 도면들을 참조함으로써 더 잘 이해될 수 있다.
도 1은 본 기술의 실시예들이 동작할 수 있는 클라우드 컴퓨팅 시스템의 실시예의 블록도이다.
도 2는 본 기술의 실시예들이 동작할 수 있는 멀티-인스턴스 클라우드 아키텍처의 실시예의 블록도이다.
도 3은 본 기술의 양태들에 따른, 도 1 또는 도 2에 존재할 수 있는 컴퓨팅 시스템에서 이용되는 컴퓨팅 디바이스의 블록도이다.
도 4a는 본 기술의 양태들에 따른, 클라우드 컴퓨팅 시스템에 의해 호스팅되는 클라이언트 인스턴스의 일부인 NLU 프레임워크를 포함하는 에이전트 자동화 프레임워크의 실시예를 예시하는 개략도이다.
도 4b는 본 기술의 양태들에 따른, NLU 프레임워크의 부분들이 클라우드 컴퓨팅 시스템에 의해 호스팅되는 기업 인스턴스의 일부인 에이전트 자동화 프레임워크의 대안적인 실시예를 예시하는 개략도이다.
도 5는 본 기술의 양태들에 따른, NLU 프레임워크 및 거동 엔진 프레임워크를 포함하는 에이전트 자동화 프레임워크가 사용자 발화로부터 의도들 및/또는 엔티티들을 추출하고 사용자 발화에 응답하는 프로세스의 실시예를 예시하는 흐름도이다.
도 6은 본 기술의 양태들에 따른, 의미 추출 서브시스템 및 의미 검색 서브시스템을 포함하는 NLU 프레임워크의 실시예를 예시하는 블록도이며, 의미 추출 서브시스템은 수신된 사용자 발화로부터 의미 표현들을 생성하여 발화 의미 모델을 산출하고, 이해 모델의 샘플 발화들로부터 의미 표현들을 생성하여 이해 모델을 산출하고, 의미 검색 서브시스템은 발화 의미 모델의 의미 표현들을 이해 모델의 의미 표현들과 비교하여 수신된 사용자 발화로부터 아티팩트들(예를 들어, 의도들 및 엔티티들)을 추출한다.
도 7은 본 접근법의 실시예에 따라, 발화에 대해 생성된 발화 트리의 예를 나타내는 도면이다.
도 8은 본 기술의 양태들에 따른, NLU 프레임워크가 수신된 사용자 발화로부터 아티팩트들을 추출할 수 있게 하는, 매칭 의미 표현들을 결정 또는 식별하기 위해 이해 모델에 의해 정의된 검색 공간을 분석하는 의미 검색 서브시스템의 실시예를 나타내는 정보 흐름도이다.
도 9는 본 기술의 양태들에 따른, 임의의 적절한 수의 의미 표현들 사이의 효율적인 비교들을 가능하게 하는, 수학적 비교 함수 리스트들을 검색하기 위해 NLU 프레임워크의 의미 검색 서브시스템 내에 구현될 수 있는 유사도 스코어링 서브시스템의 실시예를 나타내는 정보 흐름도이다.
도 10은 본 기술의 양태들에 따른, 유사도 스코어링 서브시스템이 도 8의 발화 기반 의미 표현과 검색 공간 사이의 비교를 가능하게 하는 수학적 비교 함수 리스트들을 검색하는 프로세스의 실시예를 나타내는 흐름도이다.
도 11은 본 기술의 양태들에 따른, 제1 의미 표현을 제2 의미 표현과 비교하기 위해 하나의 수학적 비교 함수 리스트를 이용하는 의미 검색 서브시스템의 유사도 스코어링 서브시스템의 실시예의 도면이다.
도 12는 본 기술의 양태들에 따른, 발화 기반 의미 표현과 매칭하는 의미 표현들을 식별하기 위해 검색 공간을 선택적으로 좁히기 위해 수학적 비교 함수 리스트를 적용하는 유사도 스코어링 서브시스템의 실시예를 나타내는 개략도이다.
도 13은 본 기술의 양태들에 따른, 유사도 스코어링 서브시스템이 검색 공간으로부터 매칭 의미 표현들을 식별하기 위해 수학적 비교 함수 리스트를 구현하는 프로세스의 실시예의 흐름도이다.Various aspects of the present disclosure may be better understood by reading the following detailed description and referring to the drawings.
1 is a block diagram of an embodiment of a cloud computing system in which embodiments of the present technology may operate.
2 is a block diagram of an embodiment of a multi-instance cloud architecture in which embodiments of the present technology may operate.
3 is a block diagram of a computing device utilized in the computing system that may be in FIG. 1 or 2 , in accordance with aspects of the present technology.
4A is a schematic diagram illustrating an embodiment of an agent automation framework including an NLU framework that is part of a client instance hosted by a cloud computing system, in accordance with aspects of the present technology.
4B is a schematic diagram illustrating an alternative embodiment of an agent automation framework in which portions of the NLU framework are part of an enterprise instance hosted by a cloud computing system, in accordance with aspects of the present technology.
5 illustrates an embodiment of a process by which an agent automation framework including an NLU framework and a behavior engine framework extracts intents and/or entities from a user utterance and responds to a user utterance, in accordance with aspects of the present technology; is a flow chart that
6 is a block diagram illustrating an embodiment of an NLU framework including a semantic extraction subsystem and a semantic retrieval subsystem, wherein the semantic extraction subsystem generates semantic expressions from received user utterances, in accordance with aspects of the present technology. to calculate the utterance semantic model, generate semantic expressions from the sample utterances of the comprehension model to produce an comprehension model, and the semantic search subsystem compares the semantic expressions of the utterance semantic model with the semantic expressions of the comprehension model to compare the received user Extract artifacts (eg intents and entities) from the utterance.
7 is a diagram illustrating an example of an utterance tree generated for an utterance, according to an embodiment of the present approach.
8 is a semantic search analyzing a search space defined by an understanding model to determine or identify matching semantic expressions that enable an NLU framework to extract artifacts from a received user utterance, in accordance with aspects of the present technology; An information flow diagram illustrating an embodiment of a subsystem.
9 is a diagram of similarity that may be implemented within the semantic retrieval subsystem of the NLU framework to retrieve lists of mathematical comparison functions that enable efficient comparisons between any suitable number of semantic expressions, in accordance with aspects of the present technology; An information flow diagram illustrating an embodiment of a scoring subsystem.
10 is a flow diagram illustrating an embodiment of a process by which the similarity scoring subsystem retrieves lists of mathematical comparison functions that enable comparison between the utterance-based semantic representation of FIG. 8 and the search space, in accordance with aspects of the present technology.
11 is a diagram of an embodiment of a similarity scoring subsystem of a semantic search subsystem that uses a list of mathematical comparison functions to compare a first semantic representation to a second semantic representation, in accordance with aspects of the present technology.
12 is a schematic diagram illustrating an embodiment of a similarity scoring subsystem that applies a list of mathematical comparison functions to selectively narrow a search space to identify semantic expressions that match an utterance-based semantic expression, in accordance with aspects of the present technology.
13 is a flow diagram of an embodiment of a process by which a similarity scoring subsystem implements a list of mathematical comparison functions to identify matching semantic expressions from a search space, in accordance with aspects of the present technology.

하나 이상의 특정 실시예가 아래에서 설명될 것이다. 이러한 실시예들의 간결한 설명을 제공하기 위한 노력으로, 실제 구현의 모든 특징들이 명세서에서 설명되지는 않는다. 임의의 엔지니어링 또는 설계 프로젝트에서와 같이 임의의 이러한 실제 구현을 개발할 때, 구현마다 달라질 수 있는 시스템-관련 및 비지니스-관련 제약들의 준수와 같은 개발자들의 특정 목표들을 달성하기 위해 수많은 구현-특정 결정들이 이루어져야 한다는 것이 이해되어야 한다. 또한, 이러한 개발 노력은 복잡하고 시간 소모적일 수 있지만, 그럼에도 불구하고 본 개시내용의 이점을 갖는 통상의 기술자를 위한 설계, 제작 및 제조의 일상적인 작업이 될 것이라는 점이 이해되어야 한다.One or more specific embodiments will be described below. In an effort to provide a concise description of these embodiments, not all features of an actual implementation are described in the specification. When developing any such actual implementation, such as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developer's specific goals, such as compliance with system-related and business-related constraints, which may vary from implementation to implementation. It should be understood that It should also be understood that such a development effort can be complex and time consuming, but will nevertheless become the routine tasks of design, fabrication, and manufacture for those skilled in the art having the benefit of this disclosure.

본 명세서에서 사용되는 바와 같이, 용어 "컴퓨팅 시스템" 또는 "컴퓨팅 디바이스"는 단일 컴퓨터, 가상 기계, 가상 컨테이너, 호스트, 서버, 랩톱, 및/또는 모바일 디바이스와 같은, 그러나 이에 제한되지 않는 전자 컴퓨팅 디바이스, 또는 컴퓨팅 시스템 상에서 또는 컴퓨팅 시스템에 의해 수행되는 것으로서 설명되는 기능을 수행하기 위해 함께 작동하는 복수의 전자 컴퓨팅 디바이스들을 지칭한다. 본 명세서에서 사용되는 바와 같이, 용어 "기계 판독가능한 매체"는 하나 이상의 명령어 또는 데이터 구조를 저장하는 단일 매체 또는 복수의 매체들(예를 들어, 중앙집중형 또는 분산형 데이터베이스, 및/또는 연관된 캐시들 및 서버들)을 포함할 수 있다. 용어 "비일시적 기계 판독가능한 매체"는 또한 컴퓨팅 시스템에 의해 실행하기 위한 명령어들을 저장, 인코딩, 또는 운반할 수 있고 컴퓨팅 시스템으로 하여금 본 주제의 방법론들 중 임의의 하나 이상을 수행하게 하거나, 이러한 명령어들에 의해 이용되거나 이러한 명령어들과 연관되는 데이터 구조들을 저장, 인코딩, 또는 운반할 수 있는 임의의 유형적 매체를 포함하는 것으로 고려되어야 한다. 용어 "비일시적 기계 판독가능한 매체"는 이에 따라 솔리드 스테이트 메모리들, 및 광학 및 자기 매체를 포함하지만 이에 제한되지 않는 것으로 고려되어야 한다. 비일시적 기계 판독가능한 매체의 특정 예들은, 예로서, 반도체 메모리 디바이스들(예를 들어, 소거가능한 프로그래머블 판독 전용 메모리(EPROM), 전기적 소거가능한 프로그래머블 판독 전용 메모리(EEPROM), 및 플래시 메모리 디바이스들), 내부 하드 디스크들 및 이동식 디스크들과 같은 자기 디스크들, 자기-광학 디스크들, 및 CD-ROM 및 DVD-ROM 디스크들을 포함하는 비휘발성 메모리를 포함하지만 이에 제한되지 않는다.As used herein, the term “computing system” or “computing device” refers to an electronic computing device such as, but not limited to, a single computer, virtual machine, virtual container, host, server, laptop, and/or mobile device. , or a plurality of electronic computing devices that work together to perform a function described as being performed on or by the computing system. As used herein, the term “machine-readable medium” refers to a single medium or a plurality of media (eg, a centralized or distributed database, and/or an associated cache) storing one or more instructions or data structures. and servers). The term “non-transitory machine-readable medium” may also store, encode, or carry instructions for execution by a computing system and cause the computing system to perform any one or more of the methodologies of the present subject matter, or such instructions It should be considered to include any tangible medium capable of storing, encoding, or carrying data structures used by or associated with such instructions. The term “non-transitory machine-readable medium” should accordingly be considered to include, but is not limited to, solid state memories, and optical and magnetic media. Specific examples of non-transitory machine-readable media include, for example, semiconductor memory devices (eg, erasable programmable read only memory (EPROM), electrically erasable programmable read only memory (EEPROM), and flash memory devices). , magnetic disks such as internal hard disks and removable disks, magneto-optical disks, and non-volatile memory including CD-ROM and DVD-ROM disks.

본 명세서에서 사용되는 바와 같이, 용어들 "애플리케이션", "엔진" 및 "플러그-인"은 특정 기능을 제공하기 위해 컴퓨팅 시스템의 하나 이상의 프로세서에 의해 실행가능한 컴퓨터 소프트웨어 명령어들(예로서, 컴퓨터 프로그램들 및/또는 스크립트들)의 하나 이상의 세트를 지칭한다. 컴퓨터 소프트웨어 명령어들은 C, C++, C#, 파스칼, 포트란, 펄(Perl), MATLAB, SAS, SPSS, 자바스크립트, AJAX 및 JAVA와 같은 임의의 적절한 프로그래밍 언어들로 기입될 수 있다. 이러한 컴퓨터 소프트웨어 명령어들은 데이터 입력 및 데이터 디스플레이 모듈들을 갖는 독립 애플리케이션을 포함할 수 있다. 대안적으로, 개시되는 컴퓨터 소프트웨어 명령어들은 분산 객체들로서 인스턴스화되는 클래스들일 수 있다. 개시되는 컴퓨터 소프트웨어 명령어들은 또한 구성요소 소프트웨어, 예를 들어 JAVABEANS 또는 ENTERPRISE JAVABEANS일 수 있다. 추가로, 개시되는 애플리케이션들 또는 엔진들은 컴퓨터 소프트웨어, 컴퓨터 하드웨어 또는 이들의 조합으로 구현될 수 있다.As used herein, the terms “application,” “engine,” and “plug-in” refer to computer software instructions (eg, a computer program) executable by one or more processors of a computing system to provide a particular function. and/or scripts). Computer software instructions may be written in any suitable programming languages, such as C, C++, C#, Pascal, Fortran, Perl, MATLAB, SAS, SPSS, JavaScript, AJAX, and JAVA. These computer software instructions may include standalone application with data input and data display modules. Alternatively, the computer software instructions disclosed may be classes instantiated as distributed objects. The computer software instructions disclosed may also be component software, eg, JAVABEANS or ENTERPRISE JAVABEANS. Additionally, the disclosed applications or engines may be implemented in computer software, computer hardware, or a combination thereof.

본 명세서에서 사용되는 바와 같이, "프레임워크"라는 용어는 하나 이상의 전체 기능을 수행하기 위해 협력하는 애플리케이션들 및/또는 엔진들의 시스템뿐만 아니라, 임의의 다른 지원 데이터 구조들, 라이브러리들, 모듈들, 및 임의의 다른 지원 기능을 지칭한다. 특히, "자연어 이해 프레임워크" 또는 "NLU 프레임워크"는 이해 모델에 기반하여 자연어 발화들로부터 의미(예를 들어, 의도들, 엔티티들, 아티팩트들)를 처리하고 도출하도록 설계된 컴퓨터 프로그램들의 집합을 포함한다. 본 명세서에서 사용되는 바와 같이, 추론 에이전트 또는 RA/BE로도 알려진 "거동 엔진" 또는 "BE"는 대화 모델에 기반하여 사용자들과 상호작용하도록 설계된 가상 에이전트와 같은 규칙 기반 에이전트를 지칭한다. 예를 들어, "가상 에이전트"는 특정 대화 또는 통신 채널에서 자연어 요청들을 통해 사용자들과 상호작용하도록 설계된 BE의 특정 예를 지칭할 수 있다. 이를 염두에 두고, "가상 에이전트" 및 "BE"라는 용어들은 본 명세서에서 상호교환가능하게 사용된다. 특정 예로서, 가상 에이전트는 채팅방 환경에서 자연어 요청들 및 응답들을 통해 사용자들과 상호작용하는 채팅 에이전트일 수 있거나 이를 포함할 수 있다. 가상 에이전트들의 다른 예들은 이메일, 포럼 게시물들, 서비스 티켓들, 전화 호출들 등에 대한 자동 응답들과 관련하여 사용자들과 상호작용하는 이메일 에이전트, 포럼 에이전트, 티켓팅 에이전트, 전화 호출 에이전트 등을 포함할 수 있다.As used herein, the term “framework” refers to a system of applications and/or engines that cooperate to perform one or more overall functions, as well as any other supporting data structures, libraries, modules, and any other supporting function. In particular, a "Natural Language Understanding Framework" or "NLU Framework" refers to a set of computer programs designed to process and derive meaning (e.g., intents, entities, artifacts) from natural language utterances based on an understanding model. include As used herein, “behavioral engine” or “BE”, also known as an inference agent or RA/BE, refers to a rule-based agent, such as a virtual agent, designed to interact with users based on a conversational model. For example, a “virtual agent” may refer to a particular example of a BE designed to interact with users via natural language requests in a particular conversation or communication channel. With this in mind, the terms "virtual agent" and "BE" are used interchangeably herein. As a specific example, the virtual agent may be or include a chat agent that interacts with users via natural language requests and responses in a chat room environment. Other examples of virtual agents may include an email agent, forum agent, ticketing agent, phone call agent, etc. that interacts with users in connection with automated responses to email, forum posts, service tickets, phone calls, etc. there is.

본 명세서에서 사용되는 바와 같이, "의도"는 발화와 같은 통신의 기본 목적과 관련될 수 있는 사용자의 욕구 또는 목표를 지칭한다. 본 명세서에서 사용되는 바와 같이, "엔티티"는 목적어(object), 주어(subject) 또는 의도의 일부 다른 파라미터화를 지칭한다. 본 실시예들에서, 특정 엔티티들은 대응하는 의도의 파라미터들로서 취급된다는 점에 유의한다. 더 구체적으로, 특정 엔티티들(예로서, 시간 및 위치)은 모든 의도들에 대해 전역적으로 인식되고 추출될 수 있는 반면, 다른 엔티티들은 의도-특정적(예로서, 구매 의도들과 연관된 상품 엔티티들)이고 일반적으로 이들을 정의하는 의도들 내에서 발견될 때에만 추출된다. 본 명세서에서 사용되는 바와 같이, "아티팩트"는 발화의 의도들 및 엔티티들 둘 다를 집합적으로 지칭한다. 본 명세서에서 사용되는 바와 같이, "이해 모델"은 자연어 발화들의 의미를 추론하기 위해 NLU 프레임워크에 의해 이용되는 모델들의 집합이다. 이해 모델은 특정 토큰들(예로서, 단어들 또는 구들)을 특정 단어 벡터들, 의도-엔티티 모델, 엔티티 모델 또는 이들의 조합과 연관시키는 어휘 모델을 포함할 수 있다. 본 명세서에서 사용되는 바와 같이, "의도-엔티티 모델"은 특정 의도들을 특정 샘플 발화들과 연관시키는 모델을 지칭하며, 의도와 연관된 엔티티들은 모델의 샘플 발화들 내에서 의도의 파라미터로서 인코딩될 수 있다. 본 명세서에서 사용되는 바와 같이, "에이전트들"이라는 용어는 대화 채널 내에서 서로 상호작용하는 컴퓨터 생성 페르소나들(예로서, 채팅 에이전트들 또는 다른 가상 에이전트들)을 지칭할 수 있다. 본 명세서에서 사용되는 바와 같이, "코퍼스"는 다양한 사용자들과 가상 에이전트들 사이의 상호작용들을 포함하는 소스 데이터의 캡처된 보디를 지칭하며, 그 상호작용들은 하나 이상의 적절한 유형의 미디어(예로서, 전화 상담 서비스(help line), 채팅방 또는 메시지 스트링, 이메일 스트링) 내의 통신들 또는 대화들을 포함한다. 본 명세서에서 사용되는 바와 같이, "발화 트리"는 발화의 의미 표현을 저장하는 데이터 구조를 지칭한다. 논의되는 바와 같이, 발화 트리는 발화의 구문론적 구조를 나타내는 트리 구조(예로서, 의존성 파스 트리 구조)를 가지며, 트리 구조의 노드들은 발화의 의미론적 의미를 인코딩하는 벡터들(예로서, 단어 벡터들, 서브트리 벡터들)을 저장한다.As used herein, “intent” refers to a desire or goal of a user that may relate to a primary purpose of a communication, such as an utterance. As used herein, "entity" refers to some other parameterization of an object, subject, or intent. Note that in the present embodiments, specific entities are treated as parameters of the corresponding intent. More specifically, certain entities (eg, time and location) may be globally recognized and extracted for all intents, while other entities are intent-specific (eg, a product entity associated with purchase intents). ) and are generally extracted only when found within the intents that define them. As used herein, “artifact” refers collectively to both intents and entities of an utterance. As used herein, an “understanding model” is a set of models used by the NLU framework to infer the meaning of natural language utterances. The understanding model may include a lexical model that associates specific tokens (eg, words or phrases) with specific word vectors, an intent-entity model, an entity model, or a combination thereof. As used herein, "intent-entity model" refers to a model that associates particular intents with particular sample utterances, and the entities associated with the intent may be encoded as a parameter of the intent within the sample utterances of the model. . As used herein, the term “agents” may refer to computer-generated personas (eg, chat agents or other virtual agents) interacting with each other within a conversation channel. As used herein, “corpus” refers to a captured body of source data that includes interactions between various users and virtual agents, the interactions comprising one or more suitable tangible media (eg, communications or conversations within a help line, chat room or message string, e-mail string). As used herein, “utterance tree” refers to a data structure that stores a semantic representation of an utterance. As discussed, the utterance tree has a tree structure (eg, a dependency parse tree structure) representing the syntactic structure of the utterance, wherein the nodes of the tree structure contain vectors encoding the semantic meaning of the utterance (eg, word vectors). , subtree vectors).

본 명세서에서 사용되는 바와 같이, "소스 데이터" 또는 "대화 로그들"은 채팅 로그들, 이메일 스트링들, 문서들, 도움말 문서화, 자주 문의되는 질문들(FAQ들), 포럼 엔트리들, 티켓팅을 지원하는 아이템들, 전화 상담 서비스 호출들의 기록들 등을 포함하지만 이에 제한되지 않는 다양한 에이전트들 간의 임의의 적절한 캡처된 상호작용들을 포함할 수 있다. 본 명세서에서 사용되는 바와 같이, "발화"는 하나 이상의 의도를 포함할 수 있는 사용자 또는 에이전트에 의해 이루어진 단일 자연어 진술을 지칭한다. 이와 같이, 발화는 소스 데이터의 이전에 캡처된 코퍼스의 일부일 수 있고, 발화는 또한 가상 에이전트와의 상호작용의 일부로서 사용자로부터 수신된 새로운 진술일 수 있다. 본 명세서에서 사용되는 바와 같이, "기계 학습" 또는 "ML"은 감독, 무감독, 및 반감독 학습 기술들을 포함하는 기계 학습 기술들을 이용하여 훈련될 수 있는 인공 지능의 임의의 적절한 통계 형태를 지칭하는데 사용될 수 있다. 예를 들어, 특정 실시예들에서, ML 기술들은 신경망(NN)(예로서, 심층 신경망(DNN), 순환 신경망(RNN), 재귀적 신경망)을 이용하여 구현될 수 있다. 본 명세서에서 사용되는 바와 같이, "벡터"(예를 들어, 단어 벡터, 의도 벡터, 주어 벡터, 서브트리 벡터)는 발화의 일부(예를 들어, 단어 또는 구, 의도, 엔티티, 토큰)의 의미론적 의미의 수학적 표현을 제공하는 부동 소수점 값들(예를 들어, 1xN 또는 Nx1 행렬)의 순서화된 n차원 리스트(예를 들어, 300차원 리스트)인 선형 대수 벡터를 지칭한다. 본 명세서에서 사용되는 바와 같이, "도메인 특정성"은 시스템이 주어진 도메인 및/또는 대화 채널에서 실제 대화들로 표현되는 의도들 및 엔티티들을 정확하게 추출하는 것에 대해 어떻게 조율되는지를 지칭한다. 본 명세서에서 사용되는 바와 같이, 발화의 "이해"는 NLU 프레임워크에 의한 발화의 해석 또는 구성을 지칭한다. 이와 같이, 발화의 상이한 이해들은 상이한 구조들(예를 들어, 상이한 노드들, 노드들 사이의 상이한 관계들), 상이한 품사 태깅들 등을 갖는 상이한 의미 표현들과 연관될 수 있다는 것을 알 수 있다.As used herein, “source data” or “conversation logs” support chat logs, email strings, documents, help documentation, frequently asked questions (FAQs), forum entries, ticketing any suitable captured interactions between the various agents including, but not limited to, items made, records of telephone counseling service calls, and the like. As used herein, “utterance” refers to a single natural language statement made by a user or agent that may contain one or more intents. As such, the utterance may be part of a previously captured corpus of source data, and the utterance may also be a new statement received from the user as part of an interaction with the virtual agent. As used herein, “machine learning” or “ML” refers to any suitable statistical form of artificial intelligence that can be trained using machine learning techniques, including supervised, unsupervised, and unsupervised learning techniques. can be used to For example, in certain embodiments, ML techniques may be implemented using a neural network (NN) (eg, a deep neural network (DNN), a recurrent neural network (RNN), a recursive neural network). As used herein, “vector” (eg, word vector, intent vector, subject vector, subtree vector) means the meaning of part of an utterance (eg, word or phrase, intent, entity, token). Refers to a linear algebra vector that is an ordered n -dimensional list (eg, a 300-dimensional list) of floating-point values (eg, a 1×N or N×1 matrix) that provides a mathematical representation of the logical meaning. As used herein, “domain specificity” refers to how the system is tuned for accurately extracting intents and entities expressed in real conversations in a given domain and/or conversation channel. As used herein, “understanding” of an utterance refers to the interpretation or construction of an utterance by the NLU framework. As such, it can be seen that different understandings of an utterance may be associated with different semantic expressions with different structures (eg, different nodes, different relationships between nodes), different part-of-speech tagging, and the like.

언급된 바와 같이, 컴퓨팅 플랫폼은 NLU 기술들을 통해 플랫폼 상에서 기능들을 수행하거나 문제들을 해결하라는 사용자 요청들에 자동으로 응답하도록 설계되는 채팅 에이전트, 또는 다른 유사한 가상 에이전트를 포함할 수 있다. 개시된 NLU 프레임워크는, 자연어 발화의 의미 또는 이해의 양태가 발화의 형태(예를 들어, 구문론적 구조, 형상) 및 의미론적 의미에 기반하여 결정될 수 있는, 인지 구축 문법(CCG)의 원리들에 기반한다. 개시된 NLU 프레임워크는 발화에 대한 복수의 의미 표현들을 생성할 수 있고, 각각의 의미 표현은 발화의 특정 이해를 나타내는 발화 트리일 수 있다. 이와 같이, 개시된 NLU 프레임워크는 특정 샘플 발화들에 대한 복수의 의미 표현들을 갖는 이해 모델을 생성할 수 있고, 이는 의미 검색의 검색 공간을 확장하고, 이에 의해 NLU 프레임워크의 동작을 개선한다. 그러나, 자연어 발화들로부터 사용자 의도를 도출하려고 시도할 때, 특정 NLU 프레임워크들은 사용자 기반 발화의 의미 표현과 비교불가능하고/하거나 비유사한 의미 표현들의 그 포함으로 인해 과도하게 큰 크기의 검색 공간에 대해 검색들을 수행할 수 있다는 것이 현재 인식되고 있다. 이와 같이, 전체 검색 공간에 대해, 특정 NLU 프레임워크들은 이용가능한 처리 및 메모리 자원들에 기반하여 특정 스케일 임계치 초과인 검색 공간들로 의미 검색의 성능을 제한하는 비맞춤형 또는 단일 비용 비교 함수를 이용할 수 있다.As noted, the computing platform may include a chat agent, or other similar virtual agent, designed to automatically respond to user requests to perform functions or solve problems on the platform via NLU technologies. The disclosed NLU framework is based on the principles of cognitive building grammar (CCG), in which aspects of the meaning or understanding of a natural language utterance can be determined based on the form (eg, syntactic structure, shape) and semantic meaning of the utterance. based on The disclosed NLU framework may generate a plurality of semantic representations for an utterance, each semantic representation being an utterance tree representing a particular understanding of the utterance. As such, the disclosed NLU framework is capable of generating an understanding model having multiple semantic representations for particular sample utterances, which expands the search space of semantic search, thereby improving the operation of the NLU framework. However, when attempting to derive user intent from natural language utterances, certain NLU frameworks may have a search space of excessively large size due to their inclusion of semantic expressions that are incomparable and/or dissimilar to that of user-based utterances. It is now recognized that searches can be performed. As such, for the entire search space, certain NLU frameworks may employ non-customized or single-cost comparison functions that limit the performance of semantic search to search spaces that are above a certain scale threshold based on available processing and memory resources. there is.

따라서, 본 실시예들은 일반적으로 의미 검색을 강화하기 위해 CCG 기술들을 활용하도록 설계된 의미 검색 서브시스템을 갖는 에이전트 자동화 프레임워크에 관한 것이다. 본 명세서에서 논의되는 바와 같이, 의미 검색 서브시스템은 매칭 의미 표현들에 대해 탐색되는 검색 공간을 반복적으로 그리고 점진적으로 좁힐 수 있다. 실제로, 이러한 매칭 의미 표현들은 수신된 사용자 발화들에 대한 의도 또는 엔티티 매치들의 식별을 가능하게 하는 검색 공간의 서브세트를 정의한다. 더 구체적으로, 본 실시예들은 수신된 사용자 발화에 기반하여 결정된 의미 표현들을 검색 공간을 정의하는 샘플 발화들의 의미 표현들과 효율적으로 비교하는 의미 검색 서브시스템의 유사도 스코어링 서브시스템에 관한 것이다. 이해되는 바와 같이, 수학적 비교 함수들의 점진적 세트에 기반하여, 유사도 스코어링 서브시스템은 한 쌍의 비교된 의미 표현들 사이의 점진적으로 정확한 유사도 스코어들을 반복적으로 결정하고 유사도 스코어의 다양한 반복들에 기반하여 검색 공간을 전지한다.Accordingly, the present embodiments generally relate to an agent automation framework having a semantic retrieval subsystem designed to utilize CCG techniques to enhance semantic retrieval. As discussed herein, the semantic search subsystem can iteratively and incrementally narrow the search space searched for matching semantic expressions. Indeed, these matching semantic expressions define a subset of the search space that enables identification of intent or entity matches for received user utterances. More specifically, the present embodiments relate to a similarity scoring subsystem of a semantic search subsystem that efficiently compares semantic expressions determined based on a received user utterance with semantic expressions of sample utterances defining a search space. As will be appreciated, based on a progressive set of mathematical comparison functions, the similarity scoring subsystem iteratively determines progressively accurate similarity scores between a pair of compared semantic expressions and searches based on the various iterations of the similarity score. Occupy the space

예를 들어, 특정 의미 표현을 검색 공간을 정의하는 의미 표현들의 집합과 비교하기 위해, 유사도 스코어링 서브시스템은 먼저 특정 의미 표현에 대한 CCG 형태를 결정할 수 있다. 이전에 언급된 바와 같이, 각각의 특정 의미 표현의 CCG 형태 클래스 멤버십은 특정 의미 표현의 트리 구조(예를 들어, 발화 트리)를 형성하는 노드들의 형상 및 의미론적 의미들에 의해 설정된다. 특정 의미 표현의 CCG 형태에 기반하여, 유사도 스코어링 서브시스템은 의미 표현들의 각각의 쌍들 사이의 정량적 비교들을 가능하게 하는, 수학적 비교 함수 리스트를 검색하기 위해 형태 클래스 데이터베이스에 질의하도록 설계된다. 실제로, 수학적 비교 함수 리스트는 특정 의미 표현의 각각의 부분들과 검색 공간의 의미 표현들 사이의 유사도의 얼마나 점점 더 정확한 및/또는 정밀한 결정들이 수행될 수 있는지를 개별적으로 지정하는 내포된 함수들을 포함한다. 수학적 비교 함수 리스트를 갖지 않는 의미 표현들의 비교된 쌍에 대해, 유사도 스코어링 서브시스템은 검색 공간으로부터 연관된 의미 표현들을 전지(예를 들어, 임의의 비교들을 수행하지 않고 가능한 최저 스코어를 즉시 결정)하여, 남아 있는 잠재적으로 매칭하는 의미 표현들에 대한 자원 이용을 보존할 수 있다.For example, to compare a particular semantic expression to a set of semantic expressions defining a search space, the similarity scoring subsystem may first determine a CCG form for the particular semantic expression. As previously mentioned, the CCG form class membership of each particular semantic expression is established by the shape and semantic meanings of the nodes that form a tree structure (eg, utterance tree) of the particular semantic expression. Based on the CCG form of a particular semantic expression, the similarity scoring subsystem is designed to query the form class database to retrieve a list of mathematical comparison functions, enabling quantitative comparisons between each pair of semantic expressions. Indeed, the list of mathematical comparison functions contains nested functions that individually specify how increasingly precise and/or precise determinations of the similarity between each part of a particular semantic expression and the semantic expressions of the search space can be performed. do. For a compared pair of semantic expressions that do not have a mathematical comparison function list, the similarity scoring subsystem prunes the associated semantic expressions from the search space (e.g., immediately determines the lowest possible score without performing any comparisons), It may conserve resource utilization for remaining potentially matching semantic expressions.

더 상세하게는, 각각의 수학적 비교 함수 리스트는 비교되고 있는 각각의 의미 표현들의 노드들의 적어도 일부를 고려하는 비교 함수들(예를 들어, 벡터 대수, 코사인 유사도들, 프로그레시브 함수들, 다른 데이터베이스들 또는 구조들에 대한 호출들)의 순서화된 집합을 포함한다. 본 명세서에서 인식되는 바와 같이, 비교 함수들은 유사도 스코어링 서브시스템이 계산적으로 가장 저렴한 및/또는 가장 효율적인 비교를 먼저 수행할 수 있게 하도록 순서화된다. 유사도 스코어링 서브시스템은 특정 의미 표현과 검색 공간 내에 남아 있는 비교가능한 의미 표현들 사이의 초기 유사도 스코어를 결정할 수 있다. 예를 들어, 유사도 스코어링 서브시스템은 특정 의미 표현이 검색 공간 내의 각각의 의미 표현과 적절히 유사한지를 고려하기 위해 계산적으로 가장 저렴한 함수를 이용할 수 있다. 이러한 결정은 검색 공간의 어느 영역들이 추가 조사의 가치가 있는지를 나타내는 가장 덜 정확하고 가장 효율적인 예측을 일반적으로 제공할 수 있다. 이어서, 유사도 스코어링 서브시스템은 비교된 의미 표현들의 각각의 남아 있는 비교가능한 쌍을 추가로 고려하기 위해 후속 함수를 적용하기 전에, 검색 공간으로부터 비유사 의미 표현들을 전지할 수 있다. 따라서, 비교 함수들의 점진적 데이터 활용은 유사도 스코어링 서브시스템이 검색 공간을 잠재적으로 매칭하는 후보들로 좁히면서, 점점 더 복잡한 비교 함수들을 통해 비교된 의미 표현들의 추가적인 특징들을 반복적으로 고려할 수 있게 한다. 이와 같이, 예측 유사도 스코어링을 위한 본 기술들은 의미 표현 매치들의 타겟화된 발견을 가능하게 하고, 이에 의해 이러한 기술들을 구현하는 에이전트 자동화 시스템에 대한 계산 오버헤드를 감소시키고 효율을 개선하기 위한 계산 이익들을 제공한다. 또한, 에이전트 자동화 시스템의 검색 용량이 증가되기 때문에, 검색 공간은 자연어 에이전트 응답들이 에이전트 자동화 시스템을 지원하는 비즈니스의 복수의 상이한 측면들을 다룰 수 있게 하는 복수의 이해 모델들로부터 구성될 수 있다.More specifically, each list of mathematical comparison functions can contain comparison functions (eg, vector algebra, cosine similarities, progressive functions, other databases or calls to structures). As will be appreciated herein, the comparison functions are ordered to allow the similarity scoring subsystem to perform the computationally cheapest and/or most efficient comparison first. The similarity scoring subsystem may determine an initial similarity score between the particular semantic representation and comparable semantic representations remaining within the search space. For example, the similarity scoring subsystem may use the computationally cheapest function to consider whether a particular semantic representation is appropriately similar to each semantic representation in the search space. Such a determination may generally provide the least accurate and most efficient prediction of which areas of the search space are worthy of further investigation. The similarity scoring subsystem may then prune dissimilar semantic expressions from the search space before applying a subsequent function to further consider each remaining comparable pair of compared semantic expressions. Thus, progressive data utilization of comparison functions allows the similarity scoring subsystem to iteratively consider additional features of compared semantic expressions through increasingly complex comparison functions, while narrowing the search space to potentially matching candidates. As such, the present techniques for predictive similarity scoring enable targeted discovery of semantic expression matches, thereby reducing computational overhead for agent automation systems implementing these techniques and reducing computational benefits to improve efficiency. to provide. Further, as the search capacity of an agent automation system is increased, the search space may be constructed from multiple understanding models that enable natural language agent responses to address multiple different aspects of the business supporting the agent automation system.

이상의 내용을 염두에 두고, 이하의 도면들은 멀티-인스턴스 프레임워크 내의 조직에 서비스들을 제공하는데 이용될 수 있고 본 접근법들이 이용될 수 있는 다양한 유형들의 일반화된 시스템 아키텍처들 또는 구성들에 관한 것이다. 이에 대응하여, 이러한 시스템 및 플랫폼 예들은 또한 본 명세서에서 논의되는 기술들이 구현되거나 다른 방식으로 이용될 수 있는 시스템들 및 플랫폼들에 관한 것일 수 있다. 이제 도 1을 참조하면, 본 개시내용의 실시예들이 동작할 수 있는 클라우드 컴퓨팅 시스템(10)의 실시예의 개략도가 예시되어 있다. 클라우드 컴퓨팅 시스템(10)은 클라이언트 네트워크(12), 네트워크(18)(예컨대, 인터넷), 및 클라우드 기반 플랫폼(20)을 포함할 수 있다. 일부 구현들에서, 클라우드 기반 플랫폼(20)은 구성 관리 데이터베이스(CMDB) 플랫폼일 수 있다. 일 실시예에서, 클라이언트 네트워크(12)는 스위치들, 서버들, 및 라우터들을 포함하지만 이들로 제한되지 않는 다양한 네트워크 디바이스들을 갖는 LAN(local area network)과 같은 로컬 사설 네트워크일 수 있다. 다른 실시예에서, 클라이언트 네트워크(12)는 하나 이상의 LAN, 가상 네트워크, 데이터 센터(22), 및/또는 다른 원격 네트워크를 포함할 수 있는 기업 네트워크를 나타낸다. 도 1에 도시된 바와 같이, 클라이언트 네트워크(12)는 클라이언트 디바이스들이 서로 그리고/또는 플랫폼(20)을 호스팅하는 네트워크와 통신할 수 있도록 하나 이상의 클라이언트 디바이스(14A, 14B, 및 14C)에 접속될 수 있다. 클라이언트 디바이스들(14)은, 예를 들어, 웹 브라우저 애플리케이션을 통해 또는 클라이언트 디바이스들(14)과 플랫폼(20) 사이의 게이트웨이로서 작용할 수 있는 에지 디바이스(16)를 통해 클라우드 컴퓨팅 서비스들에 액세스하는 사물 인터넷(IoT) 디바이스들이라고 일반적으로 지칭되는 컴퓨팅 시스템들 및/또는 다른 유형들의 컴퓨팅 디바이스들일 수 있다. 도 1은 또한 클라이언트 네트워크(12)가 플랫폼(20), 다른 외부 애플리케이션들, 데이터 소스들 및 서비스들을 호스팅하는 네트워크와 클라이언트 네트워크(12) 간의 데이터의 통신을 용이하게 하는 MID(management, instrumentation, and discovery) 서버(17)와 같은 관리 또는 운영(administration or managerial) 디바이스, 에이전트 또는 서버를 포함하는 것을 예시한다. 도 1에 구체적으로 예시되지 않았지만, 클라이언트 네트워크(12)는 또한 접속 네트워크 디바이스(예컨대, 게이트웨이 또는 라우터) 또는 고객 방화벽 또는 침입 방지 시스템을 구현하는 디바이스들의 조합을 포함할 수 있다.With the above in mind, the following figures relate to various types of generalized system architectures or configurations that may be used to provide services to an organization within a multi-instance framework and in which the present approaches may be employed. Correspondingly, such system and platform examples may also relate to systems and platforms in which the techniques discussed herein may be implemented or otherwise utilized. Referring now to FIG. 1 , illustrated is a schematic diagram of an embodiment of a cloud computing system 10 in which embodiments of the present disclosure may operate. The cloud computing system 10 may include a client network 12 , a network 18 (eg, the Internet), and a cloud-based platform 20 . In some implementations, the cloud-based platform 20 may be a configuration management database (CMDB) platform. In one embodiment, the client network 12 may be a local private network, such as a local area network (LAN), having various network devices including, but not limited to, switches, servers, and routers. In other embodiments, client network 12 represents an enterprise network, which may include one or more LANs, virtual networks, data centers 22, and/or other remote networks. 1 , a client network 12 may be connected to one or more client devices 14A, 14B, and 14C such that the client devices may communicate with each other and/or with the network hosting the platform 20 . there is. Client devices 14 access cloud computing services, for example, via a web browser application or via edge device 16 , which may act as a gateway between client devices 14 and platform 20 . computing systems and/or other types of computing devices commonly referred to as Internet of Things (IoT) devices. 1 also illustrates a management, instrumentation, and Discovery) exemplifies including an administration or managerial device, agent, or server, such as the server 17 . Although not specifically illustrated in FIG. 1 , the client network 12 may also include an access network device (eg, a gateway or router) or a combination of devices implementing a customer firewall or intrusion prevention system.

예시된 실시예에 있어서, 도 1은 클라이언트 네트워크(12)가 네트워크(18)에 결합되는 것을 예시한다. 네트워크(18)는 클라이언트 디바이스들(14A-C)과 플랫폼(20)을 호스팅하는 네트워크 사이에서 데이터를 전송하기 위해, 다른 LAN들, 광역 네트워크들(WAN), 인터넷, 및/또는 다른 원격 네트워크들과 같은 하나 이상의 컴퓨팅 네트워크를 포함할 수 있다. 네트워크(18) 내의 컴퓨팅 네트워크들 각각은 전기 및/또는 광학 도메인에서 동작하는 유선 및/또는 무선 프로그래머블 디바이스들을 포함할 수 있다. 예를 들어, 네트워크(18)는 셀룰러 네트워크들(예를 들어, GSM(Global System for Mobile Communications) 기반 셀룰러 네트워크), IEEE 802.11 네트워크들, 및/또는 다른 적절한 라디오 기반 네트워크들과 같은 무선 네트워크들을 포함할 수 있다. 네트워크(18)는 또한 TCP(Transmission Control Protocol) 및 IP(Internet Protocol)와 같은 임의의 수의 네트워크 통신 프로토콜들을 이용할 수 있다. 도 1에 명시적으로 도시되지는 않았지만, 네트워크(18)는 네트워크(18)를 통해 데이터를 전송하도록 구성된 서버들, 라우터들, 네트워크 스위치들, 및/또는 다른 네트워크 하드웨어 디바이스들과 같은 다양한 네트워크 디바이스들을 포함할 수 있다.In the illustrated embodiment, FIG. 1 illustrates that a client network 12 is coupled to a network 18 . Network 18 may include other LANs, wide area networks (WANs), the Internet, and/or other remote networks for transferring data between client devices 14A-C and the network hosting platform 20 . It may include one or more computing networks such as Each of the computing networks in network 18 may include wired and/or wireless programmable devices operating in the electrical and/or optical domain. For example, network 18 includes wireless networks such as cellular networks (eg, Global System for Mobile Communications (GSM) based cellular network), IEEE 802.11 networks, and/or other suitable radio based networks. can do. Network 18 may also utilize any number of network communication protocols, such as Transmission Control Protocol (TCP) and Internet Protocol (IP). Although not explicitly shown in FIG. 1 , network 18 may include various network devices such as servers, routers, network switches, and/or other network hardware devices configured to transmit data over network 18 . may include

도 1에서, 플랫폼(20)을 호스팅하는 네트워크는 클라이언트 네트워크(12) 및 네트워크(18)를 통해 클라이언트 디바이스들(14)과 통신할 수 있는 원격 네트워크(예를 들어, 클라우드 네트워크)일 수 있다. 플랫폼(20)을 호스팅하는 네트워크는 클라이언트 디바이스들(14) 및/또는 클라이언트 네트워크(12)에 추가적인 컴퓨팅 자원들을 제공한다. 예를 들어, 플랫폼(20)을 호스팅하는 네트워크를 이용함으로써, 클라이언트 디바이스들(14)의 사용자들은 다양한 기업, IT, 및/또는 다른 조직 관련 기능들을 위한 애플리케이션들을 구축하고 실행할 수 있다. 일 실시예에서, 플랫폼(20)을 호스팅하는 네트워크는 하나 이상의 데이터 센터(22) 상에 구현되고, 여기서 각각의 데이터 센터는 상이한 지리적 위치에 대응할 수 있다. 데이터 센터들(22) 각각은 복수의 가상 서버들(24)(본 명세서에서 애플리케이션 노드들, 애플리케이션 서버들, 가상 서버 인스턴스들, 애플리케이션 인스턴스들, 또는 애플리케이션 서버 인스턴스들이라고도 지칭됨)을 포함하고, 여기서 각각의 가상 서버(24)는 단일 전자 컴퓨팅 디바이스(예를 들어, 단일 물리적 하드웨어 서버)와 같은 물리적 컴퓨팅 시스템 상에 또는 복수의 컴퓨팅 디바이스들(예를 들어, 복수의 물리적 하드웨어 서버들)에 걸쳐 구현될 수 있다. 가상 서버들(24)의 예들은 웹 서버(예를 들어, 단일 아파치(Apache) 설치), 애플리케이션 서버(예를 들어, 단일 JAVA 가상 기계), 및/또는 데이터베이스 서버(예를 들어, 단일 관계형 데이터베이스 관리 시스템(RDBMS) 카탈로그)를 포함하지만, 이에 제한되지 않는다.1 , the network hosting platform 20 may be a remote network (eg, a cloud network) capable of communicating with client devices 14 via client network 12 and network 18 . The network hosting the platform 20 provides additional computing resources to the client devices 14 and/or the client network 12 . For example, by utilizing a network hosting platform 20 , users of client devices 14 may build and run applications for various enterprise, IT, and/or other organizational related functions. In one embodiment, the network hosting the platform 20 is implemented on one or more data centers 22 , where each data center may correspond to a different geographic location. each of the data centers 22 includes a plurality of virtual servers 24 (also referred to herein as application nodes, application servers, virtual server instances, application instances, or application server instances); wherein each virtual server 24 is on a physical computing system, such as a single electronic computing device (eg, a single physical hardware server), or across multiple computing devices (eg, a plurality of physical hardware servers). can be implemented. Examples of virtual servers 24 are a web server (eg, a single Apache installation), an application server (eg, a single JAVA virtual machine), and/or a database server (eg, a single relational database). management system (RDBMS) catalog).

플랫폼(20) 내의 컴퓨팅 자원들을 이용하기 위해, 네트워크 운영자들은 다양한 컴퓨팅 인프라스트럭처들을 이용하여 데이터 센터들(22)을 구성하도록 선택할 수 있다. 일 실시예에서, 데이터 센터들(22) 중 하나 이상은, 서버 인스턴스들(24) 중 하나가 복수의 고객으로부터의 요청들을 처리하고 복수의 고객을 서빙하도록 멀티-테넌트 클라우드 아키텍처를 이용하여 구성된다. 멀티-테넌트 클라우드 아키텍처를 갖는 데이터 센터들(22)은 복수의 고객으로부터의 데이터를 혼합 및 저장하고, 복수의 고객 인스턴스가 가상 서버들(24) 중 하나에 할당된다. 멀티-테넌트 클라우드 아키텍처에서, 특정한 가상 서버(24)는 다양한 고객들의 데이터 및 기타의 정보를 구분하고 분리한다. 예를 들어, 멀티-테넌트 클라우드 아키텍처는 각각의 고객으로부터의 데이터를 식별하고 분리하기 위해 각각의 고객에 대해 특정한 식별자를 할당할 수 있다. 일반적으로, 멀티-테넌트 클라우드 아키텍처를 구현하는 것은, 서버 인스턴스들(24) 중 특정한 하나의 고장이 특정한 서버 인스턴스에 할당된 모든 고객들에 대한 장애를 야기하는 등의, 다양한 단점들을 겪을 수 있다.To utilize the computing resources within the platform 20 , network operators may choose to configure the data centers 22 using a variety of computing infrastructures. In one embodiment, one or more of the data centers 22 are configured using a multi-tenant cloud architecture such that one of the server instances 24 handles requests from and serves the plurality of customers. . Data centers 22 with a multi-tenant cloud architecture mix and store data from multiple customers, and multiple customer instances are assigned to one of the virtual servers 24 . In a multi-tenant cloud architecture, a specific virtual server 24 separates and separates data and other information of various customers. For example, a multi-tenant cloud architecture may assign a specific identifier to each customer to identify and isolate data from each customer. In general, implementing a multi-tenant cloud architecture may suffer from various disadvantages, such as failure of a particular one of the server instances 24 causing failure for all customers assigned to that particular server instance.

다른 실시예에서, 데이터 센터들(22) 중 하나 이상은 모든 고객에게 그 자신의 고유 고객 인스턴스 또는 인스턴스들을 제공하기 위해 멀티-인스턴스 클라우드 아키텍처를 이용하여 구성된다. 예를 들어, 멀티-인스턴스 클라우드 아키텍처는 각각의 고객 인스턴스에 그 자신의 전용 애플리케이션 서버 및 전용 데이터베이스 서버를 제공할 수 있다. 다른 예들에서, 멀티-인스턴스 클라우드 아키텍처는 각각의 고객 인스턴스에 대해, 단일 물리적 또는 가상 서버(24) 및/또는 하나 이상의 전용 웹 서버, 하나 이상의 전용 애플리케이션 서버, 및 하나 이상의 데이터베이스 서버와 같은 물리적 및/또는 가상 서버들(24)의 다른 조합들을 배치할 수 있다. 멀티-인스턴스 클라우드 아키텍처에서, 복수의 고객 인스턴스들은 하나 이상의 각각의 하드웨어 서버 상에 설치될 수 있고, 여기서 각각의 고객 인스턴스는 컴퓨팅 메모리, 저장소, 및 처리 능력과 같은 물리적 서버 자원들의 특정 부분들을 할당받는다. 그렇게 함으로써, 각각의 고객 인스턴스는 데이터 분리, 고객들이 플랫폼(20)에 액세스하기 위한 비교적 더 적은 가동 정지 시간, 및 고객 중심의 업그레이드 스케줄들의 이점을 제공하는 그 자신의 고유 소프트웨어 스택을 갖게 된다. 멀티-인스턴스 클라우드 아키텍처 내에서 고객 인스턴스를 구현하는 예는 도 2를 참조하여 아래에서 더 상세하게 논의될 것이다.In another embodiment, one or more of the data centers 22 are configured using a multi-instance cloud architecture to provide every customer its own unique customer instance or instances. For example, a multi-instance cloud architecture may provide each customer instance with its own dedicated application server and dedicated database server. In other examples, a multi-instance cloud architecture may provide, for each customer instance, a single physical or virtual server 24 and/or physical and/or physical and/or virtual servers such as one or more dedicated web servers, one or more dedicated application servers, and one or more database servers. Or other combinations of virtual servers 24 may be deployed. In a multi-instance cloud architecture, a plurality of customer instances may be installed on one or more respective hardware servers, where each customer instance is allocated specific portions of physical server resources such as computing memory, storage, and processing power. . In doing so, each customer instance will have its own unique software stack that provides the benefits of data segregation, relatively less downtime for customers to access platform 20 , and customer-centric upgrade schedules. An example of implementing a customer instance within a multi-instance cloud architecture will be discussed in more detail below with reference to FIG. 2 .

도 2는 본 개시내용의 실시예들이 동작할 수 있는 멀티-인스턴스 클라우드 아키텍처(40)의 실시예의 개략도이다. 도 2는 멀티-인스턴스 클라우드 아키텍처(40)가 서로 지리적으로 분리될 수 있는 2개의(예를 들어, 페어링된) 데이터 센터(22A 및 22B)에 접속하는 클라이언트 네트워크(12) 및 네트워크(18)를 포함하는 것을 도시한다. 예로서 도 2를 이용하여, 네트워크 환경 및 서비스 제공자 클라우드 인프라스트럭처 클라이언트 인스턴스(42)(본 명세서에서 클라이언트 인스턴스(42)로도 지칭됨)는 전용 가상 서버들(예를 들어, 가상 서버들(24A, 24B, 24C, 및 24D)) 및 전용 데이터베이스 서버들(예를 들어, 가상 데이터베이스 서버들(44A 및 44B))과 연관된다(예를 들어, 이들에 의해 지원 및 인에이블된다). 달리 말하면, 가상 서버들(24A-24D) 및 가상 데이터베이스 서버들(44A 및 44B)은 다른 클라이언트 인스턴스들과 공유되지 않고 각각의 클라이언트 인스턴스(42)에 특정적이다. 도시된 예에서, 클라이언트 인스턴스(42)의 가용성을 용이하게 하기 위해, 가상 서버들(24A-24D) 및 가상 데이터베이스 서버들(44A 및 44B)은 2개의 상이한 데이터 센터(22A 및 22B)에 할당되어, 데이터 센터들(22) 중 하나가 백업 데이터 센터로서 역할을 한다. 멀티-인스턴스 클라우드 아키텍처(40)의 다른 실시예들은 웹 서버와 같은 다른 유형들의 전용 가상 서버들을 포함할 수 있다. 예를 들어, 클라이언트 인스턴스(42)는 전용 가상 서버들(24A-24D), 전용 가상 데이터베이스 서버들(44A 및 44B), 및 추가적인 전용 가상 웹 서버들(도 2에 도시되지 않음)과 연관될 수 있다(예를 들어, 이들에 의해 지원 및 인에이블될 수 있다).2 is a schematic diagram of an embodiment of a multi-instance cloud architecture 40 in which embodiments of the present disclosure may operate. 2 illustrates a client network 12 and network 18 in which a multi-instance cloud architecture 40 connects to two (eg, paired) data centers 22A and 22B that may be geographically separated from each other. shown to include Using FIG. 2 as an example, a network environment and service provider cloud infrastructure client instance 42 (also referred to herein as client instance 42 ) is configured with dedicated virtual servers (eg, virtual servers 24A, 24B, 24C, and 24D) and dedicated database servers (eg, virtual database servers 44A and 44B) (eg, supported and enabled by them). In other words, virtual servers 24A-24D and virtual database servers 44A and 44B are not shared with other client instances and are specific to each client instance 42 . In the illustrated example, to facilitate availability of client instance 42, virtual servers 24A-24D and virtual database servers 44A and 44B are assigned to two different data centers 22A and 22B. , one of the data centers 22 serves as a backup data center. Other embodiments of the multi-instance cloud architecture 40 may include other types of dedicated virtual servers, such as web servers. For example, client instance 42 may be associated with dedicated virtual servers 24A-24D, dedicated virtual database servers 44A and 44B, and additional dedicated virtual web servers (not shown in FIG. 2 ). are (eg, supported and enabled by them).

도 1 및 도 2는 각각 클라우드 컴퓨팅 시스템(10) 및 멀티-인스턴스 클라우드 아키텍처(40)의 특정 실시예들을 도시하지만, 본 개시내용은 도 1 및 도 2에 도시된 특정 실시예들로 제한되지 않는다. 예를 들어, 도 1은 플랫폼(20)이 데이터 센터들을 이용하여 구현되는 것을 도시하지만, 플랫폼(20)의 다른 실시예들은 데이터 센터들로 제한되지 않고, 다른 유형들의 원격 네트워크 인프라스트럭처들을 이용할 수 있다. 또한, 본 개시내용의 다른 실시예들은 하나 이상의 상이한 가상 서버를 단일 가상 서버로 결합하거나, 역으로, 복수의 가상 서버들을 이용하여 단일 가상 서버에 기인하는 동작들을 수행할 수 있다. 예를 들어, 도 2를 예로서 이용하여, 가상 서버들(24A, 24B, 24C, 24D) 및 가상 데이터베이스 서버들(44A, 44B)은 단일 가상 서버로 결합될 수 있다. 또한, 본 접근법들은 멀티-테넌트 아키텍처들, 일반화된 클라이언트/서버 구현들을 포함하지만 이에 제한되지 않는 다른 아키텍처들 또는 구성들로, 그리고/또는 심지어 본 명세서에서 논의된 동작들 중 일부 또는 전부를 수행하도록 구성된 단일 물리적 프로세서 기반 디바이스 상에서도 구현될 수 있다. 유사하게, 가상 서버들 또는 기계들이 구현의 논의를 용이하게 하기 위해 참조될 수 있지만, 물리적 서버들이 적절히 대신 이용될 수 있다. 도 1 및 도 2의 이용 및 논의는 서술 및 설명의 편의를 용이하게 하기 위한 예들일 뿐이고, 본 개시내용을 본 명세서에 도시된 특정 예들로 제한하는 것으로 의도되지 않는다.1 and 2 illustrate specific embodiments of cloud computing system 10 and multi-instance cloud architecture 40, respectively, the present disclosure is not limited to the specific embodiments shown in FIGS. 1 and 2 , respectively. . For example, while FIG. 1 shows platform 20 being implemented using data centers, other embodiments of platform 20 are not limited to data centers and may utilize other types of remote network infrastructures. there is. In addition, other embodiments of the present disclosure may combine one or more different virtual servers into a single virtual server, or conversely, use a plurality of virtual servers to perform operations attributed to a single virtual server. For example, using FIG. 2 as an example, virtual servers 24A, 24B, 24C, 24D and virtual database servers 44A, 44B may be combined into a single virtual server. Further, the present approaches may be adapted to other architectures or configurations, including but not limited to multi-tenant architectures, generalized client/server implementations, and/or even to perform some or all of the operations discussed herein. It can also be implemented on a single configured physical processor-based device. Similarly, although virtual servers or machines may be referenced to facilitate discussion of implementation, physical servers may be used instead as appropriate. The use and discussion of FIGS. 1 and 2 are examples for facilitating convenience of description and description, and are not intended to limit the present disclosure to the specific examples shown herein.

이해되는 바와 같이, 도 1 및 도 2와 관련하여 논의된 각각의 아키텍처들 및 프레임워크들은 전반에 걸쳐 다양한 유형들의 컴퓨팅 시스템들(예를 들어, 서버들, 워크스테이션들, 클라이언트 디바이스들, 랩톱들, 태블릿 컴퓨터들, 셀룰러 전화기들 등)을 포함한다. 완전함을 위해, 이러한 시스템들에서 전형적으로 발견되는 구성요소들의 간략하고 높은 레벨의 개요가 제공된다. 이해되는 바와 같이, 이러한 개요는 이러한 컴퓨팅 시스템들에서 전형적인 구성요소들의 높은 레벨의 일반화된 뷰를 제공하기 위한 것일 뿐이며, 논의되거나 논의에서 생략된 구성요소들의 관점에서 제한적인 것으로 고려되어서는 안 된다.As will be appreciated, each of the architectures and frameworks discussed in connection with FIGS. 1 and 2 are used throughout various types of computing systems (eg, servers, workstations, client devices, laptops). , tablet computers, cellular phones, etc.). For completeness, a brief, high-level overview of the components typically found in such systems is provided. As will be appreciated, this summary is only intended to provide a high-level, generalized view of components typical of such computing systems, and should not be considered limiting in terms of components discussed or omitted from discussion.

배경으로서, 본 접근법은 도 3에 도시된 것과 같은 하나 이상의 프로세서 기반 시스템을 이용하여 구현될 수 있다는 것을 알 수 있다. 마찬가지로, 본 접근법에서 이용되는 애플리케이션들 및/또는 데이터베이스들은 이러한 프로세서 기반 시스템들 상에 저장, 이용 및/또는 유지될 수 있다. 알 수 있는 바와 같이, 도 3에 도시된 것과 같은 이러한 시스템들은 분산 컴퓨팅 환경, 네트워크화된 환경, 또는 다른 멀티-컴퓨터 플랫폼 또는 아키텍처에 존재할 수 있다. 마찬가지로, 도 3에 도시된 것과 같은 시스템들은 본 접근법이 구현될 수 있는 하나 이상의 가상 환경 또는 계산 인스턴스를 지원하거나 이와 통신하는데 이용될 수 있다.As background, it can be appreciated that the present approach may be implemented using one or more processor-based systems such as those shown in FIG. 3 . Likewise, applications and/or databases used in the present approach may be stored, used, and/or maintained on such processor-based systems. As will be appreciated, such systems as shown in FIG. 3 may reside in a distributed computing environment, a networked environment, or other multi-computer platform or architecture. Likewise, systems such as that shown in FIG. 3 may be used to support or communicate with one or more virtual environments or compute instances in which the present approach may be implemented.

이를 염두에 두고, 예시적인 컴퓨터 시스템은 도 3에 도시된 컴퓨터 구성요소들 중 일부 또는 전부를 포함할 수 있다. 도 3은 일반적으로 컴퓨팅 시스템(80)의 예시적인 구성요소들 및 하나 이상의 버스를 따르는 것과 같은 그 잠재적인 상호접속들 또는 통신 경로들의 블록도를 도시한다. 도시된 바와 같이, 컴퓨팅 시스템(80)은 하나 이상의 프로세서(82), 하나 이상의 버스(84), 메모리(86), 입력 디바이스들(88), 전원(90), 네트워크 인터페이스(92), 사용자 인터페이스(94), 및/또는 본 명세서에 설명된 기능들을 수행하는데 유용한 다른 컴퓨터 구성요소들과 같은, 그러나 이에 제한되지 않는 다양한 하드웨어 구성요소들을 포함할 수 있다.With this in mind, an exemplary computer system may include some or all of the computer components illustrated in FIG. 3 . 3 generally depicts a block diagram of exemplary components of computing system 80 and its potential interconnections or communication paths, such as along one or more buses. As shown, computing system 80 includes one or more processors 82 , one or more buses 84 , memory 86 , input devices 88 , power supply 90 , network interface 92 , and a user interface. (94), and/or other computer components useful for performing the functions described herein, such as, but not limited to, various hardware components.

하나 이상의 프로세서(82)는 메모리(86)에 저장된 명령어들을 수행할 수 있는 하나 이상의 마이크로프로세서를 포함할 수 있다. 추가적으로 또는 대안적으로, 하나 이상의 프로세서(82)는 주문형 집적 회로(ASIC)들, 필드 프로그래머블 게이트 어레이(FPGA)들, 및/또는 메모리(86)로부터 명령어들을 호출하지 않고 본 명세서에서 논의된 기능들 중 일부 또는 전부를 수행하도록 설계된 다른 디바이스들을 포함할 수 있다.One or more processors 82 may include one or more microprocessors capable of executing instructions stored in memory 86 . Additionally or alternatively, the one or more processors 82 may perform the functions discussed herein without invoking instructions from application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), and/or memory 86 . It may include other devices designed to perform some or all of these.

다른 구성요소들과 관련하여, 하나 이상의 버스(84)는 컴퓨팅 시스템(80)의 다양한 구성요소들 사이에 데이터 및/또는 전력을 제공하기에 적절한 전기 채널들을 포함한다. 메모리(86)는 임의의 유형적, 비일시적, 및 컴퓨터 판독가능한 저장 매체를 포함할 수 있다. 도 1에서 단일 블록으로 도시되지만, 메모리(86)는 하나 이상의 물리적 위치에서 동일한 또는 상이한 유형들의 복수의 물리적 유닛들을 이용하여 구현될 수 있다. 입력 디바이스들(88)은 하나 이상의 프로세서(82)에 데이터 및/또는 명령들을 입력하기 위한 구조들에 대응한다. 예를 들어, 입력 디바이스들(88)은 마우스, 터치패드, 터치스크린, 키보드 등을 포함할 수 있다. 전원(90)은 라인 전력 및/또는 배터리 소스와 같은, 컴퓨팅 디바이스(80)의 다양한 구성요소들의 전력을 위한 임의의 적절한 소스일 수 있다. 네트워크 인터페이스(92)는 하나 이상의 네트워크(예를 들어, 통신 채널)를 통해 다른 디바이스들과 통신할 수 있는 하나 이상의 트랜시버를 포함할 수 있다. 네트워크 인터페이스(92)는 유선 네트워크 인터페이스 또는 무선 네트워크 인터페이스를 제공할 수 있다. 사용자 인터페이스(94)는 하나 이상의 프로세서(82)로부터 그것으로 전송되는 텍스트 또는 이미지들을 표시하도록 구성되는 디스플레이를 포함할 수 있다. 디스플레이에 추가적으로 및/또는 대안적으로, 사용자 인터페이스(94)는 조명들(예를 들어, LED들), 스피커들 등과 같은, 사용자와 인터페이싱하기 위한 다른 디바이스들을 포함할 수 있다.With respect to the other components, the one or more buses 84 include electrical channels suitable for providing data and/or power between the various components of the computing system 80 . Memory 86 may include any tangible, non-transitory, and computer-readable storage media. Although shown as a single block in FIG. 1 , memory 86 may be implemented using a plurality of physical units of the same or different types in one or more physical locations. Input devices 88 correspond to structures for inputting data and/or instructions to one or more processors 82 . For example, input devices 88 may include a mouse, touchpad, touchscreen, keyboard, and the like. Power source 90 may be any suitable source for power of the various components of computing device 80 , such as line power and/or a battery source. Network interface 92 may include one or more transceivers capable of communicating with other devices over one or more networks (eg, communication channels). The network interface 92 may provide a wired network interface or a wireless network interface. User interface 94 may include a display configured to display text or images transmitted from one or more processors 82 to it. In addition to and/or alternatively to a display, user interface 94 may include other devices for interfacing with a user, such as lights (eg, LEDs), speakers, and the like.

위에서 논의된 클라우드 기반 플랫폼(20)은 NLU 기술들을 이용할 수 있는 아키텍처의 예를 제공한다는 것이 이해되어야 한다. 특히, 클라우드 기반 플랫폼(20)은 의도-엔티티 모델을 포함하는, 다수의 출력들의 생성을 용이하게 하기 위해, 마이닝될 수 있는 소스 데이터의 큰 코퍼스를 포함하거나 이를 저장할 수 있다. 예를 들어, 클라우드 기반 플랫폼(20)은 특정 시스템들의 변경들 또는 수리들에 대한 요청들, 요청자와 서비스 기술자 또는 문제를 해결하려고 시도하는 관리자 사이의 대화, 티켓이 결국 어떻게 해결되었는지에 대한 설명 등을 갖는 티켓팅 소스 데이터를 포함할 수 있다. 그 후, 생성된 의도-엔티티 모델은 장래의 요청들에서 의도들을 분류하기 위한 기반으로서 역할을 할 수 있고, 사용자들로부터의 자연어 요청들에 기반하여 클라우드 기반 플랫폼(20) 내의 장래의 문제들을 자동으로 해결할 수 있는 가상 에이전트를 지원하기 위해 대화 모델을 생성 및 개선하는데 이용될 수 있다. 이와 같이, 본 명세서에 설명된 특정 실시예들에서, 개시된 에이전트 자동화 프레임워크는 클라우드 기반 플랫폼(20)에 통합되는 반면, 다른 실시예들에서, 에이전트 자동화 프레임워크는, 아래에 논의되는 바와 같이, 발화들을 처리하기 위해 클라우드 기반 플랫폼(20)에 통신가능하게 결합되는 적절한 시스템에 의해 (클라우드 기반 플랫폼(20)과는 별도로) 호스팅 및 실행될 수 있다.It should be understood that the cloud-based platform 20 discussed above provides an example of an architecture that may utilize NLU techniques. In particular, the cloud-based platform 20 may contain or store a large corpus of source data that may be mined to facilitate the generation of multiple outputs, including an intent-entity model. For example, the cloud-based platform 20 may support requests for changes or repairs of certain systems, a conversation between the requestor and a service technician or administrator attempting to solve the problem, a description of how the ticket was eventually resolved, etc. It may include ticketing source data with The generated intent-entity model can then serve as a basis for classifying intents in future requests, and automatically address future issues within the cloud-based platform 20 based on natural language requests from users. It can be used to create and improve dialog models to support virtual agents that can be solved with As such, in certain embodiments described herein, the disclosed agent automation framework is integrated into the cloud-based platform 20 , while in other embodiments, the agent automation framework, as discussed below, comprises: may be hosted and executed (separately from the cloud-based platform 20 ) by a suitable system communicatively coupled to the cloud-based platform 20 for processing utterances.

이상의 내용을 염두에 두고, 도 4a는 클라이언트 인스턴스(42)와 연관된 에이전트 자동화 프레임워크(100)(본 명세서에서 에이전트 자동화 시스템(100)이라고도 함)를 나타내고 있다. 보다 구체적으로는, 도 4a는 앞서 논의된 클라우드 기반 플랫폼(20)을 비롯한 서비스 제공자 클라우드 인프라스트럭처의 일부분의 예를 나타내고 있다. 클라우드 기반 플랫폼(20)은 (예컨대, 클라이언트 디바이스(14D)의 웹 브라우저를 통해) 클라이언트 인스턴스(42) 내에서 실행 중인 네트워크 애플리케이션들에 대한 사용자 인터페이스를 제공하기 위해 네트워크(18)를 통해 클라이언트 디바이스(14D)에 접속된다. 클라이언트 인스턴스(42)는 도 2와 관련하여 설명된 것들과 유사한 가상 서버들에 의해 지원되고, 클라이언트 인스턴스(42) 내에서 본 명세서에서 설명되는 개시된 기능에 대한 지원을 보여주기 위해 본 명세서에 예시되어 있다. 클라우드 제공자 인프라스트럭처는 일반적으로 클라이언트 디바이스(14D)와 같은 복수의 최종 사용자 디바이스들을 동시에 지원하도록 구성되어 있고, 여기서 각각의 최종 사용자 디바이스는 단일 클라이언트 인스턴스(42)와 통신하고 있다. 또한, 클라우드 제공자 인프라스트럭처는 클라이언트 인스턴스(42)와 같은 임의의 수의 클라이언트 인스턴스들을 동시에 지원하도록 구성될 수 있고, 여기서 인스턴스들 각각은 하나 이상의 최종 사용자 디바이스와 통신하고 있다. 앞서 언급된 바와 같이, 최종 사용자는 또한 웹 브라우저 내에서 실행되는 애플리케이션을 이용하여 클라이언트 인스턴스(42)와 인터페이싱할 수 있다.With the above in mind, FIG. 4A illustrates an agent automation framework 100 (also referred to herein as agent automation system 100 ) associated with a client instance 42 . More specifically, FIG. 4A illustrates an example of a portion of a service provider cloud infrastructure, including the cloud-based platform 20 discussed above. The cloud-based platform 20 provides a user interface to the network applications running within the client instance 42 (eg, via a web browser of the client device 14D) via the network 18 to the client device ( 14D). A client instance 42 is supported by virtual servers similar to those described with respect to FIG. 2 , and is illustrated herein to demonstrate support for the disclosed functionality described herein within the client instance 42 . there is. The cloud provider infrastructure is generally configured to concurrently support a plurality of end user devices, such as client device 14D, where each end user device is communicating with a single client instance 42 . Further, the cloud provider infrastructure may be configured to concurrently support any number of client instances, such as client instance 42 , each of which is in communication with one or more end user devices. As noted above, the end user may also interface with the client instance 42 using an application running within a web browser.

도 4a에 도시된 에이전트 자동화 프레임워크(100)의 실시예는 거동 엔진(BE)(102), NLU 프레임워크(104), 및 데이터베이스(106)를 포함하고, 이들은 클라이언트 인스턴스(42) 내에서 통신가능하게 결합된다. BE(102)는 자연어 사용자 요청들(122)(본 명세서에서 사용자 발화들(122) 또는 발화들(122)이라고도 지칭됨) 및 에이전트 응답들(124)(본 명세서에서 에이전트 발화들(124)이라고도 지칭됨)을 통해 클라이언트 디바이스(14D)의 사용자와 상호작용하는 임의의 적절한 수의 가상 에이전트들 또는 페르소나들을 호스팅하거나 이들을 포함할 수 있다. 실제 구현들에서, 에이전트 자동화 프레임워크(100)는 본 개시내용에 따라, 의미 추출 서브시스템, 의미 검색 서브시스템 등을 포함하는 다수의 다른 적절한 구성요소들을 포함할 수 있다는 점에 유의할 수 있다.The embodiment of the agent automation framework 100 shown in FIG. 4A includes a behavior engine (BE) 102 , an NLU framework 104 , and a database 106 , which communicate within a client instance 42 . possibly combined. BE 102 responds to natural language user requests 122 (also referred to herein as user utterances 122 or utterances 122) and agent responses 124 (also referred to herein as agent utterances 124). may host or include any suitable number of virtual agents or personas interacting with the user of the client device 14D via It may be noted that in actual implementations, the agent automation framework 100 may include a number of other suitable components, including a semantic extraction subsystem, a semantic retrieval subsystem, and the like, in accordance with the present disclosure.

도 4a에 도시된 실시예에 있어서, 데이터베이스(106)는 데이터베이스 서버 인스턴스(예를 들어, 도 2와 관련하여 논의된 바와 같은 데이터베이스 서버 인스턴스(44A 또는 44B)), 또는 데이터베이스 서버 인스턴스들의 집합일 수 있다. 도시된 데이터베이스(106)는 의도-엔티티 모델(108), 대화 모델(110), 발화들(112)의 코퍼스, 및 규칙들(114)의 집합을 데이터베이스(106)의 하나 이상의 테이블(예를 들어, 관계형 데이터베이스 테이블)에 저장한다. 의도-엔티티 모델(108)은 특정 샘플 발화들을 통해 특정 의도들과 특정 엔티티들 사이의 연관들 또는 관계들을 저장한다. 특정 실시예들에서, 의도-엔티티 모델(108)은 적절한 저작 툴을 이용하여 설계자에 의해 저작될 수 있다. 다른 실시예들에서, 에이전트 자동화 프레임워크(100)는 데이터베이스(106)의 하나 이상의 테이블에 저장된 발화들(112)의 코퍼스 및 규칙들(114)의 집합으로부터 의도-엔티티 모델(108)을 생성한다. 의도-엔티티 모델(108)은 또한 일부 실시예들에서, 저작 및 ML 기술들의 조합에 기반하여 결정될 수 있다. 여하튼, 개시된 의도-엔티티 모델(108)은 의도들 및/또는 엔티티들의 임의의 적절한 조합을 발화들(112)의 코퍼스의 각각의 것들과 연관시킬 수 있다는 것을 이해해야 한다. 아래에 논의되는 실시예들에 있어서, 의도-엔티티 모델(108)의 샘플 발화들은 의미 검색을 위한 검색 공간을 정의하기 위해 이해 모델의 의미 표현들을 생성하는데 이용된다.In the embodiment shown in FIG. 4A , database 106 may be a database server instance (eg, database server instance 44A or 44B as discussed with respect to FIG. 2 ), or a collection of database server instances. there is. The illustrated database 106 stores an intent-entity model 108 , a dialog model 110 , a corpus of utterances 112 , and a set of rules 114 in one or more tables of the database 106 (eg, , a relational database table). The intent-entity model 108 stores associations or relationships between specific intents and specific entities through specific sample utterances. In certain embodiments, the intent-entity model 108 may be authored by a designer using an appropriate authoring tool. In other embodiments, the agent automation framework 100 generates an intent-entity model 108 from a corpus of utterances 112 stored in one or more tables of a database 106 and a set of rules 114 . . The intent-entity model 108 may also be determined based on a combination of authoring and ML techniques, in some embodiments. In any event, it should be understood that the disclosed intent-entity model 108 may associate any suitable combination of intents and/or entities with respective ones of the corpus of utterances 112 . In the embodiments discussed below, sample utterances of the intent-entity model 108 are used to generate semantic representations of the understanding model to define a search space for a semantic search.

도 4a에 도시된 실시예에 있어서, 대화 모델(110)은 의도-엔티티 모델(108)의 의도들과 일반적으로 BE(102)의 거동을 정의하는 특정 응답들 및/또는 액션들 사이의 연관들을 저장한다. 특정 실시예들에서, 대화 모델 내의 연관들의 적어도 일부는 설계자가 BE(102)가 처리된 발화들 내의 특정 식별된 아티팩트들에 어떻게 응답하기를 원하는지에 기반하여 BE(102)의 설계자에 의해 수동으로 생성되거나 미리 정의된다. 상이한 실시예들에서, 데이터베이스(106)는 컴파일 모델 템플릿 데이터(예를 들어, 클래스 호환성 규칙들, 클래스 레벨 스코어링 계수들, 트리 모델 비교 알고리즘들, 트리 하위구조 벡터화 알고리즘들), 의미 표현들 등에 관한 정보를 저장하는 테이블들과 같은, 의도 분류에 관련된 다른 정보를 저장하는 다른 데이터베이스 테이블들을 포함할 수 있다는 점에 유의해야 한다.In the embodiment shown in FIG. 4A , the dialog model 110 creates associations between the intents of the intent-entity model 108 and specific responses and/or actions that generally define the behavior of the BE 102 . Save. In certain embodiments, at least some of the associations in the dialog model are manually set by the designer of the BE 102 based on how the designer wants the BE 102 to respond to particular identified artifacts in the processed utterances. generated or predefined. In different embodiments, database 106 relates to compilation model template data (eg, class compatibility rules, class level scoring coefficients, tree model comparison algorithms, tree substructure vectorization algorithms), semantic expressions, etc. It should be noted that other database tables may include other database tables storing other information related to intent classification, such as tables storing information.

예시된 실시예에 있어서, NLU 프레임워크(104)는 NLU 엔진(116) 및 어휘 관리자(118)를 포함한다. NLU 프레임워크(104)는 임의의 적절한 수의 다른 구성요소들을 포함할 수 있다는 것을 알 수 있다. 특정 실시예들에서, NLU 엔진(116)은, 발화들의 단어 또는 구들로부터 단어 벡터들(예를 들어, 의도 벡터들, 주어 또는 엔티티 벡터들, 서브트리 벡터들)을 생성하는 것은 물론, 이러한 벡터들 사이의 거리들(예를 들어, 유클리드 거리들)을 결정하는 것을 포함하여, NLU 프레임워크(104)의 다수의 기능들을 수행하도록 설계된다. 예를 들어, NLU 엔진(116)은 일반적으로 분석된 발화의 각각의 의도에 대한 각각의 의도 벡터를 생성할 수 있다. 이와 같이, 2개의 상이한 발화 사이의 유사도 척도 또는 거리는 2개의 의도에 대해 NLU 엔진(116)에 의해 생성된 각각의 의도 벡터들을 이용하여 계산될 수 있고, 유사도 척도는 2개의 의도 사이의 의미에서의 유사도의 표시를 제공한다.In the illustrated embodiment, the NLU framework 104 includes an NLU engine 116 and a vocabulary manager 118 . It will be appreciated that the NLU framework 104 may include any suitable number of other components. In certain embodiments, NLU engine 116 generates word vectors (eg, intent vectors, subject or entity vectors, subtree vectors) from words or phrases of utterances, as well as generating such vectors It is designed to perform a number of functions of the NLU framework 104 , including determining distances between them (eg, Euclidean distances). For example, the NLU engine 116 may generally generate a respective intent vector for each intent of the analyzed utterance. As such, a similarity measure or distance between two different utterances may be computed using respective intent vectors generated by the NLU engine 116 for the two intents, and the similarity measure in the meaning between the two intents. Provides an indication of similarity.

어휘 관리자(118)는 어휘 훈련 동안 NLU 프레임워크(104)가 마주치지 않았던 어휘 외 단어들 및 심볼들을 다룬다. 예를 들어, 특정 실시예들에서, 어휘 관리자(118)는 (예를 들어, 규칙들(114)의 집합에 기반하여) 에이전트 자동화 프레임워크(100)에 의해 분석된 발화들 내의 단어들 및 약어들의 동의어들 및 도메인-특정 의미들을 식별하고 대체할 수 있으며, 이는 NLU 프레임워크(104)의 성능을 개선하여 맥락-특정 발화들 내의 의도들 및 엔티티들을 적절히 식별할 수 있다. 추가로, 기존 단어들에 대한 새로운 이용들을 채택하는 자연어의 경향을 수용하기 위해, 특정 실시예들에서, 어휘 관리자(118)는 맥락의 변화에 기반하여 다른 의도들 또는 엔티티들과 이전에 연관된 단어들의 재창출을 처리한다. 예를 들어, 어휘 관리자(118)는 특정 클라이언트 인스턴스 및/또는 대화 채널로부터의 발화들의 맥락에서 단어 "자전거"가 실제로 자전거가 아닌 오토바이를 지칭하는 상황을 처리할 수 있다.The vocabulary manager 118 handles extra-vocabulary words and symbols that the NLU framework 104 did not encounter during vocabulary training. For example, in certain embodiments, the vocabulary manager 118 (eg, based on the set of rules 114 ) may include words and abbreviations in utterances analyzed by the agent automation framework 100 (eg, based on the set of rules 114 ). can identify and replace synonyms and domain-specific meanings of , which can improve the performance of the NLU framework 104 to properly identify intents and entities within context-specific utterances. Additionally, to accommodate the tendency of natural language to adopt new uses for existing words, in certain embodiments, the vocabulary manager 118 may configure a word previously associated with other intents or entities based on a change in context. handle their re-creation. For example, vocabulary manager 118 may handle a situation in which the word “bike” actually refers to a motorcycle rather than a bicycle in the context of utterances from a particular client instance and/or conversation channel.

의도-엔티티 모델(108) 및 대화 모델(110)이 생성되었으면, 에이전트 자동화 프레임워크(100)는 사용자 발화(122)를 (자연어 요청의 형태로) 수신하고, 그 요청을 다루기 위한 액션을 적절히 취하도록 설계된다. 예를 들어, 도 4a에 예시된 실시예에 있어서, BE(102)는 네트워크(18)를 통해, 클라이언트 네트워크(12) 상에 배치된 클라이언트 디바이스(14D)에 의해 제출된 발화(122)(예를 들어, 채팅 통신에서의 자연어 요청)를 수신하는 가상 에이전트이다. BE(102)는 발화(122)를 NLU 프레임워크(104)에 제공하고, NLU 엔진(116)은 아래에 논의되는 NLU 프레임워크의 다양한 서브시스템들과 함께, 의도-엔티티 모델(108)에 기반하여 발화(122)를 처리하여, 발화 내의 아티팩트들(예를 들어, 의도들 및/또는 엔티티들)을 도출한다. NLU 엔진(116)에 의해 도출된 아티팩트들뿐만 아니라 대화 모델(110) 내의 연관들에 기반하여, BE(102)는 하나 이상의 특정의 미리 정의된 액션을 수행한다. 예시된 실시예에 있어서, BE(102)는 또한 네트워크(18)를 통해 클라이언트 디바이스(14D)에 응답(124)(예를 들어, 가상 에이전트 발화(124) 또는 확인)을 제공하며, 이는 예를 들어 수신된 사용자 발화(122)에 응답하여 BE(102)에 의해 수행되는 액션들을 표시한다. 또한, 특정 실시예들에서, 발화(122)는 NLU 프레임워크(104) 내의 계속적인 학습을 위해 데이터베이스(106)에 저장된 발화들(112)에 추가될 수 있다.Once the intent-entity model 108 and dialog model 110 have been created, the agent automation framework 100 receives the user utterance 122 (in the form of a natural language request) and takes appropriate action to handle the request. designed to do For example, in the embodiment illustrated in FIG. 4A , BE 102 via network 18 , utterance 122 (eg, submitted by client device 14D disposed on client network 12 ) For example, a virtual agent that receives natural language requests in chat communication). BE 102 provides utterance 122 to NLU framework 104 , which NLU engine 116 is based on, along with various subsystems of NLU framework discussed below, based on intent-entity model 108 . to process the utterance 122 to derive artifacts (eg, intents and/or entities) within the utterance. Based on the artifacts derived by the NLU engine 116 as well as the associations within the dialog model 110 , the BE 102 performs one or more specific predefined actions. In the illustrated embodiment, the BE 102 also provides a response 124 (eg, a virtual agent utterance 124 or confirmation) to the client device 14D over the network 18 , which may include, for example, For example, it indicates actions performed by the BE 102 in response to the received user utterance 122 . Further, in certain embodiments, the utterance 122 may be added to the utterances 112 stored in the database 106 for continued learning within the NLU framework 104 .

다른 실시예들에서, 에이전트 자동화 프레임워크(100) 및/또는 NLU 프레임워크(104)의 하나 이상의 구성요소가 개선된 성능을 위해 다른 방식으로 배열, 위치, 또는 호스팅될 수 있다는 점이 이해될 수 있다. 예를 들어, 특정 실시예들에서, NLU 프레임워크(104)의 하나 이상의 부분은 클라이언트 인스턴스(42)와 별개이고 이에 통신가능하게 결합되는 인스턴스(예를 들어, 공유 인스턴스, 기업 인스턴스)에 의해 호스팅될 수 있다. 이러한 실시예들은 클라이언트 인스턴스(42)의 크기를 유리하게 감소시킬 수 있어서, 클라우드 기반 플랫폼(20)의 효율을 개선시킬 수 있다는 점이 현재 인식되고 있다. 특히, 특정 실시예들에서, 아래에 논의되는 유사도 스코어링 서브시스템의 하나 이상의 구성요소는, 발화(122)에 대한 아티팩트 매치들의 식별을 가능하도록 검색 공간 내의 적절한 매칭 의미 표현들에 대한 개선된 의미 검색을 가능하게 하기 위해, 클라이언트 인스턴스(42)뿐만 아니라 다른 클라이언트 인스턴스들에 통신가능하게 결합되는 별개의 인스턴스(예를 들어, 기업 인스턴스)에 의해 호스팅될 수 있다.It can be appreciated that in other embodiments, one or more components of the agent automation framework 100 and/or NLU framework 104 may be arranged, located, or hosted in other ways for improved performance. . For example, in certain embodiments, one or more portions of the NLU framework 104 are hosted by an instance that is separate from and communicatively coupled to the client instance 42 (eg, a shared instance, an enterprise instance). can be It is now recognized that such embodiments may advantageously reduce the size of the client instance 42 , thereby improving the efficiency of the cloud-based platform 20 . In particular, in certain embodiments, one or more components of the similarity scoring subsystem discussed below may provide an improved semantic search for appropriate matching semantic expressions within a search space to enable identification of artifact matches for an utterance 122 . may be hosted by the client instance 42 as well as a separate instance (eg, an enterprise instance) that is communicatively coupled to other client instances.

이상의 내용을 염두에 두고, 도 4b는 NLU 프레임워크(104)의 부분들이 클라우드 기반 플랫폼(20)에 의해 호스팅되는 별개의 공유 인스턴스(예를 들어, 기업 인스턴스(125))에 의해 대신 실행되는 에이전트 자동화 프레임워크(100)의 대안적인 실시예를 예시한다. 예시된 기업 인스턴스(125)는 적절한 프로토콜을 통해(예를 들어, 적절한 REST(Representational State Transfer) 요청들/응답들을 통해) 임의의 적절한 수의 클라이언트 인스턴스들과 아티팩트 마이닝 및 분류에 관련된 데이터를 교환하도록 통신가능하게 결합된다. 이와 같이, 도 4b에 예시된 설계에 대해, 복수의 클라이언트 인스턴스들(42)에 액세스가능한 공유 자원으로서 NLU 프레임워크의 일부분을 호스팅함으로써, 클라이언트 인스턴스(42)의 크기가 (예를 들어, 도 4a에 예시된 에이전트 자동화 프레임워크(100)의 실시예에 비해) 실질적으로 감소될 수 있고, 에이전트 자동화 프레임워크(100)의 전체 효율이 개선될 수 있다.With the above in mind, FIG. 4B illustrates an agent in which portions of the NLU framework 104 are instead executed by a separate shared instance (eg, enterprise instance 125 ) hosted by the cloud-based platform 20 . An alternative embodiment of an automation framework 100 is illustrated. The illustrated enterprise instance 125 is configured to exchange data related to artifact mining and classification with any suitable number of client instances via an appropriate protocol (eg, via appropriate Representational State Transfer (REST) requests/responses). communicatively coupled. As such, for the design illustrated in FIG. 4B , by hosting a portion of the NLU framework as a shared resource accessible to a plurality of client instances 42 , the size of the client instance 42 is reduced (eg, FIG. 4A ). compared to the embodiment of the agent automation framework 100 illustrated in ) can be substantially reduced, and the overall efficiency of the agent automation framework 100 can be improved.

특히, 도 4b에 도시된 NLU 프레임워크(104)는 NLU 프레임워크(104) 내에서 별개의 프로세스들을 수행하는 3개의 별개의 구성요소로 분할된다. 이러한 구성요소들은 기업 인스턴스(125)에 의해 호스팅되는 공유 NLU 훈련자(126), 기업 인스턴스(125)에 의해 호스팅되는 공유 NLU 주석자(127), 및 클라이언트 인스턴스(42)에 의해 호스팅되는 NLU 예측자(128)를 포함한다. 도 4a 및 도 4b에 도시된 조직들은 단지 예들이고, 다른 실시예들에서, NLU 프레임워크(104) 및/또는 에이전트 자동화 프레임워크(100)의 다른 조직들이 본 개시내용에 따라 이용될 수 있다는 것을 알 수 있다.In particular, the NLU framework 104 shown in FIG. 4B is divided into three distinct components that perform separate processes within the NLU framework 104 . These components are the shared NLU trainer 126 hosted by the enterprise instance 125 , the shared NLU annotator 127 hosted by the enterprise instance 125 , and the NLU predictor hosted by the client instance 42 . (128). It should be noted that the organizations shown in FIGS. 4A and 4B are merely examples, and that in other embodiments, other organizations of the NLU framework 104 and/or the agent automation framework 100 may be used in accordance with the present disclosure. Able to know.

도 4b에 예시된 에이전트 자동화 프레임워크(100)의 실시예에 있어서, 공유 NLU 훈련자(126)는 클라이언트 인스턴스(42)로부터 발화들(112)의 코퍼스를 수신하고, 의도-엔티티 모델(108)의 생성을 용이하게 하기 위해 의미론적 마이닝(예를 들어, 의미론적 파싱, 문법 조작(grammar engineering) 등을 포함함)을 수행하도록 설계된다. 의도-엔티티 모델(108)이 생성되었으면, BE(102)가 클라이언트 디바이스(14D)에 의해 제공되는 사용자 발화(122)를 수신할 때, NLU 예측자(128)는 발화(122)의 파싱 및 주석부기를 위해 발화(122) 및 의도-엔티티 모델(108)을 공유 NLU 주석자(127)에 전달한다. 공유 NLU 주석자(127)는 의도-엔티티 모델(108)에 기반하여 발화(122)의 의미론적 파싱, 문법 조작 등을 수행하고, 발화(122)의 주석부기된 발화 트리들을 클라이언트 인스턴스(42)의 NLU 예측자(128)에 반환한다. 그 후, NLU 예측자(128)는 아래에 더 상세히 논의되는 발화(122)의 이러한 주석부기된 구조들을 이용하여, 의도-엔티티 모델(108)로부터 매칭 의도들을 식별하고, 이에 따라 BE(102)는 식별된 의도들에 기반하여 하나 이상의 액션을 수행할 수 있다. 아래에 논의되는 바와 같이, 공유 NLU 주석자(127)는 의미 추출 서브시스템(150)에 대응할 수 있고, NLU 예측자는 NLU 프레임워크(104)의 의미 검색 서브시스템(152)에 대응할 수 있다는 점이 이해될 수 있다.In the embodiment of the agent automation framework 100 illustrated in FIG. 4B , a shared NLU trainer 126 receives a corpus of utterances 112 from a client instance 42 , It is designed to perform semantic mining (including, for example, semantic parsing, grammar engineering, etc.) to facilitate creation. Once the intent-entity model 108 has been generated, when the BE 102 receives the user utterance 122 provided by the client device 14D, the NLU predictor 128 parses and annotates the utterance 122 . Pass utterance 122 and intent-entity model 108 to shared NLU annotator 127 for bookkeeping. The shared NLU annotator 127 performs semantic parsing, grammar manipulation, etc. of the utterance 122 based on the intent-entity model 108 and converts the annotated utterance trees of the utterance 122 to the client instance 42 . returns to the NLU predictor 128 of The NLU predictor 128 then uses these annotated structures of the utterance 122, discussed in more detail below, to identify matching intents from the intent-entity model 108, and thus the BE 102 may perform one or more actions based on the identified intents. It is understood that, as discussed below, the shared NLU annotator 127 may correspond to the semantic extraction subsystem 150 , and the NLU predictor may correspond to the semantic search subsystem 152 of the NLU framework 104 . can be

도 5는 거동 엔진(BE)(102) 및 NLU 프레임워크(104)가 에이전트 자동화 프레임워크(100)의 실시예 내에서 각각의 역할들을 수행하는 프로세스(145)를 도시하는 흐름도이다. 예시된 실시예에 있어서, NLU 프레임워크(104)는 수신된 사용자 발화(122)를 처리하여 의도-엔티티 모델(108)에 기반하여 아티팩트들(140)(예를 들어, 의도들 및/또는 엔티티들)을 추출한다. 추출된 아티팩트들(140)은 BE(102)에 의해 소모가능한 형태로 사용자 발화(122)의 의도들 및 엔티티들을 나타내는 심볼들의 집합으로서 구현될 수 있다. 이와 같이, 이러한 추출된 아티팩트들(140)은 BE(102)에 제공되고, BE(102)는 대화 모델(110)에 기반하여 수신된 아티팩트들(140)을 처리함으로써, 수신된 사용자 발화(122)에 응답하여 적절한 액션들(142)(예를 들어, 패스워드를 변경하는 것, 기록을 생성하는 것, 아이템을 구매하는 것, 계정을 닫는 것) 및/또는 가상 에이전트 발화들(124)을 결정한다. 화살표(144)에 의해 표시된 바와 같이, 프로세스(145)는 에이전트 자동화 프레임워크(100)가 대화 포맷으로 동일한 사용자 및/또는 다른 사용자들로부터 추가적인 사용자 발화들(122)을 수신하고 다룸에 따라 연속적으로 반복될 수 있다.5 is a flow diagram illustrating a process 145 in which the behavior engine (BE) 102 and the NLU framework 104 perform their respective roles within an embodiment of the agent automation framework 100 . In the illustrated embodiment, the NLU framework 104 processes the received user utterance 122 to create artifacts 140 (eg, intents and/or entity) based on the intent-entity model 108 . ) are extracted. The extracted artifacts 140 may be implemented as a set of symbols representing the intents and entities of the user utterance 122 in a form consumable by the BE 102 . As such, these extracted artifacts 140 are provided to the BE 102 , which processes the received artifacts 140 based on the dialog model 110 , thereby providing a received user utterance 122 . ) in response to determining appropriate actions 142 (eg, changing a password, creating a record, purchasing an item, closing an account) and/or virtual agent utterances 124 . do. As indicated by arrow 144 , process 145 continues as agent automation framework 100 receives and handles additional user utterances 122 from the same user and/or other users in a conversational format. can be repeated.

도 5에 도시된 바와 같이, 특정 상황들에서, 적절한 액션들(142)이 수행되었으면 더 이상의 액션 또는 통신이 발생하지 않을 수 있다는 것을 알 수 있다. 또한, 사용자 발화(122) 및 에이전트 발화(124)가 본 명세서에서 서면 대화 매체 또는 채널(예를 들어, 채팅, 이메일, 티켓팅 시스템, 텍스트 메시지들, 포럼 게시물들)을 이용하여 전달되는 것으로 논의되지만, 다른 실시예들에서는, 음성 대 텍스트 및/또는 텍스트 대 음성 모듈들 또는 플러그-인들은 본 개시내용에 따라 구두 사용자 발화(122)를 텍스트로 변환하고/하거나 텍스트 기반 에이전트 발화(124)를 음성으로 변환하여 음성 대화형 시스템을 가능하게 하도록 포함될 수 있음에 유의해야 한다. 또한, 특정 실시예들에서, 사용자 발화(122) 및 가상 에이전트 발화(124) 둘 다는 에이전트 자동화 프레임워크(100) 내에서 새로운 구조 및 어휘의 계속된 학습을 가능하게 하기 위해 데이터베이스(106)에(예를 들어, 발화들(112)의 코퍼스에) 저장될 수 있다.As shown in FIG. 5 , it can be seen that, in certain circumstances, no further action or communication may occur once the appropriate actions 142 have been performed. Also, although user utterance 122 and agent utterance 124 are discussed herein as being delivered using a written conversation medium or channel (eg, chat, email, ticketing system, text messages, forum posts), , in other embodiments, speech-to-text and/or text-to-speech modules or plug-ins may convert oral user utterance 122 to text and/or text-based agent utterance 124 to speech in accordance with the present disclosure. It should be noted that it can be included to enable voice interactive systems by converting Further, in certain embodiments, both user utterances 122 and virtual agent utterances 124 are stored in database 106 ( For example, in the corpus of utterances 112 ).

언급된 바와 같이, NLU 프레임워크(104)는 NLU의 어려운 문제를 관리가능한 검색 문제, 즉, 의미 추출 서브시스템 및 의미 검색 서브시스템으로 변환하기 위해 협력하는 2개의 주요 서브시스템을 포함한다. 예를 들어, 도 6은 에이전트 자동화 프레임워크(100)의 실시예 내의 NLU 프레임워크(104)의 의미 추출 서브시스템(150) 및 의미 검색 서브시스템(152)의 역할들을 예시하는 블록도이다. 예시된 실시예에 있어서, 도 6의 우측 부분(154)은 모델의 다양한 아티팩트들 각각에 대한 샘플 발화들(155)을 포함하는, 의도-엔티티 모델(108)을 수신하는 NLU 프레임워크(104)의 의미 추출 서브시스템(150)을 예시한다. 의미 추출 서브시스템(150)은 의도-엔티티 모델(108)의 샘플 발화들(155)의 의미 표현들(158)(예를 들어, 발화 트리 구조들)을 포함하는 이해 모델(157)을 생성한다. 다시 말해서, 이해 모델(157)은 아래에 더 상세히 논의되는 바와 같이, 의미 검색 서브시스템(152)에 의한 검색(예를 들어, 비교 및 매칭)을 가능하게 하기 위한 의미 표현들(158)을 포함하는 의도-엔티티 모델(108)의 변환된 또는 증강된 버전이다. 이와 같이, 도 6의 우측 부분(154)은 일반적으로, 예를 들어 일상적으로, 스케줄링 기반으로 또는 의도-엔티티 모델(108)에 대한 업데이트들에 응답하여 사용자 발화(122)를 수신하기 전에 수행된다는 것을 알 수 있다.As mentioned, the NLU framework 104 includes two major subsystems that work together to transform the difficult problem of NLU into a manageable search problem: a semantic extraction subsystem and a semantic search subsystem. For example, FIG. 6 is a block diagram illustrating the roles of the semantic extraction subsystem 150 and the semantic retrieval subsystem 152 of the NLU framework 104 in an embodiment of the agent automation framework 100 . In the illustrated embodiment, the right portion 154 of FIG. 6 includes the NLU framework 104 for receiving the intent-entity model 108, including sample utterances 155 for each of the various artifacts of the model. The semantic extraction subsystem 150 of The semantic extraction subsystem 150 generates an understanding model 157 that includes semantic representations 158 (eg, utterance tree structures) of sample utterances 155 of the intent-entity model 108 . . In other words, the understanding model 157 includes semantic expressions 158 for facilitating a search (eg, comparison and matching) by the semantic search subsystem 152 , as discussed in more detail below. is a transformed or augmented version of the intent-entity model 108 . As such, the right portion 154 of FIG. 6 is generally, for example routinely, performed prior to receiving the user utterance 122 on a scheduling basis or in response to updates to the intent-entity model 108 . it can be seen that

도 6에 도시된 실시예의 경우, 좌측 부분(156)은 적어도 하나의 의미 표현(162)을 갖는 발화 의미 모델(160)을 생성하기 위해 사용자 발화(122)를 또한 수신 및 처리하는 의미 추출 서브시스템(150)을 도시한다. 아래에 더 상세히 논의되는 바와 같이, 이러한 의미 표현들(158 및 162)은 발화의 하나의 이해의 문법적 구문 구조를 캡처하는 형태를 갖는 데이터 구조들이며, 데이터 구조들의 서브트리들은 발화의 부분들의 의미론적 의미들을 인코딩하는 서브트리 벡터들을 포함한다. 이와 같이, 주어진 발화에 대해, 대응하는 의미 표현은 아래에 더 상세히 설명되는 바와 같이, 의미 검색 서브시스템(152)에 의한 검색, 비교 및 매칭을 가능하게 하는 공통 의미 표현 포맷으로 구문론적 및 의미론적 의미 둘 다를 캡처한다. 따라서, 발화 의미 모델(160)의 의미 표현들(162)은 일반적으로 검색 키와 같은 것으로 생각될 수 있는 반면, 이해 모델(157)의 의미 표현들(158)은 검색 키가 추구될 수 있는 검색 공간을 정의한다. 따라서, 의미 검색 서브시스템(152)은 이해 모델(157)의 의미 표현들(158)을 검색하여, 아래에 설명되는 바와 같이 발화 의미 모델(160)의 의미 표현(162)과 매칭되는 하나 이상의 아티팩트를 찾음으로써, 추출된 아티팩트들(140)을 생성한다.For the embodiment shown in FIG. 6 , the left portion 156 is a semantic extraction subsystem that also receives and processes user utterances 122 to generate a utterance semantic model 160 having at least one semantic representation 162 . (150) is shown. As discussed in greater detail below, these semantic expressions 158 and 162 are data structures having a form that captures the grammatical syntactic structure of one understanding of an utterance, and subtrees of data structures are semantic of parts of the utterance. Contains subtree vectors that encode semantics. As such, for a given utterance, the corresponding semantic representation is syntactically and semantically in a common semantic representation format that allows for retrieval, comparison, and matching by the semantic retrieval subsystem 152 , as described in more detail below. It captures both meaning. Thus, the semantic expressions 162 of the utterance semantic model 160 can be generally thought of as equivalent to a search key, whereas the semantic expressions 158 of the understanding model 157 can be searched for by a search key. define space. Accordingly, the semantic search subsystem 152 searches the semantic representations 158 of the understanding model 157 for one or more artifacts that match the semantic representations 162 of the utterance semantic model 160 , as described below. By finding , the extracted artifacts 140 are generated.

본 명세서에 개시된 의미 표현들(158, 162) 중 하나의 예로서, 도 7은 발화에 대해 생성된 발화 트리(166)의 예를 나타내는 도면이다. 이해되는 바와 같이, 발화 트리(166)는 사용자 발화(122)에 기반하여, 또는 대안적으로, 샘플 발화들(155) 중 하나에 기반하여 의미 추출 서브시스템(150)에 의해 생성되는 데이터 구조이다. 도 7에 도시된 예에 있어서, 발화 트리(166)는 예시적인 발화, "나는 오늘 쇼핑몰 옆에 있는 가게에 가서 파란색 깃이 있는 셔츠와 검은색 바지를 사고 또한 일부 결함이 있는 배터리들을 반품하고 싶다"에 기반한다. 도시된 발화 트리(166)는 트리 구조로 배열된 노드들(202)의 세트(예를 들어, 노드들(202A, 202B, 202C, 202D, 202E, 202F, 202G, 202H, 202I, 202J, 202K, 202L, 202M, 202N, 및 202P))를 포함하고, 각각의 노드는 예시적인 발화의 특정 단어 또는 구를 나타낸다. 노드들(202)의 각각이 또한 발화 트리(166)의 특정 서브트리를 나타내는 것으로 설명될 수 있고, 서브트리가 하나 이상의 노드(202)를 포함할 수 있다는 점에 유의할 수 있다.As an example of one of the semantic expressions 158 and 162 disclosed herein, FIG. 7 is a diagram illustrating an example of an utterance tree 166 generated for an utterance. As will be appreciated, the utterance tree 166 is a data structure generated by the semantic extraction subsystem 150 based on a user utterance 122 , or alternatively, based on one of the sample utterances 155 . . In the example shown in FIG. 7 , the firing tree 166 has an exemplary utterance, “I want to go to the store next to the mall today to buy a blue collared shirt and black pants and also return some defective batteries. "based on The illustrated utterance tree 166 is a set of nodes 202 arranged in a tree structure (eg, nodes 202A, 202B, 202C, 202D, 202E, 202F, 202G, 202H, 202I, 202J, 202K, 202L, 202M, 202N, and 202P)), each node representing a particular word or phrase of an exemplary utterance. It may be noted that each of the nodes 202 may also be described as representing a particular subtree of the utterance tree 166 , and a subtree may include one or more nodes 202 .

도 7에 도시된 발화 트리(166)의 형태 또는 형상은 의미 추출 서브시스템(150)에 의해 결정되고, 예시적인 발화의 구문론적인 문법적 의미를 나타낸다. 더 구체적으로, 의미 추출 서브시스템(150)의 운율 서브시스템은 발화를 의도 세그먼트들로 분해하는 반면, 의미 추출 서브시스템(150)의 구조 서브시스템은 이러한 의도 세그먼트들로부터 발화 트리(166)를 구성한다. 노드들(202) 각각은 발화의 특정 단어 또는 구의 의미론적 의미를 표시하기 위해 어휘 서브시스템에 의해 결정되는 각각의 단어 벡터(예를 들어, 토큰)를 저장하거나 참조한다. 언급된 바와 같이, 각각의 단어 벡터는 발화의 일부의 의미론적 의미의 수학적 표현을 제공하는 부동 소수점 값들(예를 들어, 1xN 또는 Nx1 행렬)의 순서화된 n차원 리스트(예를 들어, 300차원 리스트)이다.The shape or shape of the utterance tree 166 shown in FIG. 7 is determined by the semantic extraction subsystem 150 and represents the syntactic grammatical meaning of the exemplary utterance. More specifically, the prosody subsystem of the semantic extraction subsystem 150 decomposes the utterance into intent segments, while the structure subsystem of the semantic extraction subsystem 150 constructs the utterance tree 166 from these intent segments. do. Each of the nodes 202 stores or references a respective word vector (eg, token) determined by the vocabulary subsystem to indicate the semantic meaning of a particular word or phrase of the utterance. As noted, each word vector is an ordered n -dimensional list (eg, a 300-dimensional list) of floating-point values (eg, a 1×N or N×1 matrix) that provides a mathematical representation of the semantic meaning of a portion of an utterance. )am.

또한, 다른 실시예들에서, 노드들(202) 각각은 발화 트리(166)의 주석부기된 실시예를 형성하기 위해 노드에 의해 표현되는 단어 또는 구에 관한 추가 정보로 구조 서브시스템에 의해 주석부기될 수 있다. 예를 들어, 노드들(202) 각각은 각각의 노드의 클래스 주석부기를 나타내는 각각의 태그, 식별자, 음영, 또는 교차 빗금치기를 포함할 수 있다. 특히, 도 7에 도시된 예시적인 발화 트리(166)의 경우, 특정 서브트리들 또는 노드들(예를 들어, 노드들(202A, 202B, 202C, 및 202D))은 동사 노드들이 되도록 품사 라벨들 또는 태그들로 주석부기될 수 있고, 특정 서브트리들 또는 노드들(예를 들어, 노드들(202E, 202F, 202G, 202H, 202I, 및 202J))은 주어 또는 목적어 노드들이 되도록 주석부기될 수 있고, 특정 서브트리들 또는 노드들(예를 들어, 노드들(202K, 202L, 202M, 202N, 및 202P))은 구조 서브시스템에 의해 수식어 노드들(예를 들어, 주어 수식어 노드들, 목적어 수식어 노드들, 동사 수식어 노드들)이 되도록 주석부기될 수 있다. 이러한 클래스 주석부기들은 그 후 주석부기된 발화 트리들로부터 생성되는 의미 표현들을 비교할 때 의미 검색 서브시스템(152)에 의해 이용될 수 있다. 이와 같이, 의미 표현들이 생성되는 발화 트리(166)는 아티팩트 추출을 위한 기반(예를 들어, 초기 기반)으로서 역할을 한다는 것을 알 수 있다.Further, in other embodiments, each of the nodes 202 is annotated by the structure subsystem with additional information about the word or phrase represented by the node to form an annotated embodiment of the utterance tree 166 . can be For example, each of the nodes 202 may include a respective tag, identifier, shading, or cross hatching indicating the class annotation of each node. In particular, for the example utterance tree 166 shown in FIG. 7 , certain subtrees or nodes (eg, nodes 202A, 202B, 202C, and 202D) are verb nodes with part-of-speech labels. or tags, and specific subtrees or nodes (e.g., nodes 202E, 202F, 202G, 202H, 202I, and 202J) may be annotated to be subject or object nodes. , and specific subtrees or nodes (eg, nodes 202K, 202L, 202M, 202N, and 202P) are defined by the structure subsystem as modifier nodes (eg, subject modifier nodes, object modifier nodes). nodes, verb modifier nodes). These class annotations may then be used by semantic retrieval subsystem 152 when comparing semantic expressions generated from the annotated utterance trees. As such, it can be seen that the utterance tree 166 in which semantic expressions are generated serves as a basis (eg, an initial basis) for artifact extraction.

본 명세서에서 인정되는 바와 같이, 추출된 아티팩트들(140)의 생성을 용이하게 하기 위해, 의미 검색 서브시스템(152)은 임의의 2개 이상의 의미 표현들, 예컨대 발화 의미 모델(160)의 의미 표현들(162) 중 하나 이상과 이해 모델(157)의 의미 표현들(158) 중 하나 이상 사이의 유사도를 결정할 수 있다. 예를 들어, 도 8은 검색 공간(250) 내에서 동작하는 의미 검색 서브시스템(152)의 실시예를 예시하는 정보 흐름도이다. 전술한 바와 같이, 본 실시예의 검색 공간(250)은 이해 모델(157)의 의미 표현들(158)로 채워지고 이에 의해 정의된다. 다른 실시예들에서, NLU 프레임워크(104)는 특정 맥락 또는 도메인에 각각 적절한 이해 모델들과 같은 복수의 이해 모델들에 의해 생성되는 의미 표현들(158)의 언커버링에 기반하여 검색 공간(250)을 생성할 수 있다. 따라서, 사용자 발화들(122)이 수신되고 (예를 들어, 운율 서브시스템에 의해) 잠재적인 아티팩트들로 세그먼트화되고 (예를 들어, 의미 추출 서브시스템(150)에 의해) 각각의 의미 표현들(162)로 변환된 후에, 의미 검색 서브시스템(152)은 사용자 발화들(122)의 의미 표현들(162)을 검색 공간(250)의 의미 표현들(158)과 비교한다. 실제로, 아래에 더 상세히 논의되는 바와 같이, 의미 검색 서브시스템(152)은 검색 공간(250)으로부터 임의의 적절한 매칭 의미 표현들(158)을 식별하여, NLU 프레임워크(104)가 이로부터, 추출된 아티팩트들(140)을 식별하게 할 수 있다.As acknowledged herein, to facilitate generation of extracted artifacts 140 , semantic retrieval subsystem 152 may implement any two or more semantic representations, such as a semantic representation of utterance semantic model 160 . A degree of similarity between one or more of s 162 and one or more of semantic representations 158 of comprehension model 157 may be determined. For example, FIG. 8 is an information flow diagram illustrating an embodiment of a semantic search subsystem 152 operating within a search space 250 . As described above, the search space 250 of this embodiment is populated with and defined by the semantic representations 158 of the understanding model 157 . In other embodiments, the NLU framework 104 may provide a search space 250 based on the uncovering of semantic representations 158 generated by a plurality of understanding models, such as understanding models each appropriate to a particular context or domain. ) can be created. Accordingly, user utterances 122 are received (eg, by prosody subsystem) and segmented into potential artifacts (eg, by semantic extraction subsystem 150) and each semantic representations After being converted to 162 , the semantic search subsystem 152 compares the semantic expressions 162 of the user utterances 122 with the semantic expressions 158 in the search space 250 . Indeed, as discussed in greater detail below, semantic search subsystem 152 identifies any suitable matching semantic expressions 158 from search space 250 , from which NLU framework 104 extracts Artifacts 140 that have been identified can be identified.

일부 실시예들에서, 의미 검색 서브시스템(152)은 하나 또는 복수의 의미 표현(158)을 사용자 발화(122)의 의미 표현들(162) 각각에 대한 적절한 매치로서 식별한다. 예를 들어, 하나 이상의 사용자 발화(122)에 대응하는 3개의 의미 표현(162)을 수신하는 것에 응답하여, 특정 실시예들의 의미 검색 서브시스템(152)은, 의미 표현 매치가 발견될 수 있다면, 검색 공간(250)의 하나 이상의 매칭 의미 표현(158)을 반환한다. 의미 검색 서브시스템(152)은 또한 의미 표현들(158)로부터 가장 가능성 있게 추출된 아티팩트들(140)에 대한 적절한 에이전트 응답들(124) 및/또는 액션들(142)을 용이하게 하기 위해 매칭 의미 표현들(158) 및/또는 그 안의 아티팩트들을 동반되는 신뢰도 레벨로 스코어링할 수 있다.In some embodiments, semantic search subsystem 152 identifies one or more semantic expressions 158 as appropriate matches for each of semantic expressions 162 of user utterance 122 . For example, in response to receiving three semantic expressions 162 corresponding to one or more user utterances 122 , the semantic search subsystem 152 of certain embodiments may be configured to: Returns one or more matching semantic representations 158 of the search space 250 . Semantic retrieval subsystem 152 also provides matching semantics to facilitate appropriate agent responses 124 and/or actions 142 for artifacts 140 most likely extracted from semantic expressions 158 . Representations 158 and/or artifacts therein may be scored with an accompanying confidence level.

본 명세서에서 인식되는 바와 같이, 의미 검색 서브시스템(152)은 의미 표현들(158)의 타겟화된 전지 및 추출된 아티팩트들(140)의 결정을 용이하게 하도록 의미 표현들(158 및 162) 사이의 더 정확한 유사도 스코어를 점진적으로 렌더링하기 위해 예측 유사도 스코어링 스킴을 이용할 수 있다. 이러한 실시예들에서, 예측 유사도 스코어링 스킴은 복수의 또는 확장 이해 모델들(157)의 의미 표현들(158)에 기반하여 생성된 것들과 같은, 검색 공간(250)의 대규모 표명들에 대한 검색들의 수행을 가능하게 한다. 일반적으로, 의미 검색 서브시스템(152)의 유사도 스코어링 서브시스템은 특정 발화 기반 의미 표현(162)과 검색 공간(250) 내의 각각의 의미 표현(158)(예를 들어, 검색 공간 의미 표현) 사이의 비교를 가능하게 하는 수학적 비교 함수 리스트들을 먼저 식별함으로써 동작한다. 의미 표현(162)에 대한 비교가능한 형태를 갖지 않는 의미 표현들(158)에 대해, 유사도 스코어링 서브시스템은 의미 표현들(158)을 비호환적인 것으로서 식별하고 이들을 검색 공간(250)으로부터 전지할 수 있다. 그 후, 유사도 스코어링 서브시스템은 특정 의미 표현(162)과 검색 공간(250)의 남아 있는 비교가능한 의미 표현들(158) 사이의 초기 유사도 스코어를 생성하기 위해 그 안에 가장 넓은 또는 가장 덜 비싼 함수를 적용함으로써 각각의 수학적 비교 함수 리스트들을 구현한다. 아래에 논의되는 바와 같이, 의미 검색 서브시스템(152)은 검색 공간(250)에서 특정 의미 표현(162)과 적절히 유사하지 않은 의미 표현들(158)이 전지되기 때문에 더 계산적으로 비싼 함수들(예를 들어, 추가 노드들을 고려하는 함수들, 동일한 수의 노드들의 추가 차원들을 고려하는 함수들, 사전들 또는 외부 언어 모델들과 같은 데이터베이스들에 질의하는 함수들)을 이용하는 것으로 진행할 수 있다. 따라서, 의미 검색 서브시스템(152)은 추출된 아티팩트들(140)을 제공하는 적절한 의미 표현들(158)에 대해 체계적으로 좁힌다.As will be appreciated herein, semantic retrieval subsystem 152 intervenes between semantic representations 158 and 162 to facilitate determination of a targeted arsenal of semantic representations 158 and extracted artifacts 140 . A predictive similarity scoring scheme can be used to progressively render a more accurate similarity score of . In such embodiments, the predictive similarity scoring scheme may be used to evaluate searches over large-scale representations of search space 250 , such as those generated based on semantic representations 158 of multiple or extended understanding models 157 . make it possible to perform In general, the similarity scoring subsystem of the semantic search subsystem 152 is responsible for determining between a particular utterance-based semantic representation 162 and each semantic representation 158 (eg, search space semantic representation) within the search space 250 . It works by first identifying a list of mathematical comparison functions that enable comparison. For semantic representations 158 that do not have a comparable form to semantic representation 162 , the similarity scoring subsystem can identify the semantic representations 158 as incompatible and omnipotent them from the search space 250 . there is. The similarity scoring subsystem then constructs the widest or least expensive function therein to generate an initial similarity score between the particular semantic representation 162 and the remaining comparable semantic representations 158 of the search space 250 . By applying, each of the mathematical comparison function lists is implemented. As discussed below, semantic retrieval subsystem 152 may use more computationally expensive functions (eg, semantic expressions 158 ) because semantic expressions 158 that are not adequately similar to a particular semantic expression 162 in search space 250 are pruned. For example, one may proceed with using functions that take into account additional nodes, functions that take into account additional dimensions of the same number of nodes, functions that query databases such as dictionaries or external language models). Accordingly, the semantic search subsystem 152 systematically narrows down to the appropriate semantic representations 158 that provide the extracted artifacts 140 .

예로서, 도 9는 NLU 프레임워크(104)의 의미 검색 서브시스템(152) 내에서 구현될 수 있는 유사도 스코어링 서브시스템(260)의 실시예를 예시하는 정보 흐름도이다. 아래에 논의되는 바와 같이, 유사도 스코어링 서브시스템(260)은 수학적 비교 함수들을 검색하고 이들을 이용하여, 점점 더 고가의 함수들을 통해 임의의 적절한 수의 의미 표현들을 서로 반복적으로 비교한다. 예로서, 도 9의 본 실시예는 제1 의미 표현(262) 및 제2 의미 표현(264)이 검색 공간(250)의 의미 표현들(158)과 비교되는 유사도 스코어링 서브시스템(260)의 기능들에 관한 것이지만, 아래에 논의되는 기술들은 NLU 프레임워크(104)의 각각의 의미 표현에 적용가능하다는 것을 이해해야 한다. 이해되는 바와 같이, 제1 의미 표현(262)은 위에서 논의된 의미 표현들(162) 중 제1 의미 표현에 대응할 수 있고, 제2 의미 표현(164)은 의미 표현들(162) 중 제2 의미 표현에 대응할 수 있다. 다른 실시예들에서, 의미 표현들(262, 264)은 사용자 발화(122)에 대응하는 것으로서 본 명세서에서 주로 논의되는 발화(266)로부터 각각 도출될 수 있지만, 위에서 논의된 샘플 발화들(155) 중 하나에 대응할 수 있다.By way of example, FIG. 9 is an information flow diagram illustrating an embodiment of a similarity scoring subsystem 260 that may be implemented within the semantic search subsystem 152 of the NLU framework 104 . As discussed below, similarity scoring subsystem 260 searches for and uses mathematical comparison functions to iteratively compare any suitable number of semantic expressions to each other via increasingly expensive functions. As an example, this embodiment of FIG. 9 illustrates the functionality of similarity scoring subsystem 260 in which first semantic representation 262 and second semantic representation 264 are compared to semantic representations 158 of search space 250 . However, it should be understood that the techniques discussed below are applicable to each semantic representation of the NLU framework 104 . As will be appreciated, the first semantic representation 262 may correspond to a first one of the semantic expressions 162 discussed above, and the second semantic representation 164 is a second meaning of the semantic representations 162 . can respond to expressions. In other embodiments, semantic expressions 262 , 264 may each be derived from utterance 266 discussed primarily herein as corresponding to user utterance 122 , although sample utterances 155 discussed above. You can respond to one of them.

일반적으로, 각각의 의미 표현(262, 264)은 의미 표현(262, 264)의 형상(예를 들어, 발화 트리 구조 및 품사 태깅)에 기반하여 할당되는 0개, 1개 또는 복수의 인지 구축 문법(CCG) 형태 클래스에 속한다. 즉, CCG 기술들에 기반하여, 유사도 스코어링 서브시스템(260)은 각각의 의미 표현(262, 264)이 CCG 형태들에 집합적으로 매핑가능한 노드들(예를 들어, 단어 벡터들 및/또는 단어 벡터들의 조합)에 대한 품사 태그들을 포함하는 형상 또는 구조(예를 들어, 발화 트리 또는 다른 적절한 메타-구조에 의해 정의됨)를 갖는다는 것을 인식한다. 따라서, 유사도 스코어링 서브시스템(260)은 의미 표현들(262, 264)에 대한 아티팩트 매치들을 포함하는 적절한 매칭 의미 표현들(158)을 식별하기 위해 의미 표현들(262, 264)의 형상들에 기반하여 검색들을 수행할 수 있다.In general, each semantic expression 262 , 264 is assigned based on the shape of the semantic expression 262 , 264 (eg, utterance tree structure and part-of-speech tagging) 0, 1, or a plurality of cognitive constructing grammars. (CCG) belongs to the type class. That is, based on CCG techniques, the similarity scoring subsystem 260 provides nodes (eg, word vectors and/or word It is recognized that it has a shape or structure (eg, defined by an utterance tree or other suitable meta-structure) that includes parts-of-speech tags for a combination of vectors. Accordingly, the similarity scoring subsystem 260 based on the shapes of the semantic expressions 262 and 264 to identify the appropriate matching semantic expressions 158 that include artifact matches to the semantic expressions 262 and 264 . so that searches can be performed.

예시된 실시예에서, 유사도 스코어링 서브시스템(260)은 그 안에 형태 클래스 테이블(272)을 포함하는 형태 클래스 데이터베이스(270)를 포함한다. 주로 테이블로서 논의되지만, 형태 클래스 테이블(272)은 다른 실시예들에서 임의의 적절한 데이터 구조로 구현될 수 있다. 일부 실시예들에서, 형태 클래스 데이터베이스(270) 및 형태 클래스 테이블(272)은 에이전트 자동화 프레임워크(100)의 데이터베이스(106) 내에 저장될 수 있다. 본 명세서에서 인식되는 바와 같이, 형태 클래스 테이블(272)의 각각의 엔트리(275)(예를 들어, 형태 클래스 엔트리)는 의미 검색 서브시스템(152)에 의해 지원되는 일대일 형태 클래스 비교(CCG 형태 클래스 비교라고도 함)를 설명한다. 특히, 형태 클래스 테이블(272)은 제1 의미 표현의 CCG 형태와 연관된 제1 축(273) 및 비교되는 제2 의미 표현의 CCG 형태와 연관된 제2 축(274)을 포함한다. 각각의 축 라벨은 유사도 스코어링 서브시스템(260)이 지원하는 제각기의 CCG 형태들 각각에 대한 형태 패턴, 예컨대 동사-주도 구, 명사-주도 구 등과 연관되고, f₁-f_N의 지원되는 CCG 형태 범위 내의 적절한 기능 식별자에 의해 표현된다. 따라서, 특정 의미 표현에 대한 형태 패턴은 특정 의미 표현에 대한 CCG 형태 클래스 멤버십을 정의한다는 것을 이해해야 한다.In the illustrated embodiment, similarity scoring subsystem 260 includes a shape class database 270 that includes a shape class table 272 therein. Although primarily discussed as a table, the shape class table 272 may be implemented in other embodiments with any suitable data structure. In some embodiments, the shape class database 270 and the shape class table 272 may be stored within the database 106 of the agent automation framework 100 . As will be appreciated herein, each entry 275 (eg, a shape class entry) in the shape class table 272 is a one-to-one shape class comparison (CCG shape class) supported by the semantic search subsystem 152 . Also called comparison). In particular, the shape class table 272 includes a first axis 273 associated with the CCG form of the first semantic expression and a second axis 274 associated with the CCG form of the second semantic expression being compared. Each axis label is associated with a shape pattern for each of the respective CCG forms supported by the similarity scoring subsystem 260, eg, a verb-driven phrase, a noun-driven phrase, etc., and supported CCG forms of f ₁ -f _N It is represented by an appropriate function identifier within the scope. Therefore, it should be understood that a shape pattern for a specific semantic representation defines CCG shape class membership for that specific semantic representation.

본 실시예에서, 형태 클래스 테이블(272)은 2개의 연관된 CCG 형태가 비교가능한지 그리고 그렇다면, 비교의 수행에 관한 명령어들을 나타내기 위해 CCG 형태들 중 2개의 CCG 형태의 각각의 교차점에 대한 엔트리들(275)의 각각의 엔트리를 포함한다. 형태 클래스 테이블(272)이 비교된 CCG 형태 클래스들의 각각의 가능한 순열에 대응하는 임의의 적절한 수의 엔트리들(275)을 포함할 수 있다는 것을 이해해야 한다. 특히, 각각이 동일한 CCG 형태 클래스에 속하는 의미 표현들은 그 자체로 서로 비교가능하고, 형태 클래스 테이블(272)의 중앙 대각선(276)을 따라 각각의 엔트리(275) 내에 나타내어져 있는 이하에서 논의되는 비교 함수 리스트에 의해 표현된다. 현재 예시되는 바와 같이, 형태 클래스 테이블(272)은 중앙 대각선(276)을 따라 반사 대칭선을 가지며, 이는 형태 클래스 테이블(272)의 본 실시예의 비교 기능들이 가환적(commutative)이라는 것을 나타낸다. 즉, 제1 의미 표현을 제2 의미 표현과 비교하는 것은 제2 의미 표현을 제1 의미 표현과 비교하는 것과 동일한 결과를 낳는다. 다른 실시예들에서, 형태 클래스 테이블(272)은 반사 대칭선을 포함하지 않을 수 있고, 따라서 유사도 스코어링 서브시스템(260)이 의미 표현들이 비교되고 있는 순서 또는 방향에 기반하여 이하에서 논의되는 비교 함수 리스트를 조정할 수 있게 한다. 특정 예로서, 형태 클래스 테이블(272)의 하나의 엔트리(275)는 동사-주도 CCG 형태를 갖는 의미 표현이 동사-주도 CCG 형태, 명사-주도 CCG 형태 등을 갖는 다른 의미 표현들과 비교될 수 있다는 것을 지정할 수 있다. 본 실시예들에서, 유사도 스코어링 서브시스템(260)은 비교를 위한 엔트리(275)가 비어 있다고(예컨대, 널(null), 정의되지 않음) 결정하는 것에 응답하여 한 쌍의 의미 표현들이 비교가능하지 않다고 결정하고, 따라서 비교불가능한 의미 표현들 간의 비교들을 수행하지 않는다.In this embodiment, the shape class table 272 contains entries for each intersection of two of the CCG types ( 275), respectively. It should be understood that the shape class table 272 may include any suitable number of entries 275 corresponding to each possible permutation of the compared CCG shape classes. In particular, the semantic representations each belonging to the same CCG shape class are themselves comparable to each other, and the comparison discussed below is shown within each entry 275 along the central diagonal 276 of the shape class table 272 . Represented by a function list. As currently illustrated, shape class table 272 has a reflective line of symmetry along central diagonal 276 indicating that the comparison functions of this embodiment of shape class table 272 are commutative. That is, comparing the first semantic expression with the second semantic expression produces the same result as comparing the second semantic expression with the first semantic expression. In other embodiments, the shape class table 272 may not include a line of reflection symmetry, so that the similarity scoring subsystem 260 is based on the order or direction in which the semantic expressions are being compared to the comparison function discussed below. Allows the list to be adjusted. As a specific example, one entry 275 of the form class table 272 indicates that a semantic expression having a verb-dominant CCG form may be compared with other semantic expressions having a verb-dominant CCG form, a noun-driven CCG form, and the like. It can be specified that there is In the present embodiments, similarity scoring subsystem 260 determines that the pair of semantic expressions are not comparable in response to determining that entry 275 for comparison is empty (eg, null, undefined). not, and thus do not perform comparisons between incomparable semantic expressions.

언급된 바와 같이, 유사도 스코어링 서브시스템(260)의 각각의 지원되는 CCG 형태 클래스 비교를 위한 형태 클래스 테이블(272)의 엔트리(275)는 또한 유사도 스코어링 서브시스템(260)을 포함하거나 이를 하나 또는 복수의 함수(280)(예를 들어, 비교 함수)를 갖는 수학적 비교 함수 리스트(278)(예를 들어, 형태-대수 함수 리스트, 처리 규칙들)에 지향시킨다. 각각의 수학적 비교 함수 리스트(278)의 함수들(280)은 아래에 더 상세히 설명되는 바와 같이, 의미 표현들(262, 264)의 각각의 의미 표현이 검색 공간(250)과 비교될 수 있게 하는 점진적으로 더 고가의 스코어링 기능들을 제공하는 내포된 함수들의 세트이다. 수학적 비교 함수 리스트(278)는 벡터 대수, 코사인 유사도 함수들, 외부 데이터베이스들에 대한 질의들, 및/또는 유사도 스코어링 서브시스템(260)이 임의의 적절한 수의 의미 표현들 사이의 유사도 스코어들을 결정하는데 이용할 수 있는 임의의 다른 적절한 수학적 함수들 또는 공식들을 포함할 수 있다. 함수들(280)은 수학적 비교 함수 리스트(278)의 이전 함수를 추가로 정의할 수 있거나, 또는 대안적으로, 이전 함수들(280)로부터 완전히 독립적일 수 있다는 점이 이해되어야 한다. 일부 실시예들에서, 형태 클래스 테이블(272)의 각각의 엔트리(275)에 대한 수학적 비교 함수 리스트(278)는 언어학자들 또는 사용자들에 의해 수동으로 지정되고, ML 기술들 등에 의해 도출된다.As mentioned, the entry 275 of the shape class table 272 for each supported CCG shape class comparison of the similarity scoring subsystem 260 also includes a similarity scoring subsystem 260 or one or more Directed to a mathematical comparison function list 278 (eg, form-algebraic function list, processing rules) having a function 280 (eg, comparison function) of Functions 280 of each mathematical comparison function list 278 allow each semantic representation of semantic expressions 262 , 264 to be compared with search space 250 , as described in more detail below. It is a set of nested functions that provide progressively more expensive scoring functions. A list of mathematical comparison functions 278 is used for vector algebra, cosine similarity functions, queries to external databases, and/or for similarity scoring subsystem 260 to determine similarity scores between any suitable number of semantic expressions. any other suitable mathematical functions or formulas available. It should be appreciated that functions 280 may further define a previous function of mathematical comparison function list 278 , or, alternatively, may be completely independent of previous functions 280 . In some embodiments, the mathematical comparison function list 278 for each entry 275 of the shape class table 272 is manually specified by linguists or users, derived by ML techniques, or the like.

일반적으로, 수학적 비교 함수 리스트(278)의 함수들(280)은 각각 검색 공간(250)의 의미 표현들(158)과 적절히 매칭하는 의미 표현들(262, 264)의 고려된 부분들에 응답하여 특정 임계 스코어를 초과하는 유사도 스코어를 부여함으로써 의미 표현들(262, 264) 및 검색 공간(250)의 비교가능한 것들 사이의 유사도를 제각기 스코어링한다. 특정 실시예들에서, 함수들(280)은 각각의 의미 표현들(158)이 대응하는 검색 키 의미 표현(262, 264)의 중요한 또는 유의한 노드들을 배제하거나 이들과 매칭되지 않는 것에 응답하여, 검색 공간(250)의 각각의 의미 표현들(158)과 연관된 유사도 스코어에 0들을 할당하거나, 다른 방식으로 이에 페널티를 줄 수 있다. 이해되는 바와 같이, 유사도 스코어링 서브시스템(260)은, 형태 클래스 테이블(272)의 빈 엔트리들(275)에 의해 표시되는 바와 같이, 형태 클래스 데이터베이스(270)의 형태 클래스 호환성 규칙들에 기반한 비교에 부적절한 CCG 형태를 갖는 다른 의미 표현과 그 의미 표현을 비교하지 않는다.In general, the functions 280 of the mathematical comparison function list 278 are responsive to the considered portions of the semantic expressions 262 , 264 that appropriately match the semantic expressions 158 of the search space 250 , respectively. The similarity between the semantic expressions 262 , 264 and the comparables in the search space 250 is scored, respectively, by giving a similarity score that exceeds a certain threshold score. In certain embodiments, functions 280 are, in response to each semantic expression 158 excluding or not matching significant or significant nodes of the corresponding search key semantic expression 262 , 264 , Zeros may be assigned to, or otherwise penalized, the similarity score associated with each semantic representation 158 of search space 250 . As will be appreciated, the similarity scoring subsystem 260 is configured to perform a comparison based on the shape class compatibility rules of the shape class database 270 , as indicated by empty entries 275 of the shape class table 272 . Do not compare the semantic expression with other semantic expressions that have inappropriate CCG forms.

다른 실시예들에서, 유사도 스코어링 서브시스템은 일부 실시예들에서 0의 유사도 스코어를 비교불가능한 쌍들의 의미 표현들에 즉시 할당할 수 있다. 추가적인 실시예들에서, 유사도 스코어링 서브시스템(260)은 유사도 스코어링 서브시스템(260)으로 하여금 비교가능하지 않은 의미 표현들 사이에 0의 유사도 스코어를 생성하게 하는 함수들(280)을 갖는 수학적 비교 함수 리스트(278)를 구현함으로써 비교를 수행할 수 있다. 이러한 실시예들에서, 수학적 비교 함수 리스트들(278)은 자연스럽게 유사도 스코어링 서브시스템(260)으로 하여금 의미 표현(262, 264)과의 비교에 부적절한 CCG 형태들을 갖는 의미 표현들(158)에 0 또는 널 유사도 스코어들을 할당하게 할 수 있기 때문에, 형태 클래스 테이블(272)은 형태 클래스 테이블(272)의 각각의 엔트리(275)에 적절한 수학적 비교 함수 리스트(278)를 포함할 수 있다.In other embodiments, the similarity scoring subsystem may in some embodiments immediately assign a similarity score of zero to semantic representations of uncomparable pairs. In further embodiments, similarity scoring subsystem 260 may provide a mathematical comparison function having functions 280 that cause similarity scoring subsystem 260 to generate a similarity score of zero between non-comparable semantic expressions. Comparison may be performed by implementing list 278 . In such embodiments, the mathematical comparison function lists 278 naturally cause the similarity scoring subsystem 260 to either zero or Since null similarity scores may be assigned, the shape class table 272 may include a list of mathematical comparison functions 278 suitable for each entry 275 of the shape class table 272 .

또한, 특정 실시예들에서, 유사도 스코어링 서브시스템(260)은 발화 의미 모델(160)로부터 발화(266)의 복수의 표현의 표현들을 수신할 수 있다. 예를 들어, 의미 표현들(262, 264)은 발화(266)에 대한 대안적인 형태들을 나타내는 것으로서 발화 의미 모델(160) 내에 포함될 수 있다. 일반적으로, (의미 추출 서브시스템(150)에 의해 생성되고 발화 의미 모델(160)에 포함된) 의미 표현들(262, 264) 각각은 발화(266)의 아티팩트들에 대응하는 적절히 구별되는 의미 표현을 나타낸다. 의미 표현들(262, 264)의 각각의 비교가능한 쌍을 고려함으로써, 본 실시예의 유사도 스코어링 서브시스템(260)은 대응하는 추출된 아티팩트들(140)에 대한 더 철저한 검색을 제공하거나, 대응하는 추출된 아티팩트들에 대한 더 큰 네트를 만들기 위해 발화(266)의 복수의 해석을 평가할 수 있다.Further, in certain embodiments, similarity scoring subsystem 260 may receive representations of a plurality of expressions of utterance 266 from utterance semantic model 160 . For example, semantic expressions 262 , 264 may be included in utterance semantic model 160 as representing alternative forms to utterance 266 . In general, each of the semantic expressions 262 , 264 (generated by the semantic extraction subsystem 150 and included in the utterance semantic model 160 ) is an appropriately distinct semantic representation corresponding to artifacts of the utterance 266 . indicates By considering each comparable pair of semantic expressions 262 , 264 , the similarity scoring subsystem 260 of this embodiment provides a more exhaustive search for the corresponding extracted artifacts 140 , or the corresponding extraction Multiple interpretations of the utterance 266 may be evaluated to create a larger net of artifacts.

특정 예를 논의하기 위해 도 6을 간단히 참조하면, 특정 실시예들의 의미 추출 서브시스템(150)은 발화(266) "북 미팅"이 "예약 요청 또는 미팅 스케줄링"에 대응하는 제1 대안 의미 표현 및 "북에 관한 미팅"에 대응하는 제2 대안 의미 표현을 갖는 것으로 결정할 수 있다. 의미 추출 서브시스템(150)은 그 후 이들 대안 의미 표현들을 발화 의미 모델(160) 내에 통합할 수 있다. 이들 대안 의미 표현들 모두를 의미 검색 서브시스템(152)을 통해 유사도 스코어링 프로세스의 나중 단계들에 전파함으로써, 유사도 스코어링 서브시스템(260)은 적절한 추출된 아티팩트들(140)의 식별을 가능하게 하는 적절한 수학적 비교 함수 리스트(278)를 식별할 가능성이 더 크다.6 to discuss a specific example, the semantic extraction subsystem 150 of certain embodiments may include a first alternative semantic expression in which the utterance 266 "book meeting" corresponds to "request reservation or scheduling a meeting" and It may be determined to have a second alternative semantic expression corresponding to “meeting about a book”. Semantic extraction subsystem 150 may then incorporate these alternative semantic expressions into utterance semantic model 160 . By propagating all of these alternative semantic expressions through semantic search subsystem 152 to later stages of the similarity scoring process, similarity scoring subsystem 260 provides an appropriate It is more likely to identify a list of mathematical comparison functions 278 .

유사도 스코어링 서브시스템(260)의 구성요소들의 위의 설명을 염두에 두고, 도 10은 유사도 스코어링 서브시스템(260)이 도 9의 제1 의미 표현(262)과 도 8의 검색 공간(250)의 의미 표현들(158) 사이의 비교를 가능하게 하는 수학적 비교 함수 리스트들(278) 중 하나를 검색하는 프로세스(300)의 실시예를 예시하는 흐름도이다. 프로세스(300)는 의미 검색 서브시스템(152)의 유사도 스코어링 서브시스템(260)에 의해 고려되는 각각의 발화 기반 의미 표현(162)에 대해 병렬로 반복되거나 개별적으로 처리될 수 있다는 것을 이해해야 한다. 전술한 바와 같이, 그 안에 유사도 스코어링 서브시스템(260)을 포함하는 의미 검색 서브시스템(152)은 의미 추출 서브시스템(150)에 의해 정의되는 발화 의미 모델(160)로부터 발화(266)에 대한 의미 표현(262)을 수신할 수 있다. 유사도 스코어링 서브시스템(260)은 다른 실시예들에서 의미 추출 서브시스템(150)으로부터 직접적으로 의미 표현(262)을 검색할 수 있다. 의미 추출 서브시스템(150)의 일부로서 예시 및/또는 설명되는 단계들은 적절한 메모리(예를 들어, 메모리(86))에 저장될 수 있고, (예를 들어, 데이터 센터(22) 내의) 클라이언트 인스턴스(42) 또는 (예를 들어, 클라우드 기반 플랫폼(20)에 의해 호스팅되는) 기업 인스턴스(125)와 연관된 적절한 프로세서(예를 들어, 프로세서(82))에 의해 실행될 수 있다.With the above description of the components of the similarity scoring subsystem 260 in mind, FIG. 10 shows that the similarity scoring subsystem 260 illustrates the first semantic representation 262 of FIG. 9 and the search space 250 of FIG. 8 . A flow diagram illustrating an embodiment of a process 300 for retrieving one of a list of mathematical comparison functions 278 that enables comparison between semantic expressions 158 . It should be understood that process 300 may be iterated in parallel or processed individually for each utterance-based semantic representation 162 considered by similarity scoring subsystem 260 of semantic search subsystem 152 . As described above, the semantic search subsystem 152 , including a similarity scoring subsystem 260 therein, is configured to provide semantics for utterances 266 from the utterance semantic model 160 defined by the semantic extraction subsystem 150 . Representation 262 may be received. Similarity scoring subsystem 260 may retrieve semantic representation 262 directly from semantic extraction subsystem 150 in other embodiments. The steps illustrated and/or described as part of the semantic extraction subsystem 150 may be stored in a suitable memory (eg, memory 86 ), and may be stored in a client instance (eg, within data center 22 ). 42 or a suitable processor (eg, processor 82 ) associated with enterprise instance 125 (eg, hosted by cloud-based platform 20 ).

이와 같이, 프로세스(300)의 예시된 실시예를 시작하는 유사도 스코어링 서브시스템(260)은 의미 표현(262)의 CCG 형태(304)(예를 들어, 인지 구축 문법 형태)를 결정한다(블록 302). 전술한 바와 같이, CCG 형태(304)는 노드들 내의 토큰들(예를 들어, 단어들 또는 구들)의 품사 주석부기들 또는 태깅들과 관련하여 분석된, 의미 표현(262)의 노드들에 의해 형성된 형상의 임의의 적절한 설명일 수 있다. 예를 들어, 유사도 스코어링 서브시스템(260)은 의미 표현(262)이 동사-대명사-명사 구, 명사-동사-직접 형용사 구, 명사-동사-부사 구 등에 대응하는 CCG 형태(304)를 갖는다는 것을 식별할 수 있다.As such, similarity scoring subsystem 260 beginning the illustrated embodiment of process 300 determines a CCG form 304 (eg, a cognitive building grammar form) of the semantic representation 262 (block 302 ). ). As described above, the CCG form 304 is formed by the nodes of the semantic expression 262 , parsed with respect to the part-of-speech annotations or tagging of tokens (eg, words or phrases) within the nodes. It can be any suitable description of the shape formed. For example, similarity scoring subsystem 260 determines that semantic expression 262 has a CCG form 304 corresponding to verb-pronoun-noun phrase, noun-verb-direct adjective phrase, noun-verb-adverb phrase, etc. can be identified

본 실시예에서, 유사도 스코어링 서브시스템(260)은 한 세트의 처리 규칙들(306)(예를 들어, 형태 처리 규칙들)을 적용하고 한 세트의 처리 규칙들(306)에 대한 그 대응관계에 기반하여 CCG 형태(304)를 의미 표현(262)에 할당함으로써 의미 표현(262)에 대한 CCG 형태(304)를 결정한다. 다른 실시예들에서, 유사도 스코어링 서브시스템(260)은 임의의 다른 적절한 방식을 통해, 예를 들어 의미 추출 서브시스템(150)으로부터 CCG 형태(304)를 수신함으로써 또는 ML 기반 패턴 매칭 기술들에 의해 CCG 형태(304)를 결정할 수 있다. ML 기반 패턴 매칭의 예로서, 유사도 스코어링 서브시스템(260)은 의미 표현(262)의, 품사 태그들을 갖는 형상을 이해 모델(157)의 의미 표현들(158)과 비교하여 이들 사이의 유사도들을 식별함으로써 의미 표현(262)에 대한 적절한 또는 가장 가까운 매칭 CCG 형태(304)를 결정할 수 있다. 이해되는 바와 같이, 유사도 스코어링 서브시스템(260)은 의미 표현(262)의 CCG 형태(304)의 결정을 가능하게 하는 임의의 적절한 플러그-인 또는 다른 처리 구성요소들을 포함할 수 있다.In this embodiment, similarity scoring subsystem 260 applies a set of processing rules 306 (eg, shape processing rules) and evaluates its correspondence to a set of processing rules 306 . determines the CCG shape 304 for the semantic representation 262 by assigning the CCG shape 304 to the semantic representation 262 based on the In other embodiments, similarity scoring subsystem 260 may be configured in any other suitable manner, for example by receiving CCG form 304 from semantic extraction subsystem 150 or by ML based pattern matching techniques. The CCG form 304 may be determined. As an example of ML-based pattern matching, the similarity scoring subsystem 260 compares the shape with the part-of-speech tags of the semantic representation 262 to the semantic representations 158 of the understanding model 157 to identify similarities between them. may determine an appropriate or closest matching CCG form 304 for the semantic representation 262 . As will be appreciated, similarity scoring subsystem 260 may include any suitable plug-in or other processing components that enable determination of CCG form 304 of semantic representation 262 .

의미 표현(262)에 대한 CCG 형태(304)가 식별되면, 유사도 스코어링 서브시스템(260)은 형태 클래스 데이터베이스(270)로부터 CCG 형태(304)에 대한 0개, 1개 또는 복수의 매칭 형태 클래스(312)를 결정한다(블록 310). 매칭 형태 클래스들(312)은 형태 클래스 데이터베이스(270)의 테이블-유형 실시예들 등에 질의함으로써 패턴 매칭에 의해 식별될 수 있다. 보다 상세하게는, 본 실시예들의 유사도 스코어링 서브시스템(260)은 그 안에 수학적 비교 함수 리스트들(278)을 갖는 형태 클래스 테이블(272) 내에서 대응하는 엔트리들(275)을 찾고, 이에 의해 의미 표현(262)이 어떤 형태 클래스들과 비교될 수 있는지를 결정한다. 실제로, 앞서 언급한 바와 같이, 특정의 CCG 형태(304)에 대한 형태 클래스 테이블(272)의 각각의 엔트리(275)는 의미 표현(262)의 CCG 형태(304)와 비교될 수 있는 각각의 CCG 형태에 대한 수학적 비교 함수 리스트들(278) 중 하나(예를 들어, 형태 클래스 호환성)는 물론, 유사도가 계산될 수 있는 함수 세트(280)를 포함할 수 있다.Once the CCG form 304 for the semantic representation 262 is identified, the similarity scoring subsystem 260 retrieves zero, one, or multiple matching form classes for the CCG form 304 from the form class database 270 ( 312) is determined (block 310). Matching shape classes 312 may be identified by pattern matching by querying table-type embodiments or the like of the shape class database 270 . More specifically, the similarity scoring subsystem 260 of the present embodiments finds the corresponding entries 275 in the shape class table 272 having the mathematical comparison function lists 278 therein, thereby meaning Determines which shape classes the representation 262 can be compared to. Indeed, as noted above, each entry 275 of the shape class table 272 for a particular CCG type 304 is each CCG that can be compared with the CCG type 304 of the semantic representation 262 . One of the mathematical comparison function lists 278 for shapes (eg, shape class compatibility) can, of course, include a set of functions 280 for which similarity can be calculated.

따라서, 프로세스(300)를 따르는 유사도 스코어링 서브시스템(260)은 의미 표현(262)의 CCG 형태(304)와 각각의 엔트리(275)(예를 들어, 각각의 매칭 형태 클래스(312))와 연관된 다른 의미 표현의 CCG 형태 사이의 식별된 매치를 나타내는 각각의 수학적 비교 함수 리스트(278)를 형태 클래스 테이블(272)의 각각의 엔트리(275)로부터 검색한다(블록 314). 본 실시예에서, 유사도 스코어링 서브시스템(260)은 하나의 매칭 형태 클래스(312)를 식별하고 하나의 수학적 비교 함수 리스트(278)를 출력하지만, 임의의 다른 적절한 수의 수학적 비교 함수 리스트들(278)이 형태 클래스 매치들의 수에 따라 검색될 수 있다는 것을 이해해야 한다. 예를 들어, 유사도 스코어링 서브시스템(260)은 의미 표현(262)을 동일한 CCG 형태(304)를 갖는 의미 표현과 비교하는 방법을 지시하는 제1 수학적 비교 함수 리스트(278)뿐만 아니라, 의미 표현(262)과 상이한 CCG 형태(304)를 갖는 다른 의미 표현 사이의 비교를 가능하게 하는 제2 수학적 비교 함수 리스트(278)를 검색할 수 있다.Accordingly, the similarity scoring subsystem 260 following the process 300 is associated with the CCG form 304 of the semantic expression 262 and each entry 275 (eg, each matching form class 312 ). A respective list of mathematical comparison functions 278 representing the identified matches between CCG forms of different semantic expressions is retrieved from each entry 275 of the form class table 272 (block 314). In the present embodiment, similarity scoring subsystem 260 identifies one match type class 312 and outputs one mathematical comparison function list 278 , but any other suitable number of mathematical comparison function lists 278 . ) can be searched according to the number of shape class matches. For example, similarity scoring subsystem 260 may provide a first list of mathematical comparison functions 278 indicating how to compare a semantic representation 262 with a semantic representation having the same CCG form 304, as well as a semantic representation ( 262 ) and a second list of mathematical comparison functions 278 enabling comparison between other semantic expressions having different CCG forms 304 .

수학적 비교 함수 리스트(278)의 이용을 예시하기 위해, 도 11은 의미 표현(262)을 (예를 들어, 검색 공간(250)의) 검색 공간 의미 표현(330)과 비교하기 위해 하나의 수학적 비교 함수 리스트(278)를 이용하는 의미 검색 서브시스템(152)의 유사도 스코어링 서브시스템(260)의 실시예의 도면이다. 본 실시예에서, 검색 공간 의미 표현(330)은 검색 공간(250)의 의미 표현들(158) 중 하나이지만, 본 기술들은 임의의 적절한 의미 표현들을 비교하는데 이용될 수 있다는 것을 이해해야 한다. 전술한 바와 같이, 수학적 비교 함수 리스트(278)는 수학적 비교 함수 리스트(278) 내에 더 깊게 위치된 함수들(280) 내의 비교된 의미 표현들(262, 330)로부터 더 많은 계산 자원들 및/또는 데이터(예를 들어, 더 많은 수의 노드들)를 점진적으로 이용하는 함수들(280)의 순서화된 세트를 포함한다. 본 명세서에서 인식되는 바와 같이, 수학적 비교 함수 리스트(278)는 먼저 덜 계산적으로 비싼 함수들(280)로 순서화되고, 검색 공간(250)에 포함될 잠재적인 의미 표현들(158)이 전지됨에 따라 더 비싼 함수들을 활용하도록 진행한다. 이와 같이, 유사도 스코어링 서브시스템(260)은 일반적으로 후속하여 이용되고 그리고 더 비싼 함수들(280)을 통해 더 정확한 유사도 스코어들(340)(예를 들어, 예측 유사도 스코어들)을 결정한다.To illustrate the use of mathematical comparison function list 278 , FIG. 11 shows one mathematical comparison to compare semantic representation 262 with search space semantic representation 330 (eg, in search space 250 ). A diagram of an embodiment of a similarity scoring subsystem 260 of a semantic search subsystem 152 using a function list 278 . In the present embodiment, search space semantic representation 330 is one of semantic representations 158 of search space 250 , although it should be understood that the present techniques may be used to compare any suitable semantic representations. As described above, the mathematical comparison function list 278 may have more computational resources and/or from the compared semantic expressions 262 , 330 in the functions 280 located deeper within the mathematical comparison function list 278 . It contains an ordered set of functions 280 that progressively use data (eg, a larger number of nodes). As will be appreciated herein, the mathematical comparison function list 278 is first ordered into the less computationally expensive functions 280 , and as potential semantic expressions 158 to be included in the search space 250 are omnipresent, more Proceed to use expensive functions. As such, similarity scoring subsystem 260 generally determines similarity scores 340 (eg, predictive similarity scores) that are subsequently used and more accurate via more expensive functions 280 .

예를 들어, 제1 비교(350) 동안, 유사도 스코어링 서브시스템(260)은 의미 표현(262)의 루트 노드(354)를 검색 공간 의미 표현(330)의 검색 공간 루트 노드(356)와 비교하기 위해 제1 함수(352)를 구현할 수 있다. 전술한 바와 같이, 제1 함수(352)는 수학적 비교 함수 리스트(278)의 가장 덜 비싼 함수(280)이다. 유사도 스코어링 서브시스템(260)은 그 후 의미 표현들(262, 330) 사이의 국소적(예를 들어, 낮은 깊이의 계산하기에 가장 저렴한) 유사도를 설명하는 제1 유사도 스코어(360)를 결정할 수 있다. 제1 함수(352)가 수학적 비교 함수 리스트(278)의 가장 덜 비싼 함수(352)인 것을 고려하면, 루트 노드들(354, 356)은 제1 함수(352)를 통해 비교된 의미 표현들(262, 330)의 부분들의 예들이며, 각각의 의미 표현(262, 330)의 전체를 포함하는 다른 부분들이 제1 비교(350)에서 비교될 수 있다는 것을 이해해야 한다. 유사도 스코어들(340)은 0 내지 1, 0 내지 5, 0 내지 10 등의 유사도 스코어링 범위 내에서 할당된 값과 같은, 2개의 의미 표현에 대한 품사 태그들을 포함하는 형태 사이의 유사도 또는 대응관계의 임의의 적절한 수학적 표현일 수 있다. 적어도 유사도 스코어들(340)이 2개의 의미 표현의 고려된 부분들 사이의 포괄적인 의미 및 형태 기반 유사도를 설명하기 때문에, 본 명세서에 설명된 유사도 스코어들(340)은 2개의 의도 벡터 사이에서 결정된 전술한 유사도 척도들과 별개라는 것을 이해해야 한다.For example, during first comparison 350 , similarity scoring subsystem 260 compares root node 354 of semantic representation 262 with search space root node 356 of search space semantic representation 330 . For this purpose, a first function 352 may be implemented. As noted above, the first function 352 is the least expensive function 280 of the mathematical comparison function list 278 . Similarity scoring subsystem 260 may then determine a first similarity score 360 that describes a local (eg, low-depth, cheapest to compute) similarity between semantic representations 262 , 330 . there is. Considering that the first function 352 is the least expensive function 352 of the list of mathematical comparison functions 278 , the root nodes 354 and 356 have semantic expressions compared via the first function 352 ( It should be understood that other portions, which are examples of portions 262 , 330 , including the entirety of each semantic expression 262 , 330 , may be compared in the first comparison 350 . The similarity scores 340 indicate the degree of similarity or correspondence between forms including parts-of-speech tags for two semantic expressions, such as values assigned within a similarity scoring range of 0 to 1, 0 to 5, 0 to 10, etc. It may be any suitable mathematical expression. Since at least similarity scores 340 describe generic semantic and form-based similarity between considered portions of two semantic expressions, similarity scores 340 described herein are determined between two intent vectors. It should be understood that this is separate from the similarity measures described above.

도 11의 본 실시예에서, 제1 비교(350) 동안 평가된 루트 노드들(354, 356)은 속이 빈 원들로 도시되어 있다. 속이 찬 원들로 도시된 바와 같이, 의미 표현들(262, 330)의 남아 있는 의존 노드들은 제1 비교(350) 동안 효과적으로 "커버"(예를 들어, 고려되지 않음)된다. 이 방법론을 이하에서 논의되는 후속 비교들로 확장하면, 각각의 비교에서 속이 빈 원들로 도시된 노드들(예를 들어, 고려된 노드들(362))은 각각의 비교에서 평가되는 반면, 속이 찬 원들로 도시된 노드들(고려되지 않은 노드들(364))은 평가되지 않는다.In the present embodiment of FIG. 11 , the root nodes 354 , 356 evaluated during the first comparison 350 are shown as hollow circles. As shown by the solid circles, the remaining dependent nodes of the semantic expressions 262 , 330 are effectively “covered” (eg, not considered) during the first comparison 350 . Extending this methodology to subsequent comparisons discussed below, in each comparison nodes shown with hollow circles (eg, considered nodes 362 ) are evaluated in each comparison, whereas solid nodes are evaluated in each comparison. Nodes shown as circles (not considered nodes 364) are not evaluated.

일반적으로, 유사도 스코어링 서브시스템(260)은 수학적 비교 함수 리스트(278) 내에서 나중에 위치되는 더 확장적이고 계산적으로 비싼 비교들이 검색 공간(250)의 중점 부분에 대해 수행될 수 있게 하기 위해 가장 덜 비싼 것으로부터 가장 비싼 것까지의 함수들(280)의 순서를 활용한다. 일부 실시예들에서, 유사도 스코어링 서브시스템(260)은 유사도 스코어링 서브시스템(260)이 후속 비교들을 통해 진행함에 따라 의미 표현들(262, 330)의 더 많은 노드들을 선택적으로 "언커버링"(예를 들어, 고려)할 수 있다.In general, similarity scoring subsystem 260 provides the least expensive Utilizes the order of functions 280 from most expensive to most expensive. In some embodiments, similarity scoring subsystem 260 selectively "uncovers" (e.g., more nodes of semantic expressions 262, 330) as similarity scoring subsystem 260 progresses through subsequent comparisons. for example) can be considered.

예를 들어, 제1 유사도 스코어(360)를 결정한 후에, 본 실시예의 유사도 스코어링 서브시스템(260)은 증가된 수의 고려된 노드들(362)에 기반하여 더 정확한 제2 유사도 스코어(372)를 생성하는 제2 비교(370) 내에서 제2 함수(368)를 구현하는 것으로 진행한다. 그 후, 유사도 스코어링 서브시스템(260)은 동일한 수의 노드들을 고려하고, 고려된 노드들(362)과 연관된 단어 벡터들의 더 많은 수의 차원들을 고려함으로써, 더 비싸고, 외부 데이터베이스, 또는 임의의 다른 적절한 함수들 또는 연산들을 참고하는 제3 함수(378)를 통해 추가의 정확한 제3 유사도 스코어(376)를 생성하는 제3 비교(374)를 수행한다. 유사도 스코어링 서브시스템(260)은 일반적으로 이용되는 가장 비싼 함수(280)일 수 있는 제4 함수(384)를 통해 가장 정확한 최종 유사도 스코어(382)를 생성하는 최종 비교(380)를 후속하여 수행한다. 의미 표현들(262, 330) 사이의 예시적인 비교들(350, 370, 374, 380)에 관한 더 많은 상세들이 아래에 제공된다.For example, after determining the first similarity score 360 , the similarity scoring subsystem 260 of this embodiment generates a more accurate second similarity score 372 based on the increased number of considered nodes 362 . Proceed to implement a second function 368 within the second comparison 370 it creates. The similarity scoring subsystem 260 then considers the same number of nodes and considers a greater number of dimensions of the word vectors associated with the considered nodes 362, thereby making it more expensive, in an external database, or any other A third comparison 374 is performed which produces an additional accurate third similarity score 376 via a third function 378 referencing the appropriate functions or operations. Similarity scoring subsystem 260 subsequently performs a final comparison 380 that produces the most accurate final similarity score 382 via a fourth function 384 , which may be the most expensive function 280 commonly used. . More details regarding exemplary comparisons 350 , 370 , 374 , 380 between semantic expressions 262 , 330 are provided below.

또한, 도시되지는 않았지만, 유사도 스코어링 서브시스템(260)은 맥락-의존 방식으로 유사도 스코어에 대한 다양한 품사 기여도들을 조정하기 위해 비교들 내에서 (예를 들어, ML 기술들로부터 도출된) 가중치들 또는 계수들을 구현할 수 있다는 것을 이해해야 한다. 예를 들어, 검색 엔진 요청들을 처리하기 위해, 유사도 스코어링 서브시스템(260)은 명사들 또는 동사들에 대한 수식어들의 기여도를 증가시킬 수 있다. 대안적으로, NLU 프레임워크(104)의 챗봇 실시예에 제공되는 구매 주문 요청들 동안, 유사도 스코어링 서브시스템(260)은 수식어들 또는 동사들에 대한 명사들의 기여도를 증가시킬 수 있다. 일부 실시예들에서, 유사도 스코어링 서브시스템(260)은 동작 동안 NLU 프레임워크(104) 내의 의미 표현들(158)로부터 가중치들을 학습하고, 따라서 시간이 지남에 따라 더 담화가 되도록 개발될 수 있다.Also, although not shown, similarity scoring subsystem 260 may use weights (eg, derived from ML techniques) within comparisons or It should be understood that coefficients can be implemented. For example, to process search engine requests, similarity scoring subsystem 260 may increase the contribution of modifiers to nouns or verbs. Alternatively, during purchase order requests provided to a chatbot embodiment of the NLU framework 104 , the similarity scoring subsystem 260 may increase the contribution of nouns to modifiers or verbs. In some embodiments, similarity scoring subsystem 260 learns weights from semantic representations 158 within NLU framework 104 during operation, and thus may be developed to become more discoursed over time.

본 명세서에서 인식되는 바와 같이, 특정 실시예들의 유사도 스코어링 서브시스템(260)은 또한 의미 표현들의 서브트리 벡터들의 적절한 조합들을 비교함으로써 상이한 크기들을 갖는 2개의 의미 표현 사이의 유사도를 분석할 수 있다. 예를 들어, 유사도 스코어링 서브시스템(260)은 더 짧은 의미 표현에서 비교가능한 대응물을 갖지 않는 더 긴 의미 표현의 노드들의 서브트리 벡터에 대한 조합(예를 들어, 중심 또는 가중 평균)을 결정할 수 있다. 따라서, 이러한 실시예들의 유사도 스코어링 서브시스템(260)은 본 기술들에 따라, 더 짧은 의미 표현에 대한 수정된 의미 표현(비교가능한 노드들 대신에 비교를 포함함)을 분석할 수 있다.As will be appreciated herein, similarity scoring subsystem 260 of certain embodiments may also analyze the similarity between two semantic representations having different sizes by comparing appropriate combinations of subtree vectors of semantic representations. For example, similarity scoring subsystem 260 may determine a combination (eg, a centroid or weighted average) for a subtree vector of nodes of a longer semantic representation that do not have comparable counterparts in the shorter semantic representation. there is. Accordingly, the similarity scoring subsystem 260 of these embodiments may analyze the modified semantic representation (including comparisons instead of comparable nodes) for shorter semantic representations, in accordance with the present techniques.

특정 실시예들에서, 유사도 스코어링 서브시스템(260)은 의미 표현(262, 330)의 특정 내용, 유사도 스코어링 서브시스템(260)이 동작하고 있는 특정 맥락, 유사도 스코어링 프로세스의 원하는 세분성 등에 기반하여 하나, 다른 하나 또는 둘 다의 의미 표현(262, 330)에 대한 고려된 노드들의 부분을 확장하기 위해 함수들(280)의 특정한 상호의존적 실시예들을 적용할 수 있다. 예를 들어, 특정 심층 유사도 스코어링 프로세스가 요구되는 경우, 유사도 스코어링 서브시스템(260)은 검색 공간 의미 표현(330)에 대해 동일한 것을 행하지 않고 의미 표현(262)의 고려된 노드들(362)의 수를 확장할 수 있다. 다음 비교 동안, 유사도 스코어링 서브시스템(260)은 고려된 노드들(362)의 각각의 순열을 평가하기 위해 의미 표현(262)의 고려된 노드들(362)을 확장 해제하고 검색 공간 의미 표현(330)의 고려된 노드들(362)을 확장할 수 있다. 신속한 유사도 스코어링 프로세스가 수행되는 다른 실시예들에서, 유사도 스코어링 서브시스템(260)은 대안적으로 각각의 비교에 대해 둘 다의 의미 표현들(262, 330)의 고려된 노드들(362)을 확장할 수 있다. 또한, 의미 표현들(262, 330)은 동일한 형태들 및 길이들을 갖는 것으로 도시되어 있지만, 유사도 스코어링 서브시스템(260)은 동일하지 않은 형태들 및/또는 길이들을 갖는 임의의 적절한 의미 표현들을 반복적으로 비교할 수 있다는 것을 이해해야 한다.In certain embodiments, similarity scoring subsystem 260 may be one based on the specific content of semantic expressions 262, 330, the specific context in which similarity scoring subsystem 260 is operating, the desired granularity of the similarity scoring process, etc.; Certain interdependent embodiments of functions 280 may be applied to expand the portion of the considered nodes for the other or both semantic representations 262 , 330 . For example, if a particular deep similarity scoring process is desired, the similarity scoring subsystem 260 does not do the same to the search space semantic representation 330 but rather the number of considered nodes 362 of the semantic representation 262 . can be expanded. During the next comparison, similarity scoring subsystem 260 de-expands considered nodes 362 of semantic representation 262 and search space semantic representation 330 to evaluate each permutation of considered nodes 362 . ) of the considered nodes 362 can be expanded. In other embodiments where a rapid similarity scoring process is performed, similarity scoring subsystem 260 alternatively expands considered nodes 362 of both semantic expressions 262 , 330 for each comparison. can do. Further, while semantic expressions 262 and 330 are shown as having the same shapes and lengths, similarity scoring subsystem 260 may iterate over any suitable semantic expressions having unequal forms and/or lengths. You have to understand that you can compare.

또한, 일부 상황들에서, 발화(266)의 의미 표현(262)은 넓은 명사-주도 CCG 형태 및 더 특정적인 CCG 형태(예를 들어, 명사-형용사-동사) 둘 다와 같은 복수의 CCG 형태들(304)에 대응한다는 것이 또한 인식된다. 이와 같이, 특정 실시예들의 유사도 스코어링 서브시스템(260)은 의미 표현(262)이 맞는 형태 클래스 데이터베이스(270)에 의해 지원되는 복수의 형태 클래스들을 식별할 수 있다. 이러한 실시예들에서, 유사도 스코어링 서브시스템(260)은 의미 표현(262)을 설명하는 각각의 CCG 형태(304)에 대한 유사도 스코어들을 유리하게 생성할 수 있다. 그 후, 유사도 스코어링 서브시스템(260)은 의미 표현(262)의 각각의 할당된 CCG 형태에 대한 유사도 스코어들을 집계하여, 의미 검색의 범위를 증가시킴으로써 에이전트 자동화 시스템(100)의 동작을 개선할 수 있다.Also, in some situations, the semantic representation 262 of the utterance 266 may be formed in a plurality of CCG forms, such as both a broad noun-dominant CCG form and a more specific CCG form (eg, noun-adjective-verb). It is also recognized that corresponding to (304). As such, similarity scoring subsystem 260 of certain embodiments may identify a plurality of shape classes supported by shape class database 270 against which semantic expression 262 fits. In such embodiments, similarity scoring subsystem 260 may advantageously generate similarity scores for each CCG form 304 describing semantic expression 262 . The similarity scoring subsystem 260 may then aggregate similarity scores for each assigned CCG form of the semantic representation 262 to improve the operation of the agent automation system 100 by increasing the scope of the semantic search. there is.

어느 경우에나, 유사도 스코어링 서브시스템(260)은 의미 표현의 각각의 CCG 형태에 대한 유사도 스코어들에 조정 함수를 적용할 수 있고, 따라서 의미 표현(262)의 복수의 CCG 형태(304) 해석들로 수행된 비교들로부터 각각의 유사도 스코어링 결과들을 조정(예를 들어, 통합)할 수 있다. 조정 함수는 특정 실시예들에서, 형태 클래스 데이터베이스(270) 내에 저장될 수 있다. 본 명세서에서 인식되는 바와 같이, 조정 함수는 유사도 스코어링 서브시스템(260)이 예를 들어 유사도 스코어의 최대 또는 가중 평균을 유지함으로써, 의미 표현(262)의 CCG 형태(304)의 단일 해석으로부터 기인하는 한계들을 초월하는 집계 정밀화된 유사도 스코어(예를 들어, 집계 유사도 스코어)를 생성하고 이를 출력할 수 있게 한다. 또한, 특정 실시예들의 조정 함수는 각각의 클라이언트에 대해, 유사도 스코어링 서브시스템(260)이 동작하는 각각의 도메인에 있어서, CCG 형태들(304)의 각각의 조합에 대해 개별적으로 맞춤화될 수 있다.In any case, similarity scoring subsystem 260 may apply an adjustment function to the similarity scores for each CCG form of the semantic expression, thus generating multiple CCG forms 304 interpretations of the semantic expression 262 . Each similarity scoring result may be adjusted (eg, consolidated) from the comparisons performed. The steering function may, in certain embodiments, be stored in the shape class database 270 . As will be appreciated herein, the adjustment function results from a single interpretation of the CCG form 304 of the semantic representation 262 by the similarity scoring subsystem 260 , for example, by maintaining a maximum or weighted average of the similarity scores. Generate and output aggregated refined similarity scores (eg aggregated similarity scores) that transcend limits. Further, the steering function of certain embodiments may be individually customized for each combination of CCG forms 304 , for each client, in each domain in which the similarity scoring subsystem 260 operates.

도 12는 유사도 스코어링 서브시스템(260)이 수학적 비교 함수 리스트(278)를 이용하여 검색 공간(250)으로부터 매칭 의미 표현들을 반복적으로 식별할 수 있는 프로세스(400)의 실시예를 도시하는 흐름도이다. 전술한 바와 같이, 검색 공간(250)은 적어도 하나의 이해 모델(157)의 의미 표현들(162)에 의해 정의된다. 이해되는 바와 같이, 프로세스(400)는 유사도 스코어링 서브시스템(260)이 (위에서 소개된 의미 표현(262)에 대응하는) 발화 의미 모델(160)의 각각의 의미 표현들(162)과 검색 공간(250) 내의 방대한 수의 의미 표현들(158) 사이의 유사도를 예측적으로 평가할 수 있게 한다. 프로세스(400)는 도 3, 도 4a 및 도 4b와 관련하여 전술한 바와 같이 클라이언트 인스턴스(42) 또는 기업 인스턴스(125)와 연관된 적절한 프로세서(예를 들어, 프로세서(들)(82))에 의해 실행되고 적절한 메모리(예를 들어, 메모리(86))에 저장될 수 있다.12 is a flow diagram illustrating an embodiment of a process 400 in which similarity scoring subsystem 260 may iteratively identify matching semantic expressions from search space 250 using mathematical comparison function list 278 . As described above, the search space 250 is defined by the semantic representations 162 of the at least one understanding model 157 . As will be appreciated, the process 400 allows the similarity scoring subsystem 260 to select each of the semantic representations 162 of the spoken semantic model 160 (corresponding to the semantic representation 262 introduced above) and the search space ( 250) to predictively evaluate the degree of similarity between a vast number of semantic expressions 158. Process 400 may be performed by an appropriate processor (eg, processor(s) 82 ) associated with client instance 42 or enterprise instance 125 , as described above with respect to FIGS. 3 , 4A and 4B . may be executed and stored in a suitable memory (eg, memory 86 ).

구체적으로, 도시된 실시예의 유사도 스코어링 서브시스템(260)은 for-each 루프로 발화 의미 모델(160)의 각각의 의미 표현(262)을 통해 반복한다(블록 402). 유사도 스코어링 서브시스템(260)은 각각의 CCG 형태(304)에 대해 정밀화된 유사도 스코어(414)의 생성을 가능하게 하는 for-each 루프 대신에 임의의 다른 적절한 처리 스킴을 구현할 수 있다는 것을 이해해야 한다. 예를 들어, 유사도 스코어링 서브시스템(260)은 대안적으로 do-while 루프, for 루프, while 루프, do-until 루프 등을 구현할 수 있다. 어쨌든, 발화 의미 모델(160)의 각각의 의미 표현(262)에 대해, 유사도 스코어링 서브시스템은 각각의 의미 표현(262)의 CCG 형태를 결정하고(블록 404), 형태 클래스 데이터베이스(270)로부터 연관된 수학적 비교 함수 리스트(278)를 검색한다. CCG 형태에 기반하여 프로세스(400)에 대한 반복 파라미터들을 초기화할 때, 유사도 스코어링 서브시스템(260)은 또한 수학적 비교 함수 리스트(278)의 제1 함수(280)를 선택하고(블록 406), 관심 있는 검색 부분 공간을 처음에 전체 검색 공간(250)이 되도록 정의한다.Specifically, the similarity scoring subsystem 260 of the illustrated embodiment iterates through each semantic representation 262 of the utterance semantic model 160 in a for-each loop (block 402). It should be appreciated that the similarity scoring subsystem 260 may implement any other suitable processing scheme in lieu of a for-each loop that enables the generation of a refined similarity score 414 for each CCG form 304 . For example, similarity scoring subsystem 260 may alternatively implement a do-while loop, a for loop, a while loop, a do-until loop, or the like. In any case, for each semantic representation 262 of the utterance semantic model 160 , the similarity scoring subsystem determines (block 404 ) the CCG shape of each semantic representation 262 , and determines the associated semantic representation 262 from the shape class database 270 . A list of mathematical comparison functions 278 is retrieved. Upon initializing the iteration parameters for the process 400 based on the CCG shape, the similarity scoring subsystem 260 also selects a first function 280 of the mathematical comparison function list 278 (block 406), of interest Define the search subspace in which there is initially to be the entire search space 250 .

예시된 실시예에 있어서, 프로세스(400)는 사용자 발화(122)로부터 도출된 의미 표현(262)을 검색 부분 공간의 비교가능한 의미 표현들(158)과 비교하여(블록 410), 검색 부분 공간의 비교가능한 의미 표현들에 대한 의미 표현(262)의 비교들에 대응하는 유사도 스코어들의 세트(412)를 생성한다. 일부 실시예들에서, 유사도 스코어링 서브시스템(260)은 비교된 의미 표현들의 의미 벡터들(예를 들어, 서브트리 벡터들) 사이의 거리에 기반하여 유사도 스코어들의 세트(412)를 결정할 수 있다. 언급된 바와 같이, 제1 함수(352)를 구현하는 유사도 스코어링 서브시스템(260)은 최소량의 계산 자원들을 이용한다. 이와 같이, 유사도 스코어링 서브시스템(260)은 사용자 발화의 의미 표현을 검색 공간 또는 이해 모델에서의 각각의 의미 표현과 체계적으로 비교하기 위해 단일 복잡도 비교 함수를 이용할 수 있는 다른 검색 시스템들보다 이 초기 CCG 형태 검색 및 유사도 스코어링을 더 신속하게 그리고/또는 효율적으로 수행할 수 있다.In the illustrated embodiment, the process 400 compares the semantic representation 262 derived from the user utterance 122 with comparable semantic representations 158 of the search subspace (block 410), Generate a set 412 of similarity scores corresponding to comparisons of the semantic representation 262 to comparable semantic representations. In some embodiments, similarity scoring subsystem 260 can determine set 412 of similarity scores based on distances between semantic vectors (eg, subtree vectors) of the compared semantic expressions. As mentioned, the similarity scoring subsystem 260 implementing the first function 352 uses a minimal amount of computational resources. As such, the similarity scoring subsystem 260 is superior to other search systems that may use a single complexity comparison function to systematically compare the semantic representation of a user utterance with each semantic representation in the search space or comprehension model. Shape search and similarity scoring may be performed more quickly and/or efficiently.

예를 들어, 이제 도 13을 참조하면, 도 13은 수학적 비교 함수 리스트(278)를 적용하여 검색 공간(250)을 적절한 검색 부분 공간들(502)로 선택적으로 정밀화하는 유사도 스코어링 서브시스템(260)의 실시예의 개략도를 도시한다. 예를 들어, 제1 비교(500) 동안, 유사도 스코어링 서브시스템(260)은 제1 함수(352)를 적용하여, 검색 공간(250)의 전체인 것으로 초기화되는, 검색 부분 공간(502) 내의 비교가능한 의미 표현들(158)과 의미 표현(262)을 비교할 수 있다. 본 명세서에서 인식되는 바와 같이, 가장 덜 정확하고 가장 효율적인 함수(352)의 이러한 적용은 유사도 스코어링 서브시스템(260)이 검색 부분 공간(502)에 걸쳐 제1 패스 검색을 효율적으로 수행할 수 있게 한다. 본 실시예에서, 제1 함수(352)는 의미 표현(262)의 루트 노드(354)를 고려하지만, 의미 표현(262)의 다른 적절한 부분(예를 들어, 다른 노드들 또는 노드들의 조합들)이 제1 함수(352)를 통해 분석될 수 있다는 것을 이해해야 한다.For example, referring now to FIG. 13 , which is a similarity scoring subsystem 260 that applies a list of mathematical comparison functions 278 to selectively refine the search space 250 into appropriate search subspaces 502 . shows a schematic diagram of an embodiment of For example, during the first comparison 500 , the similarity scoring subsystem 260 applies a first function 352 to a comparison within the search subspace 502 , initialized to be the entirety of the search space 250 . Possible semantic expressions 158 and semantic expressions 262 may be compared. As will be appreciated herein, this application of the least accurate and most efficient function 352 allows the similarity scoring subsystem 260 to efficiently perform a first pass search across the search subspace 502 . . In the present embodiment, the first function 352 considers the root node 354 of the semantic representation 262 , but other suitable parts of the semantic representation 262 (eg, other nodes or combinations of nodes). It should be understood that this first function 352 may be parsed.

도 12로 돌아가서, 유사도 스코어링 서브시스템(260)은 검색 부분 공간(502)으로부터 의미 표현들(158)을 제거하거나 전지하고(블록 414), 전지된 의미 표현들(158)은 임계 유사도 스코어(예를 들어, 미리 결정된 임계 스코어) 미만인 세트(412)의 유사도 스코어들을 갖는다. 특정 실시예들에서, 임계 유사도 스코어는 유사도 스코어링 서브시스템(260) 내에 파라미터로서 저장되는 가능한 유사도 스코어들의 범위 내의 미리 정의된 값이다. 따라서, 유사도 스코어링 서브시스템(260)은 후속 함수들(280)을 검색 부분 공간(502)의 감소된 수의 의미 표현들(158)에 효율적으로 적용하기 위해 검색 부분 공간(502)을 축소시킬 수 있다. 실제로, 도 14로 돌아가서, 검색 부분 공간(502)은 제1 비교(500) 후에 좁혀져서(예를 들어, 수축되고, 잘라내져서), 임계치 미만인 세트(412)의 유사도 스코어와 연관되는 의미 표현들(158)을 검색 부분 공간(502)으로부터 제거한다.12 , similarity scoring subsystem 260 removes or omits semantic expressions 158 from search subspace 502 (block 414 ), and omniscient semantic expressions 158 obtain a threshold similarity score (eg, (eg, a predetermined threshold score) of the set 412 of similarity scores. In certain embodiments, the threshold similarity score is a predefined value within a range of possible similarity scores that is stored as a parameter in the similarity scoring subsystem 260 . Accordingly, the similarity scoring subsystem 260 can reduce the search subspace 502 to efficiently apply the subsequent functions 280 to the reduced number of semantic representations 158 of the search subspace 502 . there is. Indeed, returning to FIG. 14 , the search subspace 502 is narrowed (eg, shrunk, truncated) after the first comparison 500 so that semantic expressions associated with a similarity score of the set 412 that are below a threshold. Remove 158 from search subspace 502 .

일부 실시예들에서, 유사도 스코어링 서브시스템(260)은 임계 유사도 스코어로서 사용자 정의 값을 수신하여, 검색 부분 공간(502)의 의미 표현들(158)이 무시되는 그 미만인 임계치를 교정한다. 유사도 스코어링 서브시스템(260)은 또한, 유사도 스코어링 서브시스템(260)이 현재 동작하고 있는 특정 맥락 등에 기반하여, ML 기술들에 기반하여 임계 유사도 스코어를 업데이트할 수 있다. 예를 들어, 유사도 스코어링 서브시스템(260)은 특정 검색 부분 공간(502)이 매우 큰 맥락들에 대해 비교적 높은 또는 선택적인 임계 유사도 스코어(예를 들어, 적어도 90% 매치)를 구현하고/하거나 예측 유사도 스코어링 프로세스를 더욱 가속화할 수 있다. 또한, 임계 유사도 스코어는 각각의 함수(280)가 적용된 후에 개별적으로 선택되거나 업데이트될 수 있다. 더 상세하게는, 본 명세서에 개시된 유사도 스코어링 서브시스템(260)은, 이전 비교에서 임계 수보다 더 많은 의미 표현들(158)이 임계 유사도 스코어를 충족시켰다고 결정하는 것에 응답하여 후속 비교를 위해 임계 유사도 스코어의 값 또는 선택도를 감소시킬 수 있다.In some embodiments, similarity scoring subsystem 260 receives a user-defined value as a threshold similarity score to remediate a threshold below which semantic expressions 158 of search subspace 502 are ignored. Similarity scoring subsystem 260 may also update a threshold similarity score based on ML techniques, such as based on the particular context in which similarity scoring subsystem 260 is currently operating. For example, similarity scoring subsystem 260 implements and/or predicts a relatively high or selective threshold similarity score (eg, at least 90% matches) for contexts in which a particular search subspace 502 is very large. The similarity scoring process can be further accelerated. Also, the threshold similarity score may be individually selected or updated after each function 280 is applied. More specifically, similarity scoring subsystem 260 disclosed herein is configured to, in response to determining that more than a threshold number of semantic expressions 158 in a previous comparison satisfy a threshold similarity score, for a subsequent comparison a threshold similarity score. It can reduce the value or selectivity of the score.

도 12의 프로세스(400)로 돌아가서, 유사도 스코어링 서브시스템(260)은 블록(410)에 의해 제시된 CCG 형태 검색이 계속될 것인지를 결정한다(블록 416). 본 명세서에서 인식되는 바와 같이, 유사도 스코어링 서브시스템(260)은 하나 또는 복수의 적절한 정지 조건이 충족되는 것에 기반하여 CCG 형태 검색을 계속하기로 결정할 수 있다. 예를 들어, 유사도 스코어링 서브시스템(260)은 각각의 의미 표현이 검색 부분 공간(502)으로부터 전지되는 것(예를 들어, 어떠한 매치들도 없음을 나타냄), 임계 수의 의미 표현들이 검색 부분 공간 내에 남아 있는 것(예를 들어, 가장 가능성 있는 매치들을 나타냄), 최근에 적용된 함수(280)가 함수(280) 내의 임베딩된 정지 조건들이 충족되었음을 표시하는 것, 수학적 비교 함수 리스트(278)의 모든 함수들(280)이 적용되었던 것 등에 응답하여 CCG 형태 검색을 종료할 수 있다.Returning to process 400 of FIG. 12 , similarity scoring subsystem 260 determines whether the CCG shape search presented by block 410 should continue (block 416 ). As will be appreciated herein, similarity scoring subsystem 260 may decide to continue the CCG shape search based on one or more appropriate stopping conditions being met. For example, similarity scoring subsystem 260 may determine that each semantic expression is omniscient from search subspace 502 (eg, indicating no matches), a threshold number of semantic expressions is remaining in (eg, representing the most probable matches), the recently applied function 280 indicating that the embedded stop conditions in the function 280 have been satisfied, and all of the mathematical comparison function list 278 The CCG shape search may end in response to which functions 280 have been applied or the like.

블록(416)에서 CCG 형태 검색이 계속되어야 한다고 결정하는 것에 응답하여, 유사도 스코어링 서브시스템(260)은 수학적 비교 함수 리스트(278)의 다음 함수(280)를 선택한다(블록 420). 그 후, 화살표(422)에 의해 표시되는 바와 같이, 유사도 스코어링 서브시스템(260)은 블록(410)으로 복귀하여 검색 부분 공간(502)의 남아 있는 비교가능한 의미 표현들(158)과 의미 표현(262)을 비교한다. 따라서, 유사도 스코어링 서브시스템(260)은 수학적 비교 함수 리스트(278)의 더 비싼 함수(280)를 이용함으로써 검색 부분 공간(502)의 남아 있는 비교가능한 의미 표현들과 연관된 유사도 스코어들의 세트(412)를 정밀화(예를 들어, 수정, 업데이트)한다. 각각의 비교 후에, 유사도 스코어링 서브시스템(260)은 프로세스(400)를 통해 생성된 유사도 스코어들의 다양한 세트들(412)의 어레이를 저장할 수 있거나, 또는 대안적으로, 세트(412)의 각각의 이전에 생성된 유사도 스코어를 그것의 더 정확한 대응물로 대체할 수 있다. 실제로, 후속 함수들(280)의 적용 동안 더 많은 처리 자원들이 이용되기 때문에, 유사도 스코어들의 세트(412)는 일반적으로 더 많은 함수들(280)이 적용됨에 따라 정확도 및/또는 정밀도가 개선된다. 유사도 스코어들의 세트(412)에 기반하여, 예시된 프로세스(400)를 수행하는 유사도 스코어링 서브시스템(260)은 임계 유사도 스코어 미만인 세트(412)의 각각의 유사도 스코어들과 연관된 의미 표현들(158)의 검색 공간(250)을 전지한다(블록 404).In response to determining at block 416 that the CCG shape search should continue, the similarity scoring subsystem 260 selects the next function 280 in the list of mathematical comparison functions 278 (block 420). Thereafter, as indicated by arrow 422 , similarity scoring subsystem 260 returns to block 410 with the remaining comparable semantic representations 158 and semantic representations 158 of search subspace 502 . 262) are compared. Accordingly, the similarity scoring subsystem 260 uses the more expensive function 280 of the mathematical comparison function list 278 to provide a set 412 of similarity scores associated with the remaining comparable semantic representations of the search subspace 502 . refines (eg, modifies, updates). After each comparison, similarity scoring subsystem 260 may store an array of various sets 412 of similarity scores generated via process 400 , or alternatively, each previous of set 412 . We can replace the similarity score generated in . Indeed, since more processing resources are used during application of subsequent functions 280 , the set of similarity scores 412 generally improves in accuracy and/or precision as more functions 280 are applied. Based on the set of similarity scores 412 , the similarity scoring subsystem 260 performing the illustrated process 400 generates semantic expressions 158 associated with respective similarity scores of the set 412 that are less than a threshold similarity score. Retrieve the search space 250 of (block 404).

도 13을 다시 참조하면, 유사도 스코어들의 세트(412)를 정밀화하고 검색 부분 공간(502)을 전지하기 위해, 유사도 스코어링 서브시스템(260)은 제2 함수(368)를 적용하여 크기가 감소된 검색 부분 공간(502) 내의 남아 있는 의미 표현들(158)과 의미 표현들(262)을 비교한다. 따라서, 유사도 스코어링 서브시스템(260)은 검색 부분 공간(502)을 임계 유사도 스코어를 충족시키는 적절한 후보들로 추가로 축소시킬 수 있다. 특정 실시예들에서, 함수(180)의 구조는 예를 들어 수학적 비교 함수 리스트(278)의 함수들(280) 내의 용어들에 본래 기반하여, 비교된 의미 표현들의 각각의 노드들의 언커버링 또는 확장을 안내한다. 예를 들어, 제1 함수(352)는 의미 표현의 루트 노드(354)를 검색 부분 공간(502)의 의미 표현들(158)과 비교하는 단일 용어를 포함할 수 있는 반면, 제2 함수(368)는 의미 표현(262)의 확장된 부분을 의미 표현들(158)과 비교하는 하나 또는 복수의 용어를 포함할 수 있다.Referring back to FIG. 13 , in order to refine the set 412 of similarity scores and prune the search subspace 502 , the similarity scoring subsystem 260 applies a second function 368 to the reduced size search. The remaining semantic representations 158 in the subspace 502 are compared with the semantic representations 262 . Accordingly, similarity scoring subsystem 260 may further reduce search subspace 502 to suitable candidates that satisfy a threshold similarity score. In certain embodiments, the structure of function 180 is essentially based, for example, on terms within functions 280 of mathematical comparison function list 278, covering or expanding the respective nodes of the compared semantic expressions. to guide For example, the first function 352 may include a single term that compares the root node 354 of the semantic expression with the semantic expressions 158 of the search subspace 502 , while the second function 368 . ) may include one or more terms comparing the expanded portion of the semantic representation 262 to the semantic representations 158 .

따라서, 유사도 스코어링 서브시스템(260)은 주어진 비교를 위해 검색 부분 공간(502)의 존속하는(예를 들어, 빔 내) 의미 표현들(158)에 점진적으로 더 정확하고 더 비싼 함수들(280)을 반복적으로 적용한다. 도 13과 관련하여 도 12의 프로세스(400)의 논의를 계속하면, 유사도 스코어링 서브시스템(260)은 제3 비교(516) 동안 제3 함수(378)를 구현하여 의미 표현(262)의 동일한 언커버링된 부분(510)을 검색 부분 공간(502)의 추가로 전지된 실시예 등과 비교할 수 있다. 이와 같이, 본 실시예의 유사도 스코어링 서브시스템(260)은 최종 비교(524) 동안 최종 함수(384)를 구현하여 의미 표현(262)의 전체(526)를 검색 부분 공간(502)의 최종 실시예와 비교하도록 설계됨으로써, 현저하게 감소된 수의 남아 있는 의미 표현(158) 후보들에 대한 가장 계산적으로 집약적인 최종 함수(384)의 이용을 보존한다. 실제로, 특정 경우들에서, 도 13의 최종 비교(524)는 의미 표현(262) 내에서 이용가능한 정보의 전체를 활용하여, 의미 표현(262)과 적절히 매칭하는 검색 공간(250)으로부터의 의미 표현들(158)의 세트를 생성할 수 있다. 다른 실시예들에서, 전술한 바와 같이, 최종 함수(384)는 외부 언어 모델에 질의하는 것과 같은 임의의 다른 적절한 자원 집약적인 프로세스를 통해 의미 표현(262)의 일부를 고려할 수 있다.Accordingly, the similarity scoring subsystem 260 provides progressively more accurate and more expensive functions 280 to the persisting (eg, in-beam) semantic representations 158 of the search subspace 502 for a given comparison. is applied repeatedly. Continuing the discussion of process 400 of FIG. 12 with respect to FIG. 13 , similarity scoring subsystem 260 implements third function 378 during third comparison 516 to implement the same expression of semantic representation 262 . The covered portion 510 may be compared to an additionally omnipresent embodiment of the search subspace 502 and the like. As such, the similarity scoring subsystem 260 of this embodiment implements the final function 384 during the final comparison 524 to match the entire 526 of the semantic expression 262 with the final embodiment of the search subspace 502 . By designing to compare, it preserves the use of the most computationally intensive final function 384 for the significantly reduced number of remaining semantic representation 158 candidates. Indeed, in certain cases, the final comparison 524 of FIG. 13 utilizes all of the information available within the semantic representation 262 , the semantic representation from the search space 250 appropriately matching the semantic representation 262 . A set of 158 may be created. In other embodiments, as described above, the final function 384 may consider the portion of the semantic representation 262 through any other suitable resource intensive process, such as by querying an external language model.

이와 같이, 도 12로 돌아가서, 유사도 스코어링 서브시스템(260)은 블록(416)에서 CCG 형태 검색의 정지 조건들이 충족되었다고 결정하고, 그 후 검색 부분 공간(502)으로부터 식별된 매칭 의미 표현들의 세트(430)를 출력할 수 있다. 따라서, 유사도 스코어링 서브시스템(260)은 추출된 아티팩트들(140)의 후속 결정을 위해 매칭 의미 표현들의 세트(430)를 효율적으로 식별하고 이를 NLU 프레임워크(104)의 다른 구성요소들에 제공할 수 있다.As such, returning to FIG. 12 , the similarity scoring subsystem 260 determines at block 416 that the stopping conditions of the CCG form search have been met, and thereafter the set of matching semantic expressions identified from the search subspace 502 ( 430) can be printed. Accordingly, the similarity scoring subsystem 260 will efficiently identify the set of matching semantic expressions 430 for subsequent determination of the extracted artifacts 140 and provide them to other components of the NLU framework 104 . can

본 개시내용의 기술적 효과들은 샘플 발화들로부터 도출된 의미 표현들로 채워진 검색 공간을 능숙하게 좁힐 수 있는(예를 들어, 윈도잉할 수 있는) 의미 검색 서브시스템을 구현하는 에이전트 자동화 프레임워크를 제공함으로써, 수신된 사용자 발화의 의미 표현과 적절히 매칭하는 의미 표현들의 식별을 개선하는 것을 포함한다. 본 실시예들은 특히 CCG 기술들을 통해 의미 표현들을 효율적이고 예측적으로 비교하는 의미 검색 서브시스템의 유사도 스코어링 서브시스템에 관한 것이다. 즉, 특정 의미 표현의 식별된 CCG 형태에 기반하여, 유사도 스코어링 서브시스템은 의미 표현들 사이의 유사도의 정량화들을 가능하게 하는 수학적 비교 함수 리스트를 결정할 수 있다. 이러한 리스트의 비교 함수들은 유사도 스코어링 서브시스템이 의미 표현들 사이의 가장 효율적이고 가장 덜 비싼 비교들을 수행하고 이들 사이의 유사한 스코어들을 결정할 수 있게 하기 위해 시간순으로 순서화된다. 유사도 스코어들에 기반하여, 유사도 스코어링 서브시스템은 검색 공간 내의 특히 유사한 의미 표현들을 반복적으로 식별하고, 이러한 의미 표현들에 대한 검색 공간을 전지(예를 들어, 축소, 감소)하고, 그 후 의미 표현들의 동일한 수 또는 증가된 수의 노드들에 대해 더 계산적으로 집약적인 비교 함수들을 구현할 수 있다. 즉, 선택적 노드 언커버링의 반복적 적용 및/또는 자원 이용의 증가는 일반적으로 점점 더 복잡한 비교 함수들을 통해, 비교된 의미 표현들의 더 많은 데이터를 고려하면서, 검색 공간 내의 잠재적으로 매칭하는 의미 표현들을 좁힌다. 이와 같이, 예측 유사도 스코어링을 위한 본 기술들은 NLU 프레임워크가 아티팩트들을 효율적으로 추출할 수 있게 하는 의미 표현 매치들의 타겟화된 발견을 가능하게 하면서, 계산 오버헤드를 복수의 사용자들과의 기업 레벨 자연어 관여에 적절한 레벨로 감소시킨다.Technical effects of the present disclosure provide an agent automation framework implementing a semantic search subsystem capable of skillfully narrowing (eg, windowing) a search space filled with semantic expressions derived from sample utterances. thereby improving the identification of semantic expressions that properly match the semantic expression of the received user utterance. The present embodiments particularly relate to a similarity scoring subsystem of a semantic retrieval subsystem that efficiently and predictively compares semantic expressions via CCG techniques. That is, based on the identified CCG form of a particular semantic expression, the similarity scoring subsystem may determine a list of mathematical comparison functions that enable quantifications of the similarity between the semantic expressions. The comparison functions in this list are ordered chronologically to enable the similarity scoring subsystem to perform the most efficient and least expensive comparisons between semantic expressions and determine similar scores between them. Based on the similarity scores, the similarity scoring subsystem iteratively identifies particularly similar semantic expressions within the search space, prunes (eg, reduces, reduces) the search space for these semantic expressions, and then the semantic expression It is possible to implement more computationally intensive comparison functions for the same number of nodes or an increased number of nodes. That is, the iterative application of selective node uncovering and/or increased resource utilization typically results in narrowing down potentially matching semantic expressions within the search space, taking into account more data of compared semantic expressions, via increasingly complex comparison functions. all. As such, the present techniques for predictive similarity scoring enable targeted discovery of semantic expression matches that allow the NLU framework to efficiently extract artifacts, while reducing the computational overhead of enterprise-level natural language with multiple users. Reduce to an appropriate level for involvement.

위에서 설명된 특정 실시예들은 예로서 도시되었고, 이러한 실시예들은 다양한 수정들 및 대안적인 형태들이 가능할 수 있다는 점이 이해되어야 한다. 청구항들은 개시된 특정 형태들로 제한되는 것이 아니라, 오히려 본 개시내용의 사상 및 범위 내에 드는 모든 수정들, 등가물들, 및 대안들을 포괄하도록 의도된다는 점이 추가로 이해되어야 한다.The specific embodiments described above have been shown by way of example, and it should be understood that these embodiments are capable of various modifications and alternative forms. It is further to be understood that the claims are not limited to the specific forms disclosed, but rather are intended to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present disclosure.

본 명세서에 제시되고 청구되는 기술들은 본 기술 분야를 명백하게 개선하고, 따라서 추상적이거나, 무형적이거나, 순수하게 이론적이지 않은 실제적인 성질의 물질적 대상들 및 구체적인 예들에 참조되고 적용된다. 또한, 본 명세서의 끝에 첨부되는 임의의 청구항들이 "~[기능]을 [수행]하기 위한 수단" 또는 "~[기능]을 [수행]하기 위한 단계"로서 지정되는 하나 이상의 요소를 포함하는 경우, 이러한 요소들은 35 U.S.C. 112(f) 하에서 해석되는 것으로 의도된다. 그러나, 임의의 다른 방식으로 지정되는 요소들을 포함하는 임의의 청구항들에 대해, 이러한 요소들은 35 U.S.C. 112(f) 하에서 해석되지 않는 것으로 의도된다.The techniques presented and claimed herein clearly improve the technical field, and thus refer to and apply to material objects and specific examples of a practical nature that are not abstract, intangible, or purely theoretical. Also, if any claims appended to the end of this specification include one or more elements designated as "means for [performing] [the function]" or "steps for [performing] [the function]," These factors are 35 USC 112(f) is intended to be construed. However, for any claims that include elements designated in any other way, such elements are subject to 35 U.S.C. It is not intended to be construed under 112(f).

Claims

An agent automation system comprising:
a memory configured to store a natural language understanding (NLU) framework comprising a similarity scoring subsystem having a form class database; and
a processor configured to execute instructions to cause a similarity scoring subsystem of the NLU framework to perform actions
Including, the actions are
receiving a semantic representation of a user utterance;
identifying a cognitive construction grammar (CCG) form of the semantic expression;
determining at least one shape class entry in the shape class database that matches a CCG shape of the semantic expression; and
retrieving a list of mathematical comparison functions from the at least one shape class entry, wherein the list of mathematical comparison functions is such that the similarity scoring subsystem compares at least a portion of the semantic expression to at least a search space portion of the search space semantic expression between them. to determine the similarity score of -
Including, agent automation system.

According to claim 1,
wherein the list of mathematical comparison functions comprises an ordered set of functions that enable the similarity scoring subsystem to progressively use more computationally expensive functions to compare the semantic representation with the search space semantic representation.

According to claim 1,
The search space semantic expression is one of a plurality of search space semantic expressions defining a search space of the NLU framework, and the instructions cause the similarity scoring subsystem to: and performing actions comprising searching the search space for a subset of spatial semantic representations.

According to claim 1,
The instructions cause the similarity scoring subsystem to determine the similarity score between the semantic expression and the search space semantic expression by comparing the semantic expression to the search space semantic expression via the list of mathematical comparison functions. An agent automation system that allows to perform actions, including

5. The method of claim 4,
The similarity scoring subsystem comprises:
comparing a first root node of the semantic representation to a second root node of the search space semantic representation to determine the similarity score; and
comparing a first root node and a first dependent node of the semantic expression with a second root node and a second dependent node of the search space semantic expression to refine the similarity score;
and compare the semantic representation with the search space semantic representation by

6. The method of claim 5,
The instructions cause the similarity scoring subsystem to:
determining whether the similarity score is below a predetermined threshold score; and
disregarding the search space semantic expression from subsequent comparisons with the semantic expression in response to determining that the similarity score is below the predetermined threshold score.
An agent automation system for performing actions comprising:

According to claim 1,
wherein the semantic representation comprises a utterance tree structure comprising a root node and at least one dependent node semantically coupled to the root node.

8. The method of claim 7,
at least a part of the semantic expression is the root node, and at least a search space part of the search space semantic expression is a search space root node of the search space semantic expression, so that the similarity score is a subtree vector of the root node and the search space An agent automation system that quantifies the similarity between subtree vectors of a root node.

According to claim 1,
The CCG form comprises a first CCG form, the at least one form class entry comprises a first form class entry, the mathematical comparison function list comprises a first mathematical comparison function list, and the instructions include: the scoring subsystem,
identifying a second CCG form of the semantic expression;
determining a second shape class entry in the shape class database that matches a second CCG shape of the semantic expression; and
retrieving a second list of mathematical comparison functions from the second shape class entry.
An agent automation system for performing actions comprising:

10. The method of claim 9,
The instructions cause the similarity scoring subsystem to:
determining a first similarity score by comparing the semantic expression with the search space semantic expression through the first mathematical comparison function list;
determining a second similarity score by comparing the semantic expression with the search space semantic expression through the second list of mathematical comparison functions; and
aggregating the first similarity score and the second similarity score into an aggregated similarity score via an adjustment function, wherein the aggregated similarity score determines whether the similarity scoring subsystem determines whether the search space semantic expression is a match to the semantic expression. enable -
An agent automation system for performing actions comprising:

A method of operating a similarity scoring subsystem of an agent automation system, comprising:
For each search space semantic expression of a plurality of search space semantic expressions defining a search space of the agent automation system, a first portion of the semantic expression is divided into a plurality of the plurality of search space semantic expressions through a plurality of comparison functions. generating a plurality of initial similarity scores compared to the portion of the search space;
pruning the search space of dissimilar search space semantic expressions of the plurality of search space semantic expressions, each of the dissimilar search space semantic expressions being less than a predetermined score threshold for each of the plurality of initial similarity scores provides a similarity score;
increasing the number of considered nodes between the plurality of search space semantic expressions and the semantic expressions remaining in the search space;
For each of the plurality of search space semantic expressions remaining in the search space, a plurality of refined similarity scores by comparing the plurality of search space semantic expressions with the considered nodes of the semantic expression through the plurality of comparison functions creating a;
pruning a search space of dissimilar search space semantic expressions that provides a similarity score for each of the plurality of refined similarity scores that is less than the predetermined score threshold; and
identifying a semantic expression match for the semantic expression from any search space semantic expressions of the plurality of search space semantic expressions remaining in the search space;
A method comprising

12. The method of claim 11,
wherein increasing the number of considered nodes comprises considering additional nodes of the semantic representation and additional search space nodes of the plurality of search space semantic representations remaining in the search space.

12. The method of claim 11,
After pruning the search space of any dissimilar search space semantic expressions having a similarity score for each of the plurality of refined similarity scores that is less than the predetermined score threshold, determining whether each node of the semantic expression has been considered. step; and
in response to determining that at least one node of the semantic representation has not been considered, returning to increasing the number of considered nodes between the semantic representation and the plurality of search space semantic representations remaining in the search space. step
A method comprising

12. The method of claim 11,
identifying a Cognitive Construction Grammar (CCG) form of the semantic expression;
determining at least one shape class entry in a shape class database that matches a CCG shape of the semantic expression, and
retrieving a list of mathematical comparison functions including the plurality of comparison functions from the at least one shape class entry;
determining the plurality of comparison functions by

12. The method of claim 11,
The determining of at least one form class entry matching the CCG form of the semantic expression may include: comparing, by the form class database, the CCG form of the semantic expression to at least one search space semantic expression among the plurality of search space semantic expressions. and identifying that it contains a list of mathematical comparison functions for comparison with possible CCG forms.

12. The method of claim 11,
determining that the semantic expression includes a plurality of CCG forms;
determining a respective refined similarity score for comparing the semantic expression having a respective CCG form interpretation from the plurality of CCG forms to the plurality of search space semantic expressions; and
aggregating the respective refined similarity score for each CCG shape interpretation into an aggregated similarity score, via an adjustment function of the similarity scoring subsystem;
A method comprising

17. The method of claim 16,
wherein the adjustment function specifies that the aggregate similarity score is a maximum or weighted average of the respective refined similarity scores for each CCG form.

A non-transitory computer-readable medium having stored thereon instructions, comprising:
The instructions, when executed by one or more processors of an agent automation system, cause the agent automation system to cause the similarity scoring subsystem to:
identify a cognitive building grammar (CCG) form of the semantic expression corresponding to the received user utterance;
determine at least one shape class entry in a shape class database that matches a CCG shape of the semantic expression;
retrieve a list of mathematical comparison functions corresponding to the CCG form of the semantic expression, wherein the list of mathematical comparison functions enables the similarity scoring subsystem to progressively compare the semantic expression and the search space semantic expression;
iteratively applying each function of the list of mathematical comparison functions to determine a similarity score that quantifies the degree of similarity between a first considered portion of the semantic expression and a second considered portion of the search space semantic expression; and
successive functions in the list of mathematical comparison functions until the similarity score indicates a degree of similarity between all of the semantic expressions and at least a portion of the search space semantic expressions, or until the subsequent function is the most expensive function in the list of mathematical comparison functions. Refining the similarity score by iteratively applying
comparing the semantic expression with the search space semantic expression by
A non-transitory computer-readable medium that makes it tangible.

19. The method of claim 18,
wherein the search space semantic representation is one of a plurality of search space semantic representations defining a search space of the agent automation system.

20. The method of claim 19,
The instructions cause the agent automation system to cause the similarity scoring subsystem to:
after each refinement of the similarity score, narrow the search space to similar search space semantic representations of the plurality of search space semantic representations that provide respective similarity scores above a predetermined score threshold;
identifying the similar search space semantic expressions among the plurality of search space semantic expressions remaining in the search space as matches to the semantic expression after each similarity score indicates a degree of similarity between all of the semantic expressions;
A non-transitory computer-readable medium configured to be implemented in