CN107885744B - Conversational data analysis - Google Patents

Conversational data analysis Download PDF

Info

Publication number
CN107885744B
CN107885744B CN201610867019.5A CN201610867019A CN107885744B CN 107885744 B CN107885744 B CN 107885744B CN 201610867019 A CN201610867019 A CN 201610867019A CN 107885744 B CN107885744 B CN 107885744B
Authority
CN
China
Prior art keywords
data analysis
user
data
information
content item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610867019.5A
Other languages
Chinese (zh)
Other versions
CN107885744A (en
Inventor
侯智涛
楼建光
梁潇
张博
张海东
张冬梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Technology Licensing LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing LLC filed Critical Microsoft Technology Licensing LLC
Priority to CN202211627592.0A priority Critical patent/CN115858730A/en
Priority to CN201610867019.5A priority patent/CN107885744B/en
Priority to PCT/US2017/052839 priority patent/WO2018063924A1/en
Priority to EP17780278.2A priority patent/EP3519988A1/en
Priority to US16/338,061 priority patent/US11423229B2/en
Publication of CN107885744A publication Critical patent/CN107885744A/en
Priority to US17/813,435 priority patent/US20220405479A1/en
Application granted granted Critical
Publication of CN107885744B publication Critical patent/CN107885744B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90332Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • User Interface Of Digital Computer (AREA)
  • Machine Translation (AREA)

Abstract

Embodiments of the present disclosure relate to conversational data analysis. After receiving a data analysis request of a user, heuristic information may be determined based on the data analysis request. The heuristic information referred to herein is not the result of analyzing the request for the data, but rather information that may be used to direct the session to proceed. The user may provide supplemental information associated with the data analysis request based on such heuristic information, such as clarifying the meaning of the data analysis request, making a related further analysis request, and so forth. Based on the supplemental information from the user, the user can be provided with its true needs and meaningful data analysis results. In this way, the data analysis will be more accurate and efficient. The user can have good user experience while obtaining truly helpful information.

Description

对话式的数据分析Conversational Data Analysis

背景技术Background technique

数据分析在数据驱动决策系统等诸多应用领域发挥非常重要的作用。用户可以向数据分析工具提交数据查询,以便从期望的角度查询数据和创建可视化报告。为了使得数据分析更加方便易用,已经提出了将自然语言处理应用于数据分析的用户界面的方案。自然语言处理是指利用计算机处理人类语言的技术,它使得计算机能够理解人类语言。Data analysis plays a very important role in many application areas such as data-driven decision-making systems. Users can submit data queries to data analysis tools to query data and create visual reports from desired perspectives. In order to make data analysis more convenient and easy to use, a scheme of applying natural language processing to a user interface for data analysis has been proposed. Natural language processing refers to the technology of using computers to process human language, which enables computers to understand human language.

基于自然语言处理的传统数据分析方案主要基于单输入框(single input box)方式。当接收到用户输入的自然语言形式的数据分析请求时,机器执行相应的操作并且提供相应的结果。对于简单或基本的数据分析请求而言,这种数据分析方案通常能够得到相应的数据分析结果。然而,对于复杂的数据分析请求,现有的数据分析方案往往难以正确理解用户的真实意图,也就无法提供用户需要的数据分析结果。Traditional data analysis solutions based on natural language processing are mainly based on a single input box approach. When receiving a data analysis request in the form of natural language input by the user, the machine performs the corresponding operation and provides the corresponding result. For simple or basic data analysis requests, this data analysis solution can usually obtain corresponding data analysis results. However, for complex data analysis requests, existing data analysis solutions are often difficult to correctly understand the real intention of the user, and thus cannot provide the data analysis results required by the user.

发明内容Contents of the invention

为了解决上述和潜在的问题,本公开的实施例提供了双向对话式数据分析方法和设备。根据本公开的实施例,用户可以在与机器的对话中完成数据分析请求。在接收到用户的数据分析请求之后,可以基于该数据分析请求确定启发性信息。这里所称的启发性信息不是针对该数据分析请求的结果,而是可用于引导对话继续进行下去的信息。用户可以基于这种启发性信息提供与数据分析请求相关联的补充信息,例如澄清数据分析请求的含义、提出相关的进一步分析请求,等等。根据来自用户的补充信息,可以为用户提供其真正需要和有意义的数据分析结果。以此方式,数据分析将更加准确、有效。用户在获得真正有帮助的信息的同时也能具有良好的用户体验。In order to solve the above and potential problems, embodiments of the present disclosure provide a two-way conversational data analysis method and device. According to an embodiment of the present disclosure, a user can complete a data analysis request in a dialog with a machine. After receiving the user's data analysis request, heuristic information can be determined based on the data analysis request. The heuristic information referred to here is not the result of the data analysis request, but information that can be used to guide the conversation to continue. The user may provide supplementary information associated with the data analysis request based on such enlightening information, such as clarifying the meaning of the data analysis request, making a related further analysis request, and so on. According to the supplementary information from the user, it can provide the user with the data analysis results that they really need and meaningful. In this way, data analysis will be more accurate and efficient. Users can have a good user experience while getting really helpful information.

提供发明内容部分是为了简化的形式来介绍对概念的选择,它们在下文的具体实施方式中将被进一步描述。发明内容部分无意标识本公开的关键特征或主要特征,也无意限制本公开的范围。This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or principal characteristics of the disclosure, nor is it intended to limit the scope of the disclosure.

附图说明Description of drawings

结合附图并参考以下详细说明,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。在附图中,相同或相似的附图标注表示相同或相似的元素,其中:The above and other features, advantages and aspects of the various embodiments of the present disclosure will become more apparent with reference to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, the same or similar reference numerals indicate the same or similar elements, wherein:

图1示出了其中可以实施本公开的一个或多个实施例的计算环境100的框图;FIG. 1 shows a block diagram of a computing environment 100 in which one or more embodiments of the present disclosure may be implemented;

图2示出了根据本公开的实施例的用于进行数据分析的数据集200的示意图;FIG. 2 shows a schematic diagram of a data set 200 for performing data analysis according to an embodiment of the present disclosure;

图3示出了根据本公开的实施例的对于数据集200进行数据分析的示意图300;FIG. 3 shows a schematic diagram 300 of performing data analysis on a data set 200 according to an embodiment of the present disclosure;

图4A示出了根据本公开的实施例的对于数据集200进行数据分析的示意图400;FIG. 4A shows a schematic diagram 400 of performing data analysis on a data set 200 according to an embodiment of the present disclosure;

图4B示出了根据本公开的实施例基于图4A的启发性信息进行双向对话的数据分析过程示意图450;FIG. 4B shows a schematic diagram 450 of a data analysis process for a two-way dialogue based on the heuristic information in FIG. 4A according to an embodiment of the present disclosure;

图5示出了根据本公开的实施例的用于数据分析的方法500的流程图;FIG. 5 shows a flowchart of a method 500 for data analysis according to an embodiment of the present disclosure;

图6示出了根据本公开的实施例的用于数据分析的方法600的流程图;FIG. 6 shows a flowchart of a method 600 for data analysis according to an embodiment of the present disclosure;

图7示出了根据本公开的实施例的多个对话的用户界面700;以及FIG. 7 illustrates a user interface 700 for multiple dialogs according to an embodiment of the disclosure; and

图8示出了根据本公开的实施例的用户界面800。FIG. 8 illustrates a user interface 800 according to an embodiment of the disclosure.

在所有附图中,相同或相似参考数字表示相同或相似元素。Throughout the drawings, the same or similar reference numerals denote the same or similar elements.

具体实施方式detailed description

下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although certain embodiments of the present disclosure are shown in the drawings, it should be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein; A more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for exemplary purposes only, and are not intended to limit the protection scope of the present disclosure.

一般地,在本公开的实施例中描述的“数据分析”是指用适当的统计分析方法对收集的大量数据(以下简称为“数据集”)进行分析、提取有用信息和形成结论,从而对数据加以详细研究和概括总结的过程。Generally, the "data analysis" described in the embodiments of the present disclosure refers to analyzing a large amount of collected data (hereinafter referred to as "data set") with appropriate statistical analysis methods, extracting useful information and forming conclusions, so as to The process of examining and summarizing data in detail.

本公开的实施例所使用的术语“启发性信息”是指用于对用户与数据分析设备之间的会话进行引导的信息,例如用于引导用户澄清数据分析请求的信息、用于向用户提供扩展性数据分析结果的信息,等等。启发性信息不同于针对用户的数据分析请求而产生的结果(以下也称为“数据分析结果”)。The term "heuristic information" used in the embodiments of the present disclosure refers to information used to guide the session between the user and the data analysis device, such as information used to guide the user to clarify the data analysis request, used to provide the user with Information on the results of extended data analysis, etc. Heuristic information is different from the results generated for users' data analysis requests (hereinafter also referred to as "data analysis results").

本公开的实施例所使用的术语“内容项”是指用于表征数据集中的数据的语义单位,例如关于地点、时间、日期、事件、品牌、类别等的字词或短语。The term "content item" used in the embodiments of the present disclosure refers to a semantic unit used to characterize data in a data set, such as words or phrases about places, times, dates, events, brands, categories, and the like.

本公开的实施例所使用的术语“代码片段”是指用于实现与内容项相关联的一个或多个操作的一段代码。当以内容项作为输入来运行这段代码时,可以将得到的输出作为数据分析请求的部分或全部结果。The term "code segment" used in the embodiments of the present disclosure refers to a piece of code for implementing one or more operations associated with a content item. When this code is run with a content item as input, the resulting output can be part or all of the results of a data analysis request.

本公开使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”。其他术语的相关定义将在下文描述中给出。As used in this disclosure, the term "comprise" and its variations are open-ended, ie "including but not limited to". The term "based on" is "based at least in part on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one further embodiment". Relevant definitions of other terms will be given in the description below.

传统上,数据分析方案采用的是单向对话方式,其仅能够针对简单或基本的数据分析请求提供相应的数据分析结果。当用户输入复杂的数据分析请求时,传统的数据分析方案通常难以理解此类复杂的数据分析请求,从而导致系统报错或者给出错误的数据分析结果。结果是,无法帮助用户获得他/她真正想要得到的数据分析结果,也就无法满足用户需求,导致数据分析失去其意义。Traditionally, data analysis solutions adopt a one-way dialogue method, which can only provide corresponding data analysis results for simple or basic data analysis requests. When a user inputs a complex data analysis request, it is usually difficult for traditional data analysis solutions to understand such a complex data analysis request, thus causing the system to report errors or give wrong data analysis results. As a result, the user cannot be helped to obtain the data analysis results he/she really wants, and the user's needs cannot be met, causing the data analysis to lose its meaning.

为此,本公开提出了一种双向对话方式的数据分析方法和设备,其不仅能接收来自用户的数据分析请求,而且能够通过分析该数据分析请求而生成启发性信息。在此使用的术语“启发性信息”是指在用于引导数据分析对话继续进行的、不是数据分析结果的信息。例如,启发性信息可以引导用户做出进一步的解释或补充,从而得到设备可以理解的问题。启发性信息也可以是数据分析设备主动推荐给用户的与用户当前分析相关的扩展性信息。这些扩展性信息例如可以是数据分析设备通过数据挖掘方法从所分析的数据中得到的。通过这种方式,本公开的实施例的方法和设备可以为用户提供更加满足需求的数据分析结果,显著提高了用户体验。To this end, the present disclosure proposes a two-way dialog data analysis method and device, which can not only receive a data analysis request from a user, but also generate enlightening information by analyzing the data analysis request. As used herein, the term "enlightening information" refers to information that is not the result of a data analysis and is used to guide the continuation of a data analysis conversation. For example, enlightening information can lead the user to make further explanations or supplements, resulting in questions that the device can understand. The heuristic information may also be extended information related to the user's current analysis that is actively recommended to the user by the data analysis device. Such extensibility information may be obtained, for example, from the analyzed data by the data analysis device through a data mining method. In this way, the method and device of the embodiments of the present disclosure can provide users with data analysis results that better meet their needs, and significantly improve user experience.

以下参考图1至图8来说明本公开的基本原理和若干示例实现方式。图1示出了其中可以实施本公开的实施例的数据分析设备的计算环境100的框图。应当理解,图1所示出的计算环境100仅仅是示例性的,而不应当构成对本文所描述的实施例的功能和范围的任何限制。The basic principles and several example implementations of the present disclosure are explained below with reference to FIGS. 1 to 8 . FIG. 1 shows a block diagram of a computing environment 100 in which a data analysis device of an embodiment of the present disclosure may be implemented. It should be understood that the computing environment 100 shown in FIG. 1 is exemplary only and should not constitute any limitation on the functionality and scope of the embodiments described herein.

如图1所示,计算环境100包括用户101和通用计算设备形式的计算系统/服务器105。计算系统/服务器105可以用于实施本公开的实施例的数据分析设备(以下也称为“数据分析设备105”)。用户101可以与计算系统/服务器105进行交互以提出数据分析请求102,并获得所需的数据分析结果180。计算系统/服务器105的组件可以包括但不限于一个或多个处理器或处理单元110、存储器120、存储设备130、一个或多个通信单元140、一个或多个输入设备150以及一个或多个输出设备160。处理单元110可以是实际或虚拟处理器并且能够根据存储器120中存储的程序来执行各种处理。在多处理器系统中,多个处理单元并行执行计算机可执行指令,以提高计算系统/服务器105的并行处理能力。As shown in FIG. 1, a computing environment 100 includes a user 101 and a computing system/server 105 in the form of a general-purpose computing device. Computing system/server 105 may be used to implement the data analysis device of embodiments of the present disclosure (hereinafter also referred to as "data analysis device 105"). User 101 may interact with computing system/server 105 to make data analysis requests 102 and obtain desired data analysis results 180 . Components of computing system/server 105 may include, but are not limited to, one or more processors or processing units 110, memory 120, storage devices 130, one or more communication units 140, one or more input devices 150, and one or more output device 160. The processing unit 110 may be an actual or virtual processor and is capable of performing various processes according to programs stored in the memory 120 . In a multi-processor system, multiple processing units execute computer-executable instructions in parallel to increase the parallel processing capability of computing system/server 105 .

计算系统/服务器105通常包括多个计算机存储介质。这样的介质可以是计算系统/服务器105可访问的任何可以获得的介质,包括但不限于易失性和非易失性介质、可拆卸和不可拆卸介质。存储器120可以是易失性存储器(例如寄存器、高速缓存、随机访问存储器(RAM))、非易失性存储器(例如,只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、闪存)或它们的某种组合。存储设备130可以是可拆卸或不可拆卸的介质,并且可以包括机器可读介质,诸如闪存驱动、磁盘或者任何其他介质,其可以能够用于存储信息和/或数据170(例如数据集172)并且可以在计算系统/服务器105内被访问。应当理解,以上描述仅仅是示例性的,数据集172不仅能够存储在存储设备130中,也可以存储在网络存储设备或其他任何适当形式的存储装置中。Computing system/server 105 typically includes multiple computer storage media. Such media can be any available media that is accessible to computing system/server 105, including but not limited to, volatile and nonvolatile media, removable and non-removable media. Memory 120 may be volatile memory (eg, registers, cache, random access memory (RAM), nonvolatile memory (eg, read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM) , flash memory) or some combination of them. Storage devices 130 may be removable or non-removable media, and may include machine-readable media, such as flash drives, magnetic disks, or any other media that may be capable of storing information and/or data 170 (e.g., data sets 172) and can be accessed within computing system/server 105 . It should be understood that the above description is only exemplary, and the data set 172 can be stored not only in the storage device 130, but also in a network storage device or any other suitable storage device.

计算系统/服务器105可以进一步包括另外的可拆卸/不可拆卸、易失性/非易失性存储介质。尽管未在图1中示出,可以提供用于从可拆卸、非易失性磁盘进行读取或写入的磁盘驱动和用于从可拆卸、非易失性光盘进行读取或写入的光盘驱动。在这些情况中,每个驱动可以由一个或多个数据介质接口被连接至总线(未示出)。存储器120可以包括一个或多个程序产品122,其具有一个或多个程序模块集合,这些程序模块被配置为执行本文所描述的各种实施例的功能。Computing system/server 105 may further include additional removable/non-removable, volatile/nonvolatile storage media. Although not shown in FIG. 1, a disk drive for reading from or writing to a removable, nonvolatile disk and a disk drive for reading from or writing to a removable, nonvolatile disk may be provided. CD drive. In these cases, each drive may be connected to the bus (not shown) by one or more data media interfaces. Memory 120 may include one or more program products 122 having one or more sets of program modules configured to perform the functions of the various embodiments described herein.

通信单元140实现通过通信介质与另外的计算设备进行通信。附加地,计算系统/服务器105的组件的功能可以以单个计算集群或多个计算机器来实现,这些计算机器能够通过通信连接进行通信。因此,计算系统/服务器105可以使用与一个或多个其他服务器、网络个人计算机(PC)或者另一个一般网络节点的逻辑连接来在联网环境中进行操作。The communication unit 140 enables communication with another computing device through a communication medium. Additionally, the functionality of the components of computing system/server 105 may be implemented in a single computing cluster or as a plurality of computing machines capable of communicating via communication links. Thus, computing system/server 105 may operate in a networked environment using logical connections to one or more other servers, a network personal computer (PC), or another general network node.

输入设备150可以是一个或多个各种输入设备,例如鼠标、键盘、追踪球、语音输入设备等。输出设备160可以是一个或多个输出设备,例如显示器、扬声器、打印机等。计算系统/服务器105还可以根据需要通过通信单元140与一个或多个外部设备(未示出)进行通信,外部设备诸如存储设备、显示设备等,与一个或多个使得用户与计算系统/服务器105交互的设备进行通信,或者与使得计算系统/服务器105与一个或多个其他计算设备通信的任何设备(例如,网卡、调制解调器等)进行通信。这样的通信可以经由输入/输出(I/O)接口(未示出)来执行。The input device 150 may be one or more various input devices, such as a mouse, a keyboard, a trackball, a voice input device, and the like. Output device 160 may be one or more output devices, such as a display, speakers, printer, or the like. The computing system/server 105 can also communicate with one or more external devices (not shown) through the communication unit 140 as needed, such as storage devices, display devices, etc., and one or more external devices that allow users to communicate with the computing system/server. The interacting device 105 communicates, or communicates with any device (eg, network card, modem, etc.) that enables the computing system/server 105 to communicate with one or more other computing devices. Such communication may be performed via an input/output (I/O) interface (not shown).

如图1所示,存储设备130中存储有数据170,其包括数据集172(例如涉及关于各年度鲨鱼攻击人类的统计数据),计算系统/服务器105能够通过输入设备150接收用户101输入的针对数据集172的数据分析请求102,基于数据分析请求102来确定用于引导对话的启发性信息103,并通过输出设备160将启发性信息103提供给用户101,用于引导用户101提供与数据分析请求相关联的补充信息。然后,计算系统/服务器105能够基于该补充信息完成数据分析过程,得到满足用户需求的数据分析结果180。逻辑上,数据分析结果180可以利用图形、表格、文本、音频、视频或其任意组合形式来进行展示。应当理解,数据分析结果180可以以任意适当的形式来展示,以上所述形式仅仅是示例性的,无意限制本公开的范围。As shown in Figure 1, data 170 is stored in the storage device 130, which includes a data set 172 (for example, related to statistical data about sharks attacking humans in each year), and the computing system/server 105 can receive the user's 101 input through the input device 150. The data analysis request 102 of the data set 172, based on the data analysis request 102, determine the enlightening information 103 for guiding the conversation, and provide the enlightening information 103 to the user 101 through the output device 160, for guiding the user 101 to provide and analyze the data Request associated supplemental information. Then, the computing system/server 105 can complete the data analysis process based on the supplementary information, and obtain a data analysis result 180 that meets the needs of the user. Logically, the data analysis result 180 can be displayed in graphs, tables, text, audio, video or any combination thereof. It should be understood that the data analysis result 180 may be displayed in any suitable form, and the above-mentioned forms are merely exemplary and are not intended to limit the scope of the present disclosure.

以下通过具体示例来进一步描述本公开的实施例。图2示出了根据本公开的实施例的用于进行数据分析的数据集200的示意图。虽然图2中采用了多维表格的形式来示出该数据集200,应当理解,数据集200可以具有任意适当的形式,图2中的示例无意限制本公开的范围。数据集200可以实现为图1的数据分析设备105中的数据集172。Embodiments of the present disclosure are further described below through specific examples. FIG. 2 shows a schematic diagram of a data set 200 for data analysis according to an embodiment of the present disclosure. Although the data set 200 is shown in the form of a multi-dimensional table in FIG. 2, it should be understood that the data set 200 may have any suitable form, and the example in FIG. 2 is not intended to limit the scope of the present disclosure. Data set 200 may be implemented as data set 172 in data analysis device 105 of FIG. 1 .

在一些实施例中,数据集200可以是存储在数据库中的单个表格、逗号分隔数值(CSV)文件或者任何其他适当形式的文件,也可以从多个表格联合得到。如图2所示,在此示例中,数据集200是一个包含世界范围内鲨鱼攻击记录的表格,其具有多行和多列。每一条记录是该表格中的一行,列“国家”210、“性别”220、“致命性”230、“活动”240、“攻击次数”250和“年度”260分别是数据的维度。针对数据集200可以预先建立一个数据模型,其中可以包括一个或多个内容项以及与这些内容项相关联的一个或多个操作。这些内容项可以包括数据的维度,也可以包括根据预定算法从这些内容项确定出其他内容项。In some embodiments, the data set 200 may be a single table stored in a database, a comma-separated value (CSV) file, or any other suitable format, or may be combined from multiple tables. As shown in FIG. 2 , in this example, dataset 200 is a table containing records of shark attacks around the world, with multiple rows and columns. Each record is a row in the table, and the columns "country" 210 , "gender" 220 , "lethality" 230 , "activity" 240 , "number of attacks" 250 and "year" 260 are data dimensions. A data model may be pre-established for the data set 200, which may include one or more content items and one or more operations associated with these content items. These content items may include data dimensions, and may also include determining other content items from these content items according to a predetermined algorithm.

对于数据集200的数据分析任务可以包括多种联机分析技术(OLAP),例如聚合、交叉分析(Slicing and Dicing)、下钻(drill-down)和上卷(roll-up)等等。另外,数据分析任务还可以包括模式挖掘,诸如趋势、异常值或异常点、相关性,等等。复杂的数据分析任务可以包括多个子任务。通过基于语义来将数据分析请求解析成与查询语言(例如SQL、DAX和MDX)对应的操作,数据分析任务可以对数据集200执行这样的操作来得到针对该数据分析请求的结果。Data analysis tasks for the data set 200 may include various online analysis techniques (OLAP), such as aggregation, cross analysis (Slicing and Dicing), drill-down and roll-up, and so on. Additionally, data analysis tasks may also include pattern mining, such as trends, outliers or points, correlations, and the like. Complex data analysis tasks can include multiple subtasks. By parsing the data analysis request into operations corresponding to query languages (such as SQL, DAX, and MDX) based on semantics, the data analysis task can perform such operations on the dataset 200 to obtain results for the data analysis request.

根据本公开的一些实施例,数据分析设备105可以从用户101接收多种形式的数据分析请求102。这种数据分析请求可以是简单的短句;也可以是复杂句子,例如多个简单句子的组合或者具有很多限定的长句子。图3示出了根据本公开的实施例的对于数据集200进行数据分析的示意图300。在图3的实施例中,用户101输入的数据分析请求102为“请按年度列出危险国家”。当接收到该数据分析请求102后,数据分析设备105从中识别出一个或多个内容项,例如“年度”、“危险”、“国家”等。According to some embodiments of the present disclosure, the data analysis device 105 may receive various forms of data analysis requests 102 from the user 101 . This data analysis request may be a simple short sentence; it may also be a complex sentence, such as a combination of multiple simple sentences or a long sentence with many restrictions. FIG. 3 shows a schematic diagram 300 of performing data analysis on a data set 200 according to an embodiment of the present disclosure. In the embodiment of FIG. 3 , the data analysis request 102 input by the user 101 is "Please list dangerous countries by year". After receiving the data analysis request 102, the data analysis device 105 identifies one or more content items, such as "year", "danger", "country" and so on.

然后,数据分析设备105将识别出来的内容项与针对数据集200预先建立的数据模型进行比较,从而确定与所识别的内容项相关联的操作。在该实施例中,“年度”和“国家”这两个内容项均已在数据模型中定义了相关联的操作,但是“危险”这一内容项并未定义相应操作。因此,数据分析设备105无法确定与内容项“危险”相关联的操作。可以理解,这种不确定性不是由于词语“危险”本身的含义不清楚造成的,而是指不能确定应该基于该词语而对数据集进行什么样的操作。The data analysis device 105 then compares the identified content item with a pre-established data model for the data set 200 to determine an action associated with the identified content item. In this embodiment, the two content items "year" and "country" have associated operations defined in the data model, but the content item "danger" has not defined the corresponding operation. Therefore, the data analysis device 105 is unable to determine the action associated with the content item "dangerous". Understandably, this uncertainty is not due to the unclear meaning of the word "dangerous" per se, but rather an uncertainty about what to do with the data set based on the word.

在这种情况下,数据分析设备105可以生成针对该数据分析请求102的问题,例如“请解释‘危险国家’中的‘危险’是什么意思?”。这一问题用于启发用户101提供关于内容项“危险”的澄清性信息,从而引导用户101与数据分析设备105之间的对话。In this case, the data analysis device 105 may generate a question for the data analysis request 102, such as "Please explain what is meant by 'dangerous' in 'dangerous country'?". This question serves to prompt the user 101 to provide clarifying information about the "dangerous" content item, thereby guiding the dialogue between the user 101 and the data analysis device 105 .

当接收到上述启发性信息103时,用户101可以输入澄清性信息,例如,“致命攻击次数高于100”。该澄清性信息进一步解释“危险”这一内容项的含义。这样,根据本公开的实施例,数据分析对话不会因为分析请求中的某些项所对应的操作不确定而终止或者报错。相反,系统将通过启发用户输入澄清性信息来引导数据分析对话正常进行。When receiving the aforementioned enlightening information 103, the user 101 may input clarifying information, for example, "the number of fatal attacks is higher than 100". This clarifying information further explains the meaning of the content item "Danger". In this way, according to the embodiment of the present disclosure, the data analysis dialog will not be terminated or an error will be reported due to uncertain operations corresponding to certain items in the analysis request. Instead, the system will steer the data analysis conversation forward by prompting the user to enter clarifying information.

由于“致命”和“攻击次数”均属于已建立的数据模型中的内容项,数据分析设备105可以根据这些内容项查找与其对应的操作,并对数据集200实施所查找到的操作。在该实施例中,数据分析设备105确定出澳大利亚和美国是致命攻击次数高于100的国家,即用户101所输入的“危险国家”。此外,数据分析设备105还按年度的攻击次数给出了这两个国家的攻击次数统计图,便于用户进一步查看相关信息。Since both "fatal" and "number of attacks" belong to the content items in the established data model, the data analysis device 105 can find corresponding operations according to these content items, and implement the found operations on the data set 200 . In this embodiment, the data analysis device 105 determines that Australia and the United States are countries with more than 100 fatal attacks, that is, the "dangerous countries" input by the user 101 . In addition, the data analysis device 105 also provides a statistical chart of the number of attacks in these two countries according to the number of attacks in the year, so that the user can further view related information.

利用这种双向对话方式,数据分析设备105可以通过让用户101提供澄清性信息来补充数据分析请求102,从而获取更加满足用户需求的数据分析结果。这种方式减少了数据分析设备105无法得到数据分析结果或者得到错误结果的可能性,显著提高了用户体验。Using this two-way dialogue, the data analysis device 105 can supplement the data analysis request 102 by asking the user 101 to provide clarification information, so as to obtain a data analysis result that better meets the needs of the user. This manner reduces the possibility that the data analysis device 105 cannot obtain the data analysis result or obtains a wrong result, and significantly improves user experience.

除了启发用户提供澄清性信息之外或者作为补充,数据分析设备105还可以向用户提供对于数据分析结果的扩展的启发性信息。图4A示出了根据本公开的实施例的对于数据集200进行数据分析的示意图400。在图4A所示的实施例中,用户101输入的数据分析请求102为“按年度查”。当数据分析设备105接收到该数据分析请求102后,从该请求中识别出内容项“年度”,并从数据模型中确定与“年度”相关联的一个或多个操作。通过执行这些操作,可以得到按年度统计的攻击次数曲线410。此外,数据分析设备105还将数据分析结果应用于一个或多个预定义的操作模板中,从而对曲线410中的异常值411进行了扩展性分析,得到了如下启发性信息:“想了解关于1960年异常值的问题吗?”并提供了相应的选项“好”或者“不,谢谢”。In addition to or as a supplement to prompting the user to provide clarifying information, the data analysis device 105 may also provide the user with expanded enlightening information on the results of the data analysis. FIG. 4A shows a schematic diagram 400 of performing data analysis on a data set 200 according to an embodiment of the present disclosure. In the embodiment shown in FIG. 4A , the data analysis request 102 input by the user 101 is "check by year". After the data analysis device 105 receives the data analysis request 102, it identifies the content item "Year" from the request, and determines one or more operations associated with "Year" from the data model. By performing these operations, a curve 410 of the number of attacks counted by year can be obtained. In addition, the data analysis device 105 also applies the data analysis results to one or more predefined operation templates, thereby performing an extended analysis on the outlier 411 in the curve 410, and obtaining the following enlightening information: "Want to know about 1960 outlier problem?" and provided the corresponding options of "OK" or "No thanks".

根据本公开的实施例,预定义的操作模板可以是根据历史统计信息、用户101的简档或偏好、多个用户的访问记录等来建立的包括一个或多个操作的集合。在一些实施例中,预定义的操作模板可以是对异常数值的分析、对数据趋势的分析、对最高或最低数据的分析,等等。应当理解,上述关于预定义的操作模板的描述仅仅是示例性的,无意以任何方式限制本公开的范围。本领域技术人员可以理解,预定的操作模板可以以任何适当的形式来实现。According to an embodiment of the present disclosure, the predefined operation template may be a set including one or more operations established according to historical statistical information, profile or preference of the user 101 , access records of multiple users, and the like. In some embodiments, the predefined operational templates may be analysis of abnormal values, analysis of data trends, analysis of highest or lowest data, and the like. It should be understood that the above description about the predefined operation templates is only exemplary, and is not intended to limit the scope of the present disclosure in any way. Those skilled in the art can understand that the predetermined operation template can be implemented in any suitable form.

图4B示出了根据本公开的实施例基于图4A的启发性信息进行双向对话的数据分析过程示意图450。在图4B所示的实施例中,用户101根据数据分析设备105提供的启发性信息输入补充信息,例如输入“好”或者点击带有“好”的按钮。当接收到该补充信息之后,数据分析设备105利用预定义的操作模板或者根据用户输入的补充信息再次在操作模型中确定的操作,来得到相应的数据分析结果。FIG. 4B shows a schematic diagram 450 of a data analysis process for a two-way dialogue based on the heuristic information in FIG. 4A according to an embodiment of the present disclosure. In the embodiment shown in FIG. 4B , the user 101 inputs supplementary information according to the heuristic information provided by the data analysis device 105 , such as inputting "OK" or clicking a button with "OK". After receiving the supplementary information, the data analysis device 105 uses a predefined operation template or an operation determined in the operation model again according to the supplementary information input by the user to obtain the corresponding data analysis result.

仍然参考图2的示例,该结果包括文字“如果按照活动来分解该异常值,“捕鱼”攻击次数在1960年的所有活动中占第一位”451、图表452以及进一步的启发性信息453,即“捕鱼具有2个主要方面,您需要了解哪一方面?”以及“男性”、“非致命性”和“不,谢谢”这三个按钮。用户101可以基于该进一步的启发性信息453继续提供补充信息,例如从“男性”、“非致命性”和“不,谢谢”这三个按钮中选择一个,从而得到相应的数据分析结果。Still referring to the example of FIG. 2, the result includes the text "If this outlier is broken down by activity, the number of "fishing" attacks was the first among all activities in 1960" 451, a chart 452, and further enlightening information 453 , which is "Fishing has 2 main aspects, which one do you need to know about?" and the three buttons "Male", "Non-Lethal", and "No Thanks". The user 101 can continue to provide supplementary information based on the further enlightening information 453 , such as selecting one of the three buttons "Male", "Non-fatal" and "No, thank you", so as to obtain corresponding data analysis results.

利用这种双向对话方式,数据分析设备105可以通过向用户101提供启发性信息来提供对数据分析结果的扩展,从而可以从多角度或者多方位提供更有可能满足用户进一步需求的数据分析结果。这种方式有效提高了用户得到所需的进一步数据分析结果的可能性,显著改善了用户体验。By using this two-way dialogue, the data analysis device 105 can expand the data analysis results by providing enlightening information to the user 101, thereby providing data analysis results that are more likely to meet the user's further needs from multiple angles or directions. This approach effectively improves the possibility for users to obtain the desired further data analysis results, and significantly improves user experience.

以下更详细地描述了关于双向对话方式的数据分析方法和设备的若干示例实施例。图5示出了根据本公开的实施例的用于数据分析的方法500的流程图。应当理解,方法500可以由参考图1所描述的处理单元110执行。Several exemplary embodiments of the data analysis method and apparatus in a two-way conversational manner are described in more detail below. FIG. 5 shows a flowchart of a method 500 for data analysis according to an embodiment of the present disclosure. It should be understood that the method 500 may be executed by the processing unit 110 described with reference to FIG. 1 .

在510,在对话中接收来自用户的针对数据集的数据分析请求。以图1的实施例为例,用户101向数据分析设备105,例如数据分析设备105的输入设备150,来提供数据分析请求102。举例而言,用户101可以可以以文本、话音或其组合形式在对话框中输入数据分析请求101,也可以通过对按钮、下拉框、图形、曲线、文字的点击或触摸来输入数据分析请求101,又或者可以通过对预定控件、图形、文字等的拖拽来输入数据分析请求101。应当理解,上述用以输入数据分析请求101的示例仅仅是为了讨论的目的,其并非是限制性的,无意以任何方式限制本公开的范围。At 510, a data analysis request for a data set is received from a user in a dialog. Taking the embodiment of FIG. 1 as an example, a user 101 provides a data analysis request 102 to a data analysis device 105 , such as an input device 150 of the data analysis device 105 . For example, the user 101 may input the data analysis request 101 in the dialog box in the form of text, voice or a combination thereof, or may input the data analysis request 101 by clicking or touching a button, a drop-down box, a graph, a curve, or a text , or the data analysis request 101 can be input by dragging and dropping predetermined controls, graphics, text, and the like. It should be understood that the above examples for inputting the data analysis request 101 are for discussion purposes only, are not limiting, and are not intended to limit the scope of the present disclosure in any way.

在一些实施例中,当数据分析设备105接收到来自用户的文本、话音或其组合形式的数据分析请求101时,数据分析设备105可以将其保存到内存或预定存储空间以便后续使用。当检测到用户101对按钮、下拉框、图形、曲线、文字的点击或触摸时,数据分析设备105可以确定与该点击或触摸相关联的一个或多个事件,并基于该事件来得到与接收到的数据分析请求相关的信息。当检测到用户101对预定控件、图形、文字等的拖拽时,数据分析设备105可以确定与该拖拽相关联的一个或多个事件,并基于该事件来得到与接收到的数据分析请求相关的信息。In some embodiments, when the data analysis device 105 receives a data analysis request 101 from a user in text, voice or a combination thereof, the data analysis device 105 may save it in memory or a predetermined storage space for subsequent use. When detecting that the user 101 clicks or touches a button, a drop-down box, a graph, a curve, or a text, the data analysis device 105 can determine one or more events associated with the click or touch, and obtain and receive an event based on the event. Information about incoming data analysis requests. When it is detected that the user 101 drags a predetermined control, graphic, text, etc., the data analysis device 105 can determine one or more events associated with the drag, and obtain a data analysis request related to the received data based on the event. Related information.

在520,基于数据分析请求,确定启发性信息。在本公开的实施例中,启发性信息是不同于针对用户的数据分析请求而产生的结果的、用于引导用户与数据分析设备之间的会话以使其不至于中断或报错的信息。例如,启发性信息可以引导用户澄清其输入的数据分析请求中的某些概念,或者向用户提供与数据分析结果有关的其他扩展信息,等等。At 520, based on the data analysis request, heuristic information is determined. In the embodiments of the present disclosure, the heuristic information is information different from the result generated for the user's data analysis request and used to guide the session between the user and the data analysis device so that it will not be interrupted or an error will be reported. For example, enlightening information can guide users to clarify certain concepts in their input data analysis requests, or provide users with other extended information related to data analysis results, and so on.

根据本公开的实施例,数据分析设备105可以通过多种方式来确定启发性信息。在一些实施例中,数据分析设备105可以通过从数据分析请求中提取内容项,例如,例如关于地点、时间、日期、事件、品牌、类别等的字词或短语。在一些替代性实施例中,数据分析设备105还可以基于所提取的内容项来确定与其相关性较高的内容项。随后,数据分析设备105可以确定是否能够基于所提取的内容项和/或所确定的内容项来确定将要被应用于数据集的至少一个操作。According to an embodiment of the present disclosure, the data analysis device 105 may determine the heuristic information in a variety of ways. In some embodiments, the data analysis device 105 may extract content items from the data analysis request, for example, words or phrases about places, times, dates, events, brands, categories, and the like. In some alternative embodiments, the data analysis device 105 may also determine, based on the extracted content items, the content items that are highly relevant thereto. Subsequently, the data analysis device 105 may determine whether at least one operation to be applied to the data set can be determined based on the extracted content item and/or the determined content item.

例如,数据分析设备105可以对数据分析请求执行语言学分析,从而确定该数据分析请求中的字词具有的词性,例如“名词”、“代词”、“副词”等,确定该字词的限定作用,例如“状语”、“定语”、“谓语”等,和/或确定该字词的其他语言属性。应当理解,上述语言学分析过程可以通过利用传统的语言学分析算法(例如Part-Of-Speech(POS)标记算法)来实现,在此不再赘述。For example, the data analysis device 105 may perform linguistic analysis on the data analysis request, thereby determining the part of speech of the word in the data analysis request, such as "noun", "pronoun", "adverb", etc., and determining the definition of the word. role, such as "adverbial", "attribute", "predicate", etc., and/or determine other linguistic properties of the word. It should be understood that the above-mentioned linguistic analysis process can be realized by using a traditional linguistic analysis algorithm (such as a Part-Of-Speech (POS) tagging algorithm), which will not be repeated here.

可选地,数据分析设备105还可以检测数据分析请求的上下文。在此过程中,数据分析设备105可以通过查询用户在输入该数据分析请求之前预定时段内或者在预定数目的语句中所输入的内容,来判断该数据分析请求是在什么环境下提出的、该请求中的代词指的是什么内容项、该请求中省略了什么内容,等等。Optionally, the data analysis device 105 may also detect the context of the data analysis request. During this process, the data analysis device 105 can determine the environment under which the data analysis request is made, the data analysis request by querying the content entered by the user within a predetermined period of time or within a predetermined number of sentences before inputting the data analysis request. What content item is the pronoun in the request referring to, what was omitted from the request, and so on.

然后,数据分析设备105可以基于上述得到的内容项、语言学分析结果、上下文以及预定义的数据模型,来尝试确定该至少一个操作。预定义的数据模型可以包括内容项以及与该内容项相关联的一个或多个操作。在一个实施例中,内容项的每个操作例如可以与不同的语言学模式有关。在这种情况下,数据分析设备105可以根据上述语言学分析结果和上下文来确定内容项的语言学模式,从而可以进一步确定具有该语言学模式的内容项的操作。Then, the data analysis device 105 may try to determine the at least one operation based on the obtained content item, linguistic analysis result, context and predefined data model. A predefined data model may include a content item and one or more operations associated with the content item. In one embodiment, each operation of a content item may, for example, be associated with a different linguistic mode. In this case, the data analysis device 105 can determine the linguistic mode of the content item according to the above linguistic analysis result and the context, so as to further determine the operation of the content item with the linguistic mode.

在一些实施例中,如果无法确定出任何操作,则数据分析设备105则可以认为所识别出的内容项具有无法理解的含义,需要用户进行澄清。在此情况下,数据分析设备105可以基于所识别的内容项生成针对数据分析请求的问题,用以启发用户提供关于内容项的澄清性信息。In some embodiments, if no operation can be determined, then the data analysis device 105 may consider that the identified content item has an incomprehensible meaning and needs to be clarified by the user. In this case, the data analysis device 105 may generate questions for the data analysis request based on the identified content item to prompt the user to provide clarifying information about the content item.

作为备选方案,在另一些实施例中,如果数据分析设备105能够基于所识别的内容项确定一个或多个操作,则数据分析设备105可以确定用于实现所述至少一个操作的代码片段,并基于该代码片段确定上述启发性信息。根据本公开的实施例,代码片段例如是用于实现与内容项相关联的操作的程序或一段代码。内容项例如可以是这个代码片段的输入或者输入的一部分,并可以具有不同的类别、目的或用途。一个代码片段可能包括按照一定顺序执行的一个或多个操作。代码片段可以是按需、动态和/或自动生成的程序;也可以预先定义的、存储在特定存储器中的程序。应当理解,代码片段是可灵活配置的,其可以采用任何适当的编程语言或格式来实现,在此无意以任何方式限制本公开的范围。Alternatively, in other embodiments, if the data analysis device 105 is able to determine one or more operations based on the identified content item, the data analysis device 105 may determine a code segment for implementing the at least one operation, And determine the above heuristic information based on this code snippet. According to an embodiment of the present disclosure, a code segment is, for example, a program or a piece of code for implementing an operation associated with a content item. A content item may, for example, be an input or a part of an input of this code snippet, and may have a different class, purpose or use. A code fragment may include one or more operations performed in a certain order. The code segment may be a program generated on demand, dynamically and/or automatically; it may also be a pre-defined program stored in a specific memory. It should be understood that the code segments are flexibly configurable, and can be implemented in any suitable programming language or format, which is not intended to limit the scope of the present disclosure in any way.

在一些实施例中,如果数据分析设备105基于所识别的内容项确定出多个代码片段,则数据分析设备105可以对这多个代码片段进行排序,例如按照针对数据分析请求的语言学分析结果和/或上下文信息来对这些代码片段进行打分,继而按照该分数对代码片段进行排序。得分较高的代码片段意味着该代码片段更可能满足用户的数据分析需求。数据分析设备105可以将根据得分最高的代码片段所得到数据分析结果提供给用户101。此外,数据分析设备105可以将得分稍低的代码片段对应的选项或者根据这些代码片段所得到的结果作为启发性信息提供给用户101。这样的启发性信息包括了对于数据分析结果的扩展信息,从而提高了提供满足用户需求的数据分析结果的可能性。In some embodiments, if the data analysis device 105 determines a plurality of code fragments based on the identified content item, the data analysis device 105 may sort the plurality of code fragments, for example, according to the linguistic analysis results for the data analysis request and/or contextual information to score these code snippets, and then sort the code snippets according to the score. A code snippet with a higher score means that the code snippet is more likely to meet the user's data analysis needs. The data analysis device 105 can provide the user 101 with a data analysis result obtained according to the code fragment with the highest score. In addition, the data analysis device 105 may provide the user 101 with options corresponding to code fragments with slightly lower scores or results obtained according to these code fragments as enlightening information. Such heuristic information includes extended information on the data analysis results, thereby improving the possibility of providing data analysis results that meet user requirements.

根据本公开的一些备选实施例,数据分析设备105还可以通过基于数据分析请求中的内容项、针对数据分析请求的结果、预定的扩展规则等来扩展数据分析请求,来确定启发性信息。在一些实施例中,数据分析设备105可以确定与针对数据分析请求的结果相关联的其他操作,例如可以通过将该结果从提取内容项来根据预先建立的数据模型来查找匹配的操作。然后,数据分析设备105可以基于所确定的操作得到代码片段,并运行所得到的代码片段而得到结果,以便后续将该结果作为启发性信息提供给用户101。According to some alternative embodiments of the present disclosure, the data analysis device 105 may also determine heuristic information by expanding the data analysis request based on content items in the data analysis request, results for the data analysis request, predetermined expansion rules, and the like. In some embodiments, the data analysis device 105 may determine other operations associated with the result for the data analysis request, for example, the result may be extracted from the content item to find a matching operation according to a pre-established data model. Then, the data analysis device 105 can obtain a code segment based on the determined operation, and run the obtained code segment to obtain a result, so as to provide the result to the user 101 as enlightening information.

作为替代方案,在另一些实施例中,数据分析设备105还可以将数据分析请求中的内容项、针对数据分析请求的结果中所提取的内容项等应用于预定的扩展规则,从而得到与已有内容项相关联的一个或多个扩展内容项。这时,数据分析设备105可以尝试确定与该扩展内容项相关联的操作,并得到相应的代码片段,进而通过运行代码片段来得到扩展分析结果。As an alternative, in some other embodiments, the data analysis device 105 may also apply the content items in the data analysis request, the content items extracted from the results of the data analysis request, etc. to predetermined extension rules, so as to obtain There are one or more extended content items associated with the content item. At this time, the data analysis device 105 may try to determine the operation associated with the extended content item, obtain the corresponding code fragment, and then obtain the extended analysis result by running the code fragment.

在530,向用户提供启发性信息,以使用户能够基于启发性信息提供与数据分析请求相关联的补充信息。在一些实施例中,数据分析设备105可以将以上得到的扩展分析结果作为启发性信息提供给用户,以供用户选用。At 530, the heuristic information is provided to the user to enable the user to provide supplemental information associated with the data analysis request based on the heuristic information. In some embodiments, the data analysis device 105 may provide the extended analysis results obtained above as enlightening information to the user for selection by the user.

在一些备选实施例中,数据分析设备105也可以仅向用户101提供与扩展内容项相关联的代码片段的标识作为启发性信息,例如以序号、关键字等形式展示给用户,仅当用户点击或输入相应序号或关键字时,数据分析设备105才运行对应的代码片段来得到扩展分析结果。通过这种方式,可以减少不必要的系统资源消耗,提高了运行效率和速度。In some alternative embodiments, the data analysis device 105 may also only provide the user 101 with the identification of the code fragment associated with the extended content item as heuristic information, for example, displaying it to the user in the form of serial number, keyword, etc., only when the user 101 When a corresponding serial number or keyword is clicked or input, the data analysis device 105 runs the corresponding code segment to obtain the extended analysis result. In this way, unnecessary system resource consumption can be reduced, and operation efficiency and speed can be improved.

在另一些备选实施例,在530,数据分析设备105可以将基于所识别的内容项生成针对数据分析请求的问题作为启发性信息提供给用户,用以启发用户101提供关于该内容项的澄清性信息。In some other alternative embodiments, at 530, the data analysis device 105 may generate questions for the data analysis request based on the identified content item as enlightening information to the user to prompt the user 101 to provide clarification on the content item sexual information.

根据本公开的实施例,用户101接收到启发性信息后,可以提供与数据分析请求相关联的补充信息。该补充性信息可以是对于数据分析设备105关于数据分析请求所提出的问题的澄清性信息,也可以是对于是否查看扩展分析结果以及查看哪个扩展分析结果的选择性信息。数据分析设备105在接收到来自用户的补充信息后,可以确定与补充信息相关联的数据分析结果,并向用户提供所确定的数据分析结果。以此方式,可以有效提高用户得到所需数据分析结果的可能性,显著改善了用户体验。According to an embodiment of the present disclosure, after receiving the enlightening information, the user 101 may provide supplementary information associated with the data analysis request. The supplementary information may be clarification information on the questions raised by the data analysis device 105 regarding the data analysis request, or may be selective information on whether to view the extended analysis results and which extended analysis results to view. After receiving the supplementary information from the user, the data analysis device 105 can determine the data analysis result associated with the supplementary information, and provide the determined data analysis result to the user. In this manner, the possibility of the user obtaining the required data analysis result can be effectively improved, and the user experience is significantly improved.

根据本公开的实施例,数据分析设备105将数据分析请求和/或补充信息中的至少一项作为用户简档进行存储,以便后续基于该用户简档提供相关的数据分析结果。该用户简档可以包括用户的数据分析历史信息,可以准确反映用户的数据分析习惯。According to an embodiment of the present disclosure, the data analysis device 105 stores at least one item of the data analysis request and/or supplementary information as a user profile, so as to subsequently provide relevant data analysis results based on the user profile. The user profile may include the user's data analysis history information, which may accurately reflect the user's data analysis habits.

在一些实施例中,即使用户没有提交任何数据分析请求,数据分析设备105也可以根据用户简档来对数据集中的数据进行自动挖掘,主动向用户提供启发性信息。在一些备选或附加实施例中,数据分析设备105还可以获取其他用户的简档,并基于该用户的简档和/或其他用户的简档来确定数据分析策略。例如,当大部分用户在过去一段时间内都希望得到数据的异常点时,数据分析策略例如可以是“分析异常点”。应当理解,这仅仅是示例性的。在本公开的实施例中,数据分析策略还可以包括分析数据的平均结果、极值或者其他信息,也可以包括这些分析的优先级或执行次序,还可以包括任何其他适当的分析方式。然后,数据分析设备105可以根据所确定的数据分析策略来分析数据,从而得到启发性信息并提供给用户。用户可以根据该启发性信息来确定想要查看的数据分析结果。通过这种方式,可以帮助用户更好地理解数据,有效引导数据分析对话。In some embodiments, even if the user does not submit any data analysis request, the data analysis device 105 can automatically mine the data in the data set according to the user profile, and actively provide enlightening information to the user. In some alternative or additional embodiments, the data analysis device 105 may also obtain profiles of other users, and determine a data analysis strategy based on the profiles of the user and/or the profiles of other users. For example, when most users want to obtain abnormal points of data in the past period of time, the data analysis strategy may be "analyze abnormal points", for example. It should be understood that this is exemplary only. In the embodiments of the present disclosure, the data analysis policy may also include the average result, extreme value or other information of the analyzed data, may also include the priority or execution order of these analyzes, and may also include any other appropriate analysis methods. Then, the data analysis device 105 can analyze the data according to the determined data analysis strategy, so as to obtain enlightening information and provide it to the user. Users can determine the data analysis results they want to view according to the heuristic information. In this way, users can better understand the data and effectively guide the data analysis dialogue.

现在结合图6更详细地描述关于根据本公开的双向对话方式的数据分析方法的实施例。图6示出了根据本公开的实施例的用于数据分析的方法600的流程图。应当理解,方法600可以由参考图1所描述的处理单元110执行,并且可以认为是方法500的一种具体实施方式。还应当理解,方法600仅仅是示例性的而不是限制性的,方法600中的各个操作可以适当的增减,也可以按其他任何适当的顺序执行。An embodiment of the data analysis method in a two-way dialogue mode according to the present disclosure will now be described in more detail with reference to FIG. 6 . FIG. 6 shows a flowchart of a method 600 for data analysis according to an embodiment of the present disclosure. It should be understood that the method 600 may be executed by the processing unit 110 described with reference to FIG. 1 , and may be regarded as a specific implementation manner of the method 500 . It should also be understood that the method 600 is only exemplary and not restrictive, and various operations in the method 600 may be increased or decreased appropriately, and may also be performed in any other appropriate order.

在610,数据分析设备105接收来自用户针对数据集的数据分析请求。仍然参考图3所示的实施例,数据分析设备105接收到的来自用户101的数据分析请求例如为“请按年度列出危险国家”,如310所示。在620,数据分析设备105从数据分析请求中提取内容项。继续参考图3的实施例,数据分析设备105从数据分析请求“请按年度列出危险国家”中的内容项例如是“年度”、“危险”、“国家”。At 610, the data analysis facility 105 receives a data analysis request for a data set from a user. Still referring to the embodiment shown in FIG. 3 , the data analysis request received by the data analysis device 105 from the user 101 is, for example, “Please list dangerous countries by year”, as shown in 310 . At 620, the data analysis device 105 extracts the content item from the data analysis request. Continuing to refer to the embodiment of FIG. 3 , the data analysis device 105 requests from the data analysis that the content items in "Please list dangerous countries by year" are, for example, "year", "dangerous" and "country".

在630,数据分析设备105确定是否能够基于内容项确定将应用于数据集的操作。在上述实施例中,由于“年度”和“国家”这两个内容项均已在数据模型中定义了相关联的操作,但是“危险”这一内容项并未定义相应操作,因此数据分析设备105无法确定与内容项“危险”相关联的操作。At 630, the data analysis device 105 determines whether an operation to be applied to the data set can be determined based on the content item. In the above embodiment, since the two content items "year" and "country" have defined associated operations in the data model, but the content item "danger" has not defined the corresponding operation, the data analysis device 105 Could not determine the action associated with the content item "Dangerous".

在640,基于内容项生成针对数据分析请求的问题,来作为启发性信息。例如,数据分析设备105可以生成针对该数据分析请求102的问题,例如“请解释‘危险国家’中的‘危险’是什么意思?”。在680,数据分析设备105向用户提供该启发性信息,即向用户呈现上述问题,如图3的320所示。At 640, questions for the data analysis request are generated based on the content item as enlightening information. For example, the data analysis device 105 may generate a question for the data analysis request 102, such as "Please explain what is meant by 'dangerous' in 'dangerous country'?". At 680, the data analysis device 105 provides the heuristic information to the user, that is, presents the above-mentioned question to the user, as shown in 320 of FIG. 3 .

在690,接收到来自用户的补充信息。在图3所示的实施例中,用户101输入如下澄清性信息:“致命攻击次数高于100”,如330所示,用以解释“危险”这一内容项的含义。然后,方法600继续进行到620,数据分析设备105从用户的补充信息中提取内容项。例如,数据分析设备105可以从补充信息“致命攻击次数高于100”中提取到内容项“致命”和“攻击次数”等。At 690, supplemental information is received from the user. In the embodiment shown in FIG. 3 , the user 101 inputs the following clarification information: "the number of fatal attacks is higher than 100", as shown in 330, to explain the meaning of the content item "dangerous". Method 600 then proceeds to 620 where data analysis device 105 extracts content items from the user's supplemental information. For example, the data analysis device 105 may extract the content items "fatal" and "number of attacks" from the supplementary information "the number of fatal attacks is higher than 100".

然后,在630,数据分析设备105确定否能够基于所提取的内容项确定将应用于数据集的操作。在该实施例中,假设“致命”和“攻击次数”均属于已建立的数据模型中的内容项,所以数据分析设备105在640可以根据这些内容项确定出与其对应的、将应用于数据集200的操作。然后,在650,数据分析设备105确定用于实现所确定的操作的代码片段。在660,数据分析设备105可以通过执行该代码片段,来生成相应的数据分析结果,并将该结果提供给用户101。Then, at 630, the data analysis device 105 determines whether an operation to be applied to the data set can be determined based on the extracted content item. In this embodiment, it is assumed that both "fatal" and "number of attacks" belong to the content items in the established data model, so the data analysis device 105 can determine the corresponding 200 operations. Then, at 650, the data analysis device 105 determines a code segment for implementing the determined operation. At 660 , the data analysis device 105 can generate a corresponding data analysis result by executing the code segment, and provide the result to the user 101 .

应当理解,以上示例仅仅是说明性的,而不是限制性的。在根据本公开的另一些实施例中,如果数据分析设备105在630判断根据由补充信息所确定的内容项(例如,“致命”和“攻击次数”)仍然无法得到将应用于数据集的操作,则数据分析设备105可以继续向用户101提供启发性信息,以便用户101提供进一步的澄清性信息。It should be understood that the above examples are illustrative only and not restrictive. In some other embodiments according to the present disclosure, if the data analysis device 105 judges at 630 that the operation to be applied to the data set still cannot be obtained according to the content items (for example, "fatal" and "number of attacks") determined by the supplementary information , then the data analysis device 105 can continue to provide enlightening information to the user 101, so that the user 101 can provide further clarification information.

在根据本公开的另一些实施例中,如果数据分析设备105在610接收的数据分析请求如图4A中的420所示,即,“按年度查”,则数据分析设备105在620从该请求中识别出内容项“年度”。由于数据模型中存储有与“年度”相关联的操作,因此数据分析设备105在630判定能够基于内容项确定将应用于数据集的操作。In other embodiments according to the present disclosure, if the data analysis request received by the data analysis device 105 at 610 is as shown at 420 in FIG. The content item "Year" was identified in . Since operations associated with "year" are stored in the data model, the data analysis device 105 determines at 630 that operations to be applied to the data set can be determined based on the content item.

然后,在650,数据分析设备105确定用于实现上述操作的代码片段。在660,数据分析设备105可以通过执行该代码片段,来生成相应的数据分析结果,并将该结果提供给用户101,如430所示。Then, at 650, the data analysis device 105 determines a code segment for implementing the above operations. At 660 , the data analysis device 105 can generate a corresponding data analysis result by executing the code segment, and provide the result to the user 101 , as shown at 430 .

在一些实施例中,数据分析设备105还可以基于数据分析结果来确定启发性信息。如图6的实施例所示,数据分析设备105在660将数据分析结果(即,按年度的攻击次数的曲线)提供给用户101之后,还可以在670基于该数据分析结果确定扩展分析结果,并在680将该扩展分析结果作为启发性信息提供给用户101。该启发性信息例如“想了解更多关于1960年异常值的问题吗?”,如图4的440所示,以及相应的候选项441和442。In some embodiments, the data analysis device 105 may also determine heuristic information based on data analysis results. As shown in the embodiment of FIG. 6 , after the data analysis device 105 provides the data analysis result (that is, the curve of the number of attacks by year) to the user 101 at 660, it may also determine the extended analysis result at 670 based on the data analysis result, And at 680, the extended analysis result is provided to the user 101 as enlightening information. The heuristic information, such as "Do you want to know more about the outlier in 1960?", is shown as 440 in FIG. 4 , and corresponding candidate items 441 and 442 .

在690,数据分析设备105接收来自用户的补充信息,例如图4B中由443所指示的“好”。在一些实施例中,通过上下文分析,数据分析设备105可以将用户输入的补充信息确定为“想了解更多关于1960年异常值的问题”。然后在620,数据分析设备105从该补充信息中提取内容项,例如“1960”和“异常值”等。At 690, the data analysis device 105 receives supplemental information from the user, such as "OK" indicated by 443 in FIG. 4B. In some embodiments, through contextual analysis, the data analysis device 105 may determine the supplementary information input by the user as a "question to know more about outliers in 1960". Then at 620, the data analysis device 105 extracts content items such as "1960" and "outlier value" etc. from the supplementary information.

然后,在630,数据分析设备105确定否能够基于所提取的内容项确定将应用于数据集的操作。在该实施例中,由于“1960”和“异常值”均属于已建立的数据模型中的内容项,所以数据分析设备105在640可以根据这些内容项确定出与其对应的、将应用于数据集200的操作。然后,在650,数据分析设备105确定用于实现所确定的操作的代码片段。在660,数据分析设备105通过执行该代码片段,来生成相应的数据分析结果,并将该结果提供给用户101,如451和452所示。Then, at 630, the data analysis device 105 determines whether an operation to be applied to the data set can be determined based on the extracted content item. In this embodiment, since both "1960" and "abnormal value" belong to the content items in the established data model, the data analysis device 105 can determine the corresponding 200 operations. Then, at 650, the data analysis device 105 determines a code segment for implementing the determined operation. At 660 , the data analysis device 105 executes the code segment to generate a corresponding data analysis result, and provides the result to the user 101 , as shown in 451 and 452 .

应当理解,以上示例仅仅是说明性的,而不是限制性的。在根据本公开的另一些实施例中,在提供了上述数据分析结果之后,数据分析设备105还可以向用户101继续提供启发性信息,例如图4B中的453,以便用户查看更多相关数据分析结果。It should be understood that the above examples are illustrative only and not restrictive. In other embodiments according to the present disclosure, after providing the above-mentioned data analysis results, the data analysis device 105 can also continue to provide enlightening information to the user 101, such as 453 in FIG. 4B, so that the user can view more relevant data analysis result.

根据本公开的实施例,用户101与数据分析设备105之间的用户界面可以包括一个或多个对话。在一些实施例中,以上结合图5-图6的实施例所描述的对话可以认为是用户界面中的第一对话。数据分析设备105可以在接收到来自用户101的另一数据分析请求时,以拖拽方式建立不同于第一对话的第二对话,并可以在第二对话中向用户提供针对该另一数据分析请求的数据分析结果。图7示出了根据本公开的实施例的多个对话的用户界面700。在图7所示的实施例中,在第一对话710进行过程中,用户101通过点击或拖拽用户界面左侧的文字或按钮“性别”(由701所示)来创建了第二对话720。在该第二对话中,数据分析设备105可以提供针对“性别”这一数据分析请求721的结果,如722所示。通过这种方式,用户可以在多个对话中同时进行多个数据分析任务,有效提高了工作效率,方便了用户使用。According to an embodiment of the present disclosure, the user interface between the user 101 and the data analysis device 105 may include one or more dialogs. In some embodiments, the dialog described above in conjunction with the embodiments of FIGS. 5-6 can be considered as the first dialog in the user interface. When the data analysis device 105 receives another data analysis request from the user 101, it can create a second dialog different from the first dialog in a drag-and-drop manner, and can provide the user with an analysis method for the other data in the second dialog. The requested data analysis results. FIG. 7 illustrates a user interface 700 for multiple dialogs according to an embodiment of the disclosure. In the embodiment shown in FIG. 7, during the first dialogue 710, the user 101 creates the second dialogue 720 by clicking or dragging the text or the button "gender" (shown by 701) on the left side of the user interface . In this second dialog, the data analysis device 105 may provide the result of the data analysis request 721 for "gender", as shown at 722 . In this way, users can perform multiple data analysis tasks in multiple conversations at the same time, which effectively improves work efficiency and is convenient for users.

根据本公开的进一步的实施例,可以在各个对话中提供给用户候选项以仅呈现与数据分析结果有关的部分。图8示出了根据本公开的实施例的用户界面800。在图8中,用户通过点击用户界面上的相关按钮或控件,来仅使得与数据分析结果有关的部分810、820和830可见。用户可以通过拖拽这些部分810、820和830来进行视图的排列和尺寸的缩放等。此时,与数据分析请求、启发性信息等有关的部分是不可见的。通过这种方式,用户可以方便地获取所需的数据分析结果,例如直接生成报表或图表。这样,在一定程度上方便了用户使用,提高了用户体验。According to a further embodiment of the present disclosure, options may be provided to the user in each dialog to present only the parts related to the data analysis results. FIG. 8 illustrates a user interface 800 according to an embodiment of the disclosure. In FIG. 8, the user makes only the portions 810, 820, and 830 relevant to the data analysis results visible by clicking on relevant buttons or controls on the user interface. The user can arrange the view and zoom in and out the size by dragging these parts 810 , 820 and 830 . At this time, sections related to data analysis requests, heuristic information, etc. are not visible. In this way, users can easily obtain the required data analysis results, such as directly generating reports or charts. In this way, the use of the user is facilitated to a certain extent, and the user experience is improved.

根据本公开的实施例,数据分析设备105可以通过学习用户101的数据分析请求来确定用户的偏好或者兴趣,从而建立用户简档。数据分析设备105还可以在与用户101的对话中对用户简档进行更新和完善,以便更好地理解和分析用户的需求和意图。例如,关于鲨鱼攻击记录这个数据集200,如果用户101正在准备写一篇有关澳大利亚鲨鱼攻击分析的文字,该用户很有可能不需要有关美国的数据分析,而是仅需要有关澳大利亚的数据分析。在此情况下,数据分析设备105可以认为有关澳大利亚这一内容项的各种操作所组成的程序片段具有较高得分。这样,数据分析设备105提供给用户的数据分析结果有很大可能性是有关澳大利亚的,从而能够更好地满足用户需求。According to an embodiment of the present disclosure, the data analysis device 105 can determine the user's preference or interest by learning the data analysis request of the user 101, thereby establishing a user profile. The data analysis device 105 can also update and improve the user profile in the dialogue with the user 101, so as to better understand and analyze the user's needs and intentions. For example, regarding the data set 200 of shark attack records, if the user 101 is preparing to write a text about the analysis of shark attacks in Australia, the user probably does not need data analysis about the United States, but only needs data analysis about Australia. In this case, the data analysis device 105 may consider that the program segment composed of various operations related to the content item in Australia has a higher score. In this way, the data analysis result provided to the user by the data analysis device 105 is likely to be related to Australia, so that the user's needs can be better met.

本文中所描述的方法和功能可以至少部分地由一个或多个硬件逻辑组件来执行。例如但不限于,可以使用的硬件逻辑组件的示意性类型包括现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑器件(CPLD)等。The methods and functions described herein may be performed at least in part by one or more hardware logic components. Illustrative types of hardware logic components that may be used include, for example and without limitation, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), System on Chips (SOCs), Complex Programmable Logic Devices (CPLD) and so on.

用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program codes for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, a special purpose computer, or other programmable data processing devices, so that the program codes, when executed by the processor or controller, make the functions/functions specified in the flow diagrams and/or block diagrams Action is implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

在本公开内容的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

此外,虽然采用特定次序描绘了各操作,但是这应当理解为要求这样操作以所示出的特定次序或以顺序次序执行,或者要求所有图示的操作应被执行以取得期望的结果。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实现的上下文中描述的某些特征还可以组合地实现在单个实现中。相反地,在单个实现的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实现中。In addition, while operations are depicted in a particular order, this should be understood to require that such operations be performed in the particular order shown, or in sequential order, or that all illustrated operations should be performed to achieve desirable results. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while the above discussion contains several specific implementation details, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate implementations can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination.

以下列出了本公开的一些示例实现方式。Some example implementations of the present disclosure are listed below.

本公开的实施例包括一种计算机实现的方法。该方法包括:在对话中接收来自用户的针对数据集的数据分析请求;基于所述数据分析请求,确定用于引导所述对话的、不同于针对所述数据分析请求的结果的启发性信息;以及向所述用户提供所述启发性信息以使所述用户能够基于所述启发性信息提供与所述数据分析请求相关联的补充信息。Embodiments of the present disclosure include a computer-implemented method. The method includes: receiving a data analysis request for a data set from a user in a dialog; based on the data analysis request, determining heuristic information for guiding the dialog that is different from a result of the data analysis request; and providing the heuristic information to the user to enable the user to provide supplemental information associated with the data analysis request based on the heuristic information.

在一些实施例中,确定启发性信息包括:从所述数据分析请求中提取内容项;确定是否能够基于所识别的内容项确定将要应用于所述数据集的至少一个操作;以及响应于无法基于所述内容项确定所述至少一个操作,基于所识别的内容项生成针对所述数据分析请求的问题,所述问题用以启发所述用户提供关于所述内容项的澄清性信息。In some embodiments, determining heuristic information includes: extracting content items from the data analysis request; determining whether at least one operation to be applied to the data set can be determined based on the identified content items; The content item determines the at least one action, and a question for the data analysis request is generated based on the identified content item, the question to elicit the user to provide clarifying information about the content item.

在一些实施例中,确定是否能够基于所识别的内容项确定将要应用于所述数据集的至少一个操作包括:对所述数据分析请求执行语言学分析;检测所述数据分析请求的上下文;以及基于上述内容项、所述语言学分析结果、所述上下文以及预定义的数据模型,来尝试确定所述至少一个操作。In some embodiments, determining whether at least one operation to be applied to the data set can be determined based on the identified content item comprises: performing a linguistic analysis on the data analysis request; detecting a context of the data analysis request; and An attempt is made to determine said at least one operation based on said content item, said linguistic analysis result, said context and a predefined data model.

在一些实施例中,该方法还可以包括:响应于能够基于所识别的内容项确定所述至少一个操作,确定用于实现所述至少一个操作的代码片段;以及基于所述代码片段确定所述启发性信息。In some embodiments, the method may further include: in response to being able to determine the at least one operation based on the identified content item, determining a code snippet for implementing the at least one operation; and determining the Inspiring information.

在一些实施例中,确定启发性信息包括:通过基于以下至少一项扩展所述数据分析请求,来生成所述启发性信息:所述数据分析请求中的内容项,针对所述数据分析请求的结果,以及预定的扩展规则。In some embodiments, determining the heuristic information includes generating the heuristic information by extending the data analysis request based on at least one of: content items in the data analysis request, results, along with predetermined expansion rules.

在一些实施例中,该方法还可以包括:响应于接收到来自所述用户的所述补充信息,确定与所述补充信息相关联的数据分析结果;以及向所述用户提供所确定的数据分析结果。In some embodiments, the method may further include: in response to receiving the supplemental information from the user, determining a data analysis result associated with the supplemental information; and providing the determined data analysis to the user. result.

在一些实施例中,所述对话是用户界面中的第一对话,并且该方法还可以包括:响应于接收到来自所述用户的进一步的数据分析请求,以拖拽方式建立不同于所述第一对话的第二对话;以及在所述第二对话中向所述用户提供针对所述进一步的数据分析请求的数据分析结果。In some embodiments, the dialog is the first dialog in the user interface, and the method may further include: in response to receiving a further data analysis request from the user, creating a dialog different from the first dialog in a dragging manner a second session of the first session; and providing the user with data analysis results for the further data analysis request in the second session.

在一些实施例中,该方法还可以包括:将数据分析请求和补充信息中的至少一项作为用户简档进行存储,以便基于用户简档提供相关数据分析结果。In some embodiments, the method may further include: storing at least one of the data analysis request and the supplementary information as a user profile, so as to provide relevant data analysis results based on the user profile.

本公开的实施例包括一种电子设备,包括:处理单元;存储器,耦合至所述处理单元并且存储有指令,所述指令在由所述处理单元执行时执行以下动作:在对话中接收来自用户的针对数据集的数据分析请求;基于所述数据分析请求,确定用于引导所述对话的、不同于针对所述数据分析请求的结果的启发性信息;以及向所述用户提供所述启发性信息以使所述用户能够基于所述启发性信息提供与所述数据分析请求相关联的补充信息。Embodiments of the present disclosure include an electronic device including: a processing unit; a memory coupled to the processing unit and storing instructions that, when executed by the processing unit, perform the following actions: receive a message from a user in a dialog a data analysis request for a data set; based on the data analysis request, determine heuristic information for guiding the conversation that differs from the results for the data analysis request; and provide the user with the heuristic information to enable the user to provide supplemental information associated with the data analysis request based on the heuristic information.

在一些实施例中,确定启发性信息可以包括:从所述数据分析请求中提取内容项;确定是否能够基于所识别的内容项确定将要应用于所述数据集的至少一个操作;以及响应于无法基于所述内容项确定所述至少一个操作,基于所识别的内容项生成针对所述数据分析请求的问题,所述问题用以启发所述用户提供关于所述内容项的澄清性信息。In some embodiments, determining heuristic information may include: extracting content items from the data analysis request; determining whether at least one operation to be applied to the data set can be determined based on the identified content items; The at least one action is determined based on the content item, and a question for the data analysis request is generated based on the identified content item, the question to elicit the user to provide clarifying information about the content item.

在一些实施例中,确定是否能够基于所识别的内容项确定将要应用于所述数据集的至少一个操作可以包括:对所述数据分析请求执行语言学分析;检测所述数据分析请求的上下文;以及基于上述内容项、所述语言学分析结果、所述上下文以及预定义的数据模型,来尝试确定所述至少一个操作。In some embodiments, determining whether at least one operation to be applied to the data set can be determined based on the identified content item may comprise: performing a linguistic analysis on the data analysis request; detecting a context of the data analysis request; and attempting to determine said at least one operation based on said content item, said linguistic analysis result, said context and a predefined data model.

在一些实施例中,所述动作还可以包括:响应于能够基于所识别的内容项确定所述至少一个操作,确定用于实现所述至少一个操作的代码片段;以及基于所述代码片段确定所述启发性信息。In some embodiments, the actions may further include: in response to being able to determine the at least one operation based on the identified content item, determining a code snippet for implementing the at least one operation; and determining the at least one operation based on the code snippet Inspiring information.

在一些实施例中,确定启发性信息可以包括:通过基于以下至少一项扩展所述数据分析请求,来生成所述启发性信息:所述数据分析请求中的内容项,针对所述数据分析请求的结果,以及预定的扩展规则。In some embodiments, determining the heuristic information may include: generating the heuristic information by extending the data analysis request based on at least one of: a content item in the data analysis request, for the data analysis request , and the predetermined expansion rules.

在一些实施例中,所述动作还可以包括:响应于接收到来自所述用户的所述补充信息,确定与所述补充信息相关联的数据分析结果;以及向所述用户提供所确定的数据分析结果。In some embodiments, the actions may further include: in response to receiving the supplemental information from the user, determining data analysis results associated with the supplemental information; and providing the determined data to the user Analyze the results.

在一些实施例中,所述对话是用户界面中的第一对话,并且所述动作还可以包括:响应于接收到来自所述用户的进一步的数据分析请求,以拖拽方式建立不同于所述第一对话的第二对话;以及在所述第二对话中向所述用户提供针对所述进一步的数据分析请求的数据分析结果。In some embodiments, the dialog is the first dialog in the user interface, and the action may further include: in response to receiving a further data analysis request from the user, creating a dialog different from the a second session of the first session; and providing the user with data analysis results for the further data analysis request in the second session.

在一些实施例中,所述动作还可以包括:将所述数据分析请求和所述补充信息中的至少一项作为用户简档进行存储,以便基于所述用户简档提供相关数据分析结果。In some embodiments, the actions may further include: storing at least one of the data analysis request and the supplementary information as a user profile, so as to provide relevant data analysis results based on the user profile.

本公开的实施例还提供了一种计算机程序产品,所述计算机程序产品被存储在非瞬态计算机存储介质中并且包括机器可执行指令,所述机器可执行指令在设备中运行时使得所述设备:在对话中接收来自用户的针对数据集的数据分析请求;基于所述数据分析请求,确定用于引导所述对话的、不同于针对所述数据分析请求的结果的启发性信息;以及向所述用户提供所述启发性信息以使所述用户能够基于所述启发性信息提供与所述数据分析请求相关联的补充信息。Embodiments of the present disclosure also provide a computer program product stored in a non-transitory computer storage medium and comprising machine-executable instructions that, when executed in a device, cause the An apparatus for: receiving a data analysis request for a data set from a user in a dialog; determining, based on the data analysis request, heuristic information for guiding the dialog that differs from a result for the data analysis request; The user provides the heuristic information to enable the user to provide supplemental information associated with the data analysis request based on the heuristic information.

在一些实施例中,所述机器可执行指令在设备中运行时还使得所述设备:从所述数据分析请求中提取内容项;确定是否能够基于所识别的内容项确定将要应用于所述数据集的至少一个操作;以及响应于无法基于所述内容项确定所述至少一个操作,基于所识别的内容项生成针对所述数据分析请求的问题,所述问题用以启发所述用户提供关于所述内容项的澄清性信息。In some embodiments, the machine-executable instructions, when executed in the device, further cause the device to: extract content items from the data analysis request; set at least one operation; and in response to the at least one operation being unable to be determined based on the content item, generating a question for the data analysis request based on the identified content item, the question being used to elicit the user to provide information about the Clarifying information for the above content item.

在一些实施例中,所述机器可执行指令在设备中运行时还使得所述设备:对所述数据分析请求执行语言学分析;检测所述数据分析请求的上下文;以及基于上述内容项、所述语言学分析结果、所述上下文以及预定义的数据模型,来尝试确定所述至少一个操作。In some embodiments, the machine-executable instructions, when executed in the device, further cause the device to: perform linguistic analysis on the data analysis request; detect the context of the data analysis request; Attempt to determine the at least one operation based on the linguistic analysis result, the context and the predefined data model.

在一些实施例中,所述机器可执行指令在设备中运行时还使得所述设备:响应于能够基于所识别的内容项确定所述至少一个操作,确定用于实现所述至少一个操作的代码片段;以及基于所述代码片段确定所述启发性信息。In some embodiments, the machine-executable instructions, when executed in the device, further cause the device to: in response to being able to determine the at least one operation based on the identified content item, determine code for implementing the at least one operation a fragment; and determining the heuristic information based on the code fragment.

在一些实施例中,所述机器可执行指令在设备中运行时还使得所述设备:通过基于以下至少一项扩展所述数据分析请求,来生成所述启发性信息:所述数据分析请求中的内容项,针对所述数据分析请求的结果,以及预定的扩展规则。In some embodiments, the machine-executable instructions, when executed in the device, further cause the device to: generate the heuristic information by extending the data analysis request based on at least one of: The content item, the result of the analysis request for the data, and the predetermined expansion rule.

在一些实施例中,所述机器可执行指令在设备中运行时还使得所述设备:响应于接收到来自所述用户的所述补充信息,确定与所述补充信息相关联的数据分析结果;以及向所述用户提供所确定的数据分析结果。In some embodiments, the machine-executable instructions, when executed in a device, further cause the device to: in response to receiving the supplemental information from the user, determine a data analysis result associated with the supplemental information; and providing the determined data analysis result to the user.

在一些实施例中,所述对话是用户界面中的第一对话,并且所述机器可执行指令在设备中运行时还使得所述设备:响应于接收到来自所述用户的进一步的数据分析请求,以拖拽方式建立不同于所述第一对话的第二对话;以及在所述第二对话中向所述用户提供针对所述进一步的数据分析请求的数据分析结果。In some embodiments, the dialog is a first dialog in a user interface, and the machine-executable instructions, when executed in the device, further cause the device to: respond to receiving a further data analysis request from the user , establishing a second dialog different from the first dialog in a drag-and-drop manner; and providing the user with a data analysis result for the further data analysis request in the second dialog.

在一些实施例中,所述机器可执行指令在设备中运行时还使得所述设备:将所述数据分析请求和所述补充信息中的至少一项作为用户简档进行存储,以便基于所述用户简档提供相关数据分析结果。In some embodiments, the machine-executable instructions, when executed in a device, further cause the device to: store at least one of the data analysis request and the supplemental information as a user profile for use based on the User profiles provide relevant data analysis results.

尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本公开,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。Although the disclosure has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are merely example forms of implementing the claims.

Claims (20)

1. A computer-implemented method, comprising:
receiving a data analysis request for a data set from a user in a dialog;
extracting a content item from the data analysis request;
comparing the extracted content items to a data model for the data set, the data model including a plurality of content items defined in the data model and one or more operations associated with each of the plurality of content items to be applied to the data set, the one or more operations based on at least one of: historical statistics, a profile or preference of the user, or a record of access by a plurality of users;
based on the comparison, determining that the extracted content item and associated operations to be applied to the data set are undefined in the data model;
in response to the determination, generating heuristic information for directing the dialog to prompt the user to provide clarification information associated with the extracted content item and different from a result of the request for data analysis; and
providing the heuristic information to the user to enable the user to provide supplemental information associated with the data analysis request based on the heuristic information.
2. The method of claim 1, wherein generating the heuristic information for directing the dialog to prompt the user to provide the clarity information associated with the extracted content item comprises:
generating a question for the data analysis request based on the extracted content item.
3. The method of claim 1, further comprising:
performing linguistic analysis on the data analysis request;
detecting a context of the data analysis request; and
generating the heuristic information in response to determining that the associated operation to be applied to the dataset cannot be determined based on the linguistic analysis results, the context, and the associated operation that is not defined in the data model.
4. The method of claim 1, further comprising:
extracting at least one additional content item from the data analysis request;
determining that the at least one additional content item is one of the plurality of content items and has at least one associated operation defined in the data model;
determining a code segment for implementing the at least one associated operation defined in the data model; and
determining the heuristic information based on the code snippet.
5. The method of claim 1, wherein the heuristic information is generated based on at least one of:
the extracted content item in the data analysis request, an
Predefined rules for extending the data analysis request.
6. The method of claim 1, further comprising:
receiving the supplemental information from the user;
in response to receiving the supplemental information from the user, determining a data analysis result associated with the supplemental information; and
providing the determined data analysis results to the user.
7. The method of claim 6, wherein the determined data analysis results are provided to the user as one of:
a graph;
a table;
a text;
audio frequency; and
and (6) video.
8. The method of claim 6, the supplemental information received from the user including at least one content item from the plurality of content items in the data model, and determining the data analysis results associated with the supplemental information includes applying corresponding one or more operations defined for the at least one content item in the data model to the data set.
9. The method of claim 1, further comprising:
receiving the supplemental information from the user; and
storing at least one of the data analysis request and the supplemental information as a user profile to provide relevant data analysis results based on the user profile.
10. The method of claim 1, wherein generating the heuristic information for directing the dialog to prompt the user to provide the clarity information associated with the extracted content item further comprises:
providing a list of options selectable by the user.
11. The method of claim 1, wherein the data set includes a table having a plurality of rows representing data records and a plurality of columns representing data dimensions of the data records, and the plurality of content items in the data model includes at least the data dimensions of the data records.
12. A computing device, comprising:
a processing unit; and
a memory coupled to the processing unit and storing instructions that, when executed by the processing unit, cause the computing device to perform a set of operations comprising:
receiving a data analysis request for a data set from a user in a dialog;
extracting a content item from the data analysis request;
comparing the extracted content items to a data model for the data set
In one embodiment, the data model includes a plurality of content items defined in the data model and one or more operations associated with each of the plurality of content items to be applied to the data set, the one or more operations based on at least one of: historical statistics, the user's profile or preferences, or access records of multiple users
Based on the comparison, determining that the extracted content item and associated operation to be applied to the data set are undefined in the data model;
in response to the determination, generating a dialog for directing the dialog to prompt the user to provide clarification information associated with the extracted content item and different than for the extracted content item
Heuristic information of the result of the data analysis request; and
providing the heuristic information to the user to enable the user to be based on the heuristic information
Heuristic information provides supplemental information associated with the data analysis request.
13. The computing device of claim 12, wherein generating the heuristic information for directing the dialog to prompt the user to provide the clarity information associated with the extracted content item comprises:
generating a question for the data analysis request based on the content item.
14. The computing device of claim 12, wherein the heuristic information is generated based on at least one of:
the extracted content item in the data analysis request, an
Predefined rules for extending the data analysis request.
15. The computing device of claim 12, the set of operations further comprising:
receiving the supplemental information from the user;
in response to receiving the supplemental information from the user, determining a data analysis result associated with the supplemental information; and
providing the determined data analysis results to the user.
16. The computing device of claim 12, wherein the conversation is a first conversation in a user interface, and the set of operations further comprises:
in response to receiving a second data analysis request from the user, establishing a second conversation different from the first conversation in a drag-and-drop manner; and
providing data analysis results for the second data analysis request to the user in the second dialog.
17. The computing device of claim 12, the set of operations further comprising:
receiving the supplemental information from the user; and
storing at least one of the data analysis request and the supplemental information as a user profile to provide relevant data analysis results based on the user profile.
18. The computing device of claim 12, further comprising:
performing linguistic analysis on the data analysis request;
detecting a context of the data analysis request; and
generating the heuristic information in response to determining that the associated operation to be applied to the dataset cannot be determined based on the linguistic analysis results, the context, and the associated operation that is not defined in the data model.
19. The computing device of claim 12, the set of operations further comprising:
extracting at least one additional content item from the data analysis request;
determining that the at least one additional content item is one of the plurality of content items and has at least one associated operation defined in the data model;
determining a code segment for implementing the at least one associated operation defined in the data model; and
determining the heuristic information based on the code snippet.
20. A non-transitory machine-readable medium storing machine-executable instructions that, when executed in a device, cause the device to:
receiving a data analysis request for a data set from a user in a dialog;
extracting a content item from the data analysis request;
comparing the extracted content items to a data model for the data set, the data model including a plurality of content items defined in the data model and one or more operations associated with each of the plurality of content items to be applied to the data set, the one or more operations based on at least one of: historical statistics, a user's profile or preferences, or access records for multiple users;
based on the comparison, determining that the extracted content item and associated operations to be applied to the data set are undefined in the data model;
in response to the determination, generating heuristic information for directing the dialog to prompt the user to provide clarity information associated with the extracted content item and different from a result of the request for data analysis; and
providing the heuristic information to the user to enable the user to provide supplemental information associated with the data analysis request based on the heuristic information.
CN201610867019.5A 2016-09-29 2016-09-29 Conversational data analysis Active CN107885744B (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
CN202211627592.0A CN115858730A (en) 2016-09-29 2016-09-29 Conversational data analysis
CN201610867019.5A CN107885744B (en) 2016-09-29 2016-09-29 Conversational data analysis
PCT/US2017/052839 WO2018063924A1 (en) 2016-09-29 2017-09-22 Conversational data analysis
EP17780278.2A EP3519988A1 (en) 2016-09-29 2017-09-22 Conversational data analysis
US16/338,061 US11423229B2 (en) 2016-09-29 2017-09-22 Conversational data analysis
US17/813,435 US20220405479A1 (en) 2016-09-29 2022-07-19 Conversational data analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610867019.5A CN107885744B (en) 2016-09-29 2016-09-29 Conversational data analysis

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202211627592.0A Division CN115858730A (en) 2016-09-29 2016-09-29 Conversational data analysis

Publications (2)

Publication Number Publication Date
CN107885744A CN107885744A (en) 2018-04-06
CN107885744B true CN107885744B (en) 2023-01-03

Family

ID=60020626

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201610867019.5A Active CN107885744B (en) 2016-09-29 2016-09-29 Conversational data analysis
CN202211627592.0A Pending CN115858730A (en) 2016-09-29 2016-09-29 Conversational data analysis

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202211627592.0A Pending CN115858730A (en) 2016-09-29 2016-09-29 Conversational data analysis

Country Status (4)

Country Link
US (2) US11423229B2 (en)
EP (1) EP3519988A1 (en)
CN (2) CN107885744B (en)
WO (1) WO2018063924A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10896297B1 (en) 2017-12-13 2021-01-19 Tableau Software, Inc. Identifying intent in visual analytical conversations
US11055489B2 (en) * 2018-10-08 2021-07-06 Tableau Software, Inc. Determining levels of detail for data visualizations using natural language constructs
US11537276B2 (en) 2018-10-22 2022-12-27 Tableau Software, Inc. Generating data visualizations according to an object model of selected data sources
US11314817B1 (en) 2019-04-01 2022-04-26 Tableau Software, LLC Methods and systems for inferring intent and utilizing context for natural language expressions to modify data visualizations in a data visualization interface
US11455339B1 (en) 2019-09-06 2022-09-27 Tableau Software, LLC Incremental updates to natural language expressions in a data visualization user interface
US10997217B1 (en) 2019-11-10 2021-05-04 Tableau Software, Inc. Systems and methods for visualizing object models of database tables
US11714807B2 (en) * 2019-12-24 2023-08-01 Sap Se Platform for conversation-based insight search in analytics systems
US12217000B1 (en) * 2021-09-10 2025-02-04 Tableau Software, LLC Optimizing natural language analytical conversations using platform-specific input and output interface functionality

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101124578A (en) * 2005-01-14 2008-02-13 国际商业机器公司 Sharable multi-tenant reference data utility and repository, including value enhancement and on-demand data delivery and methods of operation
CN101364229A (en) * 2008-10-06 2009-02-11 中国移动通信集团设计院有限公司 A Data Warehouse Host Resource Prediction Method Based on Time Capacity Analysis
CN103295148A (en) * 2012-02-27 2013-09-11 埃森哲环球服务有限公司 Digital consumer data model and customer analytic record
CN104077347A (en) * 2013-03-26 2014-10-01 国际商业机器公司 Method and a system for profiling social trendsetters on a communications network

Family Cites Families (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070294229A1 (en) 1998-05-28 2007-12-20 Q-Phrase Llc Chat conversation methods traversing a provisional scaffold of meanings
US7798417B2 (en) * 2000-01-03 2010-09-21 Snyder David M Method for data interchange
WO2002073331A2 (en) 2001-02-20 2002-09-19 Semantic Edge Gmbh Natural language context-sensitive and knowledge-based interaction environment for dynamic and flexible product, service and information search and presentation applications
US8015143B2 (en) 2002-05-22 2011-09-06 Estes Timothy W Knowledge discovery agent system and method
AU2003293071A1 (en) 2002-11-22 2004-06-18 Roy Rosser Autonomous response engine
WO2007134402A1 (en) 2006-05-24 2007-11-29 Mor(F) Dynamics Pty Ltd Instant messaging system
US8788517B2 (en) * 2006-06-28 2014-07-22 Microsoft Corporation Intelligently guiding search based on user dialog
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US8949377B2 (en) 2008-05-21 2015-02-03 The Delfin Project, Inc. Management system for a conversational system
US8375014B1 (en) * 2008-06-19 2013-02-12 BioFortis, Inc. Database query builder
US9292577B2 (en) * 2010-09-17 2016-03-22 International Business Machines Corporation User accessibility to data analytics
JP6087899B2 (en) 2011-03-31 2017-03-01 マイクロソフト テクノロジー ライセンシング,エルエルシー Conversation dialog learning and conversation dialog correction
US20120253789A1 (en) * 2011-03-31 2012-10-04 Microsoft Corporation Conversational Dialog Learning and Correction
US9842168B2 (en) 2011-03-31 2017-12-12 Microsoft Technology Licensing, Llc Task driven user intents
US20120260263A1 (en) 2011-04-11 2012-10-11 Analytics Intelligence Limited Method, system and program for data delivering using chatbot
US20120306741A1 (en) 2011-06-06 2012-12-06 Gupta Kalyan M System and Method for Enhancing Locative Response Abilities of Autonomous and Semi-Autonomous Agents
KR101402506B1 (en) 2011-12-01 2014-06-03 라인 가부시키가이샤 System and method for providing information interactively by instant messaging application
US9020824B1 (en) 2012-03-09 2015-04-28 Google Inc. Using natural language processing to generate dynamic content
US9575963B2 (en) 2012-04-20 2017-02-21 Maluuba Inc. Conversational agent
US9424233B2 (en) * 2012-07-20 2016-08-23 Veveo, Inc. Method of and system for inferring user intent in search input in a conversational interaction system
US9465833B2 (en) * 2012-07-31 2016-10-11 Veveo, Inc. Disambiguating user intent in conversational interaction system for large corpus information retrieval
US9269354B2 (en) 2013-03-11 2016-02-23 Nuance Communications, Inc. Semantic re-ranking of NLU results in conversational dialogue applications
US10133546B2 (en) * 2013-03-14 2018-11-20 Amazon Technologies, Inc. Providing content on multiple devices
US10572473B2 (en) 2013-10-09 2020-02-25 International Business Machines Corporation Optimized data visualization according to natural language query
US9189742B2 (en) * 2013-11-20 2015-11-17 Justin London Adaptive virtual intelligent agent
WO2015100362A1 (en) 2013-12-23 2015-07-02 24/7 Customer, Inc. Systems and methods for facilitating dialogue mining
US10133530B2 (en) * 2014-05-19 2018-11-20 Allstate Insurance Company Electronic display systems connected to vehicles and vehicle-based systems
US9335911B1 (en) * 2014-12-29 2016-05-10 Palantir Technologies Inc. Interactive user interface for dynamic data analysis exploration and query processing
US10719524B1 (en) * 2015-04-15 2020-07-21 Arimo, LLC Query template based architecture for processing natural language queries for data analysis
EP3142028A3 (en) * 2015-09-11 2017-07-12 Google, Inc. Handling failures in processing natural language queries through user interactions
CN105512228B (en) * 2015-11-30 2018-12-25 北京光年无限科技有限公司 A kind of two-way question and answer data processing method and system based on intelligent robot
US20180005149A1 (en) * 2016-07-04 2018-01-04 Musigma Business Solutions Pvt. Ltd. Guided analytics system and method
US10453074B2 (en) * 2016-07-08 2019-10-22 Asapp, Inc. Automatically suggesting resources for responding to a request

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101124578A (en) * 2005-01-14 2008-02-13 国际商业机器公司 Sharable multi-tenant reference data utility and repository, including value enhancement and on-demand data delivery and methods of operation
CN101364229A (en) * 2008-10-06 2009-02-11 中国移动通信集团设计院有限公司 A Data Warehouse Host Resource Prediction Method Based on Time Capacity Analysis
CN103295148A (en) * 2012-02-27 2013-09-11 埃森哲环球服务有限公司 Digital consumer data model and customer analytic record
CN104077347A (en) * 2013-03-26 2014-10-01 国际商业机器公司 Method and a system for profiling social trendsetters on a communications network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
tools for analyzing qualitative data:the history and relevance of qualitative data analysis of software;Linda S等;《handbook of research on educational communications and technology》;20130101;221-236 *
商业银行小微业务应用大数据技术研究;熊福平;《商业银行经营管理》;20160910;53-55 *

Also Published As

Publication number Publication date
EP3519988A1 (en) 2019-08-07
CN115858730A (en) 2023-03-28
CN107885744A (en) 2018-04-06
US11423229B2 (en) 2022-08-23
WO2018063924A1 (en) 2018-04-05
US20220405479A1 (en) 2022-12-22
US20190236144A1 (en) 2019-08-01

Similar Documents

Publication Publication Date Title
CN107885744B (en) Conversational data analysis
Yu et al. FlowSense: A natural language interface for visual data exploration within a dataflow system
US9621601B2 (en) User collaboration for answer generation in question and answer system
US10733197B2 (en) Method and apparatus for providing information based on artificial intelligence
US10515147B2 (en) Using statistical language models for contextual lookup
US20170308571A1 (en) Techniques for utilizing a natural language interface to perform data analysis and retrieval
US9817821B2 (en) Translation and dictionary selection by context
US20150161242A1 (en) Identifying and Displaying Relationships Between Candidate Answers
CN111401058B (en) Attribute value extraction method and device based on named entity recognition tool
JP2021101361A (en) Method, device, apparatus and storage medium for generating event topics
US11842154B2 (en) Visually correlating individual terms in natural language input to respective structured phrases representing the natural language input
US20160171063A1 (en) Modeling actions, consequences and goal achievement from social media and other digital traces
US11481733B2 (en) Automated interfaces with interactive keywords between employment postings and candidate profiles
US20220215186A1 (en) Machine learning enabled text analysis with support for unstructured data
CN117788172A (en) Data asset assessment method, device, equipment and medium based on large model
US20220092452A1 (en) Automated machine learning tool for explaining the effects of complex text on predictive results
US20220092508A1 (en) Method and system for generating contextual narrative for deriving insights from visualizations
JP2018198044A (en) Apparatus and method for generating multiple-event pattern query
WO2021120878A1 (en) Book graph-based book display method, computing device, and storage medium
EP2800014A1 (en) Method for searching curriculum vitae's on a job portal website, server and computer program product therefore
US20240289554A1 (en) Method and system for personalized embedding search engine
US20230196035A1 (en) Identifying zones of interest in text transcripts using deep learning
CN118410153B (en) Multi-mode enhanced dialogue processing and responding system and method based on user intention
US20240346252A1 (en) Automated analysis of computer systems using machine learning
CN118981527A (en) Question answering method, device, electronic device, storage medium, intelligent agent and program product based on large model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TG01 Patent term adjustment
TG01 Patent term adjustment