WO2022135419A1 - 一种语音交互的方法和装置 - Google Patents

一种语音交互的方法和装置 Download PDF

Info

Publication number
WO2022135419A1
WO2022135419A1 PCT/CN2021/140193 CN2021140193W WO2022135419A1 WO 2022135419 A1 WO2022135419 A1 WO 2022135419A1 CN 2021140193 W CN2021140193 W CN 2021140193W WO 2022135419 A1 WO2022135419 A1 WO 2022135419A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice interaction
target
information
page jump
vehicle system
Prior art date
Application number
PCT/CN2021/140193
Other languages
English (en)
French (fr)
Inventor
易晖
张又亮
申众
赵鹏
史小凯
翁志伟
Original Assignee
广州橙行智动汽车科技有限公司
广州小鹏汽车科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州橙行智动汽车科技有限公司, 广州小鹏汽车科技有限公司 filed Critical 广州橙行智动汽车科技有限公司
Publication of WO2022135419A1 publication Critical patent/WO2022135419A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/954Navigation, e.g. using categorised browsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]

Definitions

  • the present invention relates to the technical field of voice interaction, and in particular, to a method and device for voice interaction.
  • the touch screen interaction is usually used, that is, the user needs to enter from the entrance of the large screen main page and click the link multiple times to reach the user's desired The large screen page or large screen element to be entered.
  • VUI Voice User Interface
  • voice user interface Voice User Interface
  • the hierarchical relationship between different pages can be described through pre-defined structured slots.
  • the page hierarchical relationship of different applications, different page types, and different elements is diverse, and pre-defined structured slots are required.
  • the number of slots is large; in addition, since the VUI is updated and upgraded with the system, and the application update speed is fast, after the application is updated, the page revision of the application client may cause the page hierarchy to change.
  • the structure of the VUI Slots cannot be applied to the updated page hierarchy, so one-sentence direct interaction cannot be achieved, which reduces user experience.
  • a method for voice interaction comprising:
  • the target page jump information corresponding to the voice interaction event is determined from the knowledge graph data; wherein the knowledge graph data includes the on-board system or multiple functional entities applied in the on-board system and their corresponding page jump information;
  • the in-vehicle system or the application in the in-vehicle system is controlled according to the target semantic representation.
  • the determining the target page jump information corresponding to the voice interaction event from the knowledge graph data includes:
  • the target page jump information corresponding to the voice interaction event is determined from the knowledge graph data.
  • the method before the determining of the second entity information of the target functional entity corresponding to the voice interaction event, the method further includes:
  • the voice interaction event determine the first entity information of the in-vehicle system or the target application in the in-vehicle system;
  • the determining, according to the second entity information, the target page jump information corresponding to the voice interaction event from the knowledge graph data including:
  • the target page jump information corresponding to the voice interaction event is determined from the knowledge graph data.
  • the method before determining the target page jump information corresponding to the voice interaction event from the knowledge graph data, the method further includes:
  • the determining of the target page jump information corresponding to the voice interaction event from the knowledge graph data is performed.
  • the specified intent category information is intent category information for the in-vehicle system or a functional entity applied in the in-vehicle system.
  • controlling the in-vehicle system or the application in the in-vehicle system according to the target semantic representation includes:
  • the in-vehicle system or the application in the in-vehicle system is controlled.
  • the page jump information includes page URL, or, the page jump information includes page URL and anchor point information.
  • a device for voice interaction comprising:
  • the target page jump information determination module is used to determine the target page jump information corresponding to the voice interaction event from the knowledge graph data when the voice interaction event is detected; wherein, the knowledge graph data includes the vehicle-mounted system or the vehicle-mounted system Multiple functional entities applied in and their corresponding page jump information;
  • a target semantic expression generation module configured to construct one or more dynamic slots according to the target page jump information, and fill the one or more dynamic slots with slot values to obtain the target semantic expression
  • a control module configured to control the in-vehicle system or the application in the in-vehicle system according to the semantic representation of the target.
  • a vehicle includes a processor, a memory, and a computer program stored on the memory and executable on the processor, the computer program implementing the method of voice interaction as described above when executed by the processor.
  • a computer-readable storage medium stores a computer program on the computer-readable storage medium, and when the computer program is executed by a processor, implements the above-mentioned voice interaction method.
  • the target page jump information corresponding to the voice interaction event is determined from knowledge map data, wherein the knowledge map data includes a vehicle-mounted system or a plurality of applications in the vehicle-mounted system.
  • the function entity and its corresponding page jump information construct one or more dynamic slots, and fill the one or more dynamic slots with the slot value to obtain the target semantic representation , according to the semantic representation of the target, the on-board system or the application in the on-board system is controlled, and the dynamic slot filling is realized according to the target page jump information determined according to the knowledge spectrum, so that the slot structure is highly flexible and does not need to be defined.
  • FIG. 1a is a schematic diagram of jumping of a navigation application according to an embodiment of the present invention.
  • FIG. 1b is a schematic diagram of jumping of another navigation application provided by an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a knowledge spectrum diagram provided by an embodiment of the present invention.
  • FIG. 3 is a flowchart of steps of a method for voice interaction provided by an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of an apparatus for voice interaction provided by an embodiment of the present invention.
  • a voice interaction module (such as a voice assistant) may be included in the vehicle system.
  • the voice interaction module can determine the intent information contained in the user's voice data by recognizing the voice interaction information input by the user through voice, and execute according to the intent. Corresponding interactive control.
  • the intent information may be the purpose of the user inputting the voice data.
  • the voice data input by the user is "jump to the volume control of the navigation", which means that the user wants the page to jump to the volume control page of the navigation. Therefore, the voice The data corresponds to a kind of direct intent information in one sentence.
  • the voice interaction module in order to improve the user's voice interaction experience, you can set a direct page jump method in one sentence.
  • the semantic representation can be defined in advance according to the page hierarchy relationship of different applications, different page types, and different elements.
  • the user can jump to the specified large screen according to the predefined semantic representation. Pages or large screen page elements, eliminating the need for users to click links multiple times through touchscreen interaction.
  • the page level relationship of volume adjustment during navigation is shown in Figure 1a.
  • Click the setting tab (tab) in the interface click the volume setting in the setting tab, and a volume setting pop-up window will pop up.
  • the volume setting pop-up window interface includes a volume adjustment control.
  • a voice interaction information (that is, query query) can be pre-defined in the voice interaction module as "jump to navigation volume adjustment", and the corresponding semantic representation of the voice interaction information:
  • the first level page Navigation Open by: Home
  • the third level page volume setting Open method: pop-up box
  • the system can determine the predefined semantic expression according to the voice interaction information, and then the system can be opened in sequence: navigation application interface - setting interface - volume setting pop-up page , no anchoring is required during this jump.
  • the page hierarchy can change, and the voice interaction module needs to redefine the semantic expression in order to perform the corresponding voice interaction.
  • the page level relationship of volume adjustment during navigation is shown in Figure 1b.
  • the navigation application interface may include a volume adjustment control .
  • a voice interaction information (ie query query) can be pre-defined as "jump to navigation volume adjustment", and the corresponding semantic representation of the voice interaction information:
  • the system can determine the semantic expression according to the voice interaction information, so that the system can open the navigation application interface, and then anchor and locate the volume adjustment control.
  • the voice interaction module can be updated with the update of the in-vehicle system, while the application update of the in-vehicle system is not synchronized with the voice interaction module, and the application update speed is faster.
  • the semantic expression of the original in-vehicle system or application of the in-vehicle system in the voice interaction module is predefined, the slot information in the semantic expression is fixed, and the slot in the semantic expression Bit and slot values cannot change with the updated page level, making them incompatible with newer versions of the app.
  • the knowledge graph can be combined with dynamic slots, the slot value can be filled according to the current page level relationship of the application, the semantic expression corresponding to the voice interaction information can be generated, and the high flexibility of the slot can be increased, so as to solve the problem of one-sentence direct access. , the problem of page hierarchy changes caused by application updates. At the same time, there is no need to modify a lot of code.
  • the system or system application can upload the updated packaged data related to the page hierarchy relationship to the NLU platform (Natural Language Understanding, natural language understanding module), such as element name, category, parent Nodes, anchor names, page addresses, etc., so that the page structure diagram can be generated based on these packaged data, and then the knowledge graph can be updated.
  • NLU platform Natural Language Understanding, natural language understanding module
  • the navigation application will send the page jump information of all functional entities to the NLU platform, and the corresponding page structure relationship table will be generated, wherein Table 1 is the volume adjustment function of the navigation application.
  • Table 1 is the volume adjustment function of the navigation application.
  • the corresponding knowledge graph can be obtained, and the knowledge graph includes functional entities and page jump information corresponding to the functional entities.
  • Figure 2 it is a knowledge spectrum map generated according to Table 1.
  • volume switch and volume adjustment are included, and the page jump information may be: navigation application-volume switch-volume adjustment.
  • a dynamic slot can be constructed in the corresponding preset internal voice format according to the updated knowledge spectrum, and the dynamic slot can be filled with the slot value, and then the current slot can be generated.
  • the system or the system applies the updated semantic representation, and according to the semantic representation, the system controls the corresponding page to jump.
  • FIG. 3 a flowchart of steps of a method for voice interaction provided by an embodiment of the present invention is shown, which may specifically include the following steps:
  • Step 301 when a voice interaction event is detected, determine the target page jump information corresponding to the voice interaction event from knowledge graph data; wherein, the knowledge graph data includes a vehicle-mounted system or a plurality of functional entities applied in the vehicle-mounted system and its corresponding page jump information;
  • the page jump information includes a page URL, or the page jump information includes a page URL and anchor point information.
  • knowledge graph data may be stored in vehicle data, and knowledge graph data may be used to describe entities and relationships between entities, for example, entities corresponding to parent nodes and/or child nodes of an entity.
  • the page jump information of the functional entity corresponding to the in-vehicle system or the application of the in-vehicle system can be formed through the relationship between the entities and the application of the in-vehicle system.
  • the volume adjustment control in the navigation application is a functional entity and can be used to adjust the volume of the navigation voice.
  • the page jump information may further include a page URL, or the page jump information includes page URL and anchor point information.
  • the knowledge graph can also change accordingly, and the page jump information in the knowledge graph data will also change.
  • the vehicle system can detect the corresponding voice interaction event, so that the target page jump information corresponding to the voice interaction event can be determined in all the page jump information of the updated knowledge graph data.
  • Step 302 constructing one or more dynamic slots according to the target page jump information, and filling the one or more dynamic slots with slot values to obtain the target semantic representation;
  • one or more dynamic slots can be constructed according to the target page jump information, and the corresponding slot value can be filled for each dynamic slot, so that the target semantic expression can be obtained.
  • the dynamic slot can also be constructed according to the current page jump information of the in-vehicle system or the application of the in-vehicle system, and filled with The corresponding slot value makes the slot structure highly flexible, so that it can be directly reached in one sentence under the page revision.
  • the page jump information is determined as the navigation application interface-setting interface-volume setting pop-up page, so as to construct the corresponding slot according to the page jump information, and fill the slot value , the semantic expression corresponding to the generated voice interaction information is:
  • the page jump information at this time is: open the navigation application interface, and then anchor to the volume adjustment On the control, the corresponding slot is constructed according to the page jump information, and the slot value is filled.
  • the semantic expression corresponding to the generated voice interaction information at this time is:
  • Step 303 Control the in-vehicle system or the application in the in-vehicle system according to the semantic representation of the target.
  • the in-vehicle system or the application in the in-vehicle system can be controlled according to the target semantic representation, so that the page jumps according to the semantic interaction information input by the user.
  • controlling the in-vehicle system or the application in the in-vehicle system according to the target semantic representation includes:
  • an event control instruction for the voice interaction event is generated; according to the event control instruction, the in-vehicle system or the application in the in-vehicle system is controlled
  • the knowledge spectrum graph data and the internal semantic representation can be combined to generate event control instructions for the voice interaction event.
  • the event control instruction is a control instruction that the vehicle system can recognize and can execute, and then the vehicle system or the application in the vehicle system can be controlled according to the event control instruction, that is, the corresponding voice interaction event can be executed. Page jump to reach the specified page or page element.
  • the event control instruction may include a command (command) of the application client, and the command may include a page name, a page address, and an anchor point name.
  • the code of the generated control instruction is as follows:
  • the target page jump information that cannot be executed by the system can be converted into control instructions that can be directly executed by the system, thereby realizing voice control.
  • the target page jump information corresponding to the voice interaction event is determined from knowledge graph data, wherein the knowledge graph data includes an in-vehicle system or an application in an in-vehicle system.
  • the knowledge graph data includes an in-vehicle system or an application in an in-vehicle system.
  • multiple functional entities and their corresponding page jump information construct one or more dynamic slots according to the target page jump information, and fill the one or more dynamic slots with slot value, to obtain Target semantic representation, according to the target semantic representation, the on-board system or the application in the on-board system is controlled, and the slot filling is realized according to the target page jump information determined according to the knowledge spectrum graph, so that the slot structure is highly flexible.
  • FIG. 4 a flowchart of steps of another voice interaction method provided by an embodiment of the present invention is shown, which may specifically include the following steps:
  • Step 401 when a voice interaction event is detected, determine the second entity information of the target functional entity corresponding to the voice interaction event;
  • the vehicle-mounted system can detect a voice interaction event, the voice interaction event can correspond to a target functional entity, and the target functional entity can include a second entity information.
  • the second entity information may correspond to entity information corresponding to the target page in the target page jump information of the voice interaction event, for example, the detected voice interaction event is "jump to navigation volume adjustment", wherein, "Volume adjustment" may be the second entity information.
  • the method before the embodiment of the present invention, before the determining of the second entity information of the target functional entity corresponding to the voice interaction event, the method further includes:
  • the voice interaction event determine the first entity information of the in-vehicle system or the target application in the in-vehicle system;
  • the first entity information of the in-vehicle system or the target application in the in-vehicle system can also be determined according to the voice interaction event.
  • volume adjustment where "navigation” is the first entity information.
  • the determining, according to the second entity information, the target page jump information corresponding to the voice interaction event from the knowledge graph data including:
  • the target page jump information corresponding to the voice interaction event is determined from the knowledge graph data.
  • the first entity information is the entity corresponding to the in-vehicle system or the in-vehicle application system, that is, the first page of the page jump process
  • the second entity information corresponds to the final required Jump to the page, so that the target page jump information corresponding to the voice interaction event can be determined from the knowledge graph data.
  • the type of page jump, the application corresponding to the page, the first entity information, the second entity information, etc. can be determined by performing speech recognition on the speech input by the user, and determining the relevance of the context included in the speech. .
  • the method before determining the target page jump information corresponding to the voice interaction event from the knowledge graph data, the method further includes:
  • the specified intent category information is intent category information for the in-vehicle system or a functional entity applied in the in-vehicle system.
  • voice interaction events can correspond to different intents
  • the target intent category information of user voice interaction events can be determined through NUL arbitration, dialogue state tracking and other methods.
  • the knowledge spectrum can be executed. The step of determining the target page jump information corresponding to the voice interaction event in the graph data.
  • the specified intent category information may be the intent category information for the in-vehicle system or the functional entity applied by the in-vehicle system.
  • the specified intent type information may be direct in one sentence, and for the functional entity, jump to the page or page where the functional entity is located. element.
  • Step 402 determine the target page jump information corresponding to the voice interaction event from knowledge graph data.
  • the knowledge graph data includes the in-vehicle system or multiple functional entities applied in the in-vehicle system and their corresponding page jump information;
  • the page jump information in the knowledge graph data is searched according to the second entity information, and then the target page jump information corresponding to the voice interaction event can be determined from the knowledge graph data.
  • Step 403 constructing one or more dynamic slots according to the target page jump information, and filling the one or more dynamic slots with slot values to obtain the target semantic representation;
  • Step 404 controlling the in-vehicle system or the application in the in-vehicle system according to the semantic representation of the target.
  • the second entity information of the target functional entity corresponding to the voice interaction event is determined, and according to the second entity information, the voice is determined from knowledge graph data
  • the target page jump information corresponding to the interaction event wherein the knowledge graph data includes multiple functional entities applied in the in-vehicle system or the in-vehicle system and their corresponding page jump information, according to the target page jump information, construct a or multiple dynamic slots, and fill the one or more dynamic slots with slot values to obtain a target semantic representation, and control the in-vehicle system or the application in the in-vehicle system according to the target semantic representation to achieve
  • the slot structure is highly flexible, and it is not necessary to define the slot structure of all page hierarchical relationships, which can be applied to the scene where the page jump information data changes dynamically.
  • FIG. 5 a schematic structural diagram of a voice interaction device provided by an embodiment of the present invention is shown, which may specifically include the following modules:
  • the target page jump information determination module 501 is used to determine the target page jump information corresponding to the voice interaction event from knowledge graph data when a voice interaction event is detected; wherein, the knowledge graph data includes a vehicle-mounted system or a vehicle-mounted system. Multiple functional entities applied in the system and their corresponding page jump information;
  • the target semantic representation generation module 502 is configured to construct one or more dynamic slots according to the target page jump information, and fill the one or more dynamic slots with slot values to obtain the target semantic representation;
  • the control module 503 is configured to control the in-vehicle system or the application in the in-vehicle system according to the semantic representation of the target.
  • the page jump information includes a page URL, or the page jump information includes a page URL and anchor point information.
  • the target page jump information determination module 501 may include:
  • the second entity information determination submodule is used to determine the second entity information of the target functional entity corresponding to the voice interaction event
  • the first target page jump information determination submodule is configured to determine target page jump information corresponding to the voice interaction event from the knowledge graph data according to the second entity information.
  • the target page jump information determination module 501 may include:
  • a first entity information determination sub-module configured to determine the first entity information of the in-vehicle system or the target application in the in-vehicle system according to the voice control event;
  • the target page jump information determination module 501 may include:
  • the second target page jump information determination submodule is configured to determine target page jump information corresponding to the voice interaction event from the knowledge graph data according to the first entity information and the second entity information.
  • the target page jump information determining module 501 may further include:
  • a target intent category information determination submodule configured to determine the target intent category information of the voice interaction event
  • the third target page jump information determination submodule is configured to execute the determining of the target page jump information corresponding to the voice interaction event from the knowledge graph data when the target intent category information is the specified intent category information.
  • the specified intent category information is intent category information for the in-vehicle system or a functional entity applied in the in-vehicle system.
  • control module 503 may include:
  • an event control instruction generation submodule configured to generate an event control instruction for the voice interaction event according to the target semantic representation
  • the control sub-module is used for controlling the in-vehicle system or the application in the in-vehicle system according to the event control instruction.
  • the target page jump information corresponding to the voice interaction event is determined from knowledge graph data, wherein the knowledge graph data includes an in-vehicle system or an application in an in-vehicle system.
  • An embodiment of the present invention also provides a vehicle, which may include a processor, a memory, and a computer program stored in the memory and capable of running on the processor, and the above voice interaction method is implemented when the computer program is executed by the processor.
  • An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the above voice interaction method is implemented.
  • embodiments of the present invention may be provided as a method, an apparatus, or a computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product implemented on one or more computer-usable storage media having computer-usable program code embodied therein, including but not limited to disk storage, CD-ROM, optical storage, and the like.
  • Embodiments of the present invention are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present invention. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal equipment to produce a machine that causes the instructions to be executed by the processor of the computer or other programmable data processing terminal equipment Means are created for implementing the functions specified in the flow or flows of the flowcharts and/or the blocks or blocks of the block diagrams.
  • These computer program instructions may also be stored in a computer readable memory capable of directing a computer or other programmable data processing terminal equipment to operate in a particular manner, such that the instructions stored in the computer readable memory result in an article of manufacture comprising instruction means, the The instruction means implement the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种语音交互的方法和装置,所述方法包括:在检测到语音交互事件时,从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息;其中,所述知识图谱数据包括车载系统或车载系统中应用的多个功能实体及其对应的页面跳转信息(301);根据所述目标页面跳转信息,构建一个或多个动态槽位,并对所述一个或多个动态槽位进行槽位值填充,得到目标语义表示(302);根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制(303)。该方法实现了根据知识谱图确定的目标页面跳转信息进行槽位填充,使槽位结构具备高度弹性,无需定义所有页面层级关系的槽位结构,可应用于页面跳转信息数据动态变化的场景中。

Description

一种语音交互的方法和装置
本发明要求在2020年12月21日提交中国专利局、申请号202011522703.2、发明名称为“一种语音交互的方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本发明中。
技术领域
本发明涉及语音交互技术领域,特别是涉及一种语音交互的方法和装置。
背景技术
在用户使用大屏控制车辆时,如果用户想要进入某个大屏页面或者大屏元素的时候,通常使用触屏交互,即需要从大屏主页面入口进入,多次点击链接才能到达用户想要进入的大屏页面或大屏元素。
随着智能汽车的发展,车辆与用户的互动越来越多,为了增强用户的使用体验,可以设置一语直达交互,用户可以使用系统的VUI(Voice User Interface,语音用户界面)即时清晰的表达出意图,即可跳转到用户想到进入的页面或者大屏元素,无需多次点击链接。
为了实现一语直达交互,可以通过预先定义好的结构化槽位描述不同页面间的层级关系,但是,不同应用、不同页面类型、不同元素的页面层级关系存在多样性,需要预先定义的结构化槽位数量多;另外,由于VUI是随系统进行更新升级,而应用的更新速度较快,在应用进行更新后,应用客户端页面改版可能导致页面层级关系发生变化,此时,VUI的结构化槽位无法适用于更新后的页面层级关系,从而无法实现一语直达交互,降低了用户体验。
发明内容
鉴于上述问题,提出了以便提供克服上述问题或者至少部分地解决上述问题的一种语音控制的方法和装置,包括:
一种语音交互的方法,所述方法包括:
在检测到语音交互事件时,从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息;其中,所述知识图谱数据包括车载系统或车载系统中应用的多个功能实体及其对应的页面跳转信息;
根据所述目标页面跳转信息,构建一个或多个动态槽位,并对所述一个或多个动态槽位进行槽位值填充,得到目标语义表示;
根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制。
可选地,所述从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息,包括:
确定所述语音交互事件对应的目标功能实体的第二实体信息;
根据所述第二实体信息,从所述知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息。
可选地,在所述确定所述语音交互事件对应的目标功能实体的第二实体信息之前,还包括:
根据所述语音交互事件,确定所述车载系统或车载系统中目标应用的第一实体信息;
所述根据所述第二实体信息,从所述知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息,包括:
根据所述第一实体信息和所述第二实体信息,从所述知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息。
可选地,在所述从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息之前,还包括:
确定所述语音交互事件的目标意图类别信息;
在所述目标意图类别信息为指定意图类别信息时,执行所述从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息。
可选地,所述指定意图类别信息为针对车载系统或车载系统中应用的功能实体的意图类别信息。
可选地,所述根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制包括:
根据所述目标语义表示,生成针对所述语音交互事件的事件控制指令;
按照所述事件控制指令,对所述车载系统或车载系统中应用进行控制。可选地,所述页面跳转信息包括页面URL,或者,所述页面跳转信息包括页面URL和锚点信息。
一种语音交互的装置,所述装置包括:
目标页面跳转信息确定模块,用于在检测到语音交互事件时,从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息;其中,所述知识图谱数据包括车载系统或车载系统中应用的多个功能实体及其对应的页面跳转信息;
目标语义表达生成模块,用于根据所述目标页面跳转信息,构建一个或多个动态槽位,并对所述一个或多个动态槽位进行槽位值填充,得到目标语义表示;
控制模块,用于根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制。
一种车辆,包括处理器、存储器及存储在所述存储器上并能够在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如上所述的语音交互的方法。
一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如上所述的语音交互的方法。
本发明实施例具有以下优点:
本发明实施例通过在检测到语音交互事件时,从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息,其中,所述知识图谱数据包括车载系统或车载系统中应用的多个功能实体及其对应的页面跳转信息,根据所述目标页面跳转信息,构建一个或多个动态槽位,并对所述一个或多个动态槽位进行槽位值填充,得到目标语义表示,根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制,实现了根据知识谱图确定的目标页面跳转信息进行动态槽位填充,使槽位结构具备高度弹性,无需定义所有页面层级关系的槽位结构,可应用于页面跳转信息数据动态变化的场景中,由于采用知识图谱,其数据驱动的方式灵活,在客户端页面改版等原因造成的页面层级和跳转方式变化,无需修改代码即可支持一语直达,提高用户使用体 验。
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1a是本发明一实施例提供的一种导航应用的跳转示意图;
图1b是本发明一实施例提供的另一种导航应用的跳转示意图;
图2是本发明一实施例提供的一种知识谱图的示意图;
图3是本发明一实施例提供的一种语音交互的方法的步骤流程图;
图4是本发明一实施例提供的另一种语音交互的方法的步骤流程图;
图5是本发明一实施例提供的一种语音交互的装置的结构示意图。
具体实施例
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
在实际应用中,在车辆系统中可以包括语音交互模块(如语音助手),语音交互模块可以通过识别用户通过语音输入的语音交互信息,确定用户语音数据中包含的意图信息,并根据其意图执行相应交互控制。
其中,意图信息可以是用户输入该语音数据的目的,例如,用户输入的语音数据为“跳转至导航的音量控制”,表示用户希望页面可以跳转到导航的 音量控制页面,因而,该语音数据对应的是一种一语直达的意图信息。
在语音交互模块中,为了提升用户的语音交互体验,可以设置有一语直达的页面跳转方式。在一语直达中,可以预先根据不同应用、不同页面类型、不同元素的页面层级关系定义语义表示,当用户输入语音交互信息时,可以按照预先定义的语义表示,实现跳转到指定的大屏页面或大屏页面元素,从而无需用户通过触屏交互多次点击链接。
例如,在车辆的导航应用V01版本中,导航时的音量调节的页面层级关系如图1a所示,当使用触屏交互时,需要从首页开始点击导航应用,打开导航应用界面,进而在导航应用界面中点击设置TAB(标签),在设置TAB中点击音量设置,弹出一个音量设置弹窗,音量设置弹窗界面中包括一个音量调节控件。
在一语直达的交互方式中,可以在语音交互模块中预先定义一语音交互信息(即query查询)为“跳转到导航的音量调节”,以及该语音交互信息对应的语义表示:
领域&意图:一语直达
槽位:
跳转类型:控件
应用:导航
需打开页面:三级
第一级页面:导航 打开方式:首页
第二级页面:设置 打开方式:tab页
第三级页面:音量设置 打开方式:弹框
从而,在用户语音输入“跳转到导航的音量调节”时,系统可以根据该语音交互信息确定预先定义的语义表达,进而可以实现系统依次打开:导航应用界面-设置界面-音量设置弹窗页面,在该跳转过程中,无需锚定定位。
在实际应用中,当车载系统或者车载系统的应用更新后,页面层级结构可以发生变化,则语音交互模块需要重新定义语义表达,才能执行相应语音交互。
例如,在车辆的导航应用V02版本中,导航时的音量调节的页面层级关 系如图1b所示,在使用触屏交互时,需要点击导航应用,该导航应用界面中可以包括有个音量调节控件。
在一语直达交互方式中,可以预先定义一语音交互信息(即query查询)为“跳转到导航的音量调节”,以及该语音交互信息对应的语义表示:
领域&意图:一语直达
槽位:
跳转类型:控件
应用:导航
需打开页面:导航首页
锚点定位:音量调节
从而,在用户通过语音输入“跳转到导航的音量调节”时,系统可以根据该语音交互信息确定语义表达,进而可以实现系统打开导航应用界面,进而锚定定位到音量调节控件上。
语音交互模块可以随车载系统的更新进行更新,而车载系统的应用更新与语音交互模块并不同步,应用的更新速度更快。
在车载系统或车载系统的应用发生页面改版后,语音交互模块中按照原来的车载系统或车载系统的应用的语义表达由于是预先定义的,语义表达中的槽位信息固定,语义表达中的槽位及槽位值无法随更新后的页面层级变化而变化,从而无法兼容新版本的应用。
如果对所有更新的页面层级全部进行重新定义语义表达,则需要修改大量代码,可操作性低。
在本发明中,可以采用知识图谱与动态槽位结合,根据应用当前的页面层级关系填充槽位值,生成语音交互信息对应的语义表达,增加槽位的高度弹性,从而可以解决一语直达中,应用更新导致的页面层级关系变化的问题。同时,也无需修改大量代码。
在离线过程中,可以在系统或应用更新后,系统或系统应用可以向NLU平台(Natural Language Understanding,自然语言理解模块)上传更新后的页面层级关系相关的打包数据,如元素名称、类别、父节点、锚点名、页面地址等,从而可以根据这些打包数据,生成页面结构关系图,进而可以更新 知识图谱。
例如,在车辆的导航应用在V01版本中,导航应用会向NLU平台发送所有功能实体的页面跳转信息,其中,从而生成对应的页面结构关系表,其中,表1为导航应用的音量调节功能实体的页面结构关系表:
元素名称 类别 父节点 锚点名 pageurl
声音 页面 导航应用 volume xiaopeng://setting/volume
音量开关 元素 声音页面 volumeSwitch  
音量调节 元素 声音页面 volumeSet  
根据表1,可以得到对应的知识图谱,知识图谱包括功能实体以及功能实体对应的页面跳转信息。如图2所示,为根据表1生成的一种知识谱图。
在该知识谱图中,包括音量开关、音量调节等功能实体,页面跳转信息可以为:导航应用-音量开关-音量调节。
在线过程中,在用户语音输入语音交互信息后,可以按照更新的知识谱图在对应的预设内部语音格式中,构建动态槽位,并对动态槽位进行槽位值填充,进而可以生成当前系统或系统应用更新后的语义表示,根据该语义表示,系统控制相应页面进行跳转。
参照图3,示出了本发明一实施例提供的一种语音交互的方法的步骤流程图,具体可以包括如下步骤:
步骤301,在检测到语音交互事件时,从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息;其中,所述知识图谱数据包括车载系统或车载系统中应用的多个功能实体及其对应的页面跳转信息;
在本发明一实施例中,所述页面跳转信息包括页面URL,或者,所述页面跳转信息包括页面URL和锚点信息。
在实际应用中,在车辆数据中可以储存有知识图谱数据,知识图谱数据可以用来描述实体以及实体之间的关系,例如,某一实体的父节点和或子节点对应的实体。
通过实体与实体之间的关系可以形成车载系统或车载系统的应用对应的功能实体的页面跳转信息,功能实体是指车载系统或车载系统的应用对应 的可以实现车辆的具体功能的实体,如导航应用中的音量调节控件属于功能实体,可以用于调节导航语音的音量。
页面跳转信息还可以包括页面URL,或者,页面跳转信息包括页面URL和锚点信息。
当车载系统或车载系统应用的页面层级发生变化时,知识图谱也可以随之变化,进而知识谱图数据中的页面跳转信息也会发生变化。当用户输入语音时,车辆系统可以检测到相应的语音交互事件,从而可以在更新后的知识图谱数据的所有页面跳转信息中,确定语音交互事件所对应的目标页面跳转信息。
步骤302,根据所述目标页面跳转信息,构建一个或多个动态槽位,并对所述一个或多个动态槽位进行槽位值填充,得到目标语义表示;
在确定目标页面跳转信息之后,可以根据目标页面跳转信息,构建一个或多个动态槽位,并可以针对每个动态槽位,填充对应的槽位值,从而可以得到目标语义表达。
当车载系统或车载系统的应用发生页面改版时,页面层级发生变化,当用户输入相同的语音交互信息,也可以依照车载系统或车载系统的应用当前的页面跳转信息构建动态槽位,并填充对应的槽位值,使槽位结构具有高度弹性,进而可以在页面改版下,也能实现一语直达。
在一示例中,在离线过程中,可以在技能字典中预先定义不同的内部语义表达格式,内部语义表达可以如下所示:
领域&意图:一语直达
槽位:
跳转类型:控件|页面
应用:string
页面跳转信息(动态槽位):json
用于可以易于预先定义的内部语义表示格式生成目标语义表示。
例如:用户通过语音输入Query:“跳转到导航的音量调节时:
(一)当导航应用为V01版本时,根据知识图谱确定页面跳转信息为导航应用界面-设置界面-音量设置弹窗页面,从而根据页面跳转信息构建对应 槽位,以及进行槽位值填充,生成该语音交互信息对应生成的语义表达为:
领域&意图:一语直达
槽位:
跳转类型:控件
应用:导航
页面跳转信息(动态槽位):
Figure PCTCN2021140193-appb-000001
(二)在导航应用V02版本中,根据V02版本的导航应用更新知识图谱后,根据更新后的知识图谱,此时的页面跳转转信息为:打开导航应用界面,进而锚定定位到音量调节控件上,从而根据页面跳转信息构建对应槽位,以及进行槽位值填充,该语音交互信息此时对应生成的语义表达为:
领域&意图:一语直达
槽位:
跳转类型:控件
应用:导航
页面跳转信息(动态槽位):
Figure PCTCN2021140193-appb-000002
步骤303,根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制。
在确定目标语义表示之后,可以根据该目标语义表示,对车载系统或车载系统中应用进行控制,使其按照用户所输入的语义交互信息进行页面跳转。
在本发明一实施例中,所述根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制,包括:
根据所述目标语义表示,生成针对所述语音交互事件的事件控制指令;按照所述事件控制指令,对所述车载系统或车载系统中应用进行控制
在生成目标语义表示之后,可以结合知识谱图数据以及内部语义表示生成针对语音交互事件的事件控制指令。
在生成事件控制指令后,事件控制指令是车辆系统可以识别且能够执行的控制指令,进而可以按照事件控制指令,对车载系统或车载系统中的应用进行控制,即可以按照语音交互事件执行相应的页面跳转,到达指定的页面 或页面元素。
在一示例中,事件控制指令可以包括应用客户端的command(命令),command中可以包括页面名称、页面地址、锚点名称。
例如,当用户输入的语音交互事件为“跳转到导航的音量调节”时,生成的控制指令的代码如下:
Figure PCTCN2021140193-appb-000003
通过生成控制指令,可以将系统无法执行的目标页面跳转信息转变为系统可以直接执行的控制指令,从而实现语音控制。
在本发明实施例中,通过在检测到语音交互事件时,从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息,其中,所述知识图谱数据包括车载系统或车载系统中应用的多个功能实体及其对应的页面跳转信息,根据所述目标页面跳转信息,构建一个或多个动态槽位,并对所述一个或多个动态槽位进行槽位值填充,得到目标语义表示,根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制,实现了根据知识谱图确定的目标页面跳转信息进行槽位填充,使槽位结构具备高度弹性,无需定义所有页面层级关系的槽位结构,可应用于页面跳转信息数据动态变化的场景中,由于采用知识图谱,其数据驱动的方式灵活,在客户端页面改版等原因造成的页面层级和跳转方式变化,无需修改代码即可支持一语直达,提高用户使用体验。
参照图4,示出了本发明一实施例提供的另一种语音交互的方法的步骤流程图,具体可以包括如下步骤:
步骤401,在检测到语音交互事件时,确定所述语音交互事件对应的目标功能实体的第二实体信息;
在实际应用中,当用户进行语音输入时,车载系统可以检测到语音交互事件,该语音交互事件可以对应一目标功能实体,目标功能实体可以包括一第二实体信息。
在一示例中,第二实体信息可以对应语音交互事件的目标页面跳转信息中的目标页面对应的实体信息,例如,检测到的语音交互事件为“跳转到导航的音量调节”,其中,“音量调节”可以为第二实体信息。
在本发明实施例之前,在所述确定所述语音交互事件对应的目标功能实体的第二实体信息之前,还包括:
根据所述语音交互事件,确定所述车载系统或车载系统中目标应用的第一实体信息;
在确定第二实体信息前,还可以根据语音交互事件确定车载系统或车载系统中目标应用的第一实体信息,例如,检测到的语音交互事件为用户通过语音输入的语音信息“跳转到导航的音量调节”,其中,“导航”为第一实体信息。
所述根据所述第二实体信息,从所述知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息,包括:
根据所述第一实体信息和所述第二实体信息,从所述知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息。
在实际应用中,可以根据第一实体信息和第二实体信息,第一实体信息为车载系统或车载应用系统对应的实体,即页面跳转过程的最开始页面,而第二实体信息对应最终需要跳转到的页面,从而可以从知识图谱数据中确定语音交互事件对应的目标页面跳转信息。
在一示例中,可以通过对用户所输入的语音进行语音识别,并确定语音中所包含的上下文的相关性确定页面跳转的类型、页面对应的应用、第一实体信息、第二实体信息等。
在本发明实施例中,在所述从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息之前,还包括:
确定所述语音控制事件的目标意图类别信息;在所述目标意图类别信息为指定意图类别信息时,执行所述从知识图谱数据中确定所述语音交互事件 对应的目标页面跳转信息。
在本发明一实施例中,所述指定意图类别信息为针对车载系统或车载系统中应用的功能实体的意图类别信息。
在实际应用中,语音交互事件可以对应不同的意图,可以通过NUL仲裁、对话状态跟踪等方法确定用户语音交互事件的目标意图类别信息,在目标意图为指定意图类别信息时,可以执行从知识谱图数据中确定语音交互事件对应的目标页面跳转信息的步骤。
其中,指定意图类别信息可以是针对车载系统或车载系统应用的功能实体的意图类别信息,例如,在指定意图类型信息可以是一语直达,可以针对功能实体,跳转到功能实体所在页面或页面元素。
步骤402,根据所述第二实体信息,从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息。其中,所述知识图谱数据包括车载系统或车载系统中应用的多个功能实体及其对应的页面跳转信息;
在确定第二实体信息后,根据第二实体信息在知识图谱数据中的页面跳转信息中进行查找,进而可以从知识图谱数据中,确定语音交互事件对应的目标页面跳转信息。
步骤403,根据所述目标页面跳转信息,构建一个或多个动态槽位,并对所述一个或多个动态槽位进行槽位值填充,得到目标语义表示;
步骤404,根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制。
在本发明实施例中,通过在检测到语音交互事件时,确定所述语音交互事件对应的目标功能实体的第二实体信息,根据所述第二实体信息,从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息,其中,所述知识图谱数据包括车载系统或车载系统中应用的多个功能实体及其对应的页面跳转信息,根据所述目标页面跳转信息,构建一个或多个动态槽位,并对所述一个或多个动态槽位进行槽位值填充,得到目标语义表示,根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制,实现了通过实体信息在知识谱图中进行查找目标跳转信息,槽位结构具备高度弹性,无需定义所有页面层级关系的槽位结构,可应用于页面跳转信息数据动态变化的场 景中。
需要说明的是,对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明实施例并不受所描述的动作顺序的限制,因为依据本发明实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本发明实施例所必须的。
参照图5,示出了本发明一实施例提供的一种语音交互的装置的结构示意图,具体可以包括如下模块:
目标页面跳转信息确定模块501,用于在检测到语音交互事件时,从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息;其中,所述知识图谱数据包括车载系统或车载系统中应用的多个功能实体及其对应的页面跳转信息;
目标语义表示生成模块502,用于根据所述目标页面跳转信息,构建一个或多个动态槽位,并对所述一个或多个动态槽位进行槽位值填充,得到目标语义表示;
控制模块503,用于根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制。
在本发明一实施例中,所述页面跳转信息包括页面URL,或者,所述页面跳转信息包括页面URL和锚点信息。
在本发明一实施例中,所述目标页面跳转信息确定模块501可以包括:
第二实体信息确定子模块,用于确定所述语音交互事件对应的目标功能实体的第二实体信息;
第一目标页面跳转信息确定子模块,用于根据所述第二实体信息,从所述知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息。
在本发明一实施例中,所述目标页面跳转信息确定模块501可以包括:
第一实体信息确定子模块,用于根据所述语音控制事件,确定所述车载系统或车载系统中目标应用的第一实体信息;
在本发明一实施例中,所述目标页面跳转信息确定模块501可以包括:
第二目标页面跳转信息确定子模块,用于根据所述第一实体信息和所述第二实体信息,从所述知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息。
在本发明一实施例中,所述目标页面跳转信息确定模块501还可以包括:
目标意图类别信息确定子模块,用于确定所述语音交互事件的目标意图类别信息;
第三目标页面跳转信息确定子模块,用于在所述目标意图类别信息为指定意图类别信息时,执行所述从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息。
在本发明一实施例中,所述指定意图类别信息为针对车载系统或车载系统中应用的功能实体的意图类别信息。
在本发明一实施例中,所述控制模块503可以包括:
事件控制指令生成子模块,用于根据所述目标语义表示,生成针对所述语音交互事件的事件控制指令;
控制子模块,用于按照所述事件控制指令,对所述车载系统或车载系统中应用进行控制。
在本发明实施例中,通过在检测到语音交互事件时,从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息,其中,所述知识图谱数据包括车载系统或车载系统中应用的多个功能实体及其对应的页面跳转信息,根据所述目标页面跳转信息,构建一个或多个动态槽位,并对所述一个或多个动态槽位进行槽位值填充,得到目标语义表示,根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制,实现了根据知识谱图确定的目标页面跳转信息进行槽位填充使槽位结构具备高度弹性,无需定义所有页面层级关系的槽位结构,可应用于页面跳转信息数据动态变化的场景中,由于采用知识图谱,其数据驱动的方式灵活,在客户端页面改版等原因造成的页面层级和跳转方式变化,无需修改代码即可支持一语直达,提高用户使用体验。
本发明一实施例还提供了一种车辆,可以包括处理器、存储器及存储在存储器上并能够在处理器上运行的计算机程序,计算机程序被处理器执行时实现如上语音交互的方法。
本发明一实施例还提供了一种计算机可读存储介质,计算机可读存储介质上存储计算机程序,计算机程序被处理器执行时实现如上语音交互的方法。
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。
本领域内的技术人员应明白,本发明实施例可提供为方法、装置、或计算机程序产品。因此,本发明实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明实施例是参照根据本发明实施例的方法、终端设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理终端设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理终端设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理 终端设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理终端设备上,使得在计算机或其他可编程终端设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程终端设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本发明实施例的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明实施例范围的所有变更和修改。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。
以上对所提供的一种语音交互的方法和装置,进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。

Claims (10)

  1. 一种语音交互的方法,其特征在于,所述方法包括:
    在检测到语音交互事件时,从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息;其中,所述知识图谱数据包括车载系统或车载系统中应用的多个功能实体及其对应的页面跳转信息;
    根据所述目标页面跳转信息,构建一个或多个动态槽位,并对所述一个或多个动态槽位进行槽位值填充,得到目标语义表示;
    根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制。
  2. 根据权利要求1所述的方法,其特征在于,所述从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息,包括:
    确定所述语音交互事件对应的目标功能实体的第二实体信息;
    根据所述第二实体信息,从所述知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息。
  3. 根据权利要求2所述的方法,其特征在于,在所述确定所述语音交互事件对应的目标功能实体的第二实体信息之前,还包括:
    根据所述语音交互事件,确定所述车载系统或车载系统中目标应用的第一实体信息;
    所述根据所述第二实体信息,从所述知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息,包括:
    根据所述第一实体信息和所述第二实体信息,从所述知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息。
  4. 根据权利要求2或3所述的方法,其特征在于,在所述从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息之前,还包括:
    确定所述语音交互事件的目标意图类别信息;
    在所述目标意图类别信息为指定意图类别信息时,执行所述从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息。
  5. 根据权利要求4所述的方法,其特征在于,所述指定意图类别信息为针对车载系统或车载系统中应用的功能实体的意图类别信息。
  6. 根据权利要求1所述的方法,其特征在于,所述根据所述目标语义 表示,对所述车载系统或车载系统中应用进行控制,包括:
    根据所述目标语义表示,生成针对所述语音交互事件的事件控制指令;
    按照所述事件控制指令,对所述车载系统或车载系统中应用进行控制。
  7. 根据权利要求1所述的方法,其特征在于,所述页面跳转信息包括页面URL,或者,所述页面跳转信息包括页面URL和锚点信息。
  8. 一种语音交互的装置,其特征在于,所述装置包括:
    目标页面跳转信息确定模块,用于在检测到语音交互事件时,从知识图谱数据中确定所述语音交互事件对应的目标页面跳转信息;其中,所述知识图谱数据包括车载系统或车载系统中应用的多个功能实体及其对应的页面跳转信息;
    目标语义表示生成模块,用于根据所述目标页面跳转信息,构建一个或多个动态槽位,并对所述一个或多个动态槽位进行槽位值填充,得到目标语义表示;
    控制模块,用于根据所述目标语义表示,对所述车载系统或车载系统中应用进行控制。
  9. 一种车辆,其特征在于,包括处理器、存储器及存储在所述存储器上并能够在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至7中任一项所述的语音交互的方法。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至7中任一项所述的语音交互的方法。
PCT/CN2021/140193 2020-12-21 2021-12-21 一种语音交互的方法和装置 WO2022135419A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011522703.2A CN112882679B (zh) 2020-12-21 2020-12-21 一种语音交互的方法和装置
CN202011522703.2 2020-12-21

Publications (1)

Publication Number Publication Date
WO2022135419A1 true WO2022135419A1 (zh) 2022-06-30

Family

ID=76043359

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/140193 WO2022135419A1 (zh) 2020-12-21 2021-12-21 一种语音交互的方法和装置

Country Status (2)

Country Link
CN (1) CN112882679B (zh)
WO (1) WO2022135419A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116092494A (zh) * 2023-04-07 2023-05-09 广州小鹏汽车科技有限公司 语音交互方法、服务器和计算机可读存储介质
CN116129551A (zh) * 2022-12-09 2023-05-16 浙江凌骁能源科技有限公司 汽车故障根因分析方法、装置、计算机设备和存储介质

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634888A (zh) * 2020-12-11 2021-04-09 广州橙行智动汽车科技有限公司 语音交互方法、服务器、语音交互系统和可读存储介质
CN112882679B (zh) * 2020-12-21 2022-07-01 广州橙行智动汽车科技有限公司 一种语音交互的方法和装置
CN113978328B (zh) * 2021-10-29 2022-08-16 广州小鹏汽车科技有限公司 控制方法及装置、车辆及存储介质
CN114489557B (zh) * 2021-12-15 2024-03-22 青岛海尔科技有限公司 语音交互方法、装置、设备及存储介质
CN113990301B (zh) * 2021-12-28 2022-05-13 广州小鹏汽车科技有限公司 语音交互方法及其装置、服务器和可读存储介质
CN114461170A (zh) * 2022-01-27 2022-05-10 山东省城市商业银行合作联盟有限公司 手机银行应用程序的页面朗读方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110473521A (zh) * 2019-02-26 2019-11-19 北京蓦然认知科技有限公司 一种任务模型的训练方法、装置、设备
CN111736738A (zh) * 2020-06-30 2020-10-02 广州小鹏车联网科技有限公司 一种车载系统的控件对象查询方法和装置
CN111966939A (zh) * 2020-09-18 2020-11-20 北京百度网讯科技有限公司 页面跳转方法及装置
CN111986673A (zh) * 2020-07-24 2020-11-24 北京奇保信安科技有限公司 一种用于语音识别的槽值填充方法、装置和电子设备
CN112882679A (zh) * 2020-12-21 2021-06-01 广州橙行智动汽车科技有限公司 一种语音交互的方法和装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111078844B (zh) * 2018-10-18 2023-03-14 上海交通大学 软件众包的任务型对话系统及方法
CN110111787B (zh) * 2019-04-30 2021-07-09 华为技术有限公司 一种语义解析方法及服务器
CN110222162A (zh) * 2019-05-10 2019-09-10 天津中科智能识别产业技术研究院有限公司 一种基于自然语言处理和知识图谱的智能问答方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110473521A (zh) * 2019-02-26 2019-11-19 北京蓦然认知科技有限公司 一种任务模型的训练方法、装置、设备
CN111736738A (zh) * 2020-06-30 2020-10-02 广州小鹏车联网科技有限公司 一种车载系统的控件对象查询方法和装置
CN111986673A (zh) * 2020-07-24 2020-11-24 北京奇保信安科技有限公司 一种用于语音识别的槽值填充方法、装置和电子设备
CN111966939A (zh) * 2020-09-18 2020-11-20 北京百度网讯科技有限公司 页面跳转方法及装置
CN112882679A (zh) * 2020-12-21 2021-06-01 广州橙行智动汽车科技有限公司 一种语音交互的方法和装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129551A (zh) * 2022-12-09 2023-05-16 浙江凌骁能源科技有限公司 汽车故障根因分析方法、装置、计算机设备和存储介质
CN116092494A (zh) * 2023-04-07 2023-05-09 广州小鹏汽车科技有限公司 语音交互方法、服务器和计算机可读存储介质
CN116092494B (zh) * 2023-04-07 2023-08-25 广州小鹏汽车科技有限公司 语音交互方法、服务器和计算机可读存储介质

Also Published As

Publication number Publication date
CN112882679A (zh) 2021-06-01
CN112882679B (zh) 2022-07-01

Similar Documents

Publication Publication Date Title
WO2022135419A1 (zh) 一种语音交互的方法和装置
CN108415702B (zh) 一种移动终端应用界面动态渲染方法和装置
CN111033492A (zh) 为自动化助手提供命令束建议
US20150040098A1 (en) Systems and methods for developing and delivering platform adaptive web and native application content
US8321226B2 (en) Generating speech-enabled user interfaces
US10860289B2 (en) Flexible voice-based information retrieval system for virtual assistant
KR20200007891A (ko) 제작자 제공 콘텐츠 기반 인터랙티브 대화 애플리케이션 테일링
KR20210056961A (ko) 의미 처리 방법, 장치, 전자 기기 및 매체
US11080330B2 (en) Generation of digital content navigation data
US8935305B2 (en) Sequential semantic representations for media curation
US10614800B1 (en) Development of voice and other interaction applications
KR20100016003A (ko) 입력 방식 편집기 통합
US11508365B2 (en) Development of voice and other interaction applications
US20150254211A1 (en) Interactive data manipulation using examples and natural language
US11749256B2 (en) Development of voice and other interaction applications
JP7223112B2 (ja) ナビゲーション放送の管理方法、装置および装置
KR20240113524A (ko) Api 콜 호출 및 구두 응답의 언어 모델 예측
CA3151910A1 (en) Development of voice and other interaction applications
US12032922B2 (en) Automated script generation and audio-visual presentations
Sheppard et al. Development of voice commands in digital signage for improved indoor navigation using google assistant SDK
CN116467432A (zh) 展示查词结果的方法及相关产品
CN112639796A (zh) 具有音频反馈和词完成的多字符文本输入系统
JP2022088586A (ja) 音声認識方法、音声認識装置、電子機器、記憶媒体コンピュータプログラム製品及びコンピュータプログラム
KR20130008663A (ko) 사용자 인터페이스 방법 및 장치
DE102019007797B4 (de) Abgleichen von Stimmbefehlen während des Testens von stimmunterstützten App-Prototypen für Sprachen mit nichtphonetischen Alphabeten

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21909415

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21909415

Country of ref document: EP

Kind code of ref document: A1