CN117311862A

CN117311862A - Voice interaction method and system for vehicle-mounted device, electronic equipment and storage medium

Info

Publication number: CN117311862A
Application number: CN202210729288.0A
Authority: CN
Inventors: 高晓辉
Original assignee: Pateo Connect Nanjing Co Ltd
Current assignee: Pateo Connect Nanjing Co Ltd
Priority date: 2022-06-24
Filing date: 2022-06-24
Publication date: 2023-12-29

Abstract

The embodiment of the application discloses a vehicle-mounted voice interaction method and system, electronic equipment and a storage medium, and relates to the technical field of computers. The method comprises the following steps: responding to a received first task instruction, generating an execution interface corresponding to the first task instruction, and sending the execution interface to the vehicle to display the execution interface on a display screen of the vehicle; acquiring and storing an execution interface currently displayed by the display screen in real time; and responding to the switching of the currently displayed execution interface of the display screen and receiving a calling instruction related to the currently displayed execution interface, and sending the saved currently displayed execution interface to the vehicle machine so as to redisplay the saved currently displayed execution interface on the display screen. According to the embodiment of the application, the latest task can be quickly pulled up, the switched execution interface is guaranteed not to be lost, the user experience is improved, cloud resources can be utilized to the maximum extent, the performance advantage of the cloud resources is fully exerted, and the utilization rate of the cloud resources is improved.

Description

Voice interaction method and system for vehicle-mounted device, electronic equipment and storage medium

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a vehicle-to-machine voice interaction method and system, electronic equipment and a storage medium.

Background

With the development of artificial intelligence, natural language processing technology has also been rapidly developed. The recognition and understanding of single sentence languages by natural language processing technology has not met the needs of people who want to be able to conduct more intelligent conversations with computers, resulting from the application of multi-round conversational techniques. The multi-round dialog technology can continuously and repeatedly perform dialog with a user based on key voice information in combination with background, context, etc., thereby making the interactive process more prone to natural dialog.

However, if the current task is executed or the current task is interrupted during the multi-round dialogue, the currently displayed execution interface of the display screen is switched, and once the execution interface is switched, the voice interaction corresponding to the execution interface is indicated to be ended, and the user cannot resume the voice interaction later, that is, the subsequent display screen cannot redisplay the execution interface.

Content of the application

The vehicle-mounted voice interaction method and system, the electronic equipment and the storage medium can solve or partially solve the defects in the prior art or other defects in the prior art.

According to the vehicle-mounted voice interaction method provided by the first aspect of the application, the method comprises the following steps:

responding to a received first task instruction, generating an execution interface corresponding to the first task instruction, and sending the execution interface to the vehicle to display the execution interface on a display screen of the vehicle;

acquiring and storing an execution interface currently displayed by the display screen in real time; and

and responding to the switching of the currently displayed execution interface of the display screen and receiving a calling instruction related to the currently displayed execution interface, and sending the saved currently displayed execution interface to the vehicle machine so as to redisplay the saved currently displayed execution interface on the display screen.

According to the vehicle-to-machine voice interaction method provided by the second aspect of the application, the method comprises the following steps:

in response to receiving a first task instruction of a user, uploading the first task instruction to a cloud server;

responding to the received execution interface corresponding to the first task instruction generated by the cloud server, and controlling a display screen of the vehicle to display the execution interface;

transmitting an execution interface currently displayed by the display screen to the cloud server;

responding to the switching of the currently displayed execution interface of the display screen and receiving a call instruction of a user, and uploading the call instruction to the cloud server; wherein the call instruction is related to the currently displayed execution interface; and

And responding to the received currently displayed execution interface stored by the cloud server, and controlling the display screen to redisplay the currently displayed execution interface.

According to the vehicle-mounted voice interaction system provided in the third aspect of the application, the vehicle-mounted voice interaction system comprises:

the display screen displays an execution interface;

the cloud server is configured to execute the vehicle-to-machine voice interaction method described in the first aspect of the application.

An electronic device provided according to a fourth aspect of the present application includes:

at least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, where the instructions are executable by the at least one processor to enable the at least one processor to perform the vehicle-to-vehicle voice interaction method according to the first aspect of the present application.

According to a fifth aspect of the present application, there is provided a computer-readable storage medium storing a computer program, where the computer program, when executed by a processor, implements the vehicle-to-vehicle voice interaction method according to the first aspect of the present application.

According to the vehicle-mounted voice interaction method and system, the electronic equipment and the storage medium, the cloud server is utilized to generate the execution interface, the execution interface currently displayed by the display screen is stored in real time, after the execution interface currently displayed by the display screen is switched, the stored execution interface currently displayed, namely the execution interface just switched, can be sent to the vehicle according to the invoking instruction of the user so as to be redisplayed on the display screen, so that a latest task can be quickly pulled up, the switched execution interface is guaranteed not to be lost, user experience is improved, cloud resources can be utilized to the greatest extent, performance advantages of the cloud resources are fully exerted, and the utilization rate of the cloud resources is improved.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings. The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow chart of a vehicle-to-machine voice interaction method according to an embodiment of the present application;

FIG. 3 is a schematic sub-flow diagram of step S100 according to an embodiment of the present application;

fig. 4 is a schematic sub-flow diagram of step S120 according to an embodiment of the present application;

fig. 5 is a schematic sub-flow diagram of step S300 according to an embodiment of the present application;

FIG. 6 is a flow chart of another vehicle-to-machine voice interaction method according to an embodiment of the present application;

fig. 7 is a block diagram of an electronic device for implementing the vehicle-to-vehicle voice interaction method according to the embodiment of the present application.

Reference numerals:

100. a system architecture; 101. a server; 102. a vehicle machine; 103. a network;

200. an electronic device; 201. a calculation unit; 202. a memory (ROM);

203. a Memory (RAM); 204. a bus; 205. an I/O interface;

206. an input unit; 207. an output unit; 208. a storage unit;

209. and a communication unit.

Detailed Description

Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In addition, embodiments and features of embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of the vehicle-to-vehicle voice interaction methods of the present application may be applied.

As shown in fig. 1, the system architecture 100 may include a server 101, a vehicle 102, and a network 103. The network 103 is a medium for providing a communication link between the server 101 and the vehicle machine 102. The network 103 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. It should be understood that the number of carts 102, servers 101, and networks 103 in fig. 1 is merely illustrative. There may be any number of carts 102, servers 101, and networks 103 as desired for implementation.

It should be noted that, the server 101 may be a cloud server or a local server. The server 101 may be hardware or software. When the server 101 is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers 101, or may be implemented as a single server. When the server 101 is software, it may be implemented as a plurality of software or software modules (for example, to provide distributed services), or as a single software or software module. The present invention is not particularly limited herein.

In the process of performing multi-wheel dialogue with the vehicle 102, the display screen of the vehicle 102 displays an execution interface corresponding to the user task instruction in real time. For example, when the task instruction sent by the user is "find nearby hotels", the execution interface currently displayed on the display screen of the car machine 102 is a nearby hotel information interface. However, if the user sends out voices irrelevant to the task, such as "o", "ou" or the like, or if a network abnormality occurs at this time, the execution interface currently displayed on the display screen, that is, the hotel information interface nearby, is switched to the initial standby interface. And once the execution interface is switched, the voice interaction corresponding to the execution interface is ended, and the user cannot resume the voice interaction later. Thus, if the user subsequently wants to review the information of the nearby hotels, after the user sends out the voice task of "viewing the last execution interface", the display screen of the car machine 102 cannot display the information interface of the nearby hotels just. For another example, when the task instruction sent by the user is "subscribe to the XXX hotel room", the interface currently displayed on the display screen of the car machine 102 is the XXX hotel room subscription interface. After the user completes the room reservation in the execution interface, the XXX hotel room reservation interface currently displayed on the display screen of the car machine 102 is switched to the initial standby interface. Similarly, the execution interface once switched indicates that the voice interaction corresponding to the execution interface is finished. If the user subsequently modifies a room style, it is not possible to return again to the previous XXX hotel room reservation interface for modification. For another example, when the task instruction sent by the user is "find parking lot near XXX scenic spot", the execution interface currently displayed on the display screen of the car machine 102 is the information interface of parking lot near XXX scenic spot. If the user issues a cancel instruction of "cancel" or "not used" and the like, the parking lot information interface near the XXX scenic spot currently displayed on the display screen of the car machine 102 is switched to the initial standby interface. If the user wants to review the parking lot information later, after the user sends out the voice task of "viewing the last interface, the display screen of the car machine 102 cannot display the parking lot information interface near the last XXX scenic spot. It should be appreciated that the foregoing is merely illustrative of some of the situations that often occur during a user's voice conversation with the vehicle 102, and that the present application may be adapted for other situations without departing from the teachings of the present disclosure.

Based on the above, in order to solve the above problems, the embodiment of the present application provides a vehicle-to-machine voice interaction method. The vehicle-mounted voice interaction method provided by the embodiment of the application is generally executed by a cloud server. FIG. 2 shows a flow diagram of one embodiment of a vehicle-to-machine voice interaction method according to the present application. The vehicle-mounted voice interaction method comprises the following steps:

s100, responding to the received first task instruction, generating an execution interface corresponding to the first task instruction, and sending the execution interface to the vehicle 102 so that the execution interface is displayed on a display screen of the vehicle 102;

s200, acquiring and storing an execution interface currently displayed by a display screen in real time;

and S300, in response to the fact that the currently displayed execution interface of the display screen is switched and the invoking instruction related to the currently displayed execution interface is received, the saved currently displayed execution interface is sent to the vehicle 102 so as to be redisplayed on the display screen.

In the process of carrying out multi-round dialogue between the user and the vehicle 102, the cloud server receives a first task instruction sent by the user through the voice equipment of the vehicle 102, and then generates an execution interface corresponding to the first task instruction according to the first task instruction. The cloud server sends the generated execution interface to the vehicle machine 102, and the vehicle machine 102 displays the received execution interface on a display screen thereof. Meanwhile, the cloud server acquires and stores the execution interface currently displayed by the display screen from the vehicle machine 102 in real time. If the first task instruction is executed or interrupted in the multi-round dialogue process, the currently displayed execution interface of the display screen is switched. Then, if the user wants to review the execution interface that was just switched, a call instruction related to the currently displayed execution interface may be issued to the vehicle 102. After receiving the call instruction sent by the user through the voice device of the vehicle 102, the cloud server sends the saved currently displayed execution interface, i.e. the execution interface which is just switched, to the vehicle 102. Thus, the user can review again the execution interface that was just switched through the display screen of the car machine 102.

Therefore, according to the vehicle-mounted device voice interaction method, the cloud server is utilized to generate the execution interface, the execution interface currently displayed on the display screen is stored in real time, after the execution interface currently displayed on the display screen is switched, the stored execution interface currently displayed, namely, the execution interface just switched, can be sent to the vehicle-mounted device 102 according to the invoking instruction of the user, so that the stored execution interface is redisplayed on the display screen, a latest task can be quickly pulled up, the switched execution interface is guaranteed not to be lost, user experience is improved, cloud resources can be utilized to the maximum extent, performance advantages of the cloud resources are fully exerted, and the utilization rate of the cloud resources is improved.

The following describes each step of the voice interaction method of the vehicle-mounted device in the embodiment of the application in detail.

Step S100:

in step S100, in response to receiving the first task instruction, an execution interface corresponding to the first task instruction is generated, and the execution interface is sent to the vehicle 102 so as to be displayed on a display screen of the vehicle 102. The first task instruction may be a voice segment corresponding to only one task, for example, the first task instruction is "navigate to south-Beijing station". Of course, the first task instruction may also be a voice segment including a plurality of voice segments respectively corresponding to different tasks, for example, the first task instruction is "navigate to Nanjing station, first look at today's weather".

As shown in fig. 3, in the case that the first task instruction includes a plurality of voice segments respectively corresponding to different tasks, step S100 includes:

s110, determining the execution sequence of a plurality of voice segments in response to receiving a first task instruction;

and S120, sequentially generating execution interfaces corresponding to the voice segments according to the execution sequence, and synchronously transmitting the execution interfaces to the car machine 102 so as to synchronously display the execution interfaces on a display screen.

In some embodiments, as shown in fig. 4, step S120 specifically includes:

s121, generating an execution interface corresponding to the voice segment at the current rank according to the execution sequence, and sending the execution interface corresponding to the voice segment at the current rank to the vehicle 102 so as to display the execution interface on a display screen;

s122, responding to completion of task execution corresponding to the voice segment of the current rank, and sending an inquiry instruction for inquiring whether to continue executing the task corresponding to the voice segment of the next rank;

s123, responding to the received continuous execution instruction, generating an execution interface corresponding to the voice segment of the next row according to the execution sequence, and sending the execution interface corresponding to the voice segment of the next row to the vehicle 102, so that the display screen switches the execution interface corresponding to the voice segment of the current row to the execution interface corresponding to the voice segment of the next row.

Taking a first task instruction as an example of ' navigation to Nanjing station, first, looking at today's weather ', the vehicle-to-machine voice interaction method in the embodiment of the application is further described below:

the cloud server receives a first task instruction "navigate to Nanjing station" sent by a user through a voice device of the vehicle 102, and determines an execution sequence of a voice segment "navigate to Nanjing station" and a voice segment "see today weather" after seeing the today's weather first. After determining the execution sequence, the cloud server firstly generates an execution interface corresponding to the voice section of 'seeing the weather today' according to the execution sequence, wherein the execution interface is a weather interface comprising the weather information today. The cloud server sends the generated weather interface to the vehicle machine 102, and the vehicle machine 102 displays the received weather interface on a display screen. Meanwhile, the cloud server acquires and stores the execution interface currently displayed by the display screen, namely the weather interface, from the vehicle 102. When the task is completed, the cloud server sends an inquiry command to the vehicle 102 to inquire the user whether the navigation task to the Nanjing station needs to be continuously executed. If the cloud server receives a continuous execution instruction such as "continue" or "yes" sent by the user through the voice device of the vehicle 102, an execution interface corresponding to the voice segment for navigating to the south-Beijing station is generated according to the execution sequence, where the execution interface is a navigation interface including a navigation path. The cloud server sends the generated navigation interface to the vehicle machine 102, and the display screen of the vehicle machine 102 switches the currently displayed weather interface to the navigation interface. If the user wants to browse the weather interface again, then a command for retrieving the weather interface is sent to the vehicle 102, for example, "look at last execution interface". After receiving the call instruction sent by the user through the voice device of the vehicle 102, the cloud server sends the saved weather interface to the vehicle 102. Thus, the user may re-browse the weather interface that was just switched through the display screen of the car machine 102.

Of course, it should be noted that, when the first task instruction includes a plurality of voice segments corresponding to different tasks, the cloud server may generate execution interfaces corresponding to the plurality of voice segments one by one, then determine an execution sequence of the plurality of voice segments, and sequentially send the generated plurality of execution interfaces to the vehicle 102 according to the execution sequence.

Step S200:

in step S200, the execution interface currently displayed on the display screen is acquired and saved in real time. All currently displayed execution interfaces stored by the cloud server can be stored in a list form. Further, all currently displayed execution interfaces are arranged in a list in a preset order; wherein the preset sequence may be, but is not limited to, a time sequence or a type sequence. For example, all currently displayed execution interfaces are sequentially arranged in the list from top to bottom according to the sequence of the preservation time.

Step S300:

in step S300, in response to the currently displayed execution interface of the display screen being switched and the call instruction related to the currently displayed execution interface being received, the saved currently displayed execution interface is sent to the car machine 102. There are many reasons for switching the currently displayed execution interface, for example, the vehicle 102 receives other task instructions sent by the user, or the vehicle 102 receives non-task instructions or cancel instructions sent by the user.

Step S300 is further described below for the different reasons described above:

the first cause, the vehicle 102 receives other task instructions sent by the user. In this case, as shown in fig. 5, step S300 includes:

s310, responding to the received second task instruction, and sending a save instruction;

s320, responding to the second task instruction received again, generating an execution interface corresponding to the second task instruction, and sending the execution interface corresponding to the second task instruction to the vehicle 102, so that the display screen switches the currently displayed execution interface to the execution interface corresponding to the second task instruction;

s330, in response to receiving the invoking instruction, the saved currently displayed execution interface is sent to the car machine 102, so that the display screen switches the execution interface corresponding to the second task instruction to the saved currently displayed execution interface.

As can be seen from the above, in the embodiment of the present application, after receiving the second task instruction, the cloud end server sends the save instruction to the vehicle 102 to guide the user to stay on the currently displayed execution interface, so that the problem that the voice interaction occurs due to frequent switching of the execution interface caused by the temporary rising of the user can be avoided. In addition, in the embodiment of the present application, the cloud server generates the execution interface corresponding to the second task instruction when receiving the second task instruction again, so that the display screen of the vehicle 102 switches the currently displayed execution interface to the execution interface corresponding to the second task instruction, which not only can ensure that the real intention of the user can be realized, but also can not cause the objection of the user due to excessive rescue.

For example, when the first task instruction is "subscribe to XXX hotel rooms", the execution interface currently displayed on the display screen of the car 102 is a XXX hotel room subscription interface. In this process, the cloud server acquires and stores the execution interface currently displayed on the display screen, i.e. the above-mentioned XXX hotel room reservation interface, from the car machine 102. Then, if the cloud server receives a second task instruction from the user to "navigate to XXX hotel" through the voice device of the car machine 102, the cloud server will send a save instruction, such as "the room is not yet ordered" or "you determine to switch the current interface", to the car machine 102. When the cloud server receives the second task instruction sent again by the user through the voice device of the vehicle 102, an execution interface corresponding to the second task instruction is generated, and the execution interface is a navigation interface including a navigation path. The cloud server sends the generated navigation interface to the car machine 102, and the display screen of the car machine 102 switches the currently displayed XXX hotel room reservation interface to the navigation interface. Thereafter, if the user wants to continue booking the room, a call instruction may be issued to the car set 102 to call the XXX hotel room booking interface. After receiving the call instruction sent by the user through the voice device of the car machine 102, the cloud server sends the stored XXX hotel room reservation interface to the car machine 102. Thus, the user can review the XXX hotel room reservation interface that was just switched through the display screen of the car 102 and continue reservation of rooms in the XXX hotel room reservation interface.

In order to improve efficiency and save resources, in some embodiments, the step of generating the execution interface corresponding to the second task instruction in step S320 includes: in response to receiving the second task instruction again, pairing the second task instruction with all the saved currently displayed execution interfaces; responding to the saved all currently displayed execution interfaces without the execution interface matched with the second task instruction, and generating an execution interface corresponding to the second task instruction; and responding to the saved all currently displayed execution interfaces including the execution interface matched with the second task instruction, and taking the matched execution interface as the execution interface corresponding to the second task instruction.

In the case that all currently displayed execution interfaces are stored in a list form, considering that a user may be used to frequently browse some execution interfaces, such as a weather interface, in order to improve the pairing efficiency of the second task instruction and all stored currently displayed execution interfaces, the vehicle-to-machine voice interaction method in the embodiment of the present application further includes: and in response to the saved one of the currently displayed execution interfaces being redisplayed, adjusting the ranking of the redisplayed currently displayed execution interface in the list to be the first.

And secondly, the vehicle 102 receives a non-task instruction sent by a user. In this case, step S300 includes: in response to receiving the non-task instruction, sending a leave instruction; in response to not receiving the hold instruction, sending an initial standby interface to the vehicle 102, so that the display screen switches the currently displayed execution interface to the initial standby interface; in response to receiving the retrieval instruction, the saved currently displayed execution interface is sent to the vehicle 102, so that the display screen switches the initial standby interface to the saved currently displayed execution interface.

As can be seen from the above, after receiving the non-task instruction, the cloud end server in the embodiment of the present application sends the save instruction to the vehicle 102 to guide the user to continue to stay on the currently displayed execution interface, so that frequent interruption of multiple rounds of conversations caused by boring of the user, inserting of other people or misrecognition of voice can be avoided, anti-abnormality capability is improved, and user experience is improved.

For example, when the first task instruction is "find nearby hotels", the execution interface currently displayed on the display screen of the car 102 is a nearby hotel information interface. In this process, the cloud server acquires and stores the execution interface currently displayed on the display screen, i.e., the nearby hotel information interface, from the car machine 102. Then, if the cloud server receives the non-task instructions, such as "o", "e", etc., sent by the user through the voice device of the vehicle 102, the cloud server sends the vehicle 102 a save instruction, such as "query has not been completed" or "you determine to switch the current interface", etc. When the cloud server receives a hold instruction, such as "continue to stay at the current execution interface" or "not switch" from the user through the voice device of the car machine 102, no action is performed, so that the display screen continues to display the information interface of the nearby hotel. When the cloud server does not receive the hold instruction sent by the user through the voice device of the car machine 102, the cloud server sends an initial standby interface to the car machine 102, and the display screen of the car machine 102 switches the currently displayed nearby hotel information interface to the initial standby interface. Then, if the user wants to continue to view the nearby hotels, a call instruction for calling up the information interface of the nearby hotels may be issued to the car set 102. After receiving the call instruction sent by the user through the voice device of the car machine 102, the cloud server sends the saved information interface of the nearby hotel to the car machine 102. Thus, the user can review the newly switched nearby hotel information interface via the display of the car 102.

And thirdly, the vehicle 102 receives a cancel instruction sent by the user. In this case, step S300 includes: in response to receiving the cancel instruction, sending an initial standby interface to the car machine 102, so that the display screen switches the currently displayed execution interface to the initial standby interface; in response to receiving the retrieval instruction, the saved currently displayed execution interface is sent to the vehicle 102, so that the display screen switches the initial standby interface to the saved currently displayed execution interface.

For example, when the first task instruction is "find parking lot near XXX scenic spot", the execution interface currently displayed on the display screen of the car machine 102 is then the information interface of parking lot near XXX scenic spot. In this process, the cloud server acquires and stores the execution interface currently displayed on the display screen, i.e. the parking lot information interface near the XXX scenic spot, from the car machine 102. Then, if the cloud server receives a cancel instruction, such as "cancel" or "do not use" sent by the user, through the voice device of the car machine 102, the cloud server will send an initial standby interface to the car machine 102, and the display screen of the car machine 102 will switch the parking lot information interface near the currently displayed XXX scenic spot to the initial standby interface. Then, if the user wants to continue to view the parking lot information, a call instruction for calling the parking lot information interface near the XXX scenic spot may be issued to the car machine 102. After receiving the call instruction sent by the user through the voice device of the car machine 102, the cloud server sends the stored information interface of the parking lot near the XXX scenic spot to the car machine 102. Thus, the user can review the parking lot information interface near the XXX attraction that was just switched through the display screen of the car set 102.

Of course, in addition to the above reasons that may cause the currently displayed execution interface to be switched, other reasons, such as network anomalies, program crashes, or task completion, may cause the currently displayed execution interface to be switched. The person skilled in the art can make appropriate adjustments to the above examples in combination with common general knowledge to adapt them to other situations that cause the currently displayed execution interface to be switched without departing from the teachings of the present disclosure.

The embodiment of the application also provides a vehicle-mounted voice interaction system, which comprises a display screen and a cloud server. The cloud server executes the vehicle-computer voice interaction method.

The embodiment of the application also provides an electronic device, which comprises at least one processor and a memory in communication connection with the at least one processor; the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the vehicle-to-machine voice interaction method.

The embodiment of the application also provides a computer readable storage medium which stores a computer program, and when the computer program is executed by a processor, the vehicle-computer voice interaction method is realized.

In addition, the embodiment of the application also provides another vehicle-to-machine voice interaction method. The vehicle-to-vehicle voice interaction method provided in the embodiments of the present application is generally executed by the vehicle-to-vehicle 102. As shown in fig. 6, the vehicle-mounted voice interaction method comprises the following steps:

s101, responding to a received first task instruction of a user, and uploading the first task instruction to a cloud server;

s102, responding to the received execution interface corresponding to the first task instruction generated by the cloud server, and controlling a display screen of the vehicle 102 to display the execution interface;

s103, sending an execution interface currently displayed by the display screen to a cloud server;

s104, responding to the fact that an execution interface currently displayed by the display screen is switched and a call instruction of a user is received, and uploading the call instruction to the cloud server; wherein the retrieving instruction is related to the currently displayed execution interface;

s105, responding to the currently displayed execution interface stored in the cloud server, and controlling the display screen to redisplay the currently displayed execution interface.

In the process of performing multi-round dialogue between the user and the vehicle 102, when the vehicle 102 receives a first task instruction sent by the user, the first task instruction is uploaded to the cloud server. The cloud server generates an execution interface corresponding to the first task instruction according to the first task instruction, and sends the execution interface to the vehicle 102. After the vehicle 102 receives the execution interface sent by the cloud server, the display screen is controlled to display the execution interface. Meanwhile, the cloud server acquires and stores the execution interface currently displayed by the display screen from the vehicle machine 102 in real time. If the first task instruction is executed or interrupted in the multi-round dialogue process, the currently displayed execution interface of the display screen is switched. When the vehicle 102 receives the call instruction sent by the user, the call instruction is uploaded to the cloud server, and the cloud server sends the saved currently displayed execution interface, i.e. the execution interface which is just switched, to the vehicle 102. After the vehicle 102 receives the execution interface, it controls its display screen to redisplay the execution interface. Thus, the user can review again the execution interface that was just switched through the display screen of the car machine 102.

The embodiment of the application also provides another electronic device, which comprises at least one processor and a memory in communication connection with the at least one processor; the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the vehicle-to-machine voice interaction method.

The embodiment of the application also provides another computer readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the vehicle-to-machine voice interaction method is realized.

Fig. 7 shows a schematic block diagram of an example electronic device 700 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 7, the electronic device 200 includes a computing unit 201 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 202 or a computer program loaded from a storage unit 208 into a Random Access Memory (RAM) 203. In the RAM 203, various programs and data required for the operation of the electronic apparatus 200 can also be stored. The computing unit 201, ROM 202, and RAM 203 are connected to each other through a bus 204. An input/output (I/O) interface 205 is also connected to bus 204.

Various components in the electronic device 200 are connected to the I/O interface 205, including: an input unit 206 such as a keyboard, a mouse, etc.; an output unit 207 such as various types of displays, speakers, and the like; a storage unit 208 such as a magnetic disk, an optical disk, or the like; and a communication unit 209 such as a network card, modem, wireless communication transceiver, etc. The communication unit 209 allows the electronic device 200 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The computing unit 201 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of computing unit 201 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 201 performs the various methods and processes described above, such as the car-to-car voice interaction method. For example, in some embodiments, the in-vehicle voice interaction method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 208. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 200 via the ROM 202 and/or the communication unit 209. When the computer program is loaded into RAM 203 and executed by computing unit 201, one or more steps of the car-to-car voice interaction method described above may be performed. Alternatively, in other embodiments, the computing unit 201 may be configured to perform the in-vehicle voice interaction method in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. The vehicle-to-machine voice interaction method is characterized by comprising the following steps of:

2. The vehicle-mounted voice interaction method according to claim 1, wherein the first task instruction comprises a plurality of voice segments respectively corresponding to different tasks;

The step of generating an execution interface corresponding to a first task instruction and sending the execution interface to the vehicle to display the execution interface on a display screen of the vehicle in response to receiving the first task instruction comprises the following steps:

determining an execution order of a plurality of the voice segments in response to receiving a first task instruction; and

and sequentially generating the execution interfaces corresponding to the voice segments according to the execution sequence, and synchronously sending the execution interfaces to the vehicle-mounted device so as to synchronously display the execution interfaces on the display screen.

3. The car-to-machine voice interaction method according to claim 2, wherein the step of sequentially generating the execution interfaces corresponding to the voice segments according to the execution order and synchronously transmitting the execution interfaces to the car-to-machine comprises:

generating an execution interface corresponding to the current-ranking voice segment according to the execution sequence, and sending the execution interface corresponding to the current-ranking voice segment to the vehicle machine so as to display the execution interface on the display screen;

in response to completion of task execution corresponding to the current ranked speech segment, sending an inquiry instruction for inquiring whether to continue execution of a task corresponding to a next ranked speech segment; and

And responding to the received continuous execution instruction, generating an execution interface corresponding to the next-row voice segment according to the execution sequence, and sending the execution interface corresponding to the next-row voice segment to the vehicle machine so that the display screen switches the execution interface corresponding to the current-row voice segment to the execution interface corresponding to the next-row voice segment.

4. The car-audio interaction method according to claim 1, wherein the step of transmitting the saved currently displayed execution interface to the car machine in response to the currently displayed execution interface of the display screen being switched and receiving a retrieval instruction related to the currently displayed execution interface comprises:

in response to receiving the second task instruction, sending a leave instruction;

responding to the second task instruction received again, generating an execution interface corresponding to the second task instruction, and sending the execution interface corresponding to the second task instruction to the vehicle machine so that the display screen switches the currently displayed execution interface to the execution interface corresponding to the second task instruction; and

and in response to receiving the invoking instruction, sending the saved currently displayed execution interface to the vehicle machine, so that the display screen switches the execution interface corresponding to the second task instruction to the saved currently displayed execution interface.

5. The vehicle-mounted voice interaction method according to claim 4, wherein the step of generating an execution interface corresponding to the second task instruction includes:

in response to receiving the second task instruction again, pairing the second task instruction with all the saved currently displayed execution interfaces;

responding to the saved all currently displayed execution interfaces without the execution interface matched with the second task instruction, and generating an execution interface corresponding to the second task instruction; and

and responding to the saved all currently displayed execution interfaces including the execution interface matched with the second task instruction, and taking the matched execution interface as the execution interface corresponding to the second task instruction.

6. The car-audio interaction method according to claim 1, wherein the step of transmitting the saved currently displayed execution interface to the car machine in response to the currently displayed execution interface of the display screen being switched and receiving a retrieval instruction related to the currently displayed execution interface comprises:

in response to receiving the non-task instruction, sending a leave instruction;

in response to not receiving the holding instruction, sending an initial standby interface to the vehicle machine so that the display screen switches the currently displayed execution interface to the initial standby interface; and

And in response to receiving the call instruction, sending the saved currently displayed execution interface to the vehicle machine, so that the display screen switches the initial standby interface to the saved currently displayed execution interface.

7. The car-audio interaction method according to claim 1, wherein the step of transmitting the saved currently displayed execution interface to the car machine in response to the currently displayed execution interface of the display screen being switched and receiving a retrieval instruction related to the currently displayed execution interface comprises:

in response to receiving a cancel instruction, an initial standby interface is sent to the vehicle machine, so that the display screen switches the currently displayed execution interface to the initial standby interface; and

8. The vehicle-mounted voice interaction method according to any one of claims 1 to 6, wherein all the currently displayed execution interfaces are stored in a list form;

The voice interaction method further comprises the following steps:

and in response to the saved one of the currently displayed execution interfaces being redisplayed, adjusting the ranking of the redisplayed currently displayed execution interface in the list to be the first.

9. The vehicle-to-machine voice interaction method is characterized by comprising the following steps of:

10. A vehicle-to-machine voice interaction system, comprising:

The display screen displays an execution interface;

the cloud server configured to perform the car-to-car voice interaction method according to any one of claims 1 to 8.

11. An electronic device, comprising:

at least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the vehicle voice interaction method of any one of claims 1 to 8.

12. A computer readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the car-to-car voice interaction method according to any one of claims 1 to 8.