CN115240664A

CN115240664A - Man-machine interaction method and electronic equipment

Info

Publication number: CN115240664A
Application number: CN202210639589.4A
Authority: CN
Inventors: 魏巍; 许翔; 吴金娴; 李秀岳
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2019-04-10
Filing date: 2019-04-10
Publication date: 2022-10-25
Also published as: CN110136705A; CN110136705B

Abstract

The application provides a man-machine interaction method and electronic equipment, and relates to the field of artificial intelligence, in particular to the field of natural language processing. The method comprises the following steps: the method comprises the steps that electronic equipment obtains a first sentence input by a user; the electronic equipment analyzes the first statement to obtain first information, the first information is used for indicating the intention of a user, the first information corresponds to one or more second information, and the one or more second information is used for realizing the intention of the user; when at least one piece of second information in the one or more pieces of second information is missing, the electronic equipment searches the missing at least one piece of second information from the content memorized by the human-computer interaction application; the electronic device performs an operation related to the user's intention according to the first information and the one or more second information. The man-machine interaction method is beneficial to improving the man-machine interaction efficiency.

Description

Man-machine interaction method and electronic equipment

Technical Field

The present application relates to the field of electronic devices, and more particularly, to a method for human-computer interaction and an electronic device.

Background

The voice assistant is widely applied to electronic equipment such as mobile phones, tablet computers and intelligent sound boxes at present, and an intelligent voice interaction mode is provided for users.

The existing voice assistant only realizes a memory function, and may also need to perform frequent interaction with a user in a scene where the intention of the user needs to be realized, so that the efficiency of human-computer interaction is low, and the user experience is poor.

Disclosure of Invention

The application provides a human-computer interaction method and electronic equipment, aiming at improving human-computer interaction efficiency.

In a first aspect, a method for human-computer interaction is provided, where the method is applied to an electronic device, and the method includes: the electronic equipment (specifically, a human-computer interaction application in the electronic equipment) acquires a first sentence input by a user; the man-machine interaction application (specifically, the man-machine interaction application in the electronic device) analyzes the first statement to obtain first information, where the first information is used to indicate an intention of a user, the first information corresponds to one or more second information, and the one or more second information is information used to implement the intention of the user; when at least one of the one or more second information is missing, the electronic device (specifically, the human-computer interaction application in the electronic device) searches for the missing at least one second information from the content memorized by the human-computer interaction application; the electronic device (which may be specifically the human-computer interaction application in the electronic device) performs an operation related to the user's intention according to the first information and the one or more second information.

According to the man-machine interaction method, when the man-machine interaction application finds that information is missing when the man-machine interaction application executes the intention of the user, the missing information can be searched from the content memorized by the man-machine interaction application (also called as stored content), the frequent interaction between the man-machine interaction application and the user is avoided, the man-machine interaction efficiency is improved, and the user experience is improved.

In the embodiment of the application, when the human-computer interaction application searches for missing information, the content memorized by the human-computer interaction application can be searched, and the searched place comprises the memorized content. Illustratively, the user has previously entered some information during the interaction with the human-computer interaction application, which may be stored.

The place searched by the human-computer interaction application can also be the content saved in other applications. Illustratively, the human interaction application looks for missing information from the notepad application of the electronic device.

In some possible implementations, after determining the intention of the user, the human-computer interaction application may analyze the information stored in the notepad in real time, determine whether the at least one second information exists in the information stored in the notepad, and perform an operation related to the intention of the user after finding the at least one second information from the information stored in the notepad.

In some possible implementations, after the electronic device detects that the user inputs relevant information in the notepad, the electronic device may analyze the information in the notepad in advance, and store the analyzed information in a storage space corresponding to the notepad application, or in a storage space corresponding to the human-computer interaction application, or may also store the analyzed information in another storage space (for example, on the cloud side or in the server).

It should be understood that the other applications mentioned above are only described by taking a notepad as an example, and may also be a short message application, a chat application, and the like. When at least one missing data is searched, the content in other applications can be analyzed in real time; or content in other applications may be analyzed in advance, and information obtained after the analysis in advance is stored in a corresponding storage space.

With reference to the first aspect, in some possible implementations of the first aspect, before the electronic device (specifically, the human-computer interaction application in the electronic device) acquires the first sentence input by the user, the method further includes: the electronic equipment (specifically, the human-computer interaction application in the electronic equipment) acquires a second sentence input by the user; the electronic device (specifically, the human-computer interaction application in the electronic device) parses the second sentence to obtain the at least one piece of second information; the electronic device (which may specifically be the human-computer interaction application in the electronic device) saves the at least one second information.

According to the human-computer interaction method, the human-computer interaction application can automatically store some information in the interaction process with the user, so that missing information can be searched from the stored information when the intention of the user is obtained, frequent interaction between the human-computer interaction application and the user can be avoided, human-computer interaction efficiency is improved, and user experience is improved.

In some possible implementations, the man-machine-interaction application stores the at least one second information in a content memorized by the man-machine-interaction application.

In some possible implementations, before the human-computer interaction application saves the at least one second information, the method further includes: the man-machine interaction application determines the type of information to be stored; wherein the human-computer interaction application saves the at least one piece of second information, including: and under the condition that the type of each piece of information in the at least one piece of second information meets the type of the information needing to be stored, the man-machine interaction application stores the at least one piece of second information.

With reference to the first aspect, in some possible implementations of the first aspect, the second statement includes a memory instruction initiated by a user.

With reference to the first aspect, in some possible implementation manners of the first aspect, the first information corresponds to multiple pieces of second information, at least two pieces of second information in the multiple pieces of second information are missing, and the searching, by the electronic device (specifically, the human-computer interaction application in the electronic device), for the missing at least one piece of second information from content memorized by the human-computer interaction application includes: the electronic device (specifically, the human-computer interaction application in the electronic device) finds part of the missing at least two pieces of second information from the content memorized by the human-computer interaction application; wherein, the method also comprises: the electronic device (specifically, the human-computer interaction application in the electronic device) generates a dialog for prompting a user to input another part of the at least two pieces of second information; the electronic device (specifically, the human-computer interaction application in the electronic device) sends the dialog information to the user; the electronic equipment (specifically, the human-computer interaction application in the electronic equipment) acquires a third sentence input by the user; the electronic device (specifically, the human-computer interaction application in the electronic device) parses the third sentence, where the third sentence includes another part of the at least two pieces of information.

When the human-computer interaction application finds only part of the missing at least two pieces of second information from the memorized contents, the human-computer interaction application can inquire the user about the other part of the missing at least two pieces of second information, so as to obtain one or more pieces of second information realizing the intention of the user.

With reference to the first aspect, in some possible implementations of the first aspect, the method further includes: the man-machine interaction application stores another part of the at least two pieces of information.

According to the human-computer interaction method, the human-computer interaction application can automatically store some information in the interaction process with the user, so that when a certain intention of the user is finished next time, if the information can be used, the frequent interaction with the user by the human-computer interaction application can be avoided, the human-computer interaction efficiency is improved, and the user experience is improved.

With reference to the first aspect, in some possible implementations of the first aspect, the performing, according to the first information and the one or more second information, an operation related to the user's intention includes: the electronic device (specifically, the human-computer interaction application in the electronic device) generates an instruction according to the first information, the found missing at least one piece of second information, and information other than the at least one piece of second information in the one or more pieces of second information; the electronic device (which may specifically be the human-computer interaction application in the electronic device) performs an operation related to the instruction according to the instruction.

In this embodiment of the application, the second statement may include only the first information, and then the man-machine interaction application needs to search the one or more second information from the memorized content, and the man-machine interaction application may search the one or more second information from the memorized content, and at this time, the man-machine interaction application may directly generate an instruction; the man-machine interaction application may also find only part of the one or more second information, and the man-machine interaction application needs to initiate a query to the user to obtain another part of the one or more second information, so as to generate the instruction.

Or, the first sentence comprises the first information and part of one or more pieces of second information for realizing the intention of the user, and the man-machine interaction application can search the memorized content for another part of one or more pieces of information. The man-machine interaction application can possibly search the other part of information from the memorized contents, and at the moment, the man-machine interaction application can directly generate an instruction; the man-machine interaction application may also find only part of the information in the other part of the information, and the other part of the information needs to be obtained after being inquired by the user, so as to generate an instruction

With reference to the first aspect, in some possible implementations of the first aspect, before the generating the instruction, the method includes: the electronic device (specifically, the human-computer interaction application in the electronic device) fills the one or more second information into the slot corresponding to the first information.

In this embodiment, before the human-computer interaction application generates the instruction, the one or more second information may be filled in the slot corresponding to the first information.

In a second aspect, a method for human-computer interaction is provided, where the method is applied to an electronic device, and the method includes: the electronic device (specifically, the human-computer interaction application in the electronic device) detects a first sentence input by a user, wherein the first sentence includes at least one piece of first information; responding to a first statement input by the user, and displaying or broadcasting first dialogue information, wherein the first dialogue information is a response to the first statement; in response to a first sentence input by a user, the electronic device (which may specifically be the human-computer interaction application in the electronic device) stores the at least one first information; the electronic device (specifically, the human-computer interaction application in the electronic device) detects a second sentence input by the user, the second sentence includes second information and does not include the at least one first information, the second information is used for indicating the intention of the user, and the at least one first information is at least part of information in the information used for realizing the intention of the user; in response to the second sentence input by the user, the electronic device (which may be specifically the human-computer interaction application in the electronic device) performs an operation related to the user's intention at least according to the second information and the at least one first information.

In some possible implementations, the human-computer interaction application performs an operation related to the user's intention at least according to the second information and the at least one first information, including: the man-machine interaction application generates an instruction at least according to the second information and the at least one piece of first information; and the man-machine interaction application executes the operation related to the instruction according to the instruction.

In some possible implementations, the electronic device stores the at least one first information in a content remembered by the human interaction application.

In some possible implementations, prior to performing the operation related to the user's intent, the method further includes: and searching the at least one first message from the content memorized by the man-machine interaction application.

With reference to the second aspect, in some possible implementations of the second aspect, the at least one piece of first information is a part of information used for realizing the intention of the user, and the second sentence does not include another part of information used for realizing the intention of the user, and in response to the second sentence input by the user, the electronic device (which may be specifically the human-computer interaction application in the electronic device) performs an operation related to the intention of the user according to at least the second information and the at least one piece of first information, including: the electronic device (specifically, the human-computer interaction application in the electronic device) displays or broadcasts second dialogue information, wherein the second dialogue information is used for reminding a user to input third information, and the third information is another part of information in information for realizing the intention of the user; the electronic device (specifically, the human-computer interaction application in the electronic device) detects a third sentence input by the user, where the third sentence includes the third information; in response to a third sentence input by the user, the electronic device (which may specifically be the human-computer interaction application in the electronic device) performs an operation related to the user's intention according to the third information, the second information and the at least one first information.

In some possible implementations, the human-computer interaction application performs an operation related to the user's intention according to the third information, the second information and the at least one first information, including: the man-machine interaction application generates an instruction according to the third information, the second information and the at least one first information; and the man-machine interaction application executes the operation related to the instruction according to the instruction.

In some possible implementations, the electronic device stores the other part of the information in the content memorized by the human-computer interaction application.

In a third aspect, the present technical solution provides a human-computer interaction apparatus, where the apparatus is included in an electronic device, and the apparatus has a function of implementing behaviors of the electronic device in the above aspect and possible implementation manners of the above aspect. The functions may be implemented by hardware, or by hardware executing corresponding software. The hardware or software includes one or more modules or units corresponding to the above-described functions. Such as a display module or unit, a detection module or unit, etc.

In a fourth aspect, the present technical solution provides an electronic device, including: one or more processors; a memory; a plurality of application programs; and one or more computer programs. Wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions. The instructions, when executed by the electronic device, cause the electronic device to perform a method of human-computer interaction in any one of the possible implementations of any of the aspects above.

In a fifth aspect, the present disclosure provides an electronic device comprising one or more processors and one or more memories. The one or more memories are coupled to the one or more processors for storing computer program code comprising computer instructions that, when executed by the one or more processors, cause the electronic device to perform a method of human-computer interaction in any of the possible implementations of the above aspects.

In a sixth aspect, the present disclosure provides a computer storage medium including computer instructions, which, when executed on an electronic device, cause the electronic device to perform a method for human-computer interaction in any one of the possible implementations of any one of the above aspects.

In a seventh aspect, the present disclosure provides a computer program product, which when run on an electronic device, causes the electronic device to perform the method for human-computer interaction in any one of the possible designs of the above aspects.

Drawings

Fig. 1 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.

Fig. 2 is a schematic diagram of a software structure of an electronic device according to an embodiment of the present application.

Fig. 3 is a schematic diagram of a set of display interfaces provided in an embodiment of the present application.

Fig. 4 is a schematic diagram of another group of display interfaces provided in an embodiment of the present application.

Fig. 5 is a schematic view of another set of display interfaces provided in the embodiments of the present application.

Fig. 6 is a schematic view of another set of display interfaces provided in the embodiments of the present application.

Fig. 7 is a schematic flowchart of a process of memory capture in a human-computer interaction process according to an embodiment of the present application.

Fig. 8 is a schematic flowchart of a flow of memory writing in a human-computer interaction process according to an embodiment of the present application.

Fig. 9 is a schematic flowchart of a method for human-computer interaction provided in an embodiment of the present application.

FIG. 10 is another schematic flow chart of a method for human-computer interaction provided by an embodiment of the present application.

Fig. 11 is a schematic block diagram of an electronic device provided in an embodiment of the present application.

Fig. 12 is another schematic block diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

The terminology used in the following examples is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of this application and the appended claims, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, such as "one or more", unless the context clearly indicates otherwise. It should also be understood that in the following embodiments of the present application, "at least one", "one or more" means one, two or more. The term "and/or" is used to describe the association relationship of the associated objects, and means that there may be three relationships; for example, a and/or B, may represent: a exists singly, A and B exist simultaneously, and B exists singly, wherein A and B can be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

Embodiments of an electronic device, a user interface for such an electronic device, and a method for using such an electronic device provided by embodiments of the present application are described below. In some embodiments, the electronic device may be a portable electronic device, such as a cell phone, a tablet, a wearable electronic device with wireless communication capabilities (e.g., a smart watch), and the like, that also includes other functionality, such as personal digital assistant and/or music player functionality. Exemplary embodiments of the portable electronic device include, but are not limited to, a mount

Or other operating system. The portable electronic device may also be other portable electronic devices such as a Laptop computer (Laptop) or the like. It should also be understood that in other embodiments, the electronic device may not be a portable electronic device, but may be a desktop computer.

By way of example, fig. 1 shows a schematic structural diagram of an electronic device 100. The electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a Universal Serial Bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a key 190, a motor 191, an indicator 192, a camera 193, a display screen 194, a Subscriber Identity Module (SIM) card interface 195, and the like.

It is to be understood that the illustrated structure of the embodiment of the present application does not specifically limit the electronic device 100. In other embodiments of the present application, electronic device 100 may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

Processor 110 may include one or more processing units, such as: the processor 110 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. Wherein the different processing units may be separate components or may be integrated in one or more processors. In some embodiments, the electronic device 101 may also include one or more processors 110. The controller can generate an operation control signal according to the instruction operation code and the time sequence signal to finish the control of instruction fetching and instruction execution. In other embodiments, a memory may also be provided in processor 110 for storing instructions and data. Illustratively, the memory in the processor 110 may be a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. This avoids repeated accesses and reduces the latency of the processor 110, thereby increasing the efficiency with which the electronic device 101 processes data or executes instructions.

In some embodiments, processor 110 may include one or more interfaces. The interface may include an inter-integrated circuit (I2C) interface, an inter-integrated circuit audio source (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a SIM card interface, and/or a USB interface, etc. The USB interface 130 is an interface conforming to a USB standard specification, and may be a Mini USB interface, a Micro USB interface, a USB Type C interface, or the like. The USB interface 130 may be used to connect a charger to charge the electronic device 101, and may also be used to transmit data between the electronic device 101 and a peripheral device. The USB interface 130 may also be used to connect to a headset to play audio through the headset.

It should be understood that the interface connection relationship between the modules illustrated in the embodiments of the present application is only an illustration, and does not limit the structure of the electronic device 100. In other embodiments of the present application, the electronic device 100 may also adopt different interface connection manners or a combination of multiple interface connection manners in the above embodiments.

The charging management module 140 is configured to receive a charging input from a charger. The charger may be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 140 may receive charging input from a wired charger via the USB interface 130. In some wireless charging embodiments, the charging management module 140 may receive a wireless charging input through a wireless charging coil of the electronic device 100. The charging management module 140 may also supply power to the electronic device through the power management module 141 while charging the battery 142.

The power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charge management module 140 and provides power to the processor 110, the internal memory 121, the external memory, the display 194, the camera 193, the wireless communication module 160, and the like. The power management module 141 may also be used to monitor parameters such as battery capacity, battery cycle count, battery state of health (leakage, impedance), etc. In some other embodiments, the power management module 141 may also be disposed in the processor 110. In other embodiments, the power management module 141 and the charging management module 140 may be disposed in the same device.

The wireless communication function of the electronic device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem processor, a baseband processor, and the like.

The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the electronic device 100 may be used to cover a single or multiple communication bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.

The mobile communication module 150 may provide a solution including wireless communication of 2G/3G/4G/5G, etc. applied to the electronic device 100. The mobile communication module 150 may include at least one filter, a switch, a power amplifier, a Low Noise Amplifier (LNA), and the like. The mobile communication module 150 may receive the electromagnetic wave from the antenna 1, filter, amplify, etc. the received electromagnetic wave, and transmit the electromagnetic wave to the modem processor for demodulation. The mobile communication module 150 may also amplify the signal modulated by the modem processor, and convert the signal into electromagnetic wave through the antenna 1 to radiate the electromagnetic wave. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the processor 110. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the same device as at least some of the modules of the processor 110.

The wireless communication module 160 may provide a solution for wireless communication applied to the electronic device 100, including Wireless Local Area Networks (WLANs) (e.g., wireless fidelity (Wi-Fi) networks), bluetooth (bluetooth, BT), global Navigation Satellite System (GNSS), frequency Modulation (FM), near Field Communication (NFC), infrared (IR), and the like. The wireless communication module 160 may be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2, performs frequency modulation and filtering on electromagnetic wave signals, and transmits the processed signals to the processor 110. The wireless communication module 160 may also receive a signal to be transmitted from the processor 110, perform frequency modulation and amplification on the signal, and convert the signal into electromagnetic waves via the antenna 2 to radiate the electromagnetic waves.

The electronic device 100 implements display functions via the GPU, the display screen 194, and the application processor. The GPU is a microprocessor for image processing, and is connected to the display screen 194 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. The processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.

The display screen 194 is used to display images, video, and the like. The display screen 194 includes a display panel. The display panel may adopt a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-oeld, a quantum dot light-emitting diode (QLED), and the like. In some embodiments, the electronic device 100 may include 1 or more display screens 194.

The display screen 194 of the electronic device 100 may be a flexible screen, which is currently being noted for its unique characteristics and great potential. Compared with the traditional screen, the flexible screen has the characteristics of strong flexibility and flexibility, can provide a new interaction mode based on the bendable characteristic for a user, and can meet more requirements of the user on electronic equipment. For the electronic equipment provided with the foldable display screen, the foldable display screen on the electronic equipment can be switched between a small screen in a folded state and a large screen in an unfolded state at any time. Therefore, the use of the split screen function by the user on the electronic device equipped with the foldable display screen is more and more frequent.

The electronic device 100 may implement a shooting function through the ISP, the camera 193, the video codec, the GPU, the display 194, the application processor, and the like.

The ISP is used to process the data fed back by the camera 193. For example, when a user takes a picture, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, an optical signal is converted into an electric signal, and the camera photosensitive element transmits the electric signal to the ISP for processing and converting into an image visible to the naked eye. The ISP can also carry out algorithm optimization on the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in camera 193.

The camera 193 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The photosensitive element may be a Charge Coupled Device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP to be converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into image signal in standard RGB, YUV and other formats. In some embodiments, electronic device 100 may include 1 or more cameras 193.

The digital signal processor is used for processing digital signals, and can process digital image signals and other digital signals. For example, when the electronic device 100 selects a frequency bin, the digital signal processor is used to perform fourier transform or the like on the frequency bin energy.

Video codecs are used to compress or decompress digital video. The electronic device 100 may support one or more video codecs. In this way, the electronic device 100 may play or record video in a variety of encoding formats, such as: moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, and the like.

The NPU is a neural-network (NN) computing processor that processes input information quickly by using a biological neural network structure, for example, by using a transfer mode between neurons of a human brain, and can also learn by itself continuously. Applications such as intelligent recognition of the electronic device 100 can be implemented by the NPU, for example: image recognition, face recognition, speech recognition, text understanding, and the like.

The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to extend the storage capability of the electronic device 100. The external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, files such as music, video, etc. are saved in the external memory card.

Internal memory 121 may be used to store one or more computer programs, including instructions. The processor 110 may execute the above instructions stored in the internal memory 121, so as to enable the electronic device 101 to perform the method for displaying a screen off display provided in some embodiments of the present application, and various applications, data processing, and the like. The internal memory 121 may include a program storage area and a data storage area. Wherein, the storage program area can store an operating system; the storage program area may also store one or more applications (e.g., gallery, contacts, etc.), and the like. The storage data area may store data (such as photos, contacts, etc.) created during use of the electronic device 101, and the like. Further, the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more magnetic disk storage components, flash memory components, universal Flash Storage (UFS), and the like. In some embodiments, the processor 110 may cause the electronic device 101 to execute the method for displaying the off-screen provided in the embodiments of the present application, and other applications and data processing by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor 110. The electronic device 100 may implement audio functions via the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor. Such as music playing, recording, etc.

The sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.

The pressure sensor 180A is used for sensing a pressure signal, and converting the pressure signal into an electrical signal. In some embodiments, the pressure sensor 180A may be disposed on the display screen 194. The pressure sensor 180A can be of a wide variety, such as a resistive pressure sensor, an inductive pressure sensor, a capacitive pressure sensor, and the like. The capacitive pressure sensor may be a sensor comprising at least two parallel plates having an electrically conductive material. When a force acts on the pressure sensor 180A, the capacitance between the electrodes changes. The electronic device 100 determines the strength of the pressure from the change in capacitance. When a touch operation is applied to the display screen 194, the electronic apparatus 100 detects the intensity of the touch operation according to the pressure sensor 180A. The electronic apparatus 100 may also calculate the touched position from the detection signal of the pressure sensor 180A. In some embodiments, the touch operations that are applied to the same touch position but different touch operation intensities may correspond to different operation instructions. For example: and when the touch operation with the touch operation intensity smaller than the first pressure threshold value acts on the short message application icon, executing an instruction for viewing the short message. And when the touch operation with the touch operation intensity larger than or equal to the first pressure threshold value acts on the short message application icon, executing an instruction of newly building the short message.

The gyro sensor 180B may be used to determine the motion attitude of the electronic device 100. In some embodiments, the angular velocity of electronic device 100 about three axes (i.e., the X, Y, and Z axes) may be determined by gyroscope sensor 180B. The gyro sensor 180B may be used for photographing anti-shake. For example, when the shutter is pressed, the gyro sensor 180B detects a shake angle of the electronic device 100, calculates a distance to be compensated for by the lens module according to the shake angle, and allows the lens to counteract the shake of the electronic device 100 through a reverse movement, thereby achieving anti-shake. The gyroscope sensor 180B may also be used for navigation, somatosensory gaming scenes.

The acceleration sensor 180E may detect the magnitude of acceleration of the electronic device 100 in various directions (typically three axes). The magnitude and direction of gravity can be detected when the electronic device 100 is stationary. The method can also be used for recognizing the posture of the electronic equipment, and is applied to horizontal and vertical screen switching, pedometers and other applications.

The ambient light sensor 180L is used to sense ambient light brightness. Electronic device 100 may adaptively adjust the brightness of display screen 194 based on the perceived ambient light level. The ambient light sensor 180L can also be used to automatically adjust the white balance when taking a picture. The ambient light sensor 180L may also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in a pocket to prevent accidental touches.

The fingerprint sensor 180H is used to collect a fingerprint. The electronic device 100 can utilize the collected fingerprint characteristics to unlock the fingerprint, access the application lock, photograph the fingerprint, answer an incoming call with the fingerprint, and so on.

The temperature sensor 180J is used to detect temperature. In some embodiments, electronic device 100 implements a temperature processing strategy using the temperature detected by temperature sensor 180J. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold, the electronic device 100 performs a reduction in performance of a processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection. In other embodiments, the electronic device 100 heats the battery 142 when the temperature is below another threshold to avoid the low temperature causing the electronic device 100 to shut down abnormally. In other embodiments, when the temperature is lower than a further threshold, the electronic device 100 performs a boost on the output voltage of the battery 142 to avoid abnormal shutdown due to low temperature.

The touch sensor 180K is also referred to as a "touch panel". The touch sensor 180K may be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen". The touch sensor 180K is used to detect a touch operation applied thereto or nearby. The touch sensor may communicate the detected touch operation to the application processor to determine the touch event type. Visual output associated with the touch operation may be provided through the display screen 194. In other embodiments, the touch sensor 180K may be disposed on a surface of the electronic device 100, different from the position of the display screen 194.

The keys 190 include a power-on key, a volume key, and the like. The keys 190 may be mechanical keys. Or may be touch keys. The electronic apparatus 100 may receive a key input, and generate a key signal input related to user setting and function control of the electronic apparatus 100.

Fig. 2 is a block diagram of a software structure of the electronic device 100 according to the embodiment of the present application. The layered architecture divides the software into several layers, each layer having a clear role and division of labor. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided into four layers, an application layer, an application framework layer, an Android runtime (Android runtime) and system library, and a kernel layer from top to bottom. The application layer may include a series of application packages.

As shown in FIG. 2, the application packages may include human interactive applications, galleries, calendars, calls, maps, navigation applications, and the like.

The application framework layer provides an Application Programming Interface (API) and a programming framework for the application program of the application layer. The application framework layer includes a number of predefined functions.

As shown in FIG. 2, the application framework layers may include a window manager, content provider, view system, phone manager, resource manager, notification manager, and the like.

The window manager is used for managing window programs. The window manager can obtain the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.

The content provider is used to store and retrieve data and make it accessible to applications. The data may include video, images, audio, calls made and answered, browsing history and bookmarks, phone books, etc.

The view system includes visual controls such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, the display interface including the short message notification icon may include a view for displaying text and a view for displaying pictures.

The phone manager is used to provide communication functions of the electronic device 100. Such as management of call status (including on, off, etc.).

The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and the like.

The notification manager enables the application to display notification information in the status bar, can be used to convey notification-type messages, can disappear automatically after a short dwell, and does not require user interaction. Such as a notification manager used to inform download completion, message alerts, etc. The notification manager may also be a notification that appears in the form of a chart or scrollbar text in a status bar at the top of the system, such as a notification of a running application in the background, or a notification that appears on the screen in the form of a dialog window. For example, prompting text information in the status bar, sounding a prompt tone, vibrating the electronic device, flashing an indicator light, etc.

The Android Runtime comprises a core library and a virtual machine. The Android runtime is responsible for scheduling and managing an Android system.

The core library comprises two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.

The application layer and the application framework layer run in a virtual machine. The virtual machine executes java files of the application layer and the application framework layer as binary files. The virtual machine is used for performing the functions of object life cycle management, stack management, thread management, safety and exception management, garbage collection and the like.

The system library may include a plurality of functional modules. For example: surface managers (surface managers), media libraries (media libraries), three-dimensional graphics processing libraries (e.g., openGL ES), 2D graphics engines (e.g., SGL), and the like.

The surface manager is used to manage the display subsystem and provide a fusion of the 2D and 3D layers for multiple applications.

The media library supports a variety of commonly used audio, video format playback and recording, and still image files, among others. The media library may support a variety of audio-video encoding formats such as MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, etc.

The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.

The 2D graphics engine is a drawing engine for 2D drawing.

The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.

The human-computer interaction application in the application package may be a system-level application. The human-computer interaction application can also be called a human-computer interaction robot, a human-computer conversation robot, a chat robot (ChatBOT), and the like, the voice assistant application is one of the human-computer interaction applications, and the voice assistant application can also be called a voice assistant application or an intelligent assistant application, and the like. The man-machine interaction application is widely applied to various electronic devices such as mobile phones, tablet computers and intelligent sound boxes at present, and an intelligent voice interaction mode is provided for users. The human-computer interaction robot is the core of human-computer interaction.

The full flow of the man-machine conversation can be realized by an Automatic Speech Recognition (ASR) module, a semantic understanding (NLU) module, a conversation control (DST) module, a conversation manager (DM) module, a conversation generation (NLG) module, and a voice broadcast (TTS) module, wherein the functions of the modules are as follows:

(1) ASR module

The primary role of the ASR module is to recognize the user's speech as textual content.

In the man-machine interaction application of the application program layer shown in fig. 2, the leftmost side represents a segment of speech, and the speech is processed by the ASR module to change the segment of speech into corresponding characters. Due to the development of machine learning capability in recent years, the recognition accuracy of the ASR speech recognition module is greatly improved, so that the speech interaction between a person and a machine becomes possible, and the ASR is a starting point in the true sense of the speech interaction. Although the ASR module can know what the user is saying, it cannot understand the meaning of the user, and the understanding of the semantics is handed over to the NLU module for processing.

(2) NLU module

The main function of the NLU module is to understand the intention (intent) of the user and perform slot (slot) analysis.

Illustratively, the user expresses: to me to order a ticket from beijing to shanghai 10 am tomorrow.

From this, the NLU module can parse the contents shown in table 1.

TABLE 1

The above example mentions 2 concepts, respectively intent and slot position, which are explained in detail below.

Intention to

The intent can be understood as a classifier that determines what type the user expresses, and is then specially parsed by the program corresponding to that type. In one implementation, the "program corresponding to this type" may be a robot (Bot), such as a user: "put a happy song for me", the intention classification of NLU module judgement user is the music, therefore summons out music robot (Bot) and recommends a song for the user to play, when the user listens to and feels it is not right, say: "change one's head", or this music robot continues to serve the user until the user expresses another problem, and when the intention is not already music, switch to another robot to serve the user.

Groove position

After the user's intention is determined, the NLU module needs to further understand the content in the dialog, and for simplicity, the most core part may be selected for understanding, and the others may be ignored, and those most important parts may be called slots (slots).

In the example "air ticket booking" three core slots are defined, respectively "departure time", "origin" and "destination". If the content that needs to be input by the user for booking the air ticket is considered comprehensively, more people such as passenger numbers, airlines, departure airports, landing airports and the like can be thought of, and for designers of voice interaction, the starting point of the design is the defined slot position.

(3) DST module and DM module

The DST module is mainly used for slot position inspection and merging, and the DM module is mainly used for sequential slot filling clarification and disambiguation.

For example, the user expresses that "help me order an airline ticket going to shanghai from beijing tomorrow", at this time, the NLU module may determine that the intention of the user is "order an airline ticket", and the slot position information related to the intention is "departure time", "origin", and "destination". And the statement expressed by the user only has two slot position information of a starting position and a destination, so that the slot position information of the take-off time of the DST module is lost, the DST module can send the lost slot position information to the DM module, and the DM module controls the NLG module to generate a dialogue for inquiring the lost slot position information from the user.

Exemplary, the user: i want to order an air ticket;

BOT: where is the destination asked?

The user: shanghai;

BOT: ask you about what time to order a flight to take off?

…

When the user supplements all slot position information completely in the intention of 'booking air ticket', the DM module can firstly skip the slots of all slot position information according to a preset sequence. For example, the order of filling the slots may be "departure time", "starting place", and "destination", wherein the corresponding slot information is "10 am", "beijing", and "shanghai", respectively.

After the slot filling is completed, the DM module may control the command execution module to perform an operation of "ordering an air ticket". For example, the command execution module may open the air ticket ordering App and display flight information from beijing to shanghai at 10 am (or around 10 am).

It should be understood that the calling and design of each module of the dialog manager are different in different dialog systems, and the DST module and the DM module can be considered as a whole to perform dialog state control and management. For example, if the user expresses a requirement of "booking tickets" but nothing is said, we need the dialogue system to ask the user for slot information that the user must know.

(4) Dialogue generation NLG module

The main role of the NLG module is to generate dialogs.

For example, when the DM module determines that the slot information of "takeoff time" is missing, the NLG module may be controlled to generate a corresponding dialog of "asking for a flight of what time you want to take off? ".

Illustratively, when the command execution module completes the operation of "booking flight ticket", it may notify the DM module that the operation is completed, and at this time, the DM module may control the NLG module to generate a corresponding dialog that "airplane from beijing to shanghai has been booked at 10 am tomorrow \8230;".

(5) TTS module

The main role of the TTS module is to broadcast the dialog to the user.

TTS is a speech synthesis broadcasting technology, and the main objective is to process the "rhyme" problem of broadcasting, which needs to judge and uniformly consider information such as symbols, polyphones, sentence patterns and the like, and process the pronunciation of characters in broadcasting. On the other hand, in order to adapt to different preferences of people, attention is paid to the tone. The "rhyme" and "timbre" are treated well in general.

In order to improve the quality of TTS broadcast, a real person is invited to record a standard template part, so that the whole dialog system sounds more natural.

The core of attention of the human-computer interaction system is the understanding of user semantics (such as NLU), which mainly aims to solve the following problems:

intention recognition: understanding the intention expressed by the user and determining the type of the demand;

analyzing the slot position: understanding key information in user expression and determining requirement details;

and (3) realizing the conversation: and a dialogue logic is designed to meet the requirements of users, so that the man-machine interaction is simple and smooth.

The existing man-machine interaction application only realizes a memory function and a recitation function, and cannot utilize the existing memory information when generating an execution instruction according to the intention of a user.

For example, the human-computer interaction application may extract and record some user-critical information from a user dialog, or record information through a memory instruction actively initiated by the user.

BOT: "how should i call you? ";

the user: "I am Catherine to go";

BOT: "I remember, happy to know your Catherine".

Exemplary, the user: "remember that my car is parked at level 3B 306" of the underground parking lot;

BOT: "I remember".

When the user asks the human-computer interaction application for some memorized information, the human-computer interaction application can make an answer, for example:

the user: "where did my car stop? ";

BOT: "your car is parked at level 3B 306" of the underground parking lot.

For example, the user home address is already memorized by the human-computer interaction application, and when the user opens the human-computer interaction application next time to inquire about the user home address, the electronic device can inform the user of the user home address; but when the user says "navigate home", the human-computer interaction application still needs to ask the user "ask where your home? ".

It should be understood that the human-computer interaction application in the embodiment of the present application may also be understood as a human-computer interaction robot, a human-computer conversation robot, a voice assistant application, and the like.

According to the embodiment of the application, the human-computer interaction efficiency can be improved by utilizing the memory function of the human-computer interaction system, the human-computer interaction application can automatically generate an instruction according to the existing memory content, the user does not need to be queried and clarified again for information related to the user intention, frequent interaction between the electronic equipment and the user is avoided, and the human-computer interaction efficiency is improved.

FIG. 3 illustrates a set of GUIs provided by an embodiment of the present application.

Referring to the GUI shown in (a) of fig. 3, the GUI is a desktop of a mobile phone. The GUI includes a plurality of application icons including a voice assistant icon 301. When the mobile phone detects that the user clicks the voice assistant icon 301 on the desktop, the voice assistant application may be launched to display a GUI as shown in (b) of fig. 3.

Referring to the GUI shown in (b) of fig. 3, the GUI is an interactive interface of a voice assistant. When the mobile phone detects that the user clicks the control 302, the mobile phone may detect the voice information of the user.

Referring to the GUI shown in (c) of FIG. 3, the GUI is another interactive interface of the voice assistant. When the mobile phone detects that the voice of the user is expressed as 'navigation home', the mobile phone can convert the voice information into character information.

Referring to the GUI shown in (d) in fig. 3, the GUI is an interface of a map App. When the mobile phone detects the voice information of the user, the address of the home of the user can be automatically acquired, and the map App is automatically opened and automatically navigated to the specific position of the home. Illustratively, the current location of the user is "the eight-five road, 43 number" and the home address of the user is "kaiser city".

It should be appreciated that the address of the home of the user may be saved before the handset automatically obtains the address.

Illustratively, the user has previously saved the address of the home in a conversation with the human-computer interaction application.

Illustratively, the user may call an address of the home saved in the map App. And requesting the address of the home from a server corresponding to the map App.

For example, the mobile phone may obtain the address of the home through the chat record in the chat App and the keyword in the content of the short message in the short message application.

For example, the mobile phone may obtain the address of the home from the user information stored in the cloud side by sending a request for querying the address of the home to the cloud side device. In one embodiment, the mobile phone can also remind the user that a navigation route for you is generated and the destination is the kalanchoe by text reminding or voice broadcasting.

In one embodiment, after automatically acquiring the address of the home of the user, the mobile phone may remind the user of whether the address of the home is × × × × × × × × × × ×? ". And when the mobile phone detects that the user confirms the address of the home, automatically opening the map App and automatically navigating to the specific position of the home.

In one embodiment, after the mobile phone detects the voice message of the user, the specific location of the home can be automatically obtained. After the specific position of the home is obtained, an instruction can be automatically generated, and the operation of navigating to the specific position of the home can be executed according to the instruction.

In one embodiment, the specific location of the home previously stored in the mobile phone is previously stored in the mobile phone, and when the expression "navigate home" is given to the user, the mobile phone may first obtain the specific location of the home previously stored in the mobile phone.

In one embodiment, the user may also speak the voice to wake up the voice assistant directly without opening the voice assistant application. For example, after the user opens the map App, the user sends a voice prompt "xiaozhi", and after the mobile phone detects the voice prompt sent by the user, the mobile phone starts interaction with the user. The voice assistant can give out the voice "do i am at you asking what can help you? ". When the mobile phone detects that the voice of the user is expressed as 'navigation home', the map App can display a navigation route from the current position of the mobile phone to the address of the home.

FIG. 4 illustrates another set of GUIs provided by an embodiment of the present application.

Referring to the GUI shown in (a) of FIG. 4, the GUI is another interactive interface of the voice assistant. When the mobile phone detects that the user expresses 'the address of the home is the Kaixuan city' through voice, after the mobile phone detects the voice information of the user, the voice information can be converted into text information to be displayed on an interactive interface of the voice assistant.

It should be understood that the process of forwarding the speech information into the text information can be performed by the ASR module.

It should also be understood that the user may express the information that "the address of my home is kharchway" in text or voice. If the user expresses the voice information in a voice form, the BOT needs to forward the voice information into character information through an ASR module; if the voice information is expressed by the character form, the voice information does not need to be forwarded to the character information through the ASR module.

Referring to the GUI shown in (b) of FIG. 4, the GUI is another interactive interface of the voice assistant. When the mobile phone determines that the expression of the user is 'the address of my home is Kaixuan city', the address of the home can be stored in the mobile phone, a dialogue 'good, i remember' is generated through an NLG module, and the dialogue is broadcasted in a voice mode through a TTS module.

It should be understood that the mobile phone can also remind the user of ' good, i ' remembering ' in the form of words. When the mobile phone uses a text form, the dialog generated by the NLG module can be directly displayed to the user in the text form; and when the dialog generated by the NLG module needs to be broadcasted to the user through the TTS module in a voice broadcasting mode.

In the embodiment of the present application, the mobile phone may obtain information to be memorized from a conversation process with a user, and exemplarily, may obtain the information through the following two conversation scenarios:

scene 1: in a general man-machine conversation scene, information needing to be memorized can be judged by a man-machine interaction application.

For example, the mobile phone may be pre-configured with the type of information to be memorized, such as a mobile phone number, an identification number, a home address, a company address, and the like. When the information appears in the expression of the user in the process of the man-machine interaction application and the conversation process of the user, the mobile phone can record the information.

Scene 2: the user actively initiates a memory command.

For example, when the expression of the user is "please note that the address of my home is kelvin", the mobile phone may determine that the user actively initiates a memory instruction by "please note" in the expression of the user, and at this time, the mobile phone may record the address information of the home.

Referring to the GUI shown in (c) of FIG. 4, the GUI is another interactive interface for a voice assistant. At a certain time after the mobile phone stores the address of the user, the mobile phone detects that the user opens the voice assistant application again, and when the mobile phone detects that the user expresses 'navigate home' in the form of voice or characters.

Referring to the GUI shown in (d) in fig. 4, the GUI is an interface of a map App. The mobile phone can determine the intention of the user and the slot information related to the intention through the NLU module, for example, the mobile phone determines that the intention of the user is "navigation" through "navigation home expressed by the user, and the slot information related to the intention is" destination ". The DST module can determine that the slot information corresponding to the intention is not lost, and after the intention and the slot information related to the intention are completely confirmed, the mobile phone can automatically open the map App and automatically generate a navigation route from the current position of the user to the home of the user.

In one embodiment, after the DST module determines that the intention and the slot information related to the intention are confirmed to be complete, the DST module sends the intention and the slot information related to the intention to the DM module. After the DM module combines the intention and the slot position information related to the intention, a navigation instruction is automatically generated and sent to the command execution module. The command execution module may automatically open a map App and display a navigation route from the user's current location to home. Meanwhile, the NLG module generates a dialog that a navigation home route is generated for you and the destination is Kaixuan city, and reminds the user in a text mode or broadcasts the dialog to the user through the TTS module.

It should be understood that in the above embodiment, when the handset determines that the user's intention is "navigation", the slot information related to the intention may only be "destination"; it is also possible that the intent-related slot information may include "origin" and "destination". When the slot position information corresponding to the intention is only the destination, the mobile phone may default to automatically acquire the position of the current mobile phone, for example, determine the position of the current mobile phone by positioning, and use the default automatically acquired position of the current mobile phone as the "departure place". And displaying a navigation route from the current position of the mobile phone to the home address through the default acquired current position of the mobile phone and the home address stored in the mobile phone.

When the slot information corresponding to the intention includes "departure place" and "destination", since only one slot information (i.e., "destination" information) related to the intention is included in the user's expression and the slot information of "departure place" is missing, the DST module may transmit the missing information to the DM module, and the DM module controls the NLG module to generate a dialog of the response. Illustratively, the dialog generated by the NLG module is "ask where do you go? The TTS module may broadcast the dialog to the user in the form of voice. When the mobile phone detects that the expression of the user is 'my place of departure is a company', the NLG module may determine that another slot information 'place of departure' is the company of the user, and if the mobile phone previously stores the company address of the user, the DST module may determine that the slot information related to the intention is complete, and notify the DM module. The DM module may notify the command execution module to generate a navigation instruction so that the mobile phone may automatically open the map App and display a navigation route from the user's company to the user's home.

In one embodiment, the slot information related to the user's intention "navigation" may include "origin", "destination", and "transportation", wherein the position of the current handset may be read by default to obtain the slot information of "origin"; the 'traffic mode' can be obtained by selecting to drive by default or inquiring clarification from a user; the slot position information of the destination is obtained by searching the address of the home of the user locally stored in the mobile phone or searching the address of the home of the user from the map App.

A set of GUIs of the present application embodiment is described above with reference to fig. 4, wherein the GUI shown in (a) of fig. 4 may be a memory content acquiring process, which may include:

(1) A user initiates a man-machine conversation to interact with a man-machine interaction application;

(2) The man-machine interaction application extracts user related information needing to be memorized in the conversation, such as a mobile phone number, an identity card number, a home address, a company address and the like of a user;

(3) The man-machine interaction application records information related to the user.

The GUI shown in (d) of fig. 4 may be a memory content acquisition process, which may include:

(2) The man-machine interaction application identifies the user intention and then searches in the memorized content;

(3) The man-machine interaction application finds that user information related to user intention exists in the memorized content, and then the information is extracted from the memory;

(4) The human-computer interaction application performs an operation related to the intent.

In one embodiment, the human-computer interaction application performs operations related to the intent, including:

the man-machine interaction application combines the information of the user with the user intention to generate an instruction, for example, the man-machine interaction application sends the information to an interface corresponding to the user intention and generates an instruction;

and the man-machine interaction application executes the operation related to the instruction according to the instruction.

It should be understood that, in the process of memory acquisition, if the intention-related user information includes a plurality of user information, and the man-machine interaction application only stores part of the memorized contents, the man-machine interaction application needs to acquire other contents without memory to the user, for example, ask the user for clarification again.

FIG. 5 illustrates another set of GUIs provided by an embodiment of the present application.

Referring to the GUI shown in (a) of FIG. 5, the GUI is another interactive interface for a voice assistant. When the mobile phone detects the operation of the user clicking the control 401, the user expresses "my mobile phone number is 187 xxx", and when the mobile phone detects the voice information of the user, the voice information can be converted into text information to be displayed on the interactive interface of the voice assistant.

Referring to the GUI shown in (b) of FIG. 5, the GUI is another interactive interface of the voice assistant. When the mobile phone determines that the expression of the user is 'my mobile phone number 187 xxx', the mobile phone number of the user can be stored in the mobile phone, a dialog is generated through an NLG module, 'good, i remember', and the dialog is broadcasted in a voice form through a TTS module.

Referring to the GUI shown in (c) of FIG. 5, the GUI is another interactive interface of the voice assistant. At a certain moment after the mobile phone stores the mobile phone number of the user, the mobile phone detects that the user opens the voice assistant application again, and detects that the user expresses 'sends my mobile phone number to Xiaoming through a chat App' in the form of voice or characters.

Referring to the GUI shown in (d) of fig. 5, the GUI is an interface of a chat App. The mobile phone can determine the intention of the user and the slot information related to the intention through the NLU module, for example, the mobile phone determines that the intention of the user is 'sending a mobile phone number' through 'sending my mobile phone number to xiaoming with chat App' expressed by the user, and the slot information related to the intention is 'mobile phone number', 'sending mode' and 'sending object'. The DST module may determine that the slot information "transmission mode" and "transmission object" corresponding to the intention are not missing, and may acquire the slot information "mobile phone number" through the memorized content. After the intention and the slot position information related to the intention are confirmed completely, the mobile phone can automatically open the chat App, find the object to be sent and automatically send the mobile phone number of the user to the object.

In one embodiment, after the DST module determines that the intention and the slot information related to the intention are confirmed to be complete, the DST module sends the intention and the slot information related to the intention to the DM module. After the DM module combines the intention and the slot position information related to the intention, an instruction of 'sending a mobile phone number' is automatically generated and sent to the command execution module. The command execution module automatically opens the chat App, finds the small message of the 'sending object' from the address list in the chat App, and automatically sends the mobile phone number of the user to the small message.

In one embodiment, the NLG module may also generate a dialog "the mobile phone number has been sent to xiaoming", remind the user in the form of text, or broadcast the dialog to the user through the TTS module.

FIG. 6 illustrates another set of GUIs provided by embodiments of the present application.

Referring to FIG. 6 (a), the GUI is another interactive interface for a voice assistant. When the mobile phone detects that the user expresses 'the address of the home is software avenue No. 6' through voice, the mobile phone can convert the voice information into text information and display the text information on an interactive interface of the voice assistant after detecting the voice information of the user. The mobile phone can also display 'good, i remember' through characters or 'good, i remember' through voice broadcasting.

Referring to FIG. 6 (b), the GUI is another interactive interface for a voice assistant. The user may move after a period of time, at this time, the mobile phone detects that the user expresses 'the address of my home is the triumph city' through voice, and after the mobile phone detects the voice information of the user, the voice information can be converted into text information to be displayed on an interactive interface of the voice assistant. The mobile phone can also display 'good, i remember' through characters or 'good, i remember' through voice broadcasting.

Referring to (c) of FIG. 6, the GUI is another interactive interface for a voice assistant. The handset again detects that the user has opened the voice assistant application and the handset detects that the user has "navigated home" in the form of voice or text. At this time, since the mobile phone stores the information of two homes ("khaki city" and "software avenue No. 6") before, the mobile phone may display a reminder window 601, which includes the text information "retrieve addresses of two homes for you, please select", and display a GUI as shown in (d) in fig. 6 when the mobile phone detects the operation of the user clicking the control 602, or when the mobile phone detects that the user voice is expressed as "the address of my home is khaki city".

Referring to the GUI shown in (d) of fig. 6, the GUI is an interface of a map App. The mobile phone can automatically open the map App and automatically generate a navigation route from the current position of the user to the triumphant city.

In one embodiment, when the mobile phone detects that the user clicks the control 603, it may be determined that the addresses of both the "kelvin" and the "software avenue 6" are not "destinations", and at this time, the mobile phone may remind the user "ask where your home? "or" ask where your home by voice play? ". After acquiring the address of the user input home through text input or voice input, the mobile phone can automatically open the map App and automatically generate a navigation route from the current position of the user to the home position.

Fig. 3 to fig. 5 show several GUI sets of the embodiment of the present application, which respectively introduce that the human-computer interaction application can utilize the saved or memorized content in the process of human-computer interaction, without querying and clarifying the intention-related information to the user again, thereby improving the efficiency of human-computer interaction. Fig. 3 to 5 described above are merely schematic, and table 2 shows other examples of a memorizing instruction of user-related information and a scene using the memorized content.

TABLE 2 examples of memory commands for user-related information and intelligent scenarios for using the memory content

According to the human-computer interaction method, the human-computer interaction application person can automatically generate the instruction by using the stored content or the existing memory content without inquiring and clarifying the information related to the intention of the user again, and therefore the human-computer interaction efficiency is improved.

In the above-mentioned several groups of GUIs shown in fig. 3 to fig. 6, the information of the user is memorized and saved during the interaction between the man-machine interaction application (or the voice assistant) and the user. When a certain intention of the user is completed next time, if user information corresponding to the intention is missing, the man-machine interaction application can search the missing information from the memorized contents. In the embodiment of the application, the information is not limited to be saved in the interaction process of the man-machine interaction application (or the voice assistant) and the user, and the missing user information can be searched in other ways.

Illustratively, the user receives flight information for a previous order (e.g., via a text message or other application). When the mobile phone detects that the user expresses 'navigate to an airport', the man-machine interaction application can automatically search from a short message or other applications. If the information of the corresponding airport can be found, the map App can be automatically opened, and a navigation route from the current position to the airport is displayed. When the human-machine-interaction application finds only city information for the origin in the flight information (assuming the city includes multiple airports), the user may be prompted to select from the multiple airports in the city.

Illustratively, the user records a family address of Zhang III as "science and technology road No. 8" in the notepad application. When the mobile phone detects that the user expresses 'navigate to open three homes', the man-machine interaction application can automatically search the information stored in the notepad application. If the information of 'three family addresses' can be found, the map App can be automatically opened, and a navigation route from the current position to three families is displayed.

In one embodiment, after determining the intention of the user, the human-computer interaction application may analyze information stored in other applications in real time, determine whether the information stored in other applications has missing user information, and perform an operation related to the intention of the user after finding the missing user information from the information stored in other applications.

In an embodiment, after the electronic device detects information related to another application (for example, a short message is received by the short message application, and information input by a user is received in the notepad application), the electronic device may analyze the information related to the other application in advance, and store the analyzed information in a storage space corresponding to the other application, or store the analyzed information in a storage space corresponding to the human-computer interaction application, or store the analyzed information in another storage space (for example, in a cloud side or in a server).

In the embodiment of the present application, an internal implementation process of the human-computer interaction application automatically generating an instruction by using stored content or existing memorized content is described below.

Fig. 7 is a schematic flowchart illustrating a process 700 of memory retrieval in a human-computer interaction process according to an embodiment of the present application, where, as shown in fig. 7, the process 700 includes:

s710, detecting that a user initiates a man-machine conversation.

For example, after detecting that a user clicks an icon of a human-computer interaction application (e.g., a voice assistant) on a desktop of the electronic device, the electronic device opens the human-computer interaction application and displays an interface of the human-computer interaction application. Specifically, after the electronic device detects that a user clicks an icon of a human-computer interaction application on a desktop of the electronic device, the human-computer interaction application of the application layer sends a label (e.g., a Process Identifier (PID)) corresponding to the human-computer interaction application and a process name corresponding to the human-computer interaction application to the system service module of the framework layer, and the system service module can determine which App is started according to the label and the process name. For example, the electronic device determines that the human-computer interaction application is started by determining the process identification number and the process name of the human-computer interaction application, and thus determines that the user initiates a human-computer conversation.

The technical scheme of the embodiment of the application can be applied to man-machine conversation in a voice interaction mode, man-machine conversation in a character interaction mode and man-machine conversation in a mixed interaction mode, for example, one party uses voice and the other party uses characters.

S720, if the voice interaction is in the man-machine conversation, the ASR module is needed to convert the user voice into the text content.

It should be understood that this step is not required if it is a man-machine conversation for text interaction.

And S730, performing semantic recognition by the NLU module, and outputting first information and one or more pieces of second information related to the first information.

For example, the first information may be used to indicate the user's intention, and the second information may be used to indicate user information related to the user's intention. For example, the second information may be the slot information described above.

Illustratively, dialog text may be input to the NLU module for semantic recognition, and the NLU module may output first information and one or more second information related to the first information.

Table 3 shows, by way of example, correspondence of several kinds of dialog texts, the first information, and the second information.

TABLE 3 correspondence of dialog text, first information and second information

S740, after the dialog system obtains the first information and one or more second information related to the first information, it determines whether there is second information missing.

S741, if there is no missing second information, an instruction may be generated directly through the first information and one or more second information related to the first information, and sent to the command execution module to execute the instruction.

S742, if the second information is missing, sending a search request to the memory management module, the search request requesting to search for the missing second information.

It should be understood that the dialog system may include a DST module, a DM module, an NLG module, and a TTS module.

Illustratively, the user may express "please send my phone number to xiaoming through WeChat", the dialog system may determine that the first information is "send phone number", the one or more second information are "send mode", "send object", and "user's phone number", wherein the dialog system may determine that the "send mode" is "WeChat", "send object" is "xiaoming", and "user's phone number" is missing through the expression of the user. The dialog system may retrieve the missing user information from the memory management module. For example, the dialogue system may send a retrieval request for requesting retrieval of user information of "mobile phone number of user" to the memory management module.

S743, the memory management module searches the memory items in the memory database and searches the missing user information.

Table 4 shows information of memory items held in a memory database.

TABLE 4 memory entries saved in the memory database

Memory item ID	Names of memory items	Content of memory item
			1	Name of user	Li Si
2	Home address of user	Triumph city
			3	Company address of user	The eight five routes of the Zhao No. 43
4	User's mobile phone number	187××××
			5	Nickname of Zhang III	Powerful brother
6	Mobile phone number of Zhang III	182××××
			7	Zhang san family address	West Daodao No. 20
…	…	…

For example, after receiving the retrieval request, the memory management module may determine that the retrieval request is used to request to retrieve the user information of "mobile phone number of user" in the memory database, and the memory management module may obtain the user information of "mobile phone number of user" by searching the memory database.

S744, the memory management module sends a retrieval request response to the memory dialogue system, and the retrieval request response comprises the queried user related information.

For example, the memory management module may send the content of the memory item corresponding to the "mobile phone number of the user" to the dialog system.

And S750, after the conversation system acquires the stored memory item content sent by the memory management module, the conversation system determines whether the second information is missing again.

It should be understood that S750 is an optional step, for example, if the dialog system in S742 requires the memory management module to retrieve 2 missing messages, and S744 only carries one missing message, then the dialog system can directly ask the user for another missing message.

S751, if there is also a second information missing, the user may be asked again for clarification.

If the dialog system determines that the second information is missing, the NLG module may generate a dialog response to ask the user for the missing second information and remind the user in a text form, or report the dialog to the user through the TTS module.

S752, the ASR module converts the user' S voice information into text.

S753, the NLU module analyzes the text content to obtain missing second information;

s754, the dialog system obtains the missing second information sent by the NLU module, so as to obtain complete first information and second information related to the first information.

The user can answer the second information inquired by the man-machine interaction application after seeing the text prompt or hearing the inquiry voice of the man-machine interaction application. After the human-computer interaction application detects that the user answers by voice, the ASR module can forward the corresponding voice information to character content and send the character content to the NLU module; the NLU module may parse the text content to obtain the missing second information, and send the second information to the dialog system.

Illustratively, the user may express that "please send my mobile phone number and identity card number to xiaoming", the dialog system may determine that the first information is "send mobile phone number", the one or more second information is "send method", "user's identity card number" and "user's mobile phone number", wherein the dialog system may determine that "send object" is "xiaoming", "user's identity card number" and "user's mobile phone number" are missing through the expression of the user. The dialog system may retrieve the missing user information from the memory management module. For example, the dialogue system may send a search request for requesting search of two pieces of user information, i.e., "user's identification number" and "user's mobile phone number", to the memory management module. However, only the user information of 'the mobile phone number of the user' is stored in the memory database, and the user information of 'the identification number of the user' is not stored. After receiving the user information of 'the mobile phone number of the user' returned by the memory management module, the dialogue system can determine the user information of 'the identity card number of the user' is lacked.

The dialog system can control the NLG module to generate a dialog "ask for how many are your identification number? And reminding the user of the conversation in a text form or broadcasting the conversation to the user through a TTS module. When the user states that "my identification number is 123 xxx", the NLU module may determine that the missing user information is "123 xxx.

For example, when the dialog system determines that the user information of "the identification number of the user" is still lacking, the dialog system may control the NLG module to generate a dialog "ask for how many mobile phone numbers and identification numbers are? And reminding the user of the conversation in a text form or broadcasting the conversation to the user through a TTS module. At this time, because there is insufficient user information, the dialog system must inquire the user about the information, and the inquiry can be initiated to the user together with other user information (for example, "mobile phone number of user" stored in the memory database), so as to ensure the accuracy of the user information.

And S755, if the second information is not missing, the dialog system generates an instruction and sends the instruction to the command execution module.

S760, the command execution module executes the operation related to the first information according to the instruction.

After the dialog system obtains the complete first information and the second information related to the first information, the dialog system may generate a corresponding instruction and send the instruction to the instruction execution module, and the instruction execution module may execute an operation related to the first information according to the instruction.

Illustratively, the dialog system generates instructions, which may include three parts: (1) opening the WeChat; (2) summarizing and finding a 'sending object' Xiaoming in the address book of the WeChat; (3) and sending the mobile phone number and the identification card number of the user to the Xiaoming. After receiving the instruction, the command execution module automatically carries out WeChat, finds out the contact person Xiaoming through the address book of the WeChat, and sends the mobile phone number and the identity card number of the user to the Xiaoming on a Xiaoming chat interface.

In one embodiment, after retrieving the missing user information in the normal manner (e.g., clarifying the user again), the dialog system may send this information to the memory management module, which adds it to the memory database.

Table 5 shows information of memory items held in another memory database.

TABLE 5 memory entries saved in the memory database

Memory item ID	Names of memory items	Content of memory item
			1	Name of user	Li Si
2	Home address of user	Triumph city
			3	Company address of user	Zhaba Wu Lu No. 43
4	User's mobile phone number	187××××
			5	User's ID card number	123××××××
6	Nickname of Zhang III	Powerful brother
			7	Mobile phone number of Zhang III	182××××
8	Zhang Sanjian household address	West Daodao No. 20
			…	…	…

The dialogue system inputs the collected user information into the memory management module, and the memory management module identifies the user related information to be memorized and stores the user related information into the memory database. Illustratively, compared with table 4, the memory database is added with user information of the identification number of the user.

According to the human-computer interaction method, the human-computer interaction application can automatically generate the instruction by utilizing the stored content or the existing memory content without inquiring and clarifying the information related to the intention of the user again, so that the human-computer interaction efficiency is improved.

Fig. 8 is a schematic flowchart illustrating a process 800 of memory writing in a human-computer interaction process according to an embodiment of the present application, where as shown in fig. 8, the process 800 includes:

and S810, detecting that the user initiates a man-machine conversation.

It should be understood that the process of S710 may be referred to in S810, and for brevity, will not be described herein.

And S820, if the user is in the man-machine conversation of the voice interaction, the ASR module is required to convert the voice of the user into the text content.

S830, the NLU module carries out semantic recognition and outputs one or more second information, or outputs the first information and one or more second information related to the first information.

Illustratively, the user may not have an intention in the presentation during the dialog. When the NLU module may output only the second information, the second information may be user information.

Illustratively, the expression of the user is "please note that the address of my home is kharchway". The NLU module may determine that the home address of the user is the kelvin city.

Illustratively, the expression of the user is "mobile phone number of zhang san is 182 xxx". The NLU module may determine that the cell phone number of zhang san is 182 xxx.

Illustratively, the user may take a conscious picture in the presentation during the dialog. When the NLU module may output first information indicating the user's intention and one or more second information related to the first information, the second information may be user information related to the user's intention.

Illustratively, the user is expressed as "please send my phone number WeChat to Xiaoming". The NLU module may determine that the user's intention is "send cell phone number" and the user information related to the intention is "user's cell phone number". At this time, if the message "mobile phone number of user" needs to be clarified, the dialog system may control the NLG module to generate a dialog of response, for example, the NLG module generates a dialog "ask how many your mobile phone number is? ", the dialog may be presented to the user in text form, or broadcast to the user by the TTS module. When the user expresses that the my mobile phone number is 187 xxx in a voice manner, the ASR module may forward the voice information to text information and send the text information to the NLU module. The NLU module may determine user information "the user's cell phone number".

S840, if the NLG module outputs the first information and one or more second information, the dialog system may determine whether any second information is missing.

If there is any second information missing, the dialog system may continue to ask the user for clarification S841.

And S842, if the user expresses through the voice, the ASR module converts the voice information into the text content.

S843, the ASR module sends the text content to the NLU module; and the NLU module carries out semantic recognition on the text content to obtain the missing second information.

S844, the NLU module sends the missing second information to the dialog system, and the dialog system obtains the complete first information and the second information related to the first information.

S850, the dialogue management module inputs the collected second information to the memory management module.

S860, the memory management module identifies the second information to be memorized and stores the second information into the memory database.

It should be understood that after S844 the dialog system may also generate a corresponding instruction and send the instruction to the command execution module (not shown in fig. 8).

According to the human-computer interaction method, the human-computer interaction application stores some user information in advance, the stored user information is used for automatically generating the instruction, the user does not need to be inquired or clarified again, and therefore human-computer interaction efficiency is improved.

With reference to the foregoing embodiments and the related drawings, the embodiments of the present application provide a method for human-computer interaction, which may be implemented in an electronic device (e.g., a mobile phone, a tablet computer, etc.) as shown in fig. 1 and fig. 2. As shown in fig. 9, the method may include the steps of:

s910, the man-machine interaction application obtains a first sentence input by the user.

Illustratively, referring to (c) in fig. 3, the human-computer interaction application (voice assistant) acquires the first sentence "navigate home" input by the user.

For example, the user may interact with the human-computer interaction application by text input, for example, the human-computer interaction application may detect that the user "navigates home" by keyboard input.

For example, the user may interact with the human-computer interaction application by voice, for example, the human-computer interaction application may detect that the user voice input "navigates home".

Illustratively, referring to (c) in fig. 5, the man-machine interaction application (voice assistant) obtains the first sentence "send my mobile phone number to xiaoming through chat App" input by the user.

S920, the human-computer interaction application analyzes the first statement to obtain first information, the first information is used for indicating the intention of the user, the first information corresponds to one or more second information, and the one or more second information is used for realizing the intention of the user.

For example, after the human-computer interaction application (voice assistant) acquires a first sentence "navigate home" input by the user, it may be determined that the intention of the user is "navigate", and information of "address of home" is required for realizing the intention of "navigate"; alternatively, two pieces of information, i.e., "departure place" and "home address", are required, and the information "departure place" can be acquired by default.

For example, after the human-computer interaction application (voice assistant) acquires a first sentence "send my mobile phone number to xiaoming through chat App" input by the user, it may be determined that the intention of the user is "send mobile phone number", and three information, namely "send object", "send mode", and "mobile phone number of the user" are required to realize the intention of "send mobile phone number".

S930, when at least one second information in the one or more second information is missing, the human-computer interaction application searches the missing at least one second information from the content memorized by the human-computer interaction application.

Illustratively, after acquiring the first sentence "navigate home" input by the user, the human-computer interaction application (voice assistant) determines that the information of "address of home" is missing, and the human-computer interaction application can search the information of "address of home" from the previously memorized contents.

Illustratively, as shown in fig. 4 (b), the information of "address of home" was previously saved in the interaction process by the human-computer interaction application and the user as "kelvin".

For example, after acquiring a first sentence "send my phone number to xiaoming through chat App" input by a user, a human-computer interaction application (voice assistant) determines that the information of "phone number of the user" is missing, and the human-computer interaction application can search the information of "phone number of the user" from previously memorized contents.

Illustratively, as shown in fig. 5 (b), the information that the user's mobile phone number is "187 × × × ×" is previously saved by the human-computer interaction application and the user during the interaction.

And S940, the man-machine interaction application executes the operation related to the intention of the user according to the first information and the one or more second information.

For example, as shown in fig. 3 (d) and 4 (d), the human-computer interaction application may automatically perform an operation of opening the map App and automatically display a navigation route from the current location to the kelvin.

For example, as shown in fig. 5 (d), the human-computer interactive application may automatically perform an operation of opening a chat App and automatically perform an operation of transmitting a mobile phone number of the user to the smart phone.

In some possible implementations, before the human-computer interaction application acquires the first sentence input by the user, the method further includes:

the man-machine interaction application acquires a second sentence input by a user;

the man-machine interaction application analyzes the second statement to obtain at least one piece of second information;

the man-machine interaction application saves the at least one second information.

For example, as shown in fig. 4 (b), the human-computer interaction application obtains a statement that "the address of my home is kelvin" that is input by the user through text or voice, and the human-computer interaction application may parse the statement to obtain user information, that is, the address of the home is kelvin. The human interactive application may save this user information in a memory database.

For example, as shown in fig. 5 (b), the human-computer interaction application obtains a statement that "my phone number is 187 xxx" input by the user through text or voice, and the human-computer interaction application may parse the statement to obtain user information, that is, "the user's phone number" is 187 xxx ". The human-computer interaction application may save this user information in a memory database.

In some possible implementations, the second statement includes a user-initiated mnemonic instruction.

Illustratively, as shown in fig. 4 (b), the second statement may also be "please note that the address of my home is karhunen".

Illustratively, as shown in fig. 5 (b), the second sentence page may be "please note my phone number 187 × × × × ×".

In the embodiment of the application, the man-machine interaction application can memorize the user information after receiving a memory instruction initiated by a user; or the type of the user information needing to be memorized can be preset by the man-machine interaction application person, and when the user information in the second statement meets the type of the user information needing to be memorized, the man-machine interaction application person memorizes the user information.

In some possible implementations, the first information corresponds to a plurality of second information, at least two pieces of second information in the plurality of second information are missing, and the searching, by the human-computer interaction application, for the missing at least one piece of second information from the content memorized by the human-computer interaction application includes:

the human-computer interaction application searches partial information in the missing at least two pieces of second information from the content memorized by the human-computer interaction application;

wherein, the method also comprises:

the man-machine interaction application generates a dialogue, and the dialogue is used for reminding a user of inputting another part of the at least two pieces of second information;

the man-machine interaction application sends the dialogue information to a user;

the man-machine interaction application acquires a third sentence input by a user;

the man-machine interaction application analyzes the third statement, and the third statement comprises another part of information in the at least two pieces of information.

Illustratively, the expression of the user is "please send my mobile phone number and identity card number to xiaoming", the human-computer interaction application person may determine that the first information is "send mobile phone number", the one or more second information is "send mode", "user's identity card number" and "user's mobile phone number", wherein the human-computer interaction application person may determine that "send object" is "xiaoming", "user's identity card number" and "user's mobile phone number" are missing through the expression of the user. The man-machine interaction application can retrieve the missing user information from the memorized content. For example, the human-computer interaction application may retrieve two pieces of user information, namely "the user's identification number" and "the user's mobile phone number" from the memory database. However, only the user information of 'the mobile phone number of the user' is stored in the memory database, and the user information of 'the identification number of the user' is not stored. After the man-machine interaction application finishes the retrieval, the user information of 'the identity card number of the user' which is still lacking can be determined.

The man-machine-interaction application generates a dialog "ask for how many your identification number is? And the user is reminded of the conversation in a text form or the conversation is broadcasted through voice. When the user expresses that the my identification card number is 123 xxx, the human-computer interaction application may determine that the missing user information is 123 xxx.

In some possible implementations, the method further includes:

the man-machine interaction application stores another part of the at least two pieces of information.

For example, the human-computer interaction application may save "user's identification number" as "123 xxx" in the memory database.

The embodiment of the present application further provides a human-computer interaction method, which can be implemented in an electronic device (e.g., a mobile phone, a tablet computer, etc.) shown in fig. 1 and fig. 2. As shown in fig. 10, the method may include the steps of:

s1010, a first sentence input by a user is detected, and the first sentence comprises at least one piece of first information.

Illustratively, as shown in (b) in fig. 4, the human-computer interaction application obtains a statement that the address of my home is kelvin, which is input by the user through text or voice, and the statement includes information that the address of home is kelvin.

Illustratively, as shown in fig. 5 (b), the human-computer interaction application obtains a statement that "my mobile phone number is 187 xxx" input by the user through text or voice, and the statement includes information that "the mobile phone number of the user" is 187 xxx ".

S1020, in response to the first sentence input by the user, displaying or broadcasting first dialog information, where the first dialog information is a response to the first sentence.

For example, as shown in fig. 4 (b), after acquiring a statement that "the address of my home is kharchway" input by the user through text or voice, the human-computer interaction application may generate a dialog message "good, i remember". The user is reminded through words or broadcasted through voice.

For example, as shown in fig. 5 (b), after the human-computer interaction application obtains a statement that "my mobile phone number is 187 × × × × × ×" that the user inputs by text or voice, it may generate a dialog message "good, i've remembered". The user is reminded through words or broadcasted through voice.

S1030, responding to the first sentence input by the user, and storing the at least one piece of first information by the man-machine interaction application.

Illustratively, as shown in fig. 4 (b), the man-machine interaction application stores information that "address of home" is "kelvin" in the memory database.

Illustratively, as shown in fig. 5 (b), the human-computer interaction application stores information that "the user's mobile phone number" is "187 x" in the memory database.

It should be understood that there is no actual precedence between S1030 and S1020.

S1040, a second sentence input by the user is detected, the second sentence includes second information and does not include the at least one piece of first information, the second information is used for indicating the intention of the user, and the at least one piece of first information is at least part of information used for realizing the intention of the user.

Illustratively, after the human-computer interaction application (voice assistant) acquires the sentence "navigate home" input by the user, the first information is determined to be "navigate", and the information for realizing the user intention of "navigate" comprises "address of home". But this information is not included in the sentence "navigate home".

Illustratively, after acquiring a first sentence "send my mobile phone number to xiaoming through chat App" input by a user, a human-computer interaction application (voice assistant) determines that the first information is "send mobile phone number", and the information for realizing the user intention of "send mobile phone number" includes three information, namely "send object", "send mode", and "mobile phone number of the user". However, the message "mobile phone number of user" in the three messages is not included in the sentence "send my mobile phone number to xiaoming through chat App".

And S1050, in response to the second sentence input by the user, performing an operation related to the user' S intention at least according to the second information and the at least one first information.

For example, since the human-computer interaction application previously remembers that the "address of home" is "kelvin", the human-computer interaction application may automatically perform an operation of opening the map App and automatically display a navigation route from the current location to the kelvin.

For example, as shown in fig. 5 (d), since the human-computer interaction application previously remembers that "the mobile phone number of the user" is "187 xxx", the human-computer interaction application may automatically perform an operation of opening a chat App and automatically perform an operation of transmitting the mobile phone number of the user to the widget.

It will be appreciated that the electronic device, in order to implement the above-described functions, comprises corresponding hardware and/or software modules for performing the respective functions. The present application is capable of being implemented in hardware or a combination of hardware and computer software in conjunction with the exemplary algorithm steps described in connection with the embodiments disclosed herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, with the embodiment described in connection with the particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In this embodiment, the electronic device may be divided into functional modules according to the above method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module may be implemented in the form of hardware. It should be noted that the division of the modules in this embodiment is schematic, and is only a logic function division, and there may be another division manner in actual implementation.

In the case of dividing each functional module by corresponding functions, fig. 11 shows a possible composition diagram of the electronic device 1100 involved in the above embodiment, and as shown in fig. 11, the electronic device 1100 may include: an acquisition unit 1101, a parsing unit 1102, a lookup unit 1103, and an execution unit 1104.

Among other things, the acquisition unit 1101 may be used to enable the electronic device 1100 to perform the above-described steps 910, etc., and/or other processes for the techniques described herein. For example, the ASR module in fig. 2 may be used to implement the function of the obtaining unit 1101.

Parsing unit 1102 may be used to enable electronic device 1100 to perform steps 920, etc., described above, and/or other processes for the techniques described herein. Illustratively, the NLU module in fig. 2 may be used to implement the function of the parsing unit 1102.

Lookup unit 1103 may be used to support electronic device 1100 in performing steps 930 described above, etc., and/or other processes for the techniques described herein. Illustratively, the DST module and the DM module in fig. 2 may be used to implement the function of the lookup unit 1103.

Execution unit 1104 may be used to enable electronic device 1100 to perform, among other things, steps 940 described above, and/or other processes for the techniques described herein. Illustratively, the Action module in FIG. 2 may be used to implement the functionality of the execution unit 1104.

Fig. 12 shows a schematic diagram of a possible composition of the electronic device 1200 involved in the above embodiment, and as shown in fig. 12, the electronic device 1200 may include: a detection unit 1201, a display and broadcast unit 1202, a storage unit 1203, and an execution unit 1204.

Among other things, detection unit 1201 may be used to enable electronic device 1200 to perform

steps

1010, 1040, etc., described above, and/or other processes for the techniques described herein.

Display and broadcast unit 1202 may be used to support electronic device 1200 in performing steps 1020, etc., described above, and/or other processes for the techniques described herein.

The storage unit 1203 may be used to support the electronic device 1200 in performing the above-described steps 1030, etc., and/or other processes for the techniques described herein.

Execution unit 1204 may be used to enable electronic device 1200 to perform steps 1050 described above, etc., and/or other processes for the techniques described herein.

It should be noted that all relevant contents of each step related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again.

The electronic device provided by the embodiment is used for executing the human-computer interaction method, so that the same effect as the implementation method can be achieved.

In case an integrated unit is employed, the electronic device may comprise a processing module, a storage module and a communication module. The processing module may be configured to control and manage an action of the electronic device, and for example, may be configured to support the electronic device to execute steps performed by the above units. The memory module may be used to support the electronic device in executing stored program codes and data, etc. The communication module can be used for supporting the communication between the electronic equipment and other equipment.

The processing module may be a processor or a controller. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. A processor may also be a combination of computing functions, e.g., a combination of one or more microprocessors, a Digital Signal Processing (DSP) and a microprocessor, or the like. The storage module may be a memory. The communication module may specifically be a radio frequency circuit, a bluetooth chip, a Wi-Fi chip, or other devices that interact with other electronic devices.

In an embodiment, when the processing module is a processor and the storage module is a memory, the electronic device according to this embodiment may be a device having a structure shown in fig. 1.

The present embodiment also provides a computer storage medium, where computer instructions are stored, and when the computer instructions are run on an electronic device, the electronic device is caused to execute the relevant method steps to implement the method for human-computer interaction in the foregoing embodiments.

The present embodiment also provides a computer program product, which when running on a computer, causes the computer to execute the relevant steps described above, so as to implement the human-computer interaction method in the foregoing embodiments.

In addition, embodiments of the present application also provide an apparatus, which may be specifically a chip, a component or a module, and may include a processor and a memory connected to each other; the memory is used for storing computer execution instructions, and when the device runs, the processor can execute the computer execution instructions stored in the memory, so that the chip can execute the human-computer interaction method in the above method embodiments.

The electronic device, the computer storage medium, the computer program product, or the chip provided in this embodiment are all configured to execute the corresponding method provided above, and therefore, the beneficial effects that can be achieved by the electronic device, the computer storage medium, the computer program product, or the chip may refer to the beneficial effects in the corresponding method provided above, and are not described herein again.

Through the description of the foregoing embodiments, those skilled in the art will understand that, for convenience and simplicity of description, only the division of the functional modules is used for illustration, and in practical applications, the above function distribution may be completed by different functional modules as needed, that is, the internal structure of the device may be divided into different functional modules, so as to complete all or part of the functions described above.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, a module or a unit may be divided into only one logic function, and may be implemented in other ways, for example, a plurality of units or components may be combined or integrated into another apparatus, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, may be located in one place, or may be distributed to a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for human-computer interaction, which is applied to an electronic device, is characterized in that the method comprises the following steps:

the electronic equipment acquires a first sentence input by a user;

the electronic equipment analyzes the first statement to obtain first information, wherein the first information is used for indicating the intention of a user, the first information corresponds to one or more pieces of second information, and the one or more pieces of second information are slot position information used for realizing the intention of the user;

when at least one piece of second information in the one or more pieces of second information is missing, the electronic equipment searches the missing second information from the content memorized by the man-machine interaction application;

the electronic device performs an operation related to the user's intention according to the first information and the one or more second information.

2. The method of claim 1, wherein prior to the electronic device obtaining the first sentence input by the user, the method further comprises:

the electronic equipment acquires a second sentence input by a user;

the electronic equipment analyzes the second statement to obtain the at least one piece of second information;

the electronic device saves the at least one second message.

3. The method of claim 2, wherein the second statement comprises a user-initiated mnemonic instruction.

4. The method according to any one of claims 1 to 3, wherein the first information corresponds to a plurality of second information, and at least two pieces of the second information in the plurality of second information are missing;

the electronic equipment searches the missing at least one piece of second information from the content memorized by the man-machine interaction application, and the method comprises the following steps:

the electronic equipment searches for partial information in the missing at least two pieces of second information from the content memorized by the man-machine interaction application;

wherein the method further comprises:

the electronic equipment generates a dialog, and the dialog is used for reminding a user of inputting another part of the at least two pieces of second information;

the electronic equipment sends the dialogue information to a user;

the electronic equipment acquires a third sentence input by a user;

the electronic device human analyzes the third sentence, and the third sentence comprises another part of the at least two pieces of information.

5. The method of claim 4, further comprising:

the electronic equipment stores another part of the at least two pieces of information.

6. The method of any of claims 1-5, wherein the electronic device performs operations related to the user's intent based on the first information and the one or more second information, comprising:

the electronic equipment generates an instruction according to the first information, the searched missing at least one piece of second information and information except the at least one piece of second information in the one or more pieces of second information;

and the electronic equipment executes the operation related to the instruction according to the instruction.

7. The method of any of claims 1 to 6, prior to the generating an instruction, comprising:

the electronic equipment fills the one or more pieces of second information into the slot position corresponding to the first information.

8. A human-computer interaction method applied to electronic equipment is characterized by comprising the following steps:

the electronic equipment detects a first sentence input by a user, wherein the first sentence comprises at least one piece of first information;

responding to a first statement input by the user, and displaying or broadcasting first dialogue information by the electronic equipment, wherein the first dialogue information is a response to the first statement;

in response to a first sentence input by the user, the electronic device stores the at least one first information;

the electronic equipment detects a second sentence input by a user, wherein the second sentence comprises second information and does not comprise the at least one piece of first information, the second information is used for indicating the intention of the user, and the at least one piece of first information is at least part of information used for realizing the intention of the user;

in response to a second sentence input by the user, the electronic device performs an operation related to the user's intention at least according to the second information and the at least one first information.

9. The method according to claim 8, wherein the at least one first message is a part of the message for realizing the user's intention, and another part of the message for realizing the user's intention is not included in the second sentence;

the electronic device, in response to the second sentence input by the user, performs an operation related to the user's intention according to at least the second information and the at least one first information, including:

the electronic equipment displays or broadcasts second dialogue information, wherein the second dialogue information is used for reminding a user to input third information, and the third information is the other part of information in the information for realizing the intention of the user;

the electronic equipment detects a third sentence input by a user, wherein the third sentence comprises the third information;

in response to a third sentence input by the user, the electronic device performs an operation related to the user's intention according to the third information, the second information, and the at least one first information.

10. An electronic device, comprising:

one or more processors;

one or more memories;

the one or more memories store one or more computer programs corresponding to a human-computer-interaction application, the one or more computer programs comprising instructions that, when executed by the one or more processors, cause the electronic device to perform the steps of:

acquiring a first sentence input by a user;

analyzing the first statement to obtain first information, wherein the first information is used for indicating the intention of a user, the first information corresponds to one or more second information, and the one or more second information is used for realizing the intention of the user;

when at least one piece of second information in the one or more pieces of second information is missing, searching the missing at least one piece of second information from the content memorized by the man-machine interaction application;

and according to the first information and the one or more second information, executing operation related to the user intention.

11. The electronic device of claim 10, wherein the instructions, when executed by the one or more processors, cause the electronic device to further perform the steps of:

acquiring a second sentence input by a user;

analyzing the second statement to obtain the at least one piece of second information;

the at least one second information is saved.

12. The electronic device of claim 11, wherein the second sentence includes a user-initiated mnemonic instruction.

13. The electronic device according to any one of claims 10 to 12, wherein the first information corresponds to a plurality of second information, and at least two of the plurality of second information are missing;

when executed by the one or more processors, the instructions cause the electronic device to perform the step of searching the content memorized by the human-computer interaction application for the missing at least one second information, comprising:

searching partial information in the at least two pieces of missing second information from the content memorized by the man-machine interaction application;

the instructions, when executed by the one or more processors, cause the electronic device to further perform the steps of:

generating a dialog for prompting a user to input another part of the at least two pieces of second information;

sending the dialogue information to a user;

and acquiring and analyzing a third sentence input by the user, wherein the third sentence comprises another part of information in the at least two pieces of information.

14. The electronic device of claim 13, wherein the instructions, when executed by the one or more processors, cause the electronic device to further perform the steps of:

and saving another part of the at least two pieces of information.

15. The electronic device of any of claims 10-14, wherein the instructions, when executed by the one or more processors, cause the electronic device to perform the operations related to the user's intent in accordance with the first information and the one or more second information comprise:

generating an instruction according to the first information, the searched missing at least one second information and information except the at least one second information in the one or more second information;

and executing the operation related to the instruction according to the instruction.

16. The electronic device of any of claims 10-15, wherein the instructions, when executed by the one or more processors, cause the electronic device to further perform the steps of:

before generating the instruction, filling the one or more second information into a slot corresponding to the first information.

17. An electronic device, comprising:

one or more processors;

one or more memories;

detecting a first sentence input by a user, wherein the first sentence comprises at least one piece of first information;

responding to a first statement input by the user, and displaying or broadcasting first dialogue information, wherein the first dialogue information is a response to the first statement;

storing the at least one first message in response to a first sentence input by the user;

detecting a second sentence input by a user, wherein the second sentence comprises second information and does not comprise the at least one piece of first information, the second information is used for indicating the intention of the user, and the at least one piece of first information is at least part of information used for realizing the intention of the user;

in response to a second sentence input by the user, performing an operation related to the user's intention at least according to the second information and the at least one first information.

18. The electronic device according to claim 17, wherein the at least one first message is a part of the message for realizing the user's intention, and another part of the message for realizing the user's intention is not included in the second sentence;

when executed by the one or more processors, cause the electronic device to perform the second sentence responsive to the user input, the performing an operation related to the user's intent based at least on the second information and the at least one first information comprising:

displaying or broadcasting second dialogue information, wherein the second dialogue information is used for reminding a user to input third information, and the third information is another part of information in the information for realizing the intention of the user;

detecting a third sentence input by a user, wherein the third sentence comprises the third information;

in response to a third sentence input by the user, performing an operation related to the user's intention according to the third information, the second information, and the at least one first information.

19. A computer storage medium comprising computer instructions which, when run on an electronic device, cause the electronic device to perform the method of human-computer interaction of any one of claims 1 to 9.

20. A computer program product, characterized in that it causes a computer to carry out the method of human-computer interaction according to any one of claims 1 to 9, when said computer program product is run on said computer.