WO2022143048A1 - 对话任务管理方法、装置及电子设备 - Google Patents

对话任务管理方法、装置及电子设备 Download PDF

Info

Publication number
WO2022143048A1
WO2022143048A1 PCT/CN2021/136167 CN2021136167W WO2022143048A1 WO 2022143048 A1 WO2022143048 A1 WO 2022143048A1 CN 2021136167 W CN2021136167 W CN 2021136167W WO 2022143048 A1 WO2022143048 A1 WO 2022143048A1
Authority
WO
WIPO (PCT)
Prior art keywords
dialogue
processing
natural language
round
language processing
Prior art date
Application number
PCT/CN2021/136167
Other languages
English (en)
French (fr)
Inventor
何雄辉
陈启蒙
左利鹏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022143048A1 publication Critical patent/WO2022143048A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to a dialog task management method, device and electronic device.
  • a human-machine language dialogue system can be configured in the artificial intelligence device, and through the human-machine language dialogue system, the user can perform human-computer interaction with the artificial intelligence device.
  • the user can arrange skills, so that when the user triggers the skill to be programmed, the artificial intelligence device will process the tasks in sequence according to the user's arrangement.
  • the user's skills are arranged as follows: the wake-up word is "good morning”, and the dialogue tasks to be processed include “time query”, “new weather”, “chicken soup for the soul” and “today's news”;
  • the artificial intelligence device such as a smart speaker
  • the artificial intelligence device will broadcast the current time, today's weather, today's chicken soup for the soul and today's news in turn.
  • the artificial intelligence device often pauses or waits between two adjacent dialogue tasks when processing dialogue tasks, and the user experience is poor.
  • the present application provides a dialogue task management method, device and electronic device, which can make the connection between various dialogue tasks in a human-machine language dialogue system smoother and improve user experience.
  • the present application provides a dialogue task management method.
  • the dialogue task includes the processing of multiple rounds of dialogue, and each round of dialogue includes natural language processing and broadcast processing when being processed. Natural language processing is used to obtain broadcast processing.
  • the reply information to be broadcast; the method includes: when processing the current round of dialogue, asynchronously executing natural language processing for the next round of dialogue; when the broadcast processing of the current round of dialogue is completed, acquiring the The reply information obtained by natural language processing is used to perform broadcast processing for the next round of dialogue.
  • the natural language processing of the next round of dialogue is asynchronously executed, so that the reply information required for the next round of dialogue is obtained in advance, so that the next round of dialogue does not need to be executed when the current round of dialogue is completed.
  • the reply information can be obtained directly from the cache, which makes the connection between the various dialogue tasks in the human-machine language dialogue system smoother and improves the user experience.
  • the method further includes: when the natural language processing of the next round of dialogue is completed, asynchronously execute the natural language processing of the next round of dialogue of the next round of dialogue. As a result, to speed up the natural language processing of subsequent dialogues.
  • the reply information obtained by the natural language processing of each round of dialogue includes the reply information corresponding to the identifier of the round of dialogue and the identifier of the conversation window corresponding to the dialogue task.
  • the method further includes at least one of the following: the current round of dialogue is the first round of dialogue in the dialogue task; or, when processing the current round of dialogue, including: performing natural language processing of the current round of dialogue or broadcast processing; or, when the broadcast processing of the current round of dialogue is completed, and the natural language processing of the next round of dialogue is completed, the broadcast processing of the next round of dialogue is performed; or, when the current round of dialogue is processed.
  • the broadcast processing of the next round of dialogue is completed, and the natural language processing of the next round of dialogue is not completed, wait for the completion of the natural language processing of the next round of dialogue.
  • waiting for the execution of the natural language processing of the next round of dialogue includes: controlling the first thread to be in a blocking state, and controlling the second thread to be in a running state, and the first thread is used to execute the execution of the next round of dialogue.
  • the second thread is used to perform the natural language processing of the next round of dialogue, wherein the second thread feeds back the execution result to the first thread when the natural language processing of the next round of dialogue is completed, so that the first thread The thread switches from the blocking state to the running state.
  • the present application provides a dialogue task management method
  • the dialogue task includes processing of multiple rounds of dialogue, each round of dialogue includes natural language processing and broadcast processing when being processed, and natural language processing is used to obtain broadcast processing.
  • Reply information to be broadcast the method includes: while processing the current round of dialogue, asynchronously performing natural language processing on at least two rounds of dialogues in the remaining other rounds of dialogues in the dialogue task, the at least two rounds of dialogues include the current round of dialogues.
  • the next round of dialogue when the broadcast processing of the current round of dialogue is completed, the reply information obtained by the natural language processing of the next round of dialogue is obtained, and the broadcast processing of the next round of dialogue is executed.
  • the natural language processing of at least two rounds of dialogues in the remaining other rounds of dialogues in the dialogue task is simultaneously executed asynchronously, so as to obtain the reply information required for at least two rounds of dialogues in advance, so that in the current round of dialogues, the natural language processing is performed asynchronously.
  • the reply information can be directly obtained from the cache, thereby making the connection between various dialogue tasks in the human-machine language dialogue system smoother and improving the user experience.
  • At least two rounds of dialogue have the highest execution efficiency of the corresponding natural language processing in the remaining other rounds of dialogue in the dialogue task.
  • the natural language processing corresponding to at least two rounds of dialogue does not depend on the setting of permissions and/or the quality of the network.
  • the at least two rounds of dialogue include other rounds of dialogue remaining in the dialogue task.
  • the reply information obtained by the natural language processing of each round of dialogue includes the reply information corresponding to the round of dialogue, and the identifier of the conversation window corresponding to the dialogue task and/or the identifier of the round of dialogue.
  • the method further includes at least one of the following: the current round of dialogue is the first round of dialogue in the dialogue task; or, when processing the current round of dialogue, including: performing natural language processing of the current round of dialogue or broadcast processing; or, when the broadcast processing of the current round of dialogue is completed, and the natural language processing of the next round of dialogue is completed, the broadcast processing of the next round of dialogue is performed; or, when the current round of dialogue is processed.
  • the broadcast processing of the next round of dialogue is completed, and the natural language processing of the next round of dialogue is not completed, wait for the completion of the natural language processing of the next round of dialogue.
  • waiting for the completion of the natural language processing of the next round of dialogue including:
  • the first thread is controlled to be in a blocking state
  • the second thread is controlled to be in a running state.
  • the first thread is used to perform broadcast processing of the next round of dialogue
  • the second thread is used to perform natural language processing of the next round of dialogue, wherein,
  • the second thread feeds back the execution result to the first thread when the natural language processing of the next round of dialogue is completed, so that the first thread is switched from the blocking state to the running state.
  • the present application provides a dialogue task management device, comprising: a memory for storing a program; a processor for executing a program stored in the memory, and when the program stored in the memory is executed, the processor is used for executing a first A method provided in the aspect or the second aspect.
  • the present application provides an electronic device, including the apparatus provided in the first aspect or the second aspect.
  • the present application provides a computer storage medium, where instructions are stored in the computer storage medium, and when the instructions are run on a computer, cause the computer to execute the method provided in the first aspect or the second aspect.
  • the present application provides a chip including at least one processor and an interface; an interface for providing program instructions or data for at least one processor; and at least one processor for executing program line instructions to implement the first aspect or the method provided in the second aspect.
  • FIG. 1 is a schematic diagram of skill arrangement in a man-machine language dialogue system provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of a processing process of a dialogue task in a human-machine language dialogue system provided by an embodiment of the present application;
  • FIG. 3 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a dialog task management method provided by an embodiment of the present application.
  • 5a is a schematic diagram of a processing process of a dialogue task provided by an embodiment of the present application.
  • FIG. 5b is a schematic diagram of a processing process of another dialogue task provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a processing process of a dialogue task provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a processing process of a dialogue task provided by an embodiment of the present application.
  • 8a is a schematic diagram of a process of waiting for the completion of the natural language processing of the next round of dialogue in the processing process of a dialogue task provided by an embodiment of the present application;
  • 8b is a schematic diagram of a process of waiting for the completion of the natural language processing of the next round of dialogue in the processing process of another dialogue task provided by an embodiment of the present application;
  • FIG. 9 is a schematic flowchart of a dialog task management method provided by an embodiment of the present application.
  • 10a is a schematic diagram of a processing process of a dialogue task provided by an embodiment of the present application.
  • FIG. 10b is a schematic diagram of a processing process of another dialogue task provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a processing process of a dialogue task provided by an embodiment of the present application.
  • FIG. 12 is a schematic flowchart of processing a dialogue task provided by an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a dialogue task management apparatus provided by an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of a chip provided by an embodiment of the present application.
  • words such as “exemplary”, “such as” or “for example” are used to mean serving as an example, illustration or illustration. Any embodiments or designs described in the embodiments of the present application as “exemplary,” “such as,” or “by way of example” should not be construed as preferred or advantageous over other embodiments or designs. Rather, use of words such as “exemplary,” “such as,” or “by way of example” is intended to present the related concepts in a specific manner.
  • the term "and/or" is only an association relationship for describing associated objects, indicating that there may be three relationships, for example, A and/or B, which may indicate: A alone exists, A alone exists There is B, and there are three cases of A and B at the same time.
  • the term "plurality" means two or more. For example, multiple systems refer to two or more systems, and multiple terminals refer to two or more terminals.
  • first and second are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implying the indicated technical features. Thus, a feature defined as “first” or “second” may expressly or implicitly include one or more of that feature.
  • the terms “including”, “including”, “having” and their variants mean “including but not limited to” unless specifically emphasized otherwise.
  • each dialogue task is independent of each other, and it is often only when a dialogue task is processed that the next dialogue task is processed.
  • a dialogue task which often leads to a gap between the two dialogue tasks, which in turn causes the human-machine voice dialogue system to pause and wait.
  • the dialogue tasks in the human-machine language dialogue system include: time query, new weather, chicken soup for the soul, and today's news, and each dialogue task includes natural language processing time and broadcast processing time.
  • the content that needs to be broadcasted for this dialogue task can be obtained through natural language processing.
  • the current time content to be broadcast can be obtained through natural language processing, and then Then broadcast the current time through the broadcast processing.
  • the task of "time query” is completed, the task of "new version of weather” will be executed, and when the task of "new version of weather” is executed, it is also necessary to obtain the current weather content that needs to be broadcast through natural language processing. This results in a process of waiting for the natural language processing in the task when executing the "new version of weather” task, resulting in the phenomenon of pause and waiting when executing the "new version of weather” task.
  • the man-machine language dialogue mentioned in this solution can be configured in an artificial intelligence device, and the artificial intelligence device can be an electronic device such as a smart phone and a smart speaker.
  • the electronic device include, but are not limited to, electronic devices equipped with iOS, android, Windows, Harmony OS or other operating systems.
  • the electronic device described above may also be other electronic devices, such as a laptop or the like having a touch-sensitive surface (eg, a touch panel).
  • the embodiment of the present application does not specifically limit the type of the electronic device.
  • the electronic device may be the artificial intelligence device described above.
  • FIG. 3 shows a schematic diagram of the hardware structure of the electronic device.
  • the electronic device 100 may include a processor 110 , an external memory interface 120 , an internal memory 121 , a universal serial bus (USB) interface 130 , a charging management module 140 , a power management module 141 , and a battery 142 , Antenna 1, Antenna 2, Mobile Communication Module 150, Wireless Communication Module 160, Audio Module 170, Speaker 170A, Receiver 170B, Microphone 170C, Headphone Interface 170D, Sensor Module 180, Key 190, Motor 191, Indicator 192, Camera 193 , a display screen 194, and a subscriber identification module (subscriber identification module, SIM) card interface 195 and the like.
  • SIM subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
  • the structures illustrated in the embodiments of the present application do not constitute a specific limitation on the electronic device 100 .
  • the electronic device 100 may include more or less components than shown, or combine some components, or separate some components, or arrange different components.
  • the illustrated components may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem (modem), a graphics processor (graphics processing unit, GPU), an image signal processor ( image signal processor (ISP), controller, video codec, digital signal processor (DSP), baseband processor, and/or neural-network processing unit (NPU), etc. one or more. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor
  • modem modem
  • graphics processor graphics processor
  • ISP image signal processor
  • DSP digital signal processor
  • NPU neural-network processing unit
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is cache memory. This memory may hold instructions or data that have just been used or recycled by the processor 110 . If the processor 110 needs to use the instruction or data again, it can be directly called from the memory to avoid repeated access, reduce the waiting time of the processor 110, and improve the efficiency of the system.
  • the processor 110 may process dialogue thinking in a human-machine language dialogue system, such as natural language processing, and the like.
  • processor 110 may include one or more interfaces.
  • the interface may include integrated circuit (inter-integrated circuit, I2C) interface, integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, pulse code modulation (Pulse Code Modulation, PCM) interface, Universal Asynchronous Receiver/Transmitter (Universal Asynchronous Receiver/ Transmitter, UART) interface, mobile industry processor interface (MobileIndustryProcessorInterface, MIPI), general input and output (General PurposeI/0Ports, GPIO), subscriber identity module (subscriber identity module, SIM) interface, and/or Universal Serial Bus (UniversalSerialBus) , USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM Pulse Code Modulation
  • UART Universal Asynchronous Receiver/Transmitter
  • MIPI mobile industry processor interface
  • General PurposeI/0Ports General PurposeI/0Ports, GPIO
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger may be a wireless charger or a wired charger.
  • the charging management module 140 may receive charging input from the wired charger through the USB interface 130 .
  • the charging management module 140 may receive wireless charging input through a wireless charging coil of the electronic device 100 . While the charging management module 140 charges the battery 142 , it can also supply power to other electronic devices through the power management module 141 .
  • the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the external memory, the display screen 194, the camera 193, the wireless communication module 160, and the like.
  • the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, battery health status (leakage, impedance).
  • the power management module 141 may also be provided in the processor 110 .
  • the power management module 141 and the charging management module 140 may also be provided in the same device.
  • the wireless communication function of the electronic device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem, the baseband processor, and the like.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in electronic device 100 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
  • the antenna 1 can be multiplexed as a diversity antenna of the wireless local area network. In other examples, the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 150 may provide wireless communication solutions including 2G/3G/4G/5G etc. applied on the electronic device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA) and the like.
  • the mobile communication module 150 can receive electromagnetic waves by at least two antennas including the antenna 1, filter, amplify, etc. the received electromagnetic waves, and transmit them to the modem for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modem, and then convert it into electromagnetic waves for radiation through the antenna 1 .
  • at least part of the functional modules of the mobile communication module 150 may be provided in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 may be provided in the same device as at least part of the modules of the processor 110.
  • a modem may include a modulator and a demodulator.
  • the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low frequency baseband signal is processed by the baseband processor and passed to the application processor.
  • the application processor outputs sound signals through audio devices (not limited to the speaker 170A, the receiver 170B, etc.), or displays images or videos through the display screen 194 .
  • the modem may be a stand-alone device.
  • the modem may be independent of the processor 110 and provided in the same device as the mobile communication module 150 or other functional modules.
  • the mobile communication module 150 may be a module in a modem.
  • the wireless communication module 160 can provide applications on the electronic device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), bluetooth (BT), global navigation satellites Wireless communication solutions such as global navigation satellite system (GNSS), frequency modulation (FM), near field communication (NFC), and infrared technology (IR).
  • WLAN wireless local area networks
  • BT Bluetooth
  • GNSS global navigation satellite system
  • FM frequency modulation
  • NFC near field communication
  • IR infrared technology
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , perform frequency modulation on it, amplify it, and convert it into electromagnetic waves for radiation through the antenna 2 .
  • the antenna 1 of the electronic device 100 is coupled with the mobile communication module 150, and the antenna 2 is coupled with the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), fifth generation, new air interface ( new radio, NR), BT, GNSS, WLAN, NFC, FM, and/or IR technologies, etc.
  • GSM global system for mobile communications
  • GPRS general packet radio service
  • CDMA code division multiple access
  • WCDMA wideband code division multiple access
  • WCDMA wideband code division multiple access
  • time division code division multiple access time-division code division multiple access
  • LTE long term evolution
  • fifth generation new air interface (new radio,
  • the GNSS may include global positioning system (global positioning system, GPS), global navigation satellite system (global navigation satellite system, GLONASS), Beidou navigation satellite system (beidou navigation satellite system, BDS), quasi-zenith satellite system (quasi -zenith satellite system, QZSS) and/or satellite based augmentation systems (SBAS).
  • global positioning system global positioning system, GPS
  • global navigation satellite system global navigation satellite system, GLONASS
  • Beidou navigation satellite system beidou navigation satellite system, BDS
  • quasi-zenith satellite system quadsi -zenith satellite system, QZSS
  • SBAS satellite based augmentation systems
  • the electronic device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like.
  • the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
  • Display screen 194 is used to display images, videos, and the like.
  • Display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode).
  • diode, AMOLED flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diode (quantum dot light emitting diodes, QLED) and so on.
  • electronic device 100 may include one or more display screens 194 .
  • the electronic device 100 may implement a shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.
  • the ISP is used to process the data fed back by the camera 193 .
  • the shutter is opened, the light is transmitted to the camera photosensitive element through the lens, the light signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
  • ISP can also perform algorithm optimization on image noise, brightness, and skin tone. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
  • the ISP may be provided in the camera 193 .
  • the camera 193 is used for capturing still images or videos, for example, capturing the user's facial feature information, gesture feature information, and the like.
  • the object is projected through the lens to generate an optical image onto the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal oxide semiconductor (Complementary Metal Oxide Semiconductor, CMOS) phototransistor.
  • CMOS complementary metal oxide semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
  • the ISP outputs the digital image signal to the DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other formats of image signals.
  • electronic device 100 may include one or more cameras 193 .
  • a digital signal processor is used to process digital signals, in addition to processing digital image signals, it can also process other digital signals. For example, when the electronic device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the frequency point energy and so on.
  • Video codecs are used to compress or decompress digital video.
  • the electronic device 100 may support one or more video codecs.
  • the electronic device 100 can play or record videos of various encoding formats, such as: Moving Picture Experts Group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4 and so on.
  • MPEG Moving Picture Experts Group
  • MPEG2 moving picture experts group
  • MPEG3 MPEG4
  • MPEG4 Moving Picture Experts Group
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100 .
  • the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example to save files like music, video etc in external memory card.
  • Internal memory 121 may be used to store computer executable program code, which includes instructions.
  • the processor 110 executes various functional applications and data processing of the electronic device 100 by executing the instructions stored in the internal memory 121 .
  • the internal memory 121 may include a storage program area and a storage data area.
  • the storage program area can store an operating system, an application program required for at least one function (such as a sound playback function, an image playback function, etc.), and the like.
  • the storage data area may store data (such as audio data, phone book, etc.) created during the use of the electronic device 100 and the like.
  • the internal memory 121 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (UFS), and the like.
  • the electronic device 100 may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, an application processor, and the like. Such as music playback, recording, etc.
  • the audio module 170 is used for converting digital audio information into analog audio signal output, and also for converting analog audio input into digital audio signal. Audio module 170 may also be used to encode and decode audio signals. In some examples, the audio module 170 may be provided in the processor 110 , or some functional modules of the audio module 170 may be provided in the processor 110 .
  • Speaker 170A also referred to as a "speaker" is used to convert audio electrical signals into sound signals.
  • the electronic device 100 can listen to music through the speaker 170A, or listen to a hands-free call, or broadcast the content required for the dialogue task in the human-machine language dialogue system.
  • the receiver 170B also referred to as "earpiece" is used to convert audio electrical signals into sound signals.
  • the voice can be answered by placing the receiver 170B close to the human ear.
  • the microphone 170C also called “microphone” or “microphone” is used to convert sound signals into electrical signals.
  • the user can make a sound by approaching the microphone 170C through a human mouth, and input the sound signal into the microphone 170C.
  • the electronic device 100 may be provided with at least one microphone 170C.
  • the electronic device 100 may be provided with two microphones 170C, which may implement a noise reduction function in addition to collecting sound signals.
  • the electronic device 100 may further be provided with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and implement directional recording functions.
  • the earphone jack 170D is used to connect wired earphones.
  • the earphone interface 170D can be the USB interface 130, or can be a 3.5mm open mobile terminal platform (OMTP) standard interface, a cellular telecommunications industry association of the USA (CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association of the USA
  • the sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and an ambient light sensor 180L, bone conduction sensor 180M, etc.
  • the pressure sensor 180A is used to sense pressure signals, and can convert the pressure signals into electrical signals.
  • pressure sensor 180A may be provided on display screen 194 .
  • the capacitive pressure sensor may be comprised of at least two parallel plates of conductive material. When a force is applied to the pressure sensor 180A, the capacitance between the electrodes changes.
  • the electronic device 100 determines the intensity of the pressure according to the change in capacitance.
  • the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A.
  • the electronic device 100 may also calculate the touched position according to the detection signal of the pressure sensor 180A.
  • touch operations acting on the same touch position but with different touch operation intensities may correspond to different operation instructions. For example, when a touch operation whose intensity is less than the first pressure threshold acts on the short message application icon, the instruction for viewing the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, the instruction to create a new short message is executed.
  • the gyro sensor 180B may be used to determine the motion attitude of the electronic device 100 .
  • the angular velocity of electronic device 100 about three axes ie, x, y, and z axes
  • the gyro sensor 180B can be used for image stabilization.
  • the gyro sensor 180B detects the angle at which the electronic device 100 shakes, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the electronic device through reverse motion. The shaking of the device 100 realizes anti-shaking.
  • the air pressure sensor 180C is used to measure air pressure.
  • the electronic device 100 calculates the altitude from the air pressure value measured by the air pressure sensor 180C to assist in positioning and navigation.
  • the acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in various directions (generally three axes).
  • the magnitude and direction of gravity can be detected when the electronic device 100 is stationary. It can also be used to identify the posture of electronic devices, and can be used in horizontal and vertical screen switching, pedometers and other applications.
  • the electronic device 100 can measure the distance through infrared or laser. In some examples, when the electronic device is used to collect the user characteristic information of the user in the environment, the electronic device 100 may use the distance sensor 180F to measure the distance to achieve fast focusing.
  • the ambient light sensor 180L is used to sense ambient light brightness.
  • the electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the electronic device 100 can use the collected fingerprint characteristics to realize fingerprint unlocking, accessing application locks, taking pictures with fingerprints, answering incoming calls with fingerprints, and the like.
  • the temperature sensor 180J is used to detect the temperature.
  • the electronic device 100 utilizes the temperature detected by the temperature sensor 180J to execute a temperature handling strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold value, the electronic device 100 reduces the performance of the processor located near the temperature sensor 180J in order to reduce power consumption and implement thermal protection.
  • the electronic device 100 when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to avoid abnormal shutdown of the electronic device 100 caused by the low temperature.
  • the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
  • Touch sensor 180K also called “touch device”.
  • the touch sensor 180K may be disposed on the display screen 194 , and the touch sensor 180K and the display screen 194 form a touch screen, also called a “touch screen”.
  • the touch sensor 180K is used to detect a touch operation on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to touch operations may be provided through display screen 194 .
  • the touch sensor 180K may also be disposed on the surface of the electronic device 100 , which is different from the location where the display screen 194 is located.
  • the keys 190 include a power-on key, a volume key, an input keyboard, and the like. Keys 190 may be mechanical keys. It can also be a touch key.
  • the electronic device 100 may receive key inputs and generate key signal inputs related to user settings and function control of the electronic device 100 .
  • Motor 191 can generate vibrating cues.
  • the motor 191 can be used for vibrating alerts for incoming calls, and can also be used for touch vibration feedback.
  • touch operations acting on different applications may correspond to different vibration feedback effects.
  • the motor 191 can also correspond to different vibration feedback effects for touch operations on different areas of the display screen 194 .
  • Different application scenarios for example: time reminder, receiving information, alarm clock, games, etc.
  • the touch vibration feedback effect can also support customization.
  • the indicator 192 can be an indicator light, which can be used to indicate the charging state, the change of the power, and can also be used to indicate a message, a missed call, a notification, and the like.
  • the dialogue task in this solution may include the processing of multiple rounds of dialogue, wherein each round of dialogue includes natural language processing and broadcast processing when being processed, and natural language processing can be used to obtain the required information for broadcast processing.
  • the broadcast reply message Exemplarily, continue to refer to Figure 1, in Figure 1 "good morning” can be understood as a dialogue task, and “time query”, “new weather”, “chicken soup for the soul” and “today's news” are in the dialogue task. dialogue, in which the "time query” can be a round of dialogue.
  • the "natural language processing event" involved in each round of dialogue in Figure 2 is the time required for each round of dialogue to be processed to perform natural language processing, and the “broadcast processing time” is the time to broadcast the reply information.
  • the reply information can be understood as replying to the user's information. For example, after the user says "good morning” towards the smart speaker, the smart speaker can reply to the user's current time, today's weather and other information, and the information replied by the smart speaker is The reply information mentioned in this program.
  • FIG. 4 is a schematic flowchart of a dialog task management method provided by an embodiment of the present application. As shown in Figure 4, the dialog task management method may include the following steps:
  • Step S101 When processing the current round of dialogue, asynchronously perform natural language processing on the next round of dialogue.
  • the natural language processing of the next round of dialogue when processing the current round of dialogue, can be asynchronously performed, so as to obtain the reply information to be broadcasted for the next round of dialogue in advance.
  • the natural language processing process of "new weather” can be asynchronously executed when the time information is broadcast, and then the weather information can be obtained in advance;
  • the natural language processing of "Chicken Soup for the Soul” can be asynchronously performed when the weather information is broadcast, and then the chicken soup for the soul information can be obtained in advance.
  • the reply information obtained by performing the natural language processing of each round of dialogue can be stored in the cache of the device where the human-machine language dialogue system is located, but is not limited to.
  • the current round of dialogue may be the first round of dialogue in the dialogue task.
  • the current round of dialogue may be "time query”.
  • when processing the current round of dialogue it may be when performing natural language processing of the current round of dialogue, or when performing broadcast processing of the current round of dialogue, which is not limited herein.
  • Step S102 When the broadcast processing of the current round of dialogue is completed, the reply information obtained by the natural language processing of the next round of dialogue is obtained, and the broadcast processing of the next round of dialogue is executed.
  • the broadcast processing of the current round of dialogue when the broadcast processing of the current round of dialogue is completed, it indicates that the current round of dialogue has been completed, that is, the execution of the next round of dialogue can be started.
  • the next round of dialogue since the reply information required for this round of dialogue broadcast processing has been obtained in advance, the reply information required for this round of dialogue can be directly obtained, and the broadcast processing of this round of dialogue is performed to broadcast the reply information .
  • the phenomenon of pause and waiting between two rounds of dialogue tasks can be avoided.
  • the natural language processing of the next round of dialogue of the next round of dialogue may be asynchronously executed, so as to further obtain the next round of dialogue of the next round of dialogue in advance
  • the reply message to be broadcast Exemplarily, as shown in Figure 5b, if the current round of dialogue is "time query", then in the broadcast processing stage of this round of dialogue, when the natural speech processing of "new version weather” has been executed, asynchronous execution can be started.
  • the natural language processing stage of "Chicken Soup for the Soul” to obtain the chicken soup for the soul information in advance.
  • the reply information obtained by the natural language processing of each round of dialogue may include reply information corresponding to the identifier of the round of dialogue and the identifier of the conversation window corresponding to the dialogue task.
  • the broadcast processing of the next round of dialogue can be executed.
  • the broadcast processing in the "time query” task has been completed, and the reply information required for the broadcast processing has been obtained in the "new version weather” task, then the execution of the "new version weather” can be started. Broadcast processing.
  • waiting for the execution of the natural language processing of the next round of dialogue may include: controlling the first thread to be in a blocking state, and controlling the second thread to be in a running state.
  • the first thread can be used to perform broadcast processing of the next round of dialogue
  • the second thread can be used to perform natural language processing of the next round of dialogue
  • the second thread can be used to perform the natural language processing of the next round of dialogue when the natural language processing of the next round of dialogue is completed.
  • the execution result is fed back to the first thread, so that the first thread is switched from the blocking state to the running state.
  • a callback function such as the future.get function
  • the first thread can call the second thread, and at this time, the first thread can process other things; after that, The first thread can register the future.get function in the second thread; then, the first thread can process other things while waiting for the second thread to finish executing; when the second thread finishes executing, the second thread will send the first thread to the first thread. Returns the result that has been executed; after that, the first thread can end the waiting and start processing the task.
  • the dialogue task can be ended when the previous dialogue of this round of dialogue is executed, that is, subsequent dialogues are not executed.
  • the execution of the natural language processing of any round of dialogue fails, the execution of the natural language processing of the subsequent round of dialogue can also be prohibited.
  • FIG. 9 is a schematic flowchart of another dialog task management method provided by an embodiment of the present application. As shown in Figure 9, the dialog task management method may include the following steps:
  • Step S201 When processing the current round of dialogue, asynchronously perform natural language processing on at least two rounds of dialogues in the remaining other rounds of dialogues in the dialogue task, where the at least two rounds of dialogues include the next round of dialogues of the current round of dialogues.
  • the natural language processing of at least two rounds of dialogues in the remaining other rounds of dialogues in the dialogue task can be asynchronously performed at the same time, so as to obtain the replies to be broadcasted for at least two rounds of dialogues in advance.
  • the current round of dialogue is "time query”
  • the natural language of "new weather”, “chicken soup for the soul” and “today's news” can be asynchronously executed at the same time when performing natural language processing.
  • At least two rounds of dialogue may be the most efficient performing natural language processing in the remaining other rounds of dialogue in the dialogue task.
  • the natural language processing corresponding to at least two rounds of dialogues may not depend on at least one of the setting of permissions and the quality of the network.
  • the dialogue task includes "time query” and "today's news”, and the time information required by "time query” can be obtained through the device's own clock, which does not depend on external conditions, while "today's news” needs The news information of , needs to be connected to the device to obtain, that is, it depends on external conditions. Therefore, the dialogue of "time query” is considered to be the most efficient, and it does not depend on the quality of the network.
  • At least two rounds of dialogue may also be dialogues with relatively high priority in dialogue tasks.
  • the priority of “time query” and “chicken soup for the soul” is higher than the priority of “today’s news”, then at least two rounds of dialogue are “time query” and “chicken soup for soul”.
  • the natural language processing of "chicken soup for the soul” is performed.
  • the at least two rounds of dialogue include the remaining rounds of dialogue in the dialogue task.
  • at least two rounds of dialogue are unexecuted rounds of dialogue other than the current round of dialogue. 10b, if the current round of dialogue is "time query”, then at least two rounds of dialogue are "new version of weather", “chicken soup for the soul” and “today's news”; if the current round of dialogue is "new version of weather”, At least two rounds of dialogue are "Chicken Soup for the Soul" and "Today's News".
  • the current round of dialogue may be the first round of dialogue in the dialogue task.
  • the current round of dialogue can be "time query”.
  • when processing the current round of dialogue it may be when performing natural language processing of the current round of dialogue, or when performing broadcast processing of the current round of dialogue, which is not limited herein.
  • Step S202 When the broadcast processing of the current round of dialogue is completed, the reply information obtained by the natural language processing of the next round of dialogue is obtained, and the broadcast processing of the next round of dialogue is executed.
  • the broadcast processing of the current round of dialogue when the broadcast processing of the current round of dialogue is completed, it indicates that the current round of dialogue has been completed, that is, the execution of the next round of dialogue can be started.
  • the next round of dialogue since the reply information required for this round of dialogue broadcast processing has been obtained in advance, the reply information required for this round of dialogue can be directly obtained, and the broadcast processing of this round of dialogue is performed to broadcast the reply information .
  • the phenomenon of pause and waiting between two rounds of dialogue tasks can be avoided.
  • the reply information obtained by the natural language processing of each round of dialogue includes the reply information corresponding to the round of dialogue, and at least one of an identifier of the conversation window corresponding to the dialogue task and the identifier of the round of dialogue.
  • the identifier of the conversation window corresponding to the dialogue task can be used to find the corresponding dialogue task when the reply information is obtained from the cache; the identifier of this round of dialogue can be used to find the corresponding dialogue when the reply information is obtained.
  • the broadcast processing of the next round of dialogue can be executed.
  • the broadcast processing in the "time query” task has been completed, and the reply information required for the broadcast processing has been obtained in the "new version weather” task, then the execution of the "new version weather” can be started. Broadcast processing.
  • waiting for the execution of the natural language processing of the next round of dialogue may include: controlling the first thread to be in a blocking state, and controlling the second thread to be in a running state.
  • the first thread can be used to perform broadcast processing of the next round of dialogue
  • the second thread can be used to perform natural language processing of the next round of dialogue
  • the second thread can be used to perform the natural language processing of the next round of dialogue when the natural language processing of the next round of dialogue is completed.
  • the execution result is fed back to the first thread, so that the first thread is switched from the blocking state to the running state.
  • a callback function such as the future.get function
  • the first thread can call the second thread, and at this time, the first thread can process other things; after that, The first thread can register the future.get function in the second thread; then, the first thread can process other things while waiting for the second thread to finish executing; when the second thread finishes executing, the second thread will send the first thread to the first thread. Returns the result that has been executed; after that, the first thread can end the waiting and start processing the task.
  • the dialogue task can be ended when the previous dialogue of this round of dialogue is executed, that is, subsequent dialogues are not executed.
  • the execution of the natural language processing of any round of dialogue fails, the execution of the natural language processing of the subsequent round of dialogue can also be prohibited.
  • user A can issue a voice command, such as a wake-up word, etc.
  • the smart terminal 21 can collect the voice command.
  • the speech recognition (Automatic Speech Recognition, ASR) module 211 in the intelligent terminal 21 can recognize the speech instruction issued by the user A, and convert the vocabulary content in the speech instruction into a computer-readable input, such as a key, binary code or character sequences, etc.
  • the Natural Language Understanding (NLU) module 212 in the smart terminal 21 can combine text matching, semantic similarity matching, information retrieval, multi-intent classification models and other semantic understanding schemes, to analyze the text converted by the ASR module 211.
  • NLU Natural Language Understanding
  • the text is processed, and the user's intention is recognized; wherein, the NLU module 212 can perform information retrieval from the knowledge base 213, and the knowledge base 213 can be configured in the smart terminal 21, and can also be configured on other devices, such as servers, etc. .
  • the Dialog Management (DM) module 214 in the smart terminal 21 can, based on the user's intention, determine the interactive information to be output, for example, query the weather, etc.; and obtain the required output from the server 22 or other devices interactive information (i.e., reply).
  • the text-to-speech (Text To Speech, TTS) module 215 in the smart terminal 21 can convert the interaction information acquired by the DM module 214 into speech.
  • the smart terminal 21 can broadcast the voice converted by the TTS module 215 to the user A.
  • the dialogue management DM module 214 can manage the dialogue tasks pre-arranged by the user, so that each round of dialogue in the dialogue task is executed sequentially.
  • FIG. 13 is a schematic structural diagram of a dialog task management apparatus provided by an embodiment of the present application. As shown in FIG. 13 , the dialog task management apparatus provided by the embodiment of the present application can be used to implement the method described in the foregoing method embodiment.
  • the dialogue task management apparatus includes at least one processor 1301, and the at least one processor 1301 can support the dialogue task management apparatus to implement the methods provided in the embodiments of this application.
  • the processor 1301 may be a general-purpose processor or a special-purpose processor.
  • the processor 1301 may include a central processing unit (CPU) and/or a baseband processor.
  • the baseband processor may be used for processing communication data (for example, determining a target screen terminal), and the CPU may be used for implementing corresponding control and processing functions, executing software programs, and processing data of software programs.
  • the dialogue task management apparatus may further include a transceiver unit 1305 to implement signal input (reception) and output (send).
  • the transceiver unit 1305 may include a transceiver or a radio frequency chip.
  • Transceiver unit 1305 may also include a communication interface.
  • the dialogue task management apparatus may further include an antenna 1306, which may be used to support the transceiver unit 1305 to implement the transceiver function of the dialogue task management apparatus.
  • the dialogue task management apparatus may include one or more memories 1302, on which programs (or instructions or codes) 1304 are stored, and the programs 1304 may be executed by the processor 1301, so that the processor 1301 executes the above method to implement method described in the example.
  • data may also be stored in the memory 1302 .
  • the processor 1301 may also read data stored in the memory 1302 (for example, pre-stored first feature information), the data may be stored at the same storage address as the program 1304, and the data may also be stored with the program 1304 at different storage addresses.
  • the processor 1301 and the memory 1302 can be provided separately or integrated together, for example, integrated on a single board or a system on chip (system on chip, SOC).
  • SOC system on chip
  • the embodiments of the present application further provide an electronic device, where the electronic device includes the dialog task management device provided in the above embodiments.
  • FIG. 14 is a schematic structural diagram of a chip provided by an embodiment of the present application.
  • the chip 1400 includes one or more processors 1401 and an interface circuit 1402 .
  • the chip 1400 may further include a bus 1403 . in:
  • the processor 1401 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method may be completed by an integrated logic circuit of hardware in the processor 1401 or an instruction in the form of software.
  • the above-mentioned processor 1401 may be a general purpose processor, a digital communicator (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components .
  • DSP digital communicator
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the interface circuit 1402 can be used to send or receive data, instructions or information.
  • the processor 1401 can use the data, instructions or other information received by the interface circuit 1402 to process, and can send the processing completion information through the interface circuit 1402.
  • the chip 1400 further includes a memory, which may include a read-only memory and a random access memory, and provides operation instructions and data to the processor.
  • a portion of the memory may also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory stores executable software modules or data structures
  • the processor may execute corresponding operations by calling operation instructions stored in the memory (the operation instructions may be stored in the operating system).
  • the interface circuit 1402 can be used to output the execution result of the processor 1401 .
  • processor 1401 and the interface circuit 1402 can be implemented by hardware design, software design, or a combination of software and hardware, which is not limited here.
  • processor in the embodiments of the present application may be a central processing unit (central processing unit, CPU), and may also be other general-purpose processors, digital signal processors (digital signal processors, DSP), application-specific integrated circuits (application specific integrated circuit, ASIC), field programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof.
  • CPU central processing unit
  • DSP digital signal processors
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general-purpose processor may be a microprocessor or any conventional processor.
  • the method steps in the embodiments of the present application may be implemented in a hardware manner, or may be implemented in a manner in which a processor executes software instructions.
  • Software instructions can be composed of corresponding software modules, and software modules can be stored in random access memory (RAM), flash memory, read-only memory (ROM), programmable read-only memory (programmable rom) , PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically erasable programmable read-only memory (electrically EPROM, EEPROM), registers, hard disks, removable hard disks, CD-ROMs or known in the art in any other form of storage medium.
  • An exemplary storage medium is coupled to the processor, such that the processor can read information from, and write information to, the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and storage medium may reside in an ASIC.
  • the above-mentioned embodiments it may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present application are generated.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted over a computer-readable storage medium.
  • the computer instructions can be sent from one website site, computer, server, or data center to another website site by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.) , computer, server or data center.
  • the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, or the like that includes an integration of one or more available media.
  • the usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, solid state disks (SSDs)), and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种对话任务管理方法、装置及电子设备,涉及人工智能技术领域,尤其涉及对话管理技术领域。该方法涉及的对话任务中包括对多轮对话的处理,每轮对话在被处理时均包括自然语言处理和播报处理,自然语言处理用于获得播报处理所需播报的回复信息;该方法包括:在处理当前轮对话时,异步执行对下一轮对话的自然语言处理(S101);在对当前轮对话的播报处理执行完毕时,获取对下一轮对话进行的自然语言处理获得的回复信息,执行对下一轮对话的播报处理(S102)。该方法通过在执行当前轮对话时异步执行其他轮对话的自然语言处理任务,可以提前获取到其他轮对话在播报处理时所需的回复信息,从而使得各个对话任务之间的衔接更加流畅,提升了用户体验。

Description

对话任务管理方法、装置及电子设备
本申请要求于2020年12月31日提交中国国家知识产权局、申请号为2020116385315、申请名称为“对话任务管理方法、装置及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种对话任务管理方法、装置及电子设备。
背景技术
随着人工智能(artificial intelligence,AI)技术的飞速发展,人工智能设备(如智能手机、智能音箱等)应运而生。人工智能设备中可以配置有人机语言对话系统,通过该人机语言对话系统,用户可以与人工智能设备进行人机交互。一般的,用户在使用人机语言对话系统时,用户可以进行技能编排,这样当用户触发其编排的技能时,人工智能设备将依据用户的编排依次处理任务。示例性的,如图1所示,用户的技能编排为:唤醒词为“早上好”,需处理的对话任务包括“时间查询”、“新版天气”、“心灵鸡汤”和“今日新闻”;当用户说出唤醒词“早上好”后,人工智能设备(如智能音箱)将依次播报当前的时间、今天的天气、今天的心灵鸡汤和今天的新闻。但目前用户在使用人工智能设备上的人机语言对话系统时,人工智能设备在处理对话任务时常常会在相邻两次的对话任务间出现停顿或等待的现象,用户体验较差。
发明内容
本申请提供了一种对话任务管理方法、装置及电子设备,能够让人机语言对话系统中各个对话任务之间的衔接更加流畅,提升用户体验。
第一方面,本申请提供了一种对话任务管理方法,对话任务中包括对多轮对话的处理,每轮对话在被处理时均包括自然语言处理和播报处理,自然语言处理用于获得播报处理所需播报的回复信息;该方法包括:在处理当前轮对话时,异步执行对下一轮对话的自然语言处理;在对当前轮对话的播报处理执行完毕时,获取对下一轮对话进行的自然语言处理获得的回复信息,执行对下一轮对话的播报处理。
由此,在执行当前轮对话时,异步执行下一轮对话的自然语言处理,从而提前获取到下一轮对话所需的回复信息,使得在当前轮对话执行完毕时不用在执行下一轮对话的自然语言处理,而是可以直接从缓存中获取到回复信息,进而使得人机语言对话系统中各个对话任务之间的衔接更加流畅,提升了用户体验。
在一种可能的实现方式中,方法还包括:在对下一轮对话的自然语言处理执行完毕时,异步执行下一轮对话的下一轮对话的自然语言处理。由此,以加快后续对话的自然语言处理速度。
在一种可能的实现方式中,每轮对话的自然语言处理所获得的回复信息均包括该轮对话的标识对应的回复信息和对话任务对应的会话窗口的标识。
在一种可能的实现方式中,方法还包括以下中的至少一项:当前轮对话为对话任务中的 首轮对话;或,在处理当前轮对话时,包括:执行当前轮对话的自然语言处理时或播报处理时;或,在对当前轮对话的播报处理执行完毕,且对下一轮对话的自然语言处理执行完毕时,执行对下一轮对话的播报处理;或,在对当前轮对话的播报处理执行完毕,且对下一轮对话的自然语言处理未执行完毕时,等待下一轮对话的自然语言处理执行完毕。
在一种可能的实现方式中,等待下一轮对话的自然语言处理执行完毕,包括:控制第一线程处于阻塞状态,以及控制第二线程处于运行状态,第一线程用于执行对下一轮对话的播报处理,第二线程用于执行对下一轮对话的自然语言处理,其中,第二线程在执行下一轮对话的自然语言处理完毕时向第一线程反馈执行结果,以使第一线程由阻塞状态切换为运行状态。
第二方面,本申请提供了一种对话任务管理方法,对话任务中包括对多轮对话的处理,每轮对话在被处理时均包括自然语言处理和播报处理,自然语言处理用于获得播报处理所需播报的回复信息;该方法包括:在处理当前轮对话时,同时异步执行对对话任务中剩余的其他轮对话中的至少两轮对话的自然语言处理,至少两轮对话包括当前轮对话的下一轮对话;在对当前轮对话的播报处理执行完毕时,获取对下一轮对话进行的自然语言处理获得的回复信息,执行对下一轮对话的播报处理。
由此,在执行当前轮对话时,同时异步执行对话任务中剩余的其他轮对话中的至少两轮对话的自然语言处理,从而提前获取到至少两轮对话所需的回复信息,使得在当前轮对话执行完毕时不用在执行至少两轮对话的自然语言处理,而是可以直接从缓存中获取到回复信息,进而使得人机语言对话系统中各个对话任务之间的衔接更加流畅,提升了用户体验。
在一种可能的实现方式中,至少两轮对话是对话任务中剩余的其他轮对话中对应的自然语言处理的执行效率最高的。
在一种可能的实现方式中,至少两轮对话对应的自然语言处理不依赖于对权限的设置和/或网络质量。
在一种可能的实现方式中,至少两轮对话包括对话任务中剩余的其他轮对话。
在一种可能的实现方式中,每轮对话的自然语言处理所获得的回复信息均包括该轮对话对应的回复信息,以及对话任务对应的会话窗口的标识和/或该轮对话的标识。
在一种可能的实现方式中,方法还包括以下中的至少一项:当前轮对话为对话任务中的首轮对话;或,在处理当前轮对话时,包括:执行当前轮对话的自然语言处理时或播报处理时;或,在对当前轮对话的播报处理执行完毕,且对下一轮对话的自然语言处理执行完毕时,执行对下一轮对话的播报处理;或,在对当前轮对话的播报处理执行完毕,且对下一轮对话的自然语言处理未执行完毕时,等待下一轮对话的自然语言处理执行完毕。
在一种可能的实现方式中,等待下一轮对话的自然语言处理执行完毕,包括:
控制第一线程处于阻塞状态,以及控制第二线程处于运行状态,第一线程用于执行对下一轮对话的播报处理,第二线程用于执行对下一轮对话的自然语言处理,其中,第二线程在执行下一轮对话的自然语言处理完毕时向第一线程反馈执行结果,以使第一线程由阻塞状态切换为运行状态。
第三方面,本申请提供了一种对话任务管理装置,包括:存储器,用于存储程序;处理器,用于执行存储器存储的程序,当存储器存储的程序被执行时,处理器用于执行第一方面 或第二方面中提供的方法。
第四方面,本申请提供了一种电子设备,包括第一方面或第二方面提供的装置。
第五方面,本申请提供了一种计算机存储介质,计算机存储介质中存储有指令,当指令在计算机上运行时,使得计算机执行第一方面或第二方面中提供的方法。
第六方面,本申请提供了一种芯片,包括至少一个处理器和接口;接口,用于为至少一个处理器提供程序指令或者数据;至少一个处理器用于执行程序行指令,以实现第一方面或第二方面中提供的方法。
附图说明
下面对实施例或现有技术描述中所需使用的附图作简单地介绍。
下面对实施例或现有技术描述中所需使用的附图作简单地介绍。
图1是本申请实施例提供的一种人机语言对话系统中技能编排的示意图;
图2是本申请实施例提供的一种人机语言对话系统中对话任务的处理过程示意图;
图3是本申请实施例提供的一种电子设备的硬件结构示意图;
图4是本申请实施例提供的一种对话任务管理方法的流程示意图;
图5a是本申请实施例提供的一种对话任务的处理过程示意图;
图5b是本申请实施例提供的另一种对话任务的处理过程示意图;
图6是本申请实施例提供的一种对话任务的处理过程示意图;
图7是本申请实施例提供的一种对话任务的处理过程示意图;
图8a是本申请实施例提供的一种对话任务的处理过程中等待下一轮对话的自然语言处理执行完毕的过程示意图;
图8b是本申请实施例提供的另一种对话任务的处理过程中等待下一轮对话的自然语言处理执行完毕的过程示意图;
图9是本申请实施例提供的一种对话任务管理方法的流程示意图;
图10a是本申请实施例提供的一种对话任务的处理过程示意图;
图10b是本申请实施例提供的另一种对话任务的处理过程示意图;
图11是本申请实施例提供的一种对话任务的处理过程示意图;
图12是本申请实施例提供的一种对话任务的处理的流程示意图;
图13是本申请实施例提供的一种对话任务管理装置的结构示意图;
图14是本申请实施例提供的一种芯片的结构示意图。
具体实施方式
为了使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图,对本申请实施例中的技术方案进行描述。
在本申请实施例的描述中,“示例性的”、“例如”或者“举例来说”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”、“例如”或者“举例来说”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”、“例如”或者“举例来说”等词旨在以具体方式呈现相关概念。
在本申请实施例的描述中,术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,单独存在B,同时存在A和B 这三种情况。另外,除非另有说明,术语“多个”的含义是指两个或两个以上。例如,多个系统是指两个或两个以上的系统,多个终端是指两个或两个以上的终端。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
通过对目前人机语言对话系统中的对话任务处理过程进行分析发现:目前人机语言对话系统中,各个对话任务之间是相互独立的,往往是当一个对话任务处理完毕时,才开始处理下一个对话任务,这就导致两个对话任务之间常会存在空档期,进而导致人机语音对话系统出现停顿和等待现象。具体地,如图2所示,人机语言对话系统中的对话任务包括:时间查询、新版天气、心灵鸡汤和今日新闻,每个对话任务中均包括自然语言处理时间和播报处理时间。在每个对话任务中,通过自然语言处理可以得到此次对话任务所需播报的内容,例如,当执行“时间查询”任务时,通过自然语言处理,可以得到当前所需播报的时间内容,之后再通过播报处理来播报当前的时间。继续参阅图2,当执行“时间查询”任务完毕时,才开始执行“新版天气”任务,而在执行“新版天气”任务时,也是需要通过自然语言处理得到当前所需播报的天气内容的,这就使得在执行“新版天气”任务时,会存在等待该任务中的自然语言处理的过程,进而致使在执行“新版天气”时出现停顿和等待的现象。正是基于对目前人机语言对话系统中的对话任务的处理过程进行分析,才得以获知目前人机语言对话系统常出现停顿和等待现象的原因。而本方案则是为解决目前人机语言对话系统常出现停顿和等待现象的原因而提出。
可以理解的是,本方案中所提及的人机语言对话可以被配置于人工智能设备中,该人工智能设备可以为智能手机、智能音箱等电子设备。该电子设备的示例性实施例包括但不限于搭载iOS、android、Windows、鸿蒙系统(Harmony OS)或者其他操作系统的电子设备。上述电子设备也可以是其他电子设备,诸如具有触敏表面(例如触控面板)的膝上型计算机(laptop)等。本申请实施例对电子设备的类型不做具体限定。
下面介绍本申请实施例中一种电子设备的硬件结构示意图。其中,该电子设备可以为上文描述的人工智能设备。
图3示出了电子设备的硬件结构示意图。如图3所示,电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本申请实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如,处理器110可以包括应用处理器(application processor,AP)、调制解调器(modem)、图形处理器(graphics processing unit,GPU)、图像信号处理器(image signal processor,ISP)、控制器、视频编解码器、数字信号处理器(digital signal processor,DSP)、基带处理器、和/或神经网络处理器(neural-network processing unit,NPU)等中的一项或多项。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
处理器110中还可以设置存储器,用于存储指令和数据。在一些示例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用,以避免重复存取,减少处理器110的等待时间,提高系统的效率。在一些示例中,处理器110可以处理人机语言对话系统中的对话认为,例如进行自然语言处理等。
在一些示例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(PulseCodeModulation,PCM)接口,通用异步收发传输器(Universal Asynchronous Receiver/Transmitter,UART)接口,移动产业处理器接口(MobileIndustryProcessorInterface,MIPI),通用输入输出(General PurposeI/0Ports,GPIO),用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(UniversalSerialBus,USB)接口等。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的示例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的示例中,充电管理模块140可以通过电子设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为其他电子设备供电。
电源管理模块141用于连接电池142、充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110、内部存储器121、外部存储器、显示屏194、摄像头193和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量、电池循环次数、电池健康状态(漏电,阻抗)等参数。在其他一些示例中,电源管理模块141也可以设置于处理器110中。在另一些示例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些示例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由包括天线1的至少两根天线接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调器进行解调。移动通信模块150还可以对经调制解调器调制后的信号放大,经天线1转为电磁波辐射出去。在一些示例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些示例中,移动通 信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
调制解调器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些示例中,调制解调器可以是独立的器件。在另一些示例中,调制解调器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。在另一些示例中,移动通信模块150可以是调制解调器中的模块。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些示例中,电子设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得电子设备100可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),第五代,新空口(new radio,NR),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可以包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrixorganic light emitting diode,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些示例中,电子设备100可以包括一个或多个显示屏194。
电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍摄时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理, 转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些示例中,ISP可以设置在摄像头193中。
摄像头193用于捕获静态图像或视频,例如,捕获用户的面部特征信息、姿态特征信息等。物体通镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(ComplementaryMetalOxideSemiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些示例中,电子设备100可以包括一个或多个摄像头193。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。电子设备100可以支持一种或多种视频编解码器。这样,电子设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG2,MPEG3,MPEG4等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行电子设备100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些示例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。电子设备100可以通过扬声器170A收听音乐,或收听免提通话,或播报人机语言对话系统中对话任务所需播报的内容等。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当电子设备100接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风170C发声,将声音信号输入到麦克风170C。电子设备100可以设置至少一个麦克风170C。在另一些示例中,电子设备100可以设置两个麦克风170C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,电子设备100还可以设置三个,四个或更多麦克风170C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。
耳机接口170D用于连接有线耳机。耳机接口170D可以是USB接口130,也可以是3.5mm的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
其中,压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些示例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。电子设备100根据电容的变化确定压力的强度。当有触摸操作作用于显示屏194,电子设备100根据压力传感器180A检测所述触摸操作强度。电子设备100也可以根据压力传感器180A的检测信号计算触摸的位置。在一些示例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。
陀螺仪传感器180B可以用于确定电子设备100的运动姿态。在一些示例中,可以通过陀螺仪传感器180B确定电子设备100围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当使用电子设备100采集环境中的用户特征信息时,陀螺仪传感器180B检测电子设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消电子设备100的抖动,实现防抖。
气压传感器180C用于测量气压。在一些示例中,电子设备100通过气压传感器180C测得的气压值计算海拔高度,辅助定位和导航。
加速度传感器180E可检测电子设备100在各个方向上(一般为三轴)加速度的大小。当电子设备100静止时可检测出重力的大小及方向。还可以用于识别电子设备的姿态,应用于横竖屏切换,计步器等应用。
距离传感器180F,用于测量距离。电子设备100可以通过红外或激光测量距离。在一些示例中,当利用电子设备采集环境中用户的用户特征信息时,电子设备100可以利用距离传感器180F测距以实现快速对焦。
环境光传感器180L用于感知环境光亮度。电子设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。
指纹传感器180H用于采集指纹。电子设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器180J用于检测温度。在一些示例中,电子设备100利用温度传感器180J检测的温度,执行温度处理策略。例如,当温度传感器180J上报的温度超过阈值,电子设备100执行降低位于温度传感器180J附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,电子设备100对电池142加热,以避免低温导致电子设备100异常关机。在其他一些实施例中,当温度低于又一阈值时,电子设备100对电池142的输出电压执行升压,以避免低温导致的异常关机。
触摸传感器180K,也称“触控器件”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于电子设备100的表面,与显示屏194所处的位置不同。
按键190包括开机键,音量键,输入键盘等。按键190可以是机械按键。也可以是触摸式按键。电子设备100可以接收按键输入,产生与电子设备100的用户设置以及功能控制有关的键信号输入。
马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如视频播放,音频播放等)的触摸操作,可以对应不同的振动反馈效果。作用于显示屏194不同区域的触摸操作,马达191也可对应不同的振动反馈效果。不同的应用场景(例如:时间提醒,接收信息,闹钟,游戏等)也可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
接下来,介绍本方案提供的一种对话任务管理方法。
需说明的是,本方案中的对话任务中可以包括对多轮对话的处理,其中,每轮对话在被处理时均包括自然语言处理和播报处理,自然语言处理可以用于获得播报处理所需播报的回复信息。示例性的,继续参阅图1,图1中的“早上好”可以理解为对话任务,而“时间查询”、“新版天气”、“心灵鸡汤”和“今日新闻”,则为该对话任务中的对话,其中,“时间查询”则可以为一轮对话。继续参阅图2,图2中每轮对话所涉及的“自然语言处理事件”即为每轮对话被处理使进行自然语言处理所需的时间,“播报处理时间”则为播报回复信息的时间。其中,回复信息可以理解为回复用户的信息,例如,用户在朝向智能音箱说出“早上好”后,智能音箱则可以回复用户当前的时间、今天的天气等信息,而智能音箱回复的信息即为本方案中所提及的回复信息。
图4是本申请实施例提供的一种对话任务管理方法的流程示意图。如图4所示,该对话任务管理方法可以包括以下步骤:
步骤S101、在处理当前轮对话时,异步执行对下一轮对话的自然语言处理。
其中,本方案中,在处理当前轮对话时,可以异步执行对下一轮对话的自然语言处理,以提前获取到下一轮对话所需播报的回复信息。示例性的,如图5a所示,若当前轮对话为“时间查询”,则可以在播报时间信息时,异步执行“新版天气”的自然语言处理过程,进而提前获取到天气信息;若当前轮对话为“新版天气”,则可以在播报天气信息时,异步执行“心灵鸡汤”的自然语言处理过程,进而提前获取到心灵鸡汤信息。
可以理解的是,本方案中,执行每轮对话的自然语言处理所获得的回复信息,均可以但不限于存放至人机语言对话系统所处的设备的缓存中。
在一个例子中,当前轮对话可以为对话任务中的首轮对话。示例性的,继续参阅图5a,当前轮对话可以为“时间查询”。
在一个例子中,在处理当前轮对话时,可以是执行当前轮对话的自然语言处理时,也可以是在执行当前轮对话的播报处理时,在此不做限定。
步骤S102、在对当前轮对话的播报处理执行完毕时,获取对下一轮对话进行的自然语言处理获得的回复信息,执行对下一轮对话的播报处理。
其中,在对当前轮对话的播报处理执行完毕时,表明当前轮对话已执行完毕,即可以开始执行下一轮对话。在执行下一轮对话时,由于已经提前获取到该轮对话播报处理所需的回复信息,因此可以直接获取该轮对话所需的回复信息,并执行该轮对话的播报处理,以播报回复信息。由此,即可以避免两轮对话任务之间出现停顿和等待的现象。
在一个例子中,在对下一轮对话的自然语言处理执行完毕时,可以异步执行下一轮对话的下一轮对话的自然语言处理,以进一步提前获取到下一轮对话的下一轮对话所需播报的回复信息。示例性的,如图5b所示,若当前轮对话为“时间查询”,则在该轮对话的播报处理阶段,当“新版天气”的自然语音处理已执行完毕时,则可以开始异步执行“心灵鸡汤”的自然语言处理阶段,以提前获取到心灵鸡汤信息。
可以理解的是,若在异步执行下一轮对话的下一轮对话的自然语言处理结束时,当前轮对话的播报处理尚未执行完毕,则可以异步执行另外一轮对话的自然语言处理。如图6所示,若当前轮对话为“时间查询”,则在该轮对话的播报处理阶段,当“新版天气”的自然语音处理已执行完毕时,则可以开始异步执行“心灵鸡汤”的自然语言处理阶段;以及,在“时间查询”的播报处理阶段未执行完毕时,且“心灵鸡汤”的自然语言处理阶段执行完毕,则可以开始异步执行“今日新闻”的自然语言处理阶段。
在一个例子中,每轮对话的自然语言处理所获得的回复信息中均可以包括该轮对话的标识对应的回复信息和对话任务对应的会话窗口的标识。
在一个例子中,在对当前轮对话的播报处理执行完毕,且对下一轮对话的自然语言处理执行完毕时,则可以执行对下一轮对话的播报处理。示例性的,继续参阅图5a,当“时间查询”任务中的播报处理已播报完毕,且“新版天气”任务中已获取到播报处理所需的回复信息,则可以开始执行“新版天气”的播报处理。
在一个例子中,在对当前轮对话的播报处理执行完毕,且对下一轮对话的自然语言处理未执行完毕时,则可以等待下一轮对话的自然语言处理执行完毕。示例性的,如图7所示,当“时间查询”任务中的播报处理已播报完毕,且“新版天气”任务中尚未获取到播报处理所需的回复信息(即自然语言处理尚未执行结束),则可以进行等待,直至“新版天气”的自然语言处理执行完毕,然后在执行“新版天气”的播报处理。
在一个例子中,等待下一轮对话的自然语言处理执行完毕,可以包括:控制第一线程处于阻塞状态,以及控制第二线程处于运行状态。其中,第一线程可以用于执行对下一轮对话的播报处理,第二线程可以用于执行对下一轮对话的自然语言处理,第二线程在执行下一轮对话的自然语言处理完毕时向第一线程反馈执行结果,以使第一线程由阻塞状态切换为运行状态。示例性的,本方案中,可以通过使用回调函数(如future.get函数)实现,如图8a所示,第一线程可以调用第二线程,此时,第一线程可以处理其他事情;之后,第一线程可以在第二线程中注册future.get函数;接着,第一线程可以一边处理其他事情,一边等待第二线程执行完毕;在第二线程执行完毕时,第二线程将向第一线程返回已执行完毕的结果;之后,第一线程即可以结束等待,并开始处理任务。可以理解的是,如图8b所示,目前的技术中,一般是第一线程在调用第二线程后,第一线程周期性的向第二线程查询,然后第二线程向第一线程反馈是否执行完毕的结果,这种方式虽然也可以实现等待下一轮对话的自然语言处理执行完毕,但其需要第一线程周期性的询问第二线程,这就使得第一线程在等待过程 中无法处理其他事情,而本方案中第一线程则可以在等待过程中处理其他事情,提升了设备的处理效率。
可以理解的是,本方案中,在任意一轮对话的自然语言处理执行失败后,则可以在执行完该轮对话的上一轮对话时,结束对话任务,即不在执行后续的对话。此外,在任意一轮对话的自然语言处理执行失败后,也可以禁止执行该轮后续对话的自然语言处理。
图9是本申请实施例提供的另一种对话任务管理方法的流程示意图。如图9所示,该对话任务管理方法可以包括以下步骤:
步骤S201、在处理当前轮对话时,同时异步执行对对话任务中剩余的其他轮对话中的至少两轮对话的自然语言处理,至少两轮对话包括当前轮对话的下一轮对话。
其中,本方案中,在处理当前轮对话时,可以同时异步执行对对话任务中剩余的其他轮对话中的至少两轮对话的自然语言处理,以提前获取到至少两轮对话所需播报的回复信息,其中,至少两轮对话包括当前轮对话的下一轮对话。示例性的,如图10a所示,若当前轮对话为“时间查询”,则可以在执行自然语言处理时,可以同时异步执行“新版天气”、“心灵鸡汤”和“今日新闻”的自然语言处理过程,进而提前获取到天气信息、心灵鸡汤信息和新闻信息;如图10b所示,若当前轮对话为“时间查询”,则可以在播报时间信息时,可以同时异步执行“新版天气”、“心灵鸡汤”和“今日新闻”的自然语言处理过程,进而提前获取到天气信息、心灵鸡汤信息和新闻信息。
在一个例子中,至少两轮对话可以是对话任务中剩余的其他轮对话中对应的自然语言处理的执行效率最高的。其中,至少两轮对话对应的自然语言处理可以不依赖于对权限的设置和网络质量中的至少一项。示例性的,对话任务中包括“时间查询”和“今日新闻”,“时间查询”所需的时间信息可以通过设备自身的时钟获取到,其不依赖于外部条件,而“今日新闻”所需的新闻信息则需要设备联网才能获取到,即其依赖于外部条件,因此,“时间查询”这一对话认为的执行效率最高,且其不依赖于网络质量。
此外,至少两轮对话也可以是对话任务中优先级相对较高的对话。示例性的,如图11所示,“时间查询”和“心灵鸡汤”的优先级高于“今日新闻”的优先级,则至少两轮对话为“时间查询”和“心灵鸡汤”。以及,在执行完“时间查询”和“心灵鸡汤”的自然语言处理后,再执行“心灵鸡汤”的自然语言处理。
在一个例子中,至少两轮对话包括对话任务中剩余的其他轮对话。换言之,至少两轮对话是除当前轮对话以外的未执行的其他轮对话。示例性的,继续参阅图10b,若当前轮对话为“时间查询”,则至少两轮对话为“新版天气”、“心灵鸡汤”和“今日新闻”;若当前轮对话为“新版天气”,则至少两轮对话为“心灵鸡汤”和“今日新闻”。
在一个例子中,当前轮对话可以为对话任务中的首轮对话。示例性的,继续参阅图10a,当前轮对话可以为“时间查询”。
在一个例子中,在处理当前轮对话时,可以是执行当前轮对话的自然语言处理时,也可以是在执行当前轮对话的播报处理时,在此不做限定。
步骤S202、在对当前轮对话的播报处理执行完毕时,获取对下一轮对话进行的自然语言处理获得的回复信息,执行对下一轮对话的播报处理。
其中,在对当前轮对话的播报处理执行完毕时,表明当前轮对话已执行完毕,即可以开始执行下一轮对话。在执行下一轮对话时,由于已经提前获取到该轮对话播报处理所需的回 复信息,因此可以直接获取该轮对话所需的回复信息,并执行该轮对话的播报处理,以播报回复信息。由此,即可以避免两轮对话任务之间出现停顿和等待的现象。
在一个例子中,每轮对话的自然语言处理所获得的回复信息均包括该轮对话对应的回复信息,以及对话任务对应的会话窗口的标识和该轮对话的标识中的至少一项。其中,对话任务对应的会话窗口的标识可以用于在从缓存中获取回复信息时,找寻到相应的对话任务;该轮对话的标识可以用于在获取回复信息时,找寻到相应的对话。
在一个例子中,在对当前轮对话的播报处理执行完毕,且对下一轮对话的自然语言处理执行完毕时,则可以执行对下一轮对话的播报处理。示例性的,继续参阅图10a,当“时间查询”任务中的播报处理已播报完毕,且“新版天气”任务中已获取到播报处理所需的回复信息,则可以开始执行“新版天气”的播报处理。
在一个例子中,在对当前轮对话的播报处理执行完毕,且对下一轮对话的自然语言处理未执行完毕时,则可以等待下一轮对话的自然语言处理执行完毕。示例性的,如图6所示,当“时间查询”任务中的播报处理已播报完毕,且“新版天气”任务中尚未获取到播报处理所需的回复信息(即自然语言处理尚未执行结束),则可以进行等待,直至“新版天气”的自然语言处理执行完毕,然后在执行“新版天气”的播报处理。
在一个例子中,等待下一轮对话的自然语言处理执行完毕,可以包括:控制第一线程处于阻塞状态,以及控制第二线程处于运行状态。其中,第一线程可以用于执行对下一轮对话的播报处理,第二线程可以用于执行对下一轮对话的自然语言处理,第二线程在执行下一轮对话的自然语言处理完毕时向第一线程反馈执行结果,以使第一线程由阻塞状态切换为运行状态。示例性的,本方案中,可以通过使用回调函数(如future.get函数)实现,如图8a所示,第一线程可以调用第二线程,此时,第一线程可以处理其他事情;之后,第一线程可以在第二线程中注册future.get函数;接着,第一线程可以一边处理其他事情,一边等待第二线程执行完毕;在第二线程执行完毕时,第二线程将向第一线程返回已执行完毕的结果;之后,第一线程即可以结束等待,并开始处理任务。可以理解的是,如图8b所示,目前的技术中,一般是第一线程在调用第二线程后,第一线程周期性的向第二线程查询,然后第二线程向第一线程反馈是否执行完毕的结果,这种方式虽然也可以实现等待下一轮对话的自然语言处理执行完毕,但其需要第一线程周期性的询问第二线程,这就使得第一线程在等待过程中无法处理其他事情,而本方案中第一线程则可以在等待过程中处理其他事情,提升了设备的处理效率。
可以理解的是,本方案中,在任意一轮对话的自然语言处理执行失败后,则可以在执行完该轮对话的上一轮对话时,结束对话任务,即不在执行后续的对话。此外,在任意一轮对话的自然语言处理执行失败后,也可以禁止执行该轮后续对话的自然语言处理。
为便于理解,下面对本方案中所涉及的自然语言处理和播报处理进行介绍。
如图12所示,用户A可以下发语音指令,例如唤醒词等,智能终端21可以采集到该语音指令。接着,智能终端21中的语音识别(Automatic Speech Recognition,ASR)模块211可以对用户A发出的语音指令进行识别,将语音指令中的词汇内容转换为计算机可读的输入,例如按键、二进制编码或者字符序列等。之后,智能终端21中的自然语言理解(Natural Language Understanding,NLU)模块212,可以结合文本匹配、语义相似度匹配、信息检索、多意图分类模型等语义理解方案,对ASR模块211转换的文本的文本进行处理,并识别出用 户的意图;其中,NLU模块212可以从知识库213中进行信息检索等,该知识库213可以配置于智能终端21中,也可以配置于其他设备上,如服务器等。接着,智能终端21中的对话管理(Dialog Management,DM)模块214可以基于用户的意图,判别所需输出的交互信息,例如,查询天气等;以及从服务器22或其他设备上获取到所需输出的交互信息(即回复语)。接着,智能终端21中的文本转语音(Text To Speech,TTS)模块215可以将DM模块214获取到交互信息转换为语音。最后,智能终端21可以向用户A播报TTS模块215所转换的语音。
可以理解的是,对话管理DM模块214可以对用户预先编排的对话任务进行管理,以使得对话任务中的各轮对话依次执行。
基于上述实施例中的方法,本申请实施例提供了一种对话任务管理装置。请参阅图13,图13是本申请实施例提供的一种对话任务管理装置的结构示意图。如图13所示,本申请实施例提供的对话任务管理装置,该对话任务管理装置可用于实现上述方法实施例中描述的方法。
该对话任务管理装置包括至少一个处理器1301,该至少一个处理器1301可支持对话任务管理装置实现本申请实施例中所提供的方法。
该处理器1301可以是通用处理器或者专用处理器。例如,处理器1301可以包括中央处理器(central processing unit,CPU)和/或基带处理器。其中,基带处理器可以用于处理通信数据(例如,确定目标屏幕终端),CPU可以用于实现相应的控制和处理功能,执行软件程序,处理软件程序的数据。
进一步的,对话任务管理装置还可以包括收发单元1305,用以实现信号的输入(接收)和输出(发送)。例如,收发单元1305可以包括收发器或射频芯片。收发单元1305还可以包括通信接口。
可选地,对话任务管理装置还可以包括天线1306,可以用于支持收发单元1305实现对话任务管理装置的收发功能。
可选地,对话任务管理装置中可以包括一个或多个存储器1302,其上存有程序(也可以是指令或者代码)1304,程序1304可被处理器1301运行,使得处理器1301执行上述方法实施例中描述的方法。可选地,存储器1302中还可以存储有数据。可选地,处理器1301还可以读取存储器1302中存储的数据(例如,预存储的第一特征信息),该数据可以与程序1304存储在相同的存储地址,该数据也可以与程序1304存储在不同的存储地址。
处理器1301和存储器1302可以单独设置,也可以集成在一起,例如,集成在单板或者系统级芯片(system on chip,SOC)上。
关于对话任务管理装置在上述各种可能的设计中执行的操作的详细描述可以参照本申请实施例提供的方法的实施例中的描述,在此就不再一一赘述。
基于上述实施例中的装置,本申请实施例还提供了一种电子设备,该电子设备包含上述实施例中所提供的对话任务管理装置。
基于上述实施例中的方法,本申请实施例还提供了一种芯片。请参阅图14,图14是本申请实施例提供的一种芯片的结构示意图。如图14所示,芯片1400包括一个或多个处理器1401以及接口电路1402。可选的,芯片1400还可以包含总线1403。其中:
处理器1401可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器1401中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1401可以是通用处理器、数字通信器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
接口电路1402可以用于数据、指令或者信息的发送或者接收,处理器1401可以利用接口电路1402接收的数据、指令或者其它信息,进行加工,可以将加工完成信息通过接口电路1402发送出去。
可选的,芯片1400还包括存储器,存储器可以包括只读存储器和随机存取存储器,并向处理器提供操作指令和数据。存储器的一部分还可以包括非易失性随机存取存储器(NVRAM)。
可选的,存储器存储了可执行软件模块或者数据结构,处理器可以通过调用存储器存储的操作指令(该操作指令可存储在操作系统中),执行相应的操作。
可选的,接口电路1402可用于输出处理器1401的执行结果。
需要说明的,处理器1401、接口电路1402各自对应的功能既可以通过硬件设计实现,也可以通过软件设计来实现,还可以通过软硬件结合的方式来实现,这里不作限制。
应理解,上述方法实施例的各步骤可以通过处理器中的硬件形式的逻辑电路或者软件形式的指令完成。
可以理解的是,本申请的实施例中的处理器可以是中央处理单元(central processing unit,CPU),还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。通用处理器可以是微处理器,也可以是任何常规的处理器。
本申请的实施例中的方法步骤可以通过硬件的方式来实现,也可以由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器(random access memory,RAM)、闪存、只读存储器(read-only memory,ROM)、可编程只读存储器(programmable rom,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者通过所述计算机可读存储介质进行传输。所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可 读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
可以理解的是,在本申请的实施例中涉及的各种数字编号仅为描述方便进行的区分,并不用来限制本申请的实施例的范围。

Claims (16)

  1. 一种对话任务管理方法,其特征在于,所述对话任务中包括对多轮对话的处理,每轮对话在被处理时均包括自然语言处理和播报处理,所述自然语言处理用于获得所述播报处理所需播报的回复信息;
    所述方法包括:
    在处理当前轮对话时,异步执行对下一轮对话的自然语言处理;
    在对所述当前轮对话的播报处理执行完毕时,获取对所述下一轮对话进行的所述自然语言处理获得的回复信息,执行对所述下一轮对话的播报处理。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    在对所述下一轮对话的自然语言处理执行完毕时,异步执行所述下一轮对话的下一轮对话的自然语言处理。
  3. 根据权利要求1或2所述的方法,其特征在于,每轮对话的自然语言处理所获得的回复信息均包括该轮对话的标识对应的回复信息和所述对话任务对应的会话窗口的标识。
  4. 根据权利要求1-3任一所述的方法,其特征在于,所述方法还包括以下中的至少一项:
    所述当前轮对话为所述对话任务中的首轮对话;或,所述在处理当前轮对话时,包括:执行所述当前轮对话的自然语言处理时或播报处理时;或,在对所述当前轮对话的播报处理执行完毕,且对所述下一轮对话的自然语言处理执行完毕时,执行对所述下一轮对话的播报处理;或,在对所述当前轮对话的播报处理执行完毕,且对所述下一轮对话的自然语言处理未执行完毕时,等待所述下一轮对话的自然语言处理执行完毕。
  5. 根据权利要求4所述的方法,其特征在于,所述等待所述下一轮对话的自然语言处理执行完毕,包括:
    控制第一线程处于阻塞状态,以及控制第二线程处于运行状态,所述第一线程用于执行对所述下一轮对话的播报处理,所述第二线程用于执行对所述下一轮对话的自然语言处理,其中,所述第二线程在执行所述下一轮对话的自然语言处理完毕时向所述第一线程反馈执行结果,以使所述第一线程由阻塞状态切换为运行状态。
  6. 一种对话任务管理方法,其特征在于,所述对话任务中包括对多轮对话的处理,每轮对话在被处理时均包括自然语言处理和播报处理,所述自然语言处理用于获得所述播报处理所需播报的回复信息;
    所述方法包括:
    在处理当前轮对话时,同时异步执行对所述对话任务中剩余的其他轮对话中的至少两轮对话的自然语言处理,所述至少两轮对话包括所述当前轮对话的下一轮对话;
    在对所述当前轮对话的播报处理执行完毕时,获取对所述下一轮对话进行的所述自然语言处理获得的回复信息,执行对所述下一轮对话的播报处理。
  7. 根据权利要求6所述的方法,其特征在于,所述至少两轮对话是所述对话任务中剩余的其他轮对话中对应的自然语言处理的执行效率最高的。
  8. 根据权利要求7所述的方法,其特征在于,所述至少两轮对话对应的自然语言处理不依赖于对权限的设置和/或网络质量。
  9. 根据权利要求6-8任一所述的方法,其特征在于,所述至少两轮对话包括所述对话任 务中剩余的其他轮对话。
  10. 根据权利要求6-9任一所述的方法,其特征在于,每轮对话的自然语言处理所获得的回复信息均包括该轮对话对应的回复信息,以及所述对话任务对应的会话窗口的标识和/或所述该轮对话的标识。
  11. 根据权利要求6-10任一所述的方法,其特征在于,所述方法还包括以下中的至少一项:
    所述当前轮对话为所述对话任务中的首轮对话;或,所述在处理当前轮对话时,包括:执行所述当前轮对话的自然语言处理时或播报处理时;或,在对所述当前轮对话的播报处理执行完毕,且对所述下一轮对话的自然语言处理执行完毕时,执行对所述下一轮对话的播报处理;或,在对所述当前轮对话的播报处理执行完毕,且对所述下一轮对话的自然语言处理未执行完毕时,等待所述下一轮对话的自然语言处理执行完毕。
  12. 根据权利要求11所述的方法,其特征在于,所述等待所述下一轮对话的自然语言处理执行完毕,包括:
    控制第一线程处于阻塞状态,以及控制第二线程处于运行状态,所述第一线程用于执行对所述下一轮对话的播报处理,所述第二线程用于执行对所述下一轮对话的自然语言处理,其中,所述第二线程在执行所述下一轮对话的自然语言处理完毕时向所述第一线程反馈执行结果,以使所述第一线程由阻塞状态切换为运行状态。
  13. 一种对话任务管理装置,其特征在于,包括:
    存储器,用于存储程序;
    处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行如权利要求1-12任一所述的方法。
  14. 一种电子设备,其特征在于,包括如权利要求13或14所述的装置。
  15. 一种计算机存储介质,所述计算机存储介质中存储有指令,当所述指令在计算机上运行时,使得计算机执行如权利要求1-12任一所述的方法。
  16. 一种芯片,其特征在于,包括至少一个处理器和接口;
    所述接口,用于为所述至少一个处理器提供程序指令或者数据;
    所述至少一个处理器用于执行所述程序行指令,以实现如权利要求1-12任一项所述的方法。
PCT/CN2021/136167 2020-12-31 2021-12-07 对话任务管理方法、装置及电子设备 WO2022143048A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011638531.5A CN114691844A (zh) 2020-12-31 2020-12-31 对话任务管理方法、装置及电子设备
CN202011638531.5 2020-12-31

Publications (1)

Publication Number Publication Date
WO2022143048A1 true WO2022143048A1 (zh) 2022-07-07

Family

ID=82135401

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/136167 WO2022143048A1 (zh) 2020-12-31 2021-12-07 对话任务管理方法、装置及电子设备

Country Status (2)

Country Link
CN (1) CN114691844A (zh)
WO (1) WO2022143048A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110202924A1 (en) * 2010-02-17 2011-08-18 Microsoft Corporation Asynchronous Task Execution
CN109086026A (zh) * 2018-07-17 2018-12-25 阿里巴巴集团控股有限公司 播报语音的确定方法、装置和设备
CN109741753A (zh) * 2019-01-11 2019-05-10 百度在线网络技术(北京)有限公司 一种语音交互方法、装置、终端及服务器
CN111091813A (zh) * 2019-12-31 2020-05-01 北京猎户星空科技有限公司 语音唤醒模型更新方法、装置、设备及介质
CN111916082A (zh) * 2020-08-14 2020-11-10 腾讯科技(深圳)有限公司 语音交互方法、装置、计算机设备和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110202924A1 (en) * 2010-02-17 2011-08-18 Microsoft Corporation Asynchronous Task Execution
CN109086026A (zh) * 2018-07-17 2018-12-25 阿里巴巴集团控股有限公司 播报语音的确定方法、装置和设备
CN109741753A (zh) * 2019-01-11 2019-05-10 百度在线网络技术(北京)有限公司 一种语音交互方法、装置、终端及服务器
CN111091813A (zh) * 2019-12-31 2020-05-01 北京猎户星空科技有限公司 语音唤醒模型更新方法、装置、设备及介质
CN111916082A (zh) * 2020-08-14 2020-11-10 腾讯科技(深圳)有限公司 语音交互方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
CN114691844A (zh) 2022-07-01

Similar Documents

Publication Publication Date Title
US20220223150A1 (en) Voice wakeup method and device
CN110784830B (zh) 数据处理方法、蓝牙模块、电子设备与可读存储介质
CN110347269B (zh) 一种空鼠模式实现方法及相关设备
CN112289313A (zh) 一种语音控制方法、电子设备及系统
CN113504851A (zh) 一种播放多媒体数据的方法及电子设备
CN111369988A (zh) 一种语音唤醒方法及电子设备
WO2020119492A1 (zh) 消息处理方法及相关装置
CN111522250B (zh) 智能家居系统及其控制方法与装置
WO2020073288A1 (zh) 一种触发电子设备执行功能的方法及电子设备
CN112806067B (zh) 语音切换方法、电子设备及系统
CN113973398B (zh) 无线网络连接方法、电子设备及芯片系统
CN114115770A (zh) 显示控制的方法及相关装置
WO2022161077A1 (zh) 语音控制方法和电子设备
WO2023273321A1 (zh) 一种语音控制方法及电子设备
CN113365274B (zh) 一种网络接入方法和电子设备
CN114077519B (zh) 一种系统服务恢复方法、装置和电子设备
CN109285563B (zh) 在线翻译过程中的语音数据处理方法及装置
WO2022143048A1 (zh) 对话任务管理方法、装置及电子设备
WO2021254294A1 (zh) 一种切换音频输出通道的方法、装置和电子设备
CN113380240B (zh) 语音交互方法和电子设备
WO2022062902A1 (zh) 一种文件传输方法和电子设备
WO2022007757A1 (zh) 跨设备声纹注册方法、电子设备及存储介质
CN116032942A (zh) 跨设备的导航任务的同步方法、装置、设备及存储介质
CN114116610A (zh) 获取存储信息的方法、装置、电子设备和介质
CN114079809A (zh) 终端及其输入方法与装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21913789

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21913789

Country of ref document: EP

Kind code of ref document: A1