CN114041102A - Service providing method and device - Google Patents

Service providing method and device Download PDF

Info

Publication number
CN114041102A
CN114041102A CN202080003002.XA CN202080003002A CN114041102A CN 114041102 A CN114041102 A CN 114041102A CN 202080003002 A CN202080003002 A CN 202080003002A CN 114041102 A CN114041102 A CN 114041102A
Authority
CN
China
Prior art keywords
service
user
recognition result
electronic device
voiceprint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202080003002.XA
Other languages
Chinese (zh)
Inventor
杨宇辰
高振东
孙凤宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN114041102A publication Critical patent/CN114041102A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification

Abstract

The embodiment of the application discloses a service providing method and device, relates to the field of information processing, and aims to enable electronic equipment without a voiceprint recognition function to provide services based on voiceprint recognition results. The method comprises the following steps: after receiving the first identification result from the first electronic device, the service device provides a first service for the first user. The service equipment does not have the voiceprint recognition function, the first recognition result comprises a first voiceprint recognition result, and the first voiceprint recognition result is used for indicating a first user recognized by the first electronic equipment through the voiceprint recognition function. The communication method is applied to a service providing process based on voiceprint recognition.

Description

Service providing method and device Technical Field
The present application relates to the field of information processing technologies, and in particular, to a service providing method and apparatus.
Background
Voiceprint recognition is a technique for recognizing the originator of a speech signal by means of the speech signal. The electronic equipment supporting the voiceprint recognition technology collects a voice signal, and after a voice signal sender is recognized, customized services are provided or recommended for the voice signal sender. Further, in order to provide a better service (such as implementing task collaboration or task migration) to a user, a plurality of electronic devices each supporting a voiceprint recognition technology need to be provided.
However, voiceprint recognition technology requires high chip computing power for electronic devices, resulting in high hardware costs. Moreover, a voiceprint mapping model is established between any two electronic devices in the plurality of electronic devices, so that the problem of inconsistent voiceprint characteristics caused by hardware differences among different electronic devices is solved. If the number of electronic devices is large, the hardware cost and the complexity of the voiceprint mapping model are further increased.
Disclosure of Invention
The embodiment of the application provides a service providing method and device, which can enable an electronic device without a voiceprint recognition function to provide services based on a voiceprint recognition result.
In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:
in a first aspect, an embodiment of the present application provides a service providing apparatus, which includes a first receiving unit and a processing unit. The first receiving unit is used for receiving a first identification result from the first electronic device. The processing unit is used for providing a first service for a first user. Here, the service providing apparatus is located in a service device, and the service device does not have a voiceprint recognition function. The first recognition result comprises a first voiceprint recognition result, and the first voiceprint recognition result is used for indicating a first user recognized by the first electronic device through a voiceprint recognition function.
Therefore, for the service providing device without the voiceprint recognition function, after the first voiceprint recognition result from the first electronic device is received by the first receiving unit, the processing unit can also provide the first service for the first user indicated by the first voiceprint recognition result, hardware for realizing the voiceprint recognition function does not need to be deployed on the service providing device, hardware cost is saved, relevant operation of a voiceprint mapping model does not need to be executed, the range of types of equipment for providing the first service for the user is expanded, and service experience of the user is improved.
In one possible design, the first recognition result further includes a first semantic recognition result, the first semantic recognition result is obtained by performing semantic recognition on a first voice request sent by a first user by the first electronic device, and the first semantic recognition result is used for requesting a first service. The processing unit is further configured to provide a first service to the first user according to the first semantic recognition result.
Therefore, under the condition that the service providing device does not have the semantic recognition function, the service providing device can also obtain the first semantic recognition result from the first electronic equipment and provide the corresponding first service according to the first semantic recognition result.
In a possible design, the service providing apparatus according to the embodiment of the present application further includes a second receiving unit, where the second receiving unit is configured to receive a second voice request sent by the first user. The processing unit is further configured to perform semantic recognition on the second voice request to obtain a second semantic recognition result, and provide the first service for the first user according to the second semantic recognition result. Wherein the second semantic identification result is used for requesting the first service.
Therefore, under the condition that the service providing device has the semantic recognition function, the service providing device can also perform semantic recognition to obtain a second semantic recognition result, and then provides the corresponding first service according to the second semantic recognition result.
In one possible design, the processing unit is further configured to obtain user service information of the first user from the server, and provide the first user with the first service matching the user service information. Wherein the user service information includes at least one of preference information or history information. The preference information is used to indicate usage habits of the first user for the first service, and the history information is used to indicate at least one of an identification of a service device providing the first service for the first user, a service progress, or a service time.
That is, the service providing apparatus can also interact with the server to obtain the user service information of the first user, so as to provide the first service matching with the user service information, such as providing favorite music of the user or providing a movie that is not played by the user.
In one possible design, the first receiving unit is further configured to receive a second recognition result from the second electronic device. And the second identification result comprises a second voiceprint identification result, and the second voiceprint identification result is used for indicating a second user identified by the second electronic equipment through the voiceprint identification function. The processing unit is further configured to determine the first user as a service object from the first user and the second user.
In this way, in the case where the service providing apparatus can also acquire the voiceprint recognition result of the other electronic device, the processing unit can also autonomously determine the service object to accurately provide the first service for the service object.
In one possible design, the first recognition result further includes a first degree of matching of the first voiceprint recognition result, and the second recognition result further includes a second degree of matching of the second voiceprint recognition result. The processing unit is specifically configured to determine, according to the first matching degree and the second matching degree, the first user as a service object from the first user and the second user. Wherein the first matching degree is not lower than the second matching degree.
In this way, when the service providing device can also obtain the voiceprint recognition results of other electronic devices, the processing unit determines the service object according to the matching degree of different voiceprint recognition results, so as to accurately provide the first service for the service object.
In one possible design, the first service includes at least one of text, images, voice, video, or audio.
In a second aspect, an embodiment of the present application provides a service providing method, where the method includes: after receiving the first identification result from the first electronic device, the service device provides a first service for the first user. Wherein the service device does not have a voiceprint recognition function. The first recognition result comprises a first voiceprint recognition result, and the first voiceprint recognition result is used for indicating a first user recognized by the first electronic device through a voiceprint recognition function.
In one possible design, the first recognition result further includes the first semantic recognition result. The first semantic recognition result is obtained by performing semantic recognition on a first voice request sent by a first user through the first electronic device, and the first semantic recognition result is used for requesting a first service. The service equipment provides a first service for a first user, and comprises: and the service equipment provides a first service for the first user according to the first semantic recognition result.
In a possible design, the service providing method according to the embodiment of the present application further includes: and after receiving a second voice request sent by the first user, the service equipment performs semantic recognition on the second voice request to obtain a second semantic recognition result. Wherein the second semantic identification result is used for requesting the first service. The service equipment provides a first service for a first user, and comprises: and the service equipment provides the first service for the first user according to the second semantic recognition result.
In a possible design, the service providing method according to the embodiment of the present application further includes: the service device acquires user service information of the first user from the server. The service equipment provides a first service for a first user, and comprises: the service equipment provides a first service matched with the user service information for the first user.
In one possible design, the user service information includes at least one of preference information indicating a usage habit of the first user for the first service or history information indicating at least one of an identification of a service device providing the first service for the first user, a service progress, or a service time.
In a possible design, the service providing method according to the embodiment of the present application further includes: and after receiving the second identification result from the second electronic equipment, the service equipment determines the first user as the service object from the first user and the second user. And the second identification result comprises a second voiceprint identification result, and the second voiceprint identification result is used for indicating a second user identified by the second electronic equipment through the voiceprint identification function.
In one possible design, the first recognition result further includes a first degree of matching of the first voiceprint recognition result, and the second recognition result further includes a second degree of matching of the second voiceprint recognition result. The service equipment determines the first user as a service object from the first user and the second user, and comprises the following steps: and the service equipment determines the first user as a service object from the first user and the second user according to the first matching degree and the second matching degree. Wherein the first matching degree is not lower than the second matching degree.
In one possible design, the first service includes at least one of text, images, voice, video, or audio.
In a third aspect, an embodiment of the present application provides a service providing apparatus, including a processor and an interface circuit, where the processor is configured to communicate with other apparatuses through the interface circuit, and execute the service providing method of any one of the above second aspect or second aspect. The processor includes one or more.
In a fourth aspect, an embodiment of the present application provides a service providing apparatus, including a processor, connected to a memory, and configured to call a program stored in the memory to execute the service providing method of the second aspect or any one of the second aspects. The memory may be located within the service providing device or may be located outside the service providing device. The processor includes one or more.
In a fifth aspect, an embodiment of the present application provides a service providing apparatus, including at least one processor and at least one memory, where the at least one processor is configured to execute the service providing method of any one of the above second aspect or second aspect.
In a sixth aspect, embodiments of the present application provide a computer-readable storage medium, which stores instructions that, when executed on a computer, enable the computer to perform the service providing method of any one of the second aspect or the second aspect.
In a seventh aspect, an embodiment of the present application provides a computer program product containing instructions, which when run on a computer, enable the computer to execute the service providing method of the second aspect or any one of the second aspects.
In an eighth aspect, embodiments of the present application provide a circuit system, which includes a processing circuit configured to execute the service providing method according to any one of the second aspect or the second aspect.
In a ninth aspect, an embodiment of the present application provides a chip, where the chip includes a processor, a coupling of the processor and a memory, and the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, the service providing method of the second aspect or any one of the second aspects is implemented.
In a tenth aspect, an embodiment of the present application provides a service providing system, where the service providing system includes the service device in any one of the above aspects and the first electronic device in any one of the above aspects, or includes the service device, the first electronic device, and the second electronic device in any one of the above aspects, or includes the service device, the first electronic device, and the server in any one of the above aspects, or includes the service device, the first electronic device, the second electronic device, and the server in any one of the above aspects.
For technical effects brought by any one of the design manners in the second aspect to the tenth aspect, reference may be made to technical effects brought by different design manners in the first aspect, and details are not described herein.
Drawings
Fig. 1 is a schematic diagram of a network architecture according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;
fig. 3 is a schematic flowchart of a service providing method according to an embodiment of the present application;
fig. 4 is a schematic flowchart of another service providing method according to an embodiment of the present application;
fig. 5 is a schematic flowchart of another service providing method according to an embodiment of the present application;
fig. 6 is a schematic flowchart of another service providing method according to an embodiment of the present application;
fig. 7 is a schematic flowchart of another service providing method according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a service providing apparatus according to an embodiment of the present application.
Detailed Description
The terms "first" and "second" and the like in the description and drawings of the present application are used for distinguishing different objects or for distinguishing different processes for the same object, and are not used for describing a specific order of the objects. Furthermore, the terms "including" and "having," and any variations thereof, as referred to in the description of the present application, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements but may alternatively include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. In the embodiments of the present application, "a plurality" includes two or more. In the embodiments of the present application, words such as "exemplary" or "for example" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.
First, technical terms involved in the related art are introduced:
1. task collaboration
In a plurality of electronic devices, each electronic device can determine one electronic device from the plurality of electronic devices according to a pre-configured device selection rule, and the electronic device serves as a service device which is adopted to provide the best service for a user. For example, when providing an audio service to a user, an electronic device (e.g., a speaker) closest to the user is used as a service device to provide the audio service to the user. That is, among the plurality of electronic devices, other electronic devices than the electronic device that finally provides the service give up the right to provide the service, functionally breaking the isolation between the plurality of electronic devices to provide the user with the best quality service.
2. Task migration
Both electronic devices store authentication information of a certain user. Wherein the two electronic devices comprise an electronic device 1 and an electronic device 2. The electronic device 1 stores information that the user has not performed the completed service. The electronic device 2 acquires information of the service that the user has not performed on the electronic device 1, and continues to provide the service that the user has not performed.
3. Voiceprint recognition
Voiceprint recognition is a biometric identification technique that mainly uses speech signals to identify the person who uttered the speech signal. For example, a voice signal is converted into a digital signal, and the originator of the voice signal is determined based on the converted digital signal. The voiceprint recognition function can be a text-independent voiceprint recognition function or a text-dependent voiceprint recognition function. The algorithms employed by the voiceprint recognition function may be, for example but not limited to: gaussian Mixture Model (GMM) -Universal Background Model (UBM), vector, xvector, or dnn-vector algorithm. The main processing steps of voiceprint recognition comprise:
step one, the electronic equipment carries out audio processing on the collected voice signals.
And step two, the electronic equipment extracts the voiceprint characteristics of the voice signal after the audio processing to obtain the voiceprint characteristics of the voice signal.
And step three, the electronic equipment compares the voiceprint characteristics of the voice signal with the pre-registered voiceprint template to obtain a comparison result.
Wherein the electronic device stores the pre-registered voiceprint template. The pre-registered voiceprint templates can be one or more. Each pre-registered voiceprint template corresponds to a user identification. The comparison result may be a degree of match between the voiceprint features of the speech signal and each of the pre-registered voiceprint templates.
And step four, the electronic equipment judges and decides according to the comparison result and the voice print to obtain the sender of the voice signal.
For example, if the comparison result includes a plurality of matching degrees, the electronic device takes the user identified by the user identifier corresponding to the voiceprint template with the highest matching degree as the sender of the voice signal.
In the related art, in order to provide a better service (such as implementing task collaboration or task migration) to a user, it is necessary to provide a plurality of electronic devices, each of which supports a voiceprint recognition technique. And constructing a voiceprint mapping relation function aiming at any two electronic devices in the plurality of electronic devices. However, voiceprint recognition technology requires high chip computing power for electronic devices, resulting in high hardware costs. If the number of the electronic devices is large, the hardware cost and the complexity of the voiceprint mapping model are further increased.
In view of the above, the present embodiment provides a service providing method, and referring to fig. 1, a communication system to which the service providing method of the present embodiment is applied includes a plurality of electronic devices 10. The plurality of electronic devices 10 are wirelessly connected, for example, for data transmission via bluetooth, wireless fidelity (WiFi), infrared, and the like. Only four electronic devices 10, such as a television, a mobile phone, an air conditioner, a sound box, etc., are shown in fig. 1. Optionally, the communication system to which the service providing method according to the embodiment of the present application is applied further includes a cloud-side device 20. Fig. 1 is a schematic diagram, and does not limit the applicable scenarios of the service providing method according to the embodiment of the present application.
The electronic device 10 may or may not have a voiceprint recognition function. The electronic device 10 may be a mobile phone, a television, a sound box, an air conditioner, smart glasses, a tablet computer, a wearable device, an in-vehicle device, an Augmented Reality (AR)/Virtual Reality (VR) device, a desktop, a laptop, a handheld notebook, an ultra-mobile personal computer (UMPC), a netbook, a Personal Digital Assistant (PDA), a camera, or the like. In practical applications, the electronic device 10 may be any one of information appliances. The information appliance refers to a product combining a computer (computer), communication (communication) and consumer electronics (consumer electronics), and is also called a 3C product. The embodiment of the present application does not set any limit to the specific form of the electronic device. In the embodiment of the present application, an electronic device that does not have a voiceprint recognition function and provides a service to a user is described as a "service device".
And the cloud-side device 20 is configured to store user service information, such as preference information or history information of the user. The preference information of the user is used for indicating the use habit of the user for a certain service. The history information of the user is used for indicating at least one of identification of a service device providing a certain service for the user, service progress, service time, or the like. The cloud-side device 20 may be a server or a server cluster. The algorithm employed by the cloud-side device 20 may be, for example, but not limited to: content-based recommendation algorithms, collaborative filtering recommendation algorithms, and knowledge-based recommendation algorithms. The cloud-side device 20 may be regarded as a server located in the background, and is a service device that is not directly facing to the user.
The network architecture and the service scenario described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not form a limitation on the technical solution provided in the embodiment of the present application, and as a person of ordinary skill in the art knows that along with the evolution of the network architecture and the appearance of a new service scenario, the technical solution provided in the embodiment of the present application is also applicable to similar technical problems.
In the communication system to which the service providing method according to the embodiment of the present application is applied, the specific structures of the different electronic devices 10 may be the same or different. Next, a mobile phone is taken as an example of the above electronic device, and fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device may include a processor 110, an external memory interface 120, an internal memory 121, a Universal Serial Bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a key 190, a motor 191, an indicator 192, a camera 193, a display screen 194, a Subscriber Identification Module (SIM) card interface 195, and the like. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.
It is to be understood that the illustrated structure of the present embodiment does not constitute a specific limitation to the electronic device. In other embodiments, an electronic device may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
Processor 110 may include one or more processing units, such as: the processor 110 may include one or more of an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a memory, a video codec, a Digital Signal Processor (DSP), a baseband processor, a neural-Network Processing Unit (NPU), and the like. The different processing units may be separate devices or may be integrated into one or more processors.
The controller may be a neural center and a command center of the electronic device. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.
A memory may also be provided in processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 110, thereby increasing the efficiency of the system.
In some embodiments, processor 110 may include one or more interfaces. The interface may include one or more of an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a SIM interface, a USB interface, and the like.
The charging management module 140 is configured to receive charging input from a charger. The charger may be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 140 may receive charging input from a wired charger via the USB interface 130. In some wireless charging embodiments, the charging management module 140 may receive a wireless charging input through a wireless charging coil of the electronic device. The charging management module 140 may also supply power to the electronic device through the power management module 141 while charging the battery 142.
The power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110. The power management module 141 receives the input of the battery 142 and the charge management module 140, and supplies power to the processor 110, the internal memory 121, the external memory, the display 194, the camera 193, the wireless communication module 160, and the like. The power management module 141 may also be used to monitor parameters such as battery capacity, battery cycle count, battery state of health (leakage, impedance), etc. In some other embodiments, the power management module 141 may also be disposed in the processor 110. In other embodiments, the power management module 141 and the charging management module 140 may be disposed in the same device.
The wireless communication function of the electronic device may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor, the baseband processor, and the like.
The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in an electronic device may be used to cover a single or multiple communication bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
The mobile communication module 150 may provide a solution for wireless communication applied to electronic devices, including second generation mobile communication technology (2G)/third generation mobile communication technology (3G)/fourth generation mobile communication technology (4 th generation mobile communication technology, 4G)/fifth generation mobile communication technology (5 th generation mobile communication technology, 5G), etc., the mobile communication module 150 may include at least one filter, a switch, a power amplifier, a Low Noise Amplifier (LNA), etc., the mobile communication module 150 may receive electromagnetic waves from the antenna 1, and filter, amplify, etc., the received electromagnetic waves, and transmit them to the modem processor for demodulation, the mobile communication module 150 may further amplify signals modulated by the modem processor, converted into electromagnetic waves by the antenna 1 and radiated. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the processor 110. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the same device as at least some of the modules of the processor 110.
The modem processor may include a modulator and a demodulator. The modulator is used for modulating a low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then passes the demodulated low frequency baseband signal to a baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then transferred to the application processor. The application processor outputs a sound signal through an audio device (not limited to the speaker 170A, the receiver 170B, etc.) or displays an image or video through the display screen 194. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be provided in the same device as the mobile communication module 150 or other functional modules, independent of the processor 110.
The wireless communication module 160 may provide solutions for wireless communication applied to electronic devices, including Wireless Local Area Networks (WLANs) (such as wireless fidelity (Wi-Fi) networks), Bluetooth (BT), Global Navigation Satellite Systems (GNSS), Frequency Modulation (FM), Near Field Communication (NFC), Infrared (IR), and the like. The wireless communication module 160 may be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2, performs frequency modulation and filtering processing on electromagnetic wave signals, and transmits the processed signals to the processor 110. The wireless communication module 160 may also receive a signal to be transmitted from the processor 110, perform frequency modulation and amplification on the signal, and convert the signal into electromagnetic waves through the antenna 2 to radiate the electromagnetic waves.
In some embodiments, antenna 1 of the electronic device is coupled to the mobile communication module 150 and antenna 2 is coupled to the wireless communication module 160 so that the electronic device can communicate with the network and other devices through wireless communication techniques. For example, the electronic device may perform a video call or a video conference with other electronic devices through the antenna 1 and the mobile communication module 150. The wireless communication technology may include one or more of global system for mobile communications (GSM), General Packet Radio Service (GPRS), code division multiple access (code division multiple access, CDMA), Wideband Code Division Multiple Access (WCDMA), time-division code division multiple access (time-division code division multiple access, TD-SCDMA), Long Term Evolution (LTE), LTE, BT, GNSS, WLAN, NFC, FM, IR technology, and the like. The GNSS may include one or more of a Global Positioning System (GPS), a global navigation satellite system (GLONASS), a beidou satellite navigation system (BDS), a quasi-zenith satellite system (QZSS), a Satellite Based Augmentation System (SBAS), and the like.
The electronic device implements the display function through the GPU, the display screen 194, and the application processor, etc. The GPU is a microprocessor for image processing, and is connected to the display screen 194 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. The processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
The display screen 194 is used to display images, video, and the like. The display screen 194 includes a display panel. The display panel may adopt a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a MiniLed, a Micro-led, a quantum dot light-emitting diode (QLED), and the like. In some embodiments, the electronic device may include 1 or N display screens 194, with N being a positive integer greater than 1. For example, in the present embodiment, the display screen 194 may display a video interface.
The electronic device may implement a shooting function through the ISP, the camera 193, the video codec, the GPU, the display screen 194, the application processor, and the like.
The ISP is used to process the data fed back by the camera 193. For example, when a photo is taken, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converting into an image visible to naked eyes. The ISP can also carry out algorithm optimization on the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in camera 193.
The camera 193 is used to capture still images or video. For example, in the embodiment of the present application, the camera 193 can be used for capturing video images near the electronic device, such as a process of voice request by a user. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The photosensitive element may be a Charge Coupled Device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The light sensing element converts the optical signal into an electrical signal, which is then passed to the ISP where it is converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into image signal in standard RGB, YUV and other formats. In some embodiments, the electronic device may include 1 or N cameras 193, N being a positive integer greater than 1. In this embodiment, the camera 193 may be disposed in the electronic device in a hidden manner, or may not be disposed in the electronic device in a hidden manner, and this embodiment is not limited in this respect.
The digital signal processor is used for processing the digital signal. For example, human body features, five sense organs features, and the like are extracted from the digital video image to determine the user who made the voice request.
Video codecs are used to compress or decompress digital video. The electronic device may support one or more video codecs. In this way, the electronic device can play or record video in a variety of encoding formats, such as: moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, and the like.
The NPU is a neural-network (NN) computing processor that processes input information quickly by using a biological neural network structure, for example, by using a transfer mode between neurons of a human brain, and can also learn by itself continuously. The NPU can realize applications such as intelligent cognition of electronic equipment, for example: image recognition, face recognition, speech recognition, text understanding, and the like.
The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to extend the memory capability of the electronic device. The external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, files such as music, video, etc. are saved in an external memory card.
The internal memory 121 may be used to store computer-executable program code, which includes instructions. The processor 110 executes various functional applications of the electronic device and data processing by executing instructions stored in the internal memory 121. The internal memory 121 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The data storage area can store data (such as audio data, phone book and the like) created in the using process of the electronic device. In addition, the internal memory 121 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like.
The electronic device may implement audio functions via the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor. Such as conversation, voice collection, music playing, recording, etc.
The audio module 170 is used to convert digital audio information into an analog audio signal output and also to convert an analog audio input into a digital audio signal. The audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be disposed in the processor 110, or some functional modules of the audio module 170 may be disposed in the processor 110.
The speaker 170A, also called a "horn", is used to convert the audio electrical signal into an acoustic signal. The electronic apparatus can listen to music through the speaker 170A or listen to a handsfree call.
The receiver 170B, also called "earpiece", is used to convert the electrical audio signal into an acoustic signal. When the electronic device answers a call or voice information, it can answer the voice by placing the receiver 170B close to the ear of the person.
The microphone 170C, also referred to as a "microphone," is used to convert sound signals into electrical signals. When the user makes a voice request, the user can input the voice request to the microphone 170C by speaking near the microphone 170C through the mouth. The electronic device may be provided with at least one microphone 170C. In other embodiments, the electronic device may be provided with two microphones 170C to implement noise reduction functions in addition to collecting voice requests. In other embodiments, the electronic device may further include three, four, or more microphones 170C to collect voice requests, reduce noise, identify sound sources, perform directional recording, and so on.
The headphone interface 170D is used to connect a wired headphone. The headset interface 170D may be the USB interface 130, or may be a 3.5mm open mobile electronic device platform (OMTP) standard interface, a cellular telecommunications industry association (cellular telecommunications industry association of the USA, CTIA) standard interface.
The pressure sensor 180A is used for sensing a pressure signal, and converting the pressure signal into an electrical signal. In some embodiments, the pressure sensor 180A may be disposed on the display screen 194. The pressure sensor 180A can be of a wide variety, such as a resistive pressure sensor, an inductive pressure sensor, a capacitive pressure sensor, and the like. The capacitive pressure sensor may be a sensor comprising at least two parallel plates having an electrically conductive material. When a force acts on the pressure sensor 180A, the capacitance between the electrodes changes. The electronics determine the strength of the pressure from the change in capacitance. When a touch operation is applied to the display screen 194, the electronic device detects the intensity of the touch operation according to the pressure sensor 180A. The electronic device may also calculate the position of the touch from the detection signal of the pressure sensor 180A.
The gyro sensor 180B may be used to determine the motion pose of the electronic device. The gyroscope sensor 180B may also be used for navigation, somatosensory gaming scenes.
The air pressure sensor 180C is used to measure air pressure. In some embodiments, the electronic device calculates altitude, aiding in positioning and navigation, from barometric pressure values measured by barometric pressure sensor 180C.
The magnetic sensor 180D includes a hall sensor. The electronic device may detect the opening and closing of the flip holster using the magnetic sensor 180D.
The acceleration sensor 180E can detect the magnitude of acceleration of the electronic device in various directions (typically three axes). When the electronic device is at rest, the magnitude and direction of gravity can be detected. The method can also be used for recognizing the posture of the electronic equipment, and is applied to horizontal and vertical screen switching, pedometers and other applications.
A distance sensor 180F for measuring a distance. The electronic device may measure distance by infrared or laser.
The proximity light sensor 180G may include, for example, a Light Emitting Diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The electronic device emits infrared light to the outside through the light emitting diode. The electronic device uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device. When insufficient reflected light is detected, the electronic device may determine that there are no objects near the electronic device. The electronic device can detect that the electronic device is held by a user and close to the ear for conversation by utilizing the proximity light sensor 180G, so that the screen is automatically extinguished, and the purpose of saving power is achieved. The proximity light sensor 180G may also be used in a holster mode, a pocket mode automatically unlocks and locks the screen.
The ambient light sensor 180L is used to sense the ambient light level. The electronic device may adaptively adjust the brightness of the display screen 194 based on the perceived ambient light level. The ambient light sensor 180L may also be used to automatically adjust the white balance when taking a picture. The ambient light sensor 180L may also cooperate with the proximity light sensor 180G to detect whether the electronic device is in a pocket to prevent accidental touches.
The fingerprint sensor 180H is used to collect a fingerprint. The electronic equipment can utilize the collected fingerprint characteristics to realize fingerprint unlocking, access to an application lock, fingerprint photographing, fingerprint incoming call answering and the like.
The temperature sensor 180J is used to detect temperature. In some embodiments, the electronic device implements a temperature processing strategy using the temperature detected by temperature sensor 180J. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold, the electronic device performs a reduction in performance of a processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection. In other embodiments, the electronic device heats the battery 142 when the temperature is below another threshold to avoid an abnormal shutdown of the electronic device due to low temperatures. In other embodiments, the electronic device performs a boost on the output voltage of the battery 142 when the temperature is below a further threshold to avoid abnormal shutdown due to low temperature.
The touch sensor 180K is also referred to as a "touch panel". The touch sensor 180K may be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen". The touch sensor 180K is used to detect a touch operation applied thereto or nearby. The touch sensor can communicate the detected touch operation to the application processor to determine the touch event type. Visual output associated with the touch operation may be provided through the display screen 194. In other embodiments, the touch sensor 180K may be disposed on a surface of the electronic device at a different position than the display screen 194.
The bone conduction sensor 180M may acquire a vibration signal. In some embodiments, the bone conduction sensor 180M may acquire a vibration signal of the human vocal part vibrating the bone mass. The bone conduction sensor 180M may also contact the human pulse to receive the blood pressure pulsation signal. In some embodiments, the bone conduction sensor 180M may also be disposed in a headset, integrated into a bone conduction headset. The audio module 170 may analyze a voice signal based on the vibration signal of the bone mass vibrated by the sound part acquired by the bone conduction sensor 180M, so as to implement a voice function. The application processor can analyze heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor 180M, so as to realize the heart rate detection function.
The keys 190 include a power-on key, a volume key, and the like. The keys 190 may be mechanical keys. Or may be touch keys. The electronic device may receive a key input, and generate a key signal input related to user settings and function control of the electronic device.
The motor 191 may generate a vibration cue. The motor 191 may be used for incoming call vibration cues, as well as for touch vibration feedback. For example, touch operations applied to different applications (e.g., photographing, audio playing, etc.) may correspond to different vibration feedback effects. The motor 191 may also respond to different vibration feedback effects for touch operations applied to different areas of the display screen 194. Different application scenes (such as time reminding, receiving information, alarm clock, game and the like) can also correspond to different vibration feedback effects. The touch vibration feedback effect may also support customization.
Indicator 192 may be an indicator light that may be used to indicate a state of charge, a change in charge, or a message, missed call, notification, etc.
The SIM card interface 195 is used to connect a SIM card. The SIM card can be attached to and detached from the electronic device by being inserted into the SIM card interface 195 or being pulled out of the SIM card interface 195. The electronic equipment can support 1 or N SIM card interfaces, and N is a positive integer greater than 1. The SIM card interface 195 may support a Nano SIM card, a Micro SIM card, a SIM card, etc. The same SIM card interface 195 can be inserted with multiple cards at the same time. The types of the plurality of cards may be the same or different. The SIM card interface 195 may also be compatible with different types of SIM cards. The SIM card interface 195 may also be compatible with external memory cards. The electronic equipment realizes functions of conversation, data communication and the like through the interaction of the SIM card and the network. In some embodiments, the electronic device employs esims, namely: an embedded SIM card. The eSIM card can be embedded in the electronic device and cannot be separated from the electronic device.
The methods in the following embodiments may be implemented in an electronic device having the above hardware structure. The service providing method provided by the embodiment of the present application is specifically described below.
It should be noted that, in the following embodiments of the present application, names of messages between electronic devices or names of parameters in messages, etc. are only examples, and other names may also be used in specific implementations, which are described in a unified manner herein and are not described in detail below.
The embodiment of the application provides a service providing method, which is applied to a service providing process based on voiceprint recognition. Referring to fig. 3, the service providing method includes the steps of:
s300, the first electronic device receives a first voice request sent by a first user.
The first electronic device has a voiceprint recognition function. The first voice request is for requesting a first service. The first service may be a multimedia service such as at least one of text, image, voice, video or audio. The first service may also condition the services provided by the air, such as cooling.
For example, referring to fig. 1, the television, the air conditioner, the sound box, and the like do not have a voiceprint recognition function, and the mobile phone has a voiceprint recognition function. Therefore, the mobile phone is the first electronic device. Within a certain area (indicated by the dashed oval in fig. 1), the user 1 makes a first voice request. Other users (i.e., user 2, user 3, and user 4) do not make voice requests. User 1 is the first user. Under the condition that the electronic equipment has the voice acquisition function, the electronic equipment in the area range can receive the first voice request. For a first electronic device with a voiceprint recognition function, such as a mobile phone, a first voice request sent by the user 1 is received through a microphone. In the following, a first electronic device with a voiceprint recognition function is taken as an example, and the first electronic device executes S301:
s301, the first electronic device identifies the first voice request to obtain a first identification result.
The first electronic device has multiple identification functions, which may be, for example, but not limited to: a voiceprint recognition function and a semantic recognition function.
Under the condition that the first electronic equipment has the voiceprint recognition function, the first electronic equipment performs voiceprint recognition on the first voice request to obtain a first voiceprint recognition result. The first voiceprint recognition result is used for indicating a first user recognized by the first electronic device through a voiceprint recognition function. The specific implementation process of the voiceprint recognition function can be referred to in the prior art, and is not described herein again.
Under the condition that the first electronic equipment has the semantic recognition function, the first electronic equipment performs semantic recognition on the first voice request to obtain a first semantic recognition result. Wherein the first semantic identification result is used for requesting a first service. For example, the first voice request is "open a speaker, play pop music", and after the first electronic device performs semantic recognition on the first voice request, the first semantic recognition result includes information of the first service device and information of the first service. The information of the first service equipment is 'sound box', and the information of the first service equipment is 'playing popular music'. The specific implementation process of the semantic recognition function can be referred to in the prior art, and is not described herein again.
Here, the first recognition result includes the first voiceprint recognition result, or includes the first voiceprint recognition result and the first semantic recognition result. After the first electronic device obtains the first recognition result, S302 is performed. Alternatively, if the first recognition result includes the first voiceprint recognition result and does not include the first semantic recognition result, the first electronic device executes S302, as shown in fig. 3. If the first recognition result includes the first voiceprint recognition result and the first semantic recognition result, the first electronic device performs S302, or performs S303 and S304, as shown in fig. 4. Wherein, the specific descriptions of S302 to S304 are as follows:
s302, the first electronic device sends a first recognition result to the electronic devices within the first preset range. Correspondingly, the electronic devices within the first preset range receive the first identification result from the first electronic device. Transmitting the first recognition result to the electronic devices within the first preset range may be regarded as a broadcast transmission to inform at least one electronic device within the first preset range of the first recognition result.
The first preset range may be an area range determined based on a position where the first electronic device is located and a preset length. For example, a circular area with a radius of a predetermined length and centered on the first electronic device is used as the first predetermined range. The number of the electronic devices within the first preset range may be one or more, and only two electronic devices, namely, the electronic device 1 and the electronic device 2, are shown in fig. 3. The electronic device within the first preset range may or may not have a voiceprint recognition function, which is not limited in the embodiment of the present application.
In the embodiment of the present application, an electronic device that does not have a voiceprint recognition function and provides the first service is described as a "service device". Here, since the first electronic device does not determine the service device, that is, does not determine which electronic device is used to provide the first service, the first electronic device may send the first recognition result to all electronic devices within the first preset range, so that all electronic devices within the first preset range can receive the first recognition result. If the first service device is within the first preset range, the first service device providing the first service can also receive the first identification result. In this case, for the electronic devices within the first preset range, the electronic devices within the first preset range determine the first service device providing the first service. S305 is performed by the first serving device based on the first recognition result, as shown in fig. 3. Fig. 3 illustrates an example in which the electronic device 1 is a first service device. The process of determining the first service device by the electronic device within the first preset range may refer to the process of determining the first service device by the first electronic device, that is, the electronic device within the first preset range performs the related processing procedure of the first electronic device in S303.
S303, the first electronic device determines the first service device according to the first semantic recognition result.
The first service device is used for providing a first service. Here, there are various specific implementations of S303, which may be, for example and without limitation, as follows:
in the first mode, the first semantic recognition result is further used for indicating the first service device. For example, still taking the first voice request as "open a sound box and play popular music" as an example, after the first electronic device performs semantic recognition on the first voice request, the first semantic recognition result includes information of the first service device, that is, the name "sound box" of the first service device. In this case, the first electronic device uses the "loudspeaker" as the first service device.
In a second mode, when the first semantic recognition result does not indicate the first service device, the first electronic device performs information interaction with the electronic device within the first preset range, and determines the first service device based on the interactive information, which includes the following specific processes:
step one, first electronic equipment sends a service quality request to electronic equipment within a first preset range. Accordingly, the electronic devices within the first preset range receive the service quality request from the first electronic device.
Wherein the quality of service request is for requesting a quality of service of the first service.
And step two, the electronic equipment in the first preset range sends service quality indication information to the first electronic equipment. Correspondingly, the first electronic device receives service quality indication information from the electronic devices within the first preset range.
The service quality indication information is used for indicating the electronic equipment capable of providing the first service. Here, for the qos indication information sent by an electronic device, the qos indication information is determined by the electronic device according to its service capability or according to interaction information with other electronic devices within a first preset range.
Illustratively, the service quality indication information sent by the electronic device 1 is used to indicate that the electronic device 1 is capable of providing the first service. The service quality indication information sent by the electronic device 2 is used for indicating that the electronic device 2 can provide the first service.
And step three, the first electronic equipment determines the first service equipment according to the service quality indication information.
If the first electronic device receives a piece of service quality indication information, the first electronic device takes the electronic device indicated by the piece of service quality indication information as a first service device. If the first electronic device receives the plurality of pieces of service quality indication information, the first electronic device takes one of the plurality of pieces of electronic devices capable of providing the first service as the first service device.
For example, the first electronic device regards the electronic device indicated by the service quality indication information received first as the first service device according to a certain device selection rule, for example, according to the sequence of the received service quality indication information.
For another example, the service quality indication information is also used to indicate a separation distance between the electronic device capable of providing the first service and the first user. The first electronic device is used as a first service device, wherein the first service device can provide a first service and is an electronic device which is capable of providing the first service and has a distance from a first user meeting a preset condition. Here, the preset condition may be, for example, but not limited to, the following two items: the separation distance between the first service device and the first user is minimum, or the separation distance between the first service device and the first user exceeds a threshold value. In this case, the first electronic device stores a device information table including information such as an electronic device identification and voiceprint intensity, as shown in table 1. Wherein, in a piece of service quality indication information, the electronic equipment identifier is the identifier of the electronic equipment indicated by the service quality indication information, and the sound intensity is the sound intensity of the first voice request determined by the electronic equipment indicated by the service quality indication information.
TABLE 1
Electronic equipment identification Sound intensity (Unit: decibel)
Electronic equipment 1 20
Electronic equipment 2 35
Electronic equipment 3 30
Referring to table 1, for the electronic device 1, after the electronic device 1 collects the voice request 1, the sound intensity of the voice request 1 is determined to be 35 db. Here, the sound intensity can indicate a separation distance between the corresponding electronic device and the first user. A smaller sound intensity indicates a larger separation distance between the two. The first electronic device may refer to the sound intensities corresponding to the different electronic devices to determine the distance between the corresponding electronic device and the first user.
In a third mode, the first electronic device determines the first service device by combining with an image recognition function, and the specific implementation process includes:
step one, first electronic equipment obtains equipment indication information.
The device indication information is used for indicating a device faced by the first user.
For example, the first electronic device also embodies an image capture function and an image recognition function. After the first electronic device collects the image, the image related to the first user is subjected to image recognition, and the device faced by the first user is obtained.
For another example, the first electronic device acquires device indication information from the other electronic device. Other electronic devices have image capture and image recognition capabilities. After the other electronic equipment collects the images, the images related to the first user are subjected to image recognition, and the equipment facing the first user is obtained.
And step two, the first electronic equipment takes the equipment indicated by the equipment indication information as first service equipment. Here, the first user is usually faced with the first service device when making the first voice request. In this manner, the first electronic device determines a first service device that provides the first service based on the image processing result.
S304, the first electronic device sends the first identification result to the first service device. Accordingly, the first service device receives the first identification result from the first electronic device.
The first recognition result comprises a first voiceprint recognition result, or the first voiceprint recognition result and a first semantic recognition result. In this case, for the first service device, after receiving the first identification result, the first service device may provide the first service based on the first identification result without autonomously determining whether to provide the first service by itself. That is, if the first service device receives the first voiceprint recognition result from the first electronic device and does not receive the voiceprint recognition result from the other electronic device, the first service device performs S305. If the first service device receives the first voiceprint recognition result from the first electronic device and also receives the voiceprint recognition results from other electronic devices, the first service device performs S309 and S310 first, and then performs S305. The relevant descriptions of S305, S309, and S310 are as follows:
s305, the first service device provides a first service for the first user.
The first user is a user who passes voiceprint recognition, and the first service device provides a first service for the first user, such as playing music. On the contrary, if a certain user is a user who does not pass the voiceprint recognition, the first service device cannot receive the first voiceprint recognition result indicating the user, and the first service device does not provide the first service for the user.
The first service may be a service determined by the first service device based on its own service provision rule. For example, the first service device is a sound box, and the first service device is playing audio. Here, the audio played by the sound box may be a segment of audio in the pre-stored audio of the local end, or may be the audio played by the sound box last time. The first service may also be a service having an association relationship with preference information and/or history information of the first user.
In the practical application process, the specific implementation process of S305 is different under different scenarios. In the following, three possible scenarios are introduced:
the first scene identification result comprises a first semantic identification result. In this case, the specific implementation process of S305 includes: and the first service equipment provides a first service for the first user according to the first semantic recognition result.
For example, still taking the first voice request as "turn on the speaker and play the pop music" as an example, after the first electronic device performs semantic recognition on the first voice request, the first semantic recognition result includes information of the first service, for example, the first service is "play the pop music". In this case, the first electronic device plays the streaming music according to the first semantic recognition result when playing the music.
Referring to fig. 5, the first service device itself has a semantic recognition function, that is, the first service device performs S306 and S307 and then performs S305:
s306, the first service device receives a second voice request sent by the first user.
Wherein the second voice request is for requesting the first service. Here, the second voice request may be the same voice request as the first voice request or may be a different voice request. For the first service, reference may be made to the related description of S300, which is not described herein again.
S307, the first service device performs semantic recognition on the second voice request to obtain a second semantic recognition result.
Wherein the second semantic identification result is used for requesting the first service.
For example, the second voice request is "open a speaker, play popular music", and after the first service device performs semantic recognition on the second voice request, the second semantic recognition result includes information of the first service. The information of the first service is "playing pop music". The specific implementation process of the semantic recognition function can be referred to in the prior art, and is not described herein again.
After the first service device performs S306 and S307, a specific implementation procedure of S305 includes: and the first service equipment provides the first service for the first user according to the second semantic recognition result. For example, in the case that the second voice request is "open speaker, play pop music", the first electronic device plays the pop-like music while playing music according to the second semantic recognition result.
It should be noted that, in this embodiment of the application, the first electronic device may first execute S300 to S304, and the first service device then executes S306 and S307, or the first service device may first execute S306 and S307 and the first electronic device then executes S300 to S304, or the first electronic device may execute S300 to S304, and the first service device executes S306 and S307 at the same time. The execution sequence in this embodiment is not limited in this context.
Referring to fig. 6, the first service device performs information interaction with the cloud-side device to obtain user service information of the first user, so as to provide a customized service for the first user. That is, the first service device performs S308 and then performs S305:
s308, the first service device acquires the user service information of the first user from the cloud side device.
The cloud-side device stores relevant information of the first user, for example, an account number, an age, a gender, preference information of the first user on different services, historical information for providing different services for the first user, and the like.
The user service information is information related to the first service. The user service information includes at least one of preference information or history information. The preference information is used to indicate a usage habit of the first user for the first service. For example, the first service is a music class service. The preference information includes a type of music preferred by the first user. The history information is used to indicate at least one of an identification of a service device providing the first service for the first user, a service schedule, or a service time. For example, the first service is a movie service. The history information includes information such as an identification of a service device that played a movie for the first user, a playing schedule of the movie, and a playing time of the movie.
After the first service device performs S308, a specific implementation procedure of S305 includes: the first service equipment provides a first service matched with the user service information for the first user.
For example, in a case that the first service is "play popular music", when playing popular music, the first service device plays popular music preferred by the first user, such as popular music of the eighties's preferred year of the first user, so as to provide a customized service for the first user, thereby improving the service experience of the user.
For another example, when the first service is "turn on the air conditioner", the first service device is the air conditioner, and when the air conditioner is in an operating state, the cooling temperature is a temperature that the first user has historically set frequently. If the first user often sets the cooling temperature of the air conditioner to twenty-five degrees, in this case, the first service device also sets the cooling temperature to twenty-five degrees, so as to improve the service experience of the user.
For another example, in a case where the first service is "continue playing the movie just now", the first service device continues playing the movie that the first user has just seen based on the service time and the service schedule of the movie in the history information while playing the movie. The first service device determines the name of the movie which is just played by the first user on the automobile intelligent assistant through the history information, and the playing progress is 45 minutes and 32 seconds. The first service device is a television, and the television plays the movie from 45 minutes to 32 seconds, so that task migration is realized, and service experience of a user is improved.
Referring to fig. 7, if the first service device receives the first identification result from the first electronic device and also receives the identification results from other electronic devices, the first service device first performs S310 and then performs S305, which is specifically described as follows:
s309, the second electronic device sends the second identification result to the first service device. Accordingly, the first service device receives the second identification result from the second electronic device.
And the second identification result comprises a second voiceprint identification result, and the second voiceprint identification result is used for indicating a second user identified by the second electronic equipment through the voiceprint identification function.
Here, the process of the second electronic device obtaining the second recognition result may refer to the relevant descriptions in S300 and S301, that is, the second electronic device performs the relevant processing process, and details are not described here.
S310, the first service device determines the first user as a service object from the first user and the second user. The first service device may randomly select the determined first user, or may select the first user based on a preset rule. Here, the first service device can provide the first service to the user identified by the voiceprint recognition function. In the case of multiple users, the first service device determines the service object based on a preset user selection rule.
For example, if the first recognition result further includes a first matching degree of the first voiceprint recognition result, and the second recognition result further includes a second matching degree of the second voiceprint recognition result, the specific implementation process of S310 includes: and the first service equipment determines the first user as a service object from the first user and the second user according to the first matching degree and the second matching degree. Wherein the first matching degree is greater than or equal to the second matching degree. The "the first recognition result further includes the first matching degree of the first voiceprint recognition result" specifically means: a degree of matching of the first user's voiceprint template to the first voiceprint feature. Wherein the first voiceprint feature is extracted from the first voice request. The "the second recognition result further includes the second matching degree of the second fingerprint recognition result" specifically means: the degree of matching of the first user's voiceprint template to the second voiceprint feature. Wherein the second voiceprint feature is extracted from a voice request received from the second electronic device. In this case, the first service device stores a device information table, where the device information table includes information such as an electronic device identifier, a voiceprint recognition result, and a matching degree, as shown in table 2.
TABLE 2
Electronic equipment identification Voiceprint recognition results Degree of matching
First electronic device First user 85%
Second electronic device Second user 70%
Referring to table 2, the first voiceprint recognition result of the first electronic device is the first user, and the matching degree of the voiceprint template of the first user and the first voiceprint feature is 85%. The second electronic device identifies the second user as the second fingerprint identification result, and the matching degree of the second user's voiceprint template and the second fingerprint feature is 70%.
In addition, in order to further improve the accuracy of the service object, after the first service device determines that the first user is the service object, the first service device also verifies whether the determined service object is correct or not according to the image recognition result. Wherein the image recognition result indicates the five sense organ characteristics, the human body characteristics, and the like of the user who made the first voice request. Here, if the first service device has an image capturing function and an image recognizing function, after the first service device uses the image, the first service device recognizes the captured image, and determines the features of the five sense organs, the human body, and the like of the user who sends the first voice request, and then the first service device determines that the user who sends the first voice request is the first user, so as to verify whether the user indicated by the voiceprint recognition result is correct, so that the service object determined by the first service device is more accurate, and further, the first service provided by the first service device is matched with the first user, which is beneficial to improving the service experience of the user.
In this way, in the case of multiple users, the first service device is also able to autonomously determine a service object to provide the first service to the corresponding service object. In the case where the service object is the first user, the first service apparatus performs S305 again. In a case that the service object is another user, the first service device provides the first service to the another user, and the specific process may refer to S305, where "the first user" is replaced with "the another user", which is not described herein again.
In addition, under the condition of no voice request, information interaction is still carried out between the electronic devices according to a certain period (such as every minute or every two minutes) so as to judge whether other devices exist around the electronic devices or not and determine which devices exist around the electronic devices. Therefore, under the condition that the voiceprint recognition result is transmitted between the electronic devices, the device serving as the sending end can accurately determine the size of the transmission resource, and the waste of the transmission resource is avoided.
According to the service providing method provided by the embodiment of the application, the first electronic device has the voiceprint recognition function, can recognize the user through the voiceprint recognition function, and provides the user recognized by the voiceprint recognition function for the first service device. Therefore, for the first service device without the voiceprint recognition function, after the first service device receives the first voiceprint recognition result from the first electronic device, the first service device can also provide the first service for the user indicated by the first voiceprint recognition result, hardware for realizing the voiceprint recognition function does not need to be deployed on the first service device, hardware cost is saved, and related operation of a voiceprint mapping model does not need to be executed. The first service equipment and the first electronic equipment use the first voiceprint recognition result as a pivot point to complete interconnection and intercommunication between equipment sides, and an intelligent service function between the equipment based on the first voiceprint recognition result is realized. Even if the first service device does not have the voiceprint recognition function, the first service can be provided for the first user, so that the number of service devices providing the first service for the user is increased, and the service experience of the user is improved.
The first service may be various types of services, and is a service directly provided for the user, so the corresponding service device is not understood as a server in the background, but a device in the foreground that provides a service that can be contacted with the user, and the first service can be intuitively perceived by the user, for example, the first service includes at least one of services such as text, image, voice, video or audio, and may also include an air-conditioning and refrigeration service, or various intelligent device services, and the like. The intelligent device includes but is not limited to various electromechanical devices, such as an electric cooker, a refrigerator, a vehicle, or a sweeping robot. Therefore, the service device providing the service according to the present embodiment may be regarded as a terminal-class device, instead of the background cloud-side device 20.
In the embodiment of the present application, the electronic device having the voiceprint recognition function may or may not have the capability of providing the first service. When the electronic device has the voiceprint recognition function and can provide the first service, the electronic device can recognize the sender of the voice request based on the voiceprint recognition function of the electronic device, and further provide the first service for the sender of the voice request.
The scheme provided by the embodiment of the application is mainly introduced from the perspective of electronic equipment. It is understood that the electronic device comprises corresponding hardware structures and/or software modules for performing the respective functions in order to realize the above-mentioned functions. Those skilled in the art will readily appreciate that the present application is capable of being implemented in hardware or a combination of hardware and computer software in connection with the examples described herein for the embodiments disclosed. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiment of the present application, the electronic device may be divided into the functional modules according to the method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that, in the embodiment of the present application, the division of the module is schematic, and is only one logic function division, and there may be another division manner in actual implementation.
In the case of dividing each functional module corresponding to each function, as shown in fig. 8, a service providing apparatus 80 provided in the embodiment of the present application is used for implementing the function of the service device in the above method. The service providing apparatus 80 may be an electronic device, an apparatus in an electronic device, or an apparatus that can be used in cooperation with an electronic device. The service providing device 80 may be a chip system. In the embodiment of the present application, the chip system may include at least one chip, and may also include a chip and other discrete devices. As shown in fig. 8, the service providing apparatus 80 includes a first receiving unit 801 and a processing unit 802. The first receiving unit 801 is configured to receive a first identification result from a first electronic device. The processing unit 802 is configured to provide a first service to a first user. Here, the service providing apparatus 80 is located in a service device, which does not have a voiceprint recognition function. The first recognition result comprises a first voiceprint recognition result, and the first voiceprint recognition result is used for indicating a first user recognized by the first electronic device through a voiceprint recognition function.
In one possible design, the first recognition result further includes a first semantic recognition result, the first semantic recognition result is obtained by performing semantic recognition on a first voice request sent by a first user by the first electronic device, and the first semantic recognition result is used for requesting a first service. The processing unit 802 is further configured to provide a first service to the first user according to the first semantic recognition result.
In a possible design, the service providing apparatus according to the embodiment of the present application further includes a second receiving unit 803, where the second receiving unit 803 is configured to receive a second voice request sent by the first user. The processing unit 802 is further configured to perform semantic recognition on the second voice request to obtain a second semantic recognition result, and provide the first service for the first user according to the second semantic recognition result. Wherein the second semantic identification result is used for requesting the first service.
In one possible design, the processing unit 802 is further configured to obtain user service information of the first user from the server, and provide the first user with the first service matching the user service information.
In one possible design, the first receiving unit 801 is further configured to receive a second recognition result from the second electronic device. And the second identification result comprises a second voiceprint identification result, and the second voiceprint identification result is used for indicating a second user identified by the second electronic equipment through the voiceprint identification function. The processing unit 802 is further configured to determine the first user as a service object from the first user and the second user.
In one possible design, the first recognition result further includes a first degree of matching of the first voiceprint recognition result, and the second recognition result further includes a second degree of matching of the second voiceprint recognition result. The processing unit 802 is specifically configured to determine, according to the first matching degree and the second matching degree, the first user as a service object from the first user and the second user. Wherein the first matching degree is not lower than the second matching degree.
All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. At least one of the above units may be implemented in software, hardware, or a combination of both. The software includes computer instructions. The hardware may include logic circuits, processors, analog circuits.
The processing unit 802 may be a processor or a controller or a software unit running thereon, among others. The processing unit 802 may implement or execute the various illustrative logical blocks, modules, and circuits described in connection with the disclosure herein. A processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, a DSP and a microprocessor, or the like. The first receiving unit 801 may be a transceiver, a transceiving circuit or a communication interface, etc. The storage unit of the service providing apparatus may be a memory. Alternatively, the second receiving unit 803 may be a microphone or a corresponding audio driver software unit.
When the processing unit 802 is a processor, the first receiving unit 801 is a communication interface or a communication driver software unit, and the storage unit is a memory or a storage driver software unit. When the first receiving unit 801 is a communication interface, the processing unit 802 is a processor, and the second receiving unit 803 is a microphone, the processing device according to the embodiment of the present application may be the electronic apparatus shown in fig. 2.
Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical functional division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another device, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, that is, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (20)

  1. A service providing apparatus, comprising:
    a first receiving unit, configured to receive a first recognition result from a first electronic device; the service providing device is positioned in service equipment, and the service equipment does not have a voiceprint recognition function; the first recognition result comprises a first voiceprint recognition result, and the first voiceprint recognition result is used for indicating a first user recognized by the first electronic device through a voiceprint recognition function;
    and the processing unit is used for providing a first service for the first user.
  2. The service providing apparatus according to claim 1, wherein the first recognition result further includes a first semantic recognition result, the first semantic recognition result is obtained by performing semantic recognition on a first voice request issued by the first user by the first electronic device, and the first semantic recognition result is used for requesting the first service;
    the processing unit is further configured to provide the first service for the first user according to the first semantic identification result.
  3. The service providing apparatus according to claim 1, wherein the apparatus further comprises:
    a second receiving unit, configured to receive a second voice request sent by the first user;
    the processing unit is further configured to perform semantic recognition on the second voice request to obtain a second semantic recognition result, and provide the first service for the first user according to the second semantic recognition result; the second semantic recognition result is used for requesting the first service.
  4. The service providing apparatus according to any one of claims 1 to 3,
    the processing unit is further configured to acquire user service information of the first user from a server, and provide the first service matched with the user service information for the first user.
  5. The service providing apparatus according to claim 4, wherein the user service information includes at least one of preference information or history information;
    the preference information is used for indicating the use habit of the first user for the first service;
    the history information is used for indicating at least one of identification, service progress or service time of a service device providing the first service for the first user.
  6. The service providing apparatus according to any one of claims 1 to 5,
    the first receiving unit is further used for receiving a second identification result from a second electronic device; the second recognition result comprises a second voice print recognition result, and the second voice print recognition result is used for indicating a second user recognized by the second electronic equipment through a voice print recognition function;
    the processing unit is further configured to determine the first user as a service object from the first user and the second user.
  7. The service providing apparatus according to claim 6, wherein the first recognition result further includes a first matching degree of the first voiceprint recognition result; the second recognition result further comprises a second matching degree of the second voiceprint recognition result;
    the processing unit is specifically configured to determine, according to the first matching degree and the second matching degree, the first user as a service object from the first user and the second user, where the first matching degree is not lower than the second matching degree.
  8. The service providing apparatus according to any one of claims 1 to 7, wherein the first service includes at least one of text, image, voice, video, or audio.
  9. A service providing method, comprising:
    the service equipment receives a first identification result from the first electronic equipment; the service equipment does not have a voiceprint recognition function; the first recognition result comprises a first voiceprint recognition result, and the first voiceprint recognition result is used for indicating a first user recognized by the first electronic device through a voiceprint recognition function;
    the service equipment provides a first service for the first user.
  10. The service providing method according to claim 9, wherein the first recognition result further includes a first semantic recognition result, the first semantic recognition result is obtained by performing semantic recognition on a first voice request issued by the first user by the first electronic device, and the first semantic recognition result is used for requesting the first service;
    the service equipment provides a first service for the first user, and comprises:
    and the service equipment provides the first service for the first user according to the first semantic recognition result.
  11. The service providing method according to claim 9, wherein the method further comprises:
    the service equipment receives a second voice request sent by the first user;
    the service equipment carries out semantic recognition on the second voice request to obtain a second semantic recognition result; the second semantic recognition result is used for requesting the first service;
    the service equipment provides a first service for the first user, and comprises:
    and the service equipment provides the first service for the first user according to the second semantic recognition result.
  12. The service providing method according to any one of claims 9 to 11, wherein the method further comprises:
    the service equipment acquires user service information of the first user from a server;
    the service equipment provides a first service for the first user, and comprises:
    and the service equipment provides the first service matched with the user service information for the first user.
  13. The service providing method according to claim 12, wherein the user service information includes at least one of preference information or history information;
    the preference information is used for indicating the use habit of the first user for the first service;
    the history information is used for indicating at least one of identification, service progress or service time of a service device providing the first service for the first user.
  14. The service providing method according to any one of claims 9 to 13, wherein the method further comprises:
    the service equipment receives a second identification result from a second electronic equipment; the second recognition result comprises a second voice print recognition result, and the second voice print recognition result is used for indicating a second user recognized by the second electronic equipment through a voice print recognition function;
    the service equipment determines the first user as a service object from the first user and the second user.
  15. The service providing method according to claim 14, wherein the first recognition result further includes a first matching degree of the first voiceprint recognition result; the second recognition result further comprises a second matching degree of the second voiceprint recognition result;
    the service equipment determines the first user as a service object from the first user and the second user, and comprises the following steps:
    and the service equipment determines the first user as a service object from the first user and the second user according to the first matching degree and the second matching degree, wherein the first matching degree is not lower than the second matching degree.
  16. The service providing method according to any one of claims 9 to 15, wherein the first service includes at least one of text, image, voice, video, or audio.
  17. A service providing apparatus, comprising: a processor for calling a program in a memory to cause the service providing apparatus to execute the service providing method of any one of claims 9 to 16.
  18. A service providing apparatus, comprising: a processor and interface circuitry for communicating with other devices, the processor being configured to perform the service provisioning method of any of claims 9 to 16.
  19. A computer-readable storage medium characterized in that the computer-readable storage medium stores a program that when called by a processor, executes the service providing method according to any one of claims 9 to 16.
  20. A computer program, characterized in that the service providing method of any one of claims 9 to 16 is executed when the program is called by a processor.
CN202080003002.XA 2020-03-27 2020-03-27 Service providing method and device Pending CN114041102A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/081662 WO2021189418A1 (en) 2020-03-27 2020-03-27 Service providing method and apparatus

Publications (1)

Publication Number Publication Date
CN114041102A true CN114041102A (en) 2022-02-11

Family

ID=77889864

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080003002.XA Pending CN114041102A (en) 2020-03-27 2020-03-27 Service providing method and device

Country Status (2)

Country Link
CN (1) CN114041102A (en)
WO (1) WO2021189418A1 (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130317827A1 (en) * 2012-05-23 2013-11-28 Tsung-Chun Fu Voice control method and computer-implemented system for data management and protection
CN102930868A (en) * 2012-10-24 2013-02-13 北京车音网科技有限公司 Identity recognition method and device
CN106293064A (en) * 2016-07-25 2017-01-04 乐视控股(北京)有限公司 A kind of information processing method and equipment
CN107391983B (en) * 2017-03-31 2020-10-16 创新先进技术有限公司 Information processing method and device based on Internet of things

Also Published As

Publication number Publication date
WO2021189418A1 (en) 2021-09-30

Similar Documents

Publication Publication Date Title
CN112289313A (en) Voice control method, electronic equipment and system
CN111742361B (en) Method for updating wake-up voice of voice assistant by terminal and terminal
CN112312366B (en) Method, electronic equipment and system for realizing functions through NFC (near field communication) tag
CN111835907A (en) Method, equipment and system for switching service across electronic equipment
CN113039822A (en) Method and equipment for establishing data channel
CN113676339B (en) Multicast method, device, terminal equipment and computer readable storage medium
CN112651510A (en) Model updating method, working node and model updating system
CN114610193A (en) Content sharing method, electronic device, and storage medium
CN113225661A (en) Loudspeaker identification method and device and electronic equipment
WO2022022319A1 (en) Image processing method, electronic device, image processing system and chip system
CN113438364B (en) Vibration adjustment method, electronic device, and storage medium
CN113645622A (en) Device authentication method, electronic device, and storage medium
CN115514844A (en) Volume adjusting method, electronic equipment and system
CN113467735A (en) Image adjusting method, electronic device and storage medium
CN109285563B (en) Voice data processing method and device in online translation process
CN115119336B (en) Earphone connection system, earphone connection method, earphone, electronic device and readable storage medium
WO2022206825A1 (en) Method and system for adjusting volume, and electronic device
CN113467747B (en) Volume adjusting method, electronic device and storage medium
CN114120987B (en) Voice wake-up method, electronic equipment and chip system
CN113572798B (en) Device control method, system, device, and storage medium
CN114698078A (en) Transmission power adjustment method, electronic device, and storage medium
US11977946B2 (en) Method for automatically activating NFC application and terminal
CN115525366A (en) Screen projection method and related device
CN115393676A (en) Gesture control optimization method and device, terminal and storage medium
CN114822525A (en) Voice control method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination