CN113380251A - Mobile voice interaction method and device based on intelligent earphone - Google Patents

Mobile voice interaction method and device based on intelligent earphone Download PDF

Info

Publication number
CN113380251A
CN113380251A CN202110694527.9A CN202110694527A CN113380251A CN 113380251 A CN113380251 A CN 113380251A CN 202110694527 A CN202110694527 A CN 202110694527A CN 113380251 A CN113380251 A CN 113380251A
Authority
CN
China
Prior art keywords
voice
local end
local
result
feedback
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110694527.9A
Other languages
Chinese (zh)
Inventor
张光强
王珂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisyou Technology Shenzhen Co ltd
Original Assignee
Unisyou Technology Shenzhen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisyou Technology Shenzhen Co ltd filed Critical Unisyou Technology Shenzhen Co ltd
Priority to CN202110694527.9A priority Critical patent/CN113380251A/en
Publication of CN113380251A publication Critical patent/CN113380251A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B5/00Near-field transmission systems, e.g. inductive or capacitive transmission systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B5/00Near-field transmission systems, e.g. inductive or capacitive transmission systems
    • H04B5/70Near-field transmission systems, e.g. inductive or capacitive transmission systems specially adapted for specific purposes
    • H04B5/72Near-field transmission systems, e.g. inductive or capacitive transmission systems specially adapted for specific purposes for local intradevice communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • H04L67/025Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • H04R1/1025Accumulators or arrangements for charging
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Telephone Function (AREA)

Abstract

The invention provides a mobile voice interaction method and a mobile voice interaction device based on an intelligent earphone, and relates to the technical field of voice interaction, wherein the method comprises the following steps: the method comprises the steps of collecting voice instructions, transmitting the voice instructions to a local end, preprocessing the voice instructions through the local end to obtain matched processing signals, judging whether the processing signals are direct feedback response signals or indirect feedback response signals, carrying out secondary voice feedback through the local end if the processing signals are the direct feedback response signals, packing the voice instructions and transmitting the voice instructions to a cloud server if the processing signals are the indirect feedback response signals, obtaining application feedback results through the cloud server, transmitting the application feedback results to the local end, carrying out voice conversion through the local end, and outputting playing results. The method can enable the acquisition and the preprocessing of the voice command to be carried out in the local end, reduces the addition of an external mobile phone or a tablet, and enables the earphone and the charging box to be used independently.

Description

Mobile voice interaction method and device based on intelligent earphone
Technical Field
The invention relates to the technical field of voice interaction, in particular to a mobile voice interaction method and device based on an intelligent earphone.
Background
The current intelligent earphones are all accessory products of mobile phones, and the voice interaction function of the mobile phones can be used only after the mobile phones are connected.
Meanwhile, when the TWS earphone is used by a user at present, the charging box and the earphone head need to be carried, the earphone head needs to be taken out to be paired with the mobile phone in actual use, and the earphone head is used as external audio input and output equipment of the mobile phone after being paired. The design of the scheme enables the earphone to become an accessory product of the mobile phone. When using the headset, the user must simultaneously carry the mobile phone and establish a bluetooth connection. No intelligent earphone in the prior art can be used independently, cannot realize the centralization of functions, has poor use effect and high use limitation, and is not beneficial to intelligent development.
Disclosure of Invention
The invention aims to provide a mobile voice interaction method and device based on an intelligent earphone, which have the advantages that the earphone is independent, the earphone can be independently used without a mobile phone or a tablet, and the convenience of voice interaction is improved.
The embodiment of the invention is realized by the following steps:
one embodiment of the present invention provides a mobile voice interaction method based on an intelligent headset, including:
collecting a voice instruction;
transmitting the voice instruction to a local end, and preprocessing the voice instruction through the local end to obtain a matched processing signal;
judging whether the processing signal is a direct feedback response signal or an indirect feedback response signal;
if the response signal is fed back directly, performing secondary voice feedback through the local end;
and if the response signal is indirectly fed back, packaging the voice command, transmitting the voice command to a cloud server, obtaining an application feedback result through the cloud server, transmitting the application feedback result to a local end, performing voice conversion through the local end, and outputting a playing result.
Before the method is used for voice interaction, a voice instruction is collected, then the voice instruction is transmitted to a local end, and the voice instruction is preprocessed through the local end to obtain a matched processing signal; and judging whether the processing signal is a direct feedback response signal or an indirect feedback response signal. If the method is used for directly feeding back the response signal when the signal is processed, the secondary voice feedback is carried out through the local end. And if the response signal is indirectly fed back, packaging the voice command, transmitting the voice command to a cloud server, obtaining an application feedback result through the cloud server, transmitting the application feedback result to a local end, performing voice conversion through the local end, and outputting a playing result. The method can enable the acquisition and the preprocessing of the voice command to be carried out in the local end, reduces the addition of an external mobile phone or a tablet, and enables the earphone and the charging box to be used independently.
In some embodiments of the present invention, the step of transmitting the voice command to the local end includes:
and respectively packaging the voice commands according to the time sequence and sending the voice commands to the local end through a Bluetooth protocol.
In some embodiments of the present invention, the step of obtaining the matched processing signal by performing the voice instruction preprocessing at the local end includes:
establishing a preset semantic instruction library and a processing signal library matched with the preset semantic instruction library;
performing local analysis on the voice instruction, obtaining a local semantic result, and judging whether the local semantic result is matched with semantic resources in the preset semantic instruction library or not;
if the semantic resources are matched, outputting corresponding processing signals;
and if the semantic resources do not have the matched semantic resources, outputting the indirect feedback response signal.
In some embodiments of the present invention, the step of performing a packing process on the voice command, transmitting the voice command to a cloud server, and obtaining an application feedback result through the cloud server includes:
and transcoding the data packet at the local end, forwarding the transcoded data packet to a cloud server through a wireless protocol, performing cloud service processing after the data packet is packaged by the cloud server to obtain a processing result, and converting the processing result into an application feedback result.
In some embodiments of the present invention, the step of transmitting the application feedback result to the local end, performing voice conversion by the local end, and outputting the playing result includes:
and respectively packaging the application feedback results according to a time sequence to obtain feedback voice result data packets, then forwarding the feedback voice result data packets to the local end through a wireless protocol, and after data packet packaging is carried out in the local end, carrying out voice data transcoding, and finally realizing voice conversion and playing.
One of the embodiments of the present invention provides a mobile voice interaction device based on an intelligent headset, including:
the voice acquisition module is used for acquiring voice instructions;
the local module is used for receiving the voice instruction and preprocessing the voice instruction to obtain a matched processing signal;
a matching module, configured to match the voice command and the processing signal with each other, and determine whether the processing signal is a direct feedback response signal or an indirect feedback response signal, and if the processing signal is the direct feedback response signal, perform secondary voice feedback through the local module; and if the indirect feedback response signal is received, packaging the voice command, transmitting the voice command to a cloud server, obtaining an application feedback result through the cloud server, transmitting the application feedback result to a local module, performing voice conversion through the local module, and outputting a playing result.
Before the device is used for voice interaction, a voice acquisition module acquires a voice instruction, a local module receives the voice instruction and preprocesses the voice instruction to obtain a matched processing signal, a matching module matches the voice instruction with the processing signal and judges whether the processing signal is a direct feedback response signal or an indirect feedback response signal, if so, secondary voice feedback is carried out through the local module; and if the indirect feedback response signal is received, packaging the voice command, transmitting the voice command to a cloud server, obtaining an application feedback result through the cloud server, transmitting the application feedback result to a local module, performing voice conversion through the local module, and outputting a playing result.
In some embodiments of the present invention, the voice collecting module includes a voice collecting unit, a sensor unit, a power supply unit and a power amplifier unit, and the power supply unit is electrically connected to the charging chamber body unit.
In some embodiments of the present invention, the local module includes:
the charging bin body unit is used for accommodating the voice acquisition module and charging the voice acquisition module;
the charging bin Bluetooth voice receiving unit is used for receiving the voice command;
the charging bin communication unit is used for transmitting information with the cloud server;
and the processor unit is used for preprocessing the voice command to obtain a matched processing signal.
An embodiment of the present invention provides an electronic device, including:
a memory for storing one or more programs;
a processor;
when the one or more programs are executed by the processor, the method as described above is implemented.
One embodiment of the present invention provides a computer-readable storage medium, where the storage medium stores computer instructions, and after a computer reads the computer instructions in the storage medium, the computer runs the above-mentioned mobile voice interaction method based on an intelligent headset.
Compared with the prior art, the embodiment of the invention has at least the following advantages or beneficial effects:
1) the earphone charging bin (local end) has application processing capacity and network connection capacity, the earphone is taken out from the charging box, namely the earphone is connected with the charging box through the Bluetooth, a user can carry out voice control through the earphone to command the charging bin to carry out corresponding application operation, and meanwhile, the charging bin has networking capacity, so that the earphone and the charging bin have independent moving capacity.
2) The device has satisfied that some users just can enjoy internet voice interaction function with the earphone under the condition of not using the cell-phone, accomplishes a series of functions such as getting on the bus, some takeouts, pronunciation exchange, news consultation.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a schematic block diagram of a mobile voice interaction device based on smart headset according to some embodiments of the present invention;
fig. 2 is a schematic flow chart of a mobile voice interaction method based on smart headsets according to some embodiments of the present invention;
FIG. 3 is a schematic block diagram of an electronic device provided in some embodiments of the invention;
in the figure, 100-a mobile voice interaction device based on intelligent earphones; 110-a voice acquisition module; 120-local module; 130-a matching module; 600-an electronic device; 610-a memory; 620-a processor; 630 — a communication interface.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only examples or embodiments of the present description, and that for a person skilled in the art, the present description can also be applied to other similar scenarios on the basis of these drawings without inventive effort. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.
It should be understood that "device," "apparatus," "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.
As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be regarded as illustrative only and not as limiting the present specification. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present specification and thus fall within the spirit and scope of the exemplary embodiments of the present specification.
Also, the description uses specific words to describe embodiments of the description. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.
Additionally, the order in which the elements and sequences are processed, the use of alphanumeric characters, or the use of other designations in this specification is not intended to limit the order of the processes and methods in this specification, unless otherwise specified in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the apparatus components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described apparatus on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
Finally, it should be understood that the examples in this specification are only intended to illustrate the principles of the examples in this specification. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the specification can be considered consistent with the teachings of the specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.
Examples
Fig. 1 is a schematic block diagram of a mobile voice interaction device 100 based on smart headset according to some embodiments of the present invention.
As shown in fig. 1, in some embodiments, a schematic block diagram of a smart headset-based mobile voice interaction device 100 may include a voice acquisition module 110, a local module 120, and a matching module 130.
A voice acquisition module 110, the voice acquisition module 110 being configured to acquire voice instructions;
the local module 120, the local module 120 is configured to receive the voice instruction, and preprocess the voice instruction to obtain a matched processing signal;
the matching module 130, the matching module 130 is configured to match the voice command and the processing signal with each other, and determine whether the processing signal is a direct feedback response signal or an indirect feedback response signal, if the processing signal is the direct feedback response signal, perform secondary voice feedback through the local module 120; if the response signal is indirectly fed back, the voice command is packaged and transmitted to the cloud server, an application feedback result is obtained through the cloud server, the application feedback result is transmitted to the local module 120, the local module 120 performs voice conversion, and a playing result is output.
It should be noted that the above description of the mobile voice interaction device 100 based on smart headset and the modules thereof is only for convenience of description, and the present specification is not limited to the scope of the illustrated embodiments. It will be appreciated by those skilled in the art that, having the benefit of the teachings of this apparatus, any combination of the various modules or sub-apparatus may be configured to connect to other modules without departing from such teachings. In some embodiments, the voice capturing module 110, the local module 120, and the matching module 130 disclosed in fig. 1 may be different modules in one device, or may be a module that implements the functions of two or more modules. For example, each module may share one memory module, and each module may have its own memory module. Such variations are within the scope of the present disclosure.
Fig. 2 is a flowchart illustrating a mobile voice interaction method based on smart headsets according to some embodiments of the present invention.
In some embodiments, a smart headset-based mobile voice interaction method may be performed by a smart headset-based mobile voice interaction apparatus 100. For example, a mobile voice interaction method based on a smart headset may be stored in a storage device in the form of a program or instructions, and when the program or instructions are executed by a mobile voice interaction device 100 based on a smart headset, a mobile voice interaction method based on a smart headset may be implemented. The operational schematic of a smart headset-based mobile voice interaction method presented below is illustrative. In some embodiments, the process may be accomplished with one or more additional operations not described and/or one or more operations not discussed. Additionally, the order of the operations of the flow illustrated in FIG. 2 and described below is not intended to be limiting.
S210, collecting a voice instruction;
in some embodiments, the voice collection may be based on an earphone, which may include an earphone charging circuit as the voice collection module 110, so as to maintain the power of the earphone, and improve the service life and the service life of the earphone; the earphone can also comprise an earphone sensor circuit for monitoring whether the earphone is in a playing state or a charging state so as to adjust the function of the earphone; the earphone charging circuit can also comprise an earphone power supply circuit which is matched with the earphone charging circuit to realize the earphone charging operation; a digital processor and an audio processor may also be included to enable collection of speech; a Bluetooth wireless interface can also be included to realize signal connection with the earphone chamber; and the voice interaction device also can comprise a power amplifier module to realize playing and final voice interaction.
S220, transmitting the voice instruction to the local end, and preprocessing the voice instruction through the local end to obtain a matched processing signal;
in the step of transmitting the voice command to the local end, the voice command may be separately packaged according to a time sequence and sent to the local end through a bluetooth protocol.
The sequence in this embodiment is the order, and the time sequence is exactly according to the order of time, and preceding operation and state have an influence to the operation and the state of back bit, and the operation and the state of back bit do not have an influence to preceding operation result, packs the processing to voice command respectively with the mode of time sequence, can make between the data after the packing more accurate, avoids influencing the high in the clouds processing result after out of order. For example, a user sends a voice instruction to execute weather condition broadcasting according to a time sequence, determines that the execution of the weather condition is finished, and then performs the state callback to execute the playing audio, namely when the time sequence conflict processing among the streaming media broadcasting is encountered, the next audio is continuously broadcasted after the state callback is finished by the previous broadcasting.
In some embodiments, the step of obtaining the matched processing signal by performing the preprocessing of the voice command through the local end includes:
establishing a preset semantic instruction library and a processing signal library matched with the preset semantic instruction library;
performing local analysis on the voice instruction, obtaining a local semantic result, and judging whether the local semantic result is matched with semantic resources in a preset semantic instruction library or not;
if the semantic resources are matched, outputting corresponding processing signals;
and if the semantic resources do not have the matched semantic resources, outputting an indirect feedback response signal.
The preset semantic instruction library and the processing signal library matched with the preset semantic instruction library comprise, for example, a user sends an instruction that the current time is several points, the local end carries out preprocessing local analysis to obtain a local semantic result of time query, the time query is judged to be matched with the real-time in the local end at the moment, and then a corresponding processing signal is sent, for example, the current time is played.
Meanwhile, if there is no matching semantic resource, for example, the user takes the headset (voice collecting module 110) out of the charging bin (local module 120), the headset automatically connects to the charging bin through bluetooth, wakes up the voice assistant and starts a voice command using the headset, and instructs the voice assistant to speak: "what is the weather today", the earphone receives the instruction and conveys to the storehouse communication module that charges, and communication module sends the instruction for high in the clouds speech processing platform, and the storehouse communication module that charges is given back with the result after the high in the clouds platform processing pronunciation, and the storehouse that charges receives the result bluetooth transmission to the earphone end again. The earphone end will hear: "you are good, today Shenzhen weather is 32 degrees. "
In some embodiments, the step of performing a packing process on the voice command, transmitting the voice command to the cloud server, and obtaining the application feedback result through the cloud server includes:
the data package is transcoded in the local end, the transcoded data package is forwarded to the cloud server through the wireless protocol, after the data package is packaged by the cloud server, cloud service processing is carried out to obtain a processing result, and then the processing result is converted into an application feedback result.
The feedback operation module can ensure that the earphone and the earphone bin do not need complex processing capacity, and can greatly save calculation and power consumption required by edge calculation. By utilizing wireless communication, the earphone bin expands the earphone to rich internet, and then the earphone is enabled to have rich internet application expansion capacity, such as computing capacity including various content ecology, internet artificial intelligence and the like, through the cloud service server. The user data is completely clouded, and data migration, terminal switching and other operations of the user are facilitated.
S230, judging whether the processing signal is a direct feedback response signal or an indirect feedback response signal;
s231, if the response signal is directly fed back, performing secondary voice feedback through the local end;
for example, the user sends an instruction "it is a few points now", the local end performs preprocessing and local parsing to obtain a local semantic result "time query", at this time, it is determined that the "time query" matches the real-time in the local end, and then a corresponding processing signal is sent, for example, the current time is played.
And S232, if the response signal is indirectly fed back, packaging the voice command, transmitting the voice command to the cloud server, obtaining an application feedback result through the cloud server, transmitting the application feedback result to the local terminal, performing voice conversion through the local terminal, and outputting a playing result.
For example, the user takes the headset (voice capture module 110) from the charging cradle (local module 120), the headset automatically connects to the charging cradle via bluetooth, wakes up the voice assistant using the headset and starts a voice command, and instructs the voice assistant to speak via the headset: "what is the weather today", the earphone receives the instruction and conveys to the storehouse communication module that charges, and communication module sends the instruction for high in the clouds speech processing platform, and the storehouse communication module that charges is given back with the result after the high in the clouds platform processing pronunciation, and the storehouse that charges receives the result bluetooth transmission to the earphone end again. The earphone end will hear: "you are good, today Shenzhen weather is 32 degrees. "
In some embodiments, the local module 120 includes a charging bin body unit, a charging bin bluetooth voice receiving unit, a charging bin communication unit, and a processor unit. The charging bin body unit is arranged to accommodate the voice acquisition module 110 and charge the voice acquisition module 110. The charging bin bluetooth voice receiving unit is configured to receive voice instructions. The charging bin communication unit is arranged for information transmission with the cloud server. The processor unit is arranged for pre-processing the voice instruction to obtain a matching processed signal.
Fig. 3 is a schematic block diagram of an electronic device 600 according to some embodiments of the invention.
Electronic device 600 includes memory 610, processor 620, and communication interface 630, where memory 610, processor 620, and communication interface 630 are electrically connected to each other, directly or indirectly, to enable the transfer or interaction of data. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The memory 610 may be used for storing software programs and modules, such as program instructions/modules corresponding to the method 100 for mobile voice interaction based on smart headset provided in an embodiment of the present application, and the processor 620 executes various functional applications and data processing by executing the software programs and modules stored in the memory 610. The communication interface 630 may be used for communicating signaling or data with other node devices.
The Memory 610 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.
Processor 620 may be an integrated circuit chip having signal processing capabilities. The processor 620 may be a general-purpose processor including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
It will be appreciated that the configuration shown in FIG. 3 is merely illustrative and that electronic device 600 may include more or fewer components than shown in FIG. 3 or have a different configuration than shown in FIG. 3. The components shown in fig. 3 may be implemented in hardware, software, or a combination thereof.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based devices that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In summary, the mobile voice interaction method and device based on the intelligent headset provided by the embodiment of the application comprise the following steps: the method comprises the steps of collecting voice instructions, transmitting the voice instructions to a local end, preprocessing the voice instructions through the local end to obtain matched processing signals, judging whether the processing signals are direct feedback response signals or indirect feedback response signals, carrying out secondary voice feedback through the local end if the processing signals are the direct feedback response signals, packing the voice instructions and transmitting the voice instructions to a cloud server if the processing signals are the indirect feedback response signals, obtaining application feedback results through the cloud server, transmitting the application feedback results to the local end, carrying out voice conversion through the local end, and outputting playing results. The method can enable the acquisition and the preprocessing of the voice command to be carried out in the local end, reduces the addition of an external mobile phone or a tablet, and enables the earphone and the charging box to be used independently.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (10)

1. A mobile voice interaction method based on an intelligent earphone is characterized by comprising the following steps:
collecting a voice instruction;
transmitting the voice instruction to a local end, and preprocessing the voice instruction through the local end to obtain a matched processing signal;
judging whether the processing signal is a direct feedback response signal or an indirect feedback response signal;
if the response signal is fed back directly, performing secondary voice feedback through the local end;
and if the response signal is indirectly fed back, packaging the voice command, transmitting the voice command to a cloud server, obtaining an application feedback result through the cloud server, transmitting the application feedback result to a local end, performing voice conversion by the local end, and outputting a playing result.
2. The method for mobile voice interaction based on intelligent earphones according to claim 1, wherein the step of transmitting the voice command to the local end comprises:
and respectively packaging the voice commands according to time sequence and sending the voice commands to the local end through a Bluetooth protocol.
3. The method of claim 2, wherein the step of obtaining the matching processing signal by preprocessing the voice command at the local end comprises:
establishing a preset semantic instruction library and a processing signal library matched with the preset semantic instruction library;
performing local analysis on the voice instruction, obtaining a local semantic result, and judging whether the local semantic result is matched with semantic resources in the preset semantic instruction library or not;
if the semantic resources are matched, outputting corresponding processing signals;
and if the semantic resources do not have the matched semantic resources, outputting the indirect feedback response signal.
4. The method of claim 3, wherein the step of packaging the voice command, transmitting the voice command to a cloud server, and obtaining an application feedback result through the cloud server comprises:
and transcoding the data packet in the local end, forwarding the transcoded data packet to a cloud server through a wireless protocol, after the data packet is packaged by the cloud server, performing cloud service processing to obtain a processing result, and converting the processing result into an application feedback result.
5. A mobile voice interaction method based on intelligent earphones according to any one of claims 1-4, wherein the steps of transmitting the application feedback result to the local end, performing voice conversion by the local end and outputting the playing result comprise:
and respectively packaging the application feedback results according to a time sequence to obtain feedback voice result data packets, then forwarding the feedback voice result data packets to the local end through a wireless protocol, and after data packet packaging is carried out in the local end, carrying out voice data transcoding, and finally realizing voice conversion and playing.
6. A mobile voice interaction device based on intelligent earphones is characterized by comprising:
a voice acquisition module configured to acquire voice instructions;
the local module is used for receiving the voice instruction and preprocessing the voice instruction to obtain a matched processing signal;
the matching module is used for matching the voice command with the processing signal and judging whether the processing signal is a direct feedback response signal or an indirect feedback response signal, and if the processing signal is the direct feedback response signal, secondary voice feedback is carried out through the local module; and if the indirect feedback response signal is received, packaging the voice command, transmitting the voice command to a cloud server, obtaining an application feedback result through the cloud server, transmitting the application feedback result to a local module, performing voice conversion by the local module, and outputting a playing result.
7. The mobile voice interaction device based on intelligent earphones according to claim 6, wherein the voice collection module comprises:
the voice collecting unit, the sensor unit, the power supply unit and the power amplifier unit are electrically connected with the charging bin body unit.
8. A smart headset-based mobile voice interaction device according to claim 7, wherein the local module comprises:
the charging bin body unit is used for accommodating the voice acquisition module and charging the voice acquisition module;
a charging bin Bluetooth voice receiving unit configured to receive the voice instruction;
a charging bin communication unit configured to communicate with the cloud server;
a processor unit configured to pre-process the voice instruction to obtain a matched processed signal.
9. An electronic device, comprising:
a memory for storing one or more programs;
a processor;
the one or more programs, when executed by the processor, implement the method of any of claims 1-5.
10. A computer-readable storage medium storing computer instructions, wherein when the computer instructions in the storage medium are read by a computer, the computer performs the method of any one of claims 1 to 5.
CN202110694527.9A 2021-06-22 2021-06-22 Mobile voice interaction method and device based on intelligent earphone Pending CN113380251A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110694527.9A CN113380251A (en) 2021-06-22 2021-06-22 Mobile voice interaction method and device based on intelligent earphone

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110694527.9A CN113380251A (en) 2021-06-22 2021-06-22 Mobile voice interaction method and device based on intelligent earphone

Publications (1)

Publication Number Publication Date
CN113380251A true CN113380251A (en) 2021-09-10

Family

ID=77578540

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110694527.9A Pending CN113380251A (en) 2021-06-22 2021-06-22 Mobile voice interaction method and device based on intelligent earphone

Country Status (1)

Country Link
CN (1) CN113380251A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708863A (en) * 2011-03-28 2012-10-03 德信互动科技(北京)有限公司 Voice dialogue equipment, system and voice dialogue implementation method
CN107204187A (en) * 2017-05-20 2017-09-26 广州雷豹科技有限公司 A kind of Xiang Shangyun intelligent sounds assistance system
CN108564949A (en) * 2018-05-18 2018-09-21 深圳傲智天下信息科技有限公司 A kind of TWS earphones, Wrist belt-type AI voice interaction devices and system
CN111276135A (en) * 2018-12-03 2020-06-12 华为终端有限公司 Network voice recognition method, network service interaction method and intelligent earphone
CN111862975A (en) * 2020-07-15 2020-10-30 百度在线网络技术(北京)有限公司 Intelligent terminal control method, device, equipment, storage medium and system
CN112511944A (en) * 2020-12-03 2021-03-16 歌尔科技有限公司 Multifunctional earphone charging box

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708863A (en) * 2011-03-28 2012-10-03 德信互动科技(北京)有限公司 Voice dialogue equipment, system and voice dialogue implementation method
CN107204187A (en) * 2017-05-20 2017-09-26 广州雷豹科技有限公司 A kind of Xiang Shangyun intelligent sounds assistance system
CN108564949A (en) * 2018-05-18 2018-09-21 深圳傲智天下信息科技有限公司 A kind of TWS earphones, Wrist belt-type AI voice interaction devices and system
CN111276135A (en) * 2018-12-03 2020-06-12 华为终端有限公司 Network voice recognition method, network service interaction method and intelligent earphone
CN111862975A (en) * 2020-07-15 2020-10-30 百度在线网络技术(北京)有限公司 Intelligent terminal control method, device, equipment, storage medium and system
CN112511944A (en) * 2020-12-03 2021-03-16 歌尔科技有限公司 Multifunctional earphone charging box

Similar Documents

Publication Publication Date Title
US11202300B2 (en) Method and system for adjusting sound quality, and host terminal
CN109246671A (en) Data transmission method, apparatus and system
US9311920B2 (en) Voice processing method, apparatus, and system
EP2134059B1 (en) Content distributing system, content distributing apparatus, terminal device and content distributing method
CN105264926A (en) Method and system for using wi-fi display transport mechanisms to accomplish voice and data communications
US20180293987A1 (en) Speech recognition method, device and system based on artificial intelligence
CN118102279A (en) Communication processing method, communication processing device and storage medium
CN107274882A (en) Data transmission method and device
CN110971685B (en) Content processing method, content processing device, computer equipment and storage medium
CN110224904B (en) Voice processing method, device, computer readable storage medium and computer equipment
CN113380251A (en) Mobile voice interaction method and device based on intelligent earphone
CN111147582B (en) Voice interaction method and device, computer equipment and storage medium
CN113518297A (en) Sound box interaction method, device and system and sound box
CN111432384B (en) Large-data-volume audio Bluetooth real-time transmission method for equipment with recording function
CN112367654B (en) TWS equipment team forming method and device, electronic equipment and storage medium
CN110034858B (en) Data packet retransmission method and device, mobile terminal and storage medium
CN107018508B (en) Data transmission method and device
WO2024131149A1 (en) Speech rate adjustment method, system and apparatus
CN113098931B (en) Information sharing method and multimedia session terminal
WO2019063019A1 (en) Method and device for generating system information
CN116708551B (en) Proxy internet surfing method and device
CN113747100B (en) Audio and video call method and device, storage medium and electronic equipment
CN114731211B (en) Data transmission method, device, communication equipment and storage medium
CN112188244B (en) Front-end camera real-time video-on-demand method and device and electronic equipment
WO2016197626A1 (en) Control method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination